10000 BUG: masked array fill_value inconsistencies with float32 · Issue #22141 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

BUG: masked array fill_value inconsistencies with float32 #22141

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
mwtoews opened this issue Aug 16, 2022 · 1 comment
Open

BUG: masked array fill_value inconsistencies with float32 #22141

mwtoews opened this issue Aug 16, 2022 · 1 comment

Comments

@mwtoews
Copy link
Contributor
mwtoews commented Aug 16, 2022

Describe the issue:

There are two probably related issues that involve a masked array 32-bit floats:

  1. It was expected that fill_value should have the same dtype as the parent array. It depends (shown in example).
  2. The dtype of fill_value depends on the order of getting or setting the fill_value attribute, which is also not expected.

Reproduce the code example:

import numpy as np

print(np.__version__)

# Example 1: the dtype of fill_value is set to float64
ar1 = np.ma.arange(6, dtype=np.float32)
getattr(ar1, "fill_value")
ar1.fill_value = -999.
print("Example 1: {0}, {1}".format(ar1.dtype, ar1.fill_value.dtype))

# Example 1: the dtype of fill_value is set to float32
ar2 = np.ma.arange(6, dtype=np.float32)
ar2.fill_value = -999.
print("Example 2: {0}, {1}".format(ar2.dtype, ar2.fill_value.dtype))

output:

1.23.2
Example 1: float32, float64
Example 2: float32, float32

NumPy/Python version information:

1.23.2 3.8.10 (default, Jun 22 2022, 20:18:18) 
[GCC 9.4.0]

Note that versions as far back as NumPy 1.16.5 (included with Ubuntu) also have the same behavior, so it's not new.

@cmarmo
Copy link
Contributor
cmarmo commented Aug 18, 2022

Hello, I was thinking to submit a PR, but I have some doubts about how to solve the issue.
In particular when arrays of int are concerned.

If I understand correctly calling getattr() is the same as not setting fill_value, so the following code

import numpy as np
print(np.__version__)
ar1 = np.ma.arange(6, dtype=np.int8)
getattr(ar1, "fill_value")
print("Example 1: {0}, {1}, {2}".format(ar1.dtype, ar1.fill_value.dtype, ar1.fill_value))
ar2 = np.ma.arange(6, dtype=np.int8)
print("Example 2: {0}, {1}, {2}".format(ar2.dtype, ar2.fill_value.dtype, ar2.fill_value))

outputs

1.24.0.dev0+653.ge8c45599c
Example 1: int8, int64, 999999
Example 2: int8, int64, 999999

When explicitly setting fill_value

import numpy as np
print(np.__version__)
ar1 = np.ma.arange(6, dtype=np.int8)
getattr(ar1, "fill_value")
print("Example 1: {0}, {1}, {2}".format(ar1.dtype, ar1.fill_value.dtype, ar1.fill_value))
ar2 = np.ma.arange(6, dtype=np.int8)
ar2.fill_value = -999.
print("Example 2: {0}, {1}, {2}".format(ar2.dtype, ar2.fill_value.dtype, ar2.fill_value))

the output is

1.24.0.dev0+653.ge8c45599c
Example 1: int8, int64, 999999
Example 2: int8, int8, 25

This is because when explicitly set, the function

def _check_fill_value(fill_value, ndtype):
checks whether fill_value is consistent with the array dtype and, if needed, forces it to be consistent.

I was thinking to enforce dtype consistency in the defaults but then the test

def test_str_repr(self):
fails for arrays of dtype int8.
That means the user will be forced to use fill_values in the range of the specific dtype, while now there is at least one way to put values likely to be very different from the data range: is that acceptable?
Or should a better documentation of the current behavior be sufficient to solve this issue?
Or am I on a completely wrong path?

Thanks for listening.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants
0