Operation on masked_array changes fill_value #3762

agartland · 2013-09-19T00:13:20Z

I first raised this issue on stackoverflow (see link on the bottom)

Seems like the new masked_array should inherit the fill_value from the two masked_arrays being summed?

Can someone explain to me this behavior of a numpy masked_array? It seems to change the fill_value after applying the sum operation, which is confusing if you intend to use the filled result.

data=ones((5,5))
m=zeros((5,5),dtype=bool)

"""Mask out row 3"""
m[3,:]=True
arr=ma.masked_array(data,mask=m,fill_value=nan)

print arr
print 'Fill value:', arr.fill_value
print arr.filled()

farr=arr.sum(axis=1)
print farr
print 'Fill value:', farr.fill_value
print farr.filled()

"""I was expecting this"""
print nansum(arr.filled(),axis=1)

Prints output:

[[1.0 1.0 1.0 1.0 1.0]
 [1.0 1.0 1.0 1.0 1.0]
 [1.0 1.0 1.0 1.0 1.0]
 [-- -- -- -- --]
 [1.0 1.0 1.0 1.0 1.0]]
Fill value: nan
[[  1.   1.   1.   1.   1.]
 [  1.   1.   1.   1.   1.]
 [  1.   1.   1.   1.   1.]
 [ nan  nan  nan  nan  nan]
 [  1.   1.   1.   1.   1.]]
[5.0 5.0 5.0 -- 5.0]
Fill value: 1e+20
[  5.00000000e+00   5.00000000e+00   5.00000000e+00   1.00000000e+20
   5.00000000e+00]
[  5.   5.   5.  nan   5.]

http://stackoverflow.com/questions/18879272/why-does-sum-operation-on-numpy-masked-array-change-fill-value-to-1e20

The text was updated successfully, but these errors were encountered:

akleeman · 2013-12-12T01:49:24Z

A similar thing happens with the take operation in numpy 1.7 ...

np.version.version
Out: '1.7.0'
tmp = np.random.normal(size=(3, 4))
masked = np.ma.masked_array(tmp, mask=tmp > 0, fill_value=np.nan)
masked.fill_value
Out: nan
masked.take([0, 1, 2], axis=0).fill_value
Out: 1e+20
np.all(masked.take([0, 1, 2], axis=0) == masked)
Out: True
np.all(masked.take([0, 1, 2], axis=0).filled() == masked.filled())
Out: False

Looks like the same thing would happen in numpy 1.8

charris · 2014-02-15T04:29:54Z

Sounds like something reasonable to do.

alimuldal · 2016-02-10T23:30:18Z

A related question popped up on SO today: http://stackoverflow.com/q/35324836/1461210

In versions >= 1.10.0:

x = np.random.randn(10, 10)
m = np.ma.masked_array(x, x < 0.5, fill_value=1)
print((m * m).fill_value)
# 1e+20
print(np.multiply(m, m).fill_value)
# 1.0

Prior to 1.10.0 I get a fill value of 1.0 in both cases, which seems much more reasonable to me. The behaviour in question was introduced in 3c6b6ba.

charris · 2016-02-11T16:26:45Z

Probably related: #7122.

alimuldal · 2016-02-14T01:28:41Z

@charris Definitely looks that way - I also did a bisect and identified 3c6b6ba as the culprit

jjhelmus · 2016-04-01T21:01:01Z

It looks like most (all) of the reductions operations do not preserve the fill_value attribute:

 import numpy as np
a = np.ma.arange(12, fill_value=88).reshape(3, 4)
print(np.__version__)
print("Original fill_value:", a.fill_value)
print("a.sum fill_value:", a.sum(axis=1).fill_value)
print("a.cumsum fill_value:",  a.cumsum().fill_value)

1.12.0.dev0+2af06c8
Original fill_value: 88
a.sum fill_value: 999999
a.cumsum fill_value: 999999

This could be fixed (and I'm willing to submit a PR) but explicitly setting the fill value of the returned array in the various MaskedArray methods. Is this the desired behavior, or should the current behavior of using the default fill_value for these arrays be maintained?

sannant · 2019-03-01T06:59:59Z

Some related problem with the multiplication with a scalar:

some example test code:

import numpy as np
h = np.array([0.0,2,3.2,4.5])
print("h: {}".format(h))

for fill_value in [float("inf"), 1e20, 0]:
    print("\ntesting with fill_value={}".format(fill_value))
    hma=np.ma.masked_array(h, [True, False, True, False], fill_value=fill_value)
    print("hma: {}".format(hma.__repr__()))
    print("np.asarray(hma)): {}".format(np.asarray(hma)))
    print("1.0 * hmat: {}".format((1.0 * hma).__repr__()))
    print("np.asarray(1.0 * hma)): {}".format(np.asarray(1.0 * hma)))

And the output for the fill_value 0 :

testing with fill_value=0
hma: masked_array(data=[--, 2.0, --, 4.5],
             mask=[ True, False,  True, False],
       fill_value=0.0)
np.asarray(hma)): [0.  2.  3.2 4.5]
1.0 * hmat: masked_array(data=[--, 2.0, --, 4.5],
             mask=[ True, False,  True, False],
       fill_value=0.0)
np.asarray(1.0 * hma)): [1.  2.  1.  4.5]

the fill'_value doesn't change. according to hma.__repr__(), but when applying np.asarray the fill_value is systematically replaced by 1.

System informations:

numpy: 1.16.0
python: Python 3.7.0
os: macOS 10.13.6

charris added Defect and removed Defect labels Feb 23, 2014

mattip removed the priority: normal label Oct 21, 2018

segasai mentioned this issue Dec 1, 2024

Broken FITS table round-trip with any nan in a float column. astropy/astropy#14558

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Operation on masked_array changes fill_value #3762

Operation on masked_array changes fill_value #3762

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Operation on masked_array changes fill_value #3762

Operation on masked_array changes fill_value #3762

Comments

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!