BUG/TST: Fix for #6723 including test: force fill_value.ndim==0 #6728

gerritholl · 2015-11-25T20:39:35Z

Fix issue #6723. Given an exotic masked structured array, where one of
the fields has a multidimensional dtype, make sure that, when accessing
this field, the fill_value still makes sense. As it stands prior to this
commit, the fill_value will end up being multidimensional, possibly with a
shape incompatible with the mother array, which leads to broadcasting
errors in methods such as .filled(). This commit uses the first element
of this multidimensional fill value as the new fill value.

Also add a test to verify that fill_value.ndim remains 0 after indexing.

charris · 2015-12-01T17:19:53Z

numpy/ma/core.py

@@ -3121,6 +3121,13 @@ def __getitem__(self, indx):
            if isinstance(indx, basestring):
                if self._fill_value is not None:
                    dout._fill_value = self._fill_value[indx]
+                    # If we're indexing a multidimensional field in a
+                    # structured array (such as dtype("(2)i2,(2)i1")),


Would be clearer if written dtype("(2,)i2,(2,)i1"). Current success looks like a parse oddity.

ahaldane · 2015-12-02T22:05:16Z

It's a messy situation, and this may be the best solution.

Another option that comes to mind is that maskedarray could always broadcast the fill value to the data shape. Perhaps this can be achieved by a few judicious uses of broadcast_to in MaskedArray.filled? This way if you create an N x M masked array, you can also set the fill value to a 1d array of length M (or shape (N,1) or (1,M))

gerritholl · 2015-12-03T18:02:25Z

@ahaldane Is it sure that such broadcasting is always possible? My gut feeling says it is, but my gut feeling is not very reliable. I'm still not clear what practical purpose it serves at all for a user to change the fill_value.

ahaldane · 2015-12-04T17:50:16Z

It's probably possible, but after thinking I'm pretty sure it will open up many new cans of worms and would take effort. For instance, we would have to worry about slicing the fill_value if the array gets sliced. I think this PR is a fair fix for this rare case, for now.

If/when we end up reimplementing masked arrays to fix all sorts of issues like this (there have been suggestions to do so once __numpy_ufunc__ behavior is decided) it might be worth thinking about allowing both the mask and the fill_value to broadcast, if it's not too messy. It might actually be convenient, eg we could mask out entire rows & columns easily.

I looked at the code in this PR, it looks good to me. I vote for merging.

seberg · 2015-12-05T00:03:48Z

@ahaldane, just the side discussion.... But I tend to think right now that the "correct" solution from a future point of view would be to not allow it at all, each dtype should have a single mask value (of course only the last layer of a dtype has multiple dtypes inside).
Dtype field arrays belong to a single dtype and in some sense this dtype is associated with the fill value, if we come from the "dtypes should be the carriers of mask" side.

Fix issue numpy#6723. Given an exotic masked structured array, where one of the fields has a multidimensional dtype, make sure that, when accessing this field, the fill_value still makes sense. As it stands prior to this commit, the fill_value will end up being multidimensional, possibly with a shape incompatible with the mother array, which leads to broadcasting errors in methods such as .filled(). This commit uses the first element of this multidimensional fill value as the new fill value. When more than one unique value existed in fill_value, a warning is issued. Also add a test to verify that fill_value.ndim remains 0 after indexing.

charris · 2016-01-12T19:32:42Z

@ahaldane @seberg So this should go in? If so, one of you can do the honors.

ahaldane · 2016-01-13T01:30:55Z

Well, I don't think @seberg objected above, and looking over it again it seems like a reasonable band-aid. The most important part is just to warn the user if something strange is happening. Merging.

…rray_fillvalue BUG/TST: Fix for #6723 including test: force fill_value.ndim==0

charris added Critical Defect component: numpy.ma masked arrays 00 - Bug and removed Critical Defect labels Nov 26, 2015

gerritholl force-pushed the structured_multidim_masked_array_fillvalue branch from 9f8f739 to f667c81 Compare December 1, 2015 14:08

charris reviewed Dec 1, 2015
View reviewed changes

gerritholl force-pushed the structured_multidim_masked_array_fillvalue branch 2 times, most recently from fd68517 to 8d1ce6e Compare December 2, 2015 00:53

gerritholl force-pushed the structured_multidim_masked_array_fillvalue branch from 8d1ce6e to 090e85e Compare December 7, 2015 16:45

ahaldane added a commit that referenced this pull request Jan 13, 2016

Merge pull request #6728 from gerritholl/structured_multidim_masked_a…

bdd4558

…rray_fillvalue BUG/TST: Fix for #6723 including test: force fill_value.ndim==0

ahaldane merged commit bdd4558 into numpy:master Jan 13, 2016

eric-wieser mentioned this pull request Feb 28, 2017

Indexing multidimensional structured masked array fails to translate fill_value correctly, leading to broadcasting errors ValueError when calling .filled() #6723

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

BUG/TST: Fix for #6723 including test: force fill_value.ndim==0 #6728

BUG/TST: Fix for #6723 including test: force fill_value.ndim==0 #6728

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

BUG/TST: Fix for #6723 including test: force fill_value.ndim==0 #6728

BUG/TST: Fix for #6723 including test: force fill_value.ndim==0 #6728

Uh oh!

Conversation

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!