MaskedArray heuristic for memory overlap seems simplistic and slow #10234

mhvk · 2017-12-18T16:32:31Z

Currently, in MaskedArray.__array_finalize__, the following is done to check whether the new object may be a view of an old one:

if (obj.__array_interface__["data"][0] != self.__array_interface__["data"][0]):

if that check fails, the mask is copied.

This seems unnecessarily restrictive and fails even simple slicing:

ma = np.ma.MaskedArray(np.arange(100.))
ma2 = ma[10:20]
ma.__array_interface__["data"][0] == ma2.__array_interface__["data"][0]
# False

This means that slices by default do not share the mask with the original object, which doesn't seem a good idea given that we tried to change this behaviour in #5580 (hence, cc @jakirkham).

A relatively straightforward solution would be to replace it with not np.may_share_memory(ma, ma2)
(this, perhaps surprisingly, is also much faster than the above for the simple slice case).

The text was updated successfully, but these errors were encountered:

eric-wieser · 2017-12-18T16:39:15Z

Are you sure that slicing isn't already handled correctly by __getitem__?

mhvk · 2017-12-18T16:56:50Z

Yes, you're right, it is. I guess that if we changed the behaviour here, some of the work-arounds in __getitem__ could be removed. But maybe it is not worth the hassle (though the present comparison takes 4us on my computer, while may_share_memory takes 0.3us)

eric-wieser · 2017-12-18T17:10:42Z

Most of the __getitem__ workarounds are due to;

Decay into scalars when indexing
Not being able to reliably inspect the indexing tuple due to DEP: Deprecate non-tuple nd-indices #9686

mhvk added 15 - Discussion 23 - Wish List component: numpy.ma masked arrays labels Dec 18, 2017

mhvk mentioned this issue Dec 18, 2017

MAINT,ENH: remove MaskedArray.astype, as the base type does everything. #10211

Merged

rgommers removed the 23 - Wish List label Aug 1, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

MaskedArray heuristic for memory overlap seems simplistic and slow #10234

MaskedArray heuristic for memory overlap seems simplistic and slow #10234

Uh oh!

Uh oh!

Uh oh!

Uh oh!

MaskedArray heuristic for memory overlap seems simplistic and slow #10234

MaskedArray heuristic for memory overlap seems simplistic and slow #10234

Comments

Uh oh!

Uh oh!

Uh oh!

Uh oh!