8000 ENH: Avoid memory peak when creating a MaskedArray with mask=True/False. by saimn · Pull Request #6734 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

ENH: Avoid memory peak when creating a MaskedArray with mask=True/False. #6734

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Dec 1, 2015
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 14 additions & 8 deletions numpy/ma/core.py
Original file line number Diff line number Diff line change
Expand Up @@ -2756,13 +2756,19 @@ def __new__(cls, data=None, mask=nomask, dtype=None, copy=False,
_data._sharedmask = True
else:
# Case 2. : With a mask in input.
# Read the mask with the current mdtype
try:
mask = np.array(mask, copy=copy, dtype=mdtype)
# Or assume it's a sequence of bool/int
except TypeError:
mask = np.array([tuple([m] * len(mdtype)) for m in mask],
dtype=mdtype)
# If mask is boolean, create an array of True or False
if mask is True and mdtype == MaskType:
mask = np.ones(_data.shape, dtype=mdtype)
elif mask is False and mdtype == MaskType:
mask = np.zeros(_data.shape, dtype=mdtype)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I should also check here that mdtype is dtype('bool') ?
If I understand correctly mdtype can be a list for record arrays, in which case mask = np.array(mask, copy=copy, dtype=mdtype) throws a TypeError, then we go to the except below which suppose that mask is a list.
So in theory one should not use mask = True or False with a record array, or a list this case is not handled ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know, but a check on mdtype sounds good if the wrong type raises an error. A try block isn't the best sort of flow control.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these cases currently covered by tests?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I will a test for mdtype. For the unit tests it is covered by tests which use mask=True or False in the constructor, but maybe I can a test specific to the constructor.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@charris done ! I have added a small test for mask=True/False, so this is explicitly tested.

else:
# Read the mask with the current mdtype
try:
mask = np.array(mask, copy=copy, dtype=mdtype)
# Or assume it's a sequence of bool/int
except TypeError:
mask = np.array([tuple([m] * len(mdtype)) for m in mask],
dtype=mdtype)
# Make sure the mask and the data have the same shape
if mask.shape != _data.shape:
(nd, nm) = (_data.size, mask.size)
Expand Down Expand Up @@ -4690,7 +4696,7 @@ def dot(self, b, out=None, strict=False):
See Also
--------
numpy.ma.dot : equivalent function

"""
return dot(self, b, out=out, strict=strict)

Expand Down
9 changes: 9 additions & 0 deletions numpy/ma/tests/test_core.py
Original file line number Diff line number Diff line change
Expand Up @@ -191,6 +191,15 @@ def test_creation_maskcreation(self):
dma_3 = MaskedArray(dma_1, mask=[1, 0, 0, 0] * 6)
fail_if_equal(dma_3.mask, dma_1.mask)

x = array([1, 2, 3], mask=True)
assert_equal(x._mask, [True, True, True])
x = array([1, 2, 3], mask=False)
assert_equal(x._mask, [False, False, False])
y = array([1, 2, 3], mask=x._mask, copy=False)
assert_(np.may_share_memory(x.mask, y.mask))
y = array([1, 2, 3], mask=x._mask, copy=True)
assert_(not np.may_share_memory(x.mask, y.mask))

def test_creation_with_list_of_maskedarrays(self):
# Tests creaating a masked array from alist of masked arrays.
x = array(np.arange(5), mask=[1, 0, 0, 0, 0])
Expand Down
0