8000 BUG: fix AttributeError on accessing object in nested MaskedArray. by rena-xu · Pull Request #15949 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

BUG: fix AttributeError on accessing object in nested MaskedArray. #15949

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Apr 25, 2020

Conversation

rena-xu
Copy link
Contributor
@rena-xu rena-xu commented Apr 11, 2020

Closes #15895

@rena-xu rena-xu changed the title BUG: fix AttributeError on accessing object in nested MaskedArray, refers to issue #15895 BUG: fix AttributeError on accessing object in nested MaskedArray. Apr 11, 2020
@rena-xu rena-xu marked this pull request as draft April 11, 2020 03:59
@rena-xu rena-xu marked this pull request as ready for review April 11, 2020 04:08
@rena-xu
Copy link
Contributor Author
rena-xu commented Apr 13, 2020

Sorry I'm really new to Travis-CI. I tried to check the details of the run but it seems like everything passed. I'm wondering why the check is still considered pending? Is there anything I could fix here? Thanks!

numpy/ma/core.py Outdated
# only initialize to empty array if it dout does not
# have ._fill_value before. If it has, do not change
# the original behavior
dout._fill_value = np.array([], dtype)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't 8000 strike me as correct, _fill_value should always be a 0d array, this is a 1d array.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops sorry I'll look into that!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect you want

Suggested change
dout._fill_value = np.array([], dtype)
dout._fill_value = np.empty((), dtype)
dout._fill_value[()] = self._fill_value[indx]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes thank you! I'll try to add some tests for this in the next commit!

def test_masked_all_with_dtype_str(self):
# Test masked_all works with nested array with dtype of an 'object'
# refers to issue #15895
my_dtype = np.dtype([('b', [('c', object)], (1,))])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other issue case I identified above is when the dtype is my_dtype = np.dtype([('b', (object, (1,)))])

rena-xu and others added 2 commits April 14, 2020 12:09
Co-Authored-By: Eric Wieser <wieser.eric@gmail.com>
Co-Authored-By: Eric Wieser <wieser.eric@gmail.com>
@rena-xu
Copy link
Contributor Author
rena-xu commented Apr 15, 2020

Hi this is really my first time contributing to an open-source project! I'm wondering if it's possible for these changes to be accepted? If the code and tests are not good enough I'll definitely try to modify something to it!

numpy/ma/core.py Outdated
@@ -3305,7 +3305,8 @@ def _scalar_heuristic(arr, elem):
# which should have shape () to avoid breaking several
8000 # methods. There is no great way out, so set to
# first element. See issue #6723.
if dout._fill_value.ndim > 0:
dtype = self._fill_value.dtype[indx]
if dtype.ndim > 0:
Copy link
Member
@eric-wieser eric-wieser Apr 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this change necessary?

I'm tempted to add assert isinstance(self._fill_value, np.ndarray) before this line.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I'm kinda confused here, so if the _fill_value is a string instead of ndarray, are we suppose to throw an error or change it to a 0darray?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_fill_value should never be anything other than a 0d array. If it's somehow a string here, then something else has a bug, and we should track down that bug.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see, guess I was not understanding the expected behavior correctly.

Co-Authored-By: Eric Wieser <wieser.eric@gmail.com>
@rgommers
Copy link
Member

Hi this is really my first time contributing to an open-source project! I'm wondering if it's possible for these changes to be accepted? If the code and tests are not good enough I'll definitely try to modify something to it!

Hi @rena-xu, welcome! You're doing everything right, no worries. Review typically takes a while, looks like this one is getting close to merging.

numpy/ma/core.py Outdated
@@ -3305,6 +3305,8 @@ def _scalar_heuristic(arr, elem):
# which should have shape () to avoid breaking several
# methods. There is no great way out, so set to
# first element. See issue #6723.
# something like gh-15895 has happened if this fails
assert isinstance(dout._fill_value, np.ndarray)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems wrong. If there is a way for dout._fill_value to ever not be an ndarray, then this should raise a ValueError, not an AssertionError. And it would be even better if there was a test to show how this could ever happen.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be even better if there was a test to show how this could ever happen.

It can't ever happen, as far as a I know - but this code is bug-prone, and this makes it easier for us to work out the issue if it does fire.

We could use RuntimeError("internal numpy error") or similar if you prefer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. As far as I can tell we have no other asserts in the code, we do have RuntimeErrors for this kind of thing.

@vrakesh
Copy link
Member
vrakesh commented Apr 25, 2020

@rena-xu Requesting an update on this. Changes to assertion as suggested above should move this forward

@rena-xu
Copy link
Contributor Author
rena-xu commented Apr 25, 2020

@vrakesh Sorry for the delay! I'll change this ASAP.

@rena-xu
Copy link
Contributor Author
rena-xu commented Apr 25, 2020

Sorry I'm kinda new to this. I'm wondering if it's possible to re-run the azure-pipeline test? It failed due to an ImportError, which I feel like should not be cause by this code change. I was able to pass this part by running runtests.py locally, so I'm not sure if I can reproduce this error

@rgommers
Copy link
Member

I'm wondering if it's possible to re-run the azure-pipeline test? It failed due to an ImportError, which I feel like should not be cause by this code change

Yes, you're right - that failure is happening on other PRs as well, re-running isn't going to fix it right now. We're fine ignoring unrelated failures, so you're all good here. Time for a re-review of your most recent changes.

numpy/ma/core.py Outdated
Comment on lines 3308 to 3311
# something like gh-15895 has happened if this check fails.
# _fill_value should always be an ndarray.
if not isinstance(dout._fill_value, np.ndarray):
raise RuntimeError('Internal NumPy error.')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can you move these lines up to line 3299? Where you've put them has separated the comment above from the code it describes.

Co-Authored-By: Eric Wieser <wieser.eric@gmail.com>
rena-xu and others added 2 commits April 25, 2020 11:09
Co-Authored-By: Eric Wieser <wieser.eric@gmail.com>
Copy link
Member
@eric-wieser eric-wieser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good now, thanks for persevering!

@mattip mattip merged commit 619422b into numpy:master Apr 25, 2020
@rgommers rgommers added this to the 1.19.0 release milestone Apr 25, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: MaskedArray with nested dtype and object elements cause AttributeError on access
6 participants
0