8000 Problem with views on masked structured arrays · Issue #10483 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

Problem with views on masked structured arrays #10483

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
neishm opened this issue Jan 26, 2018 · 3 comments
Closed

Problem with views on masked structured arrays #10483

neishm opened this issue Jan 26, 2018 · 3 comments

Comments

@neishm
Copy link
neishm commented Jan 26, 2018

Create a view on a structured array (e.g. selecting a list of columns) causes problems with comparisons when the array is masked.

Example

Create a view on a masked structured array:

import numpy as np
data = np.ma.array([
  ('Mon', 'Sunny',  30, 20),
  ('Tue', 'Cloudy', 28, 17),
  ('Wed', 'Rain',   25, 17),
  ('Thu', 'Sunny',  29, 18),
  ('Fri', 'Sunny',  30, 20),
], dtype=[('day','|S3'),('weather','|S10'),('high','i4'),('low','i4')])
data2 = data[['day','high']]
print (data2)
[('Mon', 30) ('Tue', 28) ('Wed', 25) ('Thu', 29) ('Fri', 30)]

Try to do a comparison on the array:

print (data2[1:] == data2[:-1])
Traceback (most recent call last):
  File "test_numpy14.py", line 12, in <module>
    print (data2[1:] == data2[:-1])
  File "XXX/lib/python2.7/site-packages/numpy/ma/core.py", line 4033, in __eq__
    return self._comparison(other, operator.eq)
  File "XXX/lib/python2.7/site-packages/numpy/ma/core.py", line 3993, in _comparison
    sdata = sbroadcast.filled(odata)
  File "XXX/lib/python2.7/site-packages/numpy/ma/core.py", line 3709, in filled
    fill_value = _check_fill_value(fill_value, self.dtype)
  File "XXX/lib/python2.7/site-packages/numpy/ma/core.py", line 459, in _check_fill_value
    raise ValueError(err_msg % (fill_value, fdtype))
ValueError: Unable to transform [('Mon', 30) ('Tue', 28) ('Wed', 25) ('Thu', 29)] to dtype [('day', '|S3'), ('', '|V10'), ('high', '<i4'), ('', '|V4')]

A similar problem happens when calling numpy.ma.unique on the view. The problem persists even after copying the data (e.g. data2 = data2.copy())

I am using numpy 1.14.0 with Python 2.7.6. The problem does not appear in numpy 1.13.3, or when using a regular (non-masked) structured array.

@ahaldane
Copy link
Member

Thanks for the report, confirmed.

This is very similar to, or caused by, #10387, which itself is indirectly caused by our recent change in numpy 1.14.0 to return a view instead of a copy for multi-field indexing.

We're on it. We are planning to revert that change in 1.14.1, and push it off to 1.15. That will give us time to fix this and similar bugs that need to be fixed first (#3176, #8100).

This report is a little different from #10387, because it is likely that the numpy.ma code should not be using dtype.descr in the first place. So in addition to fixing the padding bytes in .descr, we should also remove use of .descr in np.ma.

@eric-wieser
Copy link
Member

fill_value has some really weird semantics that are far too complex, when the logic should be nothing more than self.fill_value = np.array(fill_value, dtype=self.dtype). Of course, that will certainly break compatibility.

Perhaps we should start deprecating cases that give a different result.

@charris charris added this to the 1.15.0 release milestone Feb 3, 2018
@charris
Copy link
Member
charris commented Jun 8, 2018

Pushing off to 1.16. @ahaldane How do you want to handle the 1.15 fix? If you want to keep master as is we should wait until 1.15 is branched.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants
0