Description
This I think breaks reading sub-array dtypes:
# After this PR
>>> import numpy as np
>>> with open('/dev/zero', 'rb') as f:
... a = np.fromfile(f, dtype='(8,8)u8', count=1)
...
>>> (a == 0).all()
False
The reason I guess is that creating an array with sub-array dtype moves the array dimensions from the dtype to the array itself, so the number of bytes to read becomes incorrect.
Originally posted by @pv in https://github.com/numpy/numpy/pull/14586/files
While fixing/writing tests, similar issues seem to happen in FromString. Probably it is best to switch to the old pattern of using the given dtype even though the array turns out representing it differently. At least FromString
has a use-after-free bug with this pattern currently.
EDIT: I am planning to look at this, but if someone beats me to it, great :). It should not be difficult, but reference counting issues can be confusing...
Test for anyone who wants to start (it may be worth to look at some other similar paths though):
def test_subarray_into_shape(self):
# Test subarray dtypes which are absorbed into the shape
x = np.arange(24, dtype="i4").reshape(2, 3, 4)
x.tofile(self.filename)
res = np.fromfile(self.filename, dtype="(3,4)i4")
assert_array_equal(x, res)
x_str = x.tobytes()
with assert_warns(DeprecationWarning):
# binary fromstring is deprecated
res = np.fromstring(x_str, dtype="(3,4)i4")
assert_array_equal(x, res)