structured dtype descr method returns incorrect descriptor when void space is at the end of dtype built through a dict #6359

ikehall · 2015-09-25T15:06:57Z

Using python 2.7.10, numpy 1.9.2
If I define a structured dtype in the following manner:

>>>my_dtype = np.dtype({ 
    'names':['A', 'B'], 
    'formats':['f4', 'f4'], 
    'offsets'[0, 8], 
    'itemsize':16})

And I then try to create a new dtype from this dtypes descr:

>>>new_dtype = np.dtype(my_dtype.descr)

Then the two dtypes will not have the same itemsize

>>>my_dtype.itemsize
16
>>>new_dtype.itemsize
12

Examining the descr of my_dtype, we see that it is leaving off the 4 void bytes at the end

>>>my_dtype.descr
[('A', '<f4'), ('', '|V4'), ('B', '<f4')]

What should happen instead:

>>>my_dtype.descr
[('A', '<f4'), ('', '|V4'), ('B', '<f4'), ('', '|V4')]

This has relevance for use of structured arrays with IPython.parallel, as this is how structured arrays are reconstructed when serialized and sent to engines in IPython.parallel. A work-around for the user of course is to define some field that marks the end of the structured data, but it seems that this should not be necessary.

The text was updated successfully, but these errors were encountered:

dtype.descr returns void fields to explain the padding part of the dtype. The last void field for the itemsize itself was however not included. Closes numpygh-6359

seberg · 2015-09-25T15:44:45Z

Uploaded a PR to fix this, I wonder if the extra void fields are not somewhat annoying in any case and if there is not a better way to do this, though?

ahaldane · 2015-09-26T02:30:49Z

Yes, that is indeed a bug, but I'd also like to warn you that there is a deeper bug here.

dtype.descr is documented to be the "Array-interface compliant full description of the data-type." It keeps the 'unseen' padding in order to conform to the Array Interface requirements -- the fact that it keeps 'padding' bytes is intentional.

It is used in 2 places in numpy: 1. In the Array Interface (for serializing) , 2. In the io code to save the .npy format.

Now, unfortunately currently numpy cannot reliably convert dtype.descr back to a dtype!!!! This is actually a problem in the io code because it means we cannot load structured arrays with padding bytes from .npy files. See #2215 #3176 and related.

@ikehall, unfortunately this means numpy probably currently cannot do what you want (though I encourage you to check to be sure). Numpy knows how to serialize your array, but it probably currently cannot deserialize it due to a bug (in addition to the bug you just found)

ikehall · 2015-09-28T14:10:21Z

Thank you for bringing that to my attention.

For my purposes, the fix for this bug fixes my problems completely. The other bugs in dtype.descr seem to all be related to automatic naming of unnamed and invisible fields. As my fields are all either named (with names other than 'f1', 'f2', etc...) or 'invisible', it doesn't really matter to me if they come back on deserialization with automatically generated names, because I am not going to use them, and further, when I ultimately save the array, I discard the dtype information and just save the string of bytes. (Because I am saving to an industry-defined standard, and not np.save)

I agree that those are bugs, and could affect me if I relied on automatic field naming, or named my fields badly, but currently this is not the case for me.

dtype.descr returns void fields to explain the padding part of the dtype. The last void field for the itemsize itself was however not included. Closes numpygh-6359

seberg mentioned this issue Sep 25, 2015

BUG: Add void field at end of dtype.descr to match itemsize #6361

Merged

ahaldane closed this as completed in #6361 Sep 27, 2015

ahaldane mentioned this issue Oct 12, 2016

BUG: np.save() and np.load() are not idempotent when align=True or fields are discontiguous #8100

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

structured dtype descr method returns incorrect descriptor when void space is at the end of dtype built through a dict #6359

structured dtype descr method returns incorrect descriptor when void space is at the end of dtype built through a dict #6359

Uh oh!

Uh oh!

Uh oh!

Uh oh!

structured dtype descr method returns incorrect descriptor when void space is at the end of dtype built through a dict #6359

structured dtype descr method returns incorrect descriptor when void space is at the end of dtype built through a dict #6359

Comments

Uh oh!

Uh oh!

Uh oh!