8000 MAINT: Move ctype -> dtype conversion to python by eric-wieser · Pull Request #12254 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

MAINT: Move ctype -> dtype conversion to python #12254

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Oct 26, 2018

Conversation

eric-wieser
Copy link
Member
@eric-wieser eric-wieser commented Oct 24, 2018

Follows in the vein of #10602.

A step towards fixing #12207.


This comes with a number of changes:

  • Propagating the error if parsing a ctypes object goes wrong, rather than silencing it and trying more things.
  • Rejecting ctypes bitfields, which were incorrectly treated as subarrays.
  • Rejecting ctypes Unions, which were previously treated as if they were structs
  • Rejecting ctypes pointers, which were silently converted into their dereferenced types.
    This was particularly dangerous behavior when parsing structs with nested pointers, where swapping out the type changes the struct layout!
  • No longer support ctypes duck-types, which includes...
  • No longer supporting a _fields_ attribute on np.void subclasses. This was a bug due to a poorly-shared code-path between void and ctypes types.

The upshot is it should be easier to fix future problems in parsing ctypes types.

@eric-wieser eric-wieser force-pushed the dtype-from-ctypes-type branch 4 times, most recently from 96b6c79 to b36a70c Compare October 24, 2018 08:19
@eric-wieser eric-wieser force-pushed the dtype-from-ctypes-type branch 3 times, most recently from 9587794 to fc1fe5a Compare October 25, 2018 06:02
@eric-wieser
Copy link
Member Author

The mystery travis warning machine strikes again...

@eric-wieser eric-wieser force-pushed the dtype-from-ctypes-type branch from fc1fe5a to c4bf24d Compare October 25, 2018 06:43
@eric-wieser eric-wieser force-pushed the dtype-from-ctypes-type branch from c4bf24d to 0820813 Compare October 25, 2018 15:40
_fields_ = [
('a', ctypes.c_uint8, 7),
('b', ctypes.c_uint8, 1)
]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously this produced dtype([('a', 'u1', (7,)), ('b', 'u1')], align=True), which was garbage.

_fields_ = [
('a', ctypes.c_uint8),
('b', ctypes.c_uint16),
]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And this one produced dtype({'names':['a','b'], 'formats':['u1','<u2'], 'offsets':[0,2], 'itemsize':4}, align=True), which is not a union

@charris
Copy link
Member
charris 8000 commented Oct 25, 2018

Could you also fix the circular import problem that came in with #10602? Not in this PR, but I don't like the alternative fix in #12064.


#include "npy_import.h"

/*!
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the '!' markup?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think it's doxygen. Copied from npy_import, but it seems we don't use it elsewhere - removing

@@ -796,13 +796,13 @@ def _ufunc_doc_signature_formatter(ufunc):
)


def _is_from_ctypes(obj):
# determine if an object comes from ctypes, in order to work around
def npy_ctypes_check(cls):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I see that the static type is checked in the C code.

PyObject *dtypedescr;
PyArray_Descr *newdescr;
int ret;
static PyArray_Descr *_arraydescr_from_ctypes_type(PyTypeObject *o) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

C style, return type on separate line.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'o' is not a good choice for variable name.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both fixed

}
PyErr_Clear();
if (!PyObject_TypeCheck(res, &PyArrayDescr_Type)) {
// sanity check that dtype_from_ctypes_type returned the right type,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Theoretically, C++ style comments should be reserved until we go full C99, which will be in 1.17 after we drop Python 2.7.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, fixed.

* type might override the dtype if numpy does not otherwise
* support it.
*/
if (npy_ctypes_check((PyTypeObject *)obj)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally prefer to check against value, especially as the error return is '0'.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or maybe NPY_SUCCEED as that seems to be the interface for this bunch of functions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather leave this as is, since it matches things like PyString_Check

@@ -1423,10 +1387,20 @@ PyArray_DescrConverter(PyObject *obj, PyArray_Descr **at)
check_num = NPY_VOID;
}
else {
*at = _arraydescr_fromobj(obj);
*at = _arraydescr_from_dtype_attr(obj);
if (*at) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Error return here is NULL, be nice to use that in the error check.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This whole function seems to use the style of not comparing against NULL, so I'd be inclined to leave it consistent

fname, ftyp = item
fields.append((fname, dtype_from_ctypes_type(ftyp)))

# by default, ctypes structs are aligned
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

9E81

Are they ever otherwise?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a _pack_ attribute that makes them otherwise: #10532

@charris
Copy link
Member
charris commented Oct 25, 2018

Some style nits. I'm going to rely for correctness on the existing and new tests.

This comes with a number of changes:

* Propagating the error if parsing a ctypes object goes wrong, rather than silencing it and trying more things.
* Rejecting ctypes bitfields, which were incorrectly treated as subarrays.
* Rejecting ctypes Unions, which were previously treated as if they were structs
* Rejecting ctypes pointers, which were silently converted into their dereferenced types.
  This was particularly dangerous behavior when parsing structs with nested pointers, where swapping out the type changes the struct layout!
* No longer support ctypes duck-types, which includes...
* No longer supporting a `_fields_` attribute on `np.void` subclasses. This was a bug due to a poorly-shared code-path between void and ctypes types.

The upshot is it should be easier to fix future problems in parsing ctypes types.
Without this, the build will not rerun due to changes in this file
@eric-wieser eric-wieser force-pushed the dtype-from-ctypes-type branch from 0820813 to 0709d92 Compare October 26, 2018 04:24
@charris charris merged commit 763be37 into numpy:master Oct 26, 2018
@charris
Copy link
Member
charris commented Oct 26, 2018

Thanks Eric.

@eric-wieser
Copy link
Member Author

In hindsight, this probable needs a release note

@danielhrisca
Copy link
Contributor

@eric-wieser @mattip
does this mean that there is no plan to ever support bit fields in dtype?

This would be a very nice and useful feature since it is very common when dealing with data that originates in embedded systems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants
0