8000 WIP: refactor dtype to be a type subclass by mattip · Pull Request #12585 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

WIP: refactor dtype to be a type subclass #12585

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 6 commits into from
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
MAINT: use PyObjec_New, support MSVC
  • Loading branch information
mattip committed Aug 19, 2019
commit fed8ecf43280e2b18169c6ca6990b022ff5a1da3
1 change: 1 addition & 0 deletions numpy/core/include/numpy/ndarraytypes.h
Original file line number Diff line number Diff line change
Expand Up @@ -601,6 +601,7 @@ typedef struct _PyArray_Descr {
#ifdef USE_DTYPE_AS_PYOBJECT
PyObject_HEAD
#else
/* may need to be PyHeapTypeObject for new tp_as_* functions? */
PyTypeObject descrtype;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this needs to be PyHeapTypeObject. The Cython team explored this and ran into the problem before until they made meta-types inherit from PyHeapTypeObject. See, for example, https://mail.python.org/pipermail/cython-devel/2013-September/003813.html

Suggested change
PyTypeObject descrtype;
PyHeapTypeObject descrtype;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this applies to us. From that thread:

However, when MyClass is created, an instance of struct __pyx_obj_4meta_MetaClass (cc. 200 + 4 bytes) is dynamically allocated by the Python memory management machinery. The machinery then tries to initialize the allocated memory.

In our case, the instance of the dtype is malloc'd and initialized by arraydescr_new, so we're in complete control.

As far as I can tell, the main purpose of PyHeapTypeObject is to convert builtin magic methods into slots, something we probably don't care about too much for dtypes.

#endif
/*
Expand Down
5 changes: 1 addition & 4 deletions numpy/core/src/multiarray/arraytypes.c.src
Original file line number Diff line number Diff line change
Expand Up @@ -4437,10 +4437,8 @@ set_typeinfo(PyObject *dict)
* #DO_ELSIZE = 1*19, 0*3, 1*2#
*/

#if USE_DTYPE_AS_PYOBJECT
dtype = PyObject_New(PyArray_Descr, &PyArrayDescr_Type);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the advantage of heap allocating here vs the previous static allocation?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personal preference. At some point perhaps subinterpreters support may become mature enough that we will want to allow reloading the NumPy module, and thinking about what that means for static allocation gets me confused.

#else
dtype = PyObject_GC_New(PyArray_Descr, &PyArrayDescr_Type);
#ifndef USE_DTYPE_AS_PYOBJECT
/* Don't copy PyObject_HEAD part */
memset((char *)dtype + sizeof(PyObject),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me (and thus for future people looking at this...), it would be helpful to expand on why this is done. Does it get filled out later by actually instantiating? I realize links to documentation can get out of sync, but might still be better than nothing.

Looking at the documentation for PyObject_New, I see "Fields not defined by the Python object header are not initialized" - so, is the point here to undo the initialization that is done? But why?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the goal here is to ensure all the tp_ slots are initialized to null, without having to list them all.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My question was whether it should be zeroed: presumably PyObject_New initializes this section (since it states that other parts are not initialized, suggesting that these parts are), and, again presumably, this is with stuff from PyArrayDescr_Type. If those presumptions hold, are those items wrong? Or might they have, e.g., init/new functions that can just be used?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, PyObject_New does not initialize this section, because it's not part of the PyObject head - in the same way that PyObject_New does not initialize new instance of PyArrayObject

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment here is wrong though - the word copy is no longer appropriate.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arggh, I just looked so poorly - of course the code zeros everything beyond PyObject. Which definitely makes sense.

0,
Expand Down Expand Up @@ -4540,7 +4538,6 @@ set_typeinfo(PyObject *dict)
infodict = PyDict_New();
if (infodict == NULL) return -1;


/**begin repeat
*
* #name = BOOL,
Expand Down
15 changes: 2 additions & 13 deletions numpy/core/src/multiarray/descriptor.c
Original file line number Diff line number Diff line change
Expand Up @@ -1762,11 +1762,7 @@ PyArray_DescrConverter(PyObject *obj, PyArray_Descr **at)
NPY_NO_EXPORT PyArray_Descr *
PyArray_DescrNew(PyArray_Descr *base)
{
#if USE_DTYPE_AS_PYOBJECT
PyArray_Descr *newdescr = PyObject_New(PyArray_Descr, &PyArrayDescr_Type);
#else
PyArray_Descr *newdescr = PyObject_GC_New(PyArray_Descr, &PyArrayDescr_Type);
#endif

if (newdescr == NULL) {
return NULL;
Expand Down Expand Up @@ -3576,14 +3572,6 @@ static PyMappingMethods descr_as_mapping = {
};

/****************** End of Mapping Protocol ******************************/
#ifdef USE_DTYPE_AS_PYOBJECT
#pragma message "USE_DTYPE_AS_PYOBJECT defined, using old dtypes"
#define BASETYPE 0
#else
#pragma message "USE_DTYPE_AS_PYOBJECT not defined, using new dtypes"
/* may need to be PyHeapTypeObject to allow overriding tp_as_* functions? */
#define BASETYPE &PyType_Type
#endif

NPY_NO_EXPORT PyTypeObject PyArrayDescr_Type = {
#if defined(NPY_PY3K)
Expand Down Expand Up @@ -3631,7 +3619,7 @@ NPY_NO_EXPORT PyTypeObject PyArrayDescr_Type = {
arraydescr_methods, /* tp_methods */
arraydescr_members, /* tp_members */
arraydescr_getsets, /* tp_getset */
BASETYPE, /* tp_base */
0, /* tp_base */
0, /* tp_dict */
0, /* tp_descr_get */
0, /* tp_descr_set */
Expand All @@ -3653,3 +3641,4 @@ NPY_NO_EXPORT PyTypeObject PyArrayDescr_Type = {
0, /* tp_del */
0, /* tp_version_tag */
};

7 changes: 7 additions & 0 deletions numpy/core/src/multiarray/multiarraymodule.c
Original file line number Diff line number Diff line change
Expand Up @@ -4639,6 +4639,13 @@ PyMODINIT_FUNC init_multiarray_umath(void) {
goto err;
}

#ifdef USE_DTYPE_AS_PYOBJECT
#pragma message "USE_DTYPE_AS_PYOBJECT defined, using old dtypes"
#else
#pragma message "USE_DTYPE_AS_PYOBJECT not defined, using new dtypes"
PyArrayDescr_Type.tp_base = &PyType_Type;
PyArrayDescr_Type.tp_free = PyObject_Free;
#endif
PyArrayDescr_Type.tp_hash = PyArray_DescrHash;
if (PyType_Ready(&PyArrayDescr_Type) < 0) {
goto err;
Expand Down
0