-
-
Notifications
You must be signed in to change notification settings - Fork 10.9k
DOC: Add and fixup/move docs for descriptor changes #25946
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -83,6 +83,49 @@ used to explicitly implement different behavior on NumPy 1.x and 2.0. | |||||
|
||||||
Please let us know if you require additional workarounds here. | ||||||
|
||||||
.. _migration_c_descr: | ||||||
|
||||||
The ``PyArray_Descr`` struct has been changed | ||||||
--------------------------------------------- | ||||||
One of the most impactful C-API changes is that the ``PyArray_Descr`` struct | ||||||
is now more opaque to allow us to add additional flags and have | ||||||
itemsizes not limited by the size of ``int`` as well as allow improving | ||||||
structured dtypes in the future and not burdon new dtypes with their fields. | ||||||
|
||||||
Code which only uses the type number and other initial fields is unaffected. | ||||||
Most code will hopefull mainly access the ``->elsize`` field, when the | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
dtype/descriptor itself is attached to an array (e.g. ``arr->descr->elsize``) | ||||||
this is best replaced with ``PyArray_ITEMSIZE(arr)``. | ||||||
|
||||||
Where not possible, new accessor functions are required: | ||||||
* ``PyDataType_ELSIZE`` and ``PyDataType_SET_ELSIZE`` (note that the result | ||||||
is now ``npy_intp`` and not ``int``). | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Formatting is off in the rendered page
Suggested change
|
||||||
* ``PyDataType_ALIGNENT`` | ||||||
* ``PyDataType_FIELDS``, ``PyDataType_NAMES``, ``PyDataType_SUBARRAY`` | ||||||
* ``PyDataType_C_METADATA`` | ||||||
|
||||||
Cython code should use Cython 3, in which case the change is transparent. | ||||||
(Struct access is available for elsize and alignment when compiling only for | ||||||
NumPy 2.) | ||||||
|
||||||
For compiling with both 1.x and 2.x if you use these new accessors it is | ||||||
unfortunately necessary to either define them locally via a macro like:: | ||||||
|
||||||
#if NPY_ABI_VERSION < 0x02000000 | ||||||
#define PyDataType_ELSIZE(descr) ((descr)->elsize) | ||||||
#endif | ||||||
|
||||||
or adding ``npy2_compat.h`` into your code base and explicitly include it | ||||||
when compiling with NumPy 1.x (as they are new API). | ||||||
Including the file has no effect on NumPy 2. | ||||||
|
||||||
Please do not hesitate to open a NumPy issue, if you require assistence or | ||||||
the provided functions are not sufficient. | ||||||
|
||||||
**Custom User DTypes:** | ||||||
Existing user dtypes must now use ``PyArray_DescrProto`` to define their | ||||||
dtype and slightly modify the code. See note in `PyArray_RegisterDataType`. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Link to |
||||||
|
||||||
Functionality moved to headers requiring ``import_array()`` | ||||||
----------------------------------------------------------- | ||||||
If you previously included only ``ndarraytypes.h`` you may find that some | ||||||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -761,29 +761,79 @@ cannot not be accessed directly. | |
|
||
.. versionchanged:: 2.0 | ||
Prior to NumPy 2.0 the ABI was different but unnecessary large for user | ||
DTypes. These accessors were all added in 2.0. | ||
DTypes. These accessors were all added in 2.0 and can be backported | ||
(see :ref:`_migration_c_descr`). | ||
|
||
.. c:function:: npy_intp PyDataType_ELSIZE(PyArray_Descr *descr) | ||
|
||
The element size of the datatype (``itemsize`` in Python). | ||
|
||
.. note:: | ||
If the ``descr`` is attached to an array ``PyArray_ITEMSIZE(arr)`` | ||
can be used and is available on all NumPy versions. | ||
|
||
.. c:function:: void PyDataType_SET_ELSIZE(PyArray_Descr *descr, npy_intp size) | ||
|
||
Allows setting of the itemsize, this is *only* relevant for string/bytes | ||
datatypes as it is the current pattern to define one with a new size. | ||
|
||
.. c:function:: npy_intp PyDataType_ALIGNENT(PyArray_Descr *descr) | ||
|
||
The alignment of the datatype. | ||
|
||
.. c:function:: PyObject *PyDataType_METADATA(PyArray_Descr *descr) | ||
|
||
The Metadata attached to a dtype, either ``NULL`` or a dictionary. | ||
|
||
.. c:function:: PyObject *PyDataType_NAMES(PyArray_Descr *descr) | ||
|
||
``NULL`` or a list of structured field names attached to a dtype, | ||
this list should not be mutated, NumPy may change the way fields are | ||
stored in the future. | ||
``NULL`` or a tuple of structured field names attached to a dtype. | ||
|
||
.. c:function:: PyObject *PyDataType_FIELDS(PyArray_Descr *descr) | ||
|
||
``NULL``, ``None``, or a dict of structured dtype fields, this dict must | ||
not be mutated, NumPy may change the way fields are stored in the future. | ||
|
||
This is the same dict as returned by `np.dtype.fields`. | ||
|
||
.. c:function:: NpyAuxData *PyDataType_C_METADATA(PyArray_Descr *descr) | ||
|
||
C-metadata object attached to a descriptor. This accessor should not | ||
be needed usually. The C-Metadata field does provide access to the | ||
datetime/timedelta time unit information. | ||
|
||
.. c:function:: PyArray_ArrayDescr *PyDataType_SUBARRAY(PyArray_Descr *descr) | ||
|
||
Information about a subarray dtype eqivalent to the Python `np.dtype.base` | ||
and `np.dtype.shape`. | ||
|
||
If this is non- ``NULL``, then this data-type descriptor is a | ||
C-style contiguous array of another data-type descriptor. In | ||
other-words, each element that this descriptor describes is | ||
actually an array of some other base descriptor. This is most | ||
useful as the data-type descriptor for a field in another | ||
data-type descriptor. The fields member should be ``NULL`` if this | ||
is non- ``NULL`` (the fields member of the base descriptor can be | ||
non- ``NULL`` however). | ||
|
||
.. c:type:: PyArray_ArrayDescr | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this indented on purpose? See the rendered page |
||
|
||
.. code-block:: c | ||
|
||
typedef struct { | ||
PyArray_Descr *base; | ||
PyObject *shape; | ||
} PyArray_ArrayDescr; | ||
|
||
.. c:member:: PyArray_Descr *base | ||
|
||
The data-type-descriptor object of the base-type. | ||
|
||
.. c:member:: PyObject *shape | ||
|
||
The shape (always C-style contiguous) of the sub-array as a Python | ||
tuple. | ||
|
||
|
||
Data-type checking | ||
~~~~~~~~~~~~~~~~~~ | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -279,18 +279,24 @@ PyArrayDescr_Type and PyArray_Descr | |
char kind; | ||
char type; | ||
char byteorder; | ||
char flags; | ||
char _former_flags; // unused field | ||
int type_num; | ||
int elsize; | ||
int alignment; | ||
PyArray_ArrayDescr *subarray; | ||
PyObject *fields; | ||
PyObject *names; | ||
PyObject *metadata; | ||
/* | ||
* Definitions after this one must be accessed through accessor | ||
* functions (see below) when compiling with NumPy 1.x support. | ||
*/ | ||
npy_uint64 flags; | ||
npy_intp elsize; | ||
npy_intp alignment; | ||
NpyAuxData *c_metadata; | ||
npy_hash_t hash; | ||
void *reserved_null; // unused field, must be NULL. | ||
} PyArray_Descr; | ||
|
||
Some dtypes have additional members which are accessible through | ||
`PyDataType_NAMES`, `PyDataType_FIELDS`, `PyDataType_SUBARRAY`, and | ||
in some cases (times) `PyDataType_C_METADATA`. | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. These reference links are not working. Do you need a domain? |
||
.. c:member:: PyTypeO F438 bject *typeobj | ||
|
||
Pointer to a typeobject that is the corresponding Python type for | ||
|
@@ -320,7 +326,7 @@ PyArrayDescr_Type and PyArray_Descr | |
endian), '=' (native), '\|' (irrelevant, ignore). All builtin data- | ||
types have byteorder '='. | ||
|
||
.. c:member:: char flags | ||
.. c:member:: npy_uint64 flags | ||
|
||
A data-type bit-flag that determines if the data-type exhibits object- | ||
array like behavior. Each bit in this member is a flag which are named | ||
|
@@ -342,67 +348,26 @@ PyArrayDescr_Type and PyArray_Descr | |
A number that uniquely identifies the data type. For new data-types, | ||
this number is assigned when the data-type is registered. | ||
|
||
.. c:member:: int elsize | ||
.. c:member:: npy_intp elsize | ||
|
||
For data types that are always the same size (such as long), this | ||
holds the size of the data type. For flexible data types where | ||
different arrays can have a different elementsize, this should be | ||
0. | ||
|
||
.. c:member:: int alignment | ||
See `PyDataType_ELSIZE` and `PyDataType_SET_ELSIZE` for a way to access | ||
this field in a NumPy 1.x compatible way. | ||
|
||
.. c:member:: npy_intp alignment | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The links are broken. Do you need a domain? |
||
|
||
A number providing alignment information for this data type. | ||
Specifically, it shows how far from the start of a 2-element | ||
structure (whose first element is a ``char`` ), the compiler | ||
places an item of this type: ``offsetof(struct {char c; type v;}, | ||
v)`` | ||
|
||
.. c:member:: PyArray_ArrayDescr *subarray | ||
|
||
If this is non- ``NULL``, then this data-type descriptor is a | ||
C-style contiguous array of another data-type descriptor. In | ||
other-words, each element that this descriptor describes is | ||
actually an array of some other base descriptor. This is most | ||
useful as the data-type descriptor for a field in another | ||
data-type descriptor. The fields member should be ``NULL`` if this | ||
is non- ``NULL`` (the fields member of the base descriptor can be | ||
non- ``NULL`` however). | ||
|
||
.. c:type:: PyArray_ArrayDescr | ||
|
||
.. code-block:: c | ||
|
||
typedef struct { | ||
PyArray_Descr *base; | ||
PyObject *shape; | ||
} PyArray_ArrayDescr; | ||
|
||
.. c:member:: PyArray_Descr *base | ||
|
||
The data-type-descriptor object of the base-type. | ||
|
||
.. c:member:: PyObject *shape | ||
|
||
The shape (always C-style contiguous) of the sub-array as a Python | ||
tuple. | ||
|
||
.. c:member:: PyObject *fields | ||
|
||
If this is non-NULL, then this data-type-descriptor has fields | ||
described by a Python dictionary whose keys are names (and also | ||
titles if given) and whose values are tuples that describe the | ||
fields. Recall that a data-type-descriptor always describes a | ||
fixed-length set of bytes. A field is a named sub-region of that | ||
total, fixed-length collection. A field is described by a tuple | ||
composed of another data- type-descriptor and a byte | ||
offset. Optionally, the tuple may contain a title which is | ||
normally a Python string. These tuples are placed in this | ||
dictionary keyed by name (and also title if given). | ||
|
||
.. c:member:: PyObject *names | ||
|
||
An ordered tuple of field names. It is NULL if no field is | ||
defined. | ||
See `PyDataType_ALIGNMENT` for a way to access this field in a NumPy 1.x | ||
compatible way. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The link is broken. |
||
|
||
.. c:member:: PyObject *metadata | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -486,6 +486,19 @@ The ``metadata`` field is kept, but the macro version should also be preferred. | |
|
||
(`gh-25802 <https://github.com/numpy/numpy/pull/25802>`__) | ||
|
||
|
||
Descriptor ``elsize`` and ``alignment`` access | ||
---------------------------------------------- | ||
Unless compiling only with NumPy 2 support, the ``elsize`` and ``aligment`` | ||
fields must now be accessed via `PyDataType_ELSIZE`, | ||
`PyDataType_SET_ELSIZE`, and `PyDataType_ALIGNMENT`. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The links above are broken. Maybe you can fix the other warnings in the release notes, see the warnings starting here |
||
In cases where the descriptor is attached to an array, we advise | ||
using ``PyArray_ITEMSIZE`` as it exists on all NumPy versions. | ||
Please see :ref:`_migration_c_descr` for more information. | ||
|
||
(`gh-25943 <https://github.com/numpy/numpy/pull/25943>`__) | ||
|
||
|
||
NumPy 2.0 C API removals | ||
======================== | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.