8000 DOC: Add and fixup/move docs for descriptor changes · numpy/numpy@7653a32 · GitHub
[go: up one dir, main page]

Skip to content

Commit 7653a32

Browse files
committed
DOC: Add and fixup/move docs for descriptor changes
[skip actions] [skip cirrus] [skip azp]
1 parent 955d980 commit 7653a32

File tree

4 files changed

+131
-60
lines changed

4 files changed

+131
-60
lines changed

doc/source/numpy_2_0_migration_guide.rst

Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -83,6 +83,49 @@ used to explicitly implement different behavior on NumPy 1.x and 2.0.
8383

8484
Please let us know if you require additional workarounds here.
8585

86+
.. _migration_c_descr:
87+
88+
The ``PyArray_Descr`` struct has been changed
89+
---------------------------------------------
90+
One of the most impactful C-API changes is that the ``PyArray_Descr`` struct
91+
is now more opaque to allow us to add additional flags and have
92+
itemsizes not limited by the size of ``int`` as well as allow improving
93+
structured dtypes in the future and not burdon new dtypes with their fields.
94+
95+
Code which only uses the type number and other initial fields is unaffected.
96+
Most code will hopefull mainly access the ``->elsize`` field, when the
97+
dtype/descriptor itself is attached to an array (e.g. ``arr->descr->elsize``)
98+
this is best replaced with ``PyArray_ITEMSIZE(arr)``.
99+
100+
Where not possible, new accessor functions are required:
101+
* ``PyDataType_ELSIZE`` and ``PyDataType_SET_ELSIZE`` (note that the result
102+
is now ``npy_intp`` and not ``int``).
103+
* ``PyDataType_ALIGNENT``
104+
* ``PyDataType_FIELDS``, ``PyDataType_NAMES``, ``PyDataType_SUBARRAY``
105+
* ``PyDataType_C_METADATA``
106+
107+
Cython code should use Cython 3, in which case the change is transparent.
108+
(Struct access is available for elsize and alignment when compiling only for
109+
NumPy 2.)
110+
111+
For compiling with both 1.x and 2.x if you use these new accessors it is
112+
unfortunately necessary to either define them locally via a macro like::
113+
114+
#if NPY_ABI_VERSION < 0x02000000
115+
#define PyDataType_ELSIZE(descr) ((descr)->elsize)
116+
#endif
117+
118+
or adding ``npy2_compat.h`` into your code base and explicitly include it
119+
when compiling with NumPy 1.x (as they are new API).
120+
Including the file has no effect on NumPy 2.
121+
122+
Please do not hesitate to open a NumPy issue, if you require assistence or
123+
the provided functions are not sufficient.
124+
125+
**Custom User DTypes:**
126+
Existing user dtypes must now use ``PyArray_DescrProto`` to define their
127+
dtype and slightly modify the code. See note in `PyArray_RegisterDataType`.
128+
86129
Functionality moved to headers requiring ``import_array()``
87130
-----------------------------------------------------------
88131
If you previously included only ``ndarraytypes.h`` you may find that some

doc/source/reference/c-api/array.rst

Lines changed: 54 additions & 4 deletions
< F438 /tr>
Original file line numberDiff line numberDiff line change
@@ -761,29 +761,79 @@ cannot not be accessed directly.
761761
762762
.. versionchanged:: 2.0
763763
Prior to NumPy 2.0 the ABI was different but unnecessary large for user
764-
DTypes. These accessors were all added in 2.0.
764+
DTypes. These accessors were all added in 2.0 and can be backported
765+
(see :ref:`_migration_c_descr`).
766+
767+
.. c:function:: npy_intp PyDataType_ELSIZE(PyArray_Descr *descr)
768+
769+
The element size of the datatype (``itemsize`` in Python).
770+
771+
.. note::
772+
If the ``descr`` is attached to an array ``PyArray_ITEMSIZE(arr)``
773+
can be used and is available on all NumPy versions.
774+
775+
.. c:function:: void PyDataType_SET_ELSIZE(PyArray_Descr *descr, npy_intp size)
776+
777+
Allows setting of the itemsize, this is *only* relevant for string/bytes
778+
datatypes as it is the current pattern to define one with a new size.
779+
780+
.. c:function:: npy_intp PyDataType_ALIGNENT(PyArray_Descr *descr)
781+
782+
The alignment of the datatype.
765783
766784
.. c:function:: PyObject *PyDataType_METADATA(PyArray_Descr *descr)
767785
768786
The Metadata attached to a dtype, either ``NULL`` or a dictionary.
769787
770788
.. c:function:: PyObject *PyDataType_NAMES(PyArray_Descr *descr)
771789
772-
``NULL`` or a list of structured field names attached to a dtype,
773-
this list should not be mutated, NumPy may change the way fields are
774-
stored in the future.
790+
``NULL`` or a tuple of structured field names attached to a dtype.
775791
776792
.. c:function:: PyObject *PyDataType_FIELDS(PyArray_Descr *descr)
777793
778794
``NULL``, ``None``, or a dict of structured dtype fields, this dict must
779795
not be mutated, NumPy may change the way fields are stored in the future.
780796
797+
This is the same dict as returned by `np.dtype.fields`.
798+
781799
.. c:function:: NpyAuxData *PyDataType_C_METADATA(PyArray_Descr *descr)
782800
783801
C-metadata object attached to a descriptor. This accessor should not
784802
be needed usually. The C-Metadata field does provide access to the
785803
datetime/timedelta time unit information.
786804
805+
.. c:function:: PyArray_ArrayDescr *PyDataType_SUBARRAY(PyArray_Descr *descr)
806+
807+
Information about a subarray dtype eqivalent to the Python `np.dtype.base`
808+
and `np.dtype.shape`.
809+
810+
If this is non- ``NULL``, then this data-type descriptor is a
811+
C-style contiguous array of another data-type descriptor. In
812+
other-words, each element that this descriptor describes is
813+
actually an array of some other base descriptor. This is most
814+
useful as the data-type descriptor for a field in another
815+
data-type descriptor. The fields member should be ``NULL`` if this
816+
is non- ``NULL`` (the fields member of the base descriptor can be
817+
non- ``NULL`` however).
818+
819+
.. c:type:: PyArray_ArrayDescr
820+
821+
.. code-block:: c
822+
823+
typedef struct {
824+
PyArray_Descr *base;
825+
PyObject *shape;
826+
} PyArray_ArrayDescr;
827+
828+
.. c:member:: PyArray_Descr *base
829+
830+
The data-type-descriptor object of the base-type.
831+
832+
.. c:member:: PyObject *shape
833+
834+
The shape (always C-style contiguous) of the sub-array as a Python
835+
tuple.
836+
787837
788838
Data-type checking
789839
~~~~~~~~~~~~~~~~~~

doc/source/reference/c-api/types-and-structures.rst

Lines changed: 21 additions & 56 deletions
Original file line numberDiff line numberDiff line change
@@ -279,18 +279,24 @@ PyArrayDescr_Type and PyArray_Descr
279279
char kind;
280280
char type;
281281
char byteorder;
282-
char flags;
282+
char _former_flags; // unused field
283283
int type_num;
284-
int elsize;
285-
int alignment;
286-
PyArray_ArrayDescr *subarray;
287-
PyObject *fields;
288-
PyObject *names;
289-
PyObject *metadata;
284+
/*
285+
* Definitions after this one must be accessed through accessor
286+
* functions (see below) when compiling with NumPy 1.x support.
287+
*/
288+
npy_uint64 flags;
289+
npy_intp elsize;
290+
npy_intp alignment;
290291
NpyAuxData *c_metadata;
291292
npy_hash_t hash;
293+
void *reserved_null; // unused field, must be NULL.
292294
} PyArray_Descr;
293295
296+
Some dtypes have additional members which are accessible through
297+
`PyDataType_NAMES`, `PyDataType_FIELDS`, `PyDataType_SUBARRAY`, and
298+
in some cases (times) `PyDataType_C_METADATA`.
299+
294300
.. c:member:: PyTypeObject *typeobj
295301
296302
Pointer to a typeobject that is the corresponding Python type for
@@ -320,7 +326,7 @@ PyArrayDescr_Type and PyArray_Descr
320326
endian), '=' (native), '\|' (irrelevant, ignore). All builtin data-
321327
types have byteorder '='.
322328

323-
.. c:member:: char flags
329+
.. c:member:: npy_uint64 flags
324330
325331
A data-type bit-flag that determines if the data-type exhibits object-
326332
array like behavior. Each bit in this member is a flag which are named
@@ -342,67 +348,26 @@ PyArrayDescr_Type and PyArray_Descr
342348
A number that uniquely identifies the data type. For new data-types,
343349
this number is assigned when the data-type is registered.
344350

345-
.. c:member:: int elsize
351+
.. c:member:: npy_intp elsize
346352
347353
For data types that are always the same size (such as long), this
348354
holds the size of the data type. For flexible data types where
349355
different arrays can have a different elementsize, this should be
350356
0.
351357

352-
.. c:member:: int alignment
358+
See `PyDataType_ELSIZE` and `PyDataType_SET_ELSIZE` for a way to access
359+
this field in a NumPy 1.x compatible way.
360+
361+
.. c:member:: npy_intp alignment
353362
354363
A number providing alignment information for this data type.
355364
Specifically, it shows how far from the start of a 2-element
356365
structure (whose first element is a ``char`` ), the compiler
357366
places an item of this type: ``offsetof(struct {char c; type v;},
358367
v)``
359368

360-
.. c:member:: PyArray_ArrayDescr *subarray
361-
362-
If this is non- ``NULL``, then this data-type descriptor is a
363-
C-style contiguous array of another data-type descriptor. In
364-
other-words, each element that this descriptor describes is
365-
actually an array of some other base descriptor. This is most
366-
useful as the data-type descriptor for a field in another
367-
data-type descriptor. The fields member should be ``NULL`` if this
368-
is non- ``NULL`` (the fields member of the base descriptor can be
369-
non- ``NULL`` however).
370-
371-
.. c:type:: PyArray_ArrayDescr
372-
373-
.. code-block:: c
374-
375-
typedef struct {
376-
PyArray_Descr *base;
377-
PyObject *shape;
378-
} PyArray_ArrayDescr;
379-
380-
.. c:member:: PyArray_Descr *base
381-
382-
The data-type-descriptor object of the base-type.
383-
384-
.. c:member:: PyObject *shape
385-
386-
The shape (always C-style contiguous) of the sub-array as a Python
387-
tuple.
388-
389-
.. c:member:: PyObject *fields
390-
391-
If this is non-NULL, then this data-type-descriptor has fields
392-
described by a Python dictionary whose keys are names (and also
393-
titles if given) and whose values are tuples that describe the
394-
fields. Recall that a data-type-descriptor always describes a
395-
fixed-length set of bytes. A field is a named sub-region of that
396-
total, fixed-length collection. A field is described by a tuple
397-
composed of another data- type-descriptor and a byte
398-
offset. Optionally, the tuple may contain a title which is
399-
normally a Python string. These tuples are placed in this
400-
dictionary keyed by name (and also title if given).
401-
402-
.. c:member:: PyObject *names
403-
404-
An ordered tuple of field names. It is NULL if no field is
405-
defined.
369+
See `PyDataType_ALIGNMENT` for a way to access this field in a NumPy 1.x
370+
compatible way.
406371

407372
.. c:member:: PyObject *metadata
408373

doc/source/release/2.0.0-notes.rst

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -486,6 +486,19 @@ The ``metadata`` field is kept, but the macro version should also be preferred.
486486

487487
(`gh-25802 <https://github.com/numpy/numpy/pull/25802>`__)
488488

489+
490+
Descriptor ``elsize`` and ``alignment`` access
491+
----------------------------------------------
492+
Unless compiling only with NumPy 2 support, the ``elsize`` and ``aligment``
493+
fields must now be accessed via `PyDataType_ELSIZE`,
494+
`PyDataType_SET_ELSIZE`, and `PyDataType_ALIGNMENT`.
495+
In cases were the array is available, we advise using ``PyArray_ITEMSIZE``
496+
as it exists on all NumPy versions otherwise please see
497+
:ref:`_migration_c_descr`.
498+
499+
(`gh-25943 <https://github.com/numpy/numpy/pull/25943>`__)
500+
501+
489502
NumPy 2.0 C API removals
490503
========================
491504

0 commit comments

Comments
 (0)
0