8000 DOC: Add and fixup/move docs for descriptor changes by seberg · Pull Request #25946 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

DOC: Add and fixup/move docs for descriptor changes #25946

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Mar 6, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
43 changes: 43 additions & 0 deletions doc/source/numpy_2_0_migration_guide.rst
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,49 @@ used to explicitly implement different behavior on NumPy 1.x and 2.0.

Please let us know if you require additional workarounds here.

.. _migration_c_descr:

The ``PyArray_Descr`` struct has been changed
---------------------------------------------
One of the most impactful C-API changes is that the ``PyArray_Descr`` struct
is now more opaque to allow us to add additional flags and have
itemsizes not limited by the size of ``int`` as well as allow improving
structured dtypes in the future and not burdon new dtypes with their fields.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
structured dtypes in the future and not burdon new dtypes with their fields.
structured dtypes in the future and not burden new dtypes with their fields.


Code which only uses the type number and other initial fields is unaffected.
Most code will hopefull mainly access the ``->elsize`` field, when the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Most code will hopefull mainly access the ``->elsize`` field, when the
Most code will hopefully mainly access the ``->elsize`` field, when the

dtype/descriptor itself is attached to an array (e.g. ``arr->descr->elsize``)
this is best replaced with ``PyArray_ITEMSIZE(arr)``.

Where not possible, new accessor functions are required:
* ``PyDataType_ELSIZE`` and ``PyDataType_SET_ELSIZE`` (note that the result
is now ``npy_intp`` and not ``int``).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Formatting is off in the rendered page

Suggested change
is now ``npy_intp`` and not ``int``).
is now ``npy_intp`` and not ``int``).

* ``PyDataType_ALIGNENT``
* ``PyDataType_FIELDS``, ``PyDataType_NAMES``, ``PyDataType_SUBARRAY``
* ``PyDataType_C_METADATA``

Cython code should use Cython 3, in which case the change is transparent.
(Struct access is available for elsize and alignment when compiling only for
NumPy 2.)

For compiling with both 1.x and 2.x if you use these new accessors it is
unfortunately necessary to either define them locally via a macro like::

#if NPY_ABI_VERSION < 0x02000000
#define PyDataType_ELSIZE(descr) ((descr)->elsize)
#endif

or adding ``npy2_compat.h`` into your code base and explicitly include it
when compiling with NumPy 1.x (as they are new API).
Including the file has no effect on NumPy 2.

Please do not hesitate to open a NumPy issue, if you require assistence or
the provided functions are not sufficient.

**Custom User DTypes:**
Existing user dtypes must now use ``PyArray_DescrProto`` to define their
dtype and slightly modify the code. See note in `PyArray_RegisterDataType`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Link to PyArray_RegisterDataType is broken in the rendered page.


Functionality moved to headers requiring ``import_array()``
-----------------------------------------------------------
If you previously included only ``ndarraytypes.h`` you may find that some
Expand Down
58 changes: 54 additions & 4 deletions doc/source/reference/c-api/array.rst
Original file line number Diff line number Diff line change
Expand Up @@ -761,29 +761,79 @@ cannot not be accessed directly.

.. versionchanged:: 2.0
Prior to NumPy 2.0 the ABI was different but unnecessary large for user
DTypes. These accessors were all added in 2.0.
DTypes. These accessors were all added in 2.0 and can be backported
(see :ref:`_migration_c_descr`).

.. c:function:: npy_intp PyDataType_ELSIZE(PyArray_Descr *descr)

The element size of the datatype (``itemsize`` in Python).

.. note::
If the ``descr`` is attached to an array ``PyArray_ITEMSIZE(arr)``
can be used and is available on all NumPy versions.

.. c:function:: void PyDataType_SET_ELSIZE(PyArray_Descr *descr, npy_intp size)

Allows setting of the itemsize, this is *only* relevant for string/bytes
datatypes as it is the current pattern to define one with a new size.

.. c:function:: npy_intp PyDataType_ALIGNENT(PyArray_Descr *descr)

The alignment of the datatype.

.. c:function:: PyObject *PyDataType_METADATA(PyArray_Descr *descr)

The Metadata attached to a dtype, either ``NULL`` or a dictionary.

.. c:function:: PyObject *PyDataType_NAMES(PyArray_Descr *descr)

``NULL`` or a list of structured field names attached to a dtype,
this list should not be mutated, NumPy may change the way fields are
stored in the future.
``NULL`` or a tuple of structured field names attached to a dtype.

.. c:function:: PyObject *PyDataType_FIELDS(PyArray_Descr *descr)

``NULL``, ``None``, or a dict of structured dtype fields, this dict must
not be mutated, NumPy may change the way fields are stored in the future.

This is the same dict as returned by `np.dtype.fields`.

.. c:function:: NpyAuxData *PyDataType_C_METADATA(PyArray_Descr *descr)

C-metadata object attached to a descriptor. This accessor should not
be needed usually. The C-Metadata field does provide access to the
datetime/timedelta time unit information.

.. c:function:: PyArray_ArrayDescr *PyDataType_SUBARRAY(PyArray_Descr *descr)

Information about a subarray dtype eqivalent to the Python `np.dtype.base`
and `np.dtype.shape`.

If this is non- ``NULL``, then this data-type descriptor is a
C-style contiguous array of another data-type descriptor. In
other-words, each element that this descriptor describes is
actually an array of some other base descriptor. This is most
useful as the data-type descriptor for a field in another
data-type descriptor. The fields member should be ``NULL`` if this
is non- ``NULL`` (the fields member of the base descriptor can be
non- ``NULL`` however).

.. c:type:: PyArray_ArrayDescr
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this indented on purpose? See the rendered page


.. code-block:: c

typedef struct {
PyArray_Descr *base;
PyObject *shape;
} PyArray_ArrayDescr;

.. c:member:: PyArray_Descr *base

The data-type-descriptor object of the base-type.

.. c:member:: PyObject *shape

The shape (always C-style contiguous) of the sub-array as a Python
tuple.


Data-type checking
~~~~~~~~~~~~~~~~~~
Expand Down
77 changes: 21 additions & 56 deletions doc/source/reference/c-api/types-and-structures.rst
Original file line number Diff line number Diff line change
Expand Up @@ -279,18 +279,24 @@ PyArrayDescr_Type and PyArray_Descr
char kind;
char type;
char byteorder;
char flags;
char _former_flags; // unused field
int type_num;
int elsize;
int alignment;
PyArray_ArrayDescr *subarray;
PyObject *fields;
PyObject *names;
PyObject *metadata;
/*
* Definitions after this one must be accessed through accessor
* functions (see below) when compiling with NumPy 1.x support.
*/
npy_uint64 flags;
npy_intp elsize;
npy_intp alignment;
NpyAuxData *c_metadata;
npy_hash_t hash;
void *reserved_null; // unused field, must be NULL.
} PyArray_Descr;

Some dtypes have additional members which are accessible through
`PyDataType_NAMES`, `PyDataType_FIELDS`, `PyDataType_SUBARRAY`, and
in some cases (times) `PyDataType_C_METADATA`.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These reference links are not working. Do you need a domain?

.. c:member:: PyTypeO F438 bject *typeobj

Pointer to a typeobject that is the corresponding Python type for
Expand Down Expand Up @@ -320,7 +326,7 @@ PyArrayDescr_Type and PyArray_Descr
endian), '=' (native), '\|' (irrelevant, ignore). All builtin data-
types have byteorder '='.

.. c:member:: char flags
.. c:member:: npy_uint64 flags

A data-type bit-flag that determines if the data-type exhibits object-
array like behavior. Each bit in this member is a flag which are named
Expand All @@ -342,67 +348,26 @@ PyArrayDescr_Type and PyArray_Descr
A number that uniquely identifies the data type. For new data-types,
this number is assigned when the data-type is registered.

.. c:member:: int elsize
.. c:member:: npy_intp elsize

For data types that are always the same size (such as long), this
holds the size of the data type. For flexible data types where
different arrays can have a different elementsize, this should be
0.

.. c:member:: int alignment
See `PyDataType_ELSIZE` and `PyDataType_SET_ELSIZE` for a way to access
this field in a NumPy 1.x compatible way.

.. c:member:: npy_intp alignment
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The links are broken. Do you need a domain?


A number providing alignment information for this data type.
Specifically, it shows how far from the start of a 2-element
structure (whose first element is a ``char`` ), the compiler
places an item of this type: ``offsetof(struct {char c; type v;},
v)``

.. c:member:: PyArray_ArrayDescr *subarray

If this is non- ``NULL``, then this data-type descriptor is a
C-style contiguous array of another data-type descriptor. In
other-words, each element that this descriptor describes is
actually an array of some other base descriptor. This is most
useful as the data-type descriptor for a field in another
data-type descriptor. The fields member should be ``NULL`` if this
is non- ``NULL`` (the fields member of the base descriptor can be
non- ``NULL`` however).

.. c:type:: PyArray_ArrayDescr

.. code-block:: c

typedef struct {
PyArray_Descr *base;
PyObject *shape;
} PyArray_ArrayDescr;

.. c:member:: PyArray_Descr *base

The data-type-descriptor object of the base-type.

.. c:member:: PyObject *shape

The shape (always C-style contiguous) of the sub-array as a Python
tuple.

.. c:member:: PyObject *fields

If this is non-NULL, then this data-type-descriptor has fields
described by a Python dictionary whose keys are names (and also
titles if given) and whose values are tuples that describe the
fields. Recall that a data-type-descriptor always describes a
fixed-length set of bytes. A field is a named sub-region of that
total, fixed-length collection. A field is described by a tuple
composed of another data- type-descriptor and a byte
offset. Optionally, the tuple may contain a title which is
normally a Python string. These tuples are placed in this
dictionary keyed by name (and also title if given).

.. c:member:: PyObject *names

An ordered tuple of field names. It is NULL if no field is
defined.
See `PyDataType_ALIGNMENT` for a way to access this field in a NumPy 1.x
compatible way.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The link is broken.


.. c:member:: PyObject *metadata

Expand Down
13 changes: 13 additions & 0 deletions doc/source/release/2.0.0-notes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -486,6 +486,19 @@ The ``metadata`` field is kept, but the macro version should also be preferred.

(`gh-25802 <https://github.com/numpy/numpy/pull/25802>`__)


Descriptor ``elsize`` and ``alignment`` access
----------------------------------------------
Unless compiling only with NumPy 2 support, the ``elsize`` and ``aligment``
fields must now be accessed via `PyDataType_ELSIZE`,
`PyDataType_SET_ELSIZE`, and `PyDataType_ALIGNMENT`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The links above are broken. Maybe you can fix the other warnings in the release notes, see the warnings starting here

In cases where the descriptor is attached to an array, we advise
using ``PyArray_ITEMSIZE`` as it exists on all NumPy versions.
Please see :ref:`_migration_c_descr` for more information.

(`gh-25943 <https://github.com/numpy/numpy/pull/25943>`__)


NumPy 2.0 C API removals
========================

Expand Down
0