10000 ENH: Configurable allocator by mattip · Pull Request #17582 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

ENH: Configurable allocator #17582

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 83 commits into from
Oct 25, 2021
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
83 commits
Select commit Hold shift + click to select a range
55f2f6c
ENH: add and use global configurable memory routines
mattip Oct 11, 2020
23da73e
ENH: add tests and a way to compile c-extensions from tests
mattip Oct 16, 2020
94b9f25
fix allocation/free exposed by tests
mattip Oct 18, 2020
81b45fd
DOC: document the new APIs (and some old ones too)
mattip Oct 18, 2020
fc32c2f
BUG: return void from FREE, also some cleanup
mattip Oct 19, 2020
de22327
MAINT: changes from review
mattip Oct 19, 2020
38274a4
fixes from linter
mattip Mar 11, 2021
59b520a
setting ndarray->descr on 0d or scalars mess with FREE
mattip Apr 16, 2021
5264019
make scalar allocation more consistent wrt np_alloc_cache
mattip Apr 16, 2021
7c396d7
change formatting for sphinx
mattip Apr 17, 2021
de9001c
remove memcpy variants
mattip Apr 19, 2021
9e7c3ed
update to match NEP 49
mattip May 3, 2021
953cc88
ENH: add a python-level get_handler_name
mattip May 6, 2021
ad9329b
ENH: add core.multiarray.get_handler_name
mattip May 6, 2021
6d10fdb
Allow closure-like definition of the data mem routines
eliaskoromilas Jun 30, 2021
18bea05
Fix incompatible pointer warnings
eliaskoromilas Jun 30, 2021
4368023
Note PyDataMemAllocator and PyMemAllocatorEx differentiation
eliaskoromilas Jul 1, 2021
d243313
Redefine default allocator handling
eliaskoromilas Jul 1, 2021
c9b6854
Always allocate new arrays using the current_handler
eliaskoromilas Jul 5, 2021
a8fd378
Search for the mem_handler name of the data owner
eliaskoromilas Jul 5, 2021
a17565b
Sub-comparisons don't need a local mem_handler
eliaskoromilas Jul 5, 2021
2ec5912
Make the default_handler a valid PyDataMem_Handler
eliaskoromilas Jul 14, 2021
227c4b8
Fix PyDataMem_SetHandler description (NEP discussion)
eliaskoromilas Jul 14, 2021
fb8135d
Pass the allocators by reference
eliaskoromilas Jul 14, 2021
7291484
Implement allocator context-locality
eliaskoromilas Aug 8, 2021
c7a9c22
Fix documentation, make PyDataMem_GetHandler return const
eliaskoromilas Aug 8, 2021
99f8250
remove import of setuptools==49.1.3, doesn't work on python3.10
mattip Aug 9, 2021
b43c1fe
Fix refcount leaks
eliaskoromilas Aug 9, 2021
a7a5435
fix function signatures in test
mattip Aug 9, 2021
e3723df
Return early on PyDataMem_GetHandler error (VOID_compare)
eliaskoromilas Aug 9, 2021
144acc6
Add context/thread-locality tests, allow testing custom policies
eliaskoromilas Aug 9, 2021
8539f5f
Merge branch 'configurable_allocator' into configurable_allocator
eliaskoromilas Aug 9, 2021
6ab00d0
ENH: add and use global configurable memory routines
mattip Oct 11, 2020
e7e8754
ENH: add tests and a way to compile c-extensions from tests
mattip Oct 16, 2020
5f08532
fix allocation/free exposed by tests
mattip Oct 18, 2020
7266029
DOC: document the new APIs (and some old ones too)
mattip Oct 18, 2020
5d547ff
BUG: return void from FREE, also some cleanup
mattip Oct 19, 2020
4617c50
MAINT: changes from review
mattip Oct 19, 2020
8f739c4
fixes from linter
mattip Mar 11, 2021
90205b6
setting ndarray->descr on 0d or scalars mess with FREE
mattip Apr 16, 2021
5c0d3f9
make scalar allocation more consistent wrt np_alloc_cache
mattip Apr 16, 2021
3b385d9
change formatting for sphinx
mattip Apr 17, 2021
e6e12a3
remove memcpy variants
mattip Apr 19, 2021
048552d
update to match NEP 49
mattip May 3, 2021
ad6f8ad
ENH: add a python-level get_handler_name
mattip May 6, 2021
50f8b93
ENH: add core.multiarray.get_handler_name
mattip May 6, 2021
c7438f5
Allow closure-like definition of the data mem routines
eliaskoromilas Jun 30, 2021
f823ba4
Fix incompatible pointer warnings
eliaskoromilas Jun 30, 2021
ad13161
Note PyDataMemAllocator and PyMemAllocatorEx differentiation
eliaskoromilas Jul 1, 2021
0a08acd
Redefine default allocator handling
eliaskoromilas Jul 1, 2021
1f0301d
Always allocate new arrays using the current_handler
eliaskoromilas Jul 5, 2021
3d56aa0
Search for the mem_handler name of the data owner
eliaskoromilas Jul 5, 2021
8ea6818
Sub-comparisons don't need a local mem_handler
eliaskoromilas Jul 5, 2021
fb2af4d
Make the default_handler a valid PyDataMem_Handler
eliaskoromilas Jul 14, 2021
f05a1c6
Fix PyDataMem_SetHandler description (NEP discussion)
eliaskoromilas Jul 14, 2021
660e0a4
Pass the allocators by reference
eliaskoromilas Jul 14, 2021
a4f8d71
remove import of setuptools==49.1.3, doesn't work on python3.10
mattip Aug 9, 2021
d7b1a1d
fix function signatures in test
mattip Aug 9, 2021
ab1a0eb
try to fix cygwin extension building
mattip Aug 9, 2021
b92e36c
YAPF mem_policy test
eliaskoromilas Aug 9, 2021
76cda3a
Merge branch 'configurable_allocator' into configurable_allocator
eliaskoromilas Aug 9, 2021
3eadf2f
Less empty lines, more comments (tests)
eliaskoromilas Aug 10, 2021
dbe9d73
Apply suggestions from code review (set an exception and)
eliaskoromilas Aug 10, 2021
1bfb870
Merge pull request #57 from eliaskoromilas/configurable_allocator
mattip Aug 11, 2021
0511820
skip test on cygwin
mattip Aug 11, 2021
23c4bc0
update API hash for changed signature
mattip Aug 13, 2021
ed8649b
TST: add gc.collect to make sure cycles are broken
mattip Aug 13, 2021
9aacefa
Implement thread-locality for PyPy
eliaskoromilas Aug 12, 2021
79712fa
Update numpy/core/tests/test_mem_policy.py
mattip Aug 25, 2021
a2ae4c0
fixes from review
mattip Aug 25, 2021
09b9c0d
update circleci config
mattip Aug 25, 2021
efb3c77
fix test
mattip Aug 25, 2021
2945c64
make the connection between OWNDATA and having a allocator handle mor…
mattip Aug 25, 2021
3a97d9a
improve docstring, fix flake8 for tests
mattip Aug 26, 2021
1df805c
update PyDataMem_GetHandler() from review
mattip Aug 26, 2021
ef607bd
Implement allocator lifetime management
eliaskoromilas Aug 27, 2021
8bdc9a1
Merge pull request #59 from eliaskoromilas/configurable_allocator
mattip Aug 29, 2021
a3256e5
update NEP and add best-effort handling of error in PyDataMem_UserFREE
mattip Aug 29, 2021
4d6ea65
merge main into branch
mattip Aug 30, 2021
5941d7c
merge main into branch
mattip Oct 17, 2021
522c368
Merge branch 'main' into configurable_allocator
mattip Oct 25, 2021
442b0e1
ENH: fix and test for blindly taking ownership of data
mattip Oct 25, 2021
8ca8b54
Update doc/neps/nep-0049.rst
seberg Oct 25, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
update to match NEP 49
  • Loading branch information
mattip committed Aug 8, 2021
commit 9e7c3ed40bcf51a0cdbc46dc1dfcb9550bded991
52 changes: 26 additions & 26 deletions doc/source/reference/c-api/data_memory.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,11 +7,10 @@ to hold `numpy.ndarray.strides`, `numpy.ndarray.shape` and
after creating the python object in `__new__`. The ``strides`` and
``shape`` are stored in a piece of memory allocated internally.

These allocations are small relative to the ``data``, the homogeneous chunk of
memory used to store the actual array values (which could be pointers in the
case of ``object`` arrays). Since that memory can be significantly large, NumPy
has provided interfaces to manage it. This document details how those
interfaces work.
The ``data`` allocation used to store the actual array values (which could be
pointers in the case of ``object`` arrays) can be very large, so NumPy has
provided interfaces to manage its allocation and release. This document details
how those interfaces work.

Historical overview
-------------------
Expand All @@ -22,15 +21,23 @@ which are backed by `alloc`, `free`, `realloc` respectively. In that version
NumPy also exposed the `PyDataMem_EventHook` function described below, which
wrap the OS-level calls.

Python also improved its memory management capabilities, and began providing
Since those early days, Python also improved its memory management
capabilities, and began providing
various :ref:`management policies <memoryoverview>` beginning in version
3.4. These routines are divided into a set of domains, each domain has a
:c:type:`PyMemAllocatorEx` structure of routines for memory management. Python also
added a `tracemalloc` module to trace calls to the various routines. These
tracking hooks were added to the NumPy ``PyDataMem_*`` routines.

Configurable memory routines in NumPy
-------------------------------------
NumPy added a small cache of allocated memory in its internal
``npy_alloc_cache``, ``npy_alloc_cache_zero``, and ``npy_free_cache``
functions. These wrap ``alloc``, ``alloc-and-memset(0)`` and ``free``
respectively, but when ``npy_free_cache`` is called, it adds the pointer to a
short list of available blocks marked by size. These blocks can be re-used by
subsequent calls to ``npy_alloc*``, avoiding memory thrashing.

Configurable memory routines in NumPy (NEP 49)
----------------------------------------------

Users may wish to override the internal data memory routines with ones of their
own. Since NumPy does not use the Python domain strategy to manage data memory,
Expand All @@ -44,9 +51,7 @@ call :c:func:`PyTraceMalloc_Track`, :c:func:`PyTraceMalloc_Untrack`, and will
use the :c:func:`PyDataMem_EventHookFunc` mechanism. Since the functions may
change during the lifetime of the process, each ``ndarray`` carries with it the
functions used at the time of its instantiation, and these will be used to
reallocate or free the data memory of the instance. As of NumPy version 1.21,
the copy functions are not yet implemented, all memory copies are handled by
calls to ``memcpy``.
reallocate or free the data memory of the instance.

.. c:type:: PyDataMem_Handler

Expand All @@ -55,14 +60,11 @@ calls to ``memcpy``.
.. code-block:: c

typedef struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to reference the builtin PyMemAllocatorEx somewhere in this section and compare the two (https://docs.python.org/3/c-api/memory.html#customize-memory-allocators).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will add. The original npy_*_cache set of functions uses PyDataMem_* functions, which it seems preceeded these more sophisticated interfaces and directly used malloc/calloc/free.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documented. The Python ones do not use our non-documented-but-public PyDataMem_EventHookFunc callbacks. So I could see a path where we deprecate PyDataMem_EventHookFunc and move to the Python memory management strategies, although that would mean

  • if someone by chance implemented a PyDataMem_EventHookFunc callback it would no longer work.
  • in order to override data allocations, a user would also override other random memory allocations

char name[128]; /* multiple of 64 to keep the struct unaligned */
char name[128]; /* multiple of 64 to keep the struct aligned */
PyDataMem_AllocFunc *alloc;
PyDataMem_ZeroedAllocFunc *zeroed_alloc;
PyDataMem_FreeFunc *free;
PyDataMem_ReallocFunc *realloc;
PyDataMem_CopyFunc *host2obj; /* copy from the host python */
PyDataMem_CopyFunc *obj2host; /* copy to the host python */
PyDataMem_CopyFunc *obj2obj; /* copy between two objects */
} PyDataMem_Handler;

where the function's signatures are
Expand All @@ -73,40 +75,38 @@ calls to ``memcpy``.
typedef void *(PyDataMem_ZeroedAllocFunc)(size_t nelems, size_t elsize);
typedef void (PyDataMem_FreeFunc)(void *ptr, size_t size);
typedef void *(PyDataMem_ReallocFunc)(void *ptr, size_t size);
typedef void *(PyDataMem_CopyFunc)(void *dst, const void *src, size_t size);

.. c:function:: const PyDataMem_Handler * PyDataMem_SetHandler(PyDataMem_Handler *handler)

Sets a new allocation policy. If the input value is NULL, will reset
the policy to the default. Returns the previous policy, NULL if the
Sets a new allocation policy. If the input value is ``NULL``, will reset
the policy to the default. Returns the previous policy, ``NULL`` if the
previous policy was the default. We wrap the user-provided functions
so they will still call the python and numpy memory management callback
hooks.

.. c:function:: const char * PyDataMem_GetHandlerName(PyArrayObject *obj)

Return the const char name of the PyDataMem_Handler used by the
PyArrayObject. If NULL, return the name of the current global policy that
will be used to allocate data for the next PyArrayObject
Return the const char name of the `PyDataMem_Handler` used by the
``PyArrayObject``. If ``NULL``, return the name of the current global policy
that will be used to allocate data for the next ``PyArrayObject``.

For an example of setting up and using the PyDataMem_Handler, see the test in
:file:`numpy/core/tests/test_mem_policy.py`

.. c:function:: void PyDataMem_EventHookFunc(void *inp, void *outp, size_t size, void *user_data);

This function will be cal AD86 led on NEW,FREE,RENEW calls in data memory
manipulation
This function will be called during data memory manipulation



.. c:function:: PyDataMem_EventHookFunc * PyDataMem_SetEventHook(PyDataMem_EventHookFunc *newhook, void *user_data, void **old_data)

Sets the allocation event hook for numpy array data.

Returns a pointer to the previous hook or NULL. If old_data is
non-NULL, the previous user_data pointer will be copied to it.
Returns a pointer to the previous hook or ``NULL``. If old_data is
non-``NULL``, the previous user_data pointer will be copied to it.

If not NULL, hook will be called at the end of each PyDataMem_NEW/FREE/RENEW:
If not ``NULL``, hook will be called at the end of each ``PyDataMem_NEW/FREE/RENEW``:

.. code-block:: c

Expand Down
1 change: 0 additions & 1 deletion numpy/core/include/numpy/ndarraytypes.h
Original file line number Diff line number Diff line change
Expand Up @@ -671,7 +671,6 @@ typedef void *(PyDataMem_AllocFunc)(size_t size);
typedef void *(PyDataMem_ZeroedAllocFunc)(size_t nelems, size_t elsize);
typedef void (PyDataMem_FreeFunc)(void *ptr, size_t size);
typedef void *(PyDataMem_ReallocFunc)(void *ptr, size_t size);
typedef void *(PyDataMem_CopyFunc)(void *dst, const void *src, size_t size);

typedef struct {
char name[128]; /* multiple of 64 to keep the struct aligned */
Expand Down
0