8000 ENH: Configurable allocator by mattip · Pull Request #17582 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

ENH: Configurable allocator #17582

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 83 commits into from
Oct 25, 2021
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
83 commits
Select commit Hold shift + click to select a range
55f2f6c
ENH: add and use global configurable memory routines
mattip Oct 11, 2020
23da73e
ENH: add tests and a way to compile c-extensions from tests
mattip Oct 16, 2020
94b9f25
fix allocation/free exposed by tests
mattip Oct 18, 2020
81b45fd
DOC: document the new APIs (and some old ones too)
mattip Oct 18, 2020
fc32c2f
BUG: return void from FREE, also some cleanup
mattip Oct 19, 2020
de22327
MAINT: changes from review
mattip Oct 19, 2020
38274a4
fixes from linter
mattip Mar 11, 2021
59b520a
setting ndarray->descr on 0d or scalars mess with FREE
mattip Apr 16, 2021
5264019
make scalar allocation more consistent wrt np_alloc_cache
mattip Apr 16, 2021
7c396d7
change formatting for sphinx
mattip Apr 17, 2021
8000 de9001c
remove memcpy variants
mattip Apr 19, 2021
9e7c3ed
update to match NEP 49
mattip May 3, 2021
953cc88
ENH: add a python-level get_handler_name
mattip May 6, 2021
ad9329b
ENH: add core.multiarray.get_handler_name
mattip May 6, 2021
6d10fdb
Allow closure-like definition of the data mem routines
eliaskoromilas Jun 30, 2021
18bea05
Fix incompatible pointer warnings
eliaskoromilas Jun 30, 2021
4368023
Note PyDataMemAllocator and PyMemAllocatorEx differentiation
eliaskoromilas Jul 1, 2021
d243313
Redefine default allocator handling
eliaskoromilas Jul 1, 2021
c9b6854
Always allocate new arrays using the current_handler
eliaskoromilas Jul 5, 2021
a8fd378
Search for the mem_handler name of the data owner
eliaskoromilas Jul 5, 2021
a17565b
Sub-comparisons don't need a local mem_handler
eliaskoromilas Jul 5, 2021
2ec5912
Make the default_handler a valid PyDataMem_Handler
eliaskoromilas Jul 14, 2021
227c4b8
Fix PyDataMem_SetHandler description (NEP discussion)
eliaskoromilas Jul 14, 2021
fb8135d
Pass the allocators by reference
eliaskoromilas Jul 14, 2021
7291484
Implement allocator context-locality
eliaskoromilas Aug 8, 2021
c7a9c22
Fix documentation, make PyDataMem_GetHandler return const
eliaskoromilas Aug 8, 2021
99f8250
remove import of setuptools==49.1.3, doesn't work on python3.10
mattip Aug 9, 2021
b43c1fe
Fix refcount leaks
eliaskoromilas Aug 9, 2021
a7a5435
fix function signatures in test
mattip Aug 9, 2021
e3723df
Return early on PyDataMem_GetHandler error (VOID_compare)
eliaskoromilas Aug 9, 2021
144acc6
Add context/thread-locality tests, allow testing custom policies
eliaskoromilas Aug 9, 2021
8539f5f
Merge branch 'configurable_allocator' into configurable_allocator
eliaskoromilas Aug 9, 2021
6ab00d0
ENH: add and use global configurable memory routines
mattip Oct 11, 2020
e7e8754
ENH: add tests and a way to compile c-extensions from tests
mattip Oct 16, 2020
5f08532
fix allocation/free exposed by tests
mattip Oct 18, 2020
7266029
DOC: document the new APIs (and some old ones too)
mattip Oct 18, 2020
5d547ff
BUG: return void from FREE, also some cleanup
mattip Oct 19, 2020
4617c50
MAINT: changes from review
mattip Oct 19, 2020
8f739c4
fixes from linter
mattip Mar 11, 2021
90205b6
setting ndarray->descr on 0d or scalars mess with FREE
mattip Apr 16, 2021
5c0d3f9
make scalar allocation more consistent wrt np_alloc_cache
mattip Apr 16, 2021
3b385d9
change formatting for sphinx
mattip Apr 17, 2021
e6e12a3
remove memcpy variants
mattip Apr 19, 2021
048552d
update to match NEP 49
mattip May 3, 2021
ad6f8ad
ENH: add a python-level get_handler_name
mattip May 6, 2021
50f8b93
ENH: add core.multiarray.get_handler_name
mattip May 6, 2021
c7438f5
Allow closure-like definition of the data mem routines
eliaskoromilas Jun 30, 2021
f823ba4
Fix incompatible pointer warnings
eliaskoromilas Jun 30, 2021
ad13161
Note PyDataMemAllocator and PyMemAllocatorEx differentiation
eliaskoromilas Jul 1, 2021
0a08acd
Redefine default allocator handling
eliaskoromilas Jul 1, 2021
1f0301d
Always allocate new arrays using the current_handler
eliaskoromilas Jul 5, 2021
3d56aa0
Search for the mem_handler name of the data owner
eliaskoromilas Jul 5, 2021
8ea 8000 6818
Sub-comparisons don't need a local mem_handler
eliaskoromilas Jul 5, 2021
fb2af4d
Make the default_handler a valid PyDataMem_Handler
eliaskoromilas Jul 14, 2021
f05a1c6
Fix PyDataMem_SetHandler description (NEP discussion)
eliaskoromilas Jul 14, 2021
660e0a4
Pass the allocators by reference
eliaskoromilas Jul 14, 2021
a4f8d71
remove import of setuptools==49.1.3, doesn't work on python3.10
mattip Aug 9, 2021
d7b1a1d
fix function signatures in test
mattip Aug 9, 2021
ab1a0eb
try to fix cygwin extension building
mattip Aug 9, 2021
b92e36c
YAPF mem_policy test
eliaskoromilas Aug 9, 2021
76cda3a
Merge branch 'configurable_allocator' into configurable_allocator
eliaskoromilas Aug 9, 2021
3eadf2f
Less empty lines, more comments (tests)
eliaskoromilas Aug 10, 2021
dbe9d73
Apply suggestions from code review (set an exception and)
eliaskoromilas Aug 10, 2021
1bfb870
Merge pull request #57 from eliaskoromilas/configurable_allocator
mattip Aug 11, 2021
0511820
skip test on cygwin
mattip Aug 11, 2021
23c4bc0
update API hash for changed signature
mattip Aug 13, 2021
ed8649b
TST: add gc.collect to make sure cycles are broken
mattip Aug 13, 2021
9aacefa
Implement thread-locality for PyPy
eliaskoromilas Aug 12, 2021
79712fa
Update numpy/core/tests/test_mem_policy.py
mattip Aug 25, 2021
a2ae4c0
fixes from review
mattip Aug 25, 2021
09b9c0d
update circleci config
mattip Aug 25, 2021
efb3c77
fix test
mattip Aug 25, 2021
2945c64
make the connection between OWNDATA and having a allocator handle mor…
mattip Aug 25, 2021
3a97d9a
improve docstring, fix flake8 for tests
mattip Aug 26, 2021
1df805c
update PyDataMem_GetHandler() from review
mattip Aug 26, 2021
ef607bd
Implement allocator lifetime management
eliaskoromilas Aug 27, 2021
8bdc9a1
Merge pull request #59 from eliaskoromilas/configurable_allocator
mattip Aug 29, 2021
a3256e5
update NEP and add best-effort handling of error in PyDataMem_UserFREE
mattip Aug 29, 2021
4d6ea65
merge main into branch
mattip Aug 30, 2021
5941d7c
merge main into branch
mattip Oct 17, 2021
522c368
Merge branch 'main' into configurable_allocator
mattip Oct 25, 2021
442b0e1
ENH: fix and test for blindly taking ownership of data
mattip Oct 25, 2021
8ca8b54
Update doc/neps/nep-0049.rst
seberg Oct 25, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
MAINT: changes from review
  • Loading branch information
mattip committed Aug 8, 2021
commit de22327d7adf121b9210a3abb46ed2f1cf8c1d1d
60 changes: 44 additions & 16 deletions doc/source/reference/c-api/data_memory.rst
10000
Original file line number Diff line number Diff line change
@@ -1,24 +1,52 @@
Memory management
-----------------
Memory management in NumPy
==========================

The `numpy.ndarray` is a python class. It requires additinal memory allocations
The `numpy.ndarray` is a python class. It requires additional memory allocations
to hold `numpy.ndarray.strides`, `numpy.ndarray.shape` and
``numpy.ndarray.data`` attributes. These attributes are specially allocated
`numpy.ndarray.data` attributes. These attributes are specially allocated
after creating the python object in `__new__`. The ``strides`` and
``dimensions`` are stored in a piece of memory allocated internally.
``shape`` are stored in a piece of memory allocated internally.

These allocations are small relative to the ``data``, the homogeneous chunk of
memory used to store the actual array values (which could be pointers in the
case of ``object`` arrays). Users may wish to override the internal data
memory routines with ones of their own. They can do this by using
``PyDataMem_SetHandler``, which uses a ``PyDataMem_Handler`` structure to hold
pointers to functions used to manage the data memory. The calls are wrapped
by internal routines to call :c:func:`PyTraceMalloc_Track`,
:c:func:`PyTraceMalloc_Untrack`, and will use the `PyDataMem_EventHookFunc`
mechanism. Since the functions may change during the lifetime of the process,
each `ndarray` carries with it the functions used at the time of its
instantiation, and these will be used to reallocate or free the data memory of
the instance.
case of ``object`` arrays). Since that memory can be significantly large, NumPy
has provided interfaces to manage it. This document details how those
interfaces work.

Historical overview
-------------------

Since version 1.7.0, NumPy has exposed a set of ``PyDataMem_*`` functions
(:c:func:`PyDataMem_NEW`, :c:func:`PyDataMem_FREE`, :c:func:`PyDataMem_RENEW`)
which are backed by `alloc`, `free`, `realloc` respectively. In that version
NumPy also exposed the `PyDataMem_EventHook` function described below, which
wrap the OS-level calls.

Python also improved its memory management capabilities, and began providing
various :ref:`management policies <memoryoverview>` beginning in version
3.4. These routines are divided into a set of domains, each domain has a
:c:type:`PyMemAllocatorEx` structure of routines for memory management. Python also
added a `tracemalloc` module to trace calls to the various routines. These
tracking hooks were added to the NumPy ``PyDataMem_*`` routines.

Configurable memory routines in NumPy
-------------------------------------

Users may wish to override the internal data memory routines with ones of their
own. Since NumPy does not use the Python domain strategy to manage data memory,
it provides an alternative set of C-APIs to change memory routines. There are
no Python domain-wide strategies for large chunks of object data, so those are
less suited to NumPy's needs. User who wish to change the NumPy data memory
management routines can use :c:func:`PyDataMem_SetHandler`, which uses a
:c:type:`PyDataMem_Handler` structure to hold pointers to functions used to
manage the data memory. The calls are still wrapped by internal routines to
call :c:func:`PyTraceMalloc_Track`, :c:func:`PyTraceMalloc_Untrack`, and will
use the :c:func:`PyDataMem_EventHookFunc` mechanism. Since the functions may
change during the lifetime of the process, each ``ndarray`` carries with it the
functions used at the time of its instantiation, and these will be used to
reallocate or free the data memory of the instance. As of NumPy version 1.20,
the copy functions are not yet implemented, all memory copies are handled by
calls to ``memcpy``.

.. c:type:: PyDataMem_Handler

Expand Down Expand Up @@ -62,7 +90,7 @@ the instance.
will be used to allocate data for the next PyArrayObject

For an example of setting up and using the PyDataMem_Handler, see the test in
``numpy/core/tests/test_mem_policy.py``
:file:`numpy/core/tests/test_mem_policy.py`

.. c:function:: typedef void (PyDataMem_EventHookFunc)(void *inp, void *outp,
size_t size, void *user_data);
Expand Down
2 changes: 1 addition & 1 deletion numpy/core/include/numpy/ndarraytypes.h
Original file line number Diff line number Diff line change
Expand Up @@ -674,7 +674,7 @@ typedef void *(PyDataMem_ReallocFunc)(void *ptr, size_t size);
typedef void *(PyDataMem_CopyFunc)(void *dst, const void *src, size_t size);

typedef struct {
char name[128]; /* multiple of 64 to keep the struct unaligned */
char name[128]; /* multiple of 64 to keep the struct aligned */
PyDataMem_AllocFunc *alloc;
PyDataMem_ZeroedAllocFunc *zeroed_alloc;
PyDataMem_FreeFunc *free;
Expand Down
2 changes: 1 addition & 1 deletion numpy/core/src/multiarray/alloc.c
Original file line number Diff line number Diff line change
Expand Up @@ -326,7 +326,7 @@ PyDataMem_RENEW(void *ptr, size_t size)
}

/* Memory handler global default */
static PyDataMem_Handler default_allocator = {
PyDataMem_Handler default_allocator = {
"default_allocator",
npy_alloc_cache, /* alloc */
npy_alloc_cache_zero, /* zeroed_alloc */
Expand Down
1 change: 1 addition & 0 deletions numpy/core/src/multiarray/alloc.h
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@ npy_free_cache_dim_array(PyArrayObject * arr)
}

extern PyDataMem_Handler *current_allocator;
extern PyDataMem_Handler default_allocator;

#define PyArray_HANDLER(arr) ((PyArrayObject_fields*)(arr))->mem_handler

Expand Down
6 changes: 4 additions & 2 deletions numpy/core/src/multiarray/ctors.c
Original file line number Diff line number Diff line change
Expand Up @@ -804,10 +804,10 @@ PyArray_NewFromDescr_int(
fa->flags |= NPY_ARRAY_C_CONTIGUOUS|NPY_ARRAY_F_CONTIGUOUS;
}

/* Store the functions in case the global hander is modified */
fa->mem_handler = current_allocator;

if (data == NULL) {
/* Store the functions in case the global hander is modified */
fa->mem_handler = current_allocator;
/*
* Allocate something even for zero-space arrays
* e.g. shape=(0,) -- otherwise buffer exposure
Expand Down Expand Up @@ -836,6 +836,8 @@ PyArray_NewFromDescr_int(
fa->flags |= NPY_ARRAY_OWNDATA;
}
else {
/* The handlers should never be called in this case, but just in case */
fa->mem_handler = &default_allocator;
/*
* If data is passed in, this object won't own it by default.
* Caller must arrange for this to be reset if truly desired
Expand Down
0