-
-
Notifications
You must be signed in to change notification settings - Fork 10.9k
ENH: Configurable allocator #17582
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Configurable allocator #17582
Changes from all commits
55f2f6c
23da73e
94b9f25
81b45fd
fc32c2f
de22327
38274a4
59b520a
5264019
7c396d7
de9001c
9e7c3ed
953cc88
ad9329b
6d10fdb
18bea05
4368023
d243313
c9b6854
a8fd378
a17565b
2ec5912
227c4b8
fb8135d
7291484
c7a9c22
99f8250
b43c1fe
a7a5435
e3723df
144acc6
8539f5f
6ab00d0
e7e8754
5f08532
7266029
5d547ff
4617c50
8f739c4
90205b6
5c0d3f9
3b385d9
e6e12a3
048552d
ad6f8ad
50f8b93
c7438f5
f823ba4
ad13161
0a08acd
1f0301d
3d56aa0
8ea6818
fb2af4d
f05a1c6
660e0a4
a4f8d71
d7b1a1d
ab1a0eb
b92e36c
76cda3a
3eadf2f
dbe9d73
1bfb870
0511820
23c4bc0
ed8649b
9aacefa
79712fa
a2ae4c0
09b9c0d
efb3c77
2945c64
3a97d9a
1df805c
ef607bd
8bdc9a1
a3256e5
4d6ea65
5941d7c
522c368
442b0e1
8ca8b54
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,119 @@ | ||
Memory management in NumPy | ||
========================== | ||
|
||
The `numpy.ndarray` is a python class. It requires additional memory allocations | ||
to hold `numpy.ndarray.strides`, `numpy.ndarray.shape` and | ||
`numpy.ndarray.data` attributes. These attributes are specially allocated | ||
after creating the python object in `__new__`. The ``strides`` and | ||
``shape`` are stored in a piece of memory allocated internally. | ||
|
||
The ``data`` allocation used to store the actual array values (which could be | ||
pointers in the case of ``object`` arrays) can be very large, so NumPy has | ||
provided interfaces to manage its allocation and release. This document details | ||
how those interfaces work. | ||
|
||
Historical overview | ||
------------------- | ||
|
||
Since version 1.7.0, NumPy has exposed a set of ``PyDataMem_*`` functions | ||
(:c:func:`PyDataMem_NEW`, :c:func:`PyDataMem_FREE`, :c:func:`PyDataMem_RENEW`) | ||
which are backed by `alloc`, `free`, `realloc` respectively. In that version | ||
NumPy also exposed the `PyDataMem_EventHook` function described below, which | ||
wrap the OS-level calls. | ||
|
||
Since those early days, Python also improved its memory management | ||
capabilities, and began providing | ||
various :ref:`management policies <memoryoverview>` beginning in version | ||
3.4. These routines are divided into a set of domains, each domain has a | ||
:c:type:`PyMemAllocatorEx` structure of routines for memory management. Python also | ||
added a `tracemalloc` module to trace calls to the various routines. These | ||
tracking hooks were added to the NumPy ``PyDataMem_*`` routines. | ||
|
||
NumPy added a small cache of allocated memory in its internal | ||
``npy_alloc_cache``, ``npy_alloc_cache_zero``, and ``npy_free_cache`` | ||
functions. These wrap ``alloc``, ``alloc-and-memset(0)`` and ``free`` | ||
respectively, but when ``npy_free_cache`` is called, it adds the pointer to a | ||
short list of available blocks marked by size. These blocks can be re-used by | ||
subsequent calls to ``npy_alloc*``, avoiding memory thrashing. | ||
|
||
Configurable memory routines in NumPy (NEP 49) | ||
---------------------------------------------- | ||
|
||
Users may wish to override the internal data memory routines with ones of their | ||
own. Since NumPy does not use the Python domain strategy to manage data memory, | ||
it provides an alternative set of C-APIs to change memory routines. There are | ||
no Python domain-wide strategies for large chunks of object data, so those are | ||
less suited to NumPy's needs. User who wish to change the NumPy data memory | ||
management routines can use :c:func:`PyDataMem_SetHandler`, which uses a | ||
:c:type:`PyDataMem_Handler` structure to hold pointers to functions used to | ||
manage the data memory. The calls are still wrapped by internal routines to | ||
call :c:func:`PyTraceMalloc_Track`, :c:func:`PyTraceMalloc_Untrack`, and will | ||
use the :c:func:`PyDataMem_EventHookFunc` mechanism. Since the functions may | ||
change during the lifetime of the process, each ``ndarray`` carries with it the | ||
functions used at the time of its instantiation, and these will be used to | ||
reallocate or free the data memory of the instance. | ||
|
||
.. c:type:: PyDataMem_Handler | ||
|
||
A struct to hold function pointers used to manipulate memory | ||
|
||
.. code-block:: c | ||
|
||
typedef struct { | ||
char name[128]; /* multiple of 64 to keep the struct aligned */ | ||
PyDataMemAllocator allocator; | ||
} PyDataMem_Handler; | ||
eric-wieser marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
where the allocator structure is | ||
|
||
.. code-block:: c | ||
|
||
/* The declaration of free differs from PyMemAllocatorEx */ | ||
typedef struct { | ||
void *ctx; | ||
void* (*malloc) (void *ctx, size_t size); | ||
void* (*calloc) (void *ctx, size_t nelem, size_t elsize); | ||
void* (*realloc) (void *ctx, void *ptr, size_t new_size); | ||
void (*free) (void *ctx, void *ptr, size_t size); | ||
} PyDataMemAllocator; | ||
|
||
.. c:function:: PyObject * PyDataMem_SetHandler(PyObject *handler) | ||
|
||
Set a new allocation policy. If the input value is ``NULL``, will reset the | ||
policy to the default. Return the previous policy, or | ||
return ``NULL`` if an error has occurred. We wrap the user-provided functions | ||
so they will still call the python and numpy memory management callback | ||
hooks. | ||
|
||
.. c:function:: PyObject * PyDataMem_GetHandler() | ||
|
||
Return the current policy that will be used to allocate data for the | ||
next ``PyArrayObject``. On failure, return ``NULL``. | ||
|
||
For an example of setting up and using the PyDataMem_Handler, see the test in | ||
:file:`numpy/core/tests/test_mem_policy.py` | ||
|
||
.. c:function:: void PyDataMem_EventHookFunc(void *inp, void *outp, size_t size, void *user_data); | ||
|
||
This function will be called during data memory manipulation | ||
|
||
.. c:function:: PyDataMem_EventHookFunc * PyDataMem_SetEventHook(PyDataMem_EventHookFunc *newhook, void *user_data, void **old_data) | ||
|
||
Sets the allocation event hook for numpy array data. | ||
|
||
Returns a pointer to the previous hook or ``NULL``. If old_data is | ||
non-``NULL``, the previous user_data pointer will be copied to it. | ||
|
||
If not ``NULL``, hook will be called at the end of each ``PyDataMem_NEW/FREE/RENEW``: | ||
|
||
.. code-block:: c | ||
|
||
result = PyDataMem_NEW(size) -> (*hook)(NULL, result, size, user_data) | ||
PyDataMem_FREE(ptr) -> (*hook)(ptr, NULL, 0, user_data) | ||
result = PyDataMem_RENEW(ptr, size) -> (*hook)(ptr, result, size, user_data) | ||
|
||
When the hook is called, the GIL will be held by the calling | ||
thread. The hook should be written to be reentrant, if it performs | ||
operations that might cause new allocation events (such as the | ||
creation/destruction numpy objects, or creating/destroying Python | ||
objects which might cause a gc) |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -49,3 +49,4 @@ code. | |
generalized-ufuncs | ||
coremath | ||
deprecations | ||
data_memory |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -355,12 +355,10 @@ struct NpyAuxData_tag { | |
#define NPY_ERR(str) fprintf(stderr, #str); fflush(stderr); | ||
#define NPY_ERR2(str) fprintf(stderr, str); fflush(stderr); | ||
|
||
/* | ||
* Macros to define how array, and dimension/strides data is | ||
* allocated. | ||
*/ | ||
|
||
/* Data buffer - PyDataMem_NEW/FREE/RENEW are in multiarraymodule.c */ | ||
/* | ||
* Macros to define how array, and dimension/strides data is | ||
* allocated. These should be made private | ||
*/ | ||
|
||
#define NPY_USE_PYMEM 1 | ||
|
||
|
@@ -666,6 +664,24 @@ typedef struct _arr_descr { | |
PyObject *shape; /* a tuple */ | ||
} PyArray_ArrayDescr; | ||
|
||
/* | ||
* Memory handler structure for array data. | ||
*/ | ||
/* The declaration of free differs from PyMemAllocatorEx */ | ||
typedef struct { | ||
void *ctx; | ||
void* (*malloc) (void *ctx, size_t size); | ||
void* (*calloc) (void *ctx, size_t nelem, size_t elsize); | ||
void* (*realloc) (void *ctx, void *ptr, size_t new_size); | ||
void (*free) (void *ctx, void *ptr, size_t size); | ||
} PyDataMemAllocator; | ||
|
||
typedef struct { | ||
char name[128]; /* multiple of 64 to keep the struct aligned */ | ||
PyDataMemAllocator allocator; | ||
} PyDataMem_Handler; | ||
|
||
mattip marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
/* | ||
* The main array object structure. | ||
* | ||
|
@@ -716,6 +732,10 @@ typedef struct tagPyArrayObject_fields { | |
/* For weak references */ | ||
PyObject *weakreflist; | ||
void *_buffer_info; /* private buffer info, tagged to allow warning */ | ||
/* | ||
* For malloc/calloc/realloc/free per object | ||
*/ | ||
PyObject *mem_handler; | ||
} PyArrayObject_fields; | ||
|
||
/* | ||
|
@@ -1659,6 +1679,12 @@ PyArray_CLEARFLAGS(PyArrayObject *arr, int flags) | |
((PyArrayObject_fields *)arr)->flags &= ~flags; | ||
} | ||
|
||
static NPY_INLINE NPY_RETURNS_BORROWED_REF PyObject * | ||
PyArray_HANDLER(PyArrayObject *arr) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should this be public API? Otherwise can we move it or at least hide it behind the "internal" define to be clear about it? I actually also wonder if we should call it There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Actual use-cases:
while (arr != NULL && PyArray_Check(arr)) {
if (PyArray_CHKFLAGS((PyArrayObject *) arr, NPY_ARRAY_OWNDATA)) {
PyObject *_handler_ = PyArray_HANDLER((PyArrayObject *) arr);
if (!_handler_) {
PyErr_SetString(PyExc_RuntimeError, "no memory handler found but OWNDATA flag set");
return -1;
}
PyDataMem_Handler *handler = (PyDataMem_Handler *) PyCapsule_GetPointer(_handler_, "mem_handler");
if (!handler) {
return -1;
}
printf("### %s ###", handler->name);
return 0;
}
arr = PyArray_BASE((PyArrayObject *) arr);
}
PyErr_SetString(PyExc_ValueError, "argument must be an ndarray");
return -1;
PyObject *a_arr_handler = PyArray_HANDLER(a_arr);
if (!a_arr_handler) {
PyErr_SetString(PyExc_RuntimeError, "no memory handler found");
return -1;
}
PyObject *old_handler = PyDataMem_SetHandler(a_arr_handler);
Py_DECREF(a_arr_handler);
if (!old_handler) {
return -1;
}
// construct b_arr
if (!PyDataMem_SetHandler(old_handler)) {
Py_DECREF(old_handler);
return -1;
}
Py_DECREF(old_handler);
// a_arr and b_arr are handled by the same allocator
return 0; There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
EDIT: Sorry forgot to add :). I agree it is no different, but we cannot discuss any modifications to that, while we could maybe fathom modifications here? OK, yeah, we do want to be able to fetch the allocator. I guess the question I have is then whether we want to define the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Based on my other comment:
I think it's clean enough. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I guess, since this is the same If yes, a nice way to do it could be through the capsule name. For example: user: handler_capsule = PyCapsule_New(my_handler, "v1", destructor); numpy: if (!PyCapsule_IsValid(arr->mem_handler, "v1") {
// get pointer and cast it as PyDataMem_Handler_v1
} else if (!PyCapsule_IsValid(arr->mem_handler, "v2") {
// get pointer and cast it as PyDataMem_Handler_v2
} else {
// unknown version
} |
||
{ | ||
return ((PyArrayObject_fields *)arr)->mem_handler; | ||
} | ||
|
||
#define PyTypeNum_ISBOOL(type) ((type) == NPY_BOOL) | ||
|
||
#define PyTypeNum_ISUNSIGNED(type) (((type) == NPY_UBYTE) || \ | ||
|
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -31,8 +31,8 @@ | |||||
'count_nonzero', 'c_einsum', 'datetime_as_string', 'datetime_data', | ||||||
'dot', 'dragon4_positional', 'dragon4_scientific', 'dtype', | ||||||
'empty', 'empty_like', 'error', 'flagsobj', 'flatiter', 'format_longfloat', | ||||||
'frombuffer', 'fromfile', 'fromiter', 'fromstring', 'inner', | ||||||
'interp', 'interp_complex', 'is_busday', 'lexsort', | ||||||
'frombuffer', 'fromfile', 'fromiter', 'fromstring', 'get_handler_name', | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
I think we should maybe add an additional word to make "handler" explicit, since this is top-level. EDIT: sorry, meant to make a "review" not individual comment There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nevermind, it is only in There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Let's leave this for a future PR? I am really bad with names. I didn't want to use "allocator" since it is really a "allocator/free policy handler", and ended up with "handler" |
||||||
'inner', 'interp', 'interp_complex', 'is_busday', 'lexsort', | ||||||
'matmul', 'may_share_memory', 'min_scalar_type', 'ndarray', 'nditer', | ||||||
'nested_iters', 'normalize_axis_index', 'packbits', | ||||||
'promote_types', 'putmask', 'ravel_multi_index', 'result_type', 'scalar', | ||||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to reference the builtin
PyMemAllocatorEx
somewhere in this section and compare the two (https://docs.python.org/3/c-api/memory.html#customize-memory-allocators).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will add. The original
npy_*_cache
set of functions usesPyDataMem_*
functions, which it seems preceeded these more sophisticated interfaces and directly usedmalloc
/calloc
/free
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Documented. The Python ones do not use our non-documented-but-public
PyDataMem_EventHookFunc
callbacks. So I could see a path where we deprecatePyDataMem_EventHookFunc
and move to the Python memory management strategies, although that would meanPyDataMem_EventHookFunc
callback it would no longer work.