8000 ENH: Configurable allocator by mattip · Pull Request #17582 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

ENH: Configurable allocator #17582

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 83 commits into from
Oct 25, 2021
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
83 commits
Select commit Hold shift + click to select a range
55f2f6c
ENH: add and use global configurable memory routines
mattip Oct 11, 2020
23da73e
ENH: add tests and a way to compile c-extensions from tests
mattip Oct 16, 2020
94b9f25
fix allocation/free exposed by tests
mattip Oct 18, 2020
81b45fd
DOC: document the new APIs (and some old ones too)
mattip Oct 18, 2020
fc32c2f
BUG: return void from FREE, also some cleanup
mattip Oct 19, 2020
de22327
MAINT: changes from review
mattip Oct 19, 2020
38274a4
fixes from linter
mattip Mar 11, 2021
59b520a
setting ndarray->descr on 0d or scalars mess with FREE
mattip Apr 16, 2021
5264019
make scalar allocation more consistent wrt np_alloc_cache
mattip Apr 16, 2021
7c396d7
change formatting for sphinx
mattip Apr 17, 2021
10000 de9001c
remove memcpy variants
mattip Apr 19, 2021
9e7c3ed
update to match NEP 49
mattip May 3, 2021
953cc88
ENH: add a python-level get_handler_name
mattip May 6, 2021
ad9329b
ENH: add core.multiarray.get_handler_name
mattip May 6, 2021
6d10fdb
Allow closure-like definition of the data mem routines
eliaskoromilas Jun 30, 2021
18bea05
Fix incompatible pointer warnings
eliaskoromilas Jun 30, 2021
4368023
Note PyDataMemAllocator and PyMemAllocatorEx differentiation
eliaskoromilas Jul 1, 2021
d243313
Redefine default allocator handling
eliaskoromilas Jul 1, 2021
c9b6854
Always allocate new arrays using the current_handler
eliaskoromilas Jul 5, 2021
a8fd378
Search for the mem_handler name of the data owner
eliaskoromilas Jul 5, 2021
a17565b
Sub-comparisons don't need a local mem_handler
eliaskoromilas Jul 5, 2021
2ec5912
Make the default_handler a valid PyDataMem_Handler
eliaskoromilas Jul 14, 2021
227c4b8
Fix PyDataMem_SetHandler description (NEP discussion)
eliaskoromilas Jul 14, 2021
fb8135d
Pass the allocators by reference
eliaskoromilas Jul 14, 2021
7291484
Implement allocator context-locality
eliaskoromilas Aug 8, 2021
c7a9c22
Fix documentation, make PyDataMem_GetHandler return const
eliaskoromilas Aug 8, 2021
99f8250
remove import of setuptools==49.1.3, doesn't work on python3.10
mattip Aug 9, 2021
b43c1fe
Fix refcount leaks
eliaskoromilas Aug 9, 2021
a7a5435
fix function signatures in test
mattip Aug 9, 2021
e3723df
Return early on PyDataMem_GetHandler error (VOID_compare)
eliaskoromilas Aug 9, 2021
144acc6
Add context/thread-locality tests, allow testing custom policies
eliaskoromilas Aug 9, 2021
8539f5f
Merge branch 'configurable_allocator' into configurable_allocator
eliaskoromilas Aug 9, 2021
6ab00d0
ENH: add and use global configurable memory routines
mattip Oct 11, 2020
e7e8754
ENH: add tests and a way to compile c-extensions from tests
mattip Oct 16, 2020
5f08532
fix allocation/free exposed by tests
mattip Oct 18, 2020
7266029
DOC: document the new APIs (and some old ones too)
mattip Oct 18, 2020
5d547ff
BUG: return void from FREE, also some cleanup
mattip Oct 19, 2020
4617c50
MAINT: changes from review
mattip Oct 19, 2020
8f739c4
fixes from linter
mattip Mar 11, 2021
90205b6
setting ndarray->descr on 0d or scalars mess with FREE
mattip Apr 16, 2021
5c0d3f9
make scalar allocation more consistent wrt np_alloc_cache
mattip Apr 16, 2021
3b385d9
change formatting for sphinx
mattip Apr 17, 2021
e6e12a3
remove memcpy variants
mattip Apr 19, 2021
048552d
update to match NEP 49
mattip May 3, 2021
ad6f8ad
ENH: add a python-level get_handler_name
mattip May 6, 2021
50f8b93
ENH: add core.multiarray.get_handler_name
mattip May 6, 2021
c7438f5
Allow closure-like definition of the data mem routines
eliaskoromilas Jun 30, 2021
f823ba4
Fix incompatible pointer warnings
eliaskoromilas Jun 30, 2021
ad13161
Note PyDataMemAllocator and PyMemAllocatorEx differentiation
eliaskoromilas Jul 1, 2021
0a08acd
Redefine default allocator handling
eliaskoromilas Jul 1, 2021
1f0301d
Always allocate new arrays using the current_handler
eliaskoromilas Jul 5, 2021
3d56aa0
Search for the mem_handler name of the data owner
eliaskoromilas Jul 5, 2021
8ea6818
Sub-comparisons don't need a local mem_handler
eliaskoromilas Jul 5, 2021
fb2af4d
Make the default_handler a valid PyDataMem_Handler
eliaskoromilas Jul 14, 2021
f05a1c6
Fix PyDataMem_SetHandler description (NEP discussion)
eliaskoromilas Jul 14, 2021
660e0a4
Pass the allocators by reference
eliaskoromilas Jul 14, 2021
a4f8d71
remove import of setuptools==49.1.3, doesn't work on python3.10
mattip Aug 9, 2021
d7b1a1d
fix function signatures in test
mattip Aug 9, 2021
ab1a0eb
try to fix cygwin extension building
mattip Aug 9, 2021
b92e36c
YAPF mem_policy test
eliaskoromilas Aug 9, 2021
76cda3a
Merge branch 'configurable_allocator' into configurable_allocator
eliaskoromilas Aug 9, 2021
3eadf2f
Less empty lines, more comments (tests)
eliaskoromilas Aug 10, 2021
dbe9d73
Apply suggestions from code review (set an exception and)
eliaskoromilas Aug 10, 2021
1bfb870
Merge pull request #57 from eliaskoromilas/configurable_allocator
mattip Aug 11, 2021
0511820
skip test on cygwin
mattip Aug 11, 2021
23c4bc0
update API hash for changed signature
mattip Aug 13, 2021
ed8649b
TST: add gc.collect to make sure cycles are broken
mattip Aug 13, 2021
9aacefa
Implement thread-locality for PyPy
eliaskoromilas Aug 12, 2021
79712fa
Update numpy/core/tests/test_mem_policy.py
mattip Aug 25, 2021
a2ae4c0
fixes from review
mattip Aug 25, 2021
09b9c0d
update circleci config
mattip Aug 25, 2021
efb3c77
fix test
mattip Aug 25, 2021
2945c64
make the connection between OWNDATA and having a allocator handle mor…
mattip Aug 25, 2021
3a97d9a
improve docstring, fix flake8 for tests
mattip Aug 26, 2021
1df805c
update PyDataMem_GetHandler() from review
mattip Aug 26, 2021
ef607bd
Implement allocator lifetime management
eliaskoromilas Aug 27, 2021
8bdc9a1
Merge pull request #59 from eliaskoromilas/configurable_allocator
mattip Aug 29, 2021
a3256e5
update NEP and add best-effort handling of error in PyDataMem_UserFREE
mattip Aug 29, 2021
4d6ea65
merge main into branch
mattip Aug 30, 2021
5941d7c
merge main into branch
mattip Oct 17, 2021
522c368
Merge branch 'main' into configurable_allocator
mattip Oct 25, 2021
442b0e1
ENH: fix and test for blindly taking ownership of data
mattip Oct 25, 2021
8ca8b54
Update doc/neps/nep-0049.rst
seberg Oct 25, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
make the connection between OWNDATA and having a allocator handle mor…
…e explicit
  • Loading branch information
mattip committed Aug 25, 2021
commit 2945c644d20ee2d5abd0e4062d181ee6bd6058e6
5 changes: 3 additions & 2 deletions numpy/core/_add_newdocs.py
Original file line number Diff line number Diff line change
Expand Up @@ -4690,11 +4690,12 @@

add_newdoc('numpy.core.multiarray', 'get_handler_name',
"""
get_handler_name(a: ndarray) -> str
get_handler_name(a: ndarray) -> str,None

Return the name of the memory handler used by `a`. If not provided, return
the name of the current global memory handler that will be used to allocate
data for the next `ndarray`.
data for the next `ndarray`. May return None if `a` does not own its
memory, in which case you can traverse ``a.base`` for a memory handler.
""")

add_newdoc('numpy.core.multiarray', '_set_madvise_hugepage',
Expand Down
25 changes: 8 additions & 17 deletions numpy/core/src/multiarray/alloc.c
Original file line number Diff line number Diff line change
Expand Up @@ -552,9 +552,10 @@ PyDataMem_SetHandler(PyDataMem_Handler *handler)
}

/*NUMPY_API
* Return the PyDataMem_Handler used by the PyArrayObject. If NULL, return
* Return the PyDataMem_Handler used by obj. If obj is NULL, return
* the current global policy that will be used to allocate data
* for the next PyArrayObject. On failure, return NULL.
* for the next PyArrayObject. On failure, return NULL. Can also return NULL
* if obj does not own its memory
*/
NPY_NO_EXPORT const PyDataMem_Handler *
PyDataMem_GetHandler(PyArrayObject *obj)
Expand Down Expand Up @@ -588,20 +589,7 @@ PyDataMem_GetHandler(PyArrayObject *obj)
return handler;
}
#endif
/* Try to find a handler */
PyArrayObject *base = obj;
handler = PyArray_HANDLER(obj);
while (handler == NULL) {
base = (PyArrayObject*)PyArray_BASE(base);
/*
* If the base is an array which owns its own data, return its allocator.
*/
if (base == NULL || ! PyArray_Check(base)) {
break;
}
handler = PyArray_HANDLER(base);
}
return handler;
return PyArray_HANDLER(obj);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we then just export PyArray_HANDLER macro and transform PyDataMem_GetHandler to a getter of the current allocator (no obj argument, since PyDataMem_SetHandler is already the current allocator setter)? I think this final touch will make the API crystal clear.

Copy link
8000
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is still this scenario that justifies having an obj argument:

  • I allocate some ndarrays, these will be using policy A
  • I call PyDataMem_SetHandler(B)
  • I allocate more ndarrays

Now I want to know, for debugging purposes, whether handler A or B was used for each of the ndarrays. How can I do that?

Copy link
Contributor
@eliaskoromilas eliaskoromilas Aug 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently:

  • PyDataMem_GetHandler(obj) == PyArray_HANDLER(obj)
  • PyDataMem_GetHandler/PyDataMem_GetHandler do not return a PyObject (e.g. a PyCapsule), which means that are meant for C (code) Extensions only.

Based on these two facts, why can't we make PyArray_HANDLER a public Array API macro (like e.g. PyArray_BASE or PyArray_SHAPE, etc.)?

To summarize the API would look like this:

  • PyDataMem_Handler *PyDataMem_GetHandler() # current allocator getter
  • PyDataMem_Handler *PyDataMem_SetHandler(PyDataMem_Handler *) # current allocator setter
  • PyDataMem_Handler *PyArray_HANDLER(PyArrayObject *) # Return the memory handler of an array (or NULL if no memory handler, which means that the user should search in the bases)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course, get_handler_name will be modified to invoke either PyArray_HANDLER or PyDataMem_GetHandler based on the input.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got it. Adopted


NPY_NO_EXPORT PyObject *
Expand All @@ -617,7 +605,10 @@ get_handler_name(PyObject *NPY_UNUSED(self), PyObject *args)
}
const PyDataMem_Handler * mem_handler = PyDataMem_GetHandler((PyArrayObject *)arr);
if (mem_handler == NULL) {
return NULL;
if (PyErr_Occurred()) {
return NULL;
}
Py_RETURN_NONE;
}
return PyUnicode_FromString(mem_handler->name);
}
8 changes: 6 additions & 2 deletions numpy/core/src/multiarray/ctors.c
Original file line number Diff line number Diff line change
Expand Up @@ -843,8 +843,7 @@ PyArray_NewFromDescr_int(
/* The handlers should never be called in this case */
fa->mem_handler = NULL;
/*
* If data is passed in, this object won't own it by default.
* Caller must arrange for this to be reset if truly desired
* If data is passed in, this object won't own it.
*/
fa->flags &= ~NPY_ARRAY_OWNDATA;
}
Expand Down Expand Up @@ -3412,6 +3411,7 @@ array_from_text(PyArray_Descr *dtype, npy_intp num, char const *sep, size_t *nre
dptr += dtype->elsize;
if (num < 0 && thisbuf == size) {
totalbytes += bytes;
/* The handler is always valid */
tmp = PyDataMem_UserRENEW(PyArray_DATA(r), totalbytes,
&PyArray_HANDLER(r)->allocator);
if (tmp == NULL) {
Expand All @@ -3435,6 +3435,7 @@ array_from_text(PyArray_Descr *dtype, npy_intp num, char const *sep, size_t *nre
const size_t nsize = PyArray_MAX(*nread,1)*dtype->elsize;

if (nsize != 0) {
/* The handler is always valid */
tmp = PyDataMem_UserRENEW(PyArray_DATA(r), nsize,
&PyArray_HANDLER(r)->allocator);
if (tmp == NULL) {
Expand Down Expand Up @@ -3541,6 +3542,7 @@ PyArray_FromFile(FILE *fp, PyArray_Descr *dtype, npy_intp num, char *sep)
const size_t nsize = PyArray_MAX(nread,1) * dtype->elsize;
char *tmp;

/* The handler is always valid */
if((tmp = PyDataMem_UserRENEW(PyArray_DATA(ret), nsize,
&PyArray_HANDLER(ret)->allocator)) == NULL) {
Py_DECREF(dtype);
Expand Down Expand Up @@ -3826,6 +3828,7 @@ PyArray_FromIter(PyObject *obj, PyArray_Descr *dtype, npy_intp count)
*/
elcount = (i >> 1) + (i < 4 ? 4 : 2) + i;
if (!npy_mul_with_overflow_intp(&nbytes, elcount, elsize)) {
/* The handler is always valid */
new_data = PyDataMem_UserRENEW(PyArray_DATA(ret), nbytes,
&PyArray_HANDLER(ret)->allocator);
}
Expand Down Expand Up @@ -3868,6 +3871,7 @@ PyArray_FromIter(PyObject *obj, PyArray_Descr *dtype, npy_intp count)
/* The size cannot be zero for realloc. */
goto done;
}
/* The handler is always valid */
new_data = PyDataMem_UserRENEW(PyArray_DATA(ret), i * elsize,
&PyArray_HANDLER(ret)->allocator);
if (new_data == NULL) {
Expand Down
10 changes: 8 additions & 2 deletions numpy/core/src/multiarray/getset.c
Original file line number Diff line number Diff line change
Expand Up @@ -393,8 +393,14 @@ array_data_set(PyArrayObject *self, PyObject *op, void *NPY_UNUSED(ignored))
PyArray_Descr *dtype = PyArray_DESCR(self);
nbytes = dtype->elsize ? dtype->elsize : 1;
}
PyDataMem_UserFREE(PyArray_DATA(self), nbytes,
&PyArray_HANDLER(self)->allocator);
PyDataMem_Handler *handler = PyArray_HANDLER(self);
if (handler == NULL) {
/* This can happen if someone arbitrarily sets NPY_ARRAY_OWNDATA */
PyErr_SetString(PyExc_RuntimeError,
"no memory handler found but OWNDATA flag set");
return -1;
}
PyDataMem_UserFREE(PyArray_DATA(self), nbytes, &handler->allocator);
}
if (PyArray_BASE(self)) {
if ((PyArray_FLAGS(self) & NPY_ARRAY_WRITEBACKIFCOPY) ||
Expand Down
10 changes: 8 additions & 2 deletions numpy/core/src/multiarray/methods.c
Original file line number Diff line number Diff line change
Expand Up @@ -2053,8 +2 10000 053,14 @@ array_setstate(PyArrayObject *self, PyObject *args)
* Allocation will never be 0, see comment in ctors.c
* line 820
*/
PyDataMem_UserFREE(PyArray_DATA(self), n_tofree,
&PyArray_HANDLER(self)->allocator);
PyDataMem_Handler *handler = PyArray_HANDLER(self);
if (handler == NULL) {
/* This can happen if someone arbitrarily sets NPY_ARRAY_OWNDATA */
PyErr_SetString(PyExc_RuntimeError,
"no memory handler found but OWNDATA flag set");
return NULL;
}
PyDataMem_UserFREE(PyArray_DATA(self), n_tofree, &handler->allocator);
PyArray_CLEARFLAGS(self, NPY_ARRAY_OWNDATA);
}
Py_XDECREF(PyArray_BASE(self));
Expand Down
9 changes: 8 additions & 1 deletion numpy/core/src/multiarray/shape.c
Original file line number Diff line number Diff line change
Expand Up @@ -120,9 +120,16 @@ PyArray_Resize(PyArrayObject *self, PyArray_Dims *newshape, int refcheck,
}

/* Reallocate space if needed - allocating 0 is forbidden */
PyDataMem_Handler *handler = PyArray_HANDLER(self);
if (handler == NULL) {
/* This can happen if someone arbitrarily sets NPY_ARRAY_OWNDATA */
PyErr_SetString(PyExc_RuntimeError,
"no memory handler found but OWNDATA flag set");
return NULL;
}
new_data = PyDataMem_UserRENEW(PyArray_DATA(self),
newnbytes == 0 ? elsize : newnbytes,
&PyArray_HANDLER(self)->allocator);
&handler->allocator);
if (new_data == NULL) {
PyErr_SetString(PyExc_MemoryError,
"cannot allocate memory for array");
Expand Down
21 changes: 12 additions & 9 deletions numpy/core/tests/test_mem_policy.py
Original file line number Diff line number Diff line change
Expand Up @@ -151,12 +151,14 @@ def test_set_policy(get_module):
orig_policy_name = np.core.multiarray.get_handler_name()

a = np.arange(10).reshape((2, 5)) # a doesn't own its own data
assert np.core.multiarray.get_handler_name(a) == orig_policy_name
assert np.core.multiarray.get_handler_name(a) == None
assert np.core.multiarray.get_handler_name(a.base) == orig_policy_name

orig_policy = get_module.set_secret_data_policy()

b = np.arange(10).reshape((2, 5)) # b doesn't own its own data
assert np.core.multiarray.get_handler_name(b) == 'secret_data_allocator'
assert np.core.multiarray.get_handler_name(b) == None
assert np.core.multiarray.get_handler_name(b.base) == 'secret_data_allocator'

if orig_policy_name == 'default_allocator':
get_module.set_old_policy(None) # tests PyDataMem_SetHandler(NULL)
Expand All @@ -171,16 +173,17 @@ def test_policy_propagation(get_module):
class MyArr(np.ndarray):
pass

# The memory policy goes hand-in-hand with flags.owndata
orig_policy_name = np.core.multiarray.get_handler_name()
a = np.arange(10).view(MyArr).reshape((2, 5)) # a doesn't own its own data
assert np.core.multiarray.get_handler_name(a) == orig_policy_name
a = np.arange(10).view(MyArr).reshape((2, 5))
assert np.core.multiarray.get_handler_name(a) == None
assert a.flags.owndata == False

orig_policy = get_module.set_secret_data_policy()
secret_policy_name = np.core.multiarray.get_handler_name()
b = np.arange(10).view(MyArr).reshape((2, 5)) # b doesn't own its own data
assert np.core.multiarray.get_handler_name(b) == secret_policy_name
get_module.set_old_policy(orig_policy)
assert np.core.multiarray.get_handler_name(a.base) == None
assert a.base.flags.owndata == False

assert np.core.multiarray.get_handler_name(a.base.base) == orig_policy_name
assert a.base.base.flags.owndata == True

async def concurrent_context1(get_module, orig_policy_name, event):
if orig_policy_name == 'default_allocator':
Expand Down
2 changes: 2 additions & 0 deletions numpy/core/tests/test_nditer.py
Original file line number Diff line number Diff line change
Expand Up @@ -3149,6 +3149,7 @@ def test_partial_iteration_cleanup(in_dtype, buf_dtype, steps):
# Note that resetting does not free references
del it
break_cycles()
break_cycles()
assert count == sys.getrefcount(value)

# Repeat the test with `iternext`
Expand All @@ -3159,6 +3160,7 @@ def test_partial_iteration_cleanup(in_dtype, buf_dtype, steps):

del it # should ensure cleanup
break_cycles()
break_cycles()
assert count == sys.getrefcount(value)


Expand Down
0