8000 ENH: Configurable allocator by mattip · Pull Request #17582 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

ENH: Configurable allocator #17582

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 83 commits into from
Oct 25, 2021
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
83 commits
Select commit Hold shift + click to select a range
55f2f6c
ENH: add and use global configurable memory routines
mattip Oct 11, 2020
23da73e
ENH: add tests and a way to compile c-extensions from tests
mattip Oct 16, 2020
94b9f25
fix allocation/free exposed by tests
mattip Oct 18, 2020
81b45fd
DOC: document the new APIs (and some old ones too)
mattip Oct 18, 2020
fc32c2f
BUG: return void from FREE, also some cleanup
mattip Oct 19, 2020
de22327
MAINT: changes from review
mattip Oct 19, 2020
38274a4
fixes from linter
mattip Mar 11, 2021
59b520a
setting ndarray->descr on 0d or scalars mess with FREE
mattip Apr 16, 2021
5264019
make scalar allocation more consistent wrt np_alloc_cache
mattip Apr 16, 2021
7c396d7
change formatting for sphinx
mattip Apr 17, 2021
8000 de9001c
remove memcpy variants
mattip Apr 19, 2021
9e7c3ed
update to match NEP 49
mattip May 3, 2021
953cc88
ENH: add a python-level get_handler_name
mattip May 6, 2021
ad9329b
ENH: add core.multiarray.get_handler_name
mattip May 6, 2021
6d10fdb
Allow closure-like definition of the data mem routines
eliaskoromilas Jun 30, 2021
18bea05
Fix incompatible pointer warnings
eliaskoromilas Jun 30, 2021
4368023
Note PyDataMemAllocator and PyMemAllocatorEx differentiation
eliaskoromilas Jul 1, 2021
d243313
Redefine default allocator handling
eliaskoromilas Jul 1, 2021
c9b6854
Always allocate new arrays using the current_handler
eliaskoromilas Jul 5, 2021
a8fd378
Search for the mem_handler name of the data owner
eliaskoromilas Jul 5, 2021
a17565b
Sub-comparisons don't need a local mem_handler
eliaskoromilas Jul 5, 2021
2ec5912
Make the default_handler a valid PyDataMem_Handler
eliaskoromilas Jul 14, 2021
227c4b8
Fix PyDataMem_SetHandler description (NEP discussion)
eliaskoromilas Jul 14, 2021
fb8135d
Pass the allocators by reference
eliaskoromilas Jul 14, 2021
7291484
Implement allocator context-locality
eliaskoromilas Aug 8, 2021
c7a9c22
Fix documentation, make PyDataMem_GetHandler return const
eliaskoromilas Aug 8, 2021
99f8250
remove import of setuptools==49.1.3, doesn't work on python3.10
mattip Aug 9, 2021
b43c1fe
Fix refcount leaks
eliaskoromilas Aug 9, 2021
a7a5435
fix function signatures in test
mattip Aug 9, 2021
e3723df
Return early on PyDataMem_GetHandler error (VOID_compare)
eliaskoromilas Aug 9, 2021
144acc6
Add context/thread-locality tests, allow testing custom policies
eliaskoromilas Aug 9, 2021
8539f5f
Merge branch 'configurable_allocator' into configurable_allocator
eliaskoromilas Aug 9, 2021
6ab00d0
ENH: add and use global configurable memory routines
mattip Oct 11, 2020
e7e8754
ENH: add tests and a way to compile c-extensions from tests
mattip Oct 16, 2020
5f08532
fix allocation/free exposed by tests
mattip Oct 18, 2020
7266029
DOC: document the new APIs (and some old ones too)
mattip Oct 18, 2020
5d547ff
BUG: return void from FREE, also some cleanup
mattip Oct 19, 2020
4617c50
MAINT: changes from review
mattip Oct 19, 2020
8f739c4
fixes from linter
mattip Mar 11, 2021
90205b6
setting ndarray->descr on 0d or scalars mess with FREE
mattip Apr 16, 2021
5c0d3f9
make scalar allocation more consistent wrt np_alloc_cache
mattip Apr 16, 2021
3b385d9
change formatting for sphinx
mattip Apr 17, 2021
e6e12a3
remove memcpy variants
mattip Apr 19, 2021
048552d
update to match NEP 49
mattip May 3, 2021
ad6f8ad
ENH: add a python-level get_handler_name
mattip May 6, 2021
50f8b93
ENH: add core.multiarray.get_handler_name
mattip May 6, 2021
c7438f5
Allow closure-like definition of the data mem routines
eliaskoromilas Jun 30, 2021
f823ba4
Fix incompatible pointer warnings
eliaskoromilas Jun 30, 2021
ad13161
Note PyDataMemAllocator and PyMemAllocatorEx differentiation
eliaskoromilas Jul 1, 2021
0a08acd
Redefine default allocator handling
eliaskoromilas Jul 1, 2021
1f0301d
Always allocate new arrays using the current_handler
eliaskoromilas Jul 5, 2021
3d56aa0
Search for the mem_handler name of the data owner
eliaskoromilas Jul 5, 2021
8ea 8000 6818
Sub-comparisons don't need a local mem_handler
eliaskoromilas Jul 5, 2021
fb2af4d
Make the default_handler a valid PyDataMem_Handler
eliaskoromilas Jul 14, 2021
f05a1c6
Fix PyDataMem_SetHandler description (NEP discussion)
eliaskoromilas Jul 14, 2021
660e0a4
Pass the allocators by reference
eliaskoromilas Jul 14, 2021
a4f8d71
remove import of setuptools==49.1.3, doesn't work on python3.10
mattip Aug 9, 2021
d7b1a1d
fix function signatures in test
mattip Aug 9, 2021
ab1a0eb
try to fix cygwin extension building
mattip Aug 9, 2021
b92e36c
YAPF mem_policy test
eliaskoromilas Aug 9, 2021
76cda3a
Merge branch 'configurable_allocator' into configurable_allocator
eliaskoromilas Aug 9, 2021
3eadf2f
Less empty lines, more comments (tests)
eliaskoromilas Aug 10, 2021
dbe9d73
Apply suggestions from code review (set an exception and)
eliaskoromilas Aug 10, 2021
1bfb870
Merge pull request #57 from eliaskoromilas/configurable_allocator
mattip Aug 11, 2021
0511820
skip test on cygwin
mattip Aug 11, 2021
23c4bc0
update API hash for changed signature
mattip Aug 13, 2021
ed8649b
TST: add gc.collect to make sure cycles are broken
mattip Aug 13, 2021
9aacefa
Implement thread-locality for PyPy
eliaskoromilas Aug 12, 2021
79712fa
Update numpy/core/tests/test_mem_policy.py
mattip Aug 25, 2021
a2ae4c0
fixes from review
mattip Aug 25, 2021
09b9c0d
update circleci config
mattip Aug 25, 2021
efb3c77
fix test
mattip Aug 25, 2021
2945c64
make the connection between OWNDATA and having a allocator handle mor…
mattip Aug 25, 2021
3a97d9a
improve docstring, fix flake8 for tests
mattip Aug 26, 2021
1df805c
update PyDataMem_GetHandler() from review
mattip Aug 26, 2021
ef607bd
Implement allocator lifetime management
eliaskoromilas Aug 27, 2021
8bdc9a1
Merge pull request #59 from eliaskoromilas/configurable_allocator
mattip Aug 29, 2021
a3256e5
update NEP and add best-effort handling of error in PyDataMem_UserFREE
mattip Aug 29, 2021
4d6ea65
merge main into branch
mattip Aug 30, 2021
5941d7c
merge main into branch
mattip Oct 17, 2021
522c368
Merge branch 'main' into configurable_allocator
mattip Oct 25, 2021
442b0e1
ENH: fix and test for blindly taking ownership of data
mattip Oct 25, 2021
8ca8b54
Update doc/neps/nep-0049.rst
seberg Oct 25, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Add context/thread-locality tests, allow testing custom policies
  • Loading branch information
eliaskoromilas committed Aug 9, 2021
commit 144acc6d1a9ee9f8f08e9aca9fbe24ff0de6f0d6
2 changes: 1 addition & 1 deletion doc/source/reference/c-api/data_memory.rst
Original file line number Diff line number Diff line change
Expand Up @@ -88,7 +88,7 @@ reallocate or free the data memory of the instance.
.. c:function:: const PyDataMem_Handler * PyDataMem_GetHandler(PyArrayObject *obj)

Return the `PyDataMem_Handler` used by the ``PyArrayObject``. If ``NULL``,
return the current global policy that ill be used to allocate data for the
return the current global policy that will be used to allocate data for the
next ``PyArrayObject``. On failure, set an exception and return ``NULL``.

For an example of setting up and using the PyDataMem_Handler, see the test in
Expand Down
2 changes: 1 addition & 1 deletion numpy/core/src/multiarray/alloc.c
Original file line number Diff line number Diff line change
Expand Up @@ -514,7 +514,7 @@ PyDataMem_SetHandler(PyDataMem_Handler *handler)

/*NUMPY_API
* Return the PyDataMem_Handler used by the PyArrayObject. If NULL, return
* the current global policy that ill be used to allocate data
* the current global policy that will be used to allocate data
* for the next PyArrayObject. On failure, set an exception and return NULL.
*/
NPY_NO_EXPORT const PyDataMem_Handler *
Expand Down
204 changes: 151 additions & 53 deletions numpy/core/tests/test_mem_policy.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
import asyncio
import pytest
import numpy as np
import threading
from numpy.testing import extbuild


Expand All @@ -11,48 +13,51 @@ def get_module(tmp_path):
free/calloc go via the functions here.
"""
functions = [(
"test_prefix", "METH_O",
"""
if (!PyArray_Check(args)) {
PyErr_SetString(PyExc_ValueError,
"must be called with a numpy scalar or ndarray");
}
return PyUnicode_FromString(
PyDataMem_GetHandler((PyArrayObject*)args)->name);
"""
),
("set_new_policy", "METH_NOARGS",
"set_secret_data_policy", "METH_NOARGS",
"""
const PyDataMem_Handler *old = PyDataMem_SetHandler(&new_handler);
return PyUnicode_FromString(old->name);
PyDataMem_Handler *old = (PyDataMem_Handler *) PyDataMem_SetHandler(&secret_data_handler);
return PyCapsule_New(old, NULL, NULL);
"""),
("set_old_policy", "METH_NOARGS",
("set_old_policy", "METH_O",
"""
const PyDataMem_Handler *old = PyDataMem_SetHandler(NULL);
return PyUnicode_FromString(old->name);
PyDataMem_Handler *old = NULL;
if (args != NULL && PyCapsule_CheckExact(args)) {
old = (PyDataMem_Handler *) PyCapsule_GetPointer(args, NULL);
}
PyDataMem_SetHandler(old);
Py_RETURN_NONE;
"""),
]
prologue = '''
#define NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION
#include <numpy/arrayobject.h>
/*
* This struct allows the dynamic configuration of the allocator funcs
* of the `secret_data_allocator`. It is provided here for
* demonstration purposes, as a valid `ctx` use-case scenario.
*/
typedef struct {
void *(*malloc)(size_t);
void *(*calloc)(size_t, size_t);
void *(*realloc)(void *, size_t);
void (*free)(void *);
} Allocator;
} SecretDataAllocatorFuncs;
NPY_NO_EXPORT void *
shift_alloc(Allocator *ctx, size_t sz) {
char *real = (char *)ctx->malloc(sz + 64);
shift_alloc(void *ctx, size_t sz) {
SecretDataAllocatorFuncs *funcs = (SecretDataAllocatorFuncs *) ctx;

char *real = (char *)funcs->malloc(sz + 64);
if (real == NULL) {
return NULL;
}
snprintf(real, 64, "originally allocated %ld", (unsigned long)sz);
return (void *)(real + 64);
}
NPY_NO_EXPORT void *
shift_zero(Allocator *ctx, size_t sz, size_t cnt) {
char *real = (char *)ctx->calloc(sz + 64, cnt);
shift_zero(void *ctx, size_t sz, size_t cnt) {
SecretDataAllocatorFuncs *funcs = (SecretDataAllocatorFuncs *) ctx;

char *real = (char *)funcs->calloc(sz + 64, cnt);
if (real == NULL) {
return NULL;
}
Expand All @@ -61,7 +66,9 @@ def get_module(tmp_path):
return (void *)(real + 64);
}
NPY_NO_EXPORT void
shift_free(Allocator *ctx, void * p, npy_uintp sz) {
shift_free(void *ctx, void * p, npy_uintp sz) {
SecretDataAllocatorFuncs *funcs = (SecretDataAllocatorFuncs *) ctx;

if (p == NULL) {
return ;
}
Expand All @@ -70,34 +77,36 @@ def get_module(tmp_path):
fprintf(stdout, "uh-oh, unmatched shift_free, "
"no appropriate prefix\\n");
/* Make C runtime crash by calling free on the wrong address */
ctx->free((char *)p + 10);
/* ctx->free(real); */
funcs->free((char *)p + 10);
/* funcs->free(real); */
}
else {
npy_uintp i = (npy_uintp)atoi(real +20);
if (i != sz) {
fprintf(stderr, "uh-oh, unmatched shift_free"
"(ptr, %ld) but allocated %ld\\n", sz, i);
/* This happens in some places, only print */
ctx->free(real);
funcs->free(real);
}
else {
ctx->free(real);
funcs->free(real);
}
}
}
NPY_NO_EXPORT void *
shift_realloc(Allocator *ctx, void * p, npy_uintp sz) {
shift_realloc(void *ctx, void * p, npy_uintp sz) {
SecretDataAllocatorFuncs *funcs = (SecretDataAllocatorFuncs *) ctx;

if (p != NULL) {
char *real = (char *)p - 64;
if (strncmp(real, "originally allocated", 20) != 0) {
fprintf(stdout, "uh-oh, unmatched shift_realloc\\n");
return realloc(p, sz);
}
return (void *)((char *)ctx->realloc(real, sz + 64) + 64);
return (void *)((char *)funcs->realloc(real, sz + 64) + 64);
}
else {
char *real = (char *)ctx->realloc(p, sz + 64);
char *real = (char *)funcs->realloc(p, sz + 64);
if (real == NULL) {
return NULL;
}
Expand All @@ -106,20 +115,21 @@ def get_module(tmp_path):
return (void *)(real + 64);
}
}
static Allocator new_handler_ctx = {
/* As an example, we use the standard {m|c|re}alloc/free funcs. */
static SecretDataAllocatorFuncs secret_data_handler_ctx = {
malloc,
calloc,
realloc,
free
};
static PyDataMem_Handler new_handler = {
static PyDataMem_Handler secret_data_handler = {
"secret_data_allocator",
{
&new_handler_ctx,
shift_alloc, /* malloc */
shift_zero, /* calloc */
shift_realloc, /* realloc */
shift_free /* free */
&secret_data_handler_ctx, /* ctx */
shift_alloc, /* malloc */
shift_zero, /* calloc */
shift_realloc, /* realloc */
shift_free /* free */
}
};
'''
Expand All @@ -138,36 +148,124 @@ def get_module(tmp_path):


def test_set_policy(get_module):
a = np.arange(10)
orig_policy = get_module.test_prefix(a)
assert orig_policy == np.core.multiarray.get_handler_name()
assert orig_policy == np.core.multiarray.get_handler_name(a)
assert get_module.set_new_policy() == orig_policy
if orig_policy == 'default_allocator':
get_module.set_old_policy()
orig_policy_name = np.core.multiarray.get_handler_name()

a = np.arange(10).reshape((2, 5)) # a doesn't own its own data
assert np.core.multiarray.get_handler_name(a) == orig_policy_name

orig_policy = get_module.set_secret_data_policy()

b = np.arange(10).reshape((2, 5)) # b doesn't own its own data
assert np.core.multiarray.get_handler_name(b) == 'secret_data_allocator'

if orig_policy_name == 'default_allocator':
get_module.set_old_policy(None)

assert np.core.multiarray.get_handler_name() == 'default_allocator'
else:
get_module.set_old_policy(orig_policy)

assert np.core.multiarray.get_handler_name() == orig_policy_name


async def concurrent_context1(get_module, event):
get_module.set_secret_data_policy()

assert np.core.multiarray.get_handler_name() == 'secret_data_allocator'

event.set()


async def concurrent_context2(get_module, orig_policy_name, event):
await event.wait()

assert np.core.multiarray.get_handler_name() == orig_policy_name


async def secret_data_context(get_module):
assert np.core.multiarray.get_handler_name() == 'secret_data_allocator'

get_module.set_old_policy(None)


async def async_test_context_locality(get_module):
orig_policy_name = np.core.multiarray.get_handler_name()

event = asyncio.Event()
concurrent_task1 = asyncio.create_task(concurrent_context1(get_module, event))
concurrent_task2 = asyncio.create_task(concurrent_context2(get_module, orig_policy_name, event))
await concurrent_task1
await concurrent_task2

assert np.core.multiarray.get_handler_name() == orig_policy_name

orig_policy = get_module.set_secret_data_policy()

await asyncio.create_task(secret_data_context(get_module))

assert np.core.multiarray.get_handler_name() == 'secret_data_allocator'

get_module.set_old_policy(orig_policy)


def test_context_locality(get_module):
D889 asyncio.run(async_test_context_locality(get_module))


def concurrent_thread1(get_module, event):
assert np.core.multiarray.get_handler_name() == 'default_allocator'

get_module.set_secret_data_policy()

assert np.core.multiarray.get_handler_name() == 'secret_data_allocator'

event.set()


def concurrent_thread2(get_module, event):
event.wait()

assert np.core.multiarray.get_handler_name() == 'default_allocator'


def test_thread_locality(get_module):
orig_policy_name = np.core.multiarray.get_handler_name()

event = threading.Event()
concurrent_task1 = threading.Thread(target=concurrent_thread1, args=(get_module, event))
concurrent_task2 = threading.Thread(target=concurrent_thread2, args=(get_module, event))
concurrent_task1.start()
concurrent_task2.start()
concurrent_task1.join()
concurrent_task2.join()

assert np.core.multiarray.get_handler_name() == orig_policy_name


@pytest.mark.slow
def test_new_policy(get_module):
a = np.arange(10)
orig_policy = get_module.test_prefix(a)
assert get_module.set_new_policy() == orig_policy
b = np.arange(10).reshape((2, 5))
assert get_module.test_prefix(b) == 'secret_data_allocator'
orig_policy_name = np.core.multiarray.get_handler_name(a)

orig_policy = get_module.set_secret_data_policy()

b = np.arange(10)
assert np.core.multiarray.get_handler_name(b) == 'secret_data_allocator'

# test array manipulation. This is slow
if orig_policy == 'default_allocator':
if orig_policy_name == 'default_allocator':
# when the np.core.test tests recurse into this test, the
# policy will be set so this "if" will be false, preventing
# infinite recursion
#
# if needed, debug this by
# - running tests with -- -s (to not capture stdout/stderr
# - setting extra_argv=['-vv'] here
np.core.test(verbose=2, extra_argv=['-vv'])
assert np.core.test('full', verbose=2, extra_argv=['-vv'])
# also try the ma tests, the pickling test is quite tricky
np.ma.test(verbose=2, extra_argv=['-vv'])
get_module.set_old_policy()
assert get_module.test_prefix(a) == orig_policy
assert np.ma.test('full', verbose=2, extra_argv=['-vv'])

get_module.set_old_policy(orig_policy)

c = np.arange(10)
assert get_module.test_prefix(c) == 'default_allocator'
assert np.core.multiarray.get_handler_name(c) == orig_policy_name
0