8000 gh-101659: Isolate "obmalloc" State to Each Interpreter by ericsnowcurrently · Pull Request #101660 · python/cpython · GitHub
[go: up one dir, main page]

Skip to content

gh-101659: Isolate "obmalloc" State to Each Interpreter #101660

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 58 commits into from
Apr 24, 2023
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
58 commits
Select commit Hold shift + click to select a range
07a09d4
Pass PyInterpreterState to pymalloc_*().
ericsnowcurrently Oct 6, 2022
ca75048
Move the object arenas to the interpreter state.
ericsnowcurrently Oct 7, 2022
4ee199b
Drop an errant #define.
ericsnowcurrently Feb 7, 2023
2768fa4
Leave dump_debug_stats in the global state.
ericsnowcurrently Feb 7, 2023
bf9425f
Dynamically initialize obmalloc for subinterpreters.
ericsnowcurrently Feb 9, 2023
d5da34b
Merge branch 'main' into per-interpreter-alloc
ericsnowcurrently Mar 9, 2023
6c3111c
Pass around struct _obmalloc_state* instead of PyInterpeterState*.
ericsnowcurrently Mar 8, 2023
4dc087d
Add _PyInterpreterConfig.use_main_obmalloc.
ericsnowcurrently Mar 9, 2023
1ae33a0
Add a comment about why per-interpreter obmalloc requires multi-phase…
ericsnowcurrently Mar 9, 2023
5b54d63
Add a TODO comment.
ericsnowcurrently Mar 9, 2023
9f4f8f3
Optionally use the main interpreter's obmalloc state.
ericsnowcurrently Mar 9, 2023
aa10204
Pass use_main_obmalloc to run_in_subinterp() in test_import.
ericsnowcurrently Mar 9, 2023
69d9a2d
_Py_GetAllocatedBlocks() -> _Py_GetGlobalAllocatedBlocks().
ericsnowcurrently Mar 10, 2023
25378f8
Errors from _Py_NewInterpreterFromConfig() are no longer fatal.
ericsnowcurrently Mar 10, 2023
1c5b109
Chain the exceptions.
ericsnowcurrently Mar 13, 2023
f36426b
Swap out the failed tstate.
ericsnowcurrently Mar 10, 2023
54b9f09
Remaining static builtin types must be fixed.
ericsnowcurrently Mar 13, 2023
2358a42
Add PyInterpreterState.sysdict_copy.
ericsnowcurrently Mar 13, 2023
b6502e1
Set m_copy to None for sys and builtins.
ericsnowcurrently Mar 13, 2023
678e67b
Add _PyIO_InitTypes().
ericsnowcurrently Mar 13, 2023
69a5829
Fix test_capi.
ericsnowcurrently Mar 13, 2023
3feb408
Avoid allocation for shared exceptions.
ericsnowcurrently Mar 13, 2023
05806fc
Fix the ChannelID tp_name.
ericsnowcurrently Mar 13, 2023
b1cd7bb
Merge branch 'main' into per-interpreter-alloc
ericsnowcurrently Mar 29, 2023
4feb2b7
Do not include the total from interpreters sharing with main.
ericsnowcurrently Mar 29, 2023
136ad2f
Add _PyRuntime.obmalloc.interpreter_leaks.
ericsnowcurrently Mar 29, 2023
e19bb37
Track leaked blocks across init/fini cycles.
ericsnowcurrently Mar 29, 2023
6c51997
Clean up assumptions around runtime fini.
ericsnowcurrently Mar 29, 2023
f0fcaf6
Merge branch 'main' into per-interpreter-alloc
ericsnowcurrently Mar 29, 2023
0ff65ff
Add stubs for when WITH_PYMALLOC isn't defined.
ericsnowcurrently Mar 30, 2023
7db8d4a
Decref the key in the right interpreter in _extensions_cache_set().
ericsnowcurrently Mar 31, 2023
38bee89
Don't test against sys (for now).
ericsnowcurrently Mar 31, 2023
375a8f2
Clean up SubinterpImportTests.
ericsnowcurrently Mar 31, 2023
b0a9e11
Ensure we are testing against the right type of extension.
ericsnowcurrently Mar 31, 2023
5e5d5d5
Add a test that uses an isolated interpreter.
ericsnowcurrently Mar 31, 2023
25809ce
Fix is_core_module().
ericsnowcurrently Apr 4, 2023
616d3dd
Merge branch 'main' into per-interpreter-alloc
ericsnowcurrently Apr 4, 2023
43a836b
Ignore last_final_leaks.
ericsnowcurrently Apr 4, 2023
1841b55
Fix a typo.
ericsnowcurrently Apr 4, 2023
299527e
Merge branch 'main' into per-interpreter-alloc
ericsnowcurrently Apr 4, 2023
0091e48
Add a note about global state owned by the module.
ericsnowcurrently Apr 5, 2023
9f74f7b
Factor out GLOBAL_MALLOC() and GLOBAL_FREE().
ericsnowcurrently Apr 5, 2023
10c3589
Switch to the raw allocator.
ericsnowcurrently Apr 5, 2023
ff727ec
Merge branch 'channels-raw-allocator' into per-interpreter-alloc
ericsnowcurrently Apr 5, 2023
593430b
Use the raw allocator for _PyCrossInterpreterData_InitWithSize().
ericsnowcurrently Apr 5, 2023
f5ae710
atexit_callback -> atexit_py_callback.
ericsnowcurrently Apr 5, 2023
e6d4776
Add pycore_atexit.h.
ericsnowcurrently Apr 5, 2023
c719f02
Add _Py_AtExit().
ericsnowcurrently Apr 5, 2023
47c302d
Add a TODO comment.
ericsnowcurrently Apr 5, 2023
aaeaaa6
Move _Py_AtExit() to the public API.
ericsnowcurrently Apr 5, 2023
b5396e4
Test a constraint.
ericsnowcurrently Apr 5, 2023
448b48a
Add an atexit callback for _xxinterpchannels.
ericsnowcurrently Apr 5, 2023
c86f738
Implement the callback.
ericsnowcurrently Apr 5, 2023
1827feb
Drop the _PyCrossInterpreterData_Clear() call in _xxinterpchannels.
ericsnowcurrently Apr 5, 2023
82b395c
Drop the _PyCrossInterpreterData_Clear() call in _xxsubinterpreters.
ericsnowcurrently Apr 5, 2023
df77a64
Merge branch 'atexit-c-callback' into per-interpreter-alloc
ericsnowcurrently Apr 6, 2023
22758a3
Merge branch 'main' into per-interpreter-alloc
ericsnowcurrently Apr 6, 2023
0fd74a9
Merge branch 'main' into per-interpreter-alloc
ericsnowcurrently Apr 24, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Move the object arenas to the interpreter state.
  • Loading branch information
ericsnowcurrently committed Feb 7, 2023
commit ca75048cf44fa81004558a14e7d81e3aeb27e1f6
5 changes: 4 additions & 1 deletion Include/internal/pycore_interp.h
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,9 @@ extern "C" {
#include "pycore_function.h" // FUNC_MAX_WATCHERS
#include "pycore_genobject.h" // struct _Py_async_gen_state
#include "pycore_gc.h" // struct _gc_runtime_state
#include "pycore_list.h" // struct _Py_list_state
#include "pycore_global_objects.h" // struct _Py_interp_static_objects
#include "pycore_list.h" // struct _Py_list_state
#include "pycore_obmalloc.h" // struct obmalloc_state
#include "pycore_tuple.h" // struct _Py_tuple_state
#include "pycore_typeobject.h" // struct type_cache
#include "pycore_unicodeobject.h" // struct _Py_unicode_state
Expand Down Expand Up @@ -89,6 +90,8 @@ struct _is {
int _initialized;
int finalizing;

struct _obmalloc_state obmalloc;

struct _ceval_state ceval;
struct _gc_runtime_state gc;

Expand Down
2 changes: 0 additions & 2 deletions Include/internal/pycore_runtime.h
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,6 @@ extern "C" {
#include "pycore_pymem.h" // struct _pymem_allocators
#include "pycore_pyhash.h" // struct pyhash_runtime_state
#include "pycore_pythread.h" // struct _pythread_runtime_state
#include "pycore_obmalloc.h" // struct obmalloc_state
#include "pycore_signal.h" // struct _signals_runtime_state
#include "pycore_time.h" // struct _time_runtime_state
#include "pycore_tracemalloc.h" // struct _tracemalloc_runtime_state
Expand Down Expand Up @@ -88,7 +87,6 @@ typedef struct pyruntimestate {
_Py_atomic_address _finalizing;

struct _pymem_allocators allocators;
struct _obmalloc_state obmalloc;
struct pyhash_runtime_state pyhash_state;
struct _time_runtime_state time;
struct _pythread_runtime_state threads;
Expand Down
6 changes: 3 additions & 3 deletions Include/internal/pycore_runtime_init.h
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,6 @@ extern "C" {
_pymem_allocators_debug_INIT, \
_pymem_allocators_obj_arena_INIT, \
}, \
.obmalloc = _obmalloc_state_INIT(runtime.obmalloc), \
.pyhash_state = pyhash_state_INIT, \
.signals = _signals_RUNTIME_INIT, \
.interpreters = { \
Expand Down Expand Up @@ -94,7 +93,7 @@ extern "C" {
}, \
}, \
}, \
._main_interpreter = _PyInterpreterState_INIT, \
._main_interpreter = _PyInterpreterState_INIT(runtime._main_interpreter), \
}

#ifdef HAVE_DLOPEN
Expand All @@ -110,10 +109,11 @@ extern "C" {
# define DLOPENFLAGS_INIT
#endif

#define _PyInterpreterState_INIT \
#define _PyInterpreterState_INIT(interp) \
{ \
.id_refcount = -1, \
DLOPENFLAGS_INIT \
.obmalloc = _obmalloc_state_INIT(interp.obmalloc), \
.ceval = { \
.recursion_limit = Py_DEFAULT_RECURSION_LIMIT, \
}, \
Expand Down
73 changes: 40 additions & 33 deletions Objects/obmalloc.c
9E81
Original file line number Diff line number Diff line change
Expand Up @@ -727,19 +727,22 @@ static int running_on_valgrind = -1;
#endif


#define allarenas (_PyRuntime.obmalloc.mgmt.arenas)
#define maxarenas (_PyRuntime.obmalloc.mgmt.maxarenas)
#define unused_arena_objects (_PyRuntime.obmalloc.mgmt.unused_arena_objects)
#define usable_arenas (_PyRuntime.obmalloc.mgmt.usable_arenas)
#define nfp2lasta (_PyRuntime.obmalloc.mgmt.nfp2lasta)
#define narenas_currently_allocated (_PyRuntime.obmalloc.mgmt.narenas_currently_allocated)
#define ntimes_arena_allocated (_PyRuntime.obmalloc.mgmt.ntimes_arena_allocated)
#define narenas_highwater (_PyRuntime.obmalloc.mgmt.narenas_highwater)
#define raw_allocated_blocks (_PyRuntime.obmalloc.mgmt.raw_allocated_blocks)
// These macros all rely on a local "interp" variable.
#define usedpools (interp->obmalloc.pools.used)
#define allarenas (interp->obmalloc.mgmt.arenas)
#define maxarenas (interp->obmalloc.mgmt.maxarenas)
#define unused_arena_objects (interp->obmalloc.mgmt.unused_arena_objects)
#define usable_arenas (interp->obmalloc.mgmt.usable_arenas)
#define nfp2lasta (interp->obmalloc.mgmt.nfp2lasta)
#define narenas_currently_allocated (interp->obmalloc.mgmt.narenas_currently_allocated)
#define ntimes_arena_allocated (interp->obmalloc.mgmt.ntimes_arena_allocated)
#define narenas_highwater (interp->obmalloc.mgmt.narenas_highwater)
#define raw_allocated_blocks (interp->obmalloc.mgmt.raw_allocated_blocks)

Py_ssize_t
_Py_GetAllocatedBlocks(void)
{
PyInterpreterState *interp = _PyInterpreterState_GET();
Py_ssize_t n = raw_allocated_blocks;
/* add up allocated blocks for used pools */
for (uint i = 0; i < maxarenas; ++i) {
Expand All @@ -764,16 +767,16 @@ _Py_GetAllocatedBlocks(void)
/*==========================================================================*/
/* radix tree for tracking arena usage. */

#define arena_map_root (_PyRuntime.obmalloc.usage.arena_map_root)
#define arena_map_root (interp->obmalloc.usage.arena_map_root)
#ifdef USE_INTERIOR_NODES
#define arena_map_mid_count (_PyRuntime.obmalloc.usage.arena_map_mid_count)
#define arena_map_bot_count (_PyRuntime.obmalloc.usage.arena_map_bot_count)
#define arena_map_mid_count (interp->obmalloc.usage.arena_map_mid_count)
#define arena_map_bot_count (interp->obmalloc.usage.arena_map_bot_count)
#endif

/* Return a pointer to a bottom tree node, return NULL if it doesn't exist or
* it cannot be created */
static Py_ALWAYS_INLINE arena_map_bot_t *
arena_map_get(pymem_block *p, int create)
arena_map_get(PyInterpreterState *interp, pymem_block *p, int create)
{
#ifdef USE_INTERIOR_NODES
/* sanity check that IGNORE_BITS is correct */
Expand Down Expand Up @@ -834,11 +837,13 @@ arena_map_get(pymem_block *p, int create)

/* mark or unmark addresses covered by arena */
static int
arena_map_mark_used(uintptr_t arena_base, int is_used)
arena_map_mark_used(PyInterpreterState *interp,
uintptr_t arena_base, int is_used)
{
/* sanity check that IGNORE_BITS is correct */
assert(HIGH_BITS(arena_base) == HIGH_BITS(&arena_map_root));
arena_map_bot_t *n_hi = arena_map_get((pymem_block *)arena_base, is_used);
arena_map_bot_t *n_hi = arena_map_get(
interp, (pymem_block *)arena_base, is_used);
if (n_hi == NULL) {
assert(is_used); /* otherwise node should already exist */
return 0; /* failed to allocate space for node */
Expand All @@ -863,7 +868,8 @@ arena_map_mark_used(uintptr_t arena_base, int is_used)
* must overflow to 0. However, that would mean arena_base was
* "ideal" and we should not be in this case. */
assert(arena_base < arena_base_next);
arena_map_bot_t *n_lo = arena_map_get((pymem_block *)arena_base_next, is_used);
arena_map_bot_t *n_lo = arena_map_get(
interp, (pymem_block *)arena_base_next, is_used);
if (n_lo == NULL) {
assert(is_used); /* otherwise should already exist */
n_hi->arenas[i3].tail_hi = 0;
Expand All @@ -878,9 +884,9 @@ arena_map_mark_used(uintptr_t arena_base, int is_used)
/* Return true if 'p' is a pointer inside an obmalloc arena.
* _PyObject_Free() calls this so it needs to be very fast. */
static int
arena_map_is_used(pymem_block *p)
arena_map_is_used(PyInterpreterState *interp, pymem_block *p)
{
arena_map_bot_t *n = arena_map_get(p, 0);
arena_map_bot_t *n = arena_map_get(interp, p, 0);
if (n == NULL) {
return 0;
}
Expand All @@ -903,7 +909,7 @@ arena_map_is_used(pymem_block *p)
* `usable_arenas` to the return value.
*/
static struct arena_object*
new_arena(void)
new_arena(PyInterpreterState *interp)
{
struct arena_object* arenaobj;
uint excess; /* number of bytes above pool alignment */
Expand Down Expand Up @@ -969,7 +975,7 @@ new_arena(void)
address = _PyObject_Arena.alloc(_PyObject_Arena.ctx, ARENA_SIZE);
#if WITH_PYMALLOC_RADIX_TREE
if (address != NULL) {
if (!arena_map_mark_used((uintptr_t)address, 1)) {
if (!arena_map_mark_used(interp, (uintptr_t)address, 1)) {
/* marking arena in radix tree failed, abort */
_PyObject_Arena.free(_PyObject_Arena.ctx, address, ARENA_SIZE);
address = NULL;
Expand Down Expand Up @@ -1012,9 +1018,9 @@ new_arena(void)
pymalloc. When the radix tree is used, 'poolp' is unused.
*/
static bool
address_in_range(void *p, poolp Py_UNUSED(pool))
address_in_range(PyInterpreterState *interp, void *p, poolp Py_UNUSED(pool))
{
return arena_map_is_used(p);
return arena_map_is_used(interp, p);
}
#else
/*
Expand Down Expand Up @@ -1095,7 +1101,7 @@ extremely desirable that it be this fast.
static bool _Py_NO_SANITIZE_ADDRESS
_Py_NO_SANITIZE_THREAD
_Py_NO_SANITIZE_MEMORY
address_in_range(void *p, poolp pool)
address_in_range(PyInterpreterState *interp, void *p, poolp pool)
{
// Since address_in_range may be reading from memory which was not allocated
// by Python, it is important that pool->arenaindex is read only once, as
Expand Down Expand Up @@ -1139,7 +1145,7 @@ pymalloc_pool_extend(poolp pool, uint size)
* This function takes new pool and allocate a block from it.
*/
static void*
allocate_from_new_pool(uint size)
allocate_from_new_pool(PyInterpreterState *interp, uint size)
{
/* There isn't a pool of the right size class immediately
* available: use a free pool.
Expand All @@ -1151,7 +1157,7 @@ allocate_from_new_pool(uint size)
return NULL;
}
#endif
usable_arenas = new_arena();
usable_arenas = new_arena(interp);
if (usable_arenas == NULL) {
return NULL;
}
Expand Down Expand Up @@ -1315,7 +1321,7 @@ pymalloc_alloc(PyInterpreterState *interp, void *Py_UNUSED(ctx), size_t nbytes)
/* There isn't a pool of the right size class immediately
* available: use a free pool.
*/
bp = allocate_from_new_pool(size);
bp = allocate_from_new_pool(interp, size);
}

return (void *)bp;
Expand Down Expand Up @@ -1361,7 +1367,7 @@ _PyObject_Calloc(void *ctx, size_t nelem, size_t elsize)


static void
insert_to_usedpool(poolp pool)
insert_to_usedpool(PyInterpreterState *interp, poolp pool)
{
assert(pool->ref.count > 0); /* else the pool is empty */

Expand All @@ -1377,7 +1383,7 @@ insert_to_usedpool(poolp pool)
}

static void
insert_to_freepool(poolp pool)
insert_to_freepool(PyInterpreterState *interp, poolp pool)
{
poolp next = pool->nextpool;
poolp prev = pool->prevpool;
Expand Down Expand Up @@ -1460,7 +1466,7 @@ insert_to_freepool(poolp pool)

#if WITH_PYMALLOC_RADIX_TREE
/* mark arena region as not under control of obmalloc */
arena_map_mark_used(ao->address, 0);
arena_map_mark_used(interp, ao->address, 0);
#endif

/* Free the entire arena. */
Expand Down Expand Up @@ -1558,7 +1564,7 @@ pymalloc_free(PyInterpreterState *interp, void *Py_UNUSED(ctx), void *p)
#endif

poolp pool = POOL_ADDR(p);
if (UNLIKELY(!address_in_range(p, pool))) {
if (UNLIKELY(!address_in_range(interp, p, pool))) {
return 0;
}
/* We allocated this address. */
Expand All @@ -1582,7 +1588,7 @@ pymalloc_free(PyInterpreterState *interp, void *Py_UNUSED(ctx), void *p)
* targets optimal filling when several pools contain
* blocks of the same size class.
*/
insert_to_usedpool(pool);
insert_to_usedpool(interp, pool);
return 1;
}

Expand All @@ -1599,7 +1605,7 @@ pymalloc_free(PyInterpreterState *interp, void *Py_UNUSED(ctx), void *p)
* previously freed pools will be allocated later
* (being not referenced, they are perhaps paged out).
*/
insert_to_freepool(pool);
insert_to_freepool(interp, pool);
return 1;
}

Expand Down Expand Up @@ -1648,7 +1654,7 @@ pymalloc_realloc(PyInterpreterState *interp, void *ctx,
#endif

pool = POOL_ADDR(p);
if (!address_in_range(p, pool)) {
if (!address_in_range(interp, p, pool)) {
/* pymalloc is not managing this block.

If nbytes <= SMALL_REQUEST_THRESHOLD, it's tempting to try to take
Expand Down Expand Up @@ -2295,6 +2301,7 @@ _PyObject_DebugMallocStats(FILE *out)
if (!_PyMem_PymallocEnabled()) {
return 0;
}
PyInterpreterState *interp = _PyInterpreterState_GET();

uint i;
const uint numclasses = SMALL_REQUEST_THRESHOLD >> ALIGNMENT_SHIFT;
Expand Down
0