8000 ENH: add locking around initializing the argparse cache by ngoldbaum · Pull Request #26430 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

ENH: add locking around initializing the argparse cache #26430

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
NOGIL: lock initializing the argparse cache
  • Loading branch information
ngoldbaum committed May 16, 2024
commit cc94991cc529296b8a624c24974e8c0da13e6bf6
31 changes: 23 additions & 8 deletions numpy/_core/src/common/npy_argparse.c
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,16 @@

#include "arrayfunction_override.h"

static PyThread_type_lock global_mutex;

int init_argparse_mutex() {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be: int init_argparse_mutex(void)?

global_mutex = PyThread_allocate_lock();
if (global_mutex == NULL) {
PyErr_NoMemory();
return -1;
}
return 0;
}

/**
* Small wrapper converting to array just like CPython does.
Expand Down Expand Up @@ -274,15 +284,20 @@ _npy_parse_arguments(const char *funcname,
/* ... is NULL, NULL, NULL terminated: name, converter, value */
...)
{
if (NPY_UNLIKELY(cache->npositional == -1)) {
va_list va;
va_start(va, kwnames);

int res = initialize_keywords(funcname, cache, va);
va_end(va);
if (res < 0) {
return -1;
if (NPY_UNLIKELY(!cache->initialized)) {
PyThread_acquire_lock(global_mutex, WAIT_LOCK);
if (!cache->initialized) {
va_list va;
va_start(va, kwnames);
int res = initialize_keywords(funcname, cache, va);
va_end(va);
if (res < 0) {
PyThread_release_lock(global_mutex);
return -1;
}
cache->initialized = 1;
}
PyThread_release_lock(global_mutex);
}

if (NPY_UNLIKELY(len_args > cache->npositional)) {
Expand Down
10 changes: 9 additions & 1 deletion numpy/_core/src/common/npy_argparse.h
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,13 @@
#include <Python.h>
#include "numpy/ndarraytypes.h"

#ifdef __STDC_NO_ATOMICS__
#define atomic_int volatile int
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caveat: this is "good enough" for x86 MSVC and also Window's "ARM64EC" platform target, but not other ARM64 targets on Windows.

If we need to support other Windows ARM64 targets, we'll need a bit more work. Hopefully, MSVC will support <stdatomic.h> soon, but I wouldn't hold my breath for it.

https://learn.microsoft.com/en-us/cpp/build/reference/volatile-volatile-keyword-interpretation

Copy link
Member Author
@ngoldbaum ngoldbaum May 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for clarifying! We don't test or build wheels for windows ARM64 right now, but I think we might need to support _M_ARM64 (see e.g. #22530). According to the MSVC docs, ARM64EC sets _M_X64.

I see pyatomic_msc.h has:

static inline uint64_t
_Py_atomic_load_uint64(const uint64_t *obj)
{
#if defined(_M_X64) || defined(_M_IX86)
    return *(volatile uint64_t *)obj;
#elif defined(_M_ARM64)
    return (uint64_t)__ldar64((unsigned __int64 volatile *)obj);
#else
#  error "no implementation of _Py_atomic_load_uint64"
#endif
}

And storing is implemented using e.g. _InterlockedCompareExchange64 from intrin.h.

That all seems reasonable to include in NumPy to support this.

Even with the experimental atomics support in MSVC 2022, they still set _ STDC_NO_ATOMICS _, so we'd need to go out of our way to detect and use it.

#else
#include <stdatomic.h>
#endif


/*
* This file defines macros to help with keyword argument parsing.
* This solves two issues as of now:
Expand All @@ -20,18 +27,19 @@
NPY_NO_EXPORT int
PyArray_PythonPyIntFromInt(PyObject *obj, int *value);


#define _NPY_MAX_KWARGS 15

typedef struct {
int npositional;
int nargs;
int npositional_only;
int nrequired;
atomic_int initialized;
/* Null terminated list of keyword argument name strings */
PyObject *kw_strings[_NPY_MAX_KWARGS+1];
} _NpyArgParserCache;

NPY_NO_EXPORT int init_argparse_mutex();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

init_argparse_mutex(void);


/*
* The sole purpose of this macro is to hide the argument parsing cache.
Expand Down
3 changes: 3 additions & 0 deletions numpy/_core/src/multiarray/_multiarray_tests.c.src
Original file line number Diff line number Diff line change
Expand Up @@ -2432,6 +2432,9 @@ PyMODINIT_FUNC PyInit__multiarray_tests(void)
return m;
}
import_array();
if (init_argparse_mutex() < 0) {
return NULL;
}
if (PyErr_Occurred()) {
PyErr_SetString(PyExc_RuntimeError,
"cannot load _multiarray_tests module.");
Expand Down
5 changes: 5 additions & 0 deletions numpy/_core/src/umath/umathmodule.c
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@
#include "numpy/ufuncobject.h"
#include "numpy/npy_3kcompat.h"
#include "npy_pycompat.h"
#include "npy_argparse.h"
#include "abstract.h"

#include "numpy/npy_math.h"
Expand Down Expand Up @@ -346,5 +347,9 @@ int initumath(PyObject *m)
return -1;
}

if (init_argparse_mutex() < 0) {
return -1;
}

return 0;
}
0