8000 Merge pull request #27233 from charris/backport-27223 · numpy/numpy@7443dcc · GitHub
[go: up one dir, main page]

Skip to content

Commit 7443dcc

Browse files
authored
Merge pull request #27233 from charris/backport-27223
DOC: add docs on thread safety in NumPy
2 parents c080180 + 395a81d commit 7443dcc

File tree

5 files changed

+70
-9
lines changed

5 files changed

+70
-9
lines changed

doc/source/reference/c-api/array.rst

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1264,6 +1264,13 @@ User-defined data types
12641264
registered (checked only by the address of the pointer), then
12651265
return the previously-assigned type-number.
12661266
< 8000 /span>
1267+
The number of user DTypes known to numpy is stored in
1268+
``NPY_NUMUSERTYPES``, a static global variable that is public in the
1269+
C API. Accessing this symbol is inherently *not* thread-safe. If
1270+
for some reason you need to use this API in a multithreaded context,
1271+
you will need to add your own locking, NumPy does not ensure new
1272+
data types can be added in a thread-safe manner.
1273+
12671274
.. c:function:: int PyArray_RegisterCastFunc( \
12681275
PyArray_Descr* descr, int totype, PyArray_VectorUnaryFunc* castfunc)
12691276

doc/source/reference/global_state.rst

Lines changed: 8 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,13 @@
11
.. _global_state:
22

3-
************
4-
Global state
5-
************
6-
7-
NumPy has a few import-time, compile-time, or runtime options
8-
which change the global behaviour.
9-
Most of these are related to performance or for debugging
10-
purposes and will not be interesting to the vast majority
11-
of users.
3+
****************************
4+
Global Configuration Options
5+
****************************
6+
7+
NumPy has a few import-time, compile-time, or runtime configuration
8+
options which change the global behaviour. Most of these are related to
9+
performance or for debugging purposes and will not be interesting to the
10+
vast majority of users.
1211

1312

1413
Performance-related options

doc/source/reference/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -58,6 +58,7 @@ Other topics
5858

5959
array_api
6060
simd/index
61+
thread_safety
6162
global_state
6263
security
6364
distutils_status_migration
Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,51 @@
1+
.. _thread_safety:
2+
3+
*************
4+
Thread Safety
5+
*************
6+
7+
NumPy supports use in a multithreaded context via the `threading` module in the
8+
standard library. Many NumPy operations release the GIL, so unlike many
9+
situations in Python, it is possible to improve parallel performance by
10+
exploiting multithreaded parallelism in Python.
11+
12+
The easiest performance gains happen when each worker thread owns its own array
13+
or set of array objects, with no data directly shared between threads. Because
14+
NumPy releases the GIL for many low-level operations, threads that spend most of
15+
the time in low-level code will run in parallel.
16+
17+
It is possible to share NumPy arrays between threads, but extreme care must be
18+
taken to avoid creating thread safety issues when mutating arrays that are
19+
shared between multiple threads. If two threads simultaneously read from and
20+
write to the same array, they will at best produce inconsistent, racey results that
21+
are not reproducible, let alone correct. It is also possible to crash the Python
22+
interpreter by, for example, resizing an array while another thread is reading
23+
from it to compute a ufunc operation.
24+
25+
In the future, we may add locking to ndarray to make writing multithreaded
26+
algorithms using NumPy arrays safer, but for now we suggest focusing on
27+
read-only access of arrays that are shared between threads, or adding your own
28+
locking if you need to mutation and multithreading.
29+
30+
Note that operations that *do not* release the GIL will see no performance gains
31+
from use of the `threading` module, and instead might be better served with
32+
`multiprocessing`. In particular, operations on arrays with ``dtype=object`` do
33+
not release the GIL.
34+
35+
Free-threaded Python
36+
--------------------
37+
38+
.. versionadded:: 2.1
39+
40+
Starting with NumPy 2.1 and CPython 3.13, NumPy also has experimental support
41+
for python runtimes with the GIL disabled. See
42+
https://py-free-threading.github.io for more information about installing and
43+
using free-threaded Python, as well as information about supporting it in
44+
libraries that depend on NumPy.
45+
46+
Because free-threaded Python does not have a global interpreter lock to
47+
serialize access to Python objects, there are more opportunities for threads to
48+
mutate shared state and create thread safety issues. In addition to the
49+
limitations about locking of the ndarray object noted above, this also means
50+
that arrays with ``dtype=object`` are not protected by the GIL, creating data
51+
races for python objects that are not possible outside free-threaded python.

doc/source/user/c-info.beyond-basics.rst

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -268,6 +268,9 @@ specifies your data-type. This type number should be stored and made
268268
available by your module so that other modules can use it to recognize
269269
your data-type.
270270

271+
Note that this API is inherently thread-unsafe. See `thread_safety` for more
272+
details about thread safety in NumPy.
273+
271274

272275
Registering a casting function
273276
------------------------------

0 commit comments

Comments
 (0)
0