-
-
Notifications
You must be signed in to change notification settings - Fork 11k
DOC: add docs on thread safety in NumPy #27223
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
[skip azp][skip actions][skip cirrus]
8dddae1
to
908169c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly looks great, but wondering about the new legacy user dtype bit.
@@ -71,3 +70,10 @@ and set the ``ndarray.base``. | |||
|
|||
.. versionchanged:: 1.25.2 | |||
This variable is only checked on the first import. | |||
|
|||
Legacy User DTypes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does not feel very logical here. Better at the end of thread safety? Also, isn't this only an issue if one defines legacy user types, which I'd think very few programs do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could combine the two pages maybe?
It's here because it's global state.
I agree it's extremely niche and it's a legacy feature that we aren't really planning to advocate for people to use going forward. I could also not mention it, no one has cared or noticed up until now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could put the info in PyArray_RegisterDataType
and link from the free-threaded section?
(Right now the user-dtype won't be compiled with free-threaded support anyway, but it is tempting of course... And you probably get away with it since most will be probably be added at import time when no threads are active.)
I don't think it needs to be here really. It is global state, but it modifies NumPy runtime behavior very explicitly (and ideally not at all unless you use that dtype).
(I.e. maybe the global state name is not great but not sure what is better. "Global config"?)
It is possible to share NumPy arrays between threads, but extreme care must be | ||
taken to avoid creating thread safety issues when mutating shared arrays. If | ||
two threads simultaneously read from and wri 8000 te to the same array, at best they | ||
will see inconsistent views of the same array data. It is also possible to crash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may be worth paying special attention to the wording here, given the potential for confusion with array views (in the NumPy sense of the word "view"). For example, maybe "shared arrays" would be better expressed as "arrays that share the same underlying data", or something similar but less wordy. Similarly, the "inconsistent views of the same array data" might be confusing.
Overall certainly not a blocker and not worth bike-shedding over at this stage. I'd be in favor of getting this in and worrying about refining later!
[skip azp][skip actions][skip cirrus]
29908db
to
26cba75
Compare
Latest push responds to the review comments. Hopefully the wording is clearer now, I tried to emphasize that I was talking about arrays shared between threads. |
Thanks Nathan. |
Happy to add more content if anyone has specifics they'd like to see.