8000 PERF: add a fast path to ufunc type resolution by ngoldbaum · Pull Request #27859 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

Conversation

@ngoldbaum
Copy link
Member

Fixes #27786.

Our group is thinking about ways to add better benchmarking so we don't regress on issues like this, see Quansight-Labs/free-threaded-compatibility#117. For now if it's alright I'd just like to merge this fix.

With the test script in the issue and this PR applied, I get the following scaling:

mflops_array_length_1000

generating this made my laptop pretty hot!

@ngoldbaum ngoldbaum added 09 - Backport-Candidate PRs tagged should be backported 39 - free-threading PRs and issues related to support for free-threading CPython (a.k.a. no-GIL, PEP 703) labels Nov 26, 2024
@ngoldbaum ngoldbaum added this to the 2.1.4 release milestone Nov 26, 2024
PyArray_DTypeMeta *op_dtypes[],
npy_bool legacy_promotion_is_possible)
{
PyObject *info = PyArrayIdentityHash_GetItem(ufunc->_dispatch_cache,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is thread-safe because we may read from the hash table while another thread is concurrently modifying it, including possibly resizing the table.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😓 That's a good point.

Is there a thread-safe way to do it without an RW lock?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could make PyArrayIdentityHash a concurrent hash table. That's generally hard, but there are a few things that make this specific case easier:

  • We only care about concurrent read performance
  • Entries are never deleted or overwritten

But we'd still have to deal with hash table resizing, which makes it more complex.

I think a RW lock will be simpler as long as it provides good enough read performance.

@ngoldbaum
Copy link
Member Author

I'm going to try again using an RW-lock

@ngoldbaum ngoldbaum closed this Dec 2, 2024
@charris charris removed the 09 - Backport-Candidate PRs tagged should be backported label Dec 3, 2024
@charris charris removed this from the 2.1.4 release milestone Dec 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

39 - free-threading PRs and issues related to support for free-threading CPython (a.k.a. no-GIL, PEP 703)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

BUG: poor multithreaded performance scaling for small arrays (nogil)

3 participants

0