8000 BUG: move reduction initialization to ufunc initialization by ngoldbaum · Pull Request #28123 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

BUG: move reduction initialization to ufunc initialization #28123

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jan 9, 2025

Conversation

ngoldbaum
Copy link
Member
@ngoldbaum ngoldbaum commented Jan 7, 2025

Fixes #28041.

Because the reduction initialization happens during ufunc initialization, this means that ufunc initialization now depends on the allocator context variables being initialized, so I moved those initialization steps up before ufunc initialization in the main _multiarray_umath module initialization function.

@ngoldbaum ngoldbaum added the 39 - free-threading PRs and issues related to support for free-threading CPython (a.k.a. no-GIL, PEP 703) label Jan 7, 2025
@ngoldbaum
Copy link
Member Author

I'm also relying on the existing tests for reductions to cover this refactor - any suggestions for new tests?

@ngoldbaum
Copy link
Member Author

Refactored to call get_initial_from_ufunc during ufunc init, which indeed is a lot simpler and doesn't repeat as much code. There is a bit of extra boilerplate because I needed to heap-allocate a descrs array, but that's better than copy/pasting and is pretty straightforward.

Also added a test.

Copy link
Member
@seberg seberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, nice to keep things together. I still think it would be good to agressively simplify the new code, unless I am missing something?

Nice that you have a threading test! (I am almost wondering if there is some pytest plugin that would run each tests N times in parallel, or if that could be a fun project... but, we certainly have a few bad test that might not survive that :), although a plugin could use a pytest.mark to run them only once)

}

PyArray_Descr **descrs = PyMem_Calloc(ufunc->nin + ufunc->nout,
sizeof(PyArray_Descr *));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is one of the places where the NPY_ALLOC_WORSPACE could also be used, but I don't much, this isn't performance critical.
(And it might be that at least with mimalloc all of this barely matters anyway.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But more importantly. We only ever use the initial value for reductions. So I think you might as well hard-code all of this (if you like up to the knowledge that get_initial_form_ufunc() only usees descrs[0] anyway).

So make descrs[3] or just &singleton... (of course only enter the branch if nin == 2 and nout == 1)

@ngoldbaum
Copy link
Member Author

I am almost wondering if there is some pytest plugin that would run each tests N times in parallel

There's pytest-run-parallel which SciPy is using heavily. Using it in NumPy may discover some things but I suspect it will mostly detect use of global state in tests and the signal-to-noise ratio may not be great. In particular the Python warnings module isn't thread-safe, so for SciPy the majority of the work ended up being skipping tests that use warnings filters - and there are a lot of tests for warnings!

For now I'm just adding one-off multithreaded tests for specific issues as I find them.

Thanks for the review!

ngoldbaum and others added 2 commits January 8, 2025 15:11
Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net>
@charris charris added the 09 - Backport-Candidate PRs tagged should be backported label Jan 9, 2025
Copy link
Member
@seberg seberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, LGTM to me now, let's put it in.

@seberg seberg merged commit c412bed into numpy:main Jan 9, 2025
67 checks passed
@hawkinsp
Copy link
Contributor
hawkinsp commented Jan 9, 2025

I am almost wondering if there is some pytest plugin that would run each tests N times in parallel

There's pytest-run-parallel which SciPy is using heavily. Using it in NumPy may discover some things but I suspect it will mostly detect use of global state in tests and the signal-to-noise ratio may not be great. In particular the Python warnings module isn't thread-safe, so for SciPy the majority of the work ended up being skipping tests that use warnings filters - and there are a lot of tests for warnings!

For now I'm just adding one-off multithreaded tests for specific issues as I find them.

Thanks for the review!

There are ways to work around that (e.g. https://github.com/jax-ml/jax/pull/25626/files#diff-10c9477e1279e29626a5bbb8dccc529723495114f796523371741f472fc28e23) if you're sufficiently motivated. But it might be best just to wait for something like python/cpython#128300 to land upstream.

charris pushed a commit to charris/numpy that referenced this pull request Jan 9, 2025
* BUG: move reduction initialization to ufunc initialization

* MAINT: refactor to call get_initial_from_ufunc during init

* TST: add test for multithreaded reductions

* MAINT: fix linter

* Apply suggestions from code review

Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net>

* MAINT: simplify further

---------

Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net>
@charris charris removed the 09 - Backport-Candidate PRs tagged should be backported label Jan 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
39 - free-threading PRs and issues related to support for free-threading CPython (a.k.a. no-GIL, PEP 703)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: Race in computing initial value for reductions under free-threading
4 participants
0