8000 BUG: NumPy 2.0 mixed precision in-place operation performance regression · Issue #26183 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

BUG: NumPy 2.0 mixed precision in-place operation performance regression #26183

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
pijyoi opened this issue Mar 31, 2024 · 4 comments
Closed
Labels

Comments

@pijyoi
Copy link
Contributor
pijyoi commented Mar 31, 2024

Describe the issue:

When an in-place ndarray operation is performed with a higher precision numpy scalar value, it took 3x longer on NumPy 2.0rc1 than on 1.26.4.

I have read:
https://numpy.org/devdocs/numpy_2_0_migration_guide.html#changes-to-numpy-data-type-promotion

Setting

np._set_promotion_state("weak_and_warn")

does give the warning

UserWarning: result dtype changed due to the removal of value-based promotion from NumPy. Changed from float32 to float64.

The question I have is whether this is considered counter-intuitive behavior for an in-place operation. The slower runtime would seem to imply that a temporary ndarray was created, which defeats the purpose of an in-place operation. (i.e. to avoid memory allocation for temporaries)

Reproduce the code example:

import numpy as np
data = np.full((4000, 6000), 42, dtype=np.float32)
%timeit global data; data -= np.float64(0)

Error message:

(with NumPy 2.0rc1)
27.4 ms ± 207 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

(with Numpy 1.26.4)
9.5 ms ± 299 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Python and NumPy Versions:

2.0.0rc1
3.10.11 (tags/v3.10.11:7d4cc5a, Apr 5 2023, 00:38:17) [MSC v.1929 64 bit (AMD64)]

1.26.4
3.10.11 (tags/v3.10.11:7d4cc5a, Apr 5 2023, 00:38:17) [MSC v.1929 64 bit (AMD64)]

Runtime Environment:

[{'numpy_version': '2.0.0rc1',
'python': '3.10.11 (tags/v3.10.11:7d4cc5a, Apr 5 2023, 00:38:17) [MSC '
'v.1929 64 bit (AMD64)]',
'uname': uname_result(system='Windows', node='LAPTOP-GP728CM2', release='10', version='10.0.22621', machine='AMD64')},
{'simd_extensions': {'baseline': ['SSE', 'SSE2', 'SSE3'],
'found': ['SSSE3',
'SSE41',
'POPCNT',
'SSE42',
'AVX',
'F16C',
'FMA3',
'AVX2'],
'not_found': ['AVX512F',
'AVX512CD',
'AVX512_SKX',
'AVX512_CLX',
'AVX512_CNL',
'AVX512_ICL']}}]
None

Context for the issue:

This performance regression was found in pyqtgraph
pyqtgraph/pyqtgraph#2974

@mhvk
Copy link
Contributor
mhvk commented Mar 31, 2024

The difference is because scalars are no longer treated as different from arrays, so you now get the speed that previously you'd have gotten for something like data -= np.array([0.]) (I verified this gives the slow result also on older numpy). Only for a python float will the float32 loop be chosen. You can still force the choice with np.subtract(data, np.array([0.]), out=data, dtype=data.dtype).

A separate question is whether in general for an in-place operation one should effectively do dtype=in.dtype (if a suitable loop exists). It might well break other things, though...

p.s. Note that even now the operation is done buffered, not via a full temporary copy, so memory should not be a big worry.

@rgommers
8000
Copy link
Member
rgommers commented Apr 1, 2024

A separate question is whether in general for an in-place operation one should effectively do dtype=in.dtype (if a suitable loop exists). It might well break other things, though...

That's probably a good idea - xref gh-25621 for this.

@mhvk
Copy link
Contributor
mhvk commented Apr 1, 2024

Ah, yes, and I see I even read that issue at the time -- but obviously forgot about it. Your idea of an experiment seems good.

@mhvk
Copy link
Contributor
mhvk commented Apr 1, 2024

@pijyoi - since there is already a larger issue open, and I gave what hopefully are useful work-arounds, I'll close this one. Please do feel free to reopen if there is more to add!

p.s. Thanks for reporting! Even though the change to no longer treating scalars any differently from arrays is, I think, very much for the better, it is good to realize also more unexpected consequences.

@mhvk mhvk closed this as completed Apr 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants
0