-
-
Notifications
You must be signed in to change notification settings - Fork 10.9k
Comparisons between numpy scalars and python scalars are very slow #14415
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
It is a bit more complex than that, since numpy scalars behave like comparison with arrays. I.e. you can compare them to a list and it will coerce the list. That said, for comparisons there seems to be no fast paths meaning that everything is always coerced to an array first. So those fast paths could be added to make the performance gap smaller. |
Hey @seberg , I can see this line is where we do the array initialize: numpy/numpy/core/src/multiarray/scalartypes.c.src Line 1040 in 4e61c89
Do you think it will be worthwhile to include a check to see if we need to create an array for comparison? Willing to try some edits and bring some performance data to take a call. |
@ganesh-k13 there is a lot more to unravel there, since for the important things it is actuall done here: numpy/numpy/core/src/umath/scalarmath.c.src Line 1299 in 5d934a3
I had an older branch were I started to think of cleaning it up, I think it is this one: https://github.com/numpy/numpy/compare/master...seberg:scalar-binop-giveup-in-fallback?expand=1 that might also make this path faster. There is a lot going on there though (and the branch doesn't apply to master currently). I have given up on the branch for now currently, not because its not worth cleaning it up, but because I figured it is not really a necessary clean up for my dtypes work. If you want to look at this and get confused, I would be happy to help you out or do a video call. |
Thanks a lot for the info @seberg , I'll explore that part more and get back 👍 . |
I am sure there is more to do here, but note that if activating NEP 50, it is mostly fast now. Of course not quite as fast as it could be, but faster. The reason is preserving the NumPy dtype, i.e. with: np._set_promotion_state("weak") # On current main/NumPy 1.24 The timings are for me (int64 would be native, not int32):
So if we do weak promotion for scalars, improvements can be done, but the main issue here is solved. |
Comparisons between numpy scalars and python scalars are very slow. It's an order of magnitude faster to do
int(npint) == pyint
thannpint == pyint
.It seems to me that at worst be no slower as numpy could just do
int()
for me?The slowdown was apparent with all the types I tried including uint8, int32, float32 and uint64, but not int64 and float64, which are fast.
Reproducing code example
Numpy/Python version information:
The text was updated successfully, but these errors were encountered: