-
-
Notifications
You must be signed in to change notification settings - Fork 10.9k
ENH: minmax function #9836
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Can I try to implement this function as my first contribution? |
Tricky to make this a ufunc, because |
Sorry if I'm being a bit slow here, but wouldn't it be possible to just return an array containing two values? Admittedly I've not dug deep into ufuncs internal workings. So am probably missing something. |
The issue is that You could of course just implement this with |
So, did I understand right that |
Correct, but it would by far be preferable to adapt ufuncs to make it possible. You could implement it as a |
This is one of quite a number of requests to combine ufuncs in various ways (here, running two in parallel; often, two in series); I still like the idea of making it possible to programmatically make such combined ufuncs. It should actually not be that hard -- for some thoughts, see #8889 (comment). Now if only I had time to work on it... |
Since this can be done without creating a new type of ufuncs, would a proposal that added |
@jakirkham - rereading this, I think @eric-wieser is correct that it is in fact not easy to implement |
Why is it not easy to implement As to the usefulness of |
It's a little messy to implement It's harder to implement I might prefer the name |
It's true C is pretty rough when it comes to parameterizing functions over types. My impression is that most C programs leverage Certainly Cython helps as well. Though there is still some boilerplate required to glue everything together. This is mainly caused by Cython not supporting arbitrary dimensions for a memory view (e.g. could be 1-D or 3-D). One option might be to rip the C generated code out from Cython and use that as a base for NumPy. Would need some tidying though. While it is not common to use Cython in NumPy, some random number generation code does use it and it is always possible to ship the C. Perhaps the lower maintenance burden is already a reason for NumPy to start thinking about it. 😉 Have a WIP implementation of axes in PR ( jakirkham/cyminmax#31 ). Though have no idea how well it performs yet. The approach is at least similar to what things
8000
like So |
@jakirkham - my comment really wasn't that In this respect, in your example of normalizing an image, one also needs to subtract |
Sure. Fusing operations together would be very powerful and widely useful. Though that feels like a different topic to me. Is there already an issue open on this topic? Would be very interested in that discussion. :) Would add that the |
It certainly wasn't to me. This makes a good case for it needing to be a Do you mean it does something like this? max = min = data[0]
for i in range(1,n, 2):
try:
a, b = data[i:i+2]
except ValueError:
a = b = data[i]
if b > a:
a, b = b, a
if a > max:
max = a
if b < min
min = b |
I think that if we add one of these to numpy, we should add both:
|
Yes, correct. It's true. It might not have been obvious. Hence why I drew attention to it. ;) Here's the equivalent code in
Agreed. The elementwise variant looks nice as well. |
to get performance out of min/max on pcs you have to abandon your CS comparison count ideas :) tldr: write it explicitly in compiler intrinsics to get good code, see the min/max implementation in numpy in numpy/core/src/umath/simd.inc.src Some more hints, you can probably get axis reduction for free via np.lib._ureduce like median does, I also could have sworn we already had a PR for this lying around since years, but that might have also been for sumabs/sumsquared etc not minmax... |
Those ideas at least work for |
Thanks for the pointers on intrinsics, @juliantaylor. IIUC that only applies for integers (thinking of the packed integer intrinsics), correct? Or are there intrinsics for floats as well? |
cc @jni (as we were discussing this previously) |
Since it is related. I just wanted to add that it would also be great to get the according indices at the same time. At least by setting an according input flag. So maybe something like: vmin = np.min(A,axis=1)
vmin, idmin = np.min(A, ret_ids = True, axis=1)
(vmin, idmin), (vmax, idmax) = np.minmax(A, ret_ids = True, axis=1) |
Today one would probably use |
For what it's worth, I implemented a simple |
For what it's worth, I implemented a SIMD-optimized minmax function for numpy arrays here: https://github.com/nomonosound/numpy-minmax It's mainly optimized for float32, but is also partially optimized for int16 |
If minmax cannot be a reduction like min and max are, and at some point another function has to be implemented, the gain provided by this function would basically be due to the fact that the number of "iterative scanning" would be reduced to one. I think that function should not be called minmax unless it was a reduction, and Therefore I think a more complete signature in the API could be like : This function could reduce the cost (and the code length) of use cases where np.max, np.min, np.argmax, np.argmin are used sequentially, including their nan-ignoring equivalents. |
Wonder if someone could do very a mild tweak to the array before passing it to a (private) import numpy as np
a = np.random.random((10, 11))
av = np.lib.stride_tricks.as_strided(a,
shape=(2, *a.shape),
strides=(0, *a.strides),
writeable=False)
assert (av[0] == a).all()
assert (av[1] == a).all()
# a_min, a_max = np._minmax(av) |
It is worth noting that In [1]: import numpy as np
In [2]: np.divmod
Out[2]: <ufunc 'divmod'> FWIW |
Indeed, |
(i) I'm a definite yes |
For the implementation, clearly there is already the one provided by @WarrenWeckesser (see #9836 (comment)), so one could start there. But the trick will be to actually get it to good performance with vectorization, etc., because this will not be able to use much of the machinery in |
FYI & FWIW: The core calculations that I implemented are in https://github.com/WarrenWeckesser/ufunclab/blob/main/src/minmax/minmax_gufunc.h (which also includes calculations for My implementation is not optimized; it is not much more than the serial loop in C++ that you would expect. I have experimented with a SIMD implementation in C++ using xsimd in https://github.com/WarrenWeckesser/experiments/blob/main/c%2B%2B/xsimd/reductions/minmax.hpp (no NumPy, just C++). For NumPy, one would use the Google Highway library. (Adding SIMD implementations to ufunclab is on my aspirational to-do list, but not my immediate to-do list.) I'd be happy to work on adding |
Would be really handy to have a function that computes and returns both the minimum and maximum of an array. This should be substantially faster than calling
min
and thenmax
on the same array.FWIW here's an SO question from some time ago that has gotten a fair bit of attention.
The text was updated successfully, but these errors were encountered: