-
-
Notifications
You must be signed in to change notification settings - Fork 11k
BUG: np.min does not always propagate NaNs #10370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I think another point here is that it might be different code paths that trigger different (not really numpy I think) behaviour, e.g. due to using AVX or such, I think I saw sometimes that vector stuff failed to set the error cpu flags, and since that can appear a bit random if they are used or not.... Dunno if that has much to do with it, but I expect it, numpy does not do too much about floating point errors normally I think. Except checking whether something happened. |
And as Chuck said, a lot of this floating point error flag is different on different systems unfortunately. |
Ok, understood about why the error only occurs on some installs and not other.
Given that this is one of the allowed use cases for np.min, from a usage point of view I wouldn't expect any warnings/errors to be emitted at all. (There are obviously other ways of achieving this, e.g. |
Not at all is a plausible change I guess.
|
+1 for no warnings at all |
Hi, I stumbled upon a similar inconsistency: >>> import numpy
>>> import warnings
>>> warnings.simplefilter("always", category=RuntimeWarning)
>>> numpy.min(numpy.full((7,), numpy.nan, dtype=numpy.float64))
nan
>>> numpy.min(numpy.full((8,), numpy.nan, dtype=numpy.float64))
/home/user/env/lib/python3.6/site-packages/numpy/core/_methods.py:29: RuntimeWarning: invalid value encountered in reduce
return umr_minimum(a, axis, None, out, keepdims)
nan Why a shape of more than 8 raises that RuntimeWarning ? |
@NotSqrt @seberg I'm seeing similar behaviour where min is not propagating NaNs correctly at all, when the size of the input array gets to size 8:
Shows this rather odd behaviour on OSX & Windows, but not Linux.. fresh virtualenv using python 2.7.13 and numpy 1.14.2:
See the two "this should be nan as per docs: 19.0" lines in there? Also n.b. the warning does not appear on numpy 1.13.1 (which is where I first observed this behaviour.) |
@kippr Where did you get NumPy? |
I'm going to guess the eight element boundary may be related to AVX512 and the problem is some combination of compiler and cpu. What cpus/compiler are you seeing the problem on? |
@juliantaylor Thoughts? |
Alignment might also be having an effect. |
Hi @charris Thanks for taking a look at my post.. to answer your questions: CPU: NumPy was installed through pip on my mac in a fresh virtualenv (python 2.7.13 installed through homebrew), so I guess it uses all the default compiler flags etc? I just recreated the virtual env and reran pip install, and here is what looks pertinent to me, but there are a lot of messages that come by on install.. (full log attached.) Please let me know if there is other info I can fish out of the build directory or whatnot, or if there are compile flags to try.
Thanks |
P.s. probably no surprise, but the same behaviour appears with max:
|
I am using the same numpy version (on arch) with a CPU which also has AVX2 and sse4.2 capabilities and not seeing it on linux. Maybe it has something to do with mac/clang? EDIT/Ammendment: (I had tried around a bit with provoking different alignments, did not check the warning part, but the wrong result part I do not see) |
I'm guessing that there is a cpu flag that is not getting set correctly, probably compiler related. |
We should have a test that turns up this problem, if only to track it. @kippr What does
do on your installation? |
Hi @charris That seems ok:
But I tried a few more combos and found the following (see last one, 8 values and axis=1):
Quite mysterious.. not sure if that helps understand cause at all. @seberg - yes that's right, I observed this behaviour on my macs and also on windows, but linux was fine. Thanks |
@kippr This is a really disturbing bug, as it affects a basic operation and the problem probably doesn't seem to originate with NumPy. We can try turning off vectorization on MAC and Windows and see if that helps. Do you only see this problem on Python 2.7? |
I cannot see anything in NumPy that would have changed this behavior for 1.14. Maybe OpenBLAS is fiddling with the controls ... |
@juliantaylor @VictorRodriguez Thoughts? |
Hi @charris I see this behaviour also in python 3 on my Mac:
N.b. I f 8000 irst noticed this behaviour in NumPy version 1.13 under python 2.7 on my Mac, so it was not a regression introduced in 1.14:
And reproduced it in cygwin / windows:
But don't see this problem in linux:
|
@charris I have seen on my journey on performance optimizations for numpy that openblas and compiler flags can affect the behavior of numpy/scipy ( based on my experience ). The first step we could take is to tack this on a C (using openblas libraries) test so we can isolate the behavior and see if it can be replicated. Also just to double check this is just on MAC and windows right? i can't see this behavior on Linux ( fwiw the avx patches get into for 1.15 , so they should not affect this ) . @kippr how did you build your numpy in mac/windows ? regards |
This was via pip install on both platforms, a fresh virtualenv in the Mac case. See above where I pasted the log for pip including compilation output for python 2.7 install. (Python 3 pip install seems to be a wheel.) Cheers Kris |
Hi folks Thanks |
On the mac (and maybe other platforms as well) the bug is occurring because the compiler (on mac it's clang / llvm not gcc by default) is reordering the SSE2 commands in a problematic way. The bug in: np.min(np.diagflat([np.nan]*8), axis=1) is happening because the code that is being run is the sse2_minimum_DOUBLE generated from: numpy/numpy/core/src/umath/simd.inc.src Line 1020 in d7d5cb3
Specifically let's look at this code (I've auto expanded the SSE2 form) inside the function in line 1041: }
c1 = _mm_min_pd(c1, c2);
if (npy_get_floatstatus() & NPY_FPE_INVALID) {
*op = NPY_NAN;
}
else {
npy_double tmp = sse2_horizontal_min___m128d(c1);
....
} The compiler does not understand that the c1 assignment and npy_get_floatstatus() are related, and thus changes the code to: }
if (npy_get_floatstatus() & NPY_FPE_INVALID) {
*op = NPY_NAN;
}
else {
c1 = _mm_min_pd(c1, c2);
npy_double tmp = sse2_horizontal_min___m128d(c1);
....
} which is of course non-sense... I do not know what is the recommended way to make it not do this under optimizations. (I think other platforms have STDC FENV_ACCESS pragma for this?) Maybe since the changes in this code are 5 years old, the bug is cause by a new version of clang and different optimizations ? |
Thanks @tzickel I've not written C since a high school class many moons ago, so I'm definitely not going to figure out the smart way to solve this, but I did try a few things to force the compiler to evaluate in the original order by changing the if statement of the above to:
And indeed, once I do this, the bug disappears. (I also tried a bunch of other less dumb forms to try to get the compiler to evaluate _mm_min_pd ahead of the if statement, for example by putting references to c1 on both sides of the if statement and after the if/ else block, all to no avail. I suspect that it either reordered them anyway but always ran the c = _mm_min_pd assignment, or figured that my calls were actually noops of one form or another..) But in any case I can confirm you've identified the bug that I'm seeing, lets hope someone has a good way to hint to the compiler to leave the order of _mm_min_pd and npy_get_floatstatus alone.. |
@tzickel Out of curiousity, at what level of optimization does the problem appear? We used to limit ourself to |
Similar to this gcc bug, I think: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=6065 |
It seems like setting |
For all your questions, just go to the link here: https://godbolt.org/g/AJRdRQ (*it seems shared, so copy the input to a new window maybe) and try changing the compiler version, compiler flags and code, and seeing the result on the fly... This happens when using even -O1 on both latest compilers. The #pragma does not seem to do anything.... |
That version is very complicated, there is too much assembler and it is hard to relate the assembler to the c code. The original version was much simpler and also demostrated the problem. See the also the PR |
hmm.... can anybody tell me on which OS / compiler was numpy-1.14.3-cp27-cp27mu-manylinux1_x86_64.whl (from the pypi archive) compiled on ? I think there might be another (related) issue hiding here as well. |
@tzickel I believe the default ubuntu trusty compiler is gcc 4.8, I can't see that anything more recent was installed. |
@charris the manylinux container runs Centos5 with gcc 4.2 installed. |
@ngoldbaum Thanks for the info. We will need to build wheels for Python 3.7 soon. Will that be as easy as adding an |
I think you'll need to wait until the manylinux tooling gets updated. That took a few weeks after python3.6 came out IIRC. @njsmith probably knows more. |
@ngoldbaum What drove the choice of |
I believe the goal was to enabling compiling with a glibc that is old enough that issues with binary compatibility will not be a problem in practice
I don't know. I also don't know how numpy manages building wheels. I'm using Matthew Brett's multibuild project for my projects and I'll need to wait until that's updated to build python3.7 wheels for my projects. |
Ok, not sure if it's the final related issue, but I think I've found another issue in that SSE code: In numpy 1.14.0 some code was added to throw the runtime warnings if there are FP errors: But if we look at the SSE implementation again: numpy/numpy/core/src/umath/simd.inc.src Line 1020 in d7d5cb3
We can see it passes through the FPU only the middle aligned part of the array, the header and trailer (which aren't SSEable cause they aren't memory aligned) are manually checked for NaN and not gone via the FPU, while the SSE parts do, thus only the middle part will trigger a NaN warning, while the other (while equivlent in the input and ouput) won't. Is that ok ? np.min([1, np.nan, 1, 1, 1, 1, 1, 1]) won't trigger a runtime warning |
Yes, but my point is that #9020 was not covering all of the possible cases. One of them is this SSE code (which subverts this mechanism for some optimization). As for #11029 I'm trying to figure even why both in posts here and there, besides the NaN propogation bug, sometimes warnings show up and sometimes they dont |
Another question, if NaN is the outcome of both min/max if it exists in the input, and we already check isNan / invalid, shouldn't it fast-exit upon discovering the first instance of NaN ? |
@tzickel none of the reduction operations early-exit. They might in the future if they get refactored as gufuncs. |
The issue is that matplotlib#14131 made our test suite fail on any warnings (good!), however we do not test all versions of numpy and `np.min([np.nan])` seems to warn for 1.14.4 < version < 1.15.0 which is not a range we test on travis. However, the version of numpy that the wheels pin to for py37 in 1.14.6 which does warn. This is likely related to numpy/numpy#10370
The issue is that matplotlib#14131 made our test suite fail on any warnings (good!), however we do not test all versions of numpy and `np.min([np.nan])` seems to warn for 1.14.4 < version < 1.15.0 which is not a range we test on travis. However, the version of numpy that the wheels pin to for py37 in 1.14.6 which does warn. This is likely related to numpy/numpy#10370
This investigation arises from scipy CI starting to fail on appveyor in the last few days, i.e. after 1.14 being released.
On my home computer (macOS, conda python 3.6.2, conda numpy):
On my work computer (macOS, conda python 3.6.2, numpy installed via pip in a clean env):
With the first set of code examples why don't the first three examples result in a warning, only the last?
With the second set of examples no warning is emitted at all.
The text was updated successfully, but these errors were encountered: