-
-
Notifications
You must be signed in to change notification settings - Fork 10.9k
BUG: Invalid value in inplace log #17761
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Do you happen to know the dtype of the input? Is it always the same dtype? We did change many of the SIMD-enhanced ufuncs over the past couple of months, but if I recall correctly float_log is not one of them. |
It should be |
... and in the SciPy code it even protects against anything close to zero:
|
it may be that the error state is getting set somewhere else and only getting checked after the call to np.log |
In the SciPy and MNE code there are |
And actually it looks like this is causing failures in SciPy directly as well: https://travis-ci.org/github/scipy/scipy/jobs/743059506#L19563-L19579 That is a build for a PR that did not change the |
One option would be to open a SciPy PR that, in one of the NumPy-nightly Travis runs, reverts that PR to see if it fixes the |
(This would at least tell us if it's the culprit) |
@mattip we clear the floating point flags before each ufunc call, so I do not think the flags can be set anywhere else (unless there is casting involved, which would require strange input like integers, if it is possible at all). |
Sure, if it is not to much bother go ahead. I think we will need a compact reproducer anyway, so it would be best to reproduces lcoally and cut down the test to a minimum. If it is that PR, we need a avx512 machine. |
Agreed this makes sense. But perhaps it makes sense to revert gh-16247 if it does indeed fix the problem so that downstream libraries don't have to live with the bug. Then once a compact reproducer is made, gh-16247 can be revived and fixed?
FWIW my machine has AVX512 I think:
And I cannot reproduce. I compile with |
... and I cannot reproduce when installing that |
can you see what
|
(this needs to be improved for the release, it is part of PR gh-17737) |
Sorry for the noise. Could you try to save the EDIT: Well, I guess probably more noise, even if there was a 0 in there, the warning should be "divide by zero" not invalid value. |
One last bit of noise, but I think we should ping @r-devulap. I am a bit dubious by this code: numpy/numpy/core/src/umath/simd.inc.src Lines 3317 to 3318 in 6f0436d
since the
Of course I wasn't able to trigger that (e.g. with a short array or similar). |
Thank goodniss I could reproduce on CircleCI, where I can run an interactive SSH session to interact with it afterward. In theory others can, too. But in any case, the output there is:
I then shortened one of our tests, added
After some more whittling, even this does it:
If there is one element there is no error. If the I then went a step further and did |
... and I can confirm that a I am out of my depth on this, but in that commit message, this line in particular seems suspicious, due to the SIMD and in-place requirements for the failure:
|
OK, I am 99% sure I found the issue. The gcc loop uses reuses the previous values if they are close to 1 (where the fast algorithm misbehaves). The values that go to invalid values, are those. Now the So what is needed here is a To be honest, there are more bugs here, because also |
ping @seiko2plus |
@larsoner, my bad, this bug is unfairly caused by #16247 due to allowing vectorize aliasing(same pointer and stride), I will release a fix for it, as fast workaround you can disable export NPY_DISABLE_CPU_FEATURES="AVX512F AVX512_SKX" |
@larsoner thanks for narrowing down the commit id! |
Reproducing code example:
I cannot reproduce on my system (yet).
Error message:
Travis Linux hits a RuntimeWarning it did not previously (with no relevant code changes at our end):
Above the error is in a SciPy call. But it appears to happen anywhere we do a
np.log(x, out=x)
, here is another:If it's not clear from looking at the past few days of NumPy merges what the culprit might be, I can try to get more information out of Travis about what's actually in the offending arrays.
NumPy/Python version information:
This is on
numpy-1.20.0.dev0+0ffaaf8
/ 0ffaaf8 from Thursday (today) onpypi.anaconda.org/scipy-wheels-nightly
. Previous builds onnumpy-1.20.0.dev0+5f071c6
/ 5f071c6 from Monday didn't have this problem.The text was updated successfully, but these errors were encountered: