8000 MAINT: Fix floating point warning flag as much as possible by seberg · Pull Request #19476 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

MAINT: Fix floating point warning flag as much as possible #19476

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

seberg
Copy link
Member
@seberg seberg commented Jul 14, 2021

This PR avoids setting FPE flag flags rather than clearing them
where possible.

The main place where this is not easy is SSE (I could not yet find
the isless equivalent for basic SSE), and did not want to modify
the code too much.

This also updates the docs to copy the informational section to the
"Floating point error handling" landing page. Right now, the docs
assume that we (for now) keep the behaviour of NOT giving floating
point warnings for comparisons with NaN.

Hopefully, using these functions should not have any speed impact.
There is some chance that compilers will not honor the contract
and replace isless with instructions that do set floating point
error flags. If/where this happens, we may have to undo the changes.

This should have no impact on functionality (except if compilers
do not follow C99/IEEE correctly, which is entirely possible).

However, signalling NaNs WILL now warn more often (as per C99/IEEE),
since we do not clear the warnings as agressively.


A few notes:

  1. I am not sure that we can trust compilers. so there is some chance that relying on isless and friends won't work perfectly. There is also some chance of performance regressions forcing us back.
  2. Maybe @seiko2plus can have a look at the SIMD related changes? Otherwise, I could split them out to make things simpler.

To some degree getting warning flags right seems a bit "aspirational"...

@seberg
Copy link
Member Author
seberg commented Jul 14, 2021

Well, that attempt died quickly on CI :/, it had worked great locally. (I still have some hope I just missed some SIMD path though)

@seberg
Copy link
Member Author
seberg commented Jul 14, 2021

Aha, I managed to rewrite the AVX512F logic to avoid the spurious warning setting. Quick timing gives (on a machine with AVX512F*):

In [1]: arr1 = np.random.random(10000); arr2 = np.random.random(10000)

On main:

In [2]: %timeit np.minimum(arr1, arr2)
7.89 µs ± 83.8 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

On this branch:

In [3]: %timeit np.minimum(arr1, arr2)
8.39 µs ± 6.36 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

So it looks like <5% slowdown, probably we can live with that.


I think the AVX512 solution may be a big improvement for the SSE2 code. That code currently tests for an invalid flag set during the comparison. But as far as I can tell, both of the instructions I now use in AVX have equivalent ones in SSE2.

seberg added a commit to seberg/numpy that referenced this pull request Jul 14, 2021
This is technically even incorrect in IEEE, signalling NaNs should
pretty much always give a warning.  But overall, it seems unlikely
NumPy should bother about it (we never create them after all).

This is split off from numpygh-19476.  Mainly, to check whether s390x is
broken...
@seberg seberg force-pushed the use-no-fp-error-C99-functions branch from fd815bd to 7291c4b Compare July 14, 2021 19:39
@seberg
Copy link
Member Author
seberg commented Jul 14, 2021

I can reproduce the MacOS failures locally with clang, and they go away with -ffp-exception-behavior=strict, so gh-19049 (not working) strikes again...

So we will have to figure out gh-19049 before this has any chance :(. Marking as draft, I guess some of the SIMD stuff could be split out regardless of clang. EDIT: xref the issue gh-19427

@seberg seberg marked this pull request as draft July 14, 2021 20:00
@seberg seberg force-pushed the use-no-fp-error-C99-functions branch from 7291c4b to f137362 Compare July 18, 2021 20:26
@seberg seberg marked this pull request as ready for review July 18, 2021 20:26
@seberg seberg marked this pull request as draft July 18, 2021 23:14
@seberg
Copy link
Member Author
seberg commented Jul 19, 2021

Well, probably will have to just pull out the useful parts... Godbolt points out that clang <10.0 does bad optimization: https://godbolt.org/z/38hcaaP7o (note the cmpltsd xmm3, xmm1 instruction, where clang 10 correctly only uses ucomisd).

I am not actually sure why this fails CI, since that looks like a clang 11, unless it is Mac specific? But, I guess we support clang 9 anyway, so...

EDIT: Apparently the CI issue is that clang 11 does not support trapping-math, at least not for x86_64.

@seberg
Copy link
Member Author
seberg commented Nov 10, 2021

Closing reopening, because I am curious if bumping MacOS in CI from 10.14 to 10.15 makes a difference here (although I am not sure that would actually help with moving forward).

@seberg seberg closed this Nov 10, 2021
@seberg seberg reopened this Nov 10, 2021
This PR avoids setting FPE flag flags rather than clearing them
where possible.

The main place where this is not easy is SSE (I could not yet find
the `isless` equivalent for basic SSE), and did not want to modify
the code too much.

This also updates the docs to copy the informational section to the
"Floating point error handling" landing page.  Right now, the docs
assume that we (for now) keep the behaviour of NOT giving floating
point warnings for comparisons with NaN.

Hopefully, using these functions should not have any speed impact.
There is some chance that compilers will not honor the contract
and replace `isless` with instructions that do set floating point
error flags.  If/where this happens, we may have to undo the changes.

This should have no impact on functionality (except if compilers
do not follow C99/IEEE correctly, which is entirely possible).

However, signalling NaNs WILL now warn more often (as per C99/IEEE),
since we do not clear the warnings as agressively.
@seberg seberg force-pushed the use-no-fp-error-C99-functions branch from f137362 to 1110d7c Compare November 10, 2021 16:04
@seberg seberg added the 64 - Good Idea Inactive PR with a good start or idea. Consider studying it if you are working on a related issue. label Apr 11, 2022
@seberg
Copy link
Member Author
seberg commented Apr 11, 2022

Going to close this for now. There are a few things here that would be nice to have. The whole thing would be nice, but seems only viable for newer clang versions (which probably takes a bit more time).

@seberg seberg closed this Apr 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
03 - Maintenance 64 - Good Idea Inactive PR with a good start or idea. Consider studying it if you are working on a related issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant
0