8000 BUG: Don't produce undefined behavior for a << b if b >= bitsof(a) by eric-wieser · Pull Request #13739 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

BUG: Don't produce undefined behavior for a << b if b >= bitsof(a) #13739

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Sep 14, 2019

Conversation

eric-wieser
Copy link
Member
@eric-wieser eric-wieser commented Jun 8, 2019

This is a continuation of gh-7473. Fixes gh-2449

@eric-wieser
Copy link
Member Author
eric-wieser commented Jun 9, 2019

CI failure is due to this assert:

https://github.com/gcc-mirror/gcc/blob/9d0507742960aa9f2b99bc6e9584655ecc611792/gcc/expmed.c#L2349

numpy/core/src/umath/loops.c.src: In function ‘LONG_right_shift’:
numpy/core/src/umath/loops.c.src:830:18: internal compiler error: in expand_shift_1, at expmed.c:2349
             *out = in1 >> in2;
                  ^
numpy/core/src/umath/fast_loop_macros.h:180:9: note: in definition of macro ‘BASE_BINARY_LOOP_S_INP’
         op; \
         ^
numpy/core/src/umath/loops.c.src:828:5: note: in expansion of macro ‘BINARY_LOOP_FAST’
     BINARY_LOOP_FAST(@type@, @type@, {
     ^
Please submit a full bug report,
with preprocessed source if appropriate.
See <file:///usr/share/doc/gcc-5/README.Bugs> for instructions.
Running from numpy source directory.

Which can be reprod on godbolt here.

An extremely minimal repro is here. @juliantaylor, could you look at reporting this upstream?

Copy link
Member
@seberg seberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. I have not worked with those really to be honest, but can't we put a new npy_left_shift to numpy/core/src/npymath/npy_math_internal.h.src or similar to avoid duplication?

@eric-wieser
Copy link
Member Author
eric-wieser commented Jun 11, 2019

@seberg: We have no precedent in that file for functions templated over all integer types. The best we have are lcm and gcd which use the usual l and ll suffices typical of the C standard library. Unfortunately for shift we need to know the exact type, so we need overloads for short and char as well.

What names would you propose?

@seberg
Copy link
Member
seberg commented Jun 11, 2019

Oh,had not checked, but doesn't sound too bad, the normal type code should suffice, e.g. bhil (and q, if we need long long) and the capital ones for unsigned of course? Most of these names never have underscoves, but I somewhat like extra underscores npy_left_shift_@cc@ or npy_lshift@cc@, not sure I care too much (it should be private anyway).

@eric-wieser
Copy link
Member Author
eric-wieser commented Jun 12, 2019

Realized that there's precedent for h and hh in printf specifiers, so the full set of suffices is u?(h{0,2}|l{0,2}). Updated to use npy_math.

Unfortunately, I strongly suspect this is going to fail CI again (and if it doesn't, then I think we have a performance problem). I think the only fix here is to add a configtest file to check for this internal compiler error, and remove O3 from the function if it occurs.

@eric-wieser
Copy link
Member Author

Looks like this hit a problem on macOS as well. Somehow the FPU flag gets set, but only on computations with int and unsigned int.

#ifdef @TYPE@_left_shift_needs_clear_floatstatus
// For some reason, our macOS CI sets an "invalid" flag here, but only
// for some types.
npy_clear_floatstatus_barrier((char*)dimensions);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no idea why this happens. Is our macOS run x86, or something else? Do we have a resident macOS expert?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no idea, was thinking about putting it in godbold, but it is quite a lot of code around it (and I am not sure it is likely to show anything in any case).

@charris, I think I am good with this as is. But you had a look at it before and know npy_math stuff better. The floatstatus thing (and I guess the compile time check) are a bit unfortunate though...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be a clang thing.

@eric-wieser eric-wieser force-pushed the bit_shifts branch 3 times, most recently from e8fbd2f to 6d4eb24 Compare June 12, 2019 08:51
@eric-wieser
Copy link
Member Author
eric-wieser commented Jun 12, 2019

Coverage stats are just bad because we can't measure coverage for setup.py. ARM CI now passes, because setup.py now looks for this compiler bug and avoids requesting optimization for that particular loop if the bug is present.

It means 64-bit >> and << will be slower on ARM, but only on old compilers.

NPY_INPLACE npy_@u@@type@
npy_rshift@u@@c@(npy_@u@@type@ a, npy_@u@@type@ b)
{
if (NPY_LIKELY((size_t)b < sizeof(a) * CHAR_BIT)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to deal with the possibility that b is negative?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not that currently the shift value is cast to unsigned and masked, at least on my hardware.

In [1]: a = array([1,2,3,4])                                                    

In [2]: a << -1                                                                 
Out[2]: 
array([-9223372036854775808,                    0, -9223372036854775808,
                          0])

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there was a comment in the old/new code about that as well. Maybe we can push it off to another PR? That seems the easiest way forward? We can ignore the codecov failures.

Copy link
Member
@mattip mattip Jul 3, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

xref gh-13898

@mattip
Copy link
Member
mattip commented Jul 3, 2019

Fixes #2249

@mattip
Copy link
Member
mattip commented Jul 3, 2019

With the caveat that negative shifts still need to be handled, I think this is good to go as is

@charris
Copy link
Member
charris commented Jul 3, 2019

I'm OK with this going in. Would be good to have a release note about changed behavior in case it affects somebody downstream.

@mattip mattip added the 56 - Needs Release Note. Needs an entry in doc/release/upcoming_changes label Jul 14, 2019
@mattip
Copy link
Member
mattip commented Jul 14, 2019

Waiting on resolution of #13908 to know how to add a release note

@mattip
Copy link
Member
mattip commented Jul 17, 2019

Since #13908 is taking a while, perhaps just add a not to doc/release/1.18.0-notes.rst and we will refactor that file if/when #13908 lands

@seberg seberg self-requested a review July 17, 2019 16:01
@charris
Copy link
Member
charris commented Jul 23, 2019

@seberg Ping.

@seberg
Copy link
Member
seberg commented Jul 23, 2019

@charris thanks. Let me merge the towncrier related PRs (since nobody complained), and then I will add a short release note to this. Other than that I remember this as being good (although that compiler bug is a bit annoying).

@seberg seberg force-pushed the bit_shifts branch 4 times, most recently from 55cf805 to 9f258d8 Compare July 24, 2019 22:22
@seberg
Copy link
Member
seberg commented Jul 24, 2019

OK, I rebased this (it was a bit messy) and added a release note (we will have to modify the towncrier template a bit before the end, but it works).

Master seems to have added @isa@ (and the GCC flag) since that exists on master, but I am not sure what the state of this was before the rebase? I do not think it had the @isa@, but also not sure how that adds up with the place in code. Is AVX worth it here?

@seberg seberg removed the 56 - Needs Release Note. Needs an entry in doc/release/upcoming_changes label Jul 24, 2019
@seberg seberg removed their assignment Sep 12, 2019
@mattip
Copy link
Member
mattip commented Sep 12, 2019

restarting tests

@mattip mattip closed this Sep 12, 2019
@mattip mattip reopened this Sep 12, 2019
This:
* Inlines the macros in loops.c.src
* Replaces 8 with `CHAR_BIT `. The `NPY_SIZEOF_*` macros are not used here because it's too much work to apply them to the signed types, and they expand to the same thing anyway.
* Removes the reduce loop specializations which likely no one cares about
* Uses pytest.mark.parametrize to shorten the test
This avoids duplication, as suggested in review.
The suffices `l`, `ull` etc are for consistency with `strtoull` and friends.
`h` and `uhh` are inspired from the C99 printf specifiers.
@mattip mattip merged commit 31ffdec into numpy:master Sep 14, 2019
@mattip
Copy link
Member
mattip commented Sep 14, 2019

Thanks @eric-wieser

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Shift operator cycles for int32 and int64 (Trac #1856)
5 participants
0