BUG: random: Generator.integers(2**32) always returned 0. #16076

WarrenWeckesser · 2020-04-25T16:48:52Z

When the input to Generator.integers was 2**32, the value 2**32-1
was being passed as the rng argument to the 32-bit Lemire method,
but that method requires rng be strictly less than 2**32-1.

The fix was to handle 2**32-1 by calling next_uint32 directly.
This also works for the legacy code without changing the stream
of random integers from randint.

Closes gh-16066.

seberg · 2020-04-25T17:01:10Z

Any chance of the same issue for 2**64 (not through integers maybe)?

EDIT: Oh, sorry, I see the 64 bit version has this path already. A test could make sense but I assume it already exists.

seberg · 2020-04-25T17:01:45Z

@bashtage would you have time to review this to make sure its good?

seberg

LGTM, I think fine to put in as is. It seems to me the test should include also drawing a single sample to cover both paths, though?
Otherwise, will leave it in case Kevin wants to have a look, but lets not hesitate to merge to get 1.18.4 out.

WarrenWeckesser · 2020-04-26T20:08:43Z

@seberg wrote

It seems to me the test should include also drawing a single sample to cover both paths, though?

The test in there now is a crude regression test. I'll replace that with a repeatability test for 2**32-1, 2**32 and 2**32+1.

seberg · 2020-04-26T20:14:50Z

Right, just need to also cover the path where only a single number is drawn then. The approach is good in any case, I was wondering whether it makes sense to put an assert for this into the Lemirs method. We could just do that, although I doubt it helps much with flushing these issues out.

When the input to Generator.integers was 2**32, the value 2**32-1 was being passed as the `rng` argument to the 32-bit Lemire method, but that method requires `rng` be strictly less then 2**32-1. The fix was to handle 2**32-1 by calling next_uint32 directly. This also works for the legacy code without changing the stream of random integers from `randint`. Closes numpygh-16066.

Assert that an invalid value (2**n-1 for n = 8, 16, 32, 64) has not been passed to the Lemire function.

WarrenWeckesser · 2020-04-27T15:27:20Z

It took a few tries, but the Travis-CI tests on the s390x platform finally ran successfully.

I've changed the unit test to a set of repeatability tests for the values 2**32-1, 2**32 and 2**32+1. For each value, a test is run with size=7 and size=None.

I also added assert calls in the C implementations of the Lemire method to check for the invalid value.

bashtage

LGTM, modulo checking that the stream in RandomState is not affected.

bashtage · 2020-04-27T15:31:38Z

numpy/random/src/distributions/distributions.c

      for (i = 0; i < cnt; i++) {
-        out[i] = off + buffered_bounded_masked_uint32(bitgen_state, rng, mask,
-                                                      &bcnt, &buf);
+        out[i] = off + (uint64_t) next_uint32(bitgen_state);


Is this code used in RandomState, and does this preserve the stream guarantee? RS uses only masked, so intercepting before this might have been called could can an issue.

Would it be safest to added a test in RandomState too with the same end points?

This is tested by the test th 8000 at was added in gh-14501:

numpy/numpy/random/tests/test_randomstate_regression.py

Line 178 in 20f1076

@pytest.mark.skipif(np.iinfo('l').max < 2**32,

bashtage · 2020-04-27T15:32:47Z

numpy/random/src/distributions/distributions.c

+       * call next_uint32 directly.  This also works when use_masked is True,
+       * so we handle both cases here.
+       */
+      return off + (uint64_t) next_uint32(bitgen_state);


I think the same would apply here as well.

Yes, this needs a test. When calling randint with scalar values of low and high, this path is not reached, even when size is None. It turns out that the function random_bounded_uint64 is called by the code that handles broadcasting for random.randint. I'll add a repeatability test for that.

Add repeatability tests for when the range of the integers is `2**32` (and `2**32 +/- 1` for good measure) with broadcasting. The underlying functions called by Generator.integers and random.randint when the inputs are broadcast are different than when the inputs are scalars.

bashtage

Extra test was all I noticed. It has been added.

seberg · 2020-04-27T17:37:57Z

@WarrenWeckesser cool, thanks for the nice tests. Unfortunately due to 32bit longs, the tests are failing on windows (and presumably 32bit linux).

WarrenWeckesser · 2020-04-27T17:39:22Z

@seberg, yup, just noticed that. Looking into it.

WarrenWeckesser · 2020-04-27T19:05:50Z

For the new randint test, I used the same skipif decorator as in #14501. Tests are all green now.

seberg · 2020-04-27T19:11:16Z

Seems good to just skip. Thanks for tracking it down and fixing it quickly Warren!

charris · 2020-04-27T22:39:00Z

AFAIK, RandomState does not use the Lemire method, only the mask method. For repeatability tests, I used fairly long streams and hashed them, then checked the hash. Takes up a lot less room.

Repeatability is not guaranteed for the Generator methods. At some point we should have correctness checks, maybe as a separate set of tests.

WarrenWeckesser · 2020-04-27T23:01:03Z

@charris wrote

Repeatability is not guaranteed for the Generator methods.

My understanding is that we have repeatability tests for the Generator methods so we can detect a change, and create a release note about the change if turns out that the change is unavoidable.

WarrenWeckesser · 2020-04-27T23:04:23Z

At some point we should have correctness checks, maybe as a separate set of tests.

Agreed. The issue gh-15911 that you created looks like a good place to discuss this.

iyanmv · 2021-11-18T11:06:44Z

Was this ever merged to 1.17.x? So, correct me if I'm wrong, but the first version of NumPy that can be used without issues gh-16066 and gh-14774 is 1.18.4, right?

bashtage · 2021-11-18T11:21:21Z

The last 17 branch was released before this PR was merged, so it was never in the 17 branch. It was backported to 18 and in 19.0.

iyanmv · 2021-11-18T14:04:08Z

Okay, so I guess a bump in requirements-min.txt is needed in my case.

WarrenWeckesser added 00 - Bug component: numpy.random labels Apr 25, 2020

WarrenWeckesser mentioned this pull request Apr 25, 2020

default_rng.integers(2**32) always return 0 #16066

Closed

seberg added the 09 - Backport-Candidate PRs tagged should be backported label Apr 25, 2020

WarrenWeckesser force-pushed the fix-gh-16066 branch from ea2edae to cc8b4cc Compare April 26, 2020 19:40

seberg approved these 8000 changes Apr 26, 2020

View reviewed changes

WarrenWeckesser force-pushed the fix-gh-16066 branch from cc8b4cc to 9864668 Compare April 26, 2020 20:57

MAINT: random: Add assert() statements.

1ded86d

Assert that an invalid value (2**n-1 for n = 8, 16, 32, 64) has not been passed to the Lemire function.

WarrenWeckesser force-pushed the fix-gh-16066 branch from 9864668 to 1ded86d Compare April 26, 2020 21:19

WarrenWeckesser closed this Apr 27, 2020

WarrenWeckesser reopened this Apr 27, 2020

WarrenWeckesser closed this Apr 27, 2020

WarrenWeckesser reopened this Apr 27, 2020

bashtage suggested changes Apr 27, 2020

View reviewed changes

seberg added this to the 1.18.4 release milestone Apr 27, 2020

bashtage approved these changes Apr 27, 2020

View reviewed changes

TST: random: Skip a test if integers are 32 bit.

621efc7

seberg merged commit 92b880f into numpy:master Apr 27, 2020

WarrenWeckesser deleted the fix-gh-16066 branch April 27, 2020 19:14

charris removed the 09 - Backport-Candidate PRs tagged should be backported label Apr 27, 2020

charris mentioned this pull request Apr 27, 2020

BUG: random: Generator.integers(2**32) always returned 0. #16090

Merged

charris removed this from the 1.18.4 release milestone Apr 27, 2020

WarrenWeckesser added this to the 1.19.0 release milestone May 16, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

BUG: random: Generator.integers(2**32) always returned 0. #16076

BUG: random: Generator.integers(2**32) always returned 0. #16076

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

BUG: random: Generator.integers(2**32) always returned 0. #16076

BUG: random: Generator.integers(2**32) always returned 0. #16076

Uh oh!

Conversation

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!