-
-
Notifications
You must be signed in to change notification settings - Fork 10.9k
BUG: random: Generator.integers(2**32) always returned 0. #16076
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Any chance of the same issue for EDIT: Oh, sorry, I see the 64 bit version has this path already. A test could make sense but I assume it already exists. |
@bashtage would you have time to review this to make sure its good? |
ea2edae
to
cc8b4cc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, I think fine to put in as is. It seems to me the test should include also drawing a single sample to cover both paths, though?
Otherwise, will leave it in case Kevin wants to have a look, but lets not hesitate to merge to get 1.18.4 out.
@seberg wrote
The test in there now is a crude regression test. I'll replace that with a repeatability test for |
Right, just need to also cover the path where only a single number is drawn then. The approach is good in any case, I was wondering whether it makes sense to put an assert for this into the Lemirs method. We could just do that, although I doubt it helps much with flushing these issues out. |
When the input to Generator.integers was 2**32, the value 2**32-1 was being passed as the `rng` argument to the 32-bit Lemire method, but that method requires `rng` be strictly less then 2**32-1. The fix was to handle 2**32-1 by calling next_uint32 directly. This also works for the legacy code without changing the stream of random integers from `randint`. Closes numpygh-16066.
cc8b4cc
to
9864668
Compare
Assert that an invalid value (2**n-1 for n = 8, 16, 32, 64) has not been passed to the Lemire function.
9864668
to
1ded86d
Compare
It took a few tries, but the Travis-CI tests on the I've changed the unit test to a set of repeatability tests for the values I also added |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, modulo checking that the stream in RandomState
is not affected.
for (i = 0; i < cnt; i++) { | ||
out[i] = off + buffered_bounded_masked_uint32(bitgen_state, rng, mask, | ||
&bcnt, &buf); | ||
out[i] = off + (uint64_t) next_uint32(bitgen_state); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this code used in RandomState
, and does this preserve the stream guarantee? RS uses only masked, so intercepting before this might have been called could can an issue.
Would it be safest to added a test in RandomState too with the same end points?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is tested by the test that was added in gh-14501:
@pytest.mark.skipif(np.iinfo('l').max < 2**32, |
* call next_uint32 directly. This also works when use_masked is True, | ||
* so we handle both cases here. | ||
*/ | ||
return off + (uint64_t) next_uint32(bitgen_state); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the same would apply here as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this needs a test. When calling randint
with scalar values of low
and high
, this path is not reached, even when size
is None
. It turns out that the function random_bounded_uint64
is called by the code that handles broadcasting for random.randint
. I'll add a repeatability test for that.
Add repeatability tests for when the range of the integers is `2**32` (and `2**32 +/- 1` for good measure) with broadcasting. The underlying functions called by Generator.integers and random.randint when the inputs are broadcast are different than when the inputs are scalars.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extra test was all I noticed. It has been added.
@WarrenWeckesser cool, thanks for the nice tests. Unfortunately due to 32bit longs, the tests are failing on windows (and presumably 32bit linux). |
@seberg, yup, just noticed that. Looking into it. |
For the new |
Seems good to just skip. Thanks for tracking it down and fixing it quickly Warren! |
AFAIK, RandomState does not use the Lemire method, only the mask method. For repeatability tests, I used fairly long streams and hashed them, then checked the hash. Takes up a lot less room. Repeatability is not guaranteed for the Generator methods. At some point we should have correctness checks, maybe as a separate set of tests. |
@charris wrote
My understanding is that we have repeatability tests for the Generator methods so we can detect a change, and create a release note about the change if turns out that the change is unavoidable. |
Agreed. The issue gh-15911 that you created looks like a good place to discuss this. |
The last 17 branch was released before this PR was merged, so it was never in the 17 branch. It was backported to 18 and in 19.0. |
Okay, so I guess a bump in requirements-min.txt is needed in my case. |
When the input to Generator.integers was
2**32
, the value2**32-1
was being passed as the
rng
argument to the 32-bit Lemire method,but that method requires
rng
be strictly less than2**32-1
.The fix was to handle
2**32-1
by calling next_uint32 directly.This also works for the legacy code without changing the stream
of random integers from
randint
.Closes gh-16066.