-
-
Notifications
You must be signed in to change notification settings - Fork 10.9k
BUG: Buttress handling of extreme values in randint #8846
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -157,22 +157,42 @@ def test_rng_zero_and_extremes(self): | |
for dt in self.itype: | ||
lbnd = 0 if dt is np.bool_ else np.iinfo(dt).min | ||
ubnd = 2 if dt is np.bool_ else np.iinfo(dt).max + 1 | ||
|
||
tgt = ubnd - 1 | ||
assert_equal(self.rfunc(tgt, tgt + 1, size=1000, dtype=dt), tgt) | ||
|
||
tgt = lbnd | ||
assert_equal(self.rfunc(tgt, tgt + 1, size=1000, dtype=dt), tgt) | ||
|
||
tgt = (lbnd + ubnd)//2 | ||
assert_equal(self.rfunc(tgt, tgt + 1, size=1000, dtype=dt), tgt) | ||
|
||
def test_full_range(self): | ||
# Test for ticket #1690 | ||
|
||
for dt in self.itype: | ||
lbnd = 0 if dt is np.bool_ else np.iinfo(dt).min | ||
ubnd = 2 if dt is np.bool_ else np.iinfo(dt).max + 1 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Would There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Perhaps, as it does not work for the moment: >>> np.iinfo(np.bool_)
...
ValueError: Invalid integer data type. |
||
|
||
try: | ||
self.rfunc(lbnd, ubnd, dtype=dt) | ||
except Exception as e: | ||
raise AssertionError("No error should have been raised, " | ||
"but one was with the following " | ||
"message:\n\n%s" % str(e)) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is a little weird - better to leave the raw exception so that the tests produce a useful stack trace, I think There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I did this because I wanted to know which element in the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe we need an There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think
In what way does this do a better job of doing that, given that |
||
|
||
def test_in_bounds_fuzz(self): | ||
# Don't use fixed seed | ||
np.random.seed() | ||
|
||
for dt in self.itype[1:]: | ||
for ubnd in [4, 8, 16]: | ||
vals = self.rfunc(2, ubnd, size=2**16, dtype=dt) | ||
assert_(vals.max() < ubnd) | ||
assert_(vals.min() >= 2) | ||
vals = self.rfunc(0, 2, size=2**16, dtype=np.bool) | ||
|
||
vals = self.rfunc(0, 2, size=2**16, dtype=np.bool_) | ||
|
||
assert_(vals.max() < 2) | ||
assert_(vals.min() >= 0) | ||
|
||
|
@@ -209,6 +229,29 @@ def test_repeatability(self): | |
res = hashlib.md5(val).hexdigest() | ||
assert_(tgt[np.dtype(np.bool).name] == res) | ||
|
||
def test_int64_uint64_corner_case(self): | ||
# When stored in Numpy arrays, `lbnd` is casted | ||
# as np.int64, and `ubnd` is casted as np.uint64. | ||
# Checking whether `lbnd` >= `ubnd` used to be | ||
# done solely via direct comparison, which is incorrect | ||
# because when Numpy tries to compare both numbers, | ||
# it casts both to np.float64 because there is | ||
# no integer superset of np.int64 and np.uint64. However, | ||
# `ubnd` is too large to be represented in np.float64, | ||
# causing it be round down to np.iinfo(np.int64).max, | ||
# leading to a ValueError because `lbnd` now equals | ||
# the new `ubnd`. | ||
|
||
dt = np.int64 | ||
tgt = np.iinfo(np.int64).max | ||
lbnd = np.int64(np.iinfo(np.int64).max) | ||
ubnd = np.uint64(np.iinfo(np.int64).max + 1) | ||
|
||
# None of these function calls should | ||
# generate a ValueError now. | ||
actual = np.random.randint(lbnd, ubnd, dtype=dt) | ||
assert_equal(actual, tgt) | ||
|
||
def test_respect_dtype_singleton(self): | ||
# See gh-7203 | ||
for dt in self.itype: | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we claim that everything should work if comparisons get fixed, then this line doesn't need to change - right?
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is correct. The following lines will not be needed:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me be more clear: I think we should continue to pass
low
andhigh
on this line - there is no reason to change toilow
andihigh
here.Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's not the case:
This PR cannot demonstrate any clearer just how fragile
numpy
's operators are when handling largeuint64
numbers.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Darn, you're right. In that case, the comment above saying "remove these lines" should also point out that we need them to work around subtraction as well. Or just a comment to that effect next to the subtraction
Alternatively,
np.subtract(high, 1, dtype=dtype)
would probably be safe here too.Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough. In fact, even your suggestion is not safe:
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what you meant by
# correct but bad for randint
. You're in for a bad time if you ask for an upper bound that doesn't come close to fitting in the container type. What does passingdtype=None
into randint do normally?Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, looking at those examples again, I realize that those aren't valid in the context of
randint
. The only time that we could do this subtraction is ifdtype=np.uint64
, in which case the subtraction actually works correctly.So ignore everything I just said in the previous comment above. I still would rather prefer to use
high - 1
(as the current workaround) overnp.subtract
, as that will make it clear to us later that we need to patch this once #8851 is fixed.dtype=None
is not valid (per the docs) and will error as an invaliddtype
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
np.subtract
has the benefit of extending to broadcasting more naturally.I'm happy with the current workaround, provided there's a comment indicating that the subtraction too is a workaround
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment has been appended to the
TODO
I wrote earlier.