DOC, MAINT: Clarify error message of random.randint. #14333

maxwell-aladago · 2019-08-22T20:24:40Z

Improve the error raised by random.randint when low <= 0 and high is not given.

The previous behaviour to throw ValueError: ValueError: low >= high which is correct but confusing unless you read the source code. I don't think adding that extra check early on will degrade performance.

…h is not given

mattip · 2019-08-24T17:53:34Z

LGTM. Any reason not to merge?

eric-wieser · 2019-08-24T18:01:45Z

numpy/random/generator.pyx

@@ -434,6 +434,8 @@ cdef class Generator:

        """
        if high is None:
+            if isinstance(low, (int, np.integer)) and low <= 0:


I'm not really a fan of this type of detection - it doesn't work on things that are coercible to integers, and is generally fragile at actually preventing the original error from occuring

Could you instead use something like the following?

high_and_low_swapped = high is None: if high_and_low_swapped: high = low low = 0 try: # the chain of ifs below except ValueError as e: if high_and_low_swapped and 'low >= high' == str(e): raise ValueError(...) from None raise

Thanks @eric-wieser . Just pushed the changes.

numpy/random/tests/test_generator_mt19937.py

eric-wieser · 2019-08-24T22:33:41Z

numpy/random/generator.pyx

+                ret = _rand_bool(low, high, size, _masked, endpoint, &self._bitgen, self.lock)
+        except ValueError as e:
+            if high_and_low_swapped and 'low >= high' == str(e):
+                e.args = ("low must be greater than 0 when high is not given.", )


I'm not sure if this works correctly in displayed stack traces. What's the trade-off between doing this and raising a brand new exception from None?

Yes, the stack trace is good. As an example, the code

import numpy as np rng = np.random.default_rng() rng.integers(0)

produces the stacktrace

Traceback (most recent call last): File "/Users/maxwellaladago/Documents/pub/numpy-max/perms.py", line 4, in <module> rng.integers(0) File "generator.pyx", line 458, in numpy.random.generator.Generator.integers File "bounded_integers.pyx", line 1228, in numpy.random.bounded_integers._rand_int64 ValueError: low must be greater than 0 when hig 10000 h is not given.

The problem with the raise from None is that it contains the phrase raise from None which can be confusing. Additionally, it doesn't capture the trace of the original exception. For example,

import numpy as np try: rng = np.random.default_rng() rng.integers(0) except ValueError as e: raise ValueError("just a demo") from None

will produce the stacktrace

Traceback (most recent call last): File "/Users/maxwellaladago/Documents/pub/numpy-max/perms.py", line 6, in <module> raise ValueError("just a demo") from None ValueError: just a demo

How about just preempting the whole thing and raising the error early on? That seems simpler to parse to me?

Sorry :(, I completely missed that high can be a very large array, making the check a bit expensive maybe.

Rolled back to the previous implementations @seberg. If there was a quicker way to check for violations in a potentially large 'high', preemptive checks would have been the ideal thing to do.

I suppose we could time it, maybe it doesn't make a difference in either case since random number generation is much more expensive. I agree with Eric's point that editing the message seems a bit strange.
Your above example with the from None is outside the actual function, if you lose one level of the original backtrace (which is fully inside numpy), that does not actually seem to matter. Could you just check how it looks if you change the actual code? My expectation is that it is OK.

You are correct. Raising the error from none does work well. My guess is that the number of calls with high=None and low having one or more values less than 1 will be small relative to the number of calls overall.

bashtage · 2019-09-13T08:03:55Z

It seems strange to have one error check in this file (which is then duplicated).

Why not just expand the current error message to be "low >= high. If high was not provided, then low must be greater than 0.". This is really simple and avoids extra complexity. The end-user will get the correct message.

maxwell-aladago · 2019-09-13T16:15:45Z

It seems strange to have one error check in this file (which is then duplicated).

Why not just expand the current error message to be "low >= high. If high was not provided, then low must be greater than 0.". This is really simple and avoids extra complexity. The end-user will get the correct message.

That involves changing the error message in all the functions inside the try catch block which may affect other parts of the codebase.

bashtage · 2019-09-13T16:21:02Z

@maxwell-aladago There is only 1 function. They are templated.

Probably 2, but I wrote them, and changing an exception message doesn't affect the code base.

bashtage · 2019-09-13T16:24:50Z

The lines are here:

https://github.com/numpy/numpy/blob/master/numpy/random/bounded_integers.pyx.in#L71

and

https://github.com/numpy/numpy/blob/master/numpy/random/bounded_integers.pyx.in#L145

just append the second part and all is fixed.

bashtage · 2019-09-14T23:32:04Z

Two more places,

https://github.com/numpy/numpy/blob/master/numpy/random/bounded_integers.pyx.in#L163
https://github.com/numpy/numpy/blob/master/numpy/random/bounded_integers.pyx.in#L289

Could easily refactor to a shared error message (before any of the functions, about here: https://github.com/numpy/numpy/blob/master/numpy/random/bounded_integers.pyx.in#L22,

low_high_error = 'low {comp} high. If high was not provided, then low must be greater than 0.'

and then the four ValueErrors become

raise ValueError('low_high_error'.format(comp=comp)).

Increasing complexity for an error message does not seem worth the cost to me.

mattip · 2019-12-03T13:39:51Z

@maxwell-aladago thoughts? I think the general direction is to close this as "too complicated". Maybe there is a better way?

bashtage · 2020-10-12T10:57:36Z

@mattip Given randint is legacy only it might make sense to leave it. If there are changes needed, they should be clarified to apply to Generator.intergers

Improve the exception when low is 0 in case the single input form was used. closes numpy#14333

maxwell-aladago added 2 commits August 22, 2019 16:14

EHN: clarified error message of random.randint when low <= 0 when hig…

c7170ba

…h is not given

fixed errors

2424869

maxwell-aladago closed this Aug 22, 2019

maxwell-aladago reopened this Aug 22, 2019

charris changed the title ~~EHN: clarified error message of random.randint when low <= 0 when hig…~~ DOC, MAINT: Clarify error message of random.randint. Aug 23, 2019

charris added 03 - Maintenance 04 - Documentation component: numpy.random labels Aug 23, 2019

eric-wieser self-requested a review August 24, 2019 17:58

eric-wieser reviewed Aug 24, 2019

View reviewed changes

numpy/random/tests/test_generator_mt19937.py Outdated Show resolved Hide resolved

refactoring base on reviews: using try-catch

cbac580

eric-wieser reviewed Aug 24, 2019

View reviewed changes

maxwell-aladago added 3 commits September 12, 2019 13:50

adding preemptive errors base on reviews

3c2f31a

rolling back

75eb65a

raising error from none

f73762e

mattip added the 57 - Close? Issues which may be closable unless discussion continued label Oct 12, 2020

charris mentioned this pull request Dec 13, 2020

BUG: Enforce high >= low on uniform number generators #17921

Merged

Base automatically changed from master to main March 4, 2021 02:04

bashtage added a commit to bashtage/numpy that referenced this pull request Mar 17, 2021

ENH: Improve the exception for default low in Generator.integers

3b67a2a

Improve the exception when low is 0 in case the single input form was used. closes numpy#14333

bashtage mentioned this pull request Mar 17, 2021

ENH: Improve the exception for default low in Generator.integers #18635

Merged

mattip closed this in #18635 Mar 18, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

DOC, MAINT: Clarify error message of random.randint. #14333

DOC, MAINT: Clarify error message of random.randint. #14333

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

DOC, MAINT: Clarify error message of random.randint. #14333

DOC, MAINT: Clarify error message of random.randint. #14333

Uh oh!

Conversation

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!