-
-
Notifications
You must be signed in to change notification settings - Fork 10.9k
ENH: Allow size=0 in numpy.random.choice #11383
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Thanks! Since you solved the merge conflict, I'm not sure what else needs to be done? |
@eric-wieser approved the PR that this replaces, but requested a review from @seberg. Edit: typo |
numpy/random/tests/test_random.py
Outdated
assert_equal(np.random.randint(0,0,(3,0,4)).shape, (3,0,4)) | ||
assert_equal(np.random.randint(0,-10,0).shape, (0,)) | ||
assert_equal(np.random.choice(0,0).shape, (0,)) | ||
assert_equal(np.random.choice([],(0,)).shape, (0,)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: these tests would be clearer if the size/shape argument was passed by kwarg - too many zeros here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
numpy/random/mtrand/mtrand.pyx
Outdated
if pop_size is 0: | ||
raise ValueError("a must be non-empty") | ||
if pop_size is 0 and np.prod(size) != 0: | ||
raise ValueError("a cannot be empty unless no samples are taken") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If quotes are being added to all the a
s, one is missed here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
doc/release/1.16.0-notes.rst
Outdated
Even when no elements needed to be drawn, ``np.random.randint`` and | ||
``np.random.choice`` raised an error when the arguments described an empty | ||
distribution. This has been fixed so that e.g. | ||
``np.random.choice([],0) == np.array([],dtype=float64)``. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: spaces after commas
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed, also replaced all spaces after comments in these three files grep ",[^ ]"
numpy/random/tests/test_random.py
Outdated
out1 = np.empty((len(self.seeds),) + sz) | ||
out2 = np.empty((len(self.seeds),) + sz) | ||
out1 = np.empty((len(self.seeds), ) + sz) | ||
out2 = np.empty((len(self.seeds), ) + sz) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess (x,)
is an exception to the space after comma rule. Let me clarify that to "space after comma not before a closing group, like ]
or )
"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
Shows that I should check the standard before deciding matters of taste preference.
Ideally all of the docstring whitespace changes would just go in a separate |
@@ -440,6 +440,14 @@ def test_choice_return_shape(self): | |||
assert_equal(np.random.choice(6, s, replace=False, p=p).shape, s) | |||
assert_equal(np.random.choice(np.arange(6), s, replace=True).shape, s) | |||
|
|||
# Check zero-size | |||
assert_equal(np.random.randint(0, 0, size=(3, 0, 4)).shape, (3, 0, 4)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A test for randint(10, 10, size=0)
might be good too - just to check that we didn't special case 0 somehow
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems OK to me, allowing empty probabilities might be nice.
if pop_size <= 0: | ||
raise ValueError("a must be greater than 0") | ||
raise ValueError("'a' must be 1-dimensional or an integer") | ||
if pop_size <= 0 and np.prod(size) != 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a bit unintuitive that this works for None
, but I guess OK, could also add size is None
explicitly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That also has the advantage that np.prod
probably adds a bit of overhead,
6DAF
but maybe negligible in any case.
raise ValueError("a must be greater than 0") | ||
raise ValueError("'a' must be 1-dimensional or an integer") | ||
if pop_size <= 0 and np.prod(size) != 0: | ||
raise ValueError("'a' must be greater than 0 unless no samples are taken") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I somewhat thought it was backticks ;). This is good, I do not think we or python has serious guidelines for errors.
if p.size != pop_size: | ||
raise ValueError("a and p must have same size") | ||
raise ValueError("'a' and 'p' must have same size") | ||
if np.logical_or.reduce(p < 0): | ||
raise ValueError("probabilities are not non-negative") | ||
if abs(kahan_sum(pix, d) - 1.) > atol: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if you test for an empty probabilities array, you will see that this check fails also, so might as well allow that too?
raise ValueError("low >= high") | ||
|
||
if ilow >= ihigh and np.prod(size) != 0: | ||
raise ValueError("Range cannot be empty (low >= high) unless no samples are taken") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was suprised for a bit here, but I guess we do it like a python range
and allow strange ranges as empty ranges, seems fine to me.
OK, fine with me, thought it might have been an oversight. |
elif a.ndim != 1: | ||
raise ValueError("a must be 1-dimensional") | ||
raise ValueError("'a' must be 1-dimensional") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it is normal to escape argument names in error messages, see, e.g.,
https://github.com/numpy/numpy/blob/master/numpy/random/mtrand/mtrand.pyx#L988
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My guess is, you can find examples for everything both in numpy and the standard lib. Personally, I think quotes probably make it slightly more discoverable what a
refers to, so sounds good to me. (e.g. something says axis is invalid
may not refer to an axis argument, something saying `axis` is invalid
certainly does).
Anyway, I think the PR looks good and you guys can put it in if you like.
Since this issue and PR came from me originally: thanks to all who helped complete it! |
Replaces lost PR #8717. Fixes #8311. Includes tests.
@MareinK if you wish to continue I can give you permissions to push to the source repo for the PR.