-
-
Notifications
You must be signed in to change notification settings - Fork 10.9k
Fixes for np.random.choice #2727
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The size argument to random.choice should work like it does for all other functions in random as well.
Random choice used np.unique to find new indices when replace was False and p given. This is wrong since unique will sort the indices. This solves the bug, but likely not ideal.
Looks reasonable, but needs tests like noted. You need to regenerate the .c file from the .pyx, that doesn't happen automatically. (Another reason for tests, I think this means the current patch may actually not be tested...?) |
Yes, had not really run the tests (at least not for all changes), and lots of small things I overlooked as I did not have the new things tested. Forgot to note in the commit, but thanks to @alan-isaac for pointing out the 0-d vs. scalar thing. The code to do this right is a bit awkward to be honest, maybe someone knows a better way, but the different branches need different special cases to give an array or scalar consistent here. I have added tests for this too. I have run the tests locally (I think the ternary operator in the cython file should not matter for 2.4 as its translated to C). I did not commit mtrand.c since I am currently using cython 1.6, so it would be nice if someone else could do that after looking over this (though if that is inconvenient I guess I could also install a newer version). Especially for the |
Thanks to @alan-isaac for pointing out the 0-d vs. scalar issue.
Sorry, did a rebase if anyone noticed. mtrand.c with a wrong version sneaked in there... Note that the Travis tests have to fail since |
Regenerated mtrand.c using cython 0.17.1. @certik just so its not forgotten, these changes are meant for 1.7. |
@josef-pkt: I know you keep an eye on a lot of the RNG stuff, want to review this? (Note @seberg's comment above: "Especially for the |
See also http://projects.scipy.org/numpy/ticket/2246, maybe you could take that along? |
@rgommers thats not hard, but I don't see much of a reason to try hard to squeeze it into 1.7. |
Agreed, that's what I commented on the ticket too. |
Closing here, since it was pushed into master, but I would feel better if someone had at least a look at the stuff and think its ok... I cannot see a flaw in the bug fix (without the sort the results are obviously wrong) and I looked at the results and they seem reasonable, but still... |
@njsmith, @josef-pkt maybe it is not a huge issue, but I just wondered if the function as is is actually 100% right. The thing is it uses things like Acutally only one case in there and that is with |
!! If a method on RandomState is calling into the shared global RandomState via np.random.foo, then that is a huge bug! Really we should have tests that scramble the global random seed, and then test that |
Backwards compatible by adding an `axis` keyword to the end of the signature for choice. Default is 0, which selects by row. The default keyword arguments make it functionally equivalent to random.choice from Python's standard library. Note, however, that a list of tuples will have the tuples returned as lists. Issue numpy#2724 was originally opened with that request, but numpy#2727 seems to have closed it accidentally.
As mentioned on the mailinglist this changes the default for the
size
argument toNone
and allows tuples as well, just as other random functions do. This part would need to be in the 1.7. released to avoid changing it lateron.The second commit fixes a bug that I noticed while doing the first thing. I know that tests are missing (and if someone feels like adding them please do so), but I think it would be better to raise an Exception then to keep the current behaviour (unless I am seeing things wrong). However
np.unique
returnsnew
sorted, but it must be kept exactly in the original order as far as I can see.It does not adress a possible
axis
argument (see also #2724) as it should not matter for 1.7.