-
-
Notifications
You must be signed in to change notification settings - Fork 10.9k
ENH: np.random.default_gen() #13840
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: np.random.default_gen() #13840
Conversation
The most common use will certainly be to pass in an integer seed, so I want to make sure that we talk in terms of "seeds" first and foremost. I don't want to overload people with new concepts and terminology right out of the gate.
I think this was left by mistake and should be removed.
What is the motivation for returning a singleton instance? To save the cost of sampling system entropy in creating a new Bitgenerator? I don't think we should add a singleton, since it can be abused in multi-threaded contexts. We can always add it later. |
In the cases in which you would get the global, it's fine to reuse it in different threads. It's only when you want to reproduce the results that sharing an instance across threads would be a problem. Since we don't allow reseeding in-place, this would not come up. But we can go ahead and delete it for now. We'll see who howls. Going through those examples suggests that |
@@ -167,7 +167,7 @@ | |||
|
|||
from . import mtrand | |||
from .mtrand import * | |||
from .generator import Generator | |||
from .generator import Generator, default_gen |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Line 9 Generator().function
-> default_gen().function
other than a couple of documentation nits, LGTM |
needs a rebase |
Thanks @rkern |
We achieved some consensus in the default
BitGenerator
thread that we want numpy to express its opinion about the defaultBitGenerator
through a function. Here's that function.Along the way, I renamed the
seed_seq
arguments toseed
and cleaned up the descriptions of that parameter. I think it's clearer this way. It puts the more familiar concepts ("a seed") up front before the newer concepts ("aSeedSequence
").I removed the argumentless
Generator()
construction in favor ofdefault_gen()
. I did notice thatnp.random.Generator().<method>()
was being referenced a lot in the docstrings, as a text-substitution fornp.random.<method>()
. I changed many of these to explicitly createrng = np.random.default_gen()
and then use therng.<method>()
.But then I noticed that we do have a default global
Generator
instance innp.random.generator._random_generator
with function-style aliases, e.g.np.random.generator.normal()
. These don't appear to be documented anywhere, and I don't recall discussing if it's a thing that we actually want in the 1.17 release (as it will commit us to maintaining them and the underlying global, and may encourage bad habits like re-seeding the global instance in the middle of a library). It was a reasonable choice forrandomgen
, and it might be for numpy 1.18, but I question if that's the right choice for numpy 1.17.That said, the
check_random_state()
idiom is a good one, and it relies on having a global default instance somewhere. It might be best to provide that innumpy.random
rather than spread out in all of the different libraries. This also raises the question what shoulddefault_gen()
anddefault_gen(None)
do? Currently, I implemented it to just create a freshPCG64
instance with OS entropy. But maybedefault_gen()
should be more likecheck_random_state()
:What do you think? I'm neutral on maintaining and documenting the global instance, and -1 on the method aliases.