8000 ENH: use SeedSequence instead of seed() by mattip · Pull Request #13780 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

ENH: use SeedSequence instead of seed() #13780

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jun 26, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions doc/source/reference/random/bit_generators/bitgenerators.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
:orphan:

BitGenerator
------------

.. currentmodule:: numpy.random.bit_generator

.. autosummary::
:toctree: generated/

BitGenerator
35 changes: 30 additions & 5 deletions doc/source/reference/random/bit_generators/index.rst
8000
Original file line number Diff line number Diff line change
@@ -1,24 +1,49 @@
.. _bit_generator:

.. currentmodule:: numpy.random

Bit Generators
--------------

.. currentmodule:: numpy.random

The random values produced by :class:`~Generator`
orignate in a BitGenerator. The BitGenerators do not directly provide
random numbers and only contains methods used for seeding, getting or
setting the state, jumping or advancing the state, and for accessing
low-level wrappers for consumption by code that can efficiently
access the functions provided, e.g., `numba <https://numba.pydata.org>`_.

Stable RNGs
===========

.. toctree::
:maxdepth: 1

BitGenerator <bitgenerators>
MT19937 <mt19937>
PCG64 <pcg64>
Philox <philox>

Seeding and Entropy
-------------------

A BitGenerator provides a stream of random values. In order to generate
reproducableis streams, BitGenerators support setting their initial state via a
seed. But how best to seed the BitGenerator? On first impulse one would like to
do something like ``[bg(i) for i in range(12)]`` to obtain 12 non-correlated,
independent BitGenerators. However using a highly correlated set of seeds could
generate BitGenerators that are correlated or overlap within a few samples.

NumPy uses a `SeedSequence` class to mix the seed in a reproducible way that
introduces the necessary entropy to produce independent and largely non-
overlapping streams. Small seeds may still be unable to reach all possible
initialization states, which can cause biases among an ensemble of small-seed
runs. For many cases, that doesn't matter. If you just want to hold things in
place while you debug something, biases aren't a concern. For actual
simulations whose results you care about, let ``SeedSequence(None)`` do its
thing and then log/print the `SeedSequence.entropy` for repeatable
`BitGenerator` streams.

.. autosummary::
:toctree: generated/

bit_generator.ISeedSequence
bit_generator.ISpawnableSeedSequence
SeedSequence
bit_generator.SeedlessSeedSequence
5 changes: 2 additions & 3 deletions doc/source/reference/random/bit_generators/mt19937.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,13 +8,12 @@ Mersenne Twister (MT19937)
.. autoclass:: MT19937
:exclude-members:

Seeding and State
=================
State
=====

.. autosummary::
:toctree: generated/

~MT19937.seed
~MT19937.state

Parallel generation
Expand Down
5 changes: 2 additions & 3 deletions doc/source/reference/random/bit_generators/pcg64.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,13 +8,12 @@ Parallel Congruent Generator (64-bit, PCG64)
.. autoclass:: PCG64
:exclude-members:

Seeding and State
=================
State
=====

.. autosummary::
:toctree: generated/

~PCG64.seed
~PCG64.state

Parallel generation
Expand Down
5 changes: 2 additions & 3 deletions doc/source/reference/random/bit_generators/philox.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,13 +8,12 @@ Philox Counter-based RNG
.. autoclass:: Philox
:exclude-members:

Seeding and State
=================
State
=====

.. autosummary::
:toctree: generated/

~Philox.seed
~Philox.state

Parallel generation
Expand Down
67 changes: 39 additions & 28 deletions doc/source/reference/random/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,10 @@ Numpy's random number routines produce pseudo random numbers using
combinations of a `BitGenerator` to create sequences and a `Generator`
to use those sequences to sample from different statistical distributions:

* SeedSequence: Objects that provide entropy for the initial state of a
BitGenerator. A good SeedSequence will provide initializations across the
entire range of possible states for the BitGenerator, otherwise biases may
creep into the generated bit streams.
* BitGenerators: Objects that generate random numbers. These are typically
unsigned integer words filled with sequences of either 32 or 64 random bits.
* Generators: Objects that transform sequences of random bits from a
Expand Down Expand Up @@ -52,28 +56,37 @@ the random values are generated by `~PCG64`. The
rg.standard_normal()
rg.bit_generator


Seeds can be passed to any of the BitGenerators. Here `mt19937.MT19937` is used
and is the wrapped with a `~.Generator`.

Seeds can be passed to any of the BitGenerators. The provided value is mixed
via `~.SeedSequence` to spread a possible sequence of seeds across a wider
range of initialization states for the BitGenerator. Here `~.PCG64` is used and
is wrapped with a `~.Generator`.

.. code-block:: python

from numpy.random import Generator, MT19937
rg = Generator(MT19937(12345))
from numpy.random import Generator, PCG64
rg = Generator(PCG64(12345))
rg.standard_normal()


Introduction
------------
RandomGen takes a different approach to producing random numbers from the
`RandomState` object. Random number generation is separated into two
components, a bit generator and a random generator.
`RandomState` object. Random number generation is separated into three
components, a seed sequence, a bit generator and a random generator.

The bit generator has a limited set of responsibilities. It manages state
The `BitGenerator` has a limited set of responsibilities. It manages state
and provides functions to produce random doubles and random unsigned 32- and
64-bit values. The bit generator also handles all seeding which varies with
different bit generators.
64-bit values.

The `SeedSequence` takes a seed and provides the initial state for the
`BitGenerator`. Since consecutive seeds can cause bad effects when comparing
`BitGenerator` streams, the `SeedSequence` uses current best-practice methods
to spread the initial state out. However small seeds may still be unable to
reach all possible initialization states, which can cause biases among an
ensemble of small-seed runs. For many cases, that doesn't matter. If you just
want to hold things in place while you debug something, biases aren't a
concern. For actual simulations whose results you care about, let
``SeedSequence(None)`` do its thing and then log/print the
`SeedSequence.entropy` for repeatable `BitGenerator` streams.

The `random generator <Generator>` takes the
bit generator-provided stream and transforms them into more useful
Expand All @@ -86,15 +99,15 @@ The `Generator` is the user-facing object that is nearly identical to
the sole argument. Note that the BitGenerator must be instantiated.
.. code-block:: python

from numpy.random import Generator, MT19937
rg = Generator(MT19937())
from numpy.random import Generator, PCG64
rg = Generator(PCG64())
rg.random()

Seed information is directly passed to the bit generator.

.. code-block:: python

rg = Generator(MT19937(12345))
rg = Generator(PCG64(12345))
rg.random()

What's New or Different
Expand Down Expand Up @@ -150,9 +163,14 @@ Supported BitGenerators
-----------------------
The included BitGenerators are:

* MT19937 - The standard Python BitGenerator. Produces identical results to
Python using the same seed/state. Adds a `~mt19937.MT19937.jumped` function
that returns a new generator with state as-if ``2**128`` draws have been made.
* MT19937 - The standard Python BitGenerator. Adds a `~mt19937.MT19937.jumped`
function that returns a new generator with state as-if ``2**128`` draws have
been made.
* PCG-64 - Fast generator that support many parallel streams and
can be advanced by an arbitrary amount. See the documentation for
:meth:`~.PCG64.advance`. PCG-64 has a period of
:math:`2^{128}`. See the `PCG author's page`_ for more details about
this class of PRNG.
* Xorshiro256** and Xorshiro512** - The most recently introduced XOR,
shift, and rotate generator. Supports ``jumped`` and so can be used in
parallel applications. See the documentation for
Expand All @@ -163,21 +181,14 @@ The included BitGenerators are:
.. _`PCG author's page`: http://www.pcg-random.org/
.. _`Random123`: https://www.deshawresearch.com/resources_random123.html

Generator
---------
Concepts
--------
.. toctree::
:maxdepth: 1

generator
legacy mtrand <legacy>

BitGenerators
-------------

.. toctree::
:maxdepth: 1

BitGenerators <bit_generators/index>
BitGenerators, SeedSequences <bit_generators/index>

Features
--------
Expand Down
3 changes: 1 addition & 2 deletions doc/source/reference/random/new-or-different.rst
Original file line number Diff line number Diff line change
Expand Up @@ -93,9 +93,8 @@ And in more detail:

.. ipython:: python

rg.bit_generator.seed(0)
rg = Generator(PCG64(0))
rg.random(3, dtype='d')
rg.bit_generator.seed(0)
rg.random(3, dtype='f')

* Optional ``out`` argument that allows existing arrays to be filled for
Expand Down
21 changes: 15 additions & 6 deletions F987 numpy/random/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,6 @@
bytes Uniformly distributed random bytes.
permutation Randomly permute a sequence / generate a random sequence.
shuffle Randomly permute a sequence in place.
seed Seed the random number generator.
choice Random sample from 1-D array.
==================== =========================================================

Expand All @@ -32,6 +31,7 @@
(deprecated, use ``integers(..., closed=True)`` instead)
random_sample Alias for `random_sample`
randint Uniformly distributed integers in a given range
seed Seed the legacy random number generator.
==================== =========================================================

==================== =========================================================
Expand Down Expand Up @@ -102,6 +102,12 @@
Philox
============================================= ===

============================================= ===
Getting entropy to initialize a BitGenerator
--------------------------------------------- ---
SeedSequence
============================================= ===

"""
from __future__ import division, absolute_import, print_function

Expand Down Expand Up @@ -161,22 +167,25 @@
from . import mtrand
from .mtrand import *
from .generator import Generator
from .bit_generator import SeedSequence
from .mt19937 import MT19937
from .pcg64 import PCG64
from .philox import Philox
from .mtrand import RandomState

__all__ += ['Generator', 'MT19937', 'Philox', 'PCG64', 'RandomState']
__all__ += ['Generator', 'RandomState', 'SeedSequence', 'MT19937',
'Philox', 'PCG64']


def __RandomState_ctor():
"""Return a RandomState instance.

This function exists solely to assist (un)pickling.

Note that the state of the RandomState returned here is irrelevant, as this function's
entire purpose is to return a newly allocated RandomState whose state pickle can set.
Consequently the RandomState returned by this function is a freshly allocated copy
with a seed=0.
Note that the state of the RandomState returned here is irrelevant, as this
function's entire purpose is to return a newly allocated RandomState whose
state pickle can set. Consequently the RandomState returned by this function
is a freshly allocated copy with a seed=0.

See https://github.com/numpy/numpy/issues/4763 for a detailed discussion

Expand Down
27 changes: 27 additions & 0 deletions numpy/random/bit_generator.pxd
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@

from .common cimport bitgen_t
cimport numpy as np

cdef class BitGenerator():
cdef readonly object _seed_seq
cdef readonly object lock
cdef bitgen_t _bitgen
cdef readonly object _ctypes
cdef readonly object _cffi
cdef readonly object capsule


cdef class SeedSequence():
cdef readonly object entropy
cdef readonly object program_entropy
cdef readonly tuple spawn_key
cdef readonly int pool_size
cdef readonly object pool
cdef readonly int n_children_spawned

cdef mix_entropy(self, np.ndarray[np.npy_uint32, ndim=1] mixer,
np.ndarray[np.npy_uint32, ndim=1] entropy_array)
cdef get_assembled_entropy(self)

cdef class SeedlessSequence():
pass
Loading
0