numpy · charris · Jun 28, 2019 · Jun 27, 2019 · Jun 27, 2019 · Jun 27, 2019
diff --git a/doc/source/reference/random/bit_generators/index.rst b/doc/source/reference/random/bit_generators/index.rst
@@ -17,21 +17,23 @@ Supported BitGenerators
 
 The included BitGenerators are:
 
+* PCG-64 - The default. A fast generator that supports many parallel streams
+  and can be advanced by an arbitrary amount. See the documentation for
+  :meth:`~.PCG64.advance`. PCG-64 has a period of :math:`2^{128}`. See the `PCG
+  author's page`_ for more details about this class of PRNG.
 * MT19937 - The standard Python BitGenerator. Adds a `~mt19937.MT19937.jumped`
-  function that returns a new generator with state as-if ``2**128`` draws have
+  function that returns a new generator with state as-if :math:`2^{128}` draws have
   been made.
-* PCG-64 - Fast generator that support many parallel streams and
-  can be advanced by an arbitrary amount. See the documentation for
-  :meth:`~.PCG64.advance`. PCG-64 has a period of
-  :math:`2^{128}`. See the `PCG author's page`_ for more details about
-  this class of PRNG.
-* Philox - a counter-based generator capable of being advanced an
+* Philox - A counter-based generator capable of being advanced an
   arbitrary number of steps or generating independent streams. See the
   `Random123`_ page for more details about this class of bit generators.
+* SFC64 - A fast generator based on random invertible mappings. Usually the
+  fastest generator of the four. See the `SFC author's page`_ for (a little)
+  more detail.
 
 .. _`PCG author's page`: http://www.pcg-random.org/
 .. _`Random123`: https://www.deshawresearch.com/resources_random123.html
-
+.. _`SFC author's page`: http://pracrand.sourceforge.net/RNG_engines.txt
 
 .. toctree::
    :maxdepth: 1
@@ -46,26 +48,65 @@ Seeding and Entropy
 -------------------
 
 A BitGenerator provides a stream of random values. In order to generate
-reproducableis streams, BitGenerators support setting their initial state via a
-seed. But how best to seed the BitGenerator? On first impulse one would like to
-do something like ``[bg(i) for i in range(12)]`` to obtain 12 non-correlated,
-independent BitGenerators. However using a highly correlated set of seeds could
-generate BitGenerators that are correlated or overlap within a few samples.
-
-NumPy uses a `SeedSequence` class to mix the seed in a reproducible way that
-introduces the necessary entropy to produce independent and largely non-
-overlapping streams. Small seeds are unable to fill the complete range of
-initializaiton states, and lead to biases among an ensemble of small-seed
-runs. For many cases, that doesn't matter. If you just want to hold things in
-place while you debug something, biases aren't a concern.  For actual
-simulations whose results you care about, let ``SeedSequence(None)`` do its
-thing and then log/print the `SeedSequence.entropy` for repeatable
-`BitGenerator` streams.
+reproducible streams, BitGenerators support setting their initial state via a
+seed. All of the provided BitGenerators will take an arbitrary-sized
+non-negative integer, or a list of such integers, as a seed. BitGenerators
+need to take those inputs and process them into a high-quality internal state
+for the BitGenerator. All of the BitGenerators in numpy delegate that task to
+`~SeedSequence`, which uses hashing techniques to ensure that even low-quality
+seeds generate high-quality initial states.
+
+.. code-block:: python
+
+  from numpy.random import PCG64
+
+  bg = PCG64(12345678903141592653589793)
+
+.. end_block
+
+`~SeedSequence` is designed to be convenient for implementing best practices.
+We recommend that a stochastic program defaults to using entropy from the OS so
+that each run is different. The program should print out or log that entropy.
+In order to reproduce a past value, the program should allow the user to
+provide that value through some mechanism, a command-line argument is common,
+so that the user can then re-enter that entropy to reproduce the result.
+`~SeedSequence` can take care of everything except for communicating with the
+user, which is up to you.
+
+.. code-block:: python
+
+  from numpy.random import PCG64, SeedSequence
+
+  # Get the user's seed somehow, maybe through `argparse`.
+  # If the user did not provide a seed, it should return `None`.
+  seed = get_user_seed()
+  ss = SeedSequence(seed)
+  print('seed = {}'.format(ss.entropy))
+  bg = PCG64(ss)
+
+.. end_block
+
+We default to using a 128-bit integer using entropy gathered from the OS. This
+is a good amount of entropy to initialize all of the generators that we have in
+numpy. We do not recommend using small seeds below 32 bits for general use.
+Using just a small set of seeds to instantiate larger state spaces means that
+there are some initial states that are impossible to reach. This creates some
+biases if everyone uses such values.
+
+There will not be anything *wrong* with the results, per se; even a seed of
+0 is perfectly fine thanks to the processing that `~SeedSequence` does. If you
+just need *some* fixed value for unit tests or debugging, feel free to use
+whatever seed you like. But if you want to make inferences from the results or
+publish them, drawing from a larger set of seeds is good practice.
+
+If you need to generate a good seed "offline", then ``SeedSequence().entropy``
+or using ``secrets.randbits(128)`` from the standard library are both
+convenient ways.
 
 .. autosummary::
    :toctree: generated/
 
+    SeedSequence
     bit_generator.ISeedSequence
     bit_generator.ISpawnableSeedSequence
-    SeedSequence
     bit_generator.SeedlessSeedSequence
diff --git a/doc/source/reference/random/index.rst b/doc/source/reference/random/index.rst
@@ -1,18 +1,16 @@
 .. _numpyrandom:
 
+.. py:module:: numpy.random
+
 .. currentmodule:: numpy.random
 
-numpy.random
-============
+Random sampling (:mod:`numpy.random`)
+=====================================
 
 Numpy's random number routines produce pseudo random numbers using
 combinations of a `BitGenerator` to create sequences and a `Generator`
 to use those sequences to sample from different statistical distributions:
 
-* SeedSequence: Objects that provide entropy for the initial state of a
-  BitGenerator. A good SeedSequence will provide initializations across the
-  entire range of possible states for the BitGenerator, otherwise biases may
-  creep into the generated bit streams.
 * BitGenerators: Objects that generate random numbers. These are typically
   unsigned integer words filled with sequences of either 32 or 64 random bits.
 * Generators: Objects that transform sequences of random bits from a
@@ -24,28 +22,28 @@ Since Numpy version 1.17.0 the Generator can be initialized with a
 number of different BitGenerators. It exposes many different probability
 distributions. See `NEP 19 <https://www.numpy.org/neps/
 nep-0019-rng-policy.html>`_ for context on the updated random Numpy number
-routines. The legacy `RandomState` random number routines are still
+routines. The legacy `.RandomState` random number routines are still
 available, but limited to a single BitGenerator.
 
-For convenience and backward compatibility, a single `RandomState`
+For convenience and backward compatibility, a single `~.RandomState`
 instance's methods are imported into the numpy.random namespace, see
 :ref:`legacy` for the complete list.
 
 Quick Start
 -----------
 
-By default, `Generator` uses normals provided by `PCG64` which will be
-statistically more reliable than the legacy methods in `RandomState`
+By default, `~Generator` uses normals provided by `~pcg64.PCG64` which will be
+statistically more reliable than the legacy methods in `~.RandomState`
 
 .. code-block:: python
 
   # Uses the old numpy.random.RandomState
   from numpy import random
   random.standard_normal()
 
-`Generator` can be used as a direct replacement for `~RandomState`, although
-the random values are generated by `~PCG64`. The
-`Generator` holds an instance of a BitGenerator. It is accessible as
+`~Generator` can be used as a direct replacement for `~.RandomState`, although
+the random values are generated by `~.PCG64`. The
+`~Generator` holds an instance of a BitGenerator. It is accessible as
 ``gen.bit_generator``.
 
 .. code-block:: python
@@ -69,45 +67,37 @@ is wrapped with a `~.Generator`.
 
 Introduction
 ------------
-RandomGen takes a different approach to producing random numbers from the
-`RandomState` object.  Random number generation is separated into three
-components, a seed sequence, a bit generator and a random generator.
+The new infrastructure takes a different approach to producing random numbers
+from the `~.RandomState` object.  Random number generation is separated into
+two components, a bit generator and a random generator.
 
 The `BitGenerator` has a limited set of responsibilities. It manages state
 and provides functions to produce random doubles and random unsigned 32- and
 64-bit values.
 
-The `SeedSequence` takes a seed and provides the initial state for the
-`BitGenerator`. Since consecutive seeds can cause bad effects when comparing
-`BitGenerator` streams, the `SeedSequence` uses current best-practice methods
-to spread the initial state out. However small seeds may still be unable to
-reach all possible initialization states, which can cause biases among an
-ensemble of small-seed runs. For many cases, that doesn't matter. If you just
-want to hold things in place while you debug something, biases aren't a
-concern.  For actual simulations whose results you care about, let
-``SeedSequence(None)`` do its thing and then log/print the
-`SeedSequence.entropy` for repeatable `BitGenerator` streams.
-
 The `random generator <Generator>` takes the
 bit generator-provided stream and transforms them into more useful
 distributions, e.g., simulated normal random values. This structure allows
 alternative bit generators to be used with little code duplication.
 
 The `Generator` is the user-facing object that is nearly identical to
-`RandomState`. The canonical method to initialize a generator passes a
-`~mt19937.MT19937` bit generator, the underlying bit generator in Python -- as
-the sole argument. Note that the BitGenerator must be instantiated.
+`.RandomState`. The canonical method to initialize a generator passes a
+`~.PCG64` bit generator as the sole argument.
+
 .. code-block:: python
 
-  from numpy.random import Generator, PCG64
-  rg = Generator(PCG64())
+  from numpy.random import default_gen
+  rg = default_gen(12345)
   rg.random()
 
-Seed information is directly passed to the bit generator.
+One can also instantiate `Generator` directly with a `BitGenerator` instance.
+To use the older `~mt19937.MT19937` algorithm, one can instantiate it directly
+and pass it to `Generator`.
 
 .. code-block:: python
 
-  rg = Generator(PCG64(12345))
+  from numpy.random import Generator, MT19937
+  rg = Generator(MT19937(12345))
   rg.random()
 
 What's New or Different
@@ -117,9 +107,9 @@ What's New or Different
  The Box-Muller method used to produce NumPy's normals is no longer available
   in `Generator`.  It is not possible to reproduce the exact random
   values using Generator for the normal distribution or any other
-  distribution that relies on the normal such as the `numpy.random.gamma` or
-  `numpy.random.standard_t`. If you require bitwise backward compatible
-  streams, use `RandomState`.
+  distribution that relies on the normal such as the `.RandomState.gamma` or
+  `.RandomState.standard_t`. If you require bitwise backward compatible
+  streams, use `.RandomState`.
 
 * The Generator's normal, exponential and gamma functions use 256-step Ziggurat
   methods which are 2-10 times faster than NumPy's Box-Muller or inverse CDF
@@ -133,9 +123,8 @@ What's New or Different
   source of randomness that is used in cryptographic applications (e.g.,
   ``/dev/urandom`` on Unix).
 * All BitGenerators can produce doubles, uint64s and uint32s via CTypes
-  (`~PCG64.ctypes`) and CFFI
-  (:meth:`~PCG64.cffi`). This allows the bit generators to
-  be used in numba.
+  (`~.PCG64.ctypes`) and CFFI (`~.PCG64.cffi`). This allows the bit generators
+  to be used in numba.
 * The bit generators can be used in downstream projects via
   :ref:`Cython <randomgen_cython>`.
 * `~.Generator.integers` is now the canonical way to generate integer
@@ -144,8 +133,11 @@ What's New or Different
   The ``endpoint`` keyword can be used to specify open or closed intervals.
   This replaces both ``randint`` and the deprecated ``random_integers``.
 * `~.Generator.random` is now the canonical way to generate floating-point
-  random numbers, which replaces `random_sample`, `sample`, and `ranf`. This
-  is consistent with Python's `random.random`.
+  random numbers, which replaces `.RandomState.random_sample`,
+  `.RandomState.sample`, and `.RandomState.ranf`. This is consistent with
+  Python's `random.random`.
+* All BitGenerators in numpy use `~SeedSequence` to convert seeds into
+  initialized states.
 
 See :ref:`new-or-different` for a complete list of improvements and
 differences from the traditional ``Randomstate``.
@@ -154,10 +146,11 @@ Parallel Generation
 ~~~~~~~~~~~~~~~~~~~
 
 The included generators can be used in parallel, distributed applications in
-one of two ways:
+one of three ways:
 
+* :ref:`seedsequence-spawn`
 * :ref:`independent-streams`
-* :ref:`jump-and-advance`
+* :ref:`parallel-jumped`
 
 Concepts
 --------

diff --git a/doc/source/reference/random/legacy.rst b/doc/source/reference/random/legacy.rst
@@ -1,3 +1,5 @@
+.. currentmodule:: numpy.random
+
 .. _legacy:
 
 Legacy Random Generation
@@ -8,46 +10,41 @@ no further improvements.  It is guaranteed to produce the same values
 as the final point release of NumPy v1.16. These all depend on Box-Muller
 normals or inverse CDF exponentials or gammas. This class should only be used
 if it is essential to have randoms that are identical to what
-would have been produced by NumPy.
+would have been produced by previous versions of NumPy.
 
 `~mtrand.RandomState` adds additional information
 to the state which is required when using Box-Muller normals since these
 are produced in pairs. It is important to use
 `~mtrand.RandomState.get_state`, and not the underlying bit generators
 `state`, when accessing the state so that these extra values are saved.
 
-.. warning::
-
-  :class:`~randomgen.legacy.LegacyGenerator` only contains functions
-  that have changed.  Since it does not contain other functions, it
-  is not directly possible to replace :class:`~numpy.random.RandomState`.
-  In order to full replace :class:`~numpy.random.RandomState`, it is
-  necessary to use both :class:`~randomgen.legacy.LegacyGenerator`
-  and :class:`~randomgen.generator.RandomGenerator` both driven
-  by the same basic RNG. Methods present in :class:`~randomgen.legacy.LegacyGenerator`
-  must be called from :class:`~randomgen.legacy.LegacyGenerator`.  Other Methods
-  should be called from :class:`~randomgen.generator.RandomGenerator`.
-
+Although we provide the `~mt19937.MT19937` BitGenerator for use independent of
+`~mtrand.RandomState`, note that its default seeding uses `~SeedSequence`
+rather than the legacy seeding algorithm. `~mtrand.RandomState` will use the
+legacy seeding algorithm. The methods to use the legacy seeding algorithm are
+currently private as the main reason to use them is just to implement
+`~mtrand.RandomState`. However, one can reset the state of `~mt19937.MT19937`
+using the state of the `~mtrand.RandomState`:
 
 .. code-block:: python
 
    from numpy.random import MT19937
    from numpy.random import RandomState
 
-   # Use same seed
    rs = RandomState(12345)
-   mt19937 = MT19937(12345)
-   lg = RandomState(mt19937)
+   mt19937 = MT19937()
+   mt19937.state = rs.get_state()
+   rs2 = RandomState(mt19937)
 
-   # Identical output
+   # Same output
    rs.standard_normal()
-   lg.standard_normal()
+   rs2.standard_normal()
 
    rs.random()
-   lg.random()
+   rs2.random()
 
    rs.standard_exponential()
-   lg.standard_exponential()
+   rs2.standard_exponential()
 
 
 .. currentmodule:: numpy.random.mtrand

diff --git a/doc/source/reference/random/multithreading.rst b/doc/source/reference/random/multithreading.rst
@@ -1,9 +1,11 @@
 Multithreaded Generation
 ========================
 
-The four core distributions all allow existing arrays to be filled using the
-``out`` keyword argument.  Existing arrays need to be contiguous and
-well-behaved (writable and aligned).  Under normal circumstances, arrays
+The four core distributions (:meth:`~.Generator.random`,
+:meth:`~.Generator.standard_normal`, :meth:`~.Generator.standard_exponential`,
+and :meth:`~.Generator.standard_gamma`) all allow existing arrays to be filled
+using the ``out`` keyword argument. Existing arrays need to be contiguous and
+well-behaved (writable and aligned). Under normal circumstances, arrays
 created using the common constructors such as :meth:`numpy.empty` will satisfy
 these requirements.