DOC: Add example for `np.random.default_rng().beta()` #25349

linus-md · 2023-12-08T18:49:10Z

Relates to #21351.

This reverts commit bc2b152.

rkern · 2023-12-08T18:55:00Z

Thank you for the contribution! I'd like to see it fleshed out a little more. See my comments on an earlier PR for the kinds of things that we'd like to see in examples for these distribution methods. Showing just one call of the method with no commentary is below the threshold (at least in my opinion).

linus-md · 2023-12-08T19:21:22Z

@rkern I hope the improved example is to your liking. Is there some guide/heuristic on how well document specific functions should be?

rkern · 2023-12-08T21:06:59Z

A good general rule is to look at other functions/methods that are similar, and see what their documentation looks like. Fortunately, for these distribution methods on Generator, there are lots to look at; c.f. normal() for a pretty canonical example.

Specifically for these methods, I find that plotting the histogram of a large sample is typically more informative than showing the raw values of a handful of samples. Plotting the histogram (potentially with the analytical PDF) is probably the first priority (we have more methods where that's the only thing in the Examples and few methods where we don't do this). Second might be to show the computations with the samples to confirm some interesting properties of the distribution (e.g. comparing the mean of the samples to the known mean derived from the parameters). Then you can think about maybe showing the effects of changing the parameters (e.g. showing the cases when a>b, a==b, and a<b as well as when both are low vs. both are high). If you introduce an application like the order statistics of uniform variates, like you mention here, it's probably good to demonstrate that relationship, but that can take up a lot of space, so it's lowest on my personal list of priorities.

As you look at the other methods' example, please note that you will see things like np.random.default_rng().normal(); this is the result of an automatic conversion of old docs and not something you should replicate. You are doing the correct style with rng = np.random.default_rng(); rng.beta(...).

If you do show the raw values in the outputs, or other computations based on the random output (e.g. computing means), please add the # may vary at the end of the outputs that may vary if you aren't using a seed. We have automated test runners that try to execute the example code, and need this notation to ignore the output.

numpy/random/_generator.pyx

Co-authored-by: Robert Kern <robert.kern@gmail.com>

linus-md · 2023-12-09T12:09:43Z

Thank you for taking so much time to explain everything clearly. Is it ok to use math.gamma I could only find the rng.gamma() distribution in numpy.

I also added the comparison to the order statistics but it might take up too much space. We could refactor the pdf into its own function to save space.

mattip · 2023-12-14T12:13:58Z

I am not sure we should have too many examples in the function docstring.

A couple of justifications:

limited documentation team bandwidth and expertise for reviewing and approving these PRs (correct me if I am wrong)
the documentation is built for every PR, inflating the documentation will make it slower
docstrings are shipped with NumPy. Inflating them makes the runtime heavier and slower to load (albeit very slightly)
By mixing theoretical information with the API documentation, we make the same documentation page serve two audiences: those looking for what the input parameters mean with those looking for deeper knowledge about the distributions. This is problematic as neither audience will be getting what they want.

I would be happier if we refer users to a wikipedia page or open a new section in the user guide around the theory and practice of the random module. We could then consider CI management to not build the user guide on every PR.

linus-md · 2023-12-14T16:17:23Z

@mattip ok. Would you prefer that I put in something like

rng = np.random.default_rng()
n, k = 100, 5
size = 1000
sample = rng.beta(k, n+1-k, size=size)

because right now there is no example at all or should I just close this?

mattip · 2023-12-14T18:47:19Z

@rkern: am I being too harsh? Was the ask to add examples to np.random motivated by end-user complaints?

rkern · 2023-12-14T19:26:21Z

Per #21351 from the Docs Team:

"Examples, more examples, and more detailed examples" is the recurrent theme in the feedback about the NumPy documentation we received via the 2020 and 2021 NumPy user surveys.

Given that, I'd say that overcomes the minor objections concerning things like the wheel size.

I do think that all of the distributions benefit from at least one plot of a typical distribution, to trigger visual memory. If it can be demonstrated that if we expanded the example for all of the distribution methods to include multiple plots, then the making of the plots and saving them balloon the CI times by noticeable amounts, then I'd be happy to have a soft-limit guideline of keeping things down to 1 plot per method.

I also think that most of the methods could use a simple demonstration of what each of the parameters mean. Particularly in cases like this where we have two that that are hard to distinguish. I think it's really good to have that final example that shows when a > b the peak moves to the lower end of the range, and when b > a the peak moves high. That said, this particular phenomenon could also be demonstrated without the plot by calculating the means of the samples.

While each distribution method should link to the appropriate Wikipedia article, and we shouldn't recapitulate every interesting fact about the distribution, we should have our own quick demonstrations of the effects of the method arguments. We often have nonstandard parameterizations that confuse people who use the Wikipedia article as a reference.

I do think this PR is a little on the high end of what we want (and thank you @linus-md for being patient with us while we whip you round-and-around on this; we all currently have different ideas of what we want). If I'd drop something, it would be the demonstration of the relationship to the explicit order statistics of the uniforms; it's a nice-to-have example, but if we're feeling pressure to slim this down, that would be the one; it's more about the distribution itself than using the method. But overall, I think this is in range.

mattip · 2023-12-14T20:26:33Z

Thanks @rkern, I am convinced :). And thanks @linus-md for your contributions.

linus-md · 2023-12-15T15:09:03Z

Hi @rkern @mattip, what do you say to the current version? I removed the example, the math import and am doing only one plot now. Do you think we can agree on something like the current version?

There might be some disagreement but I think its is nice to see the mirror symmetry visually.

linus-md added 5 commits December 6, 2023 16:17

DOC: examples for np.polynomial.laguerre funcs

bc2b152

Revert "DOC: examples for np.polynomial.laguerre funcs"

fa0ea96

This reverts commit bc2b152.

Merge branch 'main' of https://github.com/linus-md/numpy

2953b83

Merge branch 'numpy:main' into main

3c9458e

DOC: Add example of beta distribution

c968df3

github-actions bot added the 04 - Documentation label Dec 8, 2023

rkern added the component: numpy.random label Dec 8, 2023

linus-md added 2 commits December 8, 2023 20:16

Improve example

b746745

fix typo

747c03a

charris changed the title ~~DOC: Add example for np.random.default_rng().beta()~~ DOC: Add example for np.random.default_rng().beta() Dec 8, 2023

rkern reviewed Dec 8, 2023

View reviewed changes

numpy/random/_generator.pyx Outdated Show resolved Hide resolved

linus-md and others added 2 commits December 9, 2023 12:30

Update numpy/random/_generator.pyx

a014023

Co-authored-by: Robert Kern <robert.kern@gmail.com>

improve example

37e1cf1

linus-md added 3 commits December 9, 2023 13:14

fix typo

16908d2

a=b case

347a5b6

mirror symmetry

5126b5d

This was referenced Dec 9, 2023

DOC: Add example for np.random.default_rng().binomial() #25354

Merged

DOC: random number generators #25373

Closed

linus-md added 3 commits December 15, 2023 14:56

streamline example

807dd89

change order

bcefb0c

remove pmf

8b4aba6

Add reference

2fe9e55

Update _generator.pyx

55d8b8d

linus-md mentioned this pull request Dec 15, 2023

DOC: Add example for np.random.default_rng().geometric() #25357

Merged

Update _generator.pyx

237dd3e

linus-md marked this pull request as draft December 17, 2023 20:10

linus-md mentioned this pull request Dec 17, 2023

DOC: Add example to rng.beta(...) #25413

Merged

linus-md closed this Dec 17, 2023

linus-md deleted the random-doc branch December 17, 2023 23:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

DOC: Add example for `np.random.default_rng().beta()` #25349

DOC: Add example for `np.random.default_rng().beta()` #25349

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

DOC: Add example for np.random.default_rng().beta() #25349

DOC: Add example for np.random.default_rng().beta() #25349

Uh oh!

Conversation

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

DOC: Add example for `np.random.default_rng().beta()` #25349

DOC: Add example for `np.random.default_rng().beta()` #25349