DOC, API: improve the C-API/Cython documentation and interfaces for random #15007

mattip · 2019-11-29T09:14:26Z

Clean up the random c documentation
- Split the page into Cython and C-API sections
- Document the bitgen_t struct
- Function formatting should render cleanly, linking to non-standard C arguments
Rename functions in distributions
Refactor CFFI test, better document the example files

mattip · 2019-11-29T09:15:25Z

mattip · 2019-11-29T13:53:17Z

The reworked "extending" documentation can be seen here and here is the cython/c-api. The functions in that last link could use some more documentation

bashtage

I think this looks good and is an improvement.

rkern · 2019-11-29T18:22:11Z

doc/source/reference/random/c-api.rst

@@ -112,7 +104,7 @@ The functions are named with the following cconventions:

 .. c:function:: double random_f(bitgen_t *bitgen_state, double dfnum, double dfden)

-.. c:function:: double random_standard_cauchy(bitgen_t *bitgen_state)
+.. c:function:: double random_cauchy(bitgen_t *bitgen_state)


If it already matches the name at the Python level, please leave it as such.

Reverting the cauchy and random_t changes.

@mattip I think you forgot to push the revert commit here.

good catch. fixing.

bashtage · 2019-11-29T19:34:05Z

I think it would be best, but perhaps not possible, to rename at the python level. Would also sync with scipy.stats.cauchy

…

On Fri, Nov 29, 2019, 18:22 Robert Kern ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In doc/source/reference/random/c-api.rst <#15007 (comment)>: > @@ -112,7 +104,7 @@ The functions are named with the following cconventions: .. c:function:: double random_f(bitgen_t *bitgen_state, double dfnum, double dfden) -.. c:function:: double random_standard_cauchy(bitgen_t *bitgen_state) +.. c:function:: double random_cauchy(bitgen_t *bitgen_state) If it already matches the name at the Python level, please leave it as such. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#15007?email_source=notifications&email_token=ABKTSROHGKWDEBDZC2SWA4LQWFMVJA5CNFSM4JS44YM2YY3PNVWWK3TUL52HS4DFWFIHK3DMKJSXC5LFON2FEZLWNFSXPKTDN5WW2ZLOORPWSZGOCNOHKDQ#pullrequestreview-324826382>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABKTSRPGNX3CNOE6LKLXHELQWFMVJANCNFSM4JS44YMQ> .

rkern · 2019-11-29T20:46:13Z

Making it harder to upgrade from RandomState to Generator is not the price that I would pay for cleaning this up. Besides, these are standard_ methods. They are more like standard_normal than they are normal, and these are distributions where I kind of expect to be able to specify the loc and scale unless otherwise marked, which is what standard_ does. That's not the case with, say, beta. I get that it bugs some people when looking at it from one direction, but the opposite situation would bug me!

If anything needs to be done to rationalize this, I'd add a cauchy and t (ship having sailed on student_t) methods that do take the loc and scale parameters (or whatever names are deemed appropriate).

bashtage · 2019-11-30T07:47:43Z

Making it harder to upgrade from RandomState to Generator is not the price that I would pay for cleaning this up. Besides, these are standard_ methods. They are more like standard_normal than they are normal, and these are distributions where I kind of expect to be able to specify the loc and scale unless otherwise marked, which is what standard_ does. That's not the case with, say, beta. I get that it bugs some people when looking at it from one direction, but the opposite situation would bug me!

I'm not actually arguing these must be changed, although I would argue that it was a mistake to not change then when Generator was introduced when other functions were renamed, e.g., randint -> integers.

I personally find standard_t to be a bit confusing since in my area there is a distribution known as the standard ized t that is a Student's t that is normalized to have a unit variance for all choices of the degree of freedom parameter (dof > 2). The t that adds two parameters to set the location and scale is known as a generalized t.

mattip · 2019-11-30T08:53:37Z

Windows 32bit failures are unrelated. We have I think at least three choices to make with python standard_t/ C random_standard_t:

change the python level and C level to student_t, random_student_t
add an additional python level student_t, change the C to random_student_t
- possibly deprecate standard_t
leave it alone.

I would tend to 2, with no deprecation but with a comment in the documentation.

doc/source/reference/random/c-api.rst

rgommers · 2019-12-03T04:06:04Z

doc/source/reference/random/c-api.rst

-                          size_t num_colors, int64_t *colors,
-                          int64_t nsample,
-                          size_t num_variates, int64_t *variates)
+.. c:function:: int random_mvhg_count(bitgen_t *bitgen_state, npy_int64 total, size_t num_colors, npy_int64 *colors, npy_int64 nsample, size_t num_variates, npy_int64 *variates)


I still don't know if mvhg is a sensible name, there's no docs other than the function signature and the name is very non-descriptive. Adding a few lines of docs can be addressed after the 1.18.x split, that would be nice (now it's "read the source code" I think).

For 1.18.x though, can we have a decision on the name? The legacy code has random_hypergeometric, so that's taken. The new code could use hypergeom or multi_hypergeom or something similar perhaps.

As far as I can see now, random_mvhg_count and random_mvhg_marginals are the only two poor naming choices that are left, everything else looks clear to me.

@WarrenWeckesser any ideas on nicer names?

I agree that the names random_mvhg_count and random_mvhg_marginals are not great for a public-facing API. The verbose, completely explicit name for the marginals version would be

random_multivariate_hypergeometric_marginals

but I don't think anyone wants a name that long. Is

random_multivar_hypergeom_marginals

still too long?

@rgommers wrote:

I still don't know if mvhg is a sensible name, there's no docs other than the function signature

The C functions random_hypergeometric, random_mvhg_count and random_mvhg_marginals have documentation in the comments of the C code. These comments explain the arguments, including assumptions about the input values (e.g. input values are expected to be nonnegative) that are not checked in the C code.

Is the plan to eventually expand the documentation in this file (doc/source/reference/random/c-api.rst) to include such information?

doc/source/reference/random/examples/cffi.rst

rgommers · 2019-12-03T04:13:27Z

1. change the python level and C level to student_t, random_student_t

2. add an additional python level `student_t`, change the C to `random_student_t`
   
   * possibly deprecate `standard_t`

3. leave it alone.

I would tend to 2, with no deprecation but with a comment in the documentation.

I'm fine with "3. leave it alone". When I brought up the naming inconsistency I missed that those names were already present in the Python API. So I'd say perhaps not worth it; I'd be happy to just have the new additions be named as clearly as possible.

rgommers

Getting there, thanks @mattip. All requested changes are minor I think.

mattip · 2019-12-03T07:10:58Z

Fixed the niggles and mistakes, tests pass, now waiting for a decision about the names.

bashtage · 2019-12-03T13:40:21Z

I think the first is exceedingly clear. The second is also readable, but long or long +9 is still a long name.

…

On Tue, Dec 3, 2019, 13:37 Warren Weckesser ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In doc/source/reference/random/c-api.rst <#15007 (comment)>: > -.. c:function:: int random_mvhg_count(bitgen_t *bitgen_state, - int64_t total, - size_t num_colors, int64_t *colors, - int64_t nsample, - size_t num_variates, int64_t *variates) +.. c:function:: int random_mvhg_count(bitgen_t *bitgen_state, npy_int64 total, size_t num_colors, npy_int64 *colors, npy_int64 nsample, size_t num_variates, npy_int64 *variates) I agree that the names random_mvhg_count and random_mvhg_marginals are not great for a public-facing API. The verbose, completely explicit name for the marginals version would be random_multivariate_hypergeometric_marginals but I don't think anyone wants a name that long. Is random_multivar_hypergeom_marginals still too long? — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#15007?email_source=notifications&email_token=ABKTSROVAPCSFJUPAVB4PWTQWZOHZA5CNFSM4JS44YM2YY3PNVWWK3TUL52HS4DFWFIHK3DMKJSXC5LFON2FEZLWNFSXPKTDN5WW2ZLOORPWSZGOCNYIQ5A#discussion_r353181076>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABKTSRJMJ4Z7TIQJRQ7NESTQWZOHZANCNFSM4JS44YMQ> .

rgommers · 2019-12-03T22:23:23Z

Changes look good, so merged. A couple of things left to do or decide in a follow-up PR:

expanding the mvhg naming
removing one instance of `random.examples in the docs
use of private names in the Cython API - I thought we agreed to add __init__.pxd to not have to do cimport numpy.random._generator?

I think the first is exceedingly clear. The second is also readable, but long or long +9 is still a long name.

I'd be happy with either, with a light preference for the longer version (so agreeing with @bashtage)

DOC: improve the C-API/Cython documentation for random

16e8059

mattip added 04 - Documentation 30 - API labels Nov 29, 2019

DOC, TST: refactor CFFI test, add file names to documentation

bcd950f

mattip mentioned this pull request Nov 29, 2019

API: provide examples of use of numpy random API via user stories #14778

Open

8 tasks

API: rename functions in distributions.c,h

a2acfa6

mattip changed the title ~~DOC: improve the C-API/Cython documentation for random~~ DOC, API: improve the C-API/Cython documentation and interfaces for random Nov 29, 2019

mattip requested a review from rgommers November 29, 2019 13:04

DOC: sphinx does not like breaking function declarations over lines

aeaee5e

bashtage reviewed Nov 29, 2019

View reviewed changes

rkern reviewed Nov 29, 2019

View reviewed changes

API: revert changes to standard_t, cauchy

2b791a5

mattip mentioned this pull request Dec 3, 2019

REL: Prepare for 1.18 branch #15031

Merged

rgommers added this to the 1.18.0 release milestone Dec 3, 2019

rgommers reviewed Dec 3, 2019

View reviewed changes

doc/source/reference/random/c-api.rst Outdated Show resolved Hide resolved

rgommers reviewed Dec 3, 2019

View reviewed changes

doc/source/reference/random/c-api.rst Outdated Show resolved Hide resolved

rgommers reviewed Dec 3, 2019

View reviewed changes

doc/source/reference/random/examples/cffi.rst Outdated Show resolved Hide resolved

rgommers requested changes Dec 3, 2019

View reviewed changes

DOC: fixes from review

b2f2700

rgommers merged commit fdd8395 into numpy:master Dec 3, 2019

mattip mentioned this pull request Dec 4, 2019

API, DOC: change names to multivariate_hypergeometric, improve docs #15046

Merged

charris mentioned this pull request Dec 5, 2019

API, DOC: change names to multivariate_hypergeometric, improve docs #15058

Merged

mattip deleted the random-c-api2 branch November 2, 2020 08:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

DOC, API: improve the C-API/Cython documentation and interfaces for random #15007

DOC, API: improve the C-API/Cython documentation and interfaces for random #15007

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

DOC, API: improve the C-API/Cython documentation and interfaces for random #15007

DOC, API: improve the C-API/Cython documentation and interfaces for random #15007

Uh oh!

Conversation

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!