-
-
Notifications
You must be signed in to change notification settings - Fork 10.9k
Validate the statistics of the random distributions #15911
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHu 8000 b”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Is numpy participating in gsoc this year ? |
No, I'm thinking of next year. |
@charris do you have any reference on this subject, any starting points? I would like to understand how this could be done, and how we would know that the work is done |
@charris were you still interested in this being done? I'm assuming you meant to perform hypothesis tests in which the null hypothesis is that a sample from the RNG is drawn from a hypothesized distribution. A small p-value would be reason to take a closer look. Of course, you could get false positives (especially if many tests are to be performed), and a large p-value is inconclusive. SciPy has two tests that would work out of the box for any univariate distribution.
import numpy as np
from scipy import stats
rng = np.random.default_rng(35823458245)
n = 1000
shape = 1.
loc = 0.
scale = 1.
sample = rng.gamma(shape=shape, scale=scale, size=n)
dist = stats.gamma
known_params=dict(a=shape, loc=loc, scale=scale)
res = stats.ks_1samp(sample, dist(**known_params).cdf)
assert res.pvalue > 0.05
res = stats.cramervonmises(sample, dist(**known_params).cdf)
assert res.pvalue > 0.05
for statistic in ('ad', 'filliben'): # also supportds ks, cvm
# As written, this test is sort of circular, since it will use `dist.rvs` to
# draw random samples, and `rvs` methods rely on NumPy
# It can be modified to perform a valid test, though.
res = stats.goodness_of_fit(dist, sample, statistic=statistic,
known_params=known_params)
assert res.pvalue > 0.05 Also, since the parameters of the underlying distribution are known, one could take the CDF of the sampled values and use any test developed to validate a uniform RNG (e.g. here). Perhaps @steppi has other ideas (e.g. Bayesian methods)? |
Long ago the legacy random distributions were tested to see if they produced the correct distributions. The usual method is to map the output values so that the result should be uniformly distributed and check that. @rkern, @bashtage would have a better idea is it would be desirable to do that with the new code. |
Would be a good project for someone. I think it would be particularly helpful to verify the distribution near various boundaries, especially when the variates are generated using a rejection sampler. This would lead to practical advice about the effective range of parameter values, e.g., a formal requirement in the distribution that x >= 0 might mean that either x == 0 or x >1e-8. |
Example: zipf with |
I'm interested to see what ideas are generated here. In JAX we use KS tests and chi-squared tests to validate distributions in I'm sure there are better ideas out there, and I'd love to hear them! |
There must be an extensive body of literature and prior work on this. Is anyone familiar with any fundamental literature or existing software for this? For the low-level RNG algorithms, there are the "diehard tests". For the reasons already mentioned, a single p-value from a single sample has limited value as a rigorous test. FWIW, when I added the |
I think |
Yes and no! On the no side, even the low-level PRNG tests are of the form "make a bunch of NHSTs and check the p-values; be suspicious of clusters of extreme values". And their concerns are largely distinct from ours. The low-level PRNG tests mostly spend their effort on testing for independence since the uniform distribution properties are much easier for these algorithms to obtain. We have complementary concerns here; independence is relatively straightforward when relying on a quality low-level PRNG while the fine details of the distribution, particularly in tail regions and extreme parameters, is of more concern. On the yes side, testing whether a set of samples follows a given distribution is the bread and butter of NHSTing in general. The problem for us is that we are generally implementing both the numerical CDF/PPFs and the NHSTs are okay for screening for problem areas, but once alerted to a potential problem area, then you have to go in and look at the generated values and compare them with the distributions to see how the issue actually presents itself. Because it's floating point, there are going to be issues somewhere; there is little chance of recovering the perfect uniformity of the
FWIW, PractRand is what I use. TestU01 is also well-regarded (if you hear about passing/failing "BigCrush", that's TestU01). Both are ... venerable, so I use lemire/testingRNG to get them built and runnable with the latest community patches. But neither is particularly relevant for this issue. They test low-level bit-spouting PRNG algorithms, not floating point continuous distributions. Reading about their test suite philosophy and interpretation of extreme p-values is mildly applicable, but not by much. |
Possible candidate for GSOC. The testing will be long running, so should be separate and run occasionally. Organization open to discussion.
Reproducing code example:
Error message:
Numpy/Python version information:
The text was updated successfully, but these errors were encountered: