-
Notifications
You must be signed in to change notification settings - Fork 3.3k
BUG: Fix bug in Runs.runs_test for the case of a single run yielding … #9524
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: Fix bug in Runs.runs_test for the case of a single run yielding … #9524
Conversation
…incorrect zscore and pvalue
Thanks for pointing this out. what is the theoretical justification for the chosen pvalue? I need to check this. |
I don't choose a fixed pvalue, I calculate the probability that a sample size of N is all the same, which is 2^(1-N). I then I use this to calculate a z-score assuming normal approximation. The one bit I am slightly unsure of is that because we have two possible states (above vs below the threshold used for calculating run value) I think the probability might need to be doubled. In any case, the z-score is negative because the run is longer than expected by chance. |
I kind of understand the argument, and it looks reasonable. However I'm a bit distracted these days (home renovation) and might need some time to figure out the details. But it looks intuitive that we should reject in large samples if all outcomes are the same, i.e. only one run is observed. (A bit of caution, in scipy.stats t-test we thought we had a justification for a specific non-nan result if variance is zero, but there are different ways of approaching a zero variance limit, so scipy.stats switched back to returning nans. |
aside: Based on the comments and notes the functions were largely based on SAS documentation. (AFAIR I had given up on "exact" distributions for runs as too complicated and not worth the effort. Some results are still in the sandbox module. another aside (It looks like I have not worked on this since 2013. Another module to see what can be moved out of the sandbox and what is unfinished experimental code.) update update It is likely that I mixed up the references when I wrote the docstrings. |
There are a few failures on tests, but these seem entirely unrelated to the runs test, so I presume reflect other possible issues in the wider package. I have therefore not reviewed them in any detail. FAILED statsmodels/stats/tests/test_deltacov.py::TestDeltacovOLS::test_ttest FAILED statsmodels/tsa/forecasting/tests/test_theta.py::test_auto - Assertion... FAILED statsmodels/regression/tests/test_regression.py::test_summary_as_latex |
unit test failures are unrelated and can be ignored here. |
Out of curiosity, How did you run into this problem with only one run? |
I'm a very minor author on the CSAPS package https://github.com/espdev/csaps which is a smoothing cubic spline package. When doing cubic spline smoothing, the question naturally arises as to "what is the correct level of smoothing" for arbitrary data, such that you can implement auto-smoothing. My solution is to maximise the probability of the runs test - thus avoiding the problem of outliers and so on in the data, which might influence other tests. There are two boundaries for the smoothing parameter: effectively a linear fit to the data at one end (which will have some valid number of runs), or an unsmoothed cubic spline at the other end (which will pass exactly through each datapoint). At that boundary, the residuals are a string of zeroes, and so your data for the runs test causes exactly the issue I came upon here. In order for my optimiser to work, I need a valid value for the boundary conditions, and so the need to correct the runs test implementation! |
merging, looks good AFAIR (before I got lost in runs test extensions) |
…incorrect zscore and pvalue
NumPy's guide.