8000 BUG: np.random.zipf hangs the interpreter on pathological input · Issue #9829 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

BUG: np.random.zipf hangs the interpreter on pathological input #9829

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
eric-wieser opened this issue Oct 6, 2017 · 8 comments · Fixed by #27048
Closed

BUG: np.random.zipf hangs the interpreter on pathological input #9829

eric-wieser opened this issue Oct 6, 2017 · 8 comments · Fixed by #27048

Comments

@eric-wieser
Copy link
Member
>>> np.random.zipf(np.nan) 
# hangs forever, doesn't respond to ctrl C
>>> np.random.zipf(1.0000000000001)
# hangs for a possibly finite but very long time, also not responding to Ctrl+C

The former should probably just throw an exception. The latter is sort of inevitable with a rejection algorithm, but if we called PyErr_CheckSignals() then at least it could be cancelled.

@ghost
Copy link
ghost commented Nov 20, 2017

Doesn't seem to be present anymore on the current master:

Python 3.6.3 (default, Nov 13 2017, 02:17:23) 
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> np.random.zipf(np.nan)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "mtrand.pyx", line 4086, in mtrand.RandomState.zipf
ValueError: 'a' must be a valid float > 1.0
>>> 

@eric-wieser
Copy link
Member Author

It was fixed for Nan, but the problem remains for numbers close to 1

tylerjereddy added a commit to tylerjereddy/numpy that referenced this issue Apr 19, 2018
* Fixes Issue numpy#9829 by allowing a keyboard interruption
to break out of zipf() computations that run for very
long times because of pathological inputs

* Added a unit test that verifies handling of SIGINT
by zipf()
@toslunar
Copy link
Contributor

While the nan case has been fixed, the problem remains for large a.

>>> np.__version__
'1.16.0.dev0+99f7ad5'
>>> np.random.zipf(10000)
# hangs

This is probably because the following code overflows.

    am1 = a - 1.0;
    b = pow(2.0, am1);

@bashtage
Copy link
Contributor
>>> np.random.zipf(1.0000000000001)
# hangs for a possibly finite but very long time, also not responding to Ctrl+C

This doesn't hand forever since there is a positive probability of drawing a value of 0.0, so that 1.0-draw is 1 and the rejection sampler will always exit. It may take an extraordinarily long time (1 in 2**53), but not a bug.

@charris
Copy link
Member
charris commented Apr 11, 2019

I think it is a bug. I think we decided that it reasonable to raise an error for parameters excessively close to 1. Note that this problem can be fixed by using a truncated part of the distribution and rewriting the majorizing distribution so that it succeeds more often, easy speedups of 1000x or more follow. See discussion at #9824. @bashtage The new randomgen should be fixed to handle this better.

@mattip
Copy link
Member
mattip commented Apr 11, 2019

xref #13164 and #13163

@vks
Copy link
vks commented Aug 2, 2021

I think the biggest problem is that small a are very likely to result in an infinite X, which is always rejected. Returning RAND_INT_MAX instead would fix this issue at the cost of introducing a bias.

@charris
Copy link
Member
charris commented Aug 2, 2021

which is always rejected

Yes. I note that other versions allow setting the maximum via an argument, which is probably the best way to do it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants
0