-
-
Notifications
You must be signed in to change notification settings - Fork 10.9k
multinomial casts input to np.float64 #8317
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hmm, I suspect the float32 sum is off due to rounding. Not by much, though
The question is, what is pvals[-1]? If it is large I suspect another problem somewhere. How many values are in pvals? |
pvals[-1] is typically also very small, along with most of the other 64 terms in pvals ( one or two terms tend to dominate ). Happy to upload a .npy if that'd help. |
This definitely sounds like a roundoff problem in computing the sum, and maybe in normalizing also. Not sure what your best bet for a fix is, probably depends on the details of your application. |
I'm dealing with it application-side; I just thought it might be worth noting, as this is a bit of a surprising behaviour - certainly took me a while to work out what was going on. Is there any reason a float32 array wouldn't be acceptable to |
No. But this is a classic setup for floating point roundoff errors: lots of small values and a couple of big ones. All the small values loose precision when they are added to the big ones. After this you won't be surprised, you will expect problems ;) |
Many apologies for resurrecting an old thread here, but I noticed that PyMC3 gets issues with numpy's multinomial function when working with float32s returned from the GPU. Is there a reason for explicit casting to float64, rather than simply letting the floats be what they are? |
@QCaudron may I ask, what's the fix that you've been working with to get around the multinomial probability rounding problems? |
@ericmjl I'm afraid I don't remember the context here, and thus can't provide a workaround 😉 My intuition says I probably cast |
I happened to run into this issue today. Wondering whether there is any reason for casting the array to float64? This should probably be fixed in numpy itself. |
Similar to leachim I faced this bug recently. Is there a way to get this fixed? |
Could someone post an example of an array the reproduces this issue? I haven't been able to with random arrays. |
The following snippet should break in a considerable amount of time (< 3 mins). You can take the tricky case then 😅 . Please lemme know if that helps. import numpy as np
while True:
x = np.random.rand(100).astype(np.float32)
x /= x.sum()
neg_video_ind = np.random.multinomial(1, x) I replicated the (buggy) behavior in my machine ( |
This allowed it to be easily replicated -- hadn't realized that I needed a relatively large pval array. |
This always triggers
|
Add additional check when original input is an array that does not have dtype double closes numpy#8317
Improve error message when the sum of pvals is larger than 1 when the input data is an ndarray closes numpy#8317 xref numpy#16732
I don't really see how that fixes the problem |
np.random.multinomial
seems to cast its second argumentpvals
to an array of dtype float64. Whilst this isn't an issue in and of itself, I've come across an interesting scenario where I have an array of dtype float32 whose sum is0.99999994
, and when it gets cast to float64, its sum is now1.0000000222053895
, which causesnp.random.multinomial
to raiseValueError: sum(pvals[:-1]) > 1.0
.For a little context, I have an array of dtype float32 because it's come off the GPU, where single precision calculations are a great deal faster.
The text was updated successfully, but these errors were encountered: