multinomial casts input to np.float64 #8317

QCaudron · 2016-11-25T22:19:00Z

np.random.multinomial seems to cast its second argument pvals to an array of dtype float64. Whilst this isn't an issue in and of itself, I've come across an interesting scenario where I have an array of dtype float32 whose sum is 0.99999994, and when it gets cast to float64, its sum is now 1.0000000222053895, which causes np.random.multinomial to raise ValueError: sum(pvals[:-1]) > 1.0.

For a little context, I have an array of dtype float32 because it's come off the GPU, where single precision calculations are a great deal faster.

The text was updated successfully, but these errors were encountered:

charris · 2016-11-25T23:37:03Z

Hmm, I suspect the float32 sum is off due to rounding. Not by much, though

In [18]: nextafter(float32(1), float32(0))
Out[18]: 0.99999994

The question is, what is pvals[-1]? If it is large I suspect another problem somewhere. How many values are in pvals?

QCaudron · 2016-11-26T00:07:09Z

pvals[-1] is typically also very small, along with most of the other 64 terms in pvals ( one or two terms tend to dominate ).

Happy to upload a .npy if that'd help.

charris · 2016-11-26T00:34:11Z

Happy to upload a .npy if that'd help.

This definitely sounds like a roundoff problem in computing the sum, and maybe in normalizing also. Not sure what your best bet for a fix is, probably depends on the details of your application.

QCaudron · 2016-11-26T00:36:31Z

I'm dealing with it application-side; I just thought it might be worth noting, as this is a bit of a surprising behaviour - certainly took me a while to work out what was going on. Is there any reason a float32 array wouldn't be acceptable to np.random.multinomial ?

charris · 2016-11-26T00:43:25Z

Is there any reason a float32 array wouldn't be acceptable to np.random.multinomial ?

No. But this is a classic setup for floating point roundoff errors: lots of small values and a couple of big ones. All the small values loose precision when they are added to the big ones. After this you won't be surprised, you will expect problems ;)

ericmjl · 2017-07-28T13:25:24Z

Many apologies for resurrecting an old thread here, but I noticed that PyMC3 gets issues with numpy's multinomial function when working with float32s returned from the GPU. Is there a reason for explicit casting to float64, rather than simply letting the floats be what they are?

ericmjl · 2017-08-02T12:32:08Z

@QCaudron may I ask, what's the fix that you've been working with to get around the multinomial probability rounding problems?

QCaudron · 2017-08-02T16:19:34Z

@ericmjl I'm afraid I don't remember the context here, and thus can't provide a workaround 😉 My intuition says I probably cast pvals to float64, normalised the array to sum to one, and passed that in.

leachim · 2018-11-07T17:59:13Z

I happened to run into this issue today. Wondering whether there is any reason for casting the array to float64? This should probably be fixed in numpy itself.

escorciav · 2019-03-14T06:35:29Z

Similar to leachim I faced this bug recently. Is there a way to get this fixed?

bashtage · 2019-04-10T22:38:19Z

Could someone post an example of an array the reproduces this issue? I haven't been able to with random arrays.

escorciav · 2019-04-11T09:10:16Z

The following snippet should break in a considerable amount of time (< 3 mins). You can take the tricky case then 😅 . Please lemme know if that helps.

import numpy as np

while True:
    x = np.random.rand(100).astype(np.float32)
    x /= x.sum()
    neg_video_ind = np.random.multinomial(1, x)

I replicated the (buggy) behavior in my machine (numpy=1.14.3), and google-colab. Go for this if you only need a 100 vector to replicate the bug.

bashtage · 2019-04-11T22:18:52Z

This allowed it to be easily replicated -- hadn't realized that I needed a relatively large pval array.

bashtage · 2019-04-12T06:16:27Z

This always triggers

x = np.array([9.9e-01, 9.9e-01, 1.0e-09, 1.0e-09, 1.0e-09, 1.0e-09, 1.0e-09,
       1.0e-09, 1.0e-09, 1.0e-09], dtype=np.float32)
y = x / x.sum()
np.random.multinomial(1, y)

Add additional check when original input is an array that does not have dtype double closes numpy#8317

Improve error message when the sum of pvals is larger than 1 when the input data is an ndarray closes numpy#8317 xref numpy#16732

olfMombach · 2021-03-31T23:10:12Z

I don't really see how that fixes the problem

ericmjl mentioned this issue Jul 28, 2017

Memory issue using GPU pymc-devs/pymc#2448

Closed

ericmjl mentioned this issue Aug 2, 2017

Discuss: PR to fix multinomial precision issues pymc-devs/pymc#2469

Closed

mattip added the component: numpy.random label Jul 19, 2018

mattip mentioned this issue Jun 6, 2019

ENH: tracking issue for merging randomgen into numpy #13164

Closed

16 tasks

bashtage mentioned this issue Jul 2, 2020

ENH/BUG: Allow multinomial to check pvals with other float types #16732

Closed

ricardoV94 mentioned this issue Jan 6, 2021

Dirichlet multinomial (continued) pymc-devs/pymc#4373

Merged

15 tasks

bashtage added a commit to bashtage/numpy that referenced this issue Feb 18, 2021

ENH/BUG: Allow multinomial to check pvals with other float types

c1e3e11

Add additional check when original input is an array that does not have dtype double closes numpy#8317

bashtage mentioned this issue Feb 24, 2021

ENH: Improve error message in multinomial #18482

Merged

bashtage added a commit to bashtage/numpy that referenced this issue Feb 26, 2021

ENH: Improve error message in multinomial

9e068f4

Improve error message when the sum of pvals is larger than 1 when the input data is an ndarray closes numpy#8317 xref numpy#16732

charris closed this as completed in #18482 Feb 27, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

multinomial casts input to np.float64 #8317

multinomial casts input to np.float64 #8317

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

multinomial casts input to np.float64 #8317

multinomial casts input to np.float64 #8317

Comments

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!