ENH: Support array-like `n` in random.random.multinomial #9710

ColCarroll · 2017-09-18T22:57:34Z

This changes the numpy.random.multinomial API to be more similar to the API for numpy.random.binomial. In particular, something like this:

>>> np.random.binomial([10, 20], 0.2)
array([2, 7])

>>> np.random.multinomial([10, 20], [0.2, 0.8])
array([[ 1,  9],
       [ 4, 16]])

I did some light benchmarking, and it seems like the current use case is equally fast, while this provides at least some speed up over np.array([np.random.multinomial(n, pvals) for n in nvals]).

charris · 2017-09-19T00:24:36Z

numpy/random/mtrand/mtrand.pyx

@@ -4549,7 +4549,7 @@ cdef class RandomState:

        Parameters
        ----------
-        n : int
+        n : int or array_like of ints


Needs .. versionchanged:: 1.14.0 with a note that allowing array_like here is new. You can find examples in the file.

eric-wieser · 2017-09-18T23:06:13Z

numpy/random/mtrand/mtrand.pyx

-                    mnix[i+d-1] = dn
-
-                i = i + d
+            for k from 0 <= k < m:


This is outdated syntax for Pyrex compatibility

eric-wieser · 2017-09-19T03:29:35Z

I'm wary of adding another (m)(n)->(m,n) function instead of trying to give normal broadcasting rules.

Can we make it work as:

>>> np.random.multinomial([[10, 20]], [0.2, 0.8])
array([[ 1,  9],
       [ 4, 16]])

I'm also wondering if accepting an ND p array would make sense, with out[i,j,k] = count_with(p[i,j,k]) (but broadcasting) - but that doesn't need to be dealt with now

ColCarroll · 2017-09-20T12:32:01Z

I removed what I think was the pyrex compatibility, but I'm not sure I understand your point about broadcasting? Would you guess the right approach is (in case n and pvals are both arrays) to use <broadcast>PyArray_MultiIterNew? Still learning some of this Cython...

ColCarroll · 2017-09-30T14:46:33Z

I tried a bit to use np.broadcasting, but it gets funny since the multinomial naturally wants to associate each integer n with a vector of p_vals. I'm happy to keep trying to update, but might need an example of a similar pattern (where the broadcasting was against all but the last dimension).

eric-wieser · 2017-10-02T05:53:33Z

it gets funny since the multinomial naturally wants to associate each integer n with a vector of p_vals

This maybe feels like a feature? Force the user to make the integers n and the p_vals to be along different axes?

Essentially, there are two different ways we could go with broadcasting. Here I'm using A<x, y> as a shorthand for an array A with shape x,y, and mn for `multinomial:

No broadcasting, concatenate dimensions
- mn1(N<a>, P<b>) -> M<a,b> where M[a,b] is the count of event b (probability P[b]) occured given N[a] samples
- mn1(N<a,b>, P<c,d>) -> M<a,b,c,d> where M[a,b,c,d] is the count of event c,d (probability P[c,d]) occured given N[a,b] samples
Broadcast across all dimensions
- mn2(N<a,1>, P<1,b>) -> M<a,b> where M[a,b] is the count of event b (probability P[0,b]) occured given N[a,0] samples
- mn2(N<1,a>, P<b,1>) -> M<b,a> where M[b,a] is the count of event b (probability P[b,0]) occured given N[0,a] samples
- mn2(N<1,b,1,d>, P<a,1,c,1>) -> M<a,b,c,d> where M[a,b,c,d] is the count of event (a, c) (probability P[a,0,c,0]) occured given N[0,b,0,d] samples.
- extension: mn2(N<a, 1>, P<a, b>, axis=1) -> M<a,b> where M[b,a] is the count of event b given N[a,0] samples from the set of events (a,b) | a

Version 2 is more complex, but it does preserve the meanings of dimensions, which is often handy

eric-wieser · 2017-10-02T05:55:08Z

PyArray_MultiIterNew is an outdated API - the replacement is NpyIter_MultiNew

eric-wieser · 2017-10-02T06:16:10Z

Looking at the nditer API, it seems my suggestion above might be totally unreasonable...

eric-wieser · 2017-10-02T07:53:14Z

Actually this is totally possible, there's just a quirk in the python-side interface to nditer ( #9808) that makes it non-obvious. Here's what I believe is a python implementation of what I'm describing.

Caveat: I have no idea what I'm doing with nditer, and would appreciate comments from those who do.

import numpy as np
from numpy.lib.stride_tricks import as_strided
from numpy.random import multinomial as m_old

def multinomial(n, p):
    it = np.nditer(
        (n, p, None),
        flags=[
            "zerosize_ok", 'multi_index', 'reduce_ok'  # no idea if these are needed
        ],
        op_flags=[
            ['readonly'],
            ['readonly'],
            ['writeonly', 'allocate']
        ]
    )
    n, p, out = it.operands

    p_shape = np.array(p.shape)
    n_shape = np.array(p.shape)
    out_shape = np.array(out.shape)

    # axes containing probabilities, counted from the end
    p_axs = p_shape != 1  # todo: let this be passed in
    p_axs, = p_axs.nonzero()
    p_axs -= p.ndim
    if not np.all(n_shape[p_axs] == 1):
        raise ValueError("Axes of n must be length 1 where probabilities are not")

    # don't iterate over these
    for ax_i in p_axs:
        it.remove_axis(it.ndim + ax_i)

    it.remove_multi_index()  # I think this just makes things faster?
    # it.enable_external_loop()  # TODO: maybe a speedup to be had by using this

    # convert to arrays so we can index these
    p_strides = np.array(p.strides)
    out_strides = np.array(out.strides)

    while not it.finished:
        ni, pi, outi = it.value
        # workaround for gh-9808
        pi = as_strided(pi, shape=p_shape[p_axs], strides=p_strides[p_axs])
        outi = as_strided(outi, shape=out_shape[p_axs], strides=out_strides[p_axs])

        outi[...] = m_old(ni, pi.ravel()).reshape(p.shape)
        it.iternext()

    return out

ColCarroll · 2018-01-21T16:14:08Z

Digging this up and asking if there is anything I could do to help this along?

Add broadcasting to multinomial xref numpy/numpy#9710

mattip · 2019-05-29T19:23:00Z

Closing, np.random.Generator().multinomial([10, 20], [0.2, 0.8]) works with the new API. Please reopen if needed.

ENH: Support array-like n in random.random.multinomial

994db8d

charris reviewed Sep 19, 2017

View reviewed changes

charris added 01 - Enhancement component: numpy.random labels Sep 19, 2017

DOC: Add versionchanged note to docstring

0713e6d

eric-wieser reviewed Sep 19, 2017

View reviewed changes

Remove pyrex syntax

8f71faa

ColCarroll mentioned this pull request Jan 21, 2018

Speed up random method for Multinominal pymc-devs/pymc#2811

Closed

bashtage added a commit to bashtage/randomgen that referenced this pull request Apr 10, 2019

ENH: Add broadcasting to multinomial

06cd3b4

Add broadcasting to multinomial xref numpy/numpy#9710

This was referenced Apr 10, 2019

ENH: Add broadcasting to multinomial bashtage/randomgen#72

Merged

Sync randomgen mattip/numpy#7

Closed

Sync randomgen mattip/numpy#8

Merged

ENH: tracking issue for merging randomgen into numpy #13164

Closed

mattip closed this May 29, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: Support array-like `n` in random.random.multinomial #9710

ENH: Support array-like `n` in random.random.multinomial #9710

ENH: Support array-like n in random.random.multinomial #9710

ENH: Support array-like n in random.random.multinomial #9710

Conversation

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ENH: Support array-like `n` in random.random.multinomial #9710

ENH: Support array-like `n` in random.random.multinomial #9710