10000 np.random.choice unusable for array dimensions > 1 · Issue #10835 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

np.random.choice unusable for array dimensions > 1 #10835

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
pyformulas opened this issue Apr 1, 2018 · 10 comments
Closed

np.random.choice unusable for array dimensions > 1 #10835

pyformulas opened this issue Apr 1, 2018 · 10 comments
Labels
57 - Close? Issues which may be closable unless discussion continued

Comments

@pyformulas
Copy link
pyformulas commented Apr 1, 2018

NumPy version 1.14.2

It's not possible to grab a random row from a 2d array using np.random.choice. Consider this array:

points = np.random.random((10,2))

Trying to get a random row this way fails

np.random.choice(points)

ValueError: a must be 1-dimensional

which might be reasonable. (A more reasonable behavior is perhaps for np.random.choice to take an optional axis= argument so that it can return a random slice along the axis, defaulting to a random element in the entire array.)

However, there is no way to directly grab a random row:

np.random.choice(list(points))

ValueError: a must be 1-dimensional
np.random.choice(tuple(points))

ValueError: a must be 1-dimensional
np.random.choice([row for row in points])

ValueError: a must be 1-dimensional

Whereas this can be done with random.sample:

import random
random.sample(list(points), 1)

[array([0.77376144, 0.64678796])]
@njsmith
Copy link
Member
njsmith commented Apr 1, 2018

You can sample a random set of row indices:

row_i = np.random.choice(points.shape[0], ...)
points[row_i, :]

This also allows for sampling a random set of columns.

@mattip mattip added the 57 - Close? Issues which may be closable unless discussion continued label Apr 18, 2018
@mattip
Copy link
Member
mattip commented Apr 18, 2018

Maybe needs a line in the documentation of random.choice?

@pyformulas
Copy link
Author
pyformulas commented Apr 19, 2018

The problem with @njsmith's solution is that, say you have a list of arrays. The expected behavior for numpy.random.choice is for it to return a random object from the list. But attempting to do this can result in ValueError: a must be 1-dimensional. I think this is because numpy internally converts the list to an array, making it appear 2-dimensional and resulting in the above error.

@mattip
Copy link
Member
mattip commented Apr 30, 2018

it seems there is a solution proposed in PR #7810

@stmax82
Copy link
stmax82 commented Dec 18, 2019

I think I just ran into this problem with a list of arrays. I want to pick a random array from a list of arrays:

x = np.zeros((10,10))
y = np.ones((10,10))
z = np.random.choice([x,y])

This results in the unexpected error "ValueError: a must be 1-dimensional".
Well "a" clearly is 1-dimensional...

Note that if x and y have different shapes, random.choice works as expected.
Took me a while to figure that out because my arrays only occasionally end up with the same shapes... in which case the error occures. :/

@WarrenWeckesser
Copy link
Member

The recommended method for making a random choice is now to use the choice method of the Generator class. You can get an instance of this class with, for example, np.random.default_rng().

For example, here's the example from top of this issue:

In [63]: rng = np.random.default_rng()

In [64]: points = rng.random((10,2))

In [65]: points
Out[65]: 
array([[0.37279228, 0.21731135],
       [0.46409222, 0.8051227 ],
       [0.86021344, 0.13008027],
       [0.5291926 , 0.46637031],
       [0.57839398, 0.13400253],
       [0.61457234, 0.83176585],
       [0.02210913, 0.67098608],
       [0.79260095, 0.94899478],
       [0.44924031, 0.8708212 ],
       [0.36823098, 0.51461417]])

In [66]: rng.choice(points)
Out[66]: array([0.37279228, 0.21731135])

Here's the most recent example:


In [67]: x = np.zeros((10,10))

In [68]: y = np.ones((10,10))

In [69]: z = rng.choice([x,y])

In [70]: z
Out[70]: 
array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]])

I don't think there are any plans to make improvements to the legacy random code, so I'm closing the issue, but we can reopen it if the demand is high.

@LostInBayes
Copy link

I too would like to sample from a 2d numpy array but I'm restricted to 1d arrays . :(

@jtrakk
Copy link
jtrakk commented Apr 23, 2020

I'm closing the issue, but we can reopen it if the demand is high.

FWIW the issue currently has 10 👍, which would rank it 12th of 1,864 among open issues on that metric. Perhaps it's time to consider reopening?

At least it would be nice to improve discovery for the new recommended method (which does solve the problem AFAICT). I only found out about it because of @WarrenWeckesser's comment.

@rgommers
Copy link
Member

FWIW the issue currently has 10 +1, which would rank it 12th of 1,864 among open issues on that metric. Perhaps it's time to consider reopening?

Those are from before adding Generator.choice, which solves the problem. We'd like to avoid changing the legacy functions at this point.

At least it would be nice to improve discovery for the new recommended method (which does solve the problem AFAICT).

That is a good point. I opened PR gh-16075 to do that.

@ak2911
Copy link
ak2911 commented Jul 1, 2023

think alternatively,
use random to generate indices; then slice fromthe array. works on any dimention
n_sample = 10 #sample count
points[np.random.randint(0,points.shape[0], n_sample )]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
57 - Close? Issues which may be closable unless discussion continued
Projects
None yet
Development

No branches or pull requests

9 participants
0