8000 Creating numpy array of nested list of differing sublengths should raise ValueError · Issue #11650 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

Creating numpy array of nested list of differing sublengths should raise ValueError #11650

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jkunimune opened this issue Jul 31, 2018 · 1 comment

Comments

@jkunimune
Copy link

When numpy.array() is passed a nested list of differing sublengths, it should raise a ValueError. Instead, it currently creates a 1d array of objects. It seems to me that the current behaviour would never be useful, and can only serve to confuse. There are two possible reasons a user would pass such an argument to numpy.array():

  1. If one wanted to make an array of lists, then relying on current behaviour would be very dangerous, as the desired effect only happens when the sublists are of different lengths. If something were to change in the code and the sublist lengths happened to coincide, then the result would suddenly be a 2d array rather than the 1d array they had before, which could easily lead to cryptic errors elsewhere in the code. Rather, the safest way to create a numpy array of lists, if that is the desired effect, would be to create an empty object array and then set the values individually, as this would work regardless of the lengths of the sublists.

  2. In the vast majority of use-cases, though, a 2d array is far preferable to a 1d array of lists. Most of the time, when numpy.array() is passed a nested list of differing sublengths, it is a mistake. Returning a 1d array of lists in that case is highly undesirable, as it does not raise an error that the user can see. Instead, it silently passes something completely different from what the user was expecting, to raise cryptic errors elsewhere in the code.

Therefore, the expected behaviour of numpy.array() when passed a nested list of differing sublengths would be to raise a ValueError. This simultaneously prevents users of the first use-case from using dangerous instantiation methods that will likely break if the sublist lengths just happen to coincide, and gives users of the second use-case a coherent error message to highlight their mistake.

Reproducing code example:

import numpy as np
a = np.array([[0,0], [0,0]]) # makes sense
print(a) # creates float64 ndarray of shape (2, 2)
b = np.array([[0,0], [0,0,0]]) # almost certainly a mistake
print(b) # creates object array of shape (2,)

Numpy/Python version information:

1.14.3 3.6.5 |Anaconda, Inc.| (default, Mar 29 2018, 13:32:41) [MSC v.1900 64 bit (AMD64)]

@pv
Copy link
Member
pv commented Jul 31, 2018

See gh-5303

@pv pv closed this as completed Jul 31, 2018
4737
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants
0