Creating numpy array of nested list of differing sublengths should raise ValueError #11650

jkunimune · 2018-07-31T20:28:31Z

When numpy.array() is passed a nested list of differing sublengths, it should raise a ValueError. Instead, it currently creates a 1d array of objects. It seems to me that the current behaviour would never be useful, and can only serve to confuse. There are two possible reasons a user would pass such an argument to numpy.array():

If one wanted to make an array of lists, then relying on current behaviour would be very dangerous, as the desired effect only happens when the sublists are of different lengths. If something were to change in the code and the sublist lengths happened to coincide, then the result would suddenly be a 2d array rather than the 1d array they had before, which could easily lead to cryptic errors elsewhere in the code. Rather, the safest way to create a numpy array of lists, if that is the desired effect, would be to create an empty object array and then set the values individually, as this would work regardless of the lengths of the sublists.
In the vast majority of use-cases, though, a 2d array is far preferable to a 1d array of lists. Most of the time, when numpy.array() is passed a nested list of differing sublengths, it is a mistake. Returning a 1d array of lists in that case is highly undesirable, as it does not raise an error that the user can see. Instead, it silently passes something completely different from what the user was expecting, to raise cryptic errors elsewhere in the code.

Therefore, the expected behaviour of numpy.array() when passed a nested list of differing sublengths would be to raise a ValueError. This simultaneously prevents users of the first use-case from using dangerous instantiation methods that will likely break if the sublist lengths just happen to coincide, and gives users of the second use-case a coherent error message to highlight their mistake.

Reproducing code example:

import numpy as np
a = np.array([[0,0], [0,0]]) # makes sense
print(a) # creates float64 ndarray of shape (2, 2)
b = np.array([[0,0], [0,0,0]]) # almost certainly a mistake
print(b) # creates object array of shape (2,)

Numpy/Python version information:

1.14.3 3.6.5 |Anaconda, Inc.| (default, Mar 29 2018, 13:32:41) [MSC v.1900 64 bit (AMD64)]

The text was updated successfully, but these errors were encountered:

pv · 2018-07-31T20:48:16Z

See gh-5303

pv added the 50 - Duplicate label Jul 31, 2018

pv closed this as completed Jul 31, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Creating numpy array of nested list of differing sublengths should raise ValueError #11650

Creating numpy array of nested list of differing sublengths should raise ValueError #11650

Uh oh!

Uh oh!

Creating numpy array of nested list of differing sublengths should raise ValueError #11650

Creating numpy array of nested list of differing sublengths should raise ValueError #11650

Comments

Reproducing code example:

Numpy/Python version information:

Uh oh!