BUG: `np.arange()` with int dtype can return inprecisely-sized arrays #20226

honno · 2021-10-29T09:32:11Z

Describe the issue:

Given a distance stop - start which very largely outsizes step, np.arange() can produce an array x which has x.size as 1 off the expected size. My guess is that internally, the non-integer result of distance divided by the step (stop - start) / step becomes represented as an "integer" with floating-point representation, and thus does not indicate that one more array element should be generated.

Notably np.arange() with float arguments has a great many related issues—it looks like the inprecision in those scenarios seems to be considered "acceptable behaviour" and has been noted in the docs as such. However as in #18881, because the dtype is specified as an integer, people might have different thoughts. cc @asmeurer

I found this via a Hypothesis-powered test method I wrote for the Array API test suite.

Reproduce the code example:

>>> start, stop, step = 0, 108086391056891901, 1080863910568919
>>> x = np.arange(start, stop, step, dtype=np.uint64)
>>> x.size
100
>>> r = range(start, stop, step)
>>> len(r)
101
...
>>> x[-1]
107005527146322981
>>> x[-1] + np.uint64(step)
108086391056891900  # i.e. 1 less than stop
...
>>> 108086391056891901 == 100 * 1080863910568919 + 1
True
>>> n = 108086391056891901 / 1080863910568919
>>> n
100.0
>>> n.is_integer()
True

NumPy/Python version information:

0.3.0+27065.g0169af739 3.8.10 (default, Sep 28 2021, 16:10:42) 
[GCC 9.3.0]

The text was updated successfully, but these errors were encountered:

honno · 2021-10-29T10:49:50Z

@pearu noted that the problem is indeed likely due to the use of true division in _calc_length:

numpy/numpy/core/src/multiarray/ctors.c

Line 3054 in d5f6618

    
           _calc_length(PyObject *start, PyObject *stop, PyObject *step, PyObject **next, int cmplx)

asmeurer · 2021-10-29T20:03:18Z

This is very similar to #18881.

LeonaTaric · 2021-11-13T09:14:39Z

We can use integer-devide instead of number-devide

>>> start, stop, step = 0, 108086391056891901, 1080863910568919
>>> (stop - start) / step
100.0
>>> (stop - start) // step
100
>>> (stop - start - 1) / step
100.0
>>> (stop - start - 1) // step
100
>>> (stop - start - 2) / step
100.0
>>> (stop - start - 2) // step
99

LeonaTaric · 2021-11-13T09:20:09Z

there's the link of code

there is number-devide（__truediv__）, we can change it to integer-devide（__floordiv__）

LeonaTaric · 2021-11-13T09:27:54Z

old code (python ver.)

# next = stop - start
val = next / step

new code (python ver.)

# next = stop - start
val = 1 + (next - 1) // step

honno added the 00 - Bug label Oct 29, 2021

honno mentioned this issue Apr 28, 2022

More improvements to test_linalg data-apis/array-api-tests#101

Merged

honno mentioned this issue May 24, 2022

BUG: Large distances between start and stop in np.linspace() result in non-finite values #21585

Closed

GWDx mentioned this issue Dec 27, 2024

BUG: np.arange give wrong result when given big integer #27985

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

BUG: `np.arange()` with int dtype can return inprecisely-sized arrays #20226

BUG: `np.arange()` with int dtype can return inprecisely-sized arrays #20226

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

BUG: np.arange() with int dtype can return inprecisely-sized arrays #20226

BUG: np.arange() with int dtype can return inprecisely-sized arrays #20226

Comments

Uh oh!

Describe the issue:

Reproduce the code example:

NumPy/Python version information:

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

BUG: `np.arange()` with int dtype can return inprecisely-sized arrays #20226

BUG: `np.arange()` with int dtype can return inprecisely-sized arrays #20226