10000 BUG: `np.arange()` with int dtype can return inprecisely-sized arrays · Issue #20226 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

BUG: np.arange() with int dtype can return inprecisely-sized arrays #20226

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
honno opened this issue Oct 29, 2021 · 5 comments
Open

BUG: np.arange() with int dtype can return inprecisely-sized arrays #20226

honno opened this issue Oct 29, 2021 · 5 comments
Labels

Comments

@honno
Copy link
Contributor
honno commented Oct 29, 2021

Describe the issue:

Given a distance stop - start which very largely outsizes step, np.arange() can produce an array x which has x.size as 1 off the expected size. My guess is that internally, the non-integer result of distance divided by the step (stop - start) / step becomes represented as an "integer" with floating-point representation, and thus does not indicate that one more array element should be generated.

Notably np.arange() with float arguments has a great many related issues—it looks like the inprecision in those scenarios seems to be considered "acceptable behaviour" and has been noted in the docs as such. However as in #18881, because the dtype is specified as an integer, people might have different thoughts. cc @asmeurer

I found this via a Hypothesis-powered test method I wrote for the Array API test suite.

Reproduce the code example:

>>> start, stop, step = 0, 108086391056891901, 1080863910568919
>>> x = np.arange(start, stop, step, dtype=np.uint64)
>>> x.size
100
>>> r = range(start, stop, step)
>>> len(r)
101
...
>>> x[-1]
107005527146322981
>>> x[-1] + np.uint64(step)
108086391056891900  # i.e. 1 less than stop
...
>>> 108086391056891901 == 100 * 1080863910568919 + 1
True
>>> n = 108086391056891901 / 1080863910568919
>>> n
100.0
>>> n.is_integer()
True

NumPy/Python version information:

0.3.0+27065.g0169af739 3.8.10 (default, Sep 28 2021, 16:10:42) 
[GCC 9.3.0]
@honno honno added the 00 - Bug label Oct 29, 2021
@honno
Copy link
Contributor Author
honno commented Oct 29, 2021

@pearu noted that the problem is indeed likely due to the use of true division in _calc_length:

_calc_length(PyObject *start, PyObject *stop, PyObject *step, PyObject **next, int cmplx)

@asmeurer
Copy link
Member

This is very similar to #18881.

@LeonaTaric
Copy link
Contributor
LeonaTaric commented Nov 13, 2021

We can use integer-devide instead of number-devide

>>> start, stop, step = 0, 108086391056891901, 1080863910568919
>>> (stop - start) / step
100.0
>>> (stop - start) // step
100
>>> (stop - start - 1) / step
100.0
>>> (stop - start - 1) // step
100
>>> (stop - start - 2) / step
100.0
>>> (stop - start - 2) // step
99

@LeonaTaric
Copy link
Contributor
LeonaTaric commented Nov 13, 2021

there's the link of code

there is number-devide(__truediv__), we can change it to integer-devide(__floordiv__)

@LeonaTaric
Copy link
Contributor

old code (python ver.)

# next = stop - start
val = next / step

new code (python ver.)

# next = stop - start
val = 1 + (next - 1) // step

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants
0