BUG: np.nonzero outputs too large indices for boolean matrices

Describe the issue:

When calling np.nonzero on some boolean matrices, sometimes the returned indices are larger than it should be possible.
(for example 1152921504606852819 , when they should be smaller than 12000).

The code example below reproduces this bug.
In the below script, note that one could set FORK to False and the bug is still thrown but it takes longer (for me it was maximally 17000 around iterations for a similar script.)
I tried this on two different machines, both running 20.04.1-Ubuntu with around 32 GB of RAM.
It happens with different numpy versions, including the latest one (1.24.2) and different ways of installing numpy (conda, pip and poetry).

When using FORK = True in the script below, the error is shown for me after 20 iterations already (output 20 in stdout). If not, it has proven effective to Ctrl + C and try again.

Reproduce the code example:

import os

import numpy as np

FORK = True


def main():

    np.random.seed(4321)

    if FORK:
        pid = os.fork()
        np.random.seed(4321)
        if pid > 0:
            np.random.seed(1234)
            pid = os.fork()
            if pid > 0:
                np.random.seed(123)
                pid = os.fork()
                if pid > 0:
                    np.random.seed(321)
                    if pid > 0:
                        np.random.seed(12)
                        pid = os.fork()
                        if pid > 0:
                            np.random.seed(21)
                            pid = os.fork()
                            if pid > 0:
                                np.random.seed(1)

    count = 0
    while True:
        count += 1
        if count % 10 == 0:
            print(count)
        random_num_one = np.random.randint(6000, 8000)
        random_num_two = np.random.randint(10000, 12000)

        self_offsets 
702F
= np.zeros((random_num_one, random_num_two, 2))

        random_arr = np.random.random((random_num_one, random_num_two))
        mask = random_arr >= 0.5
        ys_rel, xs_rel = np.nonzero(mask)

        if np.max(xs_rel) > random_num_two:
            raise Exception(f"This should not happen: {np.max(xs_rel)} > {random_num_two}")

        if np.max(ys_rel) > random_num_one:
            raise Exception(f"This should not happen: {np.max(ys_rel)} > {random_num_one}")


if __name__ == "__main__":
    main()

Error message:

Traceback (most recent call last):
  File "scripts/minimal_new.py", line 52, in <module>
    main()
  File "scripts/minimal_new.py", line 45, in main
    raise Exception(f"numpy does not work: {np.max(xs)} > {random_num_two}")
Exception: This should not happen: 1152921504606852819 > 10326

Runtime information:

Ouput of import sys, numpy; print(numpy.__version__); print(sys.version)

1.24.2
3.8.10 (default, Nov 14 2022, 12:59:47) 
[GCC 9.4.0]

Output of print(numpy.show_runtime())

[{'simd_extensions': {'baseline': ['SSE', 'SSE2', 'SSE3'],
                      'found': ['SSSE3',
                                'SSE41',
                                'POPCNT',
                                'SSE42',
                                'AVX',
                                'F16C',
                                'FMA3',
                                'AVX2'],
                      'not_found': ['AVX512F',
                                    'AVX512CD',
                                    'AVX512_KNL',
                                    'AVX512_KNM',
                                    'AVX512_SKX',
                                    'AVX512_CLX',
                                    'AVX512_CNL',
                                    'AVX512_ICL']}},
 {'architecture': 'Haswell',
  'filepath': '/home/benr/.cache/pypoetry/virtualenvs/projectname-6Jmumlav-py3.8/lib/python3.8/site-packages/numpy.libs/libopenblas64_p-r0-15028c96.3.21.so',
  'internal_api': 'openblas',
  'num_threads': 16,
  'prefix': 'libopenblas',
  'threading_layer': 'pthreads',
  'user_api': 'blas',
  'version': '0.3.21'}]

Operating system

Linux 5.15.0-58-generic #64~20.04.1-Ubuntu

Context for the issue:

The usage is in the context of data loading for the analysis of images (semantic segmentation).
I cannot work without getting the indices where this kind of matrices are nonzero, I'm bound to use workarounds.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Describe the issue:

Reproduce the code example:

Error message:

Runtime information:

Context for the issue:

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Description

Describe the issue:

Reproduce the code example:

Error message:

Runtime information:

Context for the issue:

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions