Description
Describe the issue:
When calling np.nonzero
on some boolean matrices, sometimes the returned indices are larger than it should be possible.
(for example 1152921504606852819 , when they should be smaller than 12000).
The code example below reproduces this bug.
In the below script, note that one could set FORK
to False and the bug is still thrown but it takes longer (for me it was maximally 17000 around iterations for a similar script.)
I tried this on two different machines, both running 20.04.1-Ubuntu with around 32 GB of RAM.
It happens with different numpy versions, including the latest one (1.24.2) and different ways of installing numpy (conda, pip and poetry).
When using FORK = True
in the script below, the error is shown for me after 20 iterations already (output 20 in stdout). If not, it has proven effective to Ctrl + C
and try again.
Reproduce the code example:
import os
import numpy as np
FORK = True
def main():
np.random.seed(4321)
if FORK:
pid = os.fork()
np.random.seed(4321)
if pid > 0:
np.random.seed(1234)
pid = os.fork()
if pid > 0:
np.random.seed(123)
pid = os.fork()
if pid > 0:
np.random.seed(321)
if pid > 0:
np.random.seed(12)
pid = os.fork()
if pid > 0:
np.random.seed(21)
pid = os.fork()
if pid > 0:
np.random.seed(1)
count = 0
while True:
count += 1
if count % 10 == 0:
print(count)
random_num_one = np.random.randint(6000, 8000)
random_num_two = np.random.randint(10000, 12000)
self_offsets
702F
= np.zeros((random_num_one, random_num_two, 2))
random_arr = np.random.random((random_num_one, random_num_two))
mask = random_arr >= 0.5
ys_rel, xs_rel = np.nonzero(mask)
if np.max(xs_rel) > random_num_two:
raise Exception(f"This should not happen: {np.max(xs_rel)} > {random_num_two}")
if np.max(ys_rel) > random_num_one:
raise Exception(f"This should not happen: {np.max(ys_rel)} > {random_num_one}")
if __name__ == "__main__":
main()
Error message:
Traceback (most recent call last):
File "scripts/minimal_new.py", line 52, in <module>
main()
File "scripts/minimal_new.py", line 45, in main
raise Exception(f"numpy does not work: {np.max(xs)} > {random_num_two}")
Exception: This should not happen: 1152921504606852819 > 10326
Runtime information:
Ouput of import sys, numpy; print(numpy.__version__); print(sys.version)
1.24.2
3.8.10 (default, Nov 14 2022, 12:59:47)
[GCC 9.4.0]
Output of print(numpy.show_runtime())
[{'simd_extensions': {'baseline': ['SSE', 'SSE2', 'SSE3'],
'found': ['SSSE3',
'SSE41',
'POPCNT',
'SSE42',
'AVX',
'F16C',
'FMA3',
'AVX2'],
'not_found': ['AVX512F',
'AVX512CD',
'AVX512_KNL',
'AVX512_KNM',
'AVX512_SKX',
'AVX512_CLX',
'AVX512_CNL',
'AVX512_ICL']}},
{'architecture': 'Haswell',
'filepath': '/home/benr/.cache/pypoetry/virtualenvs/projectname-6Jmumlav-py3.8/lib/python3.8/site-packages/numpy.libs/libopenblas64_p-r0-15028c96.3.21.so',
'internal_api': 'openblas',
'num_threads': 16,
'prefix': 'libopenblas',
'threading_layer': 'pthreads',
'user_api': 'blas',
'version': '0.3.21'}]
Operating system
Linux 5.15.0-58-generic #64~20.04.1-Ubuntu
Context for the issue:
The usage is in the context of data loading for the analysis of images (semantic segmentation).
I cannot work without getting the indices where this kind of matrices are nonzero, I'm bound to use workarounds.