-
-
Notifications
You must be signed in to change notification settings - Fork 10.9k
BUG: np.nonzero
outputs too large indices for boolean matrices
#23196
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
np.nonzero
outputs too large indices for boolean matrices>np.nonzero
outputs too large indices for boolean matrices
I cannot reproduce this. Try printing out the seed in each process each time you print out the count so we can figure out which seed is producing the problematic array within 20 iterations so we don't have to do all this forking. if FORK:
pid = os.fork()
seed = 4321
np.random.seed(seed)
if pid > 0:
...
count = 0
while True:
count += 1
if count % 10 == 0:
print(f"{seed=} {count=}")
... You can also try saving the data at the end so we can try to reproduce in the case that it's a problem with the actual values in the array. if np.max(xs_rel) > random_num_two:
np.savez('error.npz', random_arr=random_arr, mask=mask, xs_rel=xs_rel, ys_rel=ys_rel)
raise Exception(f"This should not happen: {np.max(xs_rel)} > {random_num_two}")
if np.max(ys_rel) > random_num_one:
np.savez('error.npz', random_arr=random_arr, mask=mask, xs_rel=xs_rel, ys_rel=ys_rel)
raise Exception(f"This should not happen: {np.max(ys_rel)} > {random_num_one}") You mention that it sometimes doesn't happen ("it has proven effective to |
I tested some more and found out that the randomness is not necessary. And also it's not necessary that the array is boolean. import numpy as np
def main():
count = 0
while True:
count += 1
print(count)
num_one = 8000
num_two = 6000
arr = np.ones((num_one, num_two))
ys, xs = np.nonzero(arr)
if np.max(xs) > num_two:
raise Exception(f"This should not happen: {np.max(xs)} > {num_two}")
if np.max(ys) > num_one:
raise Exception(f"This should not happen: {np.max(ys)} > {num_one}")
if __name__ == "__main__":
main() This throws an error almost instantly on both machines I tested on.
Output machine two:
|
Unfortunately, I also cannot reproduce on either of my computers, also trying Python 3.8 (and within valgrind). The hardware on one looks rather equivalent to yours (same instruction sets in Is there anything else noteworthy about your setup, for example use of a virtual machine? Since you are on an ubuntu, maybe you can actually try running Additionally/alternatively can you try to also check the result of |
valgrind
This is output even before the first printed number, so it should be before the loop. same problem in pytorch I also note that I get the same problem using similar code in pytorch. I'm not sure if pytorch calls numpy functions under the hood, but if it doesn't, I think the issue is likely not related to numpy and we can close it. setup As far as I know there are no noteworthy things. I don't think I'm running on a VM, but I will check with my co-workers. We will reinstall the OS and try again.
|
Yeah, that warning should be something harmless happening at import time probably (a bit surprised that I don't see it right now, but these are pretty familiar at startup). A similar error in pytorch seems surprising, but I also do not know if pytorch might use NumPy (I would be surprised, but no idea). One other thing to try is disabling the used simd extentions starting with the lowest one you got there by setting the environment variable Are the two machines you tried identical machines? Could you give the full CPU details (not sure it helps, but until someone can reproduce the issue any information seems good)? |
The bit patterns of the bad values are interesting (but maybe irrelevant):
|
Thank you all very much for your help, we reinstalled the OS and the error is not observable any more. |
If you have any information about the bad setup, that would be great, but I don't know what exactly we would look at, maybe CPU, but since a software updated apparently helped... Just in case we can narrow it down for the next person stumbling into a hard to find issue. Running into issues that were fixed by platform updates (in one case CPU microcode) have been reported in the recent past and is a bit scary... |
Using the machine for some time, we saw that the problem still is there, but for higher parameters |
@bsen the only idea I would have is the old one of disabling the use of some advanced SIMD instructions, e.g. starting with Anyway, thanks for following up! |
Shot in the dark: try running a rigorous RAM memory check on the machines where the problem is occurring. It seems unlikely that two or three machines would start having bad memory at the same time, but at least you could rule out that potential source of problems. |
Uh oh!
There was an error while loading. Please reload this page.
Describe the issue:
When calling
np.nonzero
on some boolean matrices, sometimes the returned indices are larger than it should be possible.(for example 1152921504606852819 , when they should be smaller than 12000).
The code example below reproduces this bug.
In the below script, note that one could set
FORK
to False and the bug is still thrown but it takes longer (for me it was maximally 17000 around iterations for a similar script.)I tried this on two different machines, both running 20.04.1-Ubuntu with around 32 GB of RAM.
It happens with different numpy versions, including the latest one (1.24.2) and different ways of installing numpy (conda, pip and poetry).
When using
FORK = True
in the script below, the error is shown for me after 20 iterations already (output 20 in stdout). If not, it has proven effective toCtrl + C
and try again.Reproduce the code example:
Error message:
Runtime information:
Ouput of
import sys, numpy; print(numpy.__version__); print(sys.version)
Output of
print(numpy.show_runtime())
Operating system
Context for the issue:
The usage is in the context of data loading for the analysis of images (semantic segmentation).
I cannot work without getting the indices where this kind of matrices are nonzero, I'm bound to use workarounds.
The text was updated successfully, but these errors were encountered: