8000 BUG: random segmentation fault in `np.unique` on MacOS (arm64) + numpy 2.0.1 · Issue #27037 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

BUG: random segmentation fault in np.unique on MacOS (arm64) + numpy 2.0.1 #27037

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
neutrinoceros opened this issue Jul 25, 2024 · 15 comments · Fixed by #27070
Closed

BUG: random segmentation fault in np.unique on MacOS (arm64) + numpy 2.0.1 #27037

neutrinoceros opened this issue Jul 25, 2024 · 15 comments · Fixed by #27070
Assignees
Labels
Milestone

Comments

@neutrinoceros
Copy link
Contributor

Describe the issue:

We noticed a regression with numpy 2.0.1 in yt's test suite yt-project/yt#4953, which bisects to gh-26821

This crash (segmentation fault) pops up in a test that use numpy's RNG (seeded), but it is still non-deterministic: it doesn't always happen, though it is frequent enough that I can reliably hit it locally if I run the test a dozen times.
As a shortcut get a MWE, I saved the random-generated data to a .npy file (uploaded to filetransfer.io, see link in the reprod script)

I suspect the important change in gh-26821 is that Highway is now used by default on macOS+arm64

may be linked to gh-27023

Reproduce the code example:

import numpy as np

# https://filetransfer.io/data-package/VZDHbBLq#link
arr = np.load("array.npy")
np.unique(arr)

Error message:

I don't know how to get gdb on macOS arm64

Python and NumPy Versions:

2.1.0.dev0+git20240725.0819378
3.10.14 (main, Apr  8 2024, 15:51:39) [Clang 15.0.0 (clang-1500.3.9.4)]

Also seen in CI with Python 3.9, FWIW

Runtime Environment:

[{'numpy_version': '2.1.0.dev0+git20240725.0819378',
  'python': '3.10.14 (main, Apr  8 2024, 15:51:39) [Clang 15.0.0 '
            '(clang-1500.3.9.4)]',
  'uname': uname_result(system='Darwin', node='kwanzaabot.home', release='23.5.0', version='Darwin Kernel Version 23.5.0: Wed May  1 20:14:38 PDT 2024; root:xnu-10063.121.3~5/RELEASE_ARM64_T6020', machine='arm64')},
 {'simd_extensions': {'baseline': ['NEON', 'NEON_FP16', 'NEON_VFPV4', 'ASIMD'],
                      'found': ['ASIMDHP'],
                      'not_found': ['ASIMDFHM']}}]

Context for the issue:

No response

@seberg
Copy link
Member
seberg commented Jul 25, 2024

I don't know how to get gdb on macOS arm64

It's lldb on OSx. And otherwise is pretty much the same, something like lldb python, then r to run. (or ldb python -- python arguments come now. spin lldb also works in a dev evenironment, but it might need tweaking, only the -c optoin seems reliable right now for me)


The array is shaped (128, 128, 128), should be ravel'd into 128**3.

FWIW, the lldb result is:

Process 28068 launched: '/opt/homebrew/Caskroom/mambaforge/base/bin/python3.11' (arm64)
Process 28068 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x104000000)
    frame #0: 0x00000001010aace0 _multiarray_umath.cpython-311-darwin.so`bool hwy::N_NEON::detail::MaybePartitionTwoValue<hwy::N_NEON::Simd<double, 2ul, 0>, hwy::N_NEON::detail::SharedTraits<hwy::N_NEON::detail::TraitsLane<hwy::N_NEON::detail::OrderAscending<double>>>, double>(hwy::N_NEON::Simd<double, 2ul, 0>, hwy::N_NEON::detail::SharedTraits<hwy::N_NEON::detail::TraitsLane<hwy::N_NEON::detail::OrderAscending<double>>>, double*, unsigned long, decltype(Zero((hwy::N_NEON::Simd<double, 2ul, 0>)())), decltype(Zero((hwy::N_NEON::Simd<double, 2ul, 0>)())), decltype(Zero((hwy::N_NEON::Simd<double, 2ul, 0>)()))&, double*) [inlined] hwy::N_NEON::Vec128<double, 2ul> hwy::N_NEON::LoadU<hwy::N_NEON::Simd<double, 2ul, 0>, (void*)0, (void*)0>((null)=<unavailable>, unaligned=<unavailable>) at arm_neon-inl.h:3501:25 [opt]
   3498	template <class D, HWY_IF_V_SIZE_D(D, 16), HWY_IF_F64_D(D)>
   3499	HWY_API Vec128<double> LoadU(D /* tag */,
   3500	                             const double* HWY_RESTRICT unaligned) {
-> 3501	  return Vec128<double>(vld1q_f64(unaligned));
   3502	}
   3503	#endif  // HWY_HAVE_FLOAT64
   3504	
Target 0: (python3.11) stopped.

with the back 8000 trace:

* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x104000000)
  * frame #0: 0x00000001010aace0 _multiarray_umath.cpython-311-darwin.so`bool hwy::N_NEON::detail::MaybePartitionTwoValue<hwy::N_NEON::Simd<double, 2ul, 0>, hwy::N_NEON::detail::SharedTraits<hwy::N_NEON::detail::TraitsLane<hwy::N_NEON::detail::OrderAscending<double>>>, double>(hwy::N_NEON::Simd<double, 2ul, 0>, hwy::N_NEON::detail::SharedTraits<hwy::N_NEON::detail::TraitsLane<hwy::N_NEON::detail::OrderAscending<double>>>, double*, unsigned long, decltype(Zero((hwy::N_NEON::Simd<double, 2ul, 0>)())), decltype(Zero((hwy::N_NEON::Simd<double, 2ul, 0>)())), decltype(Zero((hwy::N_NEON::Simd<double, 2ul, 0>)()))&, double*) [inlined] hwy::N_NEON::Vec128<double, 2ul> hwy::N_NEON::LoadU<hwy::N_NEON::Simd<double, 2ul, 0>, (void*)0, (void*)0>((null)=<unavailable>, unaligned=<unavailable>) at arm_neon-inl.h:3501:25 [opt]
    frame #1: 0x00000001010aace0 _multiarray_umath.cpython-311-darwin.so`bool hwy::N_NEON::detail::MaybePartitionTwoValue<hwy::N_NEON::Simd<double, 2ul, 0>, hwy::N_NEON::detail::SharedTraits<hwy::N_NEON::detail::TraitsLane<hwy::N_NEON::detail::OrderAscending<double>>>, double>(hwy::N_NEON::Simd<double, 2ul, 0>, hwy::N_NEON::detail::SharedTraits<hwy::N_NEON::detail::TraitsLane<hwy::N_NEON::detail::OrderAscending<double>>>, double*, unsigned long, decltype(Zero((hwy::N_NEON::Simd<double, 2ul, 0>)())), decltype(Zero((hwy::N_NEON::Simd<double, 2ul, 0>)())), decltype(Zero((hwy::N_NEON::Simd<double, 2ul, 0>)()))&, double*) [inlined] void hwy::N_NEON::BlendedStore<hwy::N_NEON::Simd<double, 2ul, 0>>(v=<unavailable>, m=<unavailable>, d=<unavailable>, p=<unavailable>) at arm_neon-inl.h:3921:65 [opt]
    frame #2: 0x00000001010aace0 _multiarray_umath.cpython-311-darwin.so`bool hwy::N_NEON::detail::MaybePartitionTwoValue<hwy::N_NEON::Simd<double, 2ul, 0>, hwy::N_NEON::detail::SharedTraits<hwy::N_NEON::detail::TraitsLane<hwy::N_NEON::detail::OrderAscending<double>>>, double>(d=<unavailable>, st=<unavailable>, keys=0x0000000103fe0000, num=16384, valueL=<unavailable>, valueR=<unavailable>, third=0x000000016fdfc110, buf=0x000000016fdfc5b0) at vqsort-inl.h:989:3 [opt]
    frame #3: 0x00000001010a9ce4 _multiarray_umath.cpython-311-darwin.so`void hwy::N_NEON::detail::Recurse<(hwy::N_NEON::detail::RecurseMode)0, hwy::N_NEON::Simd<double, 2ul, 0>, hwy::N_NEON::detail::SharedTraits<hwy::N_NEON::detail::TraitsLane<hwy::N_NEON::detail::OrderAscending<double>>>, double>(hwy::N_NEON::Simd<double, 2ul, 0>, hwy::N_NEON::detail::SharedTraits<hwy::N_NEON::detail::TraitsLane<hwy::N_NEON::detail::OrderAscending<double>>>, double*, unsigned long, double*, unsigned long long*, unsigned long, unsigned long) [inlined] bool hwy::N_NEON::detail::PartitionIfTwoKeys<hwy::N_NEON::Simd<double, 2ul, 0>, hwy::N_NEON::detail::SharedTraits<hwy::N_NEON::detail::TraitsLane<hwy::N_NEON::detail::OrderAscending<double>>>, double>(d=<unavailable>, st=<unavailable>, pivot=<unavailable>, keys=0x0000000103fc0000, num=32768, idx_second=<unavailable>, second=<unavailable>, third=0x000000016fdfc110, buf=0x000000016fdfc5b0) at vqsort-inl.h:1125:22 [opt]
    frame #4: 0x00000001010a9ca8 _multiarray_umath.cpython-311-darwin.so`void hwy::N_NEON::detail::Recurse<(hwy::N_NEON::detail::RecurseMode)0, hwy::N_NEON::Simd<double, 2ul, 0>, hwy::N_NEON::detail::SharedTraits<hwy::N_NEON::detail::TraitsLane<hwy::N_NEON::detail::OrderAscending<double>>>, double>(d=<unavailable>, st=<unavailable>, keys=0x0000000103fc0000, num=32768, buf=0x000000016fdfc5b0, state=0x000000012815ce40, remaining_levels=44, k=<unavailable>) at vqsort-inl.h:1785:9 [opt]
    frame #5: 0x00000001010a9b34 _multiarray_umath.cpython-311-darwin.so`void hwy::N_NEON::detail::Recurse<(hwy::N_NEON::detail::RecurseMode)0, hwy::N_NEON::Simd<double, 2ul, 0>, hwy::N_NEON::detail::SharedTraits<hwy::N_NEON::detail::TraitsLane<hwy::N_NEON::detail::OrderAscending<double>>>, double>(d=<unavailable>, st=<unavailable>, keys=0x0000000103f80000, num=65536, buf=0x000000016fdfc5b0, state=0x000000012815ce40, remaining_levels=45, k=<unavailable>) at vqsort-inl.h:1853:7 [opt]
    frame #6: 0x00000001010a9b34 _multiarray_umath.cpython-311-darwin.so`void hwy::N_NEON::detail::Recurse<(hwy::N_NEON::detail::RecurseMode)0, hwy::N_NEON::Simd<double, 2ul, 0>, hwy::N_NEON::detail::SharedTraits<hwy::N_NEON::detail::TraitsLane<hwy::N_NEON::detail::OrderAscending<double>>>, double>(d=<unavailable>, st=<unavailable>, keys=0x0000000103f20000, num=114688, buf=0x000000016fdfc5b0, state=0x000000012815ce40, remaining_levels=46, k=<unavailable>) at vqsort-inl.h:1853:7 [opt]
    frame #7: 0x00000001010a9b34 _multiarray_umath.cpython-311-darwin.so`void hwy::N_NEON::detail::Recurse<(hwy::N_NEON::detail::RecurseMode)0, hwy::N_NEON::Simd<double, 2ul, 0>, hwy::N_NEON::detail::SharedTraits<hwy::N_NEON::detail::TraitsLane<hwy::N_NEON::detail::OrderAscending<double>>>, double>(d=<unavailable>, st=<unavailable>, keys=0x0000000103ee0000, num=147456, buf=0x000000016fdfc5b0, state=0x000000012815ce40, remaining_levels=47, k=<unavailable>) at vqsort-inl.h:1853:7 [opt]
    frame #8: 0x00000001010a9b34 _multiarray_umath.cpython-311-darwin.so`void hwy::N_NEON::detail::Recurse<(hwy::N_NEON::detail::RecurseMode)0, hwy::N_NEON::Simd<double, 2ul, 0>, hwy::N_NEON::detail::SharedTraits<hwy::N_NEON::detail::TraitsLane<hwy::N_NEON::detail::OrderAscending<double>>>, double>(d=<unavailable>, st=<unavailable>, keys=0x0000000103b80000, num=589824, buf=0x000000016fdfc5b0, state=0x000000012815ce40, remaining_levels=48, k=<unavailable>) at vqsort-inl.h:1853:7 [opt]
    frame #9: 0x00000001010a9b34 _multiarray_umath.cpython-311-darwin.so`void hwy::N_NEON::detail::Recurse<(hwy::N_NEON::detail::RecurseMode)0, hwy::N_NEON::Simd<double, 2ul, 0>, hwy::N_NEON::detail::SharedTraits<hwy::N_NEON::detail::TraitsLane<hwy::N_NEON::detail::OrderAscending<double>>>, double>(d=<unavailable>, st=<unavailable>, keys=0x0000000103720000, num=1163264, buf=0x000000016fdfc5b0, state=0x000000012815ce40, remaining_levels=49, k=<unavailable>) at vqsort-inl.h:1853:7 [opt]
    frame #10: 0x00000001010a9b34 _multiarray_umath.cpython-311-darwin.so`void hwy::N_NEON::detail::Recurse<(hwy::N_NEON::detail::RecurseMode)0, hwy::N_NEON::Simd<double, 2ul, 0>, hwy::N_NEON::detail::SharedTraits<hwy::N_NEON::detail::TraitsLane<hwy::N_NEON::detail::OrderAscending<double>>>, double>(d=<unavailable>, st=<unavailable>, keys=0x0000000103000000, num=2097152, buf=0x000000016fdfc5b0, state=0x000000012815ce40, remaining_levels=50, k=<unavailable>) at vqsort-inl.h:1853:7 [opt]
    frame #11: 0x000000010109fd90 _multiarray_umath.cpython-311-darwin.so`void hwy::N_NEON::VQSortStatic<double>(double*, unsigned long, hwy::SortAscending) at vqsort-inl.h:1966:5 [opt]
    frame #12: 0x000000010109fbf0 _multiarray_umath.cpython-311-darwin.so`void hwy::N_NEON::VQSortStatic<double>(double*, unsigned long, hwy::SortAscending) [inlined] void hwy::N_NEON::Sort<hwy::N_NEON::Simd<double, 2ul, 0>, hwy::N_NEON::detail::SharedTraits<hwy::N_NEON::detail::TraitsLane<hwy::N_NEON::detail::OrderAscending<double>>>, double>(d=<unavailable>, st=<unavailable>, keys=<unavailable>, num=2097152) at vqsort-inl.h:2082:10 [opt]
    frame #13: 0x000000010109fbf0 _multiarray_umath.cpython-311-darwin.so`void hwy::N_NEON::VQSortStatic<double>(keys=0x0000000103000000, num=2097152, (null)=<unavailable>) at vqsort-inl.h:2164:3 [opt]
    frame #14: 0x0000000100fac850 _multiarray_umath.cpython-311-darwin.so`::quicksort_double(void *, npy_intp, void *) [inlined] bool quicksort_dispatch<double>(start=<unavailable>, num=<unavailable>) at quicksort.cpp:105:9 [opt]
    frame #15: 0x0000000100fac838 _multiarray_umath.cpython-311-darwin.so`quicksort_double(start=0x0000000103000000, n=2097152, __NPY_UNUSED_TAGGEDvarr=<unavailable>) at quicksort.cpp:814:9 [opt]
    frame #16: 0x0000000100f6f024 _multiarray_umath.cpython-311-darwin.so`_new_sortlike(op=0x00000001005f7150, axis=0, sort=(_multiarray_umath.cpython-311-darwin.so`::quicksort_double(void *, npy_intp, void *) at quicksort.cpp:813), part=0x0000000000000000, kth=0x0000000000000000, nkth=0) at item_selection.c:1265:19 [opt]

Ping @Mousius this sounds like t may be up your alley with the highway issue.

@seberg
Copy link
Member
seberg commented Jul 25, 2024

I had a look at the data, and the interesting thing about it and one thing is that it has a lot of repeated values (along dimension 0 and 2). The following seems to crash for me about half the time, so it doesn't quite seem like the right ingredient... The actual array has the property that (arr[:, :-1, :] == arr[:, 1:, :]).all(), which may or may not be relevant.

vals = np.linspace(0, 1, num=128)
data = np.broadcast_to(vals, (128, 128, 128)).transpose(0, 2, 1).copy()

# This also crashes:
np.unique(data)

@seberg
Copy link
< 8000 /details-menu>
Member
seberg commented Jul 25, 2024

And because I find it puzzling. This crashes for me in highway (expected) and the PR that @neutrinoceros bisected to was #26821 which is x86 sort and would seem unrelated to me on first sight?! (CC @rdevulap)

@neutrinoceros
Copy link
Contributor Author
neutrinoceros commented Jul 25, 2024

However unlikely, It's possible that my bisection took a wrong turn at some point because I marked any commit for which I couldn't reprod after 10 tries as "good", and maybe that wasn't enough.

@seberg
Copy link
Member
seberg commented Jul 25, 2024

Well, it makes it harder that this issue is flaky. It does run successfully sometimes for me.
FWIW, I tried to revert that commit (with the git submodule update) and I can still reproduce the crashes on main. So I guess if that did something, it is just by chance. Maybe the whole issue is even just harder to trigger on 2.0.0...

@neutrinoceros
Copy link
Contributor Author

One thing I don't get is that gh-26821 is supposedly a backport of gh-26797, but the diff in the backport is richer. I should note I really bisected only the backport branch between v2.0.1 and v2.0.0. Would it help to also bisect on main ?

@seberg
Copy link
Member
seberg commented Jul 25, 2024

Ahh, I didn't notice that one! https://github.com/numpy/numpy/pull/26821/files does indeed contain some changes that would be related to the mac/highway paths.

@r-devulap
Copy link
Member

#26821 ported changes from #26273 which modifies meson.build to reenable highway on macOS/ARM64. @Mousius should know more.

@Mousius
Copy link
Member
Mousius commented Jul 25, 2024

This is similar to #25464 - cc @jan-wassenberg, the expert on all things VQSort.

2.0.0 shipped with the code to disable Highway, and once the above issue was fixed it was re-enabled.

@r-devulap
Copy link
Member

The highway commit hash in 2.0.x matches with main branch. Does it mean this failure happens on the main branch too? @neutrinoceros is it possible to verify that?

@seberg
Copy link
Member
seberg commented Jul 25, 2024

@r-devulap yes, I tested only on main. It is reproducible there.

@Mousius
Copy link
Member
Mousius commented Jul 26, 2024

I think google/highway#2282 should fix this, can someone else verify? It was pretty much the same issue of the Highway BlendedStore doing a LoadU and then a StoreU out of bounds 😿

@neutrinoceros
Copy link
Contributor Author

thanks a lot @Mousius ! I just tried your branch locally and it seemed to run smoothly, but for reasons that are beyond me it looks like I cannot reprod the segfault at all today, so I don't think my validation is worth much.

@neutrinoceros
Copy link
Contributor Author

Nevermind, I just forgot that I used Python 3.10 yesterday and used my default (3.12) today. Switching back to 3.10 I can both reproduce the problem again and confirm that it goes away with your patch !

@jan-wassenberg
Copy link
Contributor

Thank you @Mousius for fixing :)

  • it's in MaybePartitionTwoValue which is an infrequently used codepath for low-entropy (many duplicate) distributions;
  • it happens on NEON and not AVX-512 or even AVX2, because masking is not safe on NEON.
    The flakiness would be because it only crashes if the array is not padded, and especially if it lands on the end of a memory page.

Sorry about the breakage. I checked why our tests did not find this, and it's because they generate random 32-bit values which are very unlikely to have many duplicates. We can also run them with 8-bit values.

Mousius added a commit to Mousius/numpy that referenced this issue Jul 29, 2024
charris pushed a commit to charris/numpy that referenced this issue Jul 29, 2024
* Bump Highway to latest master

Fixes numpy#27037

* Add reproducer
charris pushed a commit to charris/numpy that referenced this issue Jul 29, 2024
* Bump Highway to latest master

Fixes numpy#27037

* Add reproducer
ArvidJB pushed a commit to ArvidJB/numpy that referenced this issue Nov 1, 2024
* Bump Highway to latest master

Fixes numpy#27037

* Add reproducer
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
0