-
-
Notifications
You must be signed in to change notification settings - Fork 10.9k
BUG: Workaround segfault in Apple Accelerate framework SGEMV function #5237
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Similar to: |
SGEMV in Accelerate framework will segfault on MacOS X version 10.9 (aka Mavericks) if arrays are not aligned to 32 byte boundaries and the CPU supports AVX instructions. This can produce segfaults in np.dot. This patch overshadows the symbols cblas_sgemv, sgemv_ and sgemv exported by Accelerate to produce the correct behavior. The MacOS X version and CPU specs are checked on module import. If Mavericks and AVX are detected the call to SGEMV is emulated with a call to SGEMM if the arrays are not 32 byte aligned. If the exported symbols cannot be overshadowed on module import, a fatal error is produced and the process aborts. All the fixes are in a self-contained C file and do not alter the multiarray C code. The patch is not applied unless NumPy is configured to link with Apple's Accelerate framework.
@charris: not sure there's any point in waiting for apple here - AFAIK
|
@njsmith Perhaps we can take the Apple approach in six months, Yosemite is still pretty new right now. Upgrading was the solution to the big file write problem... |
Personally I consider the Yosemite UI a downgrade, just like I did with iOS7. But if that is what it takes to make NumPy behave correctly it is a pain I am willing to endure. |
Maybe the question is best put - how small does the proportion of affected OSX 10.9 users have to be before we are prepared to leave them with segfaults from this bug? Here's one estimate of numbers of OSX versions: https://www.adium.im/sparkle/#osVersion I am guessing that a) there will be no fix for OSX 10.9 and b) 10.9 will be about 10 percent of users in 6 months time. |
Wow, that's amazingly fast uptake of 10.10.0. |
Yes, it was the same for 10.9. I think 10.8 was down to about 5 percent a year after 10.9 came out. |
It is the Apple religion, always using the latest software instead of an older an more stabile configuration. Quite the opposite of Linux... It also helps that AppStore more or less forces you to upgrade if you are to continue buing apps and receiving updates for installed ones. It is not so bad on a Mac, but on iOS a software upgrade also tends to render older hardware useless (and there is no warning of what might happen). |
I don't think those numbers are very meaningful for the enviromnents numpy is used for. Most are probably regular user macs used for not much else but apps and browsing so they are easy to upgrade. |
I agree it's difficult to be sure how the Adium numbers relate to numpy users. My experience outside my own local environment is of teaching Berkeley students and post-docs, and I would guess they are not far behind the Adium users in terms of uptake. |
I got a reply from the Apple engineers on my bug report:
|
Fantastic. Leave it in there and let random segfaults happen in multimedia applications. That is very responsible, Apple. |
@matthew-brett Master, you are wise in the way of the Apple. |
:) |
I think one is superceded by #5223 - now merged. @sturlamolden - can you confirm? |
That was only for the 1.9 release, that's why I added the needs decision tag to this one. Depends, I suppose, on the uptake of osx 1.10... |
Ah - sorry - I didn't realize that was only 1.9. Still 18% on 10.9 on https://www.adium.im/sparkle/#osVersion . Given this patch is pretty safe, I would have thought it should stay in until 10.9 < 5 % or even lower. |
The patch should be rather safe. It is used in NumPy 1.9 and SciPy 0.14 and 0.15 and has not caused any problems. Because dotblas is merged into multiarray in NumPy 1.10 there are differences in the build scripts. Hence the different PRs for 1.9 and 1.10. The patch itself is a C file and is unchanged. |
Not using this patch is also "safe". The result will be a segfault, not an incorrect result. Personally I think segfauts are bad, but incorrect results are worse. |
Looks like it needs to go in. @sturlamolden Could you rebase this? |
Im in a cabin in the middle of the forest and have a very slow connection. I need to drive somewhere to fetch the latest master. Perhaps tomorrow. |
Not a big rush. If it is inconvenient someone else can make the fix. |
@sturlamolden Out of the woods yet ;) |
@sturlamolden Can you rebase this now? |
Yes, I can try now. |
From "unknown repository"... WTF? This just proves my phenomenal Git skills... 😞 |
I am not sure I can do this without closing this PR and creating a new one. |
Rebased in #5831, so closing this. Thanks Sturla. The normal procedure to fix this would be
|
The problem sections will be marked with the usual
Both modified are the files that need fixing. |
rebase-gh-5237 - BUG: Workaround segfault in Apple Accelerate framework SGEMV
SGEMV in Accelerate framework will segfault on MacOS X version 10.9
(aka Mavericks) if arrays are not aligned to 32 byte boundaries
and the CPU supports AVX instructions. This can produce segfaults
in np.dot. This patch overshadows the symbols cblas_sgemv, sgemv_ and
sgemv exported by Accelerate to produce the correct behavior. The MacOS X
version and CPU specs are checked on module import. If Mavericks and
AVX are detected the call to SGEMV is emulated with a call to SGEMM
if the arrays are not 32 byte aligned. If the exported symbols cannot
be overshadowed on module import, a fatal error is produced and the
process aborts. All the fixes are in a self-contained C file
and do not alter the multiarray C code. The patch is not applied
unless NumPy is configured to link with Apple's Accelerate
framework.