8000 BUG: Workaround segfault in Apple Accelerate framework SGEMV function by sturlamolden · Pull Request #5237 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

BUG: Workaround segfault in Apple Accelerate framework SGEMV function #5237

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed

BUG: Workaround segfault in Apple Accelerate framework SGEMV function #5237

wants to merge 1 commit into from

Conversation

sturlamolden
Copy link
Contributor

SGEMV in Accelerate framework will segfault on MacOS X version 10.9
(aka Mavericks) if arrays are not aligned to 32 byte boundaries
and the CPU supports AVX instructions. This can produce segfaults
in np.dot. This patch overshadows the symbols cblas_sgemv, sgemv_ and
sgemv exported by Accelerate to produce the correct behavior. The MacOS X
version and CPU specs are checked on module import. If Mavericks and
AVX are detected the call to SGEMV is emulated with a call to SGEMM
if the arrays are not 32 byte aligned. If the exported symbols cannot
be overshadowed on module import, a fatal error is produced and the
process aborts. All the fixes are in a self-contained C file
and do not alter the multiarray C code. The patch is not applied
unless NumPy is configured to link with Apple's Accelerate
framework.

@sturlamolden
Copy link
Contributor Author

Similar to:
#4007
#5223
scipy/scipy#4105

@sturlamolden sturlamolden changed the title WIP (BUG): Workaround sgefault in Apple Accelerate framework SGEMV function WIP (BUG): Workaround segfault in Apple Accelerate framework SGEMV function Oct 27, 2014
@sturlamolden sturlamolden changed the title WIP (BUG): Workaround segfault in Apple Accelerate framework SGEMV function BUG: Workaround segfault in Apple Accelerate framework SGEMV function Oct 27, 2014
SGEMV in Accelerate framework will segfault on MacOS X version 10.9
(aka Mavericks) if arrays are not aligned to 32 byte boundaries
and the CPU supports AVX instructions. This can produce segfaults
in np.dot. This patch overshadows the symbols cblas_sgemv, sgemv_ and
sgemv exported by Accelerate to produce the correct behavior. The MacOS X
version and CPU specs are checked on module import. If Mavericks and
AVX are detected the call to SGEMV is emulated with a call to SGEMM
if the arrays are not 32 byte aligned. If the exported symbols cannot
be overshadowed on module import, a fatal error is produced and the
process aborts. All the fixes are in a self-contained C file
and do not alter the multiarray C code. The patch is not applied
unless NumPy is configured to link with Apple's Accelerate
framework.
@njsmith
Copy link
Member
njsmith commented Oct 27, 2014

@charris: not sure there's any point in waiting for apple here - AFAIK
their solution is just, upgrade to Yosemite, bug closed as far as they're
concerned.
On 27 Oct 2014 07:38, "Sturla Molden" notifications@github.com wrote:

SGEMV in Accelerate framework will segfault on MacOS X version 10.9
(aka Mavericks) if arrays are not aligned to 32 byte boundaries
and the CPU supports AVX instructions. This can produce segfaults
in np.dot. This patch overshadows the symbols cblas_sgemv, sgemv_ and
sgemv exported by Accelerate to produce the correct behavior. The MacOS X
version and CPU specs are checked on module import. If Mavericks and
AVX are detected the call to SGEMV is emulated with a call to SGEMM
if the arrays are not 32 byte aligned. If the exported symbols cannot
be overshadowed on module import, a fatal error is produced and the
process aborts. All the fixes are in a self-contained C file
and do not alter the multiarray C code. The patch is not applied
unless NumPy is configured to link with Apple's Accelerate

framework.

You can merge this Pull Request by running

git pull https://github.com/sturlamolden/numpy sgemv-fix-1.10.x

Or view, comment on, or merge it at:

#5237
Commit Summary

  • BUG: Workaround sgefault in Apple Accelerate framework SGEMV
  • add sgemv-fix test case

File Changes

Patch Links:


Reply to this email directly or view it on GitHub
#5237.

@charris
Copy link
Member
charris commented Oct 27, 2014

@njsmith Perhaps we can take the Apple approach in six months, Yosemite is still pretty new right now. Upgrading was the solution to the big file write problem...

@sturlamolden
Copy link
Contributor Author

Personally I consider the Yosemite UI a downgrade, just like I did with iOS7. But if that is what it takes to make NumPy behave correctly it is a pain I am willing to endure.

@matthew-brett
Copy link
Contributor

Maybe the question is best put - how small does the proportion of affected OSX 10.9 users have to be before we are prepared to leave them with segfaults from this bug?

Here's one estimate of numbers of OSX versions: https://www.adium.im/sparkle/#osVersion

I am guessing that a) there will be no fix for OSX 10.9 and b) 10.9 will be about 10 percent of users in 6 months time.

@charris
Copy link
Member
charris commented Oct 27, 2014

Wow, that's amazingly fast uptake of 10.10.0.

@matthew-brett
Copy link
Contributor

Yes, it was the same for 10.9. I think 10.8 was down to about 5 percent a year after 10.9 came out.

@sturlamolden
Copy link
Contributor Author

It is the Apple religion, always using the latest software instead of an older an more stabile configuration. Quite the opposite of Linux... It also helps that AppStore more or less forces you to upgrade if you are to continue buing apps and receiving updates for installed ones. It is not so bad on a Mac, but on iOS a software upgrade also tends to render older hardware useless (and there is no warning of what might happen).

@juliantaylor
Copy link
Contributor

I don't think those numbers are very meaningful for the enviromnents numpy is used for. Most are probably regular user macs used for not much else but apps and browsing so they are easy to upgrade.
most people I know in the scientific environment using macs are using terribly old versions most of them around .7 or .8. Upgrading these is not something you do as it will likely break all of your scientific software that was a pain to install in the first place.
The good thing is these people are very unlikely to use 10.9 as that is the version that broke everything with its clang change and (initially) broken python installation.

@matthew-brett
Copy link
Contributor

I agree it's difficult to be sure how the Adium numbers relate to numpy users.

My experience outside my own local environment is of teaching Berkeley students and post-docs, and I would guess they are not far behind the Adium users in terms of uptake.

@matthew-brett
Copy link
Contributor

I got a reply from the Apple engineers on my bug report:

There are no plans to backport the fix to 10.9.

We are closing this bug report.

@sturlamolden
Copy link
Contributor Author

Fantastic. Leave it in there and let random segfaults happen in multimedia applications. That is very responsible, Apple.

@charris
Copy link
Member
charris commented Oct 29, 2014

@matthew-brett Master, you are wise in the way of the Apple.

@matthew-brett
Copy link
Contributor

:)

@matthew-brett
Copy link
Contributor

I think one is superceded by #5223 - now merged. @sturlamolden - can you confirm?

@charris
Copy link
Member
charris commented Apr 6, 2015

That was only for the 1.9 release, that's why I added the needs decision tag to this one. Depends, I suppose, on the uptake of osx 1.10...

@matthew-brett
Copy link
Contributor

Ah - sorry - I didn't realize that was only 1.9. Still 18% on 10.9 on https://www.adium.im/sparkle/#osVersion . Given this patch is pretty safe, I would have thought it should stay in until 10.9 < 5 % or even lower.

@sturlamolden
Copy link
Contributor Author

The patch should be rather safe. It is used in NumPy 1.9 and SciPy 0.14 and 0.15 and has not caused any problems. Because dotblas is merged into multiarray in NumPy 1.10 there are differences in the build scripts. Hence the different PRs for 1.9 and 1.10. The patch itself is a C file and is unchanged.

@sturlamolden
Copy link
Contributor Author

Not using this patch is also "safe". The result will be a segfault, not an incorrect result. Personally I think segfauts are bad, but incorrect results are worse.

@charris
Copy link
Member
charris commented Apr 7, 2015

Still 18% on 10.9

Looks like it needs to go in. @sturlamolden Could you rebase this?

@charris charris modified the milestones: 1.10 blockers, 1.10.0 release Apr 7, 2015
@sturlamolden
Copy link
Contributor Author

Im in a cabin in the middle of the forest and have a very slow connection. I need to drive somewhere to fetch the latest master. Perhaps tomorrow.

@charris
Copy link
Member
charris commented Apr 7, 2015

Not a big rush. If it is inconvenient someone else can make the fix.

@charris
Copy link
Member
charris commented Apr 22, 2015

@sturlamolden Out of the woods yet ;)

@charris
Copy link
Member
charris commented Apr 30, 2015

@sturlamolden Can you rebase this now?

@sturlamolden
Copy link
Contributor Author

Yes, I can try now.

@sturlamolden
Copy link
Contributor Author

From "unknown repository"... WTF?

This just proves my phenomenal Git skills... 😞

@sturlamolden
Copy link
Contributor Author

I am not sure I can do this without closing this PR and creating a new one.

@charris
Copy link
Member
charris commented May 3, 2015

Rebased in #5831, so closing this. Thanks Sturla.

The normal procedure to fix this would be

git rebase master # will error
git status # will let you see which files are in conflict
<edit troublesome files and fix conflicts>
git add <fixed files>
git rebase --continue
git push --force origin HEAD

@charris charris closed this May 3, 2015
@charris
Copy link
Member
charris commented May 3, 2015

The problem sections will be marked with the usual >>>>>', '=====', '<<<<< separators. You can give it a shot on your branch to see how it works. The problem file was numpy/core/setup.py. I used git am to pull the patches from here, so it isn't exactly like the rebase. The git status was

charris@localhost [numpy.git (gh-5237|AM 1/1)]$ git status
On branch gh-5237
You are in the middle of an am session.
  (fix conflicts and then run "git am --continue")
  (use "git am --skip" to skip this patch)
  (use "git am --abort" to restore the original branch)

Changes to be committed:
  (use "git reset HEAD <file>..." to unstage)

    new file:   numpy/build_utils/apple_accelerate.py
    new file:   numpy/build_utils/src/apple_sgemv_fix.c
    modified:   numpy/core/tests/test_multiarray.py

Unmerged paths:
  (use "git reset HEAD <file>..." to unstage)
  (use "git add <file>..." to mark resolution)

    both modified:   numpy/core/setup.py

Both modified are the files that need fixing.

charris added a commit that referenced this pull request May 3, 2015
rebase-gh-5237 - BUG: Workaround segfault in Apple Accelerate framework SGEMV
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants
0