BUG: Ugly fix for Apple's cblas_sgemv segfault #5223
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
SGEMV in Accelerate framework will segfault on MacOS X version 10.9
(aka Mavericks) if arrays are not aligned to 32 byte boundaries
and the CPU supports AVX instructions. This can produce segfaults
in numpy.dot if we use numpy.float32 as dtype. This patch overshadows
the symbols cblas_sgemv, sgemv_ and sgemv exported by Accelerate
to produce the correct behavior. The MacOS X version and CPU specs
are checked on module import. If Mavericks and AVX are detected
the call to SGEMV is emulated with a call to SGEMM if the arrays
are not 32 byte aligned. If the exported symbols cannot be
overshadowed on module import, a fatal error is produced and the
process aborts. All the fixes are in a self-contained C file
and do not alter the _dotblas C code. The patch is not applied
unless NumPy is configured to link with Apple's Accelerate
framework.