-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
Using OpenBLAS for the released wheels #10105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
AFAIK we rely on the BLAS shipped with numpy so I don't think we have anything to do on the scikit-learn side. |
If I understand correctly, in the current setup appveyor installs the scipy/numpy wheels from http://www.lfd.uci.edu/~gohlke/pythonlibs/ which are built against MKL, scikit-learn/build_tools/appveyor/requirements.txt Lines 1 to 11 in 4d9a12d
so that would make scikit-learn wheels on windows also dependent on MKL? The latest versions of numpy / scipy do have windows wheels on PyPi (according to what you said, built with OpenBlas). However, that's not the case for the versions currently pinned in appveyor. I'm not sure how backward compatibility works here: would it make sense incrementing the numpy/scipy versions in appveyor for v0.20 to those that have an official windows wheel, or not? cc @ogrisel |
Note OpenBLAS in Windows wheels is only on numpy master numpy/numpy#9645.
Looking at https://pypi.python.org/pypi/numpy it seems that for the latest release of numpy i.e. 1.13:
|
You seem to be calling system info though. IIUC it won't find the numpy BLAS. |
Not sure what you mean by this. I have to admit I am not an expert on neither BLAS nor wheels and even less on Windows. @ogrisel may be the right person to pitch in. |
So this is what I'm talking about: Numpy will have a BLAS implementation, but you only can use that implementation through the numpy API. If you call the BLAS API directly, then you won't be using numpy's BLAS, you'll be using your own BLAS. When I say "you seem to be calling system info," what I mean is that your setup script is looking for a BLAS implementation on the system, and it's not finding one. This is clearly apparent in the Appveyor output. The solution to this problem is to copy the numpy appveyor script: https://github.com/numpy/numpy/blob/519991d0f78cc7d37e7dcdd7c0d69a9aa1338ebd/.appveyor.yml#L101-L130:
|
Also, MKL is available if you want it. They've updated their license to be compatible with OSS projects. |
@xoviat Thanks for the additional details! I haven't worked on wheel generation, but according to the mainter docs, scikit-learn wheels for releases are generated in https://github.com/MacPython/scikit-learn-wheels . The appveyor/travis setup there doesn't appear to be strictly identical to those in the main scikit-learn repo. For Linux, as far as I understand, wheels are built with OpenBLAS with the help of Also even if OpenBLAS is used in Windows wheel for the direct BLAS calls, it wouldn't it be preferable to wait for numpy to release official windows wheels with OpenBLAS? Otherwise, it would mean mixing different BLAS libraries for the direct calls and those made via numpy which might be a bit awkward. Just my 2c. Contributors of the |
This has no practical impact.
Would be good to reuse code. |
@xoviat We have discussed this in more detail, it is certainly worth doing, but it is uncertain who would have enough bandwidth to make it happen. Thanks for opening this issue. Long-term approach for this could be to migrate the explict BLAS calls to scipy Cython BLAS API which would allow removing the bundled BLAS, or the necessity to explicitly link against BLAS. This would require scipy 0.16.0 though, while currently, the minimal supported version is 0.13.3. |
I will have an openblas available as soon as some PRs are merged. |
Great, thanks for looking into it! https://github.com/MacPython/scikit-learn-wheels is the place to make PRs scikit-learn wise, I think. I'm currently working on #9485; nightly wheels might be a good place to start testing this. |
You can prepend this to the install script:
You will also need numpy/numpy#9942 to allow numpy to detect the library location automatically. |
Due to some recent debugging in a different issue, I've learned that the Determined this by extracting
We know NumPy wheel is linked against OpenBLAS from it's configuration information (below). Though if one wants to be very sure, this can also be verified by running $ python -c "import numpy; numpy.__config__.show()"
blas_mkl_info:
NOT AVAILABLE
blis_info:
NOT AVAILABLE
openblas_info:
libraries = ['openblas', 'openblas']
library_dirs = ['/usr/local/lib']
language = c
define_macros = [('HAVE_CBLAS', None)]
blas_opt_info:
libraries = ['openblas', 'openblas']
library_dirs = ['/usr/local/lib']
language = c
define_macros = [('HAVE_CBLAS', None)]
lapack_mkl_info:
NOT AVAILABLE
openblas_lapack_info:
libraries = ['openblas', 'openblas']
library_dirs = ['/usr/local/lib']
language = c
define_macros = [('HAVE_CBLAS', None)]
lapack_opt_info:
libraries = ['openblas', 'openblas']
library_dirs = ['/usr/local/lib']
language = c
define_macros = [('HAVE_CBLAS', None)] So why does this happen? My guess (though I haven't tested it) is the configuration information above. In particular, the |
Of course, there is always the chance I've messed something up above. If so, please let me know. |
Thanks for investigating @jakirkham . I'm not an expert on OpenBLAS packaging either, but your comments seem very plausible. From what I saw, both numpy and scipy wheels bundle openblas under I have opened an issue about it in MacPython/scikit-learn-wheels#6 @jeremiedbb would you be able to ask @ogrisel 's input about this ? :) While this is not release blocking, if true it means that pip installed scikit-learn on Linux might be sensibly slower that it needs to be. Tagging for v0.20 to raise awareness about this issue. |
FWIW similar inspection of the macOS wheels shows they are linked to Accelerate. Expect the issue shares the same cause as the Linux case. While not slow necessarily, Accelerate does have known issues with forking, which makes it problematic when using |
No, we don't include openblas in the released wheels. When no BLAS is available at build time, our setup.py are configured to fallback on unoptimized source-based ATLAS code. This is what is used in our linux and windows wheels at the moment. Ideally, our cython code that need BLAS could However, this is not yet possible as we currently support old versions of scipy that do not provide this feature. This should be possible to rewrite our cython code in scikit-learn 0.21 to use this if we bump up the dependency. In the meantime, we can ship openblas in the wheels instead of using the unoptimized source-based ATLAS fallback. However, that requires some work on the wheel building infrastructure. |
Note that the lack of BLAS in our wheels is only impacting functions that call into BLAS directly in Cython code. The parts of scikit-learn that uses numpy or scipy will use the BLAS implementations from those libs. |
While it would be nice to update our wheel system to build against openblas, I am not sure it's worth blocking the 0.20 release for that. |
Thanks for the explanations @ogrisel ! I got confused by the discussion on the numpy mailing list about this (in retrospective I'm not sure why) and though that only applied for Windows. Untaged from 0.20. You are right, I imagine one could always make another minor release 0.20.x if someone does the work on including OpenBLAS in scikit-learn's manylinux wheels.
Opened an issue about this in #11638 |
Hey, I just wanted to let you know that OpenBLAS is available for windows and that you should include it in the released wheels. See the numpy appveyor.yml for an example.
The text was updated successfully, but these errors were encountered: