-
-
Notifications
You must be signed in to change notification settings - Fork 10.9k
BUG: Ensure lstsq can handle RHS with all sizes. #9976
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This fixes a bug in the creation of workspace arrays for a call to `lapack_lite.zgelsd`, which led to segmentation faults when a RHS was passed in that had larger size than the size of the matrix.
rwork = zeros((lwork,), real_t) | ||
a_real = zeros((m, n), real_t) | ||
bstar_real = zeros((ldb, n_rhs,), real_t) | ||
results = lapack_lite.dgelsd(m, n, n_rhs, a_real, m, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I must admit that I don't understand why this call for size information on the regular, non-complex routine was made. I suspect it was to work around a bug in zgelsd
, which may well have been fixed with the update to lapack-lite.
In any case, with my changes, the routine just uses the results of the first call, which should be correct.
Misclick... This is actually one of the motivations behind the lapacklite upgrade. I have a vectorised lstsq in my working tree from long ago that also fixes this, but we never agreed on the shape of the output (since right now it varies based on the matrix rank, which doesn't make sense for stacked matrices) |
It indeed looked like this could be done more easily; for now, it may be good to at least fix this obvious problem... |
IIRC, the fix for this was in lapack 3.2.2. I don't remember if this is already the minimum version of lapack that numpy supports. Relatedly, |
I cannot find what minimum version we are supposed to support... @charris, @rgommers: where should one look? I do note that scipy is moving to lapack >=3.3 in 1.1 (scipy/scipy#6051). |
@mhvk I don't think we have an official minimum. I think scipy was using 3.2 for Accelerate compatibility on the Mac and I don't know the status of Accelerate for this bug. I think that we are going to drop Accelerate at some point, but that needs discussion and preparation. @rgommers, @pv Thoughts? |
Apparently the lapack version used in accelerate can be found in the system header and was 3.2.1 back in around 2011. I think we should assume that it is at least 3.2.2 these days, although it would be nice if someone could check. Scipy currently handles this by falling back to |
I want to put this an. Anyone opposed? |
Do we want to add something like this, to guard against older versions? https://github.com/scipy/scipy/blob/master/scipy/linalg/basic.py#L1212-L1219 |
@eric-wieser That would probably be a good idea, although if we are going to require lapack >= 3.2.2 we could just raise an error rather than falling back on |
Raising an error is indeed what I was intending to suggest |
Here's the comment where I found the another bug that 3.2.2 makes fixing easy. |
This all sounds right, don't have much to add. Numpy should probably just follow Scipy in the Accelerate drop (which indeed still needs prep). |
All reactions
@eric-wieser, @charris, I just noticed that the work-around in scipy you point to is for real data, not complex; for complex, nothing special is done, which suggests my fix should be OK. For real data, indeed we should eventually follow up on the comment linked above and let |
Thanks Marten. |
After reverting to an old version of lapack_lite (3.0.x), I now get a segfault with this commit. Perhaps the workspace size should be initialized to Either that, or we add a release note saying that lapack >= 3.2.2 is now required, and add that to the docs too. |
In this comment: #10156 (comment)
Have we tested this on Mac? |
To be concrete, the exact version is 3.2.1. And the bug fixes for |
To be clear, is the segmentation fault confined to apple accelerate? |
The segfault mentioned above was tested on windows with an old lapack_lite (not an issue in release, which bundles a newer lapack_lite). I'm speculating that it will fail on accelerate, but have no way to find out. |
Somehow I thought we had actually gotten rid of Accelerate support... |
It has been discussed, maybe we should move forward on that for the next release. @rgommers Thoughts? |
For apple users, there are two options here:
1. Installing MKL/OpenBLAS with the conda package manager
2. Installing OpenBLAS with homebrew
Sure, before these package managers existed, sticking with accelerate
provided a significant benefit to OSX users: they never had to endure the
pain that so many Windows users experienced not having a BLAS builtin. But
I think these problems have been ameliorated enough over time to the point
that it's not too much to ask for apple users to type `brew install
OpenBLAS && brew link OpenBLAS --force` or `conda install openblas` or
`conda install mkl`.
|
Perhaps one future improvement to the situation would be detecting OpenBLAS
on OSX if homebrew did not link it...
|
Can we focus on determining whether we need to rollback this change in 1.13.4 and 1.14.0 before we discuss whether we should drop accelerate in 1.15.0? |
I don't think this PR was merged into the 1.13.x branch.
|
#10087 was, which is a backport |
That PR says that it was closed, not merged. Can you find that commit in the branch? |
I hadn't spotted that it was closed not merged. Still, something we need to worry about for 1.14.0 |
That's true. However, would reverting this PR on 1.14 re-introduce gh-9891 with the latest LAPACK? If so, I'm not in favor of reverting it. |
It would, but it would be possible to do a hybrid approach by checking the workspace sizes are not unset |
If it is an actual issue, can we drop accellerate for 1.14? We haven't actually released 1.14.0 yet. |
This fixes a bug in the creation of workspace arrays for a call to
lapack_lite.zgelsd
, which led to segmentation faults when a RHS was passed in that had larger size than the size of the matrix.fixes #9891