-
-
Notifications
You must be signed in to change notification settings - Fork 26k
TST workaround arpack-ng regression in [scipy-dev] #29432
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TST workaround arpack-ng regression in [scipy-dev] #29432
Conversation
Nice thanks for this, I have edited your PR description to close all the related automatically opened issues. |
@@ -1028,9 +1028,9 @@ def check_array_api_input_and_values( | |||
def _check_estimator_sparse_container(name, estimator_orig, sparse_type): | |||
rng = np.random.RandomState(0) | |||
X = rng.uniform(size=(40, 3)) | |||
X[X < 0.8] = 0 | |||
X[X < 0.6] = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fixes the problem because with X<0.6
not all elements are set to zero, but with <0.8
that does happen?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The probability of getting an all-zero training set when subsampling in Halving*SearchCV is lower with the 0.6 threshold. Tests pass locally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one question for my education. Otherwise looks good to me.
Should we make a comment in the code about the upstream issue that this helps with?
This particular line of code is not directly related to the problem. Since this regression should never be released, I don't think it's worth trying to explain this in the code. I think the info in this PR and the linked issues should be enough in the unlikely event some people need to investigate this commit for one reason or another. Hopefully, when scipy 1.15.0 is out, it might even be possible to revert this commit if we want, but it's not needed. This estimator check was never intended to yield all-zeros training data. It's doing so only incidentally. |
Thanks for the PR and the detailed explanation! I enabled auto-merge on this PR. Also the scipy-dev CI is green on this PR, see build log |
This should workaround the failure observed on our nightly
[scipy-dev]
CI: #29428 by avoiding running a PCA-based pipeline on zero data when running theHalving*SearchCV
estimator on it.See #29428 for details. The upstream regression is tracked in scipy/scipy#21137 and a fix should prospectively be included before the scipy 1.15.0 release.
However, it would be helpful not to trigger this problem in the scikit-learn common tests in the mean time to keep the scipy-dev CI green and avoid hiding potentially unrelated failures behind a failing nightly CI job.
Close #29422
Close #29423
Close #29424