8000 :lock: :robot: CI Update lock files for scipy-dev CI build(s) :lock: :robot: by scikit-learn-bot · Pull Request #29428 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

🔒 🤖 CI Update lock files for scipy-dev CI build(s) 🔒 🤖 #29428

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

scikit-learn-bot
Copy link
Contributor

Update lock files.

Note

If the CI tasks fail, create a new branch based on this PR and add the required fixes to that branch.

Copy link
github-actions bot commented Jul 8, 2024

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: edb741d. Link to the linter CI: here

@ogrisel
Copy link
Member
ogrisel commented Jul 8, 2024

By the look of it, this does not seem to be a problem in scikit-learn itself but rather an unexpected change of behavior in scipy. I will try to reproduce locally and extract a minimal reproducer to confirm.

EDIT: the estimator checks only fails with HalvingRandomizedSearchCV and HalvingGridSearchCV on sparse data with many zeros:

test_search_cv[HalvingGridSearchCV(cv=2,error_score='raise',estimator=Pipeline(steps=[('pca',PCA()),('ridge',Ridge())]),min_resources='smallest',param_grid={'ridge__alpha':[0.1,1.0]},random_state=0)-check_estimator_sparse_array] _

Halving*SearchCV estimators resample the training data to variants with a very small number of rows. On a sparse matrix with 80% sparsity, this can lead to zero data with high probability:

Here is the relevant info from the failing traceback:

../1/s/sklearn/decomposition/_pca.py:468: in fit_transform
    U, S, _, X, x_is_centered, xp = self._fit(X)
        X          = <Compressed Sparse Row sparse matrix of dtype 'float64'
	with 0 stored elements and shape (2, 3)>
        self       = PCA()
        y          = array([2, 2])

and subsequently:

E               scipy.sparse.linalg._eigen.arpack.arpack.ArpackError: ARPACK error -9: Starting vector is zero.

Notice that X has no non-zero entries.

@ogrisel
Copy link
Member
ogrisel commented Jul 8, 2024

Here is a minimal repro:

import numpy as np
import scipy
import scipy.sparse as sp

print(f"scipy version: {scipy.__version__}")

for out in sp.linalg.svds(np.zeros((2, 2)), k=1):
    print(out)

Here are the outputs for the latest stable vs dev versions of scipy:

scipy version: 1.14.0
[[1.]
 [0.]]
[0.]
[[-0.47655702 -0.87914356]]
scipy version: 1.15.0.dev0+1098.a2c613f
Traceback (most recent call last):
  File "/Users/ogrisel/tmp/repro_arpack.py", line 7, in <module>
    for out in sp.linalg.svds(np.zeros((2, 2)), k=1):
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ogrisel/miniforge3/envs/scipy-dev/lib/python3.12/site-packages/scipy/sparse/linalg/_eigen/_svds.py", line 515, in svds
    _, eigvec = eigsh(XH_X, k=k, tol=tol ** 2, maxiter=maxiter,
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ogrisel/miniforge3/envs/scipy-dev/lib/python3.12/site-packages/scipy/sparse/linalg/_eigen/arpack/arpack.py", line 1701, in eigsh
    params.iterate()
  File "/Users/ogrisel/miniforge3/envs/scipy-dev/lib/python3.12/site-packages/scipy/sparse/linalg/_eigen/arpack/arpack.py", line 574, in iterate
    raise ArpackError(self.info, infodict=self.iterate_infodict)
scipy.sparse.linalg._eigen.arpack.arpack.ArpackError: ARPACK error -9: Starting vector is zero.

I will report upstream.

@ogrisel ogrisel merged commit 1e338bb into scikit-learn:main Jul 8, 2024
31 checks passed
glemaitre pushed a commit to glemaitre/scikit-learn that referenced this pull request Sep 9, 2024
…29428)

Co-authored-by: Lock file bot <noreply@github.com>
Co-authored-by: Loïc Estève <loic.esteve@ymail.com>
glemaitre pushed a commit that referenced this pull request Sep 11, 2024
Co-authored-by: Lock file bot <noreply@github.com>
Co-authored-by: Loïc Estève <loic.esteve@ymail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants
0