Open
Description
As initially discussed in #18302 (comment) it might be interesting to add an extra constructor param to PLSSVD to select ARPACK or the sklearn's randomized_svd solver instead of the default LAPACK solver (from scipy.linalg.svd
).
But the ARPACK and randomized_svd are non-deterministic so we would also need to add a random_state
parameter.
Careful benchmarking to evaluate the speed vs numerical or statistical accuracy trade-off should be conducted to:
- help the user choose the value of this parameter (both in the docstring and the user guide)
- suggest an "auto" strategy to automatically select a good solver based on the shape of the data and the
n_components
parameter, similar to what is done in thePCA
andTruncateSVD
estimators. This "auto" parameter shall become the default after the usual deprecation period.