RidgeCV with sample_weights and 'svd' gcv mode

When sample weights are provided, RidgeCV never uses an SVD decomposition of the
design matrix and always uses an eigendecomposition of the Gram matrix:

scikit-learn/sklearn/linear_model/ridge.py

Line 1036 in 7389dba

# FIXME non-uniform sample weights not yet supported

I'm not sure why this is the case. The strategy of multiplying X and Y by the
square root of the sample weights, used when cv_mode is 'eigen', should work in
the same way for the svd solver. If I simply remove the lines that set cv_mode
to 'eigen' when sample_weights are provided, the tests still pass and the fitted
coeficients are the same wether we use 'eigen' or 'svd'. does somebody know what
I am missing?

At the very least, a warning could be emitted when the gram matrix is used,
despite the number of samples being greater than the number of features, because
of the sample weights. otherwise a user fitting a RidgeCV with many samples, few
features, sample weights, and the default parameters, might be surprised to see that it takes a
long time and a lot of memory. at the moment, such a warning is only emitted
when the user explicitely asked for gcv_mode='svd':

scikit-learn/sklearn/linear_model/ridge.py

Line 1034 in 7389dba

gcv_mode = 'svd'

scikit-learn/sklearn/linear_model/ridge.py

Line 1037 in 7389dba

warnings.warn("non-uniform sample weights unsupported for svd, "

Also, this warning message could be a bit more explicit, explaining the
performance implications of using 'eigen' rather than 'svd' when n samples > n
features.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions