8000 RidgeCV with sample_weights and 'svd' gcv mode · Issue #13321 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content
RidgeCV with sample_weights and 'svd' gcv mode #13321
Closed
@jeromedockes

Description

@jeromedockes

When sample weights are provided, RidgeCV never uses an SVD decomposition of the
design matrix and always uses an eigendecomposition of the Gram matrix:

# FIXME non-uniform sample weights not yet supported

I'm not sure why this is the case. The strategy of multiplying X and Y by the
square root of the sample weights, used when cv_mode is 'eigen', should work in
the same way for the svd solver. If I simply remove the lines that set cv_mode
to 'eigen' when sample_weights are provided, the tests still pass and the fitted
coeficients are the same wether we use 'eigen' or 'svd'. does somebody know what
I am missing?

At the very least, a warning could be emitted when the gram matrix is used,
despite the number of samples being greater than the number of features, because
of the sample weights. otherwise a user fitting a RidgeCV with many samples, few
features, sample weights, and the default parameters, might be surprised to see that it takes a
long time and a lot of memory. at the moment, such a warning is only emitted
when the user explicitely asked for gcv_mode='svd':

gcv_mode = 'svd'

warnings.warn("non-uniform sample weights unsupported for svd, "

Also, this warning message could be a bit more explicit, explaining the
performance implications of using 'eigen' rather than 'svd' when n samples > n
features.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0