You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When sample weights are provided, RidgeCV never uses an SVD decomposition of the
design matrix and always uses an eigendecomposition of the Gram matrix:
# FIXME non-uniform sample weights not yet supported
I'm not sure why this is the case. The strategy of multiplying X and Y by the
square root of the sample weights, used when cv_mode is 'eigen', should work in
the same way for the svd solver. If I simply remove the lines that set cv_mode
to 'eigen' when sample_weights are provided, the tests still pass and the fitted
coeficients are the same wether we use 'eigen' or 'svd'. does somebody know what
I am missing?
At the very least, a warning could be emitted when the gram matrix is used,
despite the number of samples being greater than the number of features, because
of the sample weights. otherwise a user fitting a RidgeCV with many samples, few
features, sample weights, and the default parameters, might be surprised to see that it takes a
long time and a lot of memory. at the moment, such a warning is only emitted
when the user explicitely asked for gcv_mode='svd':
warnings.warn("non-uniform sample weights unsupported for svd, "
Also, this warning message could be a bit more explicit, explaining the
performance implications of using 'eigen' rather than 'svd' when n samples > n
features.
The text was updated successfully, but these errors were encountered:
Uh oh!
There was an error while loading. Please reload this page.
When sample weights are provided, RidgeCV never uses an SVD decomposition of the
design matrix and always uses an eigendecomposition of the Gram matrix:
scikit-learn/sklearn/linear_model/ridge.py
Line 1036 in 7389dba
I'm not sure why this is the case. The strategy of multiplying X and Y by the
square root of the sample weights, used when cv_mode is 'eigen', should work in
the same way for the svd solver. If I simply remove the lines that set cv_mode
to 'eigen' when sample_weights are provided, the tests still pass and the fitted
coeficients are the same wether we use 'eigen' or 'svd'. does somebody know what
I am missing?
At the very least, a warning could be emitted when the gram matrix is used,
despite the number of samples being greater than the number of features, because
of the sample weights. otherwise a user fitting a RidgeCV with many samples, few
features, sample weights, and the default parameters, might be surprised to see that it takes a
long time and a lot of memory. at the moment, such a warning is only emitted
when the user explicitely asked for gcv_mode='svd':
scikit-learn/sklearn/linear_model/ridge.py
Line 1034 in 7389dba
scikit-learn/sklearn/linear_model/ridge.py
Line 1037 in 7389dba
Also, this warning message could be a bit more explicit, explaining the
performance implications of using 'eigen' rather than 'svd' when n samples > n
features.
The text was updated successfully, but these errors were encountered: