-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
MNT Add validation for parameter alphas
in RidgeCV
#21606
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After some delay, here are some notes of our previous discussions and a suggestion.
I think we need to think about refactoring it for optimality with another maintainer.
sklearn/linear_model/_ridge.py
Outdated
if isinstance(self.alphas, (np.ndarray, list, tuple)): | ||
n_alphas = 1 if np.ndim(self.alphas) == 0 else len(self.alphas) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed with @ArturoAmorQ , this provides correct parameter validation to all sub-classes of _RidgeGCV
.
Yet, in order for the validation to be run once, we need, for each subclass:
- to move those validation at the beginning of
fit
- to move the core of
_fit
in a dedicated private method (_fit
) which does all but the validation - to move other validation steps as well (?)
- to check that the validation is only done once.
For us, this small refactoring has be done in another PR.
cc @glemaitre: we need your authority on the best way to proceed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is indeed what we are leaning towards as a solution. We already applied such a pattern with @jeremiedbb: validating in fit
that will call _fit
where the processing is done with no validation.
b95f434
to
a781866
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add an entry to the change log at doc/whats_new/v1.1.rst
. Like the other entries there, please reference this pull request with :pr:
and credit yourself (and other contributors if applicable) with :user:
.
Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some last comments.
Co-authored-by: Julien Jerphanion <git@jjerphan.xyz>
Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
…earn into validations_ridgecv
Co-authored-by: Julien Jerphanion <git@jjerphan.xyz>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM given that the CI passes. Thank you, Arturo!
@jjerphan on the contrary, thank you! And also thank you @glemaitre for your time and feedback! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks @ArturoAmorQ
Thanks @ArturoAmorQ |
MNT Add validation for parameter `alphas` in `RidgeCV` (scikit-learn#21606)
Reference Issues/PRs
Solves a part of the Issue #20724 as mentioned in this comment.
What does this implement/fix? Explain your changes.
Use the helper function
check_scalar
fromsklearn.utils
to validate the scalar parameteralphas
and make sure to get consistent error types and messages inlinear_model.RidgeCV
.In the case that
alphas
is a list, anp.ndarray
or a tuple, the check iterates through the elements and gives feedback on the position of the first parameter that fails the check.Any other comments?
This PR is a first approach to implement the validation for the scalar parameter
alphas
. Making the process more efficient whenfit
is called several times (large number of folds in the cv or large number of points in a grid search) will be addressed in a future PR.