8000 ENH Remove validation from `__init__` for `SGDOneClassSVM` by betatim · Pull Request #24433 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

ENH Remove validation from __init__ for SGDOneClassSVM #24433

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Sep 15, 2022

Conversation

betatim
Copy link
Member
@betatim betatim commented Sep 14, 2022

Co-authored-by: iofall 50991099+iofall@users.noreply.github.com
Co-authored-by: arisayosh 15692997+arisayosh@users.noreply.github.com

Continuation of #21944 which contributes to #21406

@betatim
Copy link
Member Author
betatim commented Sep 14, 2022

What exactly does

VALIDATE_ESTIMATOR_INIT = [
"SGDOneClassSVM",
]
VALIDATE_ESTIMATOR_INIT = set(VALIDATE_ESTIMATOR_INIT)
@pytest.mark.parametrize(
"Estimator",
[est for name, est in all_estimators() if name not in VALIDATE_ESTIMATOR_INIT],
)
def test_estimators_do_not_raise_errors_in_init_or_set_params(Estimator):
"""Check that init or set_param does not raise errors."""
# Remove parameters with **kwargs by filtering out Parameter.VAR_KEYWORD
# TODO: Remove in 1.2 when **kwargs is removed in RadiusNeighborsClassifier
params = [
name
for name, param in signature(Estimator).parameters.items()
if param.kind != Parameter.VAR_KEYWORD
]
smoke_test_values = [-1, 3.0, "helloworld", np.array([1.0, 4.0]), [1], {}, []]
for value in smoke_test_values:
new_params = {key: value for key in params}
# Does not raise
est = Estimator(**new_params)
# Also do does not raise
est.set_params(**new_params)
check for? And why is SGDOneClassSVM the only estimator that is being checked? I was expecting this list to contain all estimators, except for those that do things like validate parameters. So I am a bit puzzled what happens here.

However based on what the code looks like ... should we remove SGDOneClassSVM from the list and then remove the test because the list is empty?

iofall and others added 2 commits September 14, 2022 09:51
Co-authored-by: iofall <50991099+iofall@users.noreply.github.com>
Co-authored-by: arisayosh <15692997+arisayosh@users.noreply.github.com>
@betatim betatim force-pushed the Issue-21406-SGDOneClassSVM branch from 44e1d8b to 72c8061 Compare September 14, 2022 07:52
@jeremiedbb
Copy link
Member

And why is SGDOneClassSVM the only estimator that is being checked?

it's the opposite actually. See the parametrization
[est for name, est in all_estimators() if name not in VALIDATE_ESTIMATOR_INIT]

What exactly does check for?

It just checks that no error is raised (hence that no validation is done) when passing wrong params to init. All validations should happen in fit because meta estimators like GridSearchCV need to set params on already instanciated estimators

@betatim
Copy link
Member Author
betatim commented Sep 14, 2022

[est for name, est in all_estimators() if name not in VALIDATE_ESTIMATOR_INIT]

Need to get my 👓 checked, the name of the variable made me think that "this is the initial list to consider, now lets filter it" for this list comprehension ... didn't see the not :)

Removed SGDOneClass and put the parameter validation back in the if.

Copy link
Member
@jeremiedbb jeremiedbb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @betatim

@jeremiedbb jeremiedbb added this to the 1.2 milestone Sep 14, 2022
@jeremiedbb jeremiedbb added the Quick Review For PRs that are quick to review label Sep 14, 2022
Copy link
Member
@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked that alpha = self.nu / 2 is already present in the SGDOneClassSVM.fit amd .partial_fit methods in scikit-learn's main branch so indeed we can just remove the alpha from the constructor.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module:linear_model Quick Review For PRs that are quick to review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants
0