8000 Remove validation from `__init__` and `set_params` · Issue #21406 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

Remove validation from __init__ and set_params #21406

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
14 tasks done
thomasjpfan opened this issue Oct 22, 2021 · 22 comments · Fixed by #21430 or #21880
Closed
14 tasks done

Remove validation from __init__ and set_params #21406

thomasjpfan opened this issue Oct 22, 2021 · 22 comments · Fixed by #21430 or #21880
Labels
Meta-issue General issue associated to an identified list of tasks Moderate Anything that requires some knowledge of conventions and best practices Sprint

Comments

@thomasjpfan
Copy link
Member
thomasjpfan commented Oct 22, 2021

This issue is reserved for the Data Umbrella Africa & Middle East Sprint on October 23, 2021 as a good first issue. For non-sprint contributors, please wait till October 30, 2021 to work on this issue.

Follow up to #16945

  1. Make sure you have the development dependencies and documentation dependencies installed.
  2. Pick an from the list below and leave a comment saying you are going to work on it. This way we can keep track of what everyone is working on.
  3. Remove the estimator(s) from:

VALIDATE_ESTIMATOR_INIT = [

  1. Run the test for checking for no validation, which should fail:
pytest sklearn/tests/test_common.py::test_estimators_do_not_raise_errors_in_init_or_set_params
  1. Update the codebase so the above test passes.
  2. Open a Pull Request with an opening message Addresses #21406. Note that each item should be submitted in a separate Pull Request.
  3. Include the function name in the title of the pull request. For example: "ENH Removes validation in __init__ for _______".

Estimators on the same line should be fixed together since they share a parent class.

@HayaAlmutairi
Copy link
Contributor
HayaAlmutairi commented Oct 23, 2021

Me and @krumeto will work on FactorAnalysis

@amueller
Copy link
Member

@HayaAlmutairi Can you please pick one of the estimators / check boxes above to start with, so other people can work on the other ones?

@DessyVV
Copy link
Contributor
DessyVV commented Oct 23, 2021

Me and @LucyJimenez will start working on KernelDensity.

@Haidar13
Copy link
Contributor

I will be working on NuSVC, NuSVR, SVC, SVR, OneClassSVM

@marenwestermann
Copy link
Member

@hhnnhh and I are working on FastICA.

@krumetoft
Copy link
Contributor

Working on LabelBinarizer.

@DessyVV
Copy link
Contributor
DessyVV commented Oct 24, 2021

@glemaitre I think this issue got closed by mistake because of my PR. Could it be reopened?

@chritter
Copy link
Contributor

@fortune-uwha and I will be working on RadiusNeighborsClassifier.

@chritter
Copy link
Contributor

@thomasjpfan We noticed that the Issue title mentions get_params but the test function in test_common.py tests the get_params function. Could you clarify this? Thanks!

@thomasjpfan thomasjpfan changed the title Remove validation from __init__ and get_params Remove validation from __init__ and set_params Oct 31, 2021
@thomasjpfan
Copy link
Member Author

We noticed that the Issue title mentions get_params but the test function in test_common.py tests the get_params function. Could you clarify this?

Thanks for catching this. The title was wrong.

@chritter
Copy link
Contributor
chritter commented Nov 4, 2021

@thomasjpfan I would suggest a slight modification of the input validation test when testing set_params due to the limitation of inspect.signature which returns kwargs as a parameter if defined as **kwargs in the function signature (example for RadiusNeighborsClassifier further below). To address this, I would suggest to call get_params() to retrieve the arguments first and pass those with the test values to set_params(**new_params) as follows:

def test_estimators_do_not_raise_errors_in_init_or_set_params(Estimator):
    """Check that init or set_param does not raise errors."""
    params = signature(Estimator).parameters

    smoke_test_values = [-1, 3.0, "helloworld", np.array([1.0, 4.0]), {}, []]
    for value in smoke_test_values:
        new_params = {key: value for key in params}

        # Does not raise
        est = Estimator(**new_params)

        new_params_set = {key: value for key in est.get_params()}
        # Also do does not raise
        est.set_params(**new_params_set)
from sklearn.neighbors import RadiusNeighborsClassifier
from inspect import signature

clf = RadiusNeighborsClassifier()

params = clf.get_params()
param_names = [param for param in params]

# signature returns **kwargs as param kwargs which is not a valid param
# available in param_names
param_names_sign = [param for param in signature(RadiusNeighborsClassifier
                                            ).parameters]

@MaggieChege
Copy link
Contributor

Working on KernelPCA

@MrinalTyagi
Copy link
Contributor

Hi. I would like to work in SGDOneClassSVM

@marenwestermann
Copy link
Member
marenwestermann commented Nov 6, 2021

Working on FeatureHasher with @hhnnhh

@VibhutiBansal-11
Copy link

Hi, I am working on GridSearchCV .

@marenwestermann
Copy link
Member

I'm working on TheilSenRegressor and TweedieRegressor.

@iofall
Copy link
Contributor
iofall commented Nov 28, 2021

Hey @MrinalTyagi are you still working on SGDOneClassSVM?

@MrinalTyagi
Copy link
Contributor

Hey @MrinalTyagi are you still working on SGDOneClassSVM?

sorry for not updating. Yes, I worked on this issue and will push it in a day.

@MrinalTyagi
Copy link
Contributor
MrinalTyagi commented Dec 4, 2021

Working on GridSearchCV and HalvingGridSearchCV

@arisayosh
Copy link
Contributor

Taking a look at ColumnTransformer, Pipeline, FeatureUnion with @iofall

@marenwestermann
Copy link
Member

ColumnTransformer, TheilSenRegressor, and TweedieRegressor can be removed from the list.

@ogrisel
Copy link
Member
ogrisel commented Sep 15, 2022

This was the last! Thank you everybody!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Meta-issue General issue associated to an identified list of tasks Moderate Anything that requires some knowledge of conventions and best practices Sprint
Projects
None yet
0