8000 MAINT Parameters validation for `fastica` by kianelbo · Pull Request #24924 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content
/ < 8000 a data-pjax="#repo-content-pjax-container" data-turbo-frame="repo-content-turbo-frame" href="/scikit-learn/scikit-learn">scikit-learn Public

MAINT Parameters validation for fastica #24924

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Dec 28, 2022

Conversation

kianelbo
Copy link
Contributor
@kianelbo kianelbo commented Nov 14, 2022

Reference Issues/PRs

Towards #24862

What does this implement/fix? Explain your changes.

Added param validation for sklearn.decomposition.fastica

Any other comments?

The function calls the FastICA class but the validations are not performed since _fit_transform is called. What about using the class validation?

Comment on lines 159 to 173
"X": ["array-like"],
"n_components": [Interval(Integral, 1, None, closed="left"), None],
"algorithm": [StrOptions({"parallel", "deflation"})],
"whiten": [
Hidden(StrOptions({"warn"})),
StrOptions({"arbitrary-variance", "unit-variance"}),
"boolean",
],
"fun": [StrOptions({"logcosh", "exp", "cube"}), callable],
"fun_args": [dict, None],
"max_iter": [Interval(Integral, 1, None, closed="left")],
"tol": [Interval(Real, 0.0, None, closed="left")],
"w_init": ["array-like", None],
"whiten_solver": [StrOptions({"eigh", "svd"})],
"random_state": ["random_state"],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All those parameters will be checked by the class FastICA. So no need to check them. It will be enough to verify the 3 remaining parameters.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So what about X? Leaving out X, makes test_function_param_validation fail because of this I guess. Is it a good idea to give X a "no_validation"?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be validated when calling fit or fit_transform, I guess.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume that it will test every variable in the signature. I am wondering if we should make a partial check if possible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is partial validation possible?
My two first initial solutions are either bypassing X validation with "no_validation" constraint (but I don't know if it is for this purpose) or giving X proper validation ("X": ["array-like"]).
What's your suggestion?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean double-validate every parameter or just X?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Every parameter indeed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is unfortunate that our validation framework does not allow for partial validation. Moving forward we would need to keep the constraints in sync between the function and the estimator. For this specific PR, the fastica function can call FastICA()._validate_params to validate the common parameters and fastica itself can validate the remaining ones: (X and compute_sources).

@jeremiedbb What do you think of adding a allow_partial_validation flag to validate_params?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I guess I should wait for the partial validations as it seems that the current tests does not allow it 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I guess I should wait for the partial validations as it seems that the current tests does not allow it

@kianelbo To clarify, this PR did not need to be updated, because our tests does not allow for partial validation. I am okay with reverting 6cc6c77 (#24924) and having the double validation for now.

I opened #25058 to track the discussion.

@glemaitre
Copy link
Member

I agree with you: we should use the class FastICA to validate the parameter.
However, we should avoid to compute the source if not requested (by calling whether fit or fit_transform.

@glemaitre
Copy link
Member

So we should have something like:

    est = FastICA(
        n_components=n_components,
        algorithm=algorithm,
        whiten=whiten,
        fun=fun,
        fun_args=fun_args,
        max_iter=max_iter,
        tol=tol,
        w_init=w_init,
        whiten_solver=whiten_solver,
        random_state=random_state,
    )
    if compute_sources:
        S = est.fit_transform(X)
    else:
        est.fit(X)
        S = None

@jeremiedbb jeremiedbb added the Validation related to input validation label Nov 18, 2022
@kianelbo kianelbo requested a review from glemaitre November 19, 2022 11:24
Copy link
Member
@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @kianelbo

@kianelbo kianelbo requested a review from thomasjpfan November 28, 2022 23:39
Copy link
Member
@jeremiedbb jeremiedbb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @kianelbo

@jeremiedbb jeremiedbb merged commit 3a30538 into scikit-learn:main Dec 28, 2022
@kianelbo kianelbo deleted the fastica-validate branch December 28, 2022 17:05
jjerphan pushed a commit to jjerphan/scikit-learn that referenced this pull request Jan 3, 2023
Co-authored-by: jeremiedbb <jeremiedbb@yahoo.fr>
jjerphan pushed a commit to jjerphan/scikit-learn that referenced this pull request Jan 20, 2023
Co-authored-by: jeremiedbb <jeremiedbb@yahoo.fr>
jjerphan pushed a commit to jjerphan/scikit-learn that referenced this pull request Jan 20, 2023
Co-authored-by: jeremiedbb <jeremiedbb@yahoo.fr>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants
0