8000 MAINT Parameters validation for chi2_kernel with gamma by rand0wn · Pull Request #26153 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

MAINT Parameters validation for chi2_kernel with gamma #26153

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Jun 27, 2023

Conversation

rand0wn
Copy link
Contributor
@rand0wn rand0wn commented Apr 11, 2023

Reference Issues/PRs

Towards #24862

What does this implement/fix? Explain your changes.

Added automatic parameter validation for "sklearn.metrics.pairwise.chi2_kernel".

Any other comments?

The validation of "sklearn.metrics.pairwise.additive_chi2_kernel" doesn't include validation of gamma, so even through X and Y are revalidated, gamm 8000 a isn't, additive_chi2_kernel validation can be removed as it may not be needed.

@rand0wn rand0wn changed the title MAINT Parameters validation for chi2_kernel with consideration of sca… MAINT Parameters validation for chi2_kernel with gamma Apr 11, 2023
Abhishek added 2 commits April 12, 2023 10:26
…heck cannot be skipped as it also checks for sparse matrix
@@ -242,6 +242,7 @@ def _check_function_param_validation(
"sklearn.tree.export_text",
"sklearn.tree.plot_tree",
"sklearn.utils.gen_batches",
"sklearn.metrics.pairwise.chi2_kernel",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you put this line in alphabetic order

{
"X": ["array-like", "sparse matrix"],
"Y": ["array-like", "sparse matrix", None],
"gamma": [Interval(Real, None, None, closed="neither")],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am also surprised here that it does fail because we don't allow ndarray which would be required by Gaussian processes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, here we would trigger a regression:

In [2]: from sklearn.datasets import make_friedman2
   ...: from sklearn.gaussian_process import GaussianProcessRegressor
   ...: from sklearn.gaussian_process.kernels import DotProduct, WhiteKernel, PairwiseKernel
   ...: X, y = make_friedman2(n_samples=500, noise=0, random_state=0)
   ...: kernel = DotProduct() + WhiteKernel() + PairwiseKernel(gamma=1.0, metric="chi2")
   ...: gpr = GaussianProcessRegressor(kernel=kernel,
   ...:         random_state=0).fit(X, y)
InvalidParameterError: The 'gamma' parameter of chi2_kernel must be a float in the range (-inf, inf). Got array([1.]) instead.

So it would be useful to add in test_gpr.py this minimal example to be sure that we have a test triggering this issue.

Here, it means that we need:

Suggested change
"gamma": [Interval(Real, None, None, closed="neither")],
"gamma": [Interval(Real, None, None, closed="neither"), Hidden(np.ndarray)],

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rand0wn please also add a test with what @glemaitre points out. Also, why don't we want to document gamma as an array publicly?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because it is an array with unique values. We found out that the only case that this happens is internal to the GP models that provide such entry.

We therefore should still accept this type of entry to avoid regression but we should not document it since we don't support gamma being several values in an array.

@glemaitre glemaitre self-requested a review April 14, 2023 12:51
@rand0wn rand0wn closed this Apr 21, 2023
@rand0wn rand0wn reopened this Apr 21, 2023
@rand0wn
Copy link
Contributor Author
rand0wn commented Apr 21, 2023

Added changes for review again

@github-actions
Copy link

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: 12ce33a. Link to the linter CI: here

Copy link
Member
@jeremiedbb jeremiedbb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @rand0wn

@jeremiedbb jeremiedbb enabled auto-merge (squash) June 27, 2023 17:01
@jeremiedbb jeremiedbb merged commit d2b9c80 into scikit-learn:main Jun 27, 2023
punndcoder28 pushed a commit to punndcoder28/scikit-learn that referenced this pull request Jul 29, 2023
Co-authored-by: jeremiedbb <jeremiedbb@yahoo.fr>
REDVM pushed a commit to REDVM/scikit-learn that referenced this pull request Nov 16, 2023
Co-authored-by: jeremiedbb <jeremiedbb@yahoo.fr>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants
0