-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
[MRG+1] explicit exception message for strict selectors #4206
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
FWIW, I can imagine cases in which |
LGTM |
In case some user code depends on the prior behaviour of But this is more reasonable given the |
@jnothman in which case does that raise an error that didn't raise an error before? |
|
But passing an empty data array as input to the next step in the pipeline is extremely likely to cause hard to diagnose errors. I think it's better to fail fast with an explicit error message at the point where the problem is fixable. I you find one legitimate case to transform data into an empty dataset I will reconsider this by ATM I don't see any. |
This is the kind of weird downstream errors that might be caused by passing empty datasets to models:
We should probably make our input validation check stricter for the case where |
The use case @jnothman mentioned is where you want to grid-search over disabling one part of a FeatureUnion basically. That is somewhat vaild, and I feel we might not want to break it. Two ways out:
(FYI @jnothman's example gives a zero division warning on master) |
Thanks for the clarification. I now get the
This is good independently of the rest of the discussion, I will work on a PR for that. |
Thanks @ogrisel :) Btw there are a couple more bug-fix PRs that could use your reviews ;) https://github.com/scikit-learn/scikit-learn/pulls?q=is%3Aopen+is%3Apr+milestone%3A0.16 |
99f0044
to
bc5b403
Compare
I updated this PR based on your feedback to issue a |
I know. I plan to dedicate this day to them (and maybe the BH t-SNE PR that deserves some love too). |
BTW @amueller feel free to join http://gitter.im/scikit-learn/scikit-learn to speak about the priority of the bug fix issues to review. |
Thanks and sorry for pestering ;) I'm travelling this week but will be more (re)active again next week. |
@amueller @jnothman would you prefer this to issue a specific new |
I'm not a huge fan of creating new warnings in most situations, -0 from me ;) |
@amueller I take your thanks as a +1 and marked this in the title to attract more reviewers, feel free to re-edit the title if you have more comments or don't think this is ready for merge. |
# rejects all the features | ||
X = rng.rand(40, 10) | ||
y = rng.randint(0, 4, size=40) | ||
fdr = SelectFdr(alpha=0.00001).fit(X, y) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe you should test all heuristics and also SelectKBest and SelectPercentile?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Regarding the And this LGTM. Merging. |
After saying LGTM, I've realised |
[MRG] explicit warning message for strict selectors Also fixes #4059
This is a fix for #4059 to raise a
ValueError
at transform time with an explicit error message instead of crashing when callingget_support()
with a cryptic error message.Note that for consistency, the behavior of
SelectKBest(k=0)
is also impacted by this change.