-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
FIX Improve error message when RepeatedStratifiedKFold.split is called without a y argument #29402
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Hi @lesteve, When possible, can you review this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a test using pytest.raises
that checks that the error is
TypeError
with missing 1 required positional argument: 'y'
in the error message?
Can you please add a test see other tests in https://github.com/scikit-learn/scikit-learn/blob/main/sklearn/model_selection/tests/test_split.py. |
@lesteve All the changes are done, can you merge the PR |
@lesteve can you review this PR |
@Anurag-Varma maybe you can imagine that pinging me 3 times over 4 days can be perceived as a bit too pushy 😉. Generally speaking, I would say pinging after a week without answer is probably fine, 3 times over 4 days is certainly not. Have a look at this FAQ for more details. I will review this when I find the time, as you can imagine there is plenty of other stuff that I am involved in ... Also, since you seem to be in quite a hurry, note that in scikit-learn we need the approval of 2 maintainers before we can merge a PR. English is not my native language and may not be yours but even without the 3 pings in 4 days thing, I personally find this kind of language too pushy:
I would rather have something like this:
|
@lesteve Ohh sorry about that, I didn't know like the rules and faq. This is my first time contributing to scikit. I thought after my changes, I should inform a member so that they can verify them. I don't know about the 2 reviewers rule or that it takes time for the review. I was thinking like the review will be done immediatly and then it will be merged as I resolved the issues which were mentioned. From next time, I will keep this in mind. Thanks |
No worries this is part of unwritten etiquette and each project may work slightly differently. Since you see to be relatively new to open-source contributiony you maybe interested in is the Github open-source guide (the section "What happens after you submit your contribution" mentions pinging someone if you haven't received a response in over a week which sounds a reasonable amount of time in general). Also don't hesitate to have a look at the scikit-learn contributing guide. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR, a few comments, mostly about improving the test.
Co-authored-by: Loïc Estève <loic.esteve@ymail.com>
Just a general question, do you have a reason to force-push as often as you do? I don't think this PR depends on any of the changes in the FYI, when there is a need for it, we generally ask people to favor merging rather than rebase + force-push. As a reviewer it makes it easier to see what has changed since I last looked at the PR. Also we use squash + merge when merging a PR so it gets turned into a single commit on the |
The sklearn main was getting updated and my branch was getting outdated. So i was rebasing the latest updates inti my branch and forch pushing it. Regarding the merging, i thought too many extra merge commits were there. Sorry that you are spending too much time on this pr reviewing it multiple times. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
As I mentioned earlier it needs a second reviewer for the PR to be merged.
In general this is not an issue, i.e. if there are new commits in the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks! Just one nit.
Co-authored-by: Lucy Liu <jliu176@gmail.com>
Thanks for the review @lucyleeow, I have enabled auto-merge so this PR will be merged when CI is green. |
Reference Issues/PRs
Fixes #29369
What does this implement/fix? Explain your changes.
For sklearn.model_selection.RepeatedStratifiedKFold, it fixes the exception when y argument is not provided.
Now it returns:
TypeError: RepeatedStratifiedKFold.split() missing 1 required positional argument: 'y'
.This matches the similar format of sklearn.model_selection.StratifiedKFold when y argument is not provided.
Any other comments?
Created a new function in
class RepeatedStratifiedKFold
->def split(self, X, y, groups=None)
in which the argument 'y' doesn't have default value of None. So its mandatory to provide y argument in split function in RepeatedStratifiedKFold class or else, it gives the error.