-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
[MRG+1] Rename CV params n_{folds,iter} to n_splits #7187
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MRG+1] Rename CV params n_{folds,iter} to n_splits #7187
Conversation
@@ -134,7 +134,7 @@ def test_cross_validator_with_default_params(): | |||
n_unique_labels = 4 | |||
n_folds = 2 | |||
p = 2 | |||
n_iter = 10 # (the default value) | |||
n_splits= 10 # (the default value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
space
One question is whether we should be doing any deprecation... I think not, in this case. |
Test failures. |
Update: Never mind, I was trapped 👋 . |
More test failures, though! |
Remaining failures are uses of |
@jnothman @MechCoder CI is smiling 😄 , I think it's ready to be reviewed! |
@@ -128,7 +128,7 @@ | |||
# To get nice curve, we need a large number of iterations to | |||
# reduce the variance | |||
grid = GridSearchCV(clf, refit=False, param_grid=param_grid, | |||
cv=ShuffleSplit(train_size=train_size, n_iter=250, | |||
cv=ShuffleSplit(train_size=train_size, n_splits=250, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PEP8
Multiple PEP8 line length violations... |
"train_size=None)") | ||
ps_repr = "PredefinedSplit(test_fold=array([1, 1, 2, 2]))" | ||
|
||
n_splits = [n_samples, comb(n_samples, p), n_folds, n_folds, | ||
n_unique_labels, comb(n_unique_labels, p), n_iter, 2] | ||
splits_cnts = [n_samples, comb(n_samples, p), n_splits, n_splits, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
expected_n_splits
or n_splits_expected
might be better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue is that cnt
, apart from being an abbreviation we should try to avoid for clarity, means the same as n
. So splits_cnt
means the same as n_splits
and so the naming tells us little of what it contains.
Another option is renaming the new n_splits
to n_splits_param
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the elaboration, it's useful for me in the future 😄
Otherwise LGTM |
Please draft a change to what's new, both in the |
Yes this should go into model selection enhancements section. Otherwise with a whatsnew and reviews addressed this LGTM too.... Let's merge this! Thanks for the PR @yenchenlin |
dd2ff8e
to
e216af5
Compare
@@ -62,6 +62,16 @@ Model Selection Enhancements and API Changes | |||
the corresponding parameter is not applicable. Additionally a list of all | |||
the parameter dicts are stored at ``results_['params']``. | |||
|
|||
- **Renaming of `n_folds` and `n_iter` to `n_splits`** | |||
|
|||
The number of folds (i.e. the number of partitions of a dataset) is not |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this doesn't need so much motivation here, you just need to say something like "In the new model_selection
module, some parameter names have changed: the n_folds
parameter in x, y, z is now n_splits
; the n_iter
parameter in a, b, c is now n_splits
". Please use :class:
as appropriate.
Comments addressed, please have a look! |
LGTM |
@@ -62,6 +62,16 @@ Model Selection Enhancements and API Changes | |||
the corresponding parameter is not applicable. Additionally a list of all | |||
the parameter dicts are stored at ``results_['params']``. | |||
|
|||
- **Renaming of ``n_folds`` and ``n_iter`` to ``n_splits``** | |||
|
|||
Some parameter names have changed: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could remove this line I think. You already have a heading...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this line is good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought after the below suggested change, this is not needed...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps...? I'm unconvinced that this series of comments markedly contributes to the quality of the text in terms of convention, clarity or information structure.
Feel free to add a +1 once the comments are addressed. Ping @jnothman for merge once done... |
@jnothman Would you like to merge it as such? LGTM... |
Lgtm @jnothman merge if you are happy |
(Y) |
aaaand my book broke ^^ glad it's not in print yet lol. I should do that release some time soom. |
hahahahaha. Glad you wrote it for the version where everything changes. |
Well that's why I knew I couldn't do it for 0.17. But It would be great to release 0.18 within the next month (which I think is doable). |
Shouldn't the default implementation of |
I think it is better explicitly implemented...? |
* Rename n_iter to n_splits * Fix bug * Fix examples * Add spaces * Rename n_folds to n_splits * Fix error * Fix doc * Fix doc * Fix example * Rename variables name * PEP8 * Fix error message * Add whats_new * Fix test * Fix doc * Fix doc * Make test clear
Since now all CV splitters have
get_n_splits()
, we can rename n_{folds,iter} to n_splits in order to avoid confusion.More detailed discussion can be found in #7169 .
n_iter
ton_splits
n_ folds
ton_splits