8000 DOC Updated documentation for cv parameter (issue #4533) by christophebourguignat · Pull Request #5184 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

DOC Updated documentation for cv parameter (issue #4533) #5184

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 30, 2015

Conversation

christophebourguignat
Copy link

No description provided.

@jnothman
Copy link
Member

Thanks. This is applicable everywhere there is a cv parameter and a classifier can be passed. I suspect @rvraghav93 is right that we should keep it brief and point to centralised documentation, perhaps just the documentation of check_cv (once clarified, if necessary).

@FedericoV
Copy link
Contributor

This is a good improvement over what was there before. The other places where it shows up are:

666 : 1063 : def cross_val_predict(estimator, X, y=None, cv=None, n_jobs=1,
675 : 1225 : def cross_val_score(estimator, X, y=None, scoring=None, cv=None, n_jobs=1,
696 : 1543 : def permutation_test_score(estimator, X, y, cv=None,
in cross_validation.py

and

2993 : 232 : def validation_curve(estimator, X, y, param_name, param_range, cv=None,
in validation_curve.py

@GaelVaroquaux
Copy link
Member

This is useful. Thanks! Merging.

GaelVaroquaux added a commit that referenced this pull request Aug 30, 2015
DOC Updated documentation for cv parameter (issue #4533)
@GaelVaroquaux GaelVaroquaux merged commit 869971f into scikit-learn:master Aug 30, 2015
@jnothman
Copy link
Member

Someone want to copy to those other destinations?

@christophebourguignat
Copy link
Author

@jnothman I can do it

@christophebourguignat
Copy link
Author

Just proposed a new PR.

A few questions :

  • in function template we say default cv is None, but in documentation default cv is 3. In check_cv() code this is the same (if cv is None: cv = 3), but shall we unify in the doc (or the code) ?
  • shall we also update cv doc in sklearn.calibration.CalibratedClassifierCV ?

@amueller
Copy link
Member
amueller commented Sep 8, 2015

@christophebourguignat which one was your PR?
The main reason 8000 for not doing cv = 3 in all the constructors is that we don't want to hard-code that in so many places iirc. Currently the number 3 is only written down in check_cv.
But we don't want our users to have to dig until they find check_cv to find out what that means.
We could change the default in the docs to None if the rest of the docstring is informative enough.

@raghavrv
Copy link
Member

Thanks for the PR this will add more consistency to the cv param documentation.

But @jnothman @GaelVaroquaux @amueller didn't we agree upon cv doc as follows? -

    cv : int, cross-validation generator or an iterable, optional
        Determines the cross-validation splitting strategy.
        Possible inputs for cv are:
          - None, to use the default 3-fold cross-validation,
          - integer, to specify the number of folds.
          - An object to be used as a cross-validation generator.
          - An iterable yielding train/test splits.

        For integer/None inputs, if ``y`` is binary or multiclass,
        :class:`StratifiedKFold` used. If classifier is False or if ``y`` is
        neither binary nor multiclass, :class:`KFold` is used.

        Refer :ref:`User Guide <cross_validation>` for the various
        cross-validation strategies that can be used here.

The current version is -

     cv : integer or cross-validation generator, default=3
        If an integer is passed, it is the number of folds.
        A cross-validation generator to use. If int, determines
        Specific cross-validation objects can be passed, see
        the number of folds in StratifiedKFold if estimator is a classifier
        sklearn.cross_validation module for the list of possible objects
        and the target y is binary or multiclass, or the number
        of folds in KFold otherwise.
        Specific cross-validation objects can be passed, see 
        sklearn.cross_validation module for the list of possible objects.

@raghavrv
Copy link
Member

If you are okay with the first doc, let me know I'll add a quick PR and merge the change to #4294 too!

@amueller
Copy link
Member

It was a net improvement. The first one looks nicer.

@raghavrv
Copy link
Member

#5238

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants
0