raghavrv
diff --git a/‎doc/modules/classes.rst
Lines changed: 35 additions & 61 deletions b/‎doc/modules/classes.rst
Lines changed: 35 additions & 61 deletions
diff --git a/‎doc/modules/cross_validation.rst
Lines changed: 47 additions & 39 deletions b/‎doc/modules/cross_validation.rst
Lines changed: 47 additions & 39 deletions
diff --git a/‎doc/modules/ensemble.rst
Lines changed: 5 additions & 5 deletions b/‎doc/modules/ensemble.rst
Lines changed: 5 additions & 5 deletions
@@ -143,43 +143,58 @@ Classes
    covariance.oas
    covariance.graph_lasso
 
+.. _model_selection_ref:
 
-.. _cross_validation_ref:
-
-:mod:`sklearn.cross_validation`: Cross Validation
-=================================================
+:mod:`sklearn.model_selection`: Model Selection
+===============================================
 
-.. automodule:: sklearn.cross_validation
+.. automodule:: sklearn.model_selection
    :no-members:
    :no-inherited-members:
 
-**User guide:** See the :ref:`cross_validation` section for further details.
+**User guide:** See the :ref:`cross_validation`, :ref:`grid_search` and
+:ref:`learning_curve` sections for further details.
+
 
+Classes
+-------
 .. currentmodule:: sklearn
 
 .. autosummary::
    :toctree: generated/
    :template: class.rst
 
-   cross_validation.KFold
-   cross_validation.LeaveOneLabelOut
-   cross_validation.LeaveOneOut
-   cross_validation.LeavePLabelOut
-   cross_validation.LeavePOut
-   cross_validation.PredefinedSplit
-   cross_validation.StratifiedKFold
-   cross_validation.ShuffleSplit
-   cross_validation.StratifiedShuffleSplit
+   model_selection.KFold
+   model_selection.LeaveOneLabelOut
+   model_selection.LeaveOneOut
+   model_selection.LeavePLabelOut
+   model_selection.LeavePOut
+   model_selection.PredefinedSplit
+   model_selection.StratifiedKFold
+   model_selection.ShuffleSplit
+   model_selection.StratifiedShuffleSplit
+   model_selection.GridSearchCV
+   model_selection.ParameterGrid
+   model_selection.ParameterSampler
+   model_selection.RandomizedSearchCV
+
+
+Functions
+---------
+.. currentmodule:: sklearn
 
 .. autosummary::
    :toctree: generated/
    :template: function.rst
 
-   cross_validation.train_test_split
-   cross_validation.cross_val_score
-   cross_validation.cross_val_predict
-   cross_validation.permutation_test_score
-   cross_validation.check_cv
+   model_selection.train_test_split
+   model_selection.cross_val_score
+   model_selection.cross_val_predict
+   model_selection.permutation_test_score
+   model_selection.check_cv
+   model_selection.learning_curve
+   model_selection.validation_curve
+
 
 .. _datasets_ref:
 
@@ -508,29 +523,6 @@ From text
    gaussian_process.regression_models.quadratic
 
 
-.. _grid_search_ref:
-
-:mod:`sklearn.grid_search`: Grid Search
-=======================================
-
-.. automodule:: sklearn.grid_search
-   :no-members:
-   :no-inherited-members:
-
-**User guide:** See the :ref:`grid_search` section for further details.
-
-.. currentmodule:: sklearn
-
-.. autosummary::
-   :toctree: generated/
-   :template: class.rst
-
-   grid_search.GridSearchCV
-   grid_search.ParameterGrid
-   grid_search.ParameterSampler
-   grid_search.RandomizedSearchCV
-
-
 .. _isotonic_ref:
 
 :mod:`sklearn.isotonic`: Isotonic regression
@@ -618,24 +610,6 @@ From text
    lda.LDA
 
 
-.. _learning_curve_ref:
-
-:mod:`sklearn.learning_curve` Learning curve evaluation
-=======================================================
-
-.. automodule:: sklearn.learning_curve
-   :no-members:
-   :no-inherited-members:
-
-.. currentmodule:: sklearn
-
-.. autosummary::
-   :toctree: generated/
-   :template: function.rst
-
-   learning_curve.learning_curve
-   learning_curve.validation_curve
-
 .. _linear_model_ref:
 
 :mod:`sklearn.linear_model`: Generalized Linear Models
 
@@ -4,7 +4,7 @@
 Cross-validation: evaluating estimator performance
 ===================================================
 
-.. currentmodule:: sklearn.cross_validation
+.. currentmodule:: sklearn.model_selection
 
 Learning the parameters of a prediction function and testing it on the
 same data is a methodological mistake: a model that would just repeat
@@ -24,7 +24,7 @@ can be quickly computed with the :func:`train_test_split` helper function.
 Let's load the iris data set to fit a linear support vector machine on it::
 
   >>> import numpy as np
-  >>> from sklearn import cross_validation
+  >>> from sklearn.model_selection import train_test_split
   >>> from sklearn import datasets
   >>> from sklearn import svm
 
@@ -35,7 +35,7 @@ Let's load the iris data set to fit a linear support vector machine on it::
 We can now quickly sample a training set while holding out 40% of the
 data for testing (evaluating) our classifier::
 
-  >>> X_train, X_test, y_train, y_test = cross_validation.train_test_split(
+  >>> X_train, X_test, y_train, y_test = train_test_split(
   ...     iris.data, iris.target, test_size=0.4, random_state=0)
 
   >>> X_train.shape, y_train.shape
@@ -101,10 +101,9 @@ kernel support vector machine on the iris dataset by splitting the data, fitting
 a model and computing the score 5 consecutive times (with different splits each
 time)::
 
+  >>> from sklearn.model_selection import cross_val_score
   >>> clf = svm.SVC(kernel='linear', C=1)
-  >>> scores = cross_validation.cross_val_score(
-  ...    clf, iris.data, iris.target, cv=5)
-  ...
+  >>> scores = cross_val_score(clf, iris.data, iris.target, cv=5)
   >>> scores                                              # doctest: +ELLIPSIS
   array([ 0.96...,  1.  ...,  0.96...,  0.96...,  1.        ])
 
@@ -119,8 +118,8 @@ method of the estimator. It is possible to change this by using the
 scoring parameter::
 
   >>> from sklearn import metrics
-  >>> scores = cross_validation.cross_val_score(clf, iris.data, iris.target,
-  ...     cv=5, scoring='f1_weighted')
+  >>> scores = cross_val_score(
+  ...     clf, iris.data, iris.target, cv=5, scoring='f1_weighted')
   >>> scores                                              # doctest: +ELLIPSIS
   array([ 0.96...,  1.  ...,  0.96...,  0.96...,  1.        ])
 
@@ -136,11 +135,11 @@ being used if the estimator derives from :class:`ClassifierMixin
 It is also possible to use other cross validation strategies by passing a cross
 validation iterator instead, for instance::
 
-  >>> n_samples = iris.data.shape[0]
-  >>> cv = cross_validation.ShuffleSplit(n_samples, n_iter=3,
-  ...     test_size=0.3, random_state=0)
+  >>> from sklearn.model_selection import ShuffleSplit
 
-  >>> cross_validation.cross_val_score(clf, iris.data, iris.target, cv=cv)
+  >>> n_samples = iris.data.shape[0]
+  >>> cv = ShuffleSplit(n_iter=3, test_size=0.3, random_state=0)
+  >>> cross_val_score(clf, iris.data, iris.target, cv=cv)
   ...                                                     # doctest: +ELLIPSIS
   array([ 0.97...,  0.97...,  1.        ])
 
@@ -153,7 +152,7 @@ validation iterator instead, for instance::
     be learnt from a training set and applied to held-out data for prediction::
 
       >>> from sklearn import preprocessing
-      >>> X_train, X_test, y_train, y_test = cross_validation.train_test_split(
+      >>> X_train, X_test, y_train, y_test = train_test_split(
       ...     iris.data, iris.target, test_size=0.4, random_state=0)
       >>> scaler = preprocessing.StandardScaler().fit(X_train)
       >>> X_train_transformed = scaler.transform(X_train)
@@ -167,7 +166,7 @@ validation iterator instead, for instance::
 
       >>> from sklearn.pipeline import make_pipeline
       >>> clf = make_pipeline(preprocessing.StandardScaler(), svm.SVC(C=1))
-      >>> cross_validation.cross_val_score(clf, iris.data, iris.target, cv=cv)
+      >>> cross_val_score(clf, iris.data, iris.target, cv=cv)
       ...                                                 # doctest: +ELLIPSIS
       array([ 0.97...,  0.93...,  0.95...])
 
@@ -184,8 +183,8 @@ can be used (otherwise, an exception is raised).
 
 These prediction can then be used to evaluate the classifier::
 
-  >>> predicted = cross_validation.cross_val_predict(clf, iris.data,
-  ...                                                iris.target, cv=10)
+  >>> from sklearn.model_selection import cross_val_predict
+  >>> predicted = cross_val_predict(clf, iris.data, iris.target, cv=10)
   >>> metrics.accuracy_score(iris.target, predicted) # doctest: +ELLIPSIS
   0.966...
 
@@ -223,10 +222,11 @@ learned using :math:`k - 1` folds, and the fold left out is used for test.
 Example of 2-fold cross-validation on a dataset with 4 samples::
 
   >>> import numpy as np
-  >>> from sklearn.cross_validation import KFold
+  >>> from sklearn.model_selection import KFold
 
-  >>> kf = KFold(4, n_folds=2)
-  >>> for train, test in kf:
+  >>> X = np.ones(4)
+  >>> kf = KFold(n_folds=2)
+  >>> for train, test in kf.split(X):
   ...     print("%s %s" % (train, test))
   [2 3] [0 1]
   [0 1] [2 3]
@@ -250,11 +250,12 @@ target class as the complete set.
 Example of stratified 3-fold cross-validation on a dataset with 10 samples from
 two slightly unbalanced classes::
 
-  >>> from sklearn.cross_validation import StratifiedKFold
+  >>> from sklearn.model_selection import StratifiedKFold
 
-  >>> labels = [0, 0, 0, 0, 1, 1, 1, 1, 1, 1]
-  >>> skf = StratifiedKFold(labels, 3)
-  >>> for train, test in skf:
+  >>> X = np.ones(10)
+  >>> y = [0, 0, 0, 0, 1, 1, 1, 1, 1, 1]
+  >>> skf = StratifiedKFold(n_folds=3)
+  >>> for train, test in skf.split(X, y):
   ...     print("%s %s" % (train, test))
   [2 3 6 7 8 9] [0 1 4 5]
   [0 1 3 4 5 8 9] [2 6 7]
@@ -271,10 +272,12 @@ training sets and :math:`n` different tests set. This cross-validation
 procedure does not waste much data as only one sample is removed from the
 training set::
 
-  >>> from sklearn.cross_validation import LeaveOneOut
+  >>> from sklearn.model_selection import LeaveOneOut
 
-  >>> loo = LeaveOneOut(4)
-  >>> for train, test in loo:
+  >>> n_samples = 4
+  >>> X = np.ones(n_samples)
+  >>> loo = LeaveOneOut()
+  >>> for train, test in loo.split(X):
   ...     print("%s %s" % (train, test))
   [1 2 3] [0]
   [0 2 3] [1]
@@ -329,10 +332,11 @@ overlap for :math:`p > 1`.
 
 Example of Leave-2-Out on a dataset with 4 samples::
 
-  >>> from sklearn.cross_validation import LeavePOut
+  >>> from sklearn.model_selection import LeavePOut
 
-  >>> lpo = LeavePOut(4, p=2)
-  >>> for train, test in lpo:
+  >>> X = np.ones(4)
+  >>> lpo = LeavePOut(p=2)
+  >>> for train, test in lpo.split(X):
   ...     print("%s %s" % (train, test))
   [2 3] [0 1]
   [1 3] [0 2]
@@ -357,11 +361,13 @@ For example, in the cases of multiple experiments, *LOLO* can be used to
 create a cross-validation based on the different experiments: we create
 a training set using the samples of all the experiments except one::
 
-  >>> from sklearn.cross_validation import LeaveOneLabelOut
+  >>> from sklearn.model_selection import LeaveOneLabelOut
 
+  >>> X = [1, 5, 10, 50]
+  >>> y = [0, 1, 1, 2]
   >>> labels = [1, 1, 2, 2]
-  >>> lolo = LeaveOneLabelOut(labels)
-  >>> for train, test in lolo:
+  >>> lolo = LeaveOneLabelOut()
+  >>> for train, test in lolo.split(X, y, labels):
   ...     print("%s %s" % (train, test))
   [2 3] [0 1]
   [0 1] [2 3]
@@ -389,11 +395,13 @@ samples related to :math:`P` labels for each training/test set.
 
 Example of Leave-2-Label Out::
 
-  >>> from sklearn.cross_validation import LeavePLabelOut
+  >>> from sklearn.model_selection import LeavePLabelOut
 
+  >>> X = np.arange(6)
+  >>> y = [1, 1, 1, 2, 2, 2]
   >>> labels = [1, 1, 2, 2, 3, 3]
-  >>> lplo = LeavePLabelOut(labels, p=2)
-  >>> for train, test in lplo:
+  >>> lplo = LeavePLabelOut(n_labels=2)
+  >>> for train, test in lplo.split(X, y, labels):
   ...     print("%s %s" % (train, test))
   [4 5] [0 1 2 3]
   [2 3] [0 1 4 5]
@@ -416,9 +424,9 @@ generator.
 
 Here is a usage example::
 
-  >>> ss = cross_validation.ShuffleSplit(5, n_iter=3, test_size=0.25,
-  ...     random_state=0)
-  >>> for train_index, test_index in ss:
+  >>> X = np.arange(5)
+  >>> ss = ShuffleSplit(n_iter=3, test_size=0.25, random_state=0)
+  >>> for train_index, test_index in ss.split(X):
   ...     print("%s %s" % (train_index, test_index))
   ...
   [1 3 4] [2 0]
@@ -480,4 +488,4 @@ Cross validation and model selection
 
 Cross validation iterators can also be used to directly perform model
 selection using Grid Search for the optimal hyperparameters of the
-model. This is the topic if the next section: :ref:`grid_search`.
+model. This is the topic of the next section: :ref:`grid_search`.
@@ -156,7 +156,7 @@ picked as the splitting rule. This usually allows to reduce the variance
 of the model a bit more, at the expense of a slightly greater increase
 in bias::
 
-    >>> from sklearn.cross_validation import cross_val_score
+    >>> from sklearn.model_selection import cross_val_score
     >>> from sklearn.datasets import make_blobs
     >>> from sklearn.ensemble import RandomForestClassifier
     >>> from sklearn.ensemble import ExtraTreesClassifier
@@ -357,7 +357,7 @@ Usage
 The following example shows how to fit an AdaBoost classifier with 100 weak
 learners::
 
-    >>> from sklearn.cross_validation import cross_val_score
+    >>> from sklearn.model_selection import cross_val_score
     >>> from sklearn.datasets import load_iris
     >>> from sklearn.ensemble import AdaBoostClassifier
 
@@ -945,7 +945,7 @@ Usage
 The following example shows how to fit the majority rule classifier::
 
    >>> from sklearn import datasets
-   >>> from sklearn import cross_validation
+   >>> from sklearn.model_selection import cross_val_score
    >>> from sklearn.linear_model import LogisticRegression
    >>> from sklearn.naive_bayes import GaussianNB
    >>> from sklearn.ensemble import RandomForestClassifier
@@ -961,7 +961,7 @@ The following example shows how to fit the majority rule classifier::
    >>> eclf = VotingClassifier(estimators=[('lr', clf1), ('rf', clf2), ('gnb', clf3)], voting='hard')
 
    >>> for clf, label in zip([clf1, clf2, clf3, eclf], ['Logistic Regression', 'Random Forest', 'naive Bayes', 'Ensemble']):
-   ...     scores = cross_validation.cross_val_score(clf, X, y, cv=5, scoring='accuracy')
+   ...     scores = cross_val_score(clf, X, y, cv=5, scoring='accuracy')
    ...     print("Accuracy: %0.2f (+/- %0.2f) [%s]" % (scores.mean(), scores.std(), label))
    Accuracy: 0.90 (+/- 0.05) [Logistic Regression]
    Accuracy: 0.93 (+/- 0.05) [Random Forest]
@@ -1038,7 +1038,7 @@ Using the `VotingClassifier` with `GridSearch`
 The `VotingClassifier` can also be used together with `GridSearch` in order
 to tune the hyperparameters of the individual estimators::
 
-   >>> from sklearn.grid_search import GridSearchCV
+   >>> from sklearn.model_selection import GridSearchCV
    >>> clf1 = LogisticRegression(random_state=1)
    >>> clf2 = RandomForestClassifier(random_state=1)
    >>> clf3 = GaussianNB()