@@ -12,6 +12,57 @@ Version 0.18
12
12
Changelog
13
13
---------
14
14
15
+ .. _model_selection_changes :
16
+
17
+ Model Selection Enhancements and API Changes
18
+ --------------------------------------------
19
+
20
+ - **The ``model_selection`` module **
21
+
22
+ The new module :mod: `sklearn.model_selection `, which groups together the
23
+ functionalities of formerly :mod: `cross_validation `, :mod: `grid_search ` and
24
+ :mod: `learning_curve `, introduces new possibilities such as nested
25
+ cross-validation and better manipulation of parameter searches with Pandas.
26
+
27
+ Many things will stay the same but there are some key differences. Read
28
+ below to know more about the changes.
29
+
30
+ - **Data-independent CV splitters enabling nested cross-validation **
31
+
32
+ The new cross-validation splitters, defined in the
33
+ :mod: `sklearn.model_selection `, are no longer initialized with any
34
+ data-dependent parameters such as ``y ``. Instead they expose a
35
+ :func: `split ` method that takes in the data and yields a generator for the
36
+ different splits.
37
+
38
+ This change makes it possible to use the cross-validation splitters to
39
+ perform nested cross-validation, facilitated by
40
+ :class: `model_selection.GridSearchCV ` and
41
+ :class: `model_selection.RandomizedSearchCV ` utilities.
42
+
43
+ - **The enhanced `results_` attribute **
44
+
45
+ The new ``results_ `` attribute (of :class: `model_selection.GridSearchCV `
46
+ and :class: `model_selection.RandomizedSearchCV `) introduced in lieu of the
47
+ ``grid_scores_ `` attribute is a dict of 1D arrays with elements in each
48
+ array corresponding to the parameter settings (i.e. search candidates).
49
+
50
+ The ``results_ `` dict can be easily imported into ``pandas `` as a
51
+ ``DataFrame `` for exploring the search results.
52
+
53
+ The ``results_ `` arrays include scores for each cross-validation split
54
+ (with keys such as ``test_split0_score ``), as well as their mean
55
+ (``test_mean_score ``) and standard deviation (``test_std_score ``).
56
+
57
+ The ranks for the search candidates (based on their mean
58
+ cross-validation score) is available at ``results_['test_rank_score'] ``.
59
+
60
+ The parameter values for each parameter is stored separately as numpy
61
+ masked object arrays. The value, for that search candidate, is masked if
62
+ the corresponding parameter is not applicable. Additionally a list of all
63
+ the parameter dicts are stored at ``results_['params'] ``.
64
+
65
+
15
66
New features
16
67
............
17
68
@@ -54,7 +105,7 @@ New features
54
105
- Added ``algorithm="elkan" `` to :class: `cluster.KMeans ` implementing
55
106
Elkan's fast K-Means algorithm. By `Andreas Müller `_.
56
107
57
- - Generalization of :func: `model_selection._validation. cross_val_predict `.
108
+ - Generalization of :func: `model_selection.cross_val_predict `.
58
109
One can pass method names such as `predict_proba ` to be used in the cross
59
110
validation framework instead of the default `predict `. By `Ori Ziv `_ and `Sears Merritt `_.
60
111
@@ -66,11 +117,10 @@ Enhancements
66
117
and `Devashish Deshpande `_.
67
118
68
119
- The cross-validation iterators are replaced by cross-validation splitters
69
- available from :mod: `model_selection `. These expose a ``split `` method
70
- that takes in the data and yields a generator for the different splits.
71
- This change makes it possible to do nested cross-validation with ease,
72
- facilitated by :class: `model_selection.GridSearchCV ` and similar
73
- utilities. (`#4294 <https://github.com/scikit-learn/scikit-learn/pull/4294 >`_) by `Raghav R V `_.
120
+ available from :mod: `sklearn.model_selection `.
121
+ Ref :ref: `model_selection_changes ` for more information.
122
+ (`#4294 <https://github.com/scikit-learn/scikit-learn/pull/4294 >`_) by
123
+ `Raghav R V `_.
74
124
75
125
- The random forest, extra trees and decision tree estimators now has a
76
126
method ``decision_path `` which returns the decision path of samples in
@@ -144,6 +194,14 @@ Enhancements
144
194
- The :func: `ignore_warnings ` now accept a category argument to ignore only
145
195
the warnings of a specified type. By `Thierry Guillemot `_.
146
196
197
+ - The new ``results_ `` attribute of :class: `model_selection.GridSearchCV `
198
+ (and :class: `model_selection.RandomizedSearchCV `) can be easily imported
199
+ into pandas as a ``DataFrame ``. Ref :ref: `model_selection_changes ` for
200
+ more information.
201
+ (`#6697 <https://github.com/scikit-learn/scikit-learn/pull/6697 >`_) by
202
+ `Raghav R V `_.
203
+
204
+
147
205
Bug fixes
148
206
.........
149
207
@@ -212,10 +270,12 @@ Bug fixes
212
270
API changes summary
213
271
-------------------
214
272
215
- - The :mod: `cross_validation `, :mod: `grid_search ` and :mod: `learning_curve `
216
- have been deprecated and the classes and functions have been reorganized into
217
- the :mod: `model_selection ` module.
218
- (`#4294 <https://github.com/scikit-learn/scikit-learn/pull/4294 >`_) by `Raghav R V `_.
273
+ - The :mod: `sklearn.cross_validation `, :mod: `sklearn.grid_search ` and
274
+ :mod: `sklearn.learning_curve ` have been deprecated and the classes and
275
+ functions have been reorganized into the :mod: `model_selection ` module.
276
+ Ref :ref: `model_selection_changes ` for more information.
277
+ (`#4294 <https://github.com/scikit-learn/scikit-learn/pull/4294 >`_) by
278
+ `Raghav R V `_.
219
279
220
280
- ``residual_metric `` has been deprecated in :class: `linear_model.RANSACRegressor `.
221
281
Use ``loss `` instead. By `Manoj Kumar `_.
@@ -224,12 +284,20 @@ API changes summary
224
284
:class: `isotonic.IsotonicRegression `. By `Jonathan Arfa `_.
225
285
226
286
- The old :class: `GMM ` is deprecated in favor of the new
227
- :class: `GaussianMixture `. The new class compute the Gaussian mixture
228
- faster than before and some of computationnal problems have been solved.
287
+ :class: `GaussianMixture `. The new class computes the Gaussian mixture
288
+ faster than before and some of computational problems have been solved.
229
289
By `Wei Xue `_ and `Thierry Guillemot `_.
230
290
291
+ - The ``grid_scores_ `` attribute of :class: `model_selection.GridSearchCV `
292<
9E7A
/code>
+ and :class: `model_selection.RandomizedSearchCV ` is deprecated in favor of
293
+ the attribute ``results_ ``.
294
+ Ref :ref: `model_selection_changes ` for more information.
295
+ (`#6697 <https://github.com/scikit-learn/scikit-learn/pull/6697 >`_) by
296
+ `Raghav R V `_.
231
297
232
298
299
+ .. currentmodule :: sklearn
300
+
233
301
.. _changes_0_17_1 :
234
302
235
303
Version 0.17.1
@@ -4088,7 +4156,7 @@ David Huard, Dave Morrill, Ed Schofield, Travis Oliphant, Pearu Peterson.
4088
4156
4089
4157
.. _Matteo Visconti di Oleggio Castello : http://www.mvdoc.me
4090
4158
4091
- .. _Raghav R V : https://github.com/rvraghav93
4159
+ .. _Raghav R V : https://github.com/raghavrv
4092
4160
4093
4161
.. _Trevor Stephens : http://trevorstephens.com/
4094
4162
0 commit comments