@@ -252,7 +252,11 @@ the most samples (just like for continuous features). When predicting,
252
252
categories that were not seen during fit time will be treated as missing
253
253
values.
254
254
255
- **Split finding with categorical features **: The canonical way of considering
255
+ |details-start |
256
+ **Split finding with categorical features **:
257
+ |details-split |
258
+
259
+ The canonical way of considering
256
260
categorical splits in a tree is to consider
257
261
all of the :math: `2 ^{K - 1 } - 1 ` partitions, where :math: `K` is the number of
258
262
categories. This can quickly become prohibitive when :math: `K` is large.
@@ -267,6 +271,8 @@ instead of :math:`2^{K - 1} - 1`. The initial sorting is a
267
271
:math: `\mathcal {O}(K \log (K))` operation, leading to a total complexity of
268
272
:math: `\mathcal {O}(K \log (K) + K)`, instead of :math: `\mathcal {O}(2 ^K)`.
269
273
274
+ |details-end |
275
+
270
276
.. topic :: Examples:
271
277
272
278
* :ref: `sphx_glr_auto_examples_ensemble_plot_gradient_boosting_categorical.py `
@@ -444,8 +450,9 @@ The usage and the parameters of :class:`GradientBoostingClassifier` and
444
450
:class: `GradientBoostingRegressor ` are described below. The 2 most important
445
451
parameters of these estimators are `n_estimators ` and `learning_rate `.
446
452
447
- Classification
448
- ^^^^^^^^^^^^^^^
453
+ |details-start |
454
+ **Classification **
455
+ |details-split |
449
456
450
457
:class: `GradientBoostingClassifier ` supports both binary and multi-class
451
458
classification.
@@ -482,8 +489,11 @@ depth via ``max_depth`` or by setting the number of leaf nodes via
482
489
:class: `HistGradientBoostingClassifier ` as an alternative to
483
490
:class: `GradientBoostingClassifier ` .
484
491
485
- Regression
486
- ^^^^^^^^^^^
492
+ |details-end |
493
+
494
+ |details-start |
495
+ **Regression **
496
+ |details-split |
487
497
488
498
:class: `GradientBoostingRegressor ` supports a number of
489
499
:ref: `different loss functions <gradient_boosting_loss >`
@@ -524,6 +534,8 @@ to determine the optimal number of trees (i.e. ``n_estimators``) by early stoppi
524
534
:align: center
525
535
:scale: 75
526
536
537
+ |details-end |
538
+
527
539
.. topic :: Examples:
528
540
529
541
* :ref: `sphx_glr_auto_examples_ensemble_plot_gradient_boosting_regression.py `
@@ -580,8 +592,9 @@ Mathematical formulation
580
592
We first present GBRT for regression, and then detail the classification
581
593
case.
582
594
583
- Regression
584
- ...........
595
+ |details-start |
596
+ **Regression **
597
+ |details-split |
585
598
586
599
GBRT regressors are additive models whose prediction :math: `\hat {y}_i` for a
587
600
given input :math: `x_i` is of the following form:
@@ -663,8 +676,11 @@ space.
663
676
update is loss-dependent: for the absolute error loss, the value of
664
677
a leaf is updated to the median of the samples in that leaf.
665
678
666
- Classification
667
- ..............
679
+ |details-end |
680
+
681
+ |details-start |
682
+ **Classification **
683
+ |details-split |
668
684
669
685
Gradient boosting for classification is very similar to the regression case.
670
686
However, the sum of the trees :math: `F_M(x_i) = \sum _m h_m(x_i)` is not
@@ -685,6 +701,8 @@ still a regressor, not a classifier. This is because the sub-estimators are
685
701
trained to predict (negative) *gradients *, which are always continuous
686
702
quantities.
687
703
704
+ |details-end |
705
+
688
706
.. _gradient_boosting_loss :
689
707
690
708
Loss Functions
@@ -693,7 +711,9 @@ Loss Functions
693
711
The following loss functions are supported and can be specified using
694
712
the parameter ``loss ``:
695
713
696
- * Regression
714
+ |details-start |
715
+ **Regression **
716
+ |details-split |
697
717
698
718
* Squared error (``'squared_error' ``): The natural choice for regression
699
719
due to its superior computational properties. The initial model is
@@ -710,7 +730,12 @@ the parameter ``loss``:
710
730
can be used to create prediction intervals
711
731
(see :ref: `sphx_glr_auto_examples_ensemble_plot_gradient_boosting_quantile.py `).
712
732
713
- * Classification
733
+ |details-end |
734
+
735
+
736
+ |details-start |
737
+ **Classification **
738
+ |details-split |
714
739
715
740
* Binary log-loss (``'log-loss' ``): The binomial
716
741
negative log-likelihood loss function for binary classification. It provides
@@ -728,6 +753,8 @@ the parameter ``loss``:
728
753
examples than ``'log-loss' ``; can only be used for binary
729
754
classification.
730
755
756
+ |details-end |
757
+
731
758
.. _gradient_boosting_shrinkage :
732
759
733
760
Shrinkage via learning rate
@@ -1356,8 +1383,28 @@ Vector Machine, a Decision Tree, and a K-nearest neighbor classifier::
1356
1383
:align: center
1357
1384
:scale: 75%
1358
1385
1359
- Using the `VotingClassifier ` with `GridSearchCV `
1360
10000
- ------------------------------------------------
1386
+ Usage
1387
+ -----
1388
+
1389
+ In order to predict the class labels based on the predicted
1390
+ class-probabilities (scikit-learn estimators in the VotingClassifier
1391
+ must support ``predict_proba `` method)::
1392
+
1393
+ >>> eclf = VotingClassifier(
1394
+ ... estimators=[('lr', clf1), ('rf', clf2), ('gnb', clf3)],
1395
+ ... voting='soft'
1396
+ ... )
1397
+
1398
+ Optionally, weights can be provided for the individual classifiers::
1399
+
1400
+ >>> eclf = VotingClassifier(
1401
+ ... estimators=[('lr', clf1), ('rf', clf2), ('gnb', clf3)],
1402
+ ... voting='soft', weights=[2,5,1]
1403
+ ... )
1404
+
1405
+ |details-start |
1406
+ **Using the `VotingClassifier` with `GridSearchCV` **
1407
+ |details-split |
1361
1408
1362
1409
The :class: `VotingClassifier ` can also be used together with
1363
1410
:class: `~sklearn.model_selection.GridSearchCV ` in order to tune the
@@ -1377,24 +1424,7 @@ hyperparameters of the individual estimators::
1377
1424
>>> grid = GridSearchCV(estimator=eclf, param_grid=params, cv=5)
1378
1425
>>> grid = grid.fit(iris.data, iris.target)
1379
1426
1380
- Usage
1381
- -----
1382
-
1383
- In order to predict the class labels based on the predicted
1384
- class-probabilities (scikit-learn estimators in the VotingClassifier
1385
- must support ``predict_proba `` method)::
1386
-
1387
- >>> eclf = VotingClassifier(
1388
- ... estimators=[('lr', clf1), ('rf', clf2), ('gnb', clf3)],
1389
- ... voting='soft'
1390
- ... )
1391
-
1392
- Optionally, weights can be provided for the individual classifiers::
1393
-
1394
- >>> eclf = VotingClassifier(
1395
- ... estimators=[('lr', clf1), ('rf', clf2), ('gnb', clf3)],
1396
- ... voting='soft', weights=[2,5,1]
1397
- ... )
1427
+ |details-end |
1398
1428
1399
1429
.. _voting_regressor :
1400
1430
0 commit comments