@@ -1188,6 +1188,13 @@ class RandomForestClassifier(ForestClassifier):
1188
1188
For a comparison between tree-based ensemble models see the example
1189
1189
:ref:`sphx_glr_auto_examples_ensemble_plot_forest_hist_grad_boosting_comparison.py`.
1190
1190
1191
+ This estimator has native support for missing values (NaNs). During training,
1192
+ the tree grower learns at each split point whether samples with missing values
1193
+ should go to the left or right child, based on the potential gain. When predicting,
1194
+ samples with missing values are assigned to the left or right child consequently.
1195
+ If no missing values were encountered for a given feature during training, then
1196
+ samples with missing values are mapped to whichever child has the most samples.
1197
+
1191
1198
Read more in the :ref:`User Guide <forest>`.
1192
1199
1193
1200
Parameters
@@ -1572,6 +1579,13 @@ class RandomForestRegressor(ForestRegressor):
1572
1579
`bootstrap=True` (default), otherwise the whole dataset is used to build
1573
1580
each tree.
1574
1581
1582
+ This estimator has native support for missing values (NaNs). During training,
1583
+ the tree grower learns at each split point whether samples with missing values
1584
+ should go to the left or right child, based on the potential gain. When predicting,
1585
+ samples with missing values are assigned to the left or right child consequently.
1586
+ If no missing values were encountered for a given feature during training, then
1587
+ samples with missing values are mapped to whichever child has the most samples.
1588
+
1575
1589
For a comparison between tree-based ensemble models see the example
1576
1590
:ref:`sphx_glr_auto_examples_ensemble_plot_forest_hist_grad_boosting_comparison.py`.
1577
1591
@@ -1929,6 +1943,14 @@ class ExtraTreesClassifier(ForestClassifier):
1929
1943
of the dataset and uses averaging to improve the predictive accuracy
1930
1944
and control over-fitting.
1931
1945
1946
+ This estimator has native support for missing values (NaNs) for
1947
+ random splits. During training, a random threshold will be chosen
1948
+ to split the non-missing values on. Then the non-missing values will be sent
1949
+ to the left and right child based on the randomly selected threshold, while
1950
+ the missing values will also be randomly sent to the left or right child.
1951
+ This is repeated for every feature considered at each split. The best split
1952
+ among these is chosen.
1953
+
1932
1954
Read more in the :ref:`User Guide <forest>`.
1933
1955
1934
1956
Parameters
@@ -2302,6 +2324,14 @@ class ExtraTreesRegressor(ForestRegressor):
2302
2324
of the dataset and uses averaging to improve the predictive accuracy
2303
2325
and control over-fitting.
2304
2326
2327
+ This estimator has native support for missing values (NaNs) for
2328
+ random splits. During training, a random threshold will be chosen
2329
+ to split the non-missing values on. Then the non-missing values will be sent
2330
+ to the left and right child based on the randomly selected threshold, while
2331
+ the missing values will also be randomly sent to the left or right child.
2332
+ This is repeated for every feature considered at each split. The best split
2333
+ among these is chosen.
2334
+
2305
2335
Read more in the :ref:`User Guide <forest>`.
2306
2336
2307
2337
Parameters
0 commit comments