@@ -260,11 +260,16 @@ respect to the predictability of the target variable. Features used at
260
260
the top of the tree contribute to the final prediction decision of a
261
261
larger fraction of the input samples. The **expected fraction of the
262
262
samples ** they contribute to can thus be used as an estimate of the
263
- **relative importance of the features **.
263
+ **relative importance of the features **. In scikit-learn, the fraction of
264
+ samples a feature contributes to is combined with the decrease in impurity
265
+ from splitting them to create a normalized estimate of the predictive power
266
+ of that feature.
264
267
265
- By **averaging ** those expected activity rates over several randomized
268
+ By **averaging ** the estimates of predictive ability over several randomized
266
269
trees one can **reduce the variance ** of such an estimate and use it
267
- for feature selection.
270
+ for feature selection. This is known as the mean decrease in impurity, or MDI.
271
+ Refer to [L2014 ]_ for more information on MDI and feature importance
272
+ evaluation with Random Forests.
268
273
269
274
The following example shows a color-coded representation of the relative
270
275
importances of each individual pixel for a face recognition task using
@@ -288,6 +293,12 @@ to the prediction function.
288
293
289
294
.. _random_trees_embedding :
290
295
296
+ .. topic :: References
297
+
298
+ .. [L2014 ] G. Louppe,
299
+ "Understanding Random Forests: From Theory to Practice",
300
+ PhD Thesis, U. of Liege, 2014.
301
+
291
302
Totally Random Trees Embedding
292
303
------------------------------
293
304
0 commit comments