8000 Don't allow criterion='mae' for gradient boosting estimators · Issue #18263 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content
Don't allow criterion='mae' for gradient boosting estimators #18263
Closed
@NicolasHug

Description

@NicolasHug

The MAE criterion for trees was introduced in #6667. This PR also started exposing the criterion parameter to GradientBoostingClassifier and GradientBoostingRegressor, thus allowing 'mae', 'mse', and 'friedman_mse'. Before that, the GBDTs were hardcoded to use 'friedman_mse'.

I think we should stop allowing criterion='mae' for GBDTs.

My understanding of Gradient Boosting is that the trees should be predicting gradients using a least squares criterion. If we want to minimize the absolute error, we should be using loss='lad', but the criterion used for splitting the tree nodes should still be a least-squares ('mse' or 'friedman_mse'). I think that splitting the gradients using mae isn't methodologically correct.

In his original paper, Friedman does mention the possibility to fit a tree to the residuals using an lad criterion. But never does he suggest that one could fit the trees to the gradients using lad, which is what we are currently allowing.

I ran some benchmarks on the PMLB dataset (most datasets are balanced hence accuracy is a decent measure).

image

We can see that using criterion=mae usually perfoms worse than using mse or friedman_mse, even when loss=lad. Also, criterion=mae is 60 times slower than the other criteria (see notebook for details).

Note: From the benchmarks, friedman_mse does seem to (marginally) outperform mse, so I guess keeping it as the default makes sense. CC @thomasjpfan @lorentzenchr

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions

    0