-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
[WIP] Add Huber loss criterion to DecisionTreeRegressor and RandomForestRegressor #27932
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
constructor
constructor and _huber_loss method
squared error loss and Huber loss on a dataset with outliers
DecisionTreeRegressor api doc
I am wondering if this is actually something that we need since we have the |
@glemaitre Good question. This came up from couple of my colleagues. As I understand their issue, they have data with outliers. They were wondering if RF with Huber loss would generate a better performing model than with "squared_error". This attempts to illusrates the scenario. Test scenario:
At least for this example, Huber gives a lower MSE. Whether this is signficant or not will probably depend on the situation. this shows the difference in the test data metric. The point you bring up re: "absolute_error" is a valid point. Let me extend the example and do a three-way comparison: "squared_error", "absolute_error" and "huber" to see how model performance is affected. |
@glemaitre Hopefully this will answer your question, "Do we have a gain in terms of fitting performance (I mean the time to train)?" The aswer is "Yes, depending on the Let me know if this answered your question. Hopefully this time reduction will be viewed as a benefit to the project. Test procedure
Test Results
|
Reference Issues/PRs
Fixes #5368 (addresses the unfullifed request for Huber Loss)
What does this implement/fix? Explain your changes.
Adds Huber Loss as a valid
criterion
to the cited estimators. Updated relevant unit tests to for the newcriterion
.Any other comments?
Here is execution of the relevant CI tests after the modifications.