Document the advantage of our new GBDT

I think it will be useful to document the advantage of our new GBDT (perhaps along with some benchmarks), so that users know when to use it.

Some insights from Nicolas:
- as guillaume said we can 100% control how they interact with the scikit-learn ecosystem (cross-validation, grid search etc.). This isn't the case for the other libraries which may only support part of it. Typically I'm not sure the LightGBM estimators pass our checks
- scikit-learn is arguably more popular than LightGBM or XGBoost alone, so the estimators have more exposure by being included here
- The APIs are significantly different. For example I really doubt our API for categorical variables will be similar to that of LightGBM.
- not everybody has a GPU, and the CPU implem is still order of magnitude faster than the other GBDT estimators that we have

Though personally I'm still not persuaded, e.g.,
- For the first reason, I think if we really care about interact with sklearn, perhaps a better way is to collaborate with existing GBDT, instead of writing a new one. Another possible way is to be more tolerant, e.g., flatten the prediction in voting.
- For the second reason, if we only consider GBDT, then I think xgboost, lightgbm, catboost is more famous than scikit-learn.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions