-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
Document the advantage of our new GBDT #15392
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Just want to let you know that they actually do (except |
cc @NicolasHug Perhaps it's worthwhile to investigate the scikit-learn API of some famous libraries after the release (e.g., xgboost, lightgbm, catboost, keras) |
I don't think it's up to us to tell users why they should be using our implementation rather than some other. We're quite partial ;) For a user browsing the scikit-learn docs, the advantage is obvious: it's here. No need to install a new package. They can just use the scikit-learn implementation, and they know it will always be fully compatible with the rest of our API. By having our own version, we have control over the way we implement new features, to keep them in line with our current ecosystem of tools. LightGBM typically isn't fully compatible, see #13679 or #15127 (comment) But none of these points are IMO relevant for the UG |
Let's leave it after the release, I'll take some time to read the code ASAP. |
I guess, for third-party packages, "fully compatible" means that they successfully pass |
Unfortunately our |
Ah, I see! Then I think there will be good to create a meta-issue with list of all known non-covered by |
@NicolasHug I think that we could close this issue probably? |
I would agree but I'd leave the decision up to @qinhanmin2014 ;) |
Overall, I agree with #15392 (comment) and the previous two comments to close this issue. As noted in #15392 (comment), the advantage is that it comes packaged with scikit-learn. Any documentation with benchmarks comparing our implementation to others can will become out of date and needs to be re-run. This adds more maintenance on our side. |
I think it will be useful to document the advantage of our new GBDT (perhaps along with some benchmarks), so that users know when to use it.
Some insights from Nicolas:
Though personally I'm still not persuaded, e.g.,
The text was updated successfully, but these errors were encountered: