-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
ENH reuse parent histogram in HGBT #26189
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
ENH reuse parent histogram in HGBT #26189
Conversation
Simple benchmarkOn the higgs dataset, this PR gives ~4% speed-up, which is about the expected amount (1 / 28 features = 3.5%). Running
on main gives
On this PR:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the PR! I left a first pass review.
9fb9006
to
b1efb34
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although I do not think this change is too complex, it still adds more complexity to the codebase. Overall, do you thi
8000
nk the complexity is worth the 1-1/n_features
performance bump?
@lorentzenchr: should we continue and merge this PR, or not? |
Only if I get the commitment of 2 core devs willing to review and finally approve. |
I would be glad reviewing this PR, but I would rather finish what I started first. |
Reference Issues/PRs
None
What does this implement/fix? Explain your changes.
This PR reuses the parent's histogram for the feature that was split on. This saves a little time.
Any other comments?
The implementation is no beauty. If we want to include this improvement, suggestions or alternative PRs are welcome.