-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
MAINT Introduce BaseCriterion
as a base abstraction for Criterion
s
#24678
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MAINT Introduce BaseCriterion
as a base abstraction for Criterion
s
#24678
Conversation
I think this PR includes by mistake unrelated Cython files from the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might want to check for BaseCriterion
instead of Criterion
in:
scikit-learn/sklearn/tree/_classes.py
Line 324 in bcff21e
if not isinstance(criterion, Criterion):
to make it easier to subclass BaseDecisionTree
with a custom criterion although other parts of the fit
method assume a float64 dtyped y
array. The various steps of the fit method should probably be split in private submethod to make it easier to subclass. This can be addressed in subsequent PRs.
sklearn/metrics/_pairwise_distances_reduction/_radius_neighborhood.pxd
Outdated
Show resolved
Hide resolved
Cross-linking: #20648 |
Hi @jjerphan @ogrisel and @glemaitre just wanted to do a light ping to see if there's any feedback regarding this and if this is good to merge? This is probably the most simple change I proposed in the roadmap for improving trees. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for this work and for the heads-up, @adam2392.
Is it possible to run the RandomForest ASV benchmarks to assess any potential performance regression w.r.t main
?
In the meantime, here are some comments and suggestions.
This comment was marked as outdated.
This comment was marked as outdated.
@jshinm has volunteered to try to address the comments you outlined in the code. Thanks! |
I re-ran the asv benchmark after closing all open apps on my Mac M1:
There are no performance regressions with including this refactoring. However, not sure why we saw an improvement in memory for sparse data. Might be chance? Saw similar results when I ran it yesterday tho (#24678 (comment)) cc: @jjerphan Our next steps are for @jshinm to re-run the asv benchmarks and to also address the code changes. |
Just wanted to follow up to get some idea what might be blocking this? We have several downstream PRs that fully flush out the modularity of criterion/splitter in the right directions w/o introducing performance degradation, so don't wanna continue working on those if something is drastically going to change here. Thanks! |
It has not got response nor attention mainly because of maintainers' lack of time, unfortunately. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Vincent-Maladiere and I are currently taking some time to get a better understanding of those implementations.
In the meantime, I think we better use a feature branch for such refactor (this will allow us to integrate this work more safely and confidently). In this regards, I have created refactor/cython-tree-class-hierarchy
from main
.
@adam2392: could you change the target branch of this PR to refactor/cython-tree-class-hierarchy
?
Done! |
The only CI issue is the CHANGELOG, but IIRC, that is fine. |
Here is an example of how such a refactoring is utilized: https://github.com/neurodata/scikit-tree/blob/main/sktree/tree/_unsup_criterion.pyx |
Good job! |
What happened here? Why was it closed? |
Sorry for the annoyance.< DBAF /p> @adrinjalali mentioned the I just have repushed the branch since I had a copy on my clone locally. @adam2392: I cannot reopen this PR, can you? |
Sure I can re-open this PR. May I ask what is the desire in doing so? Do we think this might move forward? I don't want to get my hopes up :p Do I target the now deleted branch, or do I target |
I can't guarantee this will move forward. We need to have people be involved on the Cython side of the project (I personally am contributing to scikit-learn on my free time solely).
|
Reference Issues/PRs
Fixes #24577
What does this implement/fix? Explain your changes.
Adds the
BaseCriterion
class as an abstract class for just the API relevant changes, while keepingCriterion
as the assumed "supervised learning" criterion.This change is backwards compatible.
Any other comments?