-
Notifications
You must be signed in to change notification settings - Fork 0
PWD Engine Refactor #7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of servic 8000 e and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @Micky774, very nice rework!
I focused on a similar approach on scikit-learn#25170, but I have some performances decrease and no workaround for them.
I think your approach is more likely to be successful, so I took the liberty to write some comments.
Initial tests show no significant change in performance between the current refactor approach and the performance on main. I have not run exhaustive tests, but on my 16 core machine here are the results I have currently (only on |
Hey @Micky774! We scratched our heads for a while with @jjerphan to try to make the scikit-learn#25170 work and we concluded that calling We think that you will encounter the same issues because you have specialized your What do you think? |
Yes that's where I had become blocked as well. @thomasjpfan and I explored several methods to try to solve this, mostly orienting around some form of efficient callback system. Ideally, context classes like We have an idea that may work which relies on tightly-monitored and reviewed callbacks that utilize a I'll have a version of that on this draft soon enough hopefully and will see what the performance looks like. |
A callback system has been implemented in #8 but the initial benchmark results don't look great. I'll publish more comprehensive results later tomorrow. Right now I'm struggling to figure out exactly what is causing the slowdown ( |
I fear that adding callbacks with support for arbitrary arguments will make those implementations harder to understand and maintain overtime. I would rather have duplication at the level of reduction specialization (e.g. |
Honestly I favor tight documentation and review to mediate the complexity of the callback system than the current duplication and messy hierarchy observed on It may still be a moot point if the callbacks result in an irreducible performance regression though. |
Since any of the callback implementations leads to a regression, I am okay with keeping the status quo. Currently, when we add a new reduction such as If we get to a point where we want to add another specialization, we can revisit the current design. (Adding another specialization, likely means we add it to ArgKmin, RadiusNeighbors, ArgKminLabels, and any future reductions). |
Reference Issues/PRs
N/A
What does this implement/fix? Explain your changes.
Refactors PWD backend code to disentangle core computation from problem context
Any other comments?
Draft view for refactor -- will make PR on main repo when more mature.