PWD Engine Refactor #7

Micky774 · 2022-12-17T22:17:45Z

Reference Issues/PRs
N/A

What does this implement/fix? Explain your changes.
Refactors PWD backend code to disentangle core computation from problem context

Any other comments?
Draft view for refactor -- will make PR on main repo when more mature.

Vincent-Maladiere

Hey @Micky774, very nice rework!

I focused on a similar approach on scikit-learn#25170, but I have some performances decrease and no workaround for them.

I think your approach is more likely to be successful, so I took the liberty to write some comments.

sklearn/metrics/_pairwise_distances_reduction/_argkmin.pyx.tp

Micky774 · 2023-01-04T21:12:42Z

Initial tests show no significant change in performance between the current refactor approach and the performance on main. I have not run exhaustive tests, but on my 16 core machine here are the results I have currently (only on ArgKmin with the EuclideanEngines):

Small `n_samples`

Larger `n_samples`

Vincent-Maladiere · 2023-01-11T17:05:03Z

Hey @Micky774! We scratched our heads for a while with @jjerphan to try to make the scikit-learn#25170 work and we concluded that calling surrogate_dist in a for-loop leads to the observed ~30% performance decreases. As a workaround, we tried to inline it so that surrogate_dist wouldn't introduce extra cost, but we failed to do so because inline is not feasible with inheritance. The reason for that is that Cython doesn't know how to dispatch inline methods when using inheritance (more details here).

We think that you will encounter the same issues because you have specialized your EuclideanEngine.surrogate_dist method for ArgKmin, with the call to heap_push, but you won't be able to implement RadiusNeighbors in this fashion. You'll also need to migrate the data structures for both ArgKmin and RadiusNeighbors to BaseEngine, creating massive coupling between these classes.

What do you think?

Micky774 · 2023-01-11T17:11:20Z

Hey @Micky774! We scratched our heads for a while with @jjerphan to try to make the scikit-learn#25170 work and we concluded that calling surrogate_dist in a for-loop leads to the observed ~30% performance decreases. As a workaround, we tried to inline it so that surrogate_dist wouldn't introduce extra cost, but we failed to do so because inline is not feasible with inheritance. The reason for that is that Cython doesn't know how to dispatch inline methods when using inheritance.

We think that you will encounter the same issues because you have specialized your EuclideanEngine.surrogate_dist method for ArgKmin, with the call to heap_push, but you won't be able to implement RadiusNeighbors in this fashion. You'll also need to migrate the data structures for both ArgKmin and RadiusNeighbors to BaseEngine, creating massive coupling between these classes.

What do you think?

Yes that's where I had become blocked as well. @thomasjpfan and I explored several methods to try to solve this, mostly orienting around some form of efficient callback system. Ideally, context classes like ArgKmin, RadiusNeighbors would provide a callback to the engine object such that it could be efficiently called within the loop, however the difficulty therein is how to make it generic.

We have an idea that may work which relies on tightly-monitored and reviewed callbacks that utilize a void * extra_params struct which is mediated through documentation -- each context class would create its own callback in some _callbacks.pyx and define whatever extra parameters they need in a custom-typed struct passed through the void * argument.

I'll have a version of that on this draft soon enough hopefully and will see what the performance looks like.

Micky774 · 2023-01-12T04:27:37Z

A callback system has been implemented in #8 but the initial benchmark results don't look great. I'll publish more comprehensive results later tomorrow. Right now I'm struggling to figure out exactly what is causing the slowdown (py-spy even with native tracking seems insufficient)

jjerphan · 2023-01-12T08:49:03Z

I fear that adding callbacks with support for arbitrary arguments will make those implementations harder to understand and maintain overtime.

I would rather have duplication at the level of reduction specialization (e.g. Euclidean{ArgKmin,RadiusNeighbors}) than a complex chain of responsibilities between many composites and components. What do you think?

Micky774 · 2023-01-12T17:22:13Z

I fear that adding callbacks with support for arbitrary arguments will make those implementations harder to understand and maintain overtime.

I would rather have duplication at the level of reduction specialization (e.g. Euclidean{ArgKmin,RadiusNeighbors} that have a complex chain of responsibilities between many composites and components. What do you think?

Honestly I favor tight documentation and review to mediate the complexity of the callback system than the current duplication and messy hierarchy observed on main. Of course I understand it may be an unpopular opinion, in which case I'm fine with whatever the concensus amongst core devs is -- or at least amongst those that are most involved with Cython development 😉

It may still be a moot point if the callbacks result in an irreducible performance regression though.

thomasjpfan · 2023-01-12T18:14:54Z

Since any of the callback implementations leads to a regression, I am okay with keeping the status quo. Currently, when we add a new reduction such as ArgKminLabels in scikit-learn#24076, there needs to be two implementations: one for the normal case and another for the Euclidean specialization. For me, this is still maintainable.

If we get to a point where we want to add another specialization, we can revisit the current design. (Adding another specialization, likely means we add it to ArgKmin, RadiusNeighbors, ArgKminLabels, and any future reductions).

API improvements, though still broken

0b6ab44

jjerphan mentioned this pull request Dec 19, 2022

ENH Euclidean specialization of DatasetsPair instead of ArgKmin and RadiusNeighbors scikit-learn/scikit-learn#25170

Closed

Vincent-Maladiere reviewed Dec 23, 2022

View reviewed changes

sklearn/metrics/_pairwise_distances_reduction/_argkmin.pyx.tp Outdated Show resolved Hide resolved

sklearn/metrics/_pairwise_distances_reduction/_argkmin.pyx.tp Outdated Show resolved Hide resolved

sklearn/metrics/_pairwise_distances_reduction/_argkmin.pyx.tp Outdated Show resolved Hide resolved

Micky774 added 11 commits December 28, 2022 20:31

Shifted middle-term computation/storage to engine entirely

b954160

Trimmed excess declarations

263b355

Code is runnable, however is not yet correct

3d650be

Refactored _compute_and_reduce_distances_on_chunks

dd438fb

Merge branch 'main' into pwd_engine_alt

f994248

Completed parallel_on_X path

426d76f

Removed old print statement

590d876

Enabled both paths

78b94cf

Added function pointer for surrogate dist

< 8000 /div>

18ae78f

Finally replaced ArgKmin implementation

aebcfa8

Finally replaced ArgKmin implementation

aebcfa8

Micky774 added 3 commits January 4, 2023 16:19

Merge branch 'main' into pwd_engine_alt

36fdb7f

Minor refactor to create bit-width agnostic interface for BDR

cc99ed0

Included dataset as attribute and extended engine hierarchy

45494f9

Micky774 closed this Feb 4, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

PWD Engine Refactor #7

PWD Engine Refactor #7

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

PWD Engine Refactor #7

PWD Engine Refactor #7

Uh oh!

Conversation

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!