-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
[MRG] Neighbors refactor #4217
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MRG] Neighbors refactor #4217
Conversation
1b5e83e
to
7fb3b60
Compare
Cython knows about dicts and generates 80 fewer lines of C for this idiom.
Saves a few thousand lines of generated C code.
Including this file into binary_tree.pxi gets rid of >5000 lines of generated C code. Also trimmed it down a bit.
7fb3b60
to
0cec160
Compare
Saves ~40kLOC of generated C code.
8884333
to
f04ecb4
Compare
There; -42563 lines of code. |
Tests pass. @jakevdp @MechCoder @jnothman? |
@@ -1,5 +1,6 @@ | |||
#!python | |||
cimport numpy as np | |||
import numpy as np |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is a Python import valid in a pxd
file? I thought that it doesn't actually do anything.
I have one concern: when I first wrote this code, I did the literal inclusion junk because I found that due to the virtual table architecture of Cython classes, the compiled Cython couldn't inline methods from derived classes, and this led to a performance penalty. I see that in the refactor, you've undone this and made the utilities which are specific to I'd like to see some comprehensive before/after benchmarks before this is merged. You may be able to make use of some of the scripts in https://github.com/jakevdp/BinaryTree – that's the repository where I experimented for several months before opening the scikit-learn PR with this functionality. Somewhere in that history there's some code that looks a lot more like this refactor 😄 |
Given #4213, can we please rename |
@jakevdp Here's the graphs from After: I don't see massive differences. Note that I moved all of the recursive algorithms to toplevel (they were |
Great! Thanks for taking a look at that. It looks like the two "after" plots are duplicates: do you have the times for the dense data "after"? |
@jakevdp My bad, uploaded the right picture. Performance differences are still very hard to spot. |
Indeed! It looks really good. Definitely a much cleaner implementation. |
I just noticed that the benchmarks do nothing with the dual tree implementation. That was one of the things I recall being very slow with the derived class pattern. Would you mind checking that as well? |
Also it would be cool if we could have a more direct comparison. Comparing the current plots is really hard. Some pandas magic should do the trick to combine the two datasets ;) |
Any news on that? |
A small part of this PR was extracted in #15097, but generally continuing this would require significant work. Now that scikit-learn supports pre-computed sparse distances (including from other faster ANN libraries for high (>100) dimensional datasets) there is a question of how much effort we should spend on this binary tree implementation. Though of course if someone interested in contributing PR to improve the situation they would be very welcome. |
After trying to rebase this PR via #22745, reflecting on the last changes that are left on the rebase (mostly refactoring private methods into private functions) and taking into account the reported results of benchmarks, I do not think it is worth integrating the proposed changes. Here are, in my opinion, several reasons:
Still, I thinking some idea from this PR can be picked. I am thinking of:
What do you think? |
If there were no objection, I assume this can be closed. Thank you for exploring improvements some of which have been integrated individually, @larsmans! |
Big refactor of the nearest neighbors/space partitioning code.
Currently, KD-tree and ball tree include (textually) the source of their base class (
binary_tree.pxi
) so that the code in that module gets compiled twice. The code in this PR merges the modules and compiles the base class once.Also: removal of workaround for old Cython and NumPy versions, tiny optimizations (less Python C-API calling).
Preliminary to optimizing radius neighbor queries as promised in #4157.