-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
MNT Uses memoryviews in tree criterion #22921
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MNT Uses memoryviews in tree criterion #22921
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, @thomasjpfan.
Relying on numpy's allocator is indeed more appropriate.
Co-authored-by: Julien Jerphanion <git@jjerphan.xyz>
Co-authored-by: Julien Jerphanion <git@jjerphan.xyz>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice simplification.
This is the only place where the strides are still needed. When I benchmark by using the memorview directly, I get a runtime regression.
Was it significant ? do you still have the results for this ?
That's expected. memset natively works on blocks that have the size of the registers. The loop can't compete with that, even more when not all optimisation flags are enabled. |
I reran the benchmarks for entropy again with memoryviews (repeating it 30 times with different random seeds), and I got similiar results between From memory, my original benchmarks showed a ~1% runtime regression compared to main. Maybe something was running on my system during my original benchmarks. If you are interested in running the benchmark: python benchmark.py pr_results.json --config entropy will store the results in |
@jjerphan you might want to take another look at it ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thank you, @thomasjpfan.
sklearn/tree/_criterion.pyx
Outdated
@@ -319,8 +292,7 @@ cdef class ClassificationCriterion(Criterion): | |||
cdef SIZE_t offset = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be removed.
Co-authored-by: Julien Jerphanion <git@jjerphan.xyz>
Continues #22868
This PR refactors
sklearn/tree/_criterion.pxd
to use memoryviews. This PR simplifies the code because strides are handled by the memoryview and Python can handle the memory. I ran this benchmark with every criterion and different n_samples and noticed no change in performance. Here are the plots of the results and here is the raw results on main and raw results for this PR.Note that
memcpy
andmemset
are still used because they benchmark better when compared to their memoryview counter parts (mv[:] = 0.0
ormv[:] = other_mv
).