8000 MNT Uses memoryviews in tree criterion by thomasjpfan · Pull Request #22921 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

MNT Uses memoryviews in tree criterion #22921

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Mar 29, 2022

Conversation

thomasjpfan
Copy link
Member

Continues #22868

This PR refactors sklearn/tree/_criterion.pxd to use memoryviews. This PR simplifies the code because strides are handled by the memoryview and Python can handle the memory. I ran this benchmark with every criterion and different n_samples and noticed no change in performance. Here are the plots of the results and here is the raw results on main and raw results for this PR.

Note that memcpy and memset are still used because they benchmark better when compared to their memoryview counter parts (mv[:] = 0.0 or mv[:] = other_mv).

Copy link
Member
@jjerphan jjerphan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, @thomasjpfan.

Relying on numpy's allocator is indeed more appropriate.

thomasjpfan and others added 3 commits March 25, 2022 10:51
Co-authored-by: Julien Jerphanion <git@jjerphan.xyz>
Co-authored-by: Julien Jerphanion <git@jjerphan.xyz>
Copy link
Member
@jeremiedbb jeremiedbb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice simplification.

This is the only place where the strides are still needed. When I benchmark by using the memorview directly, I get a runtime regression.

Was it significant ? do you still have the results for this ?

@jeremiedbb
Copy link
Member

Note that memcpy and memset are still used because they benchmark better when compared to their memoryview counter parts (mv[:] = 0.0 or mv[:] = other_mv).

That's expected. memset natively works on blocks that have the size of the registers. The loop can't compete with that, even more when not all optimisation flags are enabled.

@thomasjpfan
Copy link
Member Author
thomasjpfan commented Mar 26, 2022

Was it significant ? do you still have the results for this ?

I reran the benchmarks for entropy again with memoryviews (repeating it 30 times with different random seeds), and I got similiar results between main, using memoryviews (pr_mv) and using pointers (pr_pointer):
compare_entropy

From memory, my original benchmarks showed a ~1% runtime regression compared to main. Maybe something was running on my system during my original benchmarks.

If you are interested in running the benchmark:

python benchmark.py pr_results.json --config entropy

will store the results in pr_results.json and output the mean/std for the 30 runs.

@jeremiedbb
Copy link
Member

@jjerphan you might want to take another look at it ?

Copy link
Member
@jjerphan jjerphan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thank you, @thomasjpfan.

@@ -319,8 +292,7 @@ cdef class ClassificationCriterion(Criterion):
cdef SIZE_t offset = 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be removed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0