8000 MAINT refactor heap routines into utils by jjerphan · Pull Request #21987 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

MAINT refactor heap routines into utils #21987

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Dec 21, 2021

Conversation

jjerphan
Copy link
Member
@jjerphan jjerphan commented Dec 15, 2021

Reference Issues/PRs

Part of breaking #21462 down in smaller PRs.

What does this implement/fix? Explain your changes.

A few fixtures have been refactored and/or introduced in #21462.

This PR encapsulates those changes.

This PR refactors the Cython heap routines currently in the neighbors module into sklearn.utils. It improves by making use of fused types. It also adds the Cython routine _openmp_thread_num to the openmp helpers.

Copy link
Member
@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for opening this PR @jjerphan !

I think this PR can be merged directly into main. It has no functional changes and is a refactor that makes the original code easier to maintain.

(It also has a side benefit of using fused types in simultaneous_sort.)

Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
@jjerphan jjerphan changed the title Refactor fixtures prior to PairwiseDistancesReduction introduction MNT Refactor fixtures prior to PairwiseDistancesReduction introduction Dec 15, 2021
@jjerphan jjerphan changed the title MNT Refactor fixtures prior to PairwiseDistancesReduction introduction MAINT Refactor fixtures prior to PairwiseDistancesReduction introduction Dec 15, 2021
@jjerphan jjerphan changed the base branch from pairwise-distances-argkmin to main December 15, 2021 16:00
@jjerphan
Copy link
Member Author

Thank you, @thomasjpfan. I really appreciate your quick response and commitment. 🤝

I've changed the base to main as you suggested and treated your suggestions.

Copy link
Member
@lorentzenchr lorentzenchr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

8000

LGTM

Co-authored-by: Christian Lorentzen <lorentzen.ch@gmail.com>
@jjerphan
Copy link
Member Author

@lorentzenchr: Let me know if e369008 makes the comments clearer.

Copy link
Member
@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM as well!

Copy link
Member
@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestions to further improve the docstring:

Also fix typo.

Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
jjerphan and others added 3 commits December 17, 2021 08:33
Co-authored-by: Jérémie du Boisberranger <jeremiedbb@users.noreply.github.com>
Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
Co-authored-by: Jérémie du Boisberranger <jeremiedbb@users.noreply.github.com>
@jjerphan jjerphan requested a review from thomasjpfan December 20, 2021 15:44
Copy link
Member
@lorentzenchr lorentzenchr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Co-authored-by: Christian Lorentzen <lorentzen.ch@gmail.com>
@lorentzenchr
Copy link
Member

Waiting for a green CI...⏳

@lorentzenchr
Copy link
Member

BTW, this is a nice example of a refactoring with no new code but +100 diff which mainly comes from better docstrings and comments.

@jjerphan
Copy link
Member Author

I don't understand why code coverage is not passing now. 🤔

As to see if there's a glitch on the CI codecov/patch.
Co-authored-by: Christian Lorentzen <lorentzen.ch@gmail.com>
@jjerphan
Copy link
Member Author
jjerphan commented Dec 21, 2021

This is a general observation based on several identical experience: we spend much time rerunning a full CI several time for tiny diffs. I think that this situation is not that efficient on several aspects: it breaks the programming experience forcing us to wait for a feedback which takes close to a half an hour eventually creating context switches for every one; it also uses much more energy than needed, rerunning all the test suite.

Hence some naive questions: Do you think it is worth improving this? Is there a solution which allows solely rerunning relevant CI jobs in a finer-grained fashion?


Edit: all 🟢, @lorentzenchr.

@lorentzenchr lorentzenchr changed the title MAINT Refactor fixtures prior to PairwiseDistancesReduction introduction MAINT refactor heap routines into utils Dec 21, 2021
@lorentzenchr lorentzenchr merged commit 1a7cdbf into scikit-learn:main Dec 21, 2021
venkyyuvy pushed a commit to venkyyuvy/scikit-learn that referenced this pull request Jan 1, 2022
@jjerphan jjerphan deleted the extract-fixtures branch February 2, 2022 07:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants
0