-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
MAINT Plug PairwiseDistancesArgKmin
as a back-end
#22288
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MAINT Plug PairwiseDistancesArgKmin
as a back-end
#22288
Conversation
The previous would have made the code more complex by introducing some boilerplate for the interface plugs. Having it this way actually simplifies the code. This also removes the haversine branch for test_pairwise_distances_argkmin Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
This was needed when 'fast_sqeuclidean' and 'fast_euclidean' were present to choose the best implementation based on the user specification. Those metric have been removed since then, making this attribute useless. Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
1e462a4
to
ba6c463
Compare
test_neighbors_metrics
Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM (assuming all tests now pass)!
|
Most likely need to merge with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Co-authored-by: Christian Lorentzen <lorentzen.ch@gmail.com> Co-authored-by: Jérémie du Boisberranger <jeremiedbb@users.noreply.github.com>
Co-authored-by: Jérémie du Boisberranger <jeremiedbb@users.noreply.github.com> Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I think the config_context discussion is not a blocker for this PR (But let's not forget it before merging the final PR in main).
The coverage seems a bit light according to codecov (assuming it worked and did not forget any coverage file). |
Co-authored-by: Jérémie du Boisberranger <jeremiedbb@users.noreply.github.com>
Co-authored-by: Christian Lorentzen <lorentzen.ch@gmail.com> Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
#22371 has been opened to improve coverage. |
I think the freeze in some CI jobs comes from pytest 7. Merge the current |
7b90dfa
to
b8e9fb8
Compare
Finally all 🟢 🙂 |
…min` (feature branch) (#22134) * MAINT Introduce Pairwise Distances Reductions private submodule (#22064) * MAINT Introduce FastEuclideanPairwiseArgKmin (#22065) * fixup! Merge branch 'main' into pairwise-distances-argkmin Remove duplicated Bunch * MAINT Plug `PairwiseDistancesArgKmin` as a back-end (#22288) * Forward pairwise_dist_chunk_size in the configuration * Flip finalized results for PairwiseDistancesArgKmin The previous would have made the code more complex by introducing some boilerplate for the interface plugs. Having it this way actually simplifies the code. This also removes the haversine branch for test_pairwise_distances_argkmin * Plug PairwiseDistancesArgKmin as a back-end * Adapt test accordingly * Add whats_new entry * Change input validation order for kneighbors * Remove duplicated test_neighbors_distance_metric_deprecation * Adapt the documentation * Add mahalanobis case to test fixtures * Correct whats_new entry * CLN Remove unneeded private metric attribute This was needed when 'fast_sqeuclidean' and 'fast_euclidean' were present to choose the best implementation based on the user specification. Those metric have been removed since then, making this attribute useless. * TST Assert FutureWarning instead of DeprecationWarning in test_neighbors_metrics * MAINT Add use_pairwise_dist_activate to scikit-learn config * TST Add a test for the 'brute' backends' results' consistency Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com> * fixup! MAINT Add use_pairwise_dist_activate to scikit-learn config * fixup! fixup! MAINT Add use_pairwise_dist_activate to scikit-learn config * TST Filter FutureWarning for WMinkowskiDistance * MAINT pin numpydoc in arm for now (#22292) * fixup! TST Filter FutureWarning for WMinkowskiDistance * Revert keywords arguments removal for the GEMM trick for 'euclidean' * MAINT pin max numpydoc for now (#22286) * Add 'haversine' to CDIST_PAIRWISE_DISTANCES_REDUCTION_COMMON_METRICS * fixup! Add 'haversine' to CDIST_PAIRWISE_DISTANCES_REDUCTION_COMMON_METRICS * Apply suggestions from code review * MAINT Document some config parameters for maintenance Also rename one of them. * FIX Support and test one of 'sqeuclidean' specification Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org> * FIX Various typos fix and correct haversine 'haversine' is not supported by cdist. * Directly use get_config * CLN Apply comments from review * Motivate swapped returned values * TST Remove mahalanobis from test fixtures * MNT Add comment regaduction functions' signatures * TST Complete test for `pairwise_distance_{argmin,argmin_min}` (#22371) * DOC Add sub-pull requests to the whats_new entry * DOC place comment inside functions * DOC move up whatsnew entry Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com> Co-authored-by: Christian Lorentzen <lorentzen.ch@gmail.com> Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org> Co-authored-by: Jérémie du Boisberranger <jeremiedbb@users.noreply.github.com>
…min` (feature branch) (scikit-learn#22134) * MAINT Introduce Pairwise Distances Reductions private submodule (scikit-learn#22064) * MAINT Introduce FastEuclideanPairwiseArgKmin (scikit-learn#22065) * fixup! Merge branch 'main' into pairwise-distances-argkmin Remove duplicated Bunch * MAINT Plug `PairwiseDistancesArgKmin` as a back-end (scikit-learn#22288) * Forward pairwise_dist_chunk_size in the configuration * Flip finalized results for PairwiseDistancesArgKmin The previous would have made the code more complex by introducing some boilerplate for the interface plugs. Having it this way actually simplifies the code. This also removes the haversine branch for test_pairwise_distances_argkmin * Plug PairwiseDistancesArgKmin as a back-end * Adapt test accordingly * Add whats_new entry * Change input validation order for kneighbors * Remove duplicated test_neighbors_distance_metric_deprecation * Adapt the documentation * Add mahalanobis case to test fixtures * Correct whats_new entry * CLN Remove unneeded private metric attribute This was needed when 'fast_sqeuclidean' and 'fast_euclidean' were present to choose the best implementation based on the user specification. Those metric have been removed since then, making this attribute useless. * TST Assert FutureWarning instead of DeprecationWarning in test_neighbors_metrics * MAINT Add use_pairwise_dist_activate to scikit-learn config * TST Add a test for the 'brute' backends' results' consistency Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com> * fixup! MAINT Add use_pairwise_dist_activate to scikit-learn config * fixup! fixup! MAINT Add use_pairwise_dist_activate to scikit-learn config * TST Filter FutureWarning for WMinkowskiDistance * MAINT pin numpydoc in arm for now (scikit-learn#22292) * fixup! TST Filter FutureWarning for WMinkowskiDistance * Revert keywords arguments removal for the GEMM trick for 'euclidean' * MAINT pin max numpydoc for now (scikit-learn#22286) * Add 'haversine' to CDIST_PAIRWISE_DISTANCES_REDUCTION_COMMON_METRICS * fixup! Add 'haversine' to CDIST_PAIRWISE_DISTANCES_REDUCTION_COMMON_METRICS * Apply suggestions from code review * MAINT Document some config parameters for maintenance Also rename one of them. * FIX Support and test one of 'sqeuclidean' specification Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org> * FIX Various typos fix and correct haversine 'haversine' is not supported by cdist. * Directly use get_config * CLN Apply comments from review * Motivate swapped returned values * TST Remove mahalanobis from test fixtures * MNT Add comment regaduction functions' signatures * TST Complete test for `pairwise_distance_{argmin,argmin_min}` (scikit-learn#22371) * DOC Add sub-pull requests to the whats_new entry * DOC place comment inside functions * DOC move up whatsnew entry Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com> Co-authored-by: Christian Lorentzen <lorentzen.ch@gmail.com> Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org> Co-authored-by: Jérémie du Boisberranger <jeremiedbb@users.noreply.github.com>
Reference Issues/PRs
Part of #21462 (comment).
Supersedes #22214.
What does this implement/fix? Explain your changes.
This plugs
PairwiseDistancesArgKmin
as a back-end so that it is used for some API, namely:sklearn.metrics.pairwise_distances_argmin
sklearn.metrics.pairwise_distances_argmin_min
sklearn.cluster.AffinityPropagation
sklearn.cluster.Birch
sklearn.cluster.MeanShift
sklearn.cluster.OPTICS
sklearn.cluster.SpectralClustering
sklearn.feature_selection.mutual_info_regression
sklearn.neighbors.KNeighborsClassifier
sklearn.neighbors.KNeighborsRegressor
sklearn.neighbors.LocalOutlierFactor
sklearn.neighbors.NearestNeighbors
sklearn.manifold.Isomap
sklearn.manifold.LocallyLinearEmbedding
sklearn.manifold.TSNE
sklearn.manifold.trustworthiness
sklearn.semi_supervised.LabelPropagation
sklearn.semi_supervised.LabelSpreading
Existing tests are adapted and completed accordingly.