8000 MAINT Adapt `PairwiseDistancesReduction` heuristic for `strategy="aut… · scikit-learn/scikit-learn@5e1ed8b · GitHub
[go: up one dir, main page]

Skip to content

Commit 5e1ed8b

Browse files
jjerphanogrisel
andauthored
MAINT Adapt PairwiseDistancesReduction heuristic for strategy="auto" (#24043)
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
1 parent c900ad3 commit 5e1ed8b

File tree

1 file changed

+12
-1
lines changed

1 file changed

+12
-1
lines changed

sklearn/metrics/_pairwise_distances_reduction/_base.pyx.tp

Lines changed: 12 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -158,7 +158,18 @@ cdef class PairwiseDistancesReduction{{name_suffix}}:
158158
if strategy == 'auto':
159159
# This is a simple heuristic whose constant for the
160160
# comparison has been chosen based on experiments.
161-
if 4 * self.chunk_size * self.effective_n_threads < self.n_samples_X:
161+
# parallel_on_X has less synchronization overhead than
162+
# parallel_on_Y and should therefore be used whenever
163+
# n_samples_X is large enough to not starve any of the
164+
# available hardware threads.
165+
if self.n_samples_Y < self.n_samples_X:
166+
# No point to even consider parallelizing on Y in this case. This
167+
# is in particular important to do this on machines with a large
168+
# number of hardware threads.
169+
strategy = 'parallel_on_X'
170+
elif 4 * self.chunk_size * self.effective_n_threads < self.n_samples_X:
171+
# If Y is larger than X, but X is still large enough to allow for
172+
# parallelism, we might still want to favor parallelizing on X.
162173
strategy = 'parallel_on_X'
163174
else:
164175
strategy = 'parallel_on_Y'

0 commit comments

Comments
 (0)
0