-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
[MRG+2] convert to boolean arrays for boolean distances (ex: jaccard) #5460
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -1195,12 +1196,19 @@ def pairwise_distances(X, Y=None, metric="euclidean", n_jobs=1, **kwds): | |||
" support sparse matrices.") | |||
X, Y = check_pairwise_arrays(X, Y) | |||
if n_jobs == 1 and X is Y: | |||
if metric in PAIRWISE_BOOLEAN_FUNCTIONS and X.dtype != bool: | |||
warnings.warn("Metric %s: Implicit cast to boolean for X of " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DataConversionWarning?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure!
Looks good. We'd want some regression tests that check the behavior for non-boolean inputs. |
Tests added, that fail on master |
This looks really good. Let's wait for tests to pass. In the meantime, this will need a CHANGELOG entry, especially because it might change results for certain use cases. Actually, we might even think about having some sort of deprecation cycle for this... what do you think? |
I consider this a bugfix, so no deprecation cycle. We do need a whatsnew entry, though |
whatsnew entry added |
From the diff, it looks like an unrelated section of whatsnew was accidentally deleted in the most recent commit. |
This entry was duplicated |
NN = neighbors.NearestNeighbors | ||
|
||
nn1 = NN(metric="jaccard", algorithm='brute').fit(X) | ||
nn2 = NN(metric="jaccard", algorithm='ball_tree').fit(X) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 looks neat
LGTM apart from the minor nitpick! Thx @TomDLT !! |
Yes a rebase and forcepush should trigger re-build of Circle with the fixed configuration. |
see #4523 (comment)
ping @amueller @jakevdp
Closes #4523