-
-
Notifications
You must be signed in to change notification settings - Fork 26k
[MRG] Adapts cdist jaccard to scipy 1.2.0 #12692
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
that's a backwards-incompatible change, right? So now we're changing jaccard for everyone to give zero, not nan. |
Yes it is not backwards compatible, since scipy's change is also not backwards compatible. Following the discussion: scipy/scipy#7373, it seems the old version of If we want to be strict about backwards compatibility, we can wait two versions to deprecate |
I'm fine with making the change, but we should note it prominently in the whatsnew. |
Seeing as nearest neighbors wasn't going to work well with NaNs in any case, I'm fine with this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM pending what's new entry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @thomasjpfan
I think it's always acceptable to keep consistent with scipy
I'm OK with 0.20.2
doc/whats_new/v0.20.rst
Outdated
random sampling procedures. | ||
|
||
- :mod:`sklearn.neighbors` when ``metric=='jaccard'`` (bug fix) | ||
- :mod:`sklearn.metrics.pairwise` when ``metric=='jaccard'` and Scipy >= 1.2.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not related to changes in scikit-learn so remove?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR was updated to remove the second item regarding sklearn.metrics.pairwise
.
@@ -788,6 +788,8 @@ cdef class JaccardDistance(DistanceMetric): | |||
tf2 = x2[j] != 0 | |||
nnz += (tf1 or tf2) | |||
n_eq += (tf1 and tf2) | |||
if nnz == 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add some explanations and point to the scipy issue
@@ -101,6 +103,8 @@ def check_pdist(metric, kwargs, D_true): | |||
def check_pdist_bool(metric, D_true): | |||
dm = DistanceMetric.get_metric(metric) | |||
D12 = dm.pairwise(X1_bool) | |||
if metric == 'jaccard' and LooseVersion(scipy_version) < '1.2.0': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add some explanations and point to the scipy issue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @thomasjpfan
This reverts commit ab4238d.
This reverts commit ab4238d.
Reference Issues/PRs
Fixes #12685
What does this implement/fix? Explain your changes.
Adjusts
JaccardDistance.dist
to match with scipy 1.2.0 version.Any other comments?
The test
check_pdist_bool
had to be adjusted to be backwards compatible with older versions of scipy.