8000 FIX Adapts cdist jaccard to scipy 1.2.0 (#12692) · scikit-learn/scikit-learn@7f5aa85 · GitHub
[go: up one dir, main page]

Skip to content

Commit 7f5aa85

Browse files
thomasjpfanqinhanmin2014
authored andcommitted
FIX Adapts cdist jaccard to scipy 1.2.0 (#12692)
1 parent dbd28e7 commit 7f5aa85

File tree

3 files changed

+28
-0
lines changed

3 files changed

+28
-0
lines changed

doc/whats_new/v0.20.rst

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,16 @@ Version 0.20.2
1212
This is a bug-fix release with some minor documentation improvements and
1313
enhancements to features released in 0.20.0.
1414

15+
Changed models
16+
--------------
17+
18+
The following estimators and functions, when fit with the same data and
19+
parameters, may produce different models from the previous version. This often
20+
occurs due to changes in the modelling logic (bug fixes or enhancements), or in
21+
random sampling procedures.
22+
23+
- :mod:`sklearn.neighbors` when ``metric=='jaccard'`` (bug fix)
24+
1525
Changelog
1626
---------
1727

@@ -22,6 +32,12 @@ Changelog
2232
parameter may not have been updated correctly when a step is set to ``None``
2333
or ``'passthrough'``. :user:`Thomas Fan <thomasjpfan>`.
2434

35+
:mod:`sklearn.neighbors`
36+
........................
37+
38+
- |Fix| Fixed :class:`sklearn.neighbors.DistanceMetric` jaccard distance
39+
function to return 0 when two all-zero vectors are compared.
40+
:issue:`12685` by :user:`Thomas Fan <thomasjpfan>`.
2541

2642
.. _changes_0_20_1:
2743

sklearn/neighbors/dist_metrics.pyx

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -788,6 +788,11 @@ cdef class JaccardDistance(DistanceMetric):
788788
tf2 = x2[j] != 0
789789
nnz += (tf1 or tf2)
790790
n_eq += (tf1 and tf2)
791+
# Based on https://github.com/scipy/scipy/pull/7373
792+
# When comparing two all-zero vectors, scipy>=1.2.0 jaccard metric
793+
# was changed to return 0, instead of nan.
794+
if nnz == 0:
795+
return 0
791796
return (nnz - n_eq) * 1.0 / nnz
792797

793798

sklearn/neighbors/tests/test_dist_metrics.py

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,8 @@
66

77
import pytest
88

9+
from distutils.version import LooseVersion
10+
from scipy import __version__ as scipy_version
911
from scipy.spatial.distance import cdist
1012
from sklearn.neighbors.dist_metrics import DistanceMetric
1113
from sklearn.neighbors import BallTree
@@ -101,6 +103,11 @@ def check_pdist(metric, kwargs, D_true):
101103
def check_pdist_bool(metric, D_true):
102104
dm = DistanceMetric.get_metric(metric)
103105
D12 = dm.pairwise(X1_bool)
106+
# Based on https://github.com/scipy/scipy/pull/7373
107+
# When comparing two all-zero vectors, scipy>=1.2.0 jaccard metric
108+
# was changed to return 0, instead of nan.
109+
if metric == 'jaccard' and LooseVersion(scipy_version) < '1.2.0':
110+
D_true[np.isnan(D_true)] = 0
104111
assert_array_almost_equal(D12, D_true)
105112

106113

0 commit comments

Comments
 (0)
0