8000 [WIP] Metrics testing by amueller · Pull Request #4522 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

[WIP] Metrics testing #4522

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from
Closed

Conversation

amueller
Copy link
Member
@amueller amueller commented Apr 5, 2015

There have recently popped up some issues about support for different metrics. See #4520 and #4452 for example.
I am trying to add more tests, but I am not sure I am familiar enough with the metrics / neighbors modules.
@jakevdp your help would be much appreciated.
Currently travis fails because the Jaccard distances in the trees seem to be very different from the scipy ones. I think this is because our trees cast everything to bool, while scipy uses floats.
This seems to be a pretty big issues, as the algorithm that is used might change automatically depending on the dataset!!!

@landscape-bot
Copy link

Code Health
Code quality remained the same when pulling b675fc8 on amueller:metrics_testing into 9206a78 on scikit-learn:master.

@amueller
Copy link
Member Author
amueller commented Apr 5, 2015

Opened #4523 with the hope of tracking that issue.

@jakevdp
Copy link
Member
jakevdp commented May 5, 2015

I've commented on both the issues you referenced. I think the problem is that Scipy plays fast-and-loose with the Jaccard distance definition. This came up on the scipy mailing list a couple years ago; see my explanation here: http://mail.scipy.org/pipermail/scipy-dev/2012-December/018129.html

@amueller
Copy link
Member Author
amueller commented May 5, 2015

The issues in #4520 and #4452 are not tested for, either, though.
If we switch backends automatically, we need to make sure the behavior is the same.

@lesteve
Copy link
Member
lesteve commented Jun 29, 2016

@TomDLT @amueller now that #6932 has been merged should we try to resuscitate this PR?

I tried rebasing this PR on master but there seems to be another error that I haven't investigated yet:

❯ nosetests sklearn/neighbors/tests/test_neighbors.py -m test_neighbors_metrics
/home/lesteve/dev/scikit-learn/sklearn/utils/validation.py:428: DataConversionWarning: Data with input dtype float64 was converted to bool by check_pairwise_arrays.
 
8000
 warnings.warn(msg, _DataConversionWarning)
/home/lesteve/dev/scikit-learn/sklearn/utils/validation.py:428: DataConversionWarning: Data with input dtype float64 was converted to bool by check_pairwise_arrays.
  warnings.warn(msg, _DataConversionWarning)
/home/lesteve/dev/scikit-learn/sklearn/utils/validation.py:428: DataConversionWarning: Data with input dtype float64 was converted to bool by check_pairwise_arrays.
  warnings.warn(msg, _DataConversionWarning)
E
======================================================================
ERROR: sklearn.neighbors.tests.test_neighbors.test_neighbors_metrics
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/lesteve/miniconda3/lib/python3.5/site-packages/nose/case.py", line 198, in runTest
    self.test(*self.arg)
  File "/home/lesteve/dev/scikit-learn/sklearn/neighbors/tests/test_neighbors.py", line 971, in test_neighbors_metrics
    neigh.fit(X)
  File "/home/lesteve/dev/scikit-learn/sklearn/neighbors/base.py", line 798, in fit
    return self._fit(X)
  File "/home/lesteve/dev/scikit-learn/sklearn/neighbors/base.py", line 240, in _fit
    **self.effective_metric_params_)
  File "sklearn/neighbors/binary_tree.pxi", line 1059, in sklearn.neighbors.ball_tree.BinaryTree.__init__ (sklearn/neighbors/ball_tree.c:8906)
    self.dist_metric = DistanceMetric.get_metric(metric, **kwargs)
  File "sklearn/neighbors/dist_metrics.pyx", line 287, in sklearn.neighbors.dist_metrics.DistanceMetric.get_metric (sklearn/neighbors/dist_metrics.c:4484)
    return metric(**kwargs)
  File "sklearn/neighbors/dist_metrics.pyx", line 648, in sklearn.neighbors.dist_metrics.MahalanobisDistance.__init__ (sklearn/neighbors/dist_metrics.c:8241)
    raise ValueError("Must provide either V or VI "
ValueError: Must provide either V or VI for Mahalanobis distance

----------------------------------------------------------------------
Ran 1 test in 0.023s

FAILED (errors=1)

@amueller
Copy link
Member Author

We probably should. Related: #6915

Base automatically changed from master to main January 22, 2021 10:48
@adrinjalali
Copy link
Member

Good idea, but we probably need to start fresh here. Happy to have it reopened @amueller if you fancy continuing this PR.

@adrinjalali adrinjalali closed this Mar 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants
0