-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
[MRG + 2 -.5] Listed valid metrics for neighbors algorithms #4525
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
6fa67a7
to
3926ed2
Compare
@@ -427,6 +482,29 @@ and the ``'effective_metric_'`` is in the ``'VALID_METRICS'`` list of | |||
same order as the number of training points, and that ``leaf_size`` is | |||
close to its default value of ``30``. | |||
|
|||
Valid Metrics for Nearest Neighbor Algorithms | |||
--------------------------------------------- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you also add a doctests that shows the valid_metrics
class attributes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This shows the attributes to the users, but also makes sure that a doctests fails in this place if someone modifies the valid metrics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure how to do that :\ (doctest showing valid_metrics
class attributes) and searching didn't help. Can you point me to an example?
similar to http://scikit-learn.org/dev/modules/model_evaluation.html#common-cases-predefined-values |
'seuclidean', 'sokalsneath', 'kulsinski', 'minkowski', | ||
'mahalanobis', 'p', 'l2', 'hamming', 'l1', 'wminkowski', 'pyfunc' | ||
======================== ================================================================= | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean like it has been done in the Usage examples under that table in model_selection? So I should do something like this here?
>>> from sklearn.neighbors import KDTree >>> print(KDTree.valid_metrics)
Sorry for being noob.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, and then list the output.
Ah! I just understood what you meant with:
I had done this in #4356, was confusing what you asked with checking the output of |
doc/modules/neighbors.rst
Outdated
|
||
>>> from sklearn.neighbors import KDTree | ||
>>> print(KDTree.valid_metrics) # doctest: +ELLIPSIS | ||
[...] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Had to add ...
as print was giving different output everytime :| Not sure how it will fail if someone modifies the valid metrics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Diffferent output between different versions? How could it be very different?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For every run.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
print(sort(KDTree.valid_metrics))
?
22a8610
to
52e513e
Compare
@amueller - Should any more changes be added? |
LGTM |
why did you close? |
Oops, sorry :\ I was deleting unused branches, must have deleted it by mistake :| |
Is there any way I can undo this? |
Now rebase it!! :) |
Yeah, fetching from upstream. |
52e513e
to
c7247f2
Compare
@@ -252,6 +252,62 @@ the lower half of those faces. | |||
multi-output regression using nearest neighbors. | |||
|
|||
|
|||
Nearest Centroid Classifier |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this now in the right place? There is no deletions...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
Yes, as per #4521 (comment) its in the right place above the details on the trees and neighbors implementations.
This LGTM as well. |
@raghavrv Yes, will do. |
Reordered algorithms Added doctest Sorted metrics Changed example wording
6ec5662
to
686a256
Compare
|
||
>>> from sklearn.neighbors import KDTree | ||
>>> import numpy as np | ||
>>> print(np.sort(KDTree.valid_metrics)) # doctest: +ELLIPSIS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@raghavrv I don't think we need numpy
for this doctest, a normal sorted
would work just fine?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indeed!!
@raghavrv Apart from my comment, I think this is ready to merge. |
Thanks @vinayak-mehta! I'll review this in a while... |
'russellrao', 'seuclidean', 'sokalmichener', | ||
'sokalsneath', 'sqeuclidean', 'yule', 'wminkowski' | ||
|
||
**K-D Tree** 'chebyshev', 'euclidean', 'cityblock', 'manhattan', 'infinity', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should note that modules/generated/sklearn.neighbors.DistanceMetric.html
lists the measures for K-D and Ball Trees, along with their arguments. Or else we should just be pointing to that reference here rather than listing measures, so as to avoid duplication.
@jnothman I might not have been aware of that docstring. I'd be fine with a very obvious pointer towards it. Having this listed in the user guide seems helpful but duplication is not great. |
(misclick sorry) |
Opened #19379 to take over this. |
#4521