scikit-learn · vinayak-mehta · Apr 5, 2015 · Sep 11, 2015 · amueller · Sep 11, 2015
diff --git a/doc/modules/neighbors.rst b/doc/modules/neighbors.rst
@@ -252,6 +252,62 @@ the lower half of those faces.
     multi-output regression using nearest neighbors.
 
 
+Nearest Centroid Classifier
+===========================
+
+The :class:`NearestCentroid` classifier is a simple algorithm that represents
+each class by the centroid of its members. In effect, this makes it
+similar to the label updating phase of the :class:`sklearn.KMeans` algorithm.
+It also has no parameters to choose, making it a good baseline classifier. It
+does, however, suffer on non-convex classes, as well as when classes have
+drastically different variances, as equal variance in all dimensions is
+assumed. See Linear Discriminant Analysis (:class:`sklearn.lda.LDA`) and
+Quadratic Discriminant Analysis (:class:`sklearn.qda.QDA`) for more complex
+methods that do not make this assumption. Usage of the default
+:class:`NearestCentroid` is simple:
+
+    >>> from sklearn.neighbors.nearest_centroid import NearestCentroid
+    >>> import numpy as np
+    >>> X = np.array([[-1, -1], [-2, -1], [-3, -2], [1, 1], [2, 1], [3, 2]])
+    >>> y = np.array([1, 1, 1, 2, 2, 2])
+    >>> clf = NearestCentroid()
+    >>> clf.fit(X, y)
+    NearestCentroid(metric='euclidean', shrink_threshold=None)
+    >>> print(clf.predict([[-0.8, -1]]))
+    [1]
+
+
+Nearest Shrunken Centroid
+-------------------------
+
+The :class:`NearestCentroid` classifier has a ``shrink_threshold`` parameter,
+which implements the nearest shrunken centroid classifier. In effect, the value
+of each feature for each centroid is divided by the within-class variance of
+that feature. The feature values are then reduced by ``shrink_threshold``. Most
+notably, if a particular feature value crosses zero, it is set
+to zero. In effect, this removes the feature from affecting the classification.
+This is useful, for example, for removing noisy features.
+
+In the example below, using a small shrink threshold increases the accuracy of
+the model from 0.81 to 0.82.
+
+.. |nearest_centroid_1| image:: ../auto_examples/neighbors/images/plot_nearest_centroid_001.png
+   :target: ../auto_examples/neighbors/plot_classification.html
+   :scale: 50
+
+.. |nearest_centroid_2| image:: ../auto_examples/neighbors/images/plot_nearest_centroid_002.png
+   :target: ../auto_examples/neighbors/plot_classification.html
+   :scale: 50
+
+.. centered:: |nearest_centroid_1| |nearest_centroid_2|
+
+.. topic:: Examples:
+
+  * :ref:`example_neighbors_plot_nearest_centroid.py`: an example of
+    classification using nearest centroid with different shrink thresholds.
+
+  .. _approximate_nearest_neighbors:
+
 Nearest Neighbor Algorithms
 ===========================
 
@@ -427,6 +483,38 @@ and the ``'effective_metric_'`` is in the ``'VALID_METRICS'`` list of
 same order as the number of training points, and that ``leaf_size`` is 
 close to its default value of ``30``.
 
+Valid Metrics for Nearest Neighbor Algorithms
+---------------------------------------------
+
+========================     =================================================================
+Algorithm                    Valid Metrics
+========================     =================================================================
+**Brute Force**              'euclidean', 'l2', 'l1', 'manhattan', 'cityblock',
+                             'braycurtis', 'canberra', 'chebyshev', 'correlation',
+                             'cosine', 'dice', 'hamming', 'jaccard', 'kulsinski',
+                             'mahalanobis', 'matching', 'minkowski', 'rogerstanimoto',
+                             'russellrao', 'seuclidean', 'sokalmichener',
+                             'sokalsneath', 'sqeuclidean', 'yule', 'wminkowski'
+
+**K-D Tree**                 'chebyshev', 'euclidean', 'cityblock', 'manhattan', 'infinity',
+                             'minkowski', 'p', 'l2', 'l1'
+
+**Ball Tree**                'chebyshev', 'sokalmichener', 'canberra', 'haversine',
+                             'rogerstanimoto', 'matching', 'dice', 'euclidean', 'braycurtis',
+                             'russellrao', 'cityblock', 'manhattan', 'infinity', 'jaccard',
+                             'seuclidean', 'sokalsneath', 'kulsinski', 'minkowski',
+                             'mahalanobis', 'p', 'l2', 'hamming', 'l1', 'wminkowski', 'pyfunc'
+========================     =================================================================
+
+A list of valid metrics for the any of the above algorithms can be obtained by using their
+``valid_metric`` attribute. For example, valid metrics for ``KDTree`` can be generated by:
+
+    >>> from sklearn.neighbors import KDTree
+    >>> import numpy as np
+    >>> print(np.sort(KDTree.valid_metrics)) # doctest: +ELLIPSIS
+    ['chebyshev' 'cityblock' 'euclidean' 'infinity' 'l1' 'l2' 'manhattan'
+     'minkowski' 'p']
+
 Effect of ``leaf_size``
 -----------------------
 As noted above, for small sample sizes a brute force search can be more
@@ -458,62 +546,6 @@ leaf nodes.  The level of this switch can be specified with the parameter
 
 .. _nearest_centroid_classifier:
 
-Nearest Centroid Classifier
-===========================
-
-The :class:`NearestCentroid` classifier is a simple algorithm that represents
-each class by the centroid of its members. In effect, this makes it
-similar to the label updating phase of the :class:`sklearn.KMeans` algorithm.
-It also has no parameters to choose, making it a good baseline classifier. It
-does, however, suffer on non-convex classes, as well as when classes have
-drastically different variances, as equal variance in all dimensions is
-assumed. See Linear Discriminant Analysis (:class:`sklearn.discriminant_analysis.LinearDiscriminantAnalysis`)
-and Quadratic Discriminant Analysis (:class:`sklearn.discriminant_analysis.QuadraticDiscriminantAnalysis`)
-for more complex methods that do not make this assumption. Usage of the default
-:class:`NearestCentroid` is simple:
-
-    >>> from sklearn.neighbors.nearest_centroid import NearestCentroid
-    >>> import numpy as np
-    >>> X = np.array([[-1, -1], [-2, -1], [-3, -2], [1, 1], [2, 1], [3, 2]])
-    >>> y = np.array([1, 1, 1, 2, 2, 2])
-    >>> clf = NearestCentroid()
-    >>> clf.fit(X, y)
-    NearestCentroid(metric='euclidean', shrink_threshold=None)
-    >>> print(clf.predict([[-0.8, -1]]))
-    [1]
-
-
-Nearest Shrunken Centroid
--------------------------
-
-The :class:`NearestCentroid` classifier has a ``shrink_threshold`` parameter,
-which implements the nearest shrunken centroid classifier. In effect, the value
-of each feature for each centroid is divided by the within-class variance of
-that feature. The feature values are then reduced by ``shrink_threshold``. Most
-notably, if a particular feature value crosses zero, it is set
-to zero. In effect, this removes the feature from affecting the classification.
-This is useful, for example, for removing noisy features.
-
-In the example below, using a small shrink threshold increases the accuracy of
-the model from 0.81 to 0.82.
-
-.. |nearest_centroid_1| image:: ../auto_examples/neighbors/images/sphx_glr_plot_nearest_centroid_001.png
-   :target: ../auto_examples/neighbors/plot_nearest_centroid.html
-   :scale: 50
-
-.. |nearest_centroid_2| image:: ../auto_examples/neighbors/images/sphx_glr_plot_nearest_centroid_002.png
-   :target: ../auto_examples/neighbors/plot_nearest_centroid.html
-   :scale: 50
-
-.. centered:: |nearest_centroid_1| |nearest_centroid_2|
-
-.. topic:: Examples:
-
-  * :ref:`sphx_glr_auto_examples_neighbors_plot_nearest_centroid.py`: an example of
-    classification using nearest centroid with different shrink thresholds.
-
-.. _approximate_nearest_neighbors:
-
 Approximate Nearest Neighbors
 =============================