scikit-learn
diff --git a/‎doc/modules/clustering.rst
Lines changed: 138 additions & 0 deletions b/‎doc/modules/clustering.rst
Lines changed: 138 additions & 0 deletions
diff --git a/‎sklearn/metrics/__init__.py
Lines changed: 2 additions & 0 deletions b/‎sklearn/metrics/__init__.py
Lines changed: 2 additions & 0 deletions
diff --git a/‎sklearn/metrics/cluster/__init__.py
Lines changed: 1 addition & 4 deletions b/‎sklearn/metrics/cluster/__init__.py
Lines changed: 1 addition & 4 deletions
diff --git a/‎sklearn/metrics/cluster/tests/test_unsupervised.py
Lines changed: 0 additions & 40 deletions b/‎sklearn/metrics/cluster/tests/test_unsupervised.py
Lines changed: 0 additions & 40 deletions
diff --git a/‎sklearn/metrics/cluster/unsupervised.py
Lines changed: 0 additions & 101 deletions b/‎sklearn/metrics/cluster/unsupervised.py
Lines changed: 0 additions & 101 deletions
@@ -1339,6 +1339,75 @@ mean of homogeneity and completeness**:
    <http://www.cs.columbia.edu/~hila/hila-thesis-distributed.pdf>`_, Hila
    Becker, PhD Thesis.
 
+.. _fowlkes_mallows_scores:
+
+Fowlkes-Mallows scores
+----------------------
+
+  >>> from sklearn import metrics
+  >>> labels_true = [0, 0, 0, 1, 1, 1]
+  >>> labels_pred = [0, 0, 1, 1, 2, 2]
+
+  >>> metrics.fowlkes_mallows_score(labels_true, labels_pred)  # doctest: +ELLIPSIS
+  0.47140...
+
+One can permute 0 and 1 in the predicted labels, rename 2 to 3 and get
+the same score::
+
+  >>> labels_pred = [1, 1, 0, 0, 3, 3]
+
+  >>> metrics.fowlkes_mallows_score(labels_true, labels_pred)  # doctest: +ELLIPSIS
+  0.47140...
+
+Perfect labeling is scored 1.0::
+
+  >>> labels_pred = labels_true[:]
+  >>> metrics.adjusted_mutual_info_score(labels_true, labels_pred)  # doctest: +ELLIPSIS
+  1.0
+
+Bad (e.g. independent labelings) have zero scores::
+
+  >>> labels_true = [0, 1, 2, 0, 3, 4, 5, 1]
+  >>> labels_pred = [1, 1, 0, 0, 2, 2, 2, 2]
+  >>> metrics.fowlkes_mallows_score(labels_true, labels_pred)  # doctest: +ELLIPSIS
+  0.0
+
+Advantages
+~~~~~~~~~~
+
+- **Random (uniform) label assignments have a FMI score close to 0.0**
+  for any value of ``n_clusters`` and ``n_samples`` (which is not the
+  case for raw Mutual Information or the V-measure for instance).
+
+- **Bounded range [0, 1]**:  Values close to zero indicate two label
+  assignments that are largely independent, while values close to one
+  indicate significant agreement. Further, values of exactly 0 indicate
+  **purely** independent label assignments and a AMI of exactly 1 indicates
+  that the two label assignments are equal (with or without permutation).
+
+- **No assumption is made on the cluster structure**: can be used
+  to compare clustering algorithms such as k-means which assumes isotropic
+  blob shapes with results of spectral clustering algorithms which can
+  find cluster with "folded" shapes.
+
+
+Drawbacks
+~~~~~~~~~
+
+- Contrary to inertia, **FMI-based measures require the knowledge
+  of the ground truth classes** while almost never available in practice or
+  requires manual assignment by human annotators (as in the supervised learning
+  setting).
+
+.. topic:: References
+
+  * E. B. Fowkles and C. L. Mallows, 1983. "A method for comparing two
+    hierarchical clusterings". Journal of the American Statistical Association.
+    http://wildfire.stat.ucla.edu/pdflibrary/fowlkes.pdf
+
+  * `Wikipedia entry for the Fowlkes-Mallows Index
+    <https://en.wikipedia.org/wiki/Fowlkes-Mallows_index>`_
+
 .. _silhouette_coefficient:
 
 Silhouette Coefficient
@@ -1413,3 +1482,72 @@ Drawbacks
 
  * :ref:`example_cluster_plot_kmeans_silhouette_analysis.py` : In this example
    the silhouette analysis is used to choose an optimal value for n_clusters.
+
+.. _calinski_harabaz_index:
+
+Calinski-Harabaz Index
+----------------------
+
+If the ground truth labels are not known, the Calinski-Harabaz index
+(:func:'sklearn.metrics.calinski_harabaz_score') can be used to evaluate the
+model., where a higher Calinski-Harabaz score relates to a model with better
+defined clusters.
+
+For :math:`k` clusters, the Calinski-Harabaz :math:`ch` is given as the ratio
+of the between-clusters dispersion mean and the within-cluster dispersion:
+
+.. math::
+  ch(k) = \frac{trace(B_k)}{trace(W_k)} \times \frac{N - k}{k - 1}
+  W_k  = \sum_{q=1}^k \sum_{x \in C_q} (x - c_q) (x - c_q)^T \\
+  B_k = \sum_q n_q (c_q - c) (c_q -c)^T \\
+
+where:
+- :math:`N` be the number of points in our data,
+- :math:`C_q` be the set of points in cluster :math:`q`,
+- :math:`c_q` be the center of cluster :math:`q`,
+- :math:`c` be the center of :math:`E`,
+- :math:`n_q` be the number of points in cluster :math:`q`:
+
+
+  >>> from sklearn import metrics
+  >>> from sklearn.metrics import pairwise_distances
+  >>> from sklearn import datasets
+  >>> dataset = datasets.load_iris()
+  >>> X = dataset.data
+  >>> y = dataset.target
+
+In normal usage, the Calinski-Harabaz index is applied to the results of a
+cluster analysis.
+
+  >>> import numpy as np
+  >>> from sklearn.cluster import KMeans
+  >>> kmeans_model = KMeans(n_clusters=3, random_state=1).fit(X)
+  >>> labels = kmeans_model.labels_
+  >>> metrics.calinski_harabaz_score(X, labels)
+  ...                                                      # doctest: +ELLIPSIS
+  560.39...
+
+
+.. topic:: References
+
+ *  Caliński, T., & Harabasz, J. (1974). "A dendrite method for cluster
+    analysis". Communications in Statistics-theory and Methods 3: 1-27.
+    `doi:10.1080/03610926.2011.560741 <http://dx.doi.org/10.1080/03610926.2011.560741>`_.
+
+
+Advantages
+~~~~~~~~~~
+
+- The score is higher when clusters are dense and well separated, which relates
+  to a standard concept of a cluster.
+
+- The score is fast to compute
+
+
+Drawbacks
+~~~~~~~~~
+
+- The Calinski-Harabaz index is generally higher for convex clusters than other
+  concepts of clusters, such as density based clusters like those obtained
+  through DBSCAN.
+
@@ -39,8 +39,10 @@
 from .cluster import homogeneity_score
 from .cluster import mutual_info_score
 from .cluster import normalized_mutual_info_score
+from .cluster import fowlkes_mallows_score
 from .cluster import silhouette_samples
 from .cluster import silhouette_score
+from .cluster import calinski_harabaz_score
 from .cluster import v_measure_score
 
 from .pairwise import euclidean_distances
 
@@ -20,14 +20,11 @@
 from .unsupervised import silhouette_samples
 from .unsupervised import silhouette_score
 from .unsupervised import calinski_harabaz_score
-from .unsupervised import distortion_score
-from .unsupervised import inertia_score
 from .bicluster import consensus_score
 
 __all__ = ["adjusted_mutual_info_score", "normalized_mutual_info_score",
            "adjusted_rand_score", "completeness_score", "contingency_matrix",
            "expected_mutual_information", "homogeneity_completeness_v_measure",
            "homogeneity_score", "mutual_info_score", "v_measure_score",
            "fowlkes_mallows_score", "entropy", "silhouette_samples",
-           "silhouette_score", "calinski_harabaz_score", "distortion_score",
-           "inertia_score", "consensus_score"]
+           "silhouette_score", "calinski_harabaz_score", "consensus_score"]
@@ -9,8 +9,6 @@
 from sklearn.utils.testing import assert_raise_message
 from sklearn.metrics.cluster import silhouette_score
 from sklearn.metrics.cluster import calinski_harabaz_score
-from sklearn.metrics.cluster import distortion_score
-from sklearn.metrics.cluster import inertia_score
 from sklearn.metrics import pairwise_distances
 
 
@@ -119,41 +117,3 @@ def test_calinski_harabaz_score():
     labels = [0] * 10 + [1] * 10 + [2] * 10 + [3] * 10
     assert_almost_equal(calinski_harabaz_score(X, labels),
                         45 * (40 - 4) / (5 * (4 - 1)))
-
-
-def test_distortion_score():
-    rng = np.random.RandomState(seed=0)
-
-    # Assert message when there is only one label
-    assert_raise_message(ValueError, "Number of labels is",
-                         distortion_score,
-                         rng.rand(10, 2), np.zeros(10))
-
-    # Assert message when all point are in different clusters
-    assert_raise_message(ValueError, "Number of labels is",
-                         distortion_score,
-                         rng.rand(10, 2), np.arange(10))
-
-    X = np.array([[0, 0], [2, 2],
-                  [5, 5], [6, 6]])
-    labels = [0, 0, 1, 1]
-    assert_almost_equal(distortion_score(X, labels), 1.5 * np.sqrt(2.))
-
-
-def test_inertia_score():
-    rng = np.random.RandomState(seed=0)
-
-    # Assert message when there is only one label
-    assert_raise_message(ValueError, "Number of labels is",
-                         inertia_score,
-                         rng.rand(10, 2), np.zeros(10))
-
-    # Assert message when all point are in different clusters
-    assert_raise_message(ValueError, "Number of labels is",
-                         inertia_score,
-                         rng.rand(10, 2), np.arange(10))
-
-    X = np.array([[0, 0], [2, 2],
-                  [5, 5], [6, 6]])
-    labels = [0, 0, 1, 1]
-    assert_almost_equal(inertia_score(X, labels), 1.5 / np.sqrt(2.))
@@ -222,8 +222,6 @@ def calinski_harabaz_score(X, labels):
         B_K = \sum_k n_k (c_k - c) (c_k -c)^T
         W_K = \sum_k \sum_{x \in C_k} (x - c_k) (x - c_k)^T
 
-    The score ranges from 0 to 1.
-
     Parameter
     ---------
     X : array-like, shape (n_samples, n_features)
@@ -264,102 +262,3 @@ def calinski_harabaz_score(X, labels):
     return (1. if intra_disp == 0. else
             extra_disp * (n_samples - n_labels) /
             (intra_disp * (n_labels - 1.)))
-
-
-def distortion_score(X, labels, metric="euclidean", **kwds):
-    """Compute the distortion of a given dataset and their cluster assignment.
-
-    Parameters
-    ----------
-    X : array-like, shape (n_samples, n_features)
-        List of n_features-dimensional data points. Each row corresponds
-        to a single data point.
-
-    labels : array-like, shape (n_samples,)
-        Predicted labels for each sample.
-
-    metric : string, or callable
-        The metric to use when calculating distance between instances in a
-        feature array. If metric is a string, it must be one of the options
-        allowed by :func:`sklearn.metrics.pairwise.pairwise_distances`.
-
-    **kwds : optional keyword parameters
-        Any further parameters are passed directly to the distance function.
-        If using a scipy.spatial.distance metric, the parameters are still
-        metric dependent. See the scipy docs for usage examples.
-
-    Returns
-    -------
-    score: float
-        The resulting distortion value.
-    """
-    X, labels = check_X_y(X, labels)
-    le = LabelEncoder()
-    labels = le.fit_transform(labels)
-
-    n_samples, n_features = X.shape
-    n_labels = len(le.classes_)
-
-    check_number_of_labels(n_labels, n_samples)
-
-    dist = 0.
-    for k in range(n_labels):
-        cluster_k = X[labels == k]
-        mean_k = np.mean(cluster_k, axis=0)
-        dist += np.sum(pairwise_distances(cluster_k,
-                                          mean_k[:, np.newaxis].reshape(1, -1),
-                                          metric=metric, **kwds))
-
-    return dist / n_features
-
-
-def inertia_score(X, labels, metric="euclidean", **kwds):
-    """Compute the inertia of a given dataset and their cluster assignment.
-
-    The inertia is defined as the sum of distances of samples to their closest
-    cluster center.
-
-    Parameters
-    ----------
-    X : array-like, shape (n_samples, n_features)
-        List of n_features-dimensional data points. Each row corresponds
-        to a single data point.
-
-    labels : array-like, shape (n_samples,)
-        Predicted labels for each sample.
-
-    metric : string, or callable
-        The metric to use when calculating distance between instances in a
-        feature array. If metric is a string, it must be one of the options
-        allowed by :func:`sklearn.metrics.pairwise.pairwise_distances`.
-
-    **kwds : optional keyword parameters
-        Any further parameters are passed directly to the distance function.
-        If using a scipy.spatial.distance metric, the parameters are still
-        metric dependent. See the scipy docs for usage examples.
-
-    Returns
-    -------
-    score: float
-        The resulting inertia value.
-    """
-    X, labels = check_X_y(X, labels)
-    le = LabelEncoder()
-    labels = le.fit_transform(labels)
-
-    n_samples, n_features = X.shape
-    n_labels = len(le.classes_)
-
-    if not 1 < n_labels < n_samples:
-        raise ValueError("Number of labels is %d. Valid values are 2 "
-                         "to n_samples - 1 (inclusive)" % n_labels)
-
-    inertia = 0.
-    for k in range(n_labels):
-        cluster_k = X[labels == k]
-        mean_k = np.mean(cluster_k, axis=0)
-        inertia += (np.sum(pairwise_distances(
-            cluster_k, mean_k[:, np.newaxis].reshape(1, -1),
-            metric=metric, **kwds)) / (2 * len(cluster_k)))
-
-    return inertia