scikit-learn · amueller · Nov 15, 2017 · Oct 23, 2017 · Oct 24, 2017 · Oct 25, 2017
diff --git a/doc/modules/model_evaluation.rst b/doc/modules/model_evaluation.rst
@@ -464,20 +464,19 @@ given binary ``y_true`` and ``y_pred``:
     There is no clear consensus on the definition of a balanced accuracy for the
     multiclass setting. Here are some definitions that can be found in the literature:
 
-    * Normalized class-wise accuracy average as described in [Guyon2015]_: for multi-class
-      classification problem, each sample is assigned the class with maximum prediction value.
-      The predictions are then binarized to compute the accuracy of each class on a
-      one-vs-rest fashion. The balanced accuracy is obtained by averaging the individual
-      accuracies over all classes and then normalized by the expected value of balanced
-      accuracy for random predictions (:math:`0.5` for binary classification, :math:`1/C`
-      for C-class classification problem).
-    * Macro-average recall as described in [Mosley2013]_ and [Kelleher2015]_: the recall
-      for each class is computed independently and the average is taken over all classes.
+    * Macro-average recall as described in [Mosley2013]_, [Kelleher2015]_ and [Guyon2015]_:
+      the recall for each class is computed independently and the average is taken over all classes.
+      In [Guyon2015]_, the macro-average recall is then adjusted to ensure that random predictions
+      have a score of :math:`0` while perfect predictions have a score of :math:`1`.
+      One can compute the macro-average recall using ``recall_score(average="macro")`` in :func:`recall_score`.
+    * Class balanced accuracy as described in [Mosley2013]_: the minimum between the precision
+      and the recall for each class is computed. Those values are then averaged over the total
+      number of classes to get the balanced accuracy.
+    * Balanced Accuracy as described in [Urbanowicz2015]_: the average of sensitivity and selectivity
+      is computed for each class and then averaged over total number of classes.
 
     Note that none of these different definitions are currently implemented within
-    the :func:`balanced_accuracy_score` function. However, the macro-averaged recall
-    is implemented in :func:`sklearn.metrics.recall_score`: set ``average`` parameter
-    to ``"macro"``.
+    the :func:`balanced_accuracy_score` function.
 
 .. topic:: References:
 
@@ -492,6 +491,8 @@ given binary ``y_true`` and ``y_pred``:
      Machine Learning for Predictive Data Analytics: Algorithms, Worked Examples,
      and Case Studies <https://mitpress.mit.edu/books/fundamentals-machine-learning-predictive-data-analytics>`_,
      2015.
+  .. [Urbanowicz2015] Urbanowicz R.J.,  Moore, J.H. `ExSTraCS 2.0: description and evaluation of a scalable learning
+     classifier system < https://doi.org/10.1007/s12065-015-0128-8>`_, Evol. Intel. (2015) 8: 89.
 
 .. _cohen_kappa: