8000 DOC Add references for multiclass balanced-accuracy definitions (#9982) · jwjohnson314/scikit-learn@ef7d4cd · GitHub
[go: up one dir, main page]

Skip to content

Commit ef7d4cd

Browse files
maskani-mohJeremiah Johnson
authored andcommitted
DOC Add references for multiclass balanced-accuracy definitions (scikit-learn#9982)
1 parent e2cca97 commit ef7d4cd

File tree

1 file changed

+32
-0
lines changed

1 file changed

+32
-0
lines changed

doc/modules/model_evaluation.rst

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -461,6 +461,38 @@ given binary ``y_true`` and ``y_pred``:
461461
Currently this score function is only defined for binary classification problems, you
462462
may need to wrap it by yourself if you want to use it for multilabel problems.
463463

464+
There is no clear consensus on the definition of a balanced accuracy for the
465+
multiclass setting. Here are some definitions that can be found in the literature:
466+
467+
* Normalized class-wise accuracy average as described in [Guyon2015]_: for multi-class
468+
classification problem, each sample is assigned the class with maximum prediction value.
469+
The predictions are then binarized to compute the accuracy of each class on a
470+
one-vs-rest fashion. The balanced accuracy is obtained by averaging the individual
471+
accuracies over all classes and then normalized by the expected value of balanced
472+
accuracy for random predictions (:math:`0.5` for binary classification, :math:`1/C`
473+
for C-class classification problem).
474+
* Macro-average recall as described in [Mosley2013]_ and [Kelleher2015]_: the recall
475+
for each class is computed independently and the average is taken over all classes.
476+
477+
Note that none of these different definitions are currently implemented within
478+
the :func:`balanced_accuracy_score` function. However, the macro-averaged recall
479+
is implemented in :func:`sklearn.metrics.recall_score`: set ``average`` parameter
480+
to ``"macro"``.
481+
482+
.. topic:: References:
483+
484+
.. [Guyon2015] I. Guyon, K. Bennett, G. Cawley, H.J. Escalante, S. Escalera, T.K. Ho, N. Macià,
485+
B. Ray, M. Saeed, A.R. Statnikov, E. Viegas, `Design of the 2015 ChaLearn AutoML Challenge
486+
<http://ieeexplore.ieee.org/document/7280767/>`_,
487+
IJCNN 2015.
488+
.. [Mosley2013] L. Mosley, `A balanced approach to the multi-class imbalance problem
489+
<http://lib.dr.iastate.edu/etd/13537/>`_,
490+
IJCV 2010.
491+
.. [Kelleher2015] John. D. Kelleher, Brian Mac Namee, Aoife D'Arcy, `Fundamentals of
492+
Machine Learning for Predictive Data Analytics: Algorithms, Worked Examples,
493+
and Case Studies <https://mitpress.mit.edu/books/fundamentals-machine-learning-predictive-data-analytics>`_,
494+
2015.
495+
464496
.. _cohen_kappa:
465497

466498
Cohen's kappa

0 commit comments

Comments
 (0)
0