8000 Merge pull request #3527 from jnothman/unnest-model-evaluation · scikit-learn/scikit-learn@22cafa6 · GitHub
[go: up one dir, main page]

Skip to content

Commit 22cafa6

Browse files
committed
Merge pull request #3527 from jnothman/unnest-model-evaluation
[MRG] DOC A less-nested coverage of model evaluation
2 parents 0a7bef6 + e65d907 commit 22cafa6

File tree

1 file changed

+54
-67
lines changed

1 file changed

+54
-67
lines changed

doc/modules/model_evaluation.rst

Lines changed: 54 additions & 67 deletions
Original file line numberDiff line numberDiff line change
@@ -20,8 +20,10 @@ model:
2020
This is discussed on section :ref:`scoring_parameter`.
2121

2222
* **Metric functions**: The :mod:`metrics` module implements functions
23-
assessing prediction errors for specific purposes. This is discussed in
24-
the section :ref:`prediction_error_metrics`.
23+
assessing prediction error for specific purposes. These metrics are detailed
24+
in sections on :ref:`classification_metrics`,
25+
:ref:`multilabel_ranking_metrics`, :ref:`regression_metrics` and
26+
:ref:`clustering_metrics`.
2527

2628
Finally, :ref:`dummy_estimators` are useful to get a baseline
2729
value of those metrics for random predictions.
@@ -42,7 +44,7 @@ Model selection and evaluation using tools, such as
4244
controls what metric they apply to estimators evaluated.
4345

4446
Common cases: predefined values
45-
--------------------------------
47+
-------------------------------
4648

4749
For the most common usecases, you can simply provide a string as the
4850
``scoring`` parameter. Possible values are:
@@ -91,22 +93,31 @@ predicted values. These are detailed below, in the next sections.
9193

9294
.. _scoring:
9395

94-
Defining your scoring strategy from score functions
96+
Defining your scoring strategy from metric functions
9597
-----------------------------------------------------
9698

97-
The scoring parameter can be a callable that takes model predictions and
98-
ground truth.
99+
The module :mod:`sklearn.metric` also exposes a set of simple functions
100+
measuring a prediction error given ground truth and prediction:
99101

100-
However, if you want to use a scoring function that takes additional parameters, such as
101-
:func:`fbeta_sco 8000 re`, you need to generate an appropriate scoring object. The
102-
simplest way to generate a callable object for scoring is by using
103-
:func:`make_scorer`.
104-
That function converts score functions (discussed below in :ref:`prediction_error_metrics`) into callables that can be
105-
used for model evaluation.
102+
- functions ending with ``_score`` return a value to
103+
maximize (the higher the better).
106104

107-
One typical use case is to wrap an existing scoring function from the library
108-
with non default value for its parameters such as the ``beta`` parameter for the
109-
:func:`fbeta_score` function::
105+
- functions ending with ``_error`` or ``_loss`` return a
106+
value to minimize (the lower the better).
107+
108+
Metrics available for various machine learning tasks are detailed in sections
109+
below.
110+
111+
Many metrics are not given names to be used as ``scoring`` values,
112+
sometimes because they require additional parameters, such as
113+
:func:`fbeta_score`. In such cases, you need to generate an appropriate
114+
scoring object. The simplest way to generate a callable object for scoring
115+
is by using :func:`make_scorer`. That function converts metrics
116+
into callables that can be used for model evaluation.
117+
118+
One typical use case is to wrap an existing metric function from the library
119+
with non default value for its parameters, such as the ``beta`` parameter for
120+
the :func:`fbeta_score` function::
110121

111122
>>> from sklearn.metrics import fbeta_score, make_scorer
112123
>>> ftwo_scorer = make_scorer(fbeta_score, beta=2)
@@ -138,6 +149,8 @@ from a simple python function::
138149
* any additional parameters, such as ``beta`` in an :func:`f1_score`.
139150

140151

152+
.. _diy_scoring:
153+
141154
Implementing your own scoring object
142155
------------------------------------
143156
You can generate even more flexible model scores by constructing your own
@@ -154,24 +167,10 @@ the following two rules:
154167
``estimator``'s predictions on ``X`` which reference to ``y``.
155168
Again, higher numbers are better.
156169

157-
.. _prediction_error_metrics:
158-
159-
Function for prediction-error metrics
160-
======================================
161-
162-
The module :mod:`sklearn.metric` also exposes a set of simple functions
163-
measuring a prediction error given ground truth and prediction:
164-
165-
- functions ending with ``_score`` return a value to
166-
maximize (the higher the better).
167-
168-
- functions ending with ``_error`` or ``_loss`` return a
169-
value to minimize (the lower the better).
170-
171170
.. _classification_metrics:
172171

173172
Classification metrics
174-
-----------------------
173+
=======================
175174

176175
.. currentmodule:: sklearn.metrics
177176

@@ -228,7 +227,7 @@ And some work with binary and multilabel indicator format:
228227
In the following sub-sections, we will describe each of those functions.
229228

230229
Accuracy score
231-
..............
230+
--------------
232231

233232
The :func:`accuracy_score` function computes the
234233
`accuracy <http://en.wikipedia.org/wiki/Accuracy_and_precision>`_, the fraction
@@ -271,7 +270,7 @@ In the multilabel case with binary label indicators: ::
271270
the dataset.
272271

273272
Confusion matrix
274-
................
273+
----------------
275274

276275
The :func:`confusion_matrix` function computes the `confusion matrix
277276
<http://en.wikipedia.org/wiki/Confusion_matrix>`_ to evaluate
@@ -313,7 +312,7 @@ from the :ref:`example_model_selection_plot_confusion_matrix.py` example):
313312

314313

315314
Classification report
316-
......................
315+
----------------------
317316

318317
The :func:`classification_report` function builds a text report showing the
319318
main classification metrics. Here a small example with custom ``target_names``
@@ -348,7 +347,7 @@ and inferred labels::
348347
grid search with a nested cross-validation.
349348

350349
Hamming loss
351-
.............
350+
-------------
352351

353352
The :func:`hamming_loss` computes the average Hamming loss or `Hamming
354353
distance <http://en.wikipedia.org/wiki/Hamming_distance>`_ between two sets
@@ -395,7 +394,7 @@ In the multilabel case with binary label indicators: ::
395394

396395

397396
Jaccard similarity coefficient score
398-
.....................................
397+
-------------------------------------
399398

400399
The :func:`jaccard_similarity_score` function computes the average (default)
401400
or sum of `Jaccard similarity coefficients
@@ -432,7 +431,7 @@ In the multilabel case with binary label indicators: ::
432431
.. _precision_recall_f_measure_metrics:
433432

434433
Precision, recall and F-measures
435-
.................................
434+
---------------------------------
436435

437436
The `precision <http://en.wikipedia.org/wiki/Precision_and_recall#Precision>`_
438437
is intuitively the ability of the classifier not to label as
@@ -639,7 +638,7 @@ Then the metrics are defined as:
639638

640639

641640
Hinge loss
642-
...........
641+
-----------
643642

644643
The :func:`hinge_loss` function computes the average
645644
`hinge loss function <http://en.wikipedia.org/wiki/Hinge_loss>`_. The hinge
@@ -673,7 +672,8 @@ with a svm classifier::
673672

674673

675674
Log loss
676-
........
675+
--------
676+
677677
The log loss, also called logistic regression loss or cross-entropy loss,
678678
is a loss function defined on probability estimates.
679679
It is commonly used in (multinomial) logistic regression and neural networks,
@@ -725,7 +725,7 @@ The log loss is non-negative.
725725

726726

727727
Matthews correlation coefficient
728-
.................................
728+
---------------------------------
729729

730730
The :func:`matthews_corrcoef` function computes the Matthew's correlation
731731
coefficient (MCC) for binary classes (quoting the `Wikipedia article on the
@@ -761,7 +761,7 @@ function:
761761
.. _roc_metrics:
762762

763763
Receiver operating characteristic (ROC)
764-
.......................................
764+
---------------------------------------
765765

766766
The function :func:`roc_curve` computes the `receiver operating characteristic
767767
curve, or ROC curve (quoting
@@ -857,7 +857,7 @@ if predicted outputs have been binarized.
857857
.. _zero_one_loss:
858858

859859
Zero one loss
860-
..............
860+
--------------
861861

862862
The :func:`zero_one_loss` function computes the sum or the average of the 0-1
863863
classification loss (:math:`L_{0-1}`) over :math:`n_{\text{samples}}`. By
@@ -903,7 +903,7 @@ In the multilabel case with binary label indicators: ::
903903
.. _multilabel_ranking_metrics:
904904

905905
Multilabel ranking metrics
906-
--------------------------
906+
==========================
907907

908908
.. currentmodule:: sklearn.metrics
909909

@@ -912,7 +912,8 @@ associated with it. The goal is to give high scores and better rank to
912912
the ground truth labels.
913913

914914
Label ranking average precision
915-
...............................
915+
-------------------------------
916< EED3 code class="diff-text syntax-highlighted-line addition">+
916917
The :func:`label_ranking_average_precision_score` function
917918
implements the label ranking average precision (LRAP). This metric is linked to
918919
the :func:`average_precision_score` function, but is based on the notion of
@@ -955,7 +956,7 @@ Here a small example of usage of this function::
955956
.. _regression_metrics:
956957

957958
Regression metrics
958-
-------------------
959+
===================
959960

960961
.. currentmodule:: sklearn.metrics
961962

@@ -966,7 +967,7 @@ to handle the multioutput case: :func:`mean_absolute_error`,
966967

967968

968969
Explained variance score
969-
.........................
970+
-------------------------
970971

971972
The :func:`explained_variance_score` computes the `explained variance
972973
regression score <http://en.wikipedia.org/wiki/Explained_variation>`_.
@@ -991,7 +992,7 @@ function::
991992
0.957...
992993

993994
Mean absolute error
994-
...................
995+
-------------------
995996

996997
The :func:`mean_absolute_error` function computes the `mean absolute
997998
error <http://en.wikipedia.org/wiki/Mean_absolute_error>`_, which is a risk
@@ -1021,7 +1022,7 @@ Here a small example of usage of the :func:`mean_absolute_error` function::
10211022

10221023

10231024
Mean squared error
1024-
...................
1025+
-------------------
10251026

10261027
The :func:`mean_squared_error` function computes the `mean square
10271028
error <http://en.wikipedia.org/wiki/Mean_squared_error>`_, which is a risk
@@ -1056,7 +1057,7 @@ function::
10561057
evaluate gradient boosting regression.
10571058

10581059
R² score, the coefficient of determination
1059-
...........................................
1060+
-------------------------------------------
10601061

10611062
The :func:`r2_score` function computes R², the `coefficient of
10621063
determination <http://en.wikipedia.org/wiki/Coefficient_of_determination>`_.
@@ -1092,31 +1093,17 @@ Here a small example of usage of the :func:`r2_score` function::
10921093
for an example of R² score usage to
10931094
evaluate Lasso and Elastic Net on sparse signals.
10941095

1096+
.. _clustering_metrics:
1097+
10951098
Clustering metrics
10961099
======================
10971100

1098-
The :mod:`sklearn.metrics` implements several losses, scores and utility
1099-
function for more information see the :ref:`clustering_evaluation`
1100-
section.
1101-
1102-
1103-
Biclustering metrics
1104-
====================
1105-
1106-
The :mod:`sklearn.metrics` module implements bicluster scoring
1107-
metrics. For more information see the :ref:`biclustering_evaluation`
1108-
section.
1109-
1110-
11111101
.. currentmodule:: sklearn.metrics
11121102

1113-
.. _clustering_metrics:
1114-
1115-
Clustering metrics
1116-
-------------------
1117-
11181103
The :mod:`sklearn.metrics` implements several losses, scores and utility
1119-
functions. For more information see the :ref:`clustering_evaluation` section.
1104+
functions. For more information see the :ref:`clustering_evaluation`
1105+
section for instance clustering, and :ref:`biclustering_evaluation` for
1106+
biclustering.
11201107

11211108

11221109
.. _dummy_estimators:

0 commit comments

Comments
 (0)
0