[go: up one dir, main page]

0% found this document useful (0 votes)
208 views2 pages

Confusion Matrix Notes

A confusion matrix is a table used to evaluate the performance of a classification model on test data where the true values are known. It displays the number of correct and incorrect predictions made by the model compared to the actual classifications. The example matrix shows a binary classifier that made 165 predictions, correctly identifying 100 true positives and 50 true negatives, while missing 5 true positives and incorrectly predicting 10 negatives as positive. Key terms like true/false positives and negatives are defined. Common metrics like accuracy, misclassification rate, recall, and precision are also explained.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
208 views2 pages

Confusion Matrix Notes

A confusion matrix is a table used to evaluate the performance of a classification model on test data where the true values are known. It displays the number of correct and incorrect predictions made by the model compared to the actual classifications. The example matrix shows a binary classifier that made 165 predictions, correctly identifying 100 true positives and 50 true negatives, while missing 5 true positives and incorrectly predicting 10 negatives as positive. Key terms like true/false positives and negatives are defined. Common metrics like accuracy, misclassification rate, recall, and precision are also explained.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 2

A confusion matrix is a table that is often used to describe the

performance of a classification model (or "classifier") on a set of


test data for which the true values are known. The confusion matrix
itself is relatively simple to understand, but the related terminology
can be confusing.
Let's start with an example confusion matrix for a binary
classifier (though it can easily be extended to the case of more than
two classes):

What can we learn from this matrix?


 There are two possible predicted classes: "yes" and "no". If we
were predicting the presence of a disease, for example, "yes"
would mean they have the disease, and "no" would mean they
don't have the disease.
 The classifier made a total of 165 predictions (e.g., 165 patients
were being tested for the presence of that disease).
 Out of those 165 cases, the classifier predicted "yes" 110 times,
and "no" 55 times.
 In reality, 105 patients in the sample have the disease, and 60
patients do not.
Let's now define the most basic terms, which are whole numbers (not
rates):
 true positives (TP): These are cases in which we predicted yes
(they have the disease), and they do have the disease.
 true negatives (TN): We predicted no, and they don't have the
disease.
 false positives (FP): We predicted yes, but they don't actually
have the disease. (Also known as a "Type I error.")
 false negatives (FN): We predicted no, but they actually do
have the disease. (Also known as a "Type II error.")
I've added these terms to the confusion matrix, and also added the
row and column totals:

This is a list of rates that are often computed from a confusion matrix
for a binary classifier:

 Accuracy: Overall, how often is the classifier correct?


o (TP+TN)/total = (100+50)/165 = 0.91
 Misclassification Rate: Overall, how often is it wrong?
o (FP+FN)/total = (10+5)/165 = 0.09
o equivalent to 1 minus Accuracy
o also known as "Error Rate"
 True Positive Rate(Recall/Sensitivity): When it's actually yes,
how often does it predict yes?
o TP/actual yes = 100/105 = 0.95
 Precision: When it predicts yes, how often is it correct?
o TP/predicted yes = 100/110 = 0.91

You might also like