📘 Unit 3 Part B: Evaluating Models –
Notes with Examples
✅ 1. What is Model Evaluation?
Model Evaluation is the process of checking how well an AI model performs after training. It
helps in understanding:
● How accurate the model is
● If the model can be trusted
● If the model needs improvements
✅ 2. Why is Evaluation Important?
● To test model performance
● To compare different models
● To detect overfitting or underfitting
● To ensure the model works well on new data
✅ 3. Types of Errors
There are two main types of errors in AI model evaluation:
🔹 Type I Error (False Positive)
● The model predicts YES, but the actual answer is NO.
● Example: A spam filter marks a good email as spam.
🔹 Type II Error (False Negative)
● The model predicts NO, but the actual answer is YES.
● Example: A spam filter misses a spam email and sends it to inbox.
✅ 4. Confusion Matrix
A Confusion Matrix is a table used to describe the performance of a classification model.
Predicted: YES Predicted: NO
Actual: YES True Positive False Negative
Actual: NO False Positive True Negative
Example:
Suppose a medical AI model is used to detect cancer:
● True Positive (TP): Correctly predicts patient has cancer
● True Negative (TN): Correctly predicts patient does not have cancer
● False Positive (FP): Predicts cancer when there is none
● False Negative (FN): Predicts no cancer when patient has it
✅ 5. Evaluation Metrics
🔹 1. Accuracy
● Measures how often the model is correct.
● Formula:
Accuracy=TP+TNTP+TN+FP+FN\text{Accuracy} = \frac{TP + TN}{TP + TN + FP +
FN}
🔹 2. Precision
● Out of all predicted YES, how many were actually YES?
● Formula:
Precision=TPTP+FP\text{Precision} = \frac{TP}{TP + FP}
🔹 3. Recall
● Out of all actual YES cases, how many were predicted correctly?
● Formula:
Recall=TPTP+FN\text{Recall} = \frac{TP}{TP + FN}
🔹 4. F1 Score
● Balance between precision and recall.
● Formula:
F1 Score=2×Precision×RecallPrecision+Recall\text{F1 Score} = 2 \times
\frac{\text{Precision} \times \text{Recall}}{\text{Precision} + \text{Recall}}
✅ 6. Underfitting vs Overfitting
Term Meaning Example
Underfitting Model is too simple, does Predicts all emails are not
not learn well spam
Overfitting Model learns too much from Remembers every training
training data, fails on new email but fails on new
data emails
✅ 7. Real-life Example
Suppose you're building an AI model to detect fraudulent transactions in a bank.
● Accuracy: If model says 98% accuracy, but most transactions are non-fraud, it may
be misleading.
● Precision: Helps to ensure that flagged frauds are really fraud.
● Recall: Helps to catch most of the actual frauds.
So, Precision is good when false positives are costly, and Recall is important when missing
a positive case is dangerous (like cancer or fraud).
✅ 8. Cross Validation
● A technique to split data into parts (train/test) multiple times to get a better
performance estimate.
● Ensures the model is generalized, not just memorizing data.