[go: up one dir, main page]

0% found this document useful (0 votes)
106 views3 pages

Normalization and Calibration

The document discusses normalization and calibration in the context of feature scaling for machine learning. Normalization adjusts the scale of features without target labels using methods like Z-score, Robust, and Min-Max normalization, while calibration adjusts scaling using target labels to enhance model performance in binary classification. Both techniques ensure that features contribute effectively to model training and improve the handling of categorical and ordinal data.

Uploaded by

Mahesh veera
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
106 views3 pages

Normalization and Calibration

The document discusses normalization and calibration in the context of feature scaling for machine learning. Normalization adjusts the scale of features without target labels using methods like Z-score, Robust, and Min-Max normalization, while calibration adjusts scaling using target labels to enhance model performance in binary classification. Both techniques ensure that features contribute effectively to model training and improve the handling of categorical and ordinal data.

Uploaded by

Mahesh veera
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Normalization and Calibration

Normalization (Unsupervised Scaling)

Normalization is used to adjust the scale of quantitative features without using target labels.

Purpose:

• Ensures all features contribute equally during model training.

• Useful when features are on different scales.

Methods of Normalization:

1. Z-score Normalization (Standardization)

• Best for normally distributed data.

• Formula:
$$z = \frac{x - \text{mean}}{\text{std deviation}}$$

2. Robust Normalization

• Best for non-normal data.

• Formula:
$$\frac{x - \text{median}}{\text{IQR}}$$
(IQR - interquartile range)

3. Min-Max Normalization

• Scales values to the [0, 1] range.

• Formula:
$$\frac{x - \text{min}}{\text{max} - \text{min}}$$

o Truncation may be used if exact min/max aren't known.

Calibration (Supervised Scaling)

Calibration adjusts feature scaling using target labels, often in binary classification.

Purpose:

• Adds meaningful class information to features.

• Helps models (e.g., linear classifiers) handle categorical/ordinal features more effectively.

How Calibration Works:

For a feature value ( v = F(x) ), create a calibrated feature ( Fc(x) ) that estimates:
( P(\text{positive class} | v) ) → ( Fc: X → [0, 1] )

Benefits:
✔ Makes features suitable for models that depend on probability (e.g., Naive Bayes).
✔ No further training needed after calibration.
✔ Helps the algorithm decide how to use the feature (numerical, categorical, or ordinal).

Examples

Normalization (Unsupervised)

Dataset with "Age" Feature:

Person Age

A 20

B 25

C 30

D 35

Using Min-Max Normalization:

• Min (l) = 20

• Max (h) = 35

• Formula:
$$\frac{\text{Age} - 20}{35 - 20}$$

Normalized Values:

Person Age Normalized Age

A 20 0.00

B 25 0.33

C 30 0.67

D 35 1.00

Calibration (Supervised)

Binary Classification Example - Product Purchase

Age Group Bought Product (1 = Yes, 0 = No)

20–29 2 Yes, 8 No

30–39 7 Yes, 3 No

Probability Estimation for Calibration:


• For 20–29:
$$P(\text{Yes}) = \frac{2}{2+8} = 0.2$$

• For 30–39:
$$P(\text{Yes}) = \frac{7}{7+3} = 0.7$$

Calibrated Feature Table:

Person Age Group Calibrated Value (P(Yes))

A 20–29 0.2

B 30–39 0.7

Final Steps

Now, you can:

1. Copy & paste this into a Word or Google Doc.

2. Add diagrams if needed.

3. Save or export as PDF.

Want me to improve the formatting further or add more details? I'm happy to help!

You might also like