[go: up one dir, main page]

0% found this document useful (0 votes)
7 views15 pages

DL Lecture 09 Regularization

The document discusses regularization techniques to prevent overfitting in machine learning models, including LASSO (L1), Ridge (L2), and Elastic Net regularization. It also covers dropout regularization, data augmentation methods, and early stopping as strategies to improve model generalization. Each method aims to reduce generalization error while balancing model complexity and training efficiency.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views15 pages

DL Lecture 09 Regularization

The document discusses regularization techniques to prevent overfitting in machine learning models, including LASSO (L1), Ridge (L2), and Elastic Net regularization. It also covers dropout regularization, data augmentation methods, and early stopping as strategies to improve model generalization. Each method aims to reduce generalization error while balancing model complexity and training efficiency.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Regularization

Lecture 09
22 January 2025 1
Regularization

This technique discourages Any modification we make to the


learning a more complex or learning algorithm that is intended
flexible model, to avoid the risk to reduce the generalization error,
of overfitting. but not its training error

22 January 2025 2
Regularization with Modified Loss Functions
• Augment Ordinary Least Squares with regularization term:
• LASSO Regression L1 Regularization
• Ridge Regression L2 Regularization
• Elastic Net Regularization
Least Absolute Shrinkage & Selection Operator(LASSO): L1 Regularization

Minimize cost function: 1) Ordinary Least Squares 2) Regularization Term

Minimize { }
Forcing

• L1 penalizes regressors by shrinking their weights


• Regressors that contribute little to error reduction are more penalized
• λ is the weighting factor for regularization to tune overfit ↔ underfit
Ridge Regression : L2 Regularization

Minimize cost function: 1) Ordinary Least Squares 2) Regularization Term

Minimize { }
Forcing,
Elastic Net Regularization

Dropout Regularization

𝑥1

𝑥2
𝑦ො
𝑥3

𝑥4

22 January 2025 9
Drop out regularization: Prevents Overfitting
This technique has also become popular recently. We drop out some of the hidden units for
specific training examples. Different hidden units may go off for different examples. In different
iterations of the optimization the different units may be dropped randomly.

The dropouts can also be different for different layers. So, we can select specific layers which
have higher number of units and may be contributing more towards overfitting; thus suitable for
higher dropout rates.

For some of the layers drop-out can be 0, that means no dropout

22 January 2025 10
Layer wise drop out

0 0.2 0.2 0 0 0

22 January 2025 11
Drop out
• Drop out also help in spreading out the weights at all layers as the
system will be reluctant to put more weight on some specific node. So
it help in shrinking weights and has an adaptive effect on the weights.
• Dropout has a similar effect as L2 regularization for overfitting.
• We don’t use dropout for test examples
• We also need to bump up the values at the output of each layer
corresponding to the dropout

22 January 2025 12
Data Augmentation
More training data is one more solution for overfitting.

As getting additional data may be expensive and may not be possible

Flipping of all the images can be one of the ways to increase your data.

Randomly zooming in and zooming out can be another way

Distorting some of the images based on your application may be another way
to increase your data.
22 January 2025 13
Data Augmentation
22 January 2025 14
Early stopping

Dev set error

Training error

22 January 2025
# iterations 15
Early Stopping
Sometime dev set error goes down and
By stopping halfway, we also reduce
then it start going up. So, you may
number of iterations to train and the
decide to stop where the curve has
computation time.
started taking a different turn.

Early stopping does not go fine with


orthogonalization because it contradicts We are stopping the process of
with our original objective of optimizing optimization in between to take care
(w, b) to the minimum possible cost of the overfitting which is a different
function. objective then optimization.

22 January 2025 16
Thank You
For more information, please visit the
following links:

gauravsingal789@gmail.com
gaurav.singal@nsut.ac.in
https://www.linkedin.com/in/gauravsingal789/
http://www.gauravsingal.in

22 January 2025
17

You might also like