0% found this document useful (0 votes)

30 views46 pages

Neural Network Intro Lecture 4

The document provides an overview of deep learning and neural networks, detailing their components such as layers, input/output, loss functions, and optimizers. It explains the processes of feedforward, backpropagation, and the importance of parameters and hyperparameters in training models. Additionally, it discusses performance assessment, including concepts like overfitting and dropout techniques to improve model accuracy.

Uploaded by

bukaraisha99

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

30 views46 pages

Neural Network Intro Lecture 4

Uploaded by

bukaraisha99

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 46

Neural Network

What is Deep Learning

label
dataset (with label)

Cat

Dog

It is called deep learning if we use neural network as a model in supervise learning

What is Deep Learning

Neural network

We can have million of neuron to analyse and learn the pattern (features) of a given data
and memorize the pattern.
NN Basic and Concept
Neural Network components

Neural network compose of 4 main components

1) Layers
2) Input and output
3) Loss function
4) optimizer
Layer
Layers

Node aka neuron

(e.g. layer has 4
neurons)

Fully connected layer/ dense

Input & Output

Input layer
(features)
Output layer
(classes/labels)
Input & Output
n
Class 1
m
Class 2

Xmn

The whole image is input

to first layer at once.

E.g. If you have 100 images, each image

will be inserted into NN one by one
Input & Output

Sentosa
X3 others

The four features at first row are input to blue layer at once
Loss Function

4 main components

1) Neuron
2) Weight and Biases
3) Activation function
4) Feedforward
Neuron

Artificial
b1 neuron is
w1
referred
S F perceptron
Weight & Bias

Input layer Layer 1 x : input

w :weight
b b : bias
x F(s) : output
w s F F(s)

s = w*x + b
Activation function

Input layer Layer 1 x : input

w :weight
b b : bias
x F(s) : output
w s F F(s)

F(s) = 1
s = w*x + b 1+ e-x
Feed forward

Input layer Layer 1 x : input

x=1
w :weight
w=0.3
b b : bias
b= - 0.3 x F(s) : output
s= 0 w s F F(s)
F(s) = ?

F(s) = 1
s = w*x + b 1+ e-x
Loss
Forward direction to
reach F(S)

x=1
w=0.3 x b
w
b= - 0.3
s F
s=
F(s) = F(S)
Target value (label)
target value =
Loss =
Loss = target value – F(S)

Loss is the different between target value and F(s). Also called error
MSE
Loss
Forward direction to
reach F(S)

x=1
x b
w=0.3
w
w=0.3, b=-0.3
b= - 0.3
s F s = w*x +b
s= F(S) w x b s target value loss
F(s) =
Target value (label) 0.3*1 + (-0.3) = 0 0 0
target value =
Loss =
0.3*2 + (-0.3) = 0.3 -1 1.68
Loss = target value – F(S) 0.3*3 + (-0.3) = 0.6 -2 6.75
0.3*4 + (-0.3) = 0.9 -3 15.21

Lets x = 1, 2, 3, 4
Optimizer

4 main components

1) Backpropagation
2) Optimizer
3) Learning rate
4) Epoch & Accuracy
Back propagation
Backward direction

We go
backward in b
effort to x w
minimize s F Loss
the loss F(S)
Target value (label)

Loss = target value – F(S)

Optimizer (Reducing the loss)
Backward direction

We go backward
Change in value
in effort to b
minimize x w
s= (x*w) + b What are w and b so that loss is zero?
the loss using s F
F(s) w=-1, b=1
Optimizer. It is a
function to Target value (label) s = w*x +b
change w x b s target value loss
w and b so that -1*1 + 1 = 0 0 0
loss is zero
-1*2 + 1 = -1 -1 0
-1*3 + 1 = -2 -2 0
Loss = target value – F(s) = 0 0
-1*4 + 1 = -3 -3
Learning rate
Backward direction

Optimizer
updates Change in value
weight and x
b
w
bias toward s= (x*w) + b
s F
zero loss. F(s)
Learning rate Target value (label)
is the rate of
optimizer
changes
weights and Loss = target value – F(s) = 0
biases
Epoch
Backward direction

Forward direction to
One epoch Reach target value
Input layer 1 output
consists of (Dataset)
one forward
direction and b
then one x1, x2, w Optimizer
x3, x4,
backward x5, x6 s F Loss
F(s)
direction and Target value (label)
optimizer is
executed 6 inputs are fetched into neuron in one cycle is one epoch
once for each
samples in You need run several epoch until loss is zero
dataset
Now let adding more layers (multilayer)
Adding more neuron

w1, w2 .. W4 are weights

(Every input must x1 (Every path has weight)
w1 b1
has path to neuron(node))
w2 1
S1 F F(S1) =
1+ e-s
w3 b1and b2 is bias
x2
b2 (Node has bias)
w4

S2 F 1
F(S2) =
1+ e-s Every node has function F(s)
Adding more layers
Input layer 1 layer 2 output
2 inputs
b3 Model has 2 hidden layers
Layer 1 has 2 nodes
x1 w1 b1 Layer 2 has 3 nodes
S F w11
w5 Layer output has two labels
w2
S F w6 w12
w7 b4 S F Target value 1
w3
x2 w13
w4 b2 S F
w8 w14
Target value 2
S F w15 S F
w9
b5 w16
w10 Every input has path to every node
Every node has path to every node
S F
Every path has weight and value are different
Every node has different bias (b) except the output nodes
Every node has same activation function (F)
Every node has different output F(s)
Every node has to do summation (s)
F(s) is Activation function
Layer 1
Final value F(s) come out from nodes
determine by activation function. In
this example we use sigmoid function as
wn+1 activation function.
S F F(s)
1
F(S) =
1+ e-s

F(s)

1
Sigmoid function: F(S) = 1+ e-x
Sigmoid vs. Tanh ReLU Leaky ReLU
Feed forward
From x forward direction to
reach F(S)

Input layer 1 layer 2 output

(Dataset)

x1
S F

S F
S F F1(S)

x2 S F

S F S F F2(S)

S F
Loss
Forward direction to
reach F(S)

Input layer 1 layer 2 output

(Dataset)

x1
S F

S F
S F Target value (label)

x2 S F Loss = target value – F

S F Every path has loss

S F
Average of loss

S F
Back propagation
From F(s) backward direction
To reach x

We go Input layer 1 layer 2 output

backward in b3
effort to x1 b1
w1 S F
minimize w5 w11
w2
the loss by S F w6 w12
w7 b4 S F Target value 1
changing x2
w3
w13
value of w4 b2 S F
w8 w14
weight &, Target value 2
S F w15 S F
biases w9
b5
w10 w16

S F
Optimizer
From x forward direction to
reach F(S)
We go backward Input layer 1 layer 2 output
in effort to b3
minimize x1 b1
w1 S F
the loss. w5 w11
w2
Optimizer is a S F w6 w12
w7 b4 S F Target value 1
function to x2
w3
w13
change w4 b2 S F
w8 w14
weights and Target value 2
S F w15 S F
biases so that w9
b5
w10 w16
loss is zero
S F
Optimizer
Backward direction

Change in value

x1
w1 b1 b3 Loss

S F
w2 b2 S F Target value (label)
w3
S F

Optimizer works at every path

Objective : Loss = 0
Types of optimizer

GradientDescentOptimizer

AdadeltaOptimizer

MomentumOptimizer

AdamOptimizer

FtrlOptimizer

RMSPropOptimizer
Learning rate
Backward direction

Optimizer Change the

updates value according
to learning rate
weight and
bias toward
x1
zero loss. w1 b1 b3 Loss
Learning rate
S F
is the rate of w2 b2 S F Target value (label)
w3
optimizer
S F
changes
weights and Loss = target value - F
biases
Learning rate : rule that optimizer has to follow in changing w and b
Epoch
Backward direction

Forward direction to
One epoch Reach target value
Input layer 1 layer 2 output
consists of (Dataset)
one forward
direction and x1
S F Loss
then one
backward S F
Target value (label)
S F
direction and
optimizer is S F
Loss = target value - F
executed
S F S F
once for all
sample in
dataset. S F
Epoch, batch & iterations
Epoch

Batch

Iteration
Epoch, batch & iterations

This approach called “Mini batch gradient descent”

Epoch, batch & iterations

Dataset is 100 samples

Epoch = 40
Num_of_batch (iteration) = 5
Batch_size = 20

for i less than or equal to Epoch

for j less than or equal to Num_of_batch
compute loss and optimized Batch_size
Epoch, batch & iterations

What is happening during epoch?

Dataset is 1 sample

Epoch = 4
Num_of_batch = 1
Batch_size = 1

for i less than or equal to Epoch

for j less than or equal to Num_of_batch
After 4 epoch the optimizer achieves 0 error
compute loss and optimized Batch_size
Parameter and hyperparameter

Parameter Any value that change by computer.

They are weight and biases. Automatically update by optimizer

Hyperparameter Any value that change by human.

They are learning rate, epoch, batch, number of layer, number of nodes
dropped out rate.
Tutorial
How many parameters in this model
b3

w1 b1
X1 S F w11
w5
w2
S F w6 w12
w7 b4 S F
w3
x2 w13
w4 b2 S F
w8 w14
S F w15 S F
w9
b5 w16
w10
S F
Layers How many layers?
Input
How many nodes?
Output How many inputs?
How many activation functions?
How many classes?
How many weights ?
How many biases
x4 How many optimizer?
How many parameters?
Assessing performance
Assessing the performance
Train data Test data
(80%) (20%)
Dataset is 100 samples
Validation phase
Epoch = 40
Each single epoch → we run train data.
Num_of_batch = 5
End of each single epoch → we run test data
Batch_size = 20

for i less than or equal to Epoch

for j less than or equal to Num_of_batch Accuracy is the percentage of right prediction
compute loss and optimized Batch_size over number of sample in test data. It uses
during validation(of every epoch) or
testing phase(end of whole epoch).

Loss is the percentage during 1 epoch.

Assessing the performance
Train data Test data
(80%) (20%)
Dataset is 100 samples
Validation phase
Epoch = 40
Each single epoch → we run train data.
Num_of_batch = 5
End of each single epoch → we run test data
Batch_size = 20

for i less than or equal to Epoch

for j less than or equal to Num_of_batch Overfitting is when loss in validation phase is much
compute loss and optimized Batch_size bigger than in training phase.

Underfitting is simply the loss is much bigger

during training phase.
Assessing the performance
Overfitting is when training is so good but then when validation/testing phase is bit worsts

Dropped out

Randomly pick any nodes and disable it.

We gives every nodes a probability for being alive.

E.g. say probability is 0.5. So every node will be
50% alive or 50% dead.

Dropped out is always related to overcome overfitting.

Thank you

Backpropagation in MLP: A Detailed Guide
No ratings yet
Backpropagation in MLP: A Detailed Guide
34 pages
Domnic Object Detecion Basics
No ratings yet
Domnic Object Detecion Basics
62 pages
Mid 1 DL Notes
No ratings yet
Mid 1 DL Notes
15 pages
Neural Networks
No ratings yet
Neural Networks
29 pages
Linearity: Skip To Content
No ratings yet
Linearity: Skip To Content
10 pages
FALLSEM2023-24 CSE4020 ELA VL2023240104096 2023-09-07 Reference-Material-I
No ratings yet
FALLSEM2023-24 CSE4020 ELA VL2023240104096 2023-09-07 Reference-Material-I
7 pages
Module 1 DL
No ratings yet
Module 1 DL
84 pages
ML Lec 09 ANN Quadratic Training
No ratings yet
ML Lec 09 ANN Quadratic Training
44 pages
Machine Learning With Convolutional Neural Networks
No ratings yet
Machine Learning With Convolutional Neural Networks
22 pages
Part 13 MD
No ratings yet
Part 13 MD
41 pages
Neural Net 3rdclass
No ratings yet
Neural Net 3rdclass
35 pages
NN Concepts
No ratings yet
NN Concepts
4 pages
Week 06 - Deep Feedforward Networks - Optimization
No ratings yet
Week 06 - Deep Feedforward Networks - Optimization
83 pages
DL U-I Introduction Part-2
No ratings yet
DL U-I Introduction Part-2
48 pages
Optimization
No ratings yet
Optimization
44 pages
DeepLearning Recap
No ratings yet
DeepLearning Recap
104 pages
Convolutional Neural Network
100% (1)
Convolutional Neural Network
59 pages
HODL Lec 2 Training NNs Intro TF
No ratings yet
HODL Lec 2 Training NNs Intro TF
83 pages
Presentation 1
No ratings yet
Presentation 1
14 pages
PowerPoint Presentation-2
No ratings yet
PowerPoint Presentation-2
52 pages
CS601 - Machine Learning - Unit 2 New
No ratings yet
CS601 - Machine Learning - Unit 2 New
56 pages
Linear Classifier: by Dr. Sanjeev Kumar Associate Professor Department of Mathematics IIT Roorkee, Roorkee-247 667, India
No ratings yet
Linear Classifier: by Dr. Sanjeev Kumar Associate Professor Department of Mathematics IIT Roorkee, Roorkee-247 667, India
86 pages
Ch2-Training, Optimization and Regularization of DNN-new
No ratings yet
Ch2-Training, Optimization and Regularization of DNN-new
114 pages
Deep Learning Optimization Guide
No ratings yet
Deep Learning Optimization Guide
83 pages
6 Working Example 01-08-2024
No ratings yet
6 Working Example 01-08-2024
21 pages
EPS-DL-Handout3-Build ANN From Scratch Basics
No ratings yet
EPS-DL-Handout3-Build ANN From Scratch Basics
25 pages
Deep Learning UNIT-II Part1
No ratings yet
Deep Learning UNIT-II Part1
48 pages
AI Lec24-25
No ratings yet
AI Lec24-25
63 pages
Back Propagation
No ratings yet
Back Propagation
29 pages
Slide 2-f2
No ratings yet
Slide 2-f2
52 pages
A Comprehensive Guide To Training Artificial Neural Networks - From Fundamentals To Advanced Techniques
No ratings yet
A Comprehensive Guide To Training Artificial Neural Networks - From Fundamentals To Advanced Techniques
26 pages
Week 2 Artificial Neural Networks
No ratings yet
Week 2 Artificial Neural Networks
62 pages
Neural Networks for Beginners
No ratings yet
Neural Networks for Beginners
79 pages
3rd Ass
No ratings yet
3rd Ass
6 pages
Neural Networks Training Loss Functions, Stochastic Gradient Descent, Backpropagation Algorithm, Bias-Variance Tradeoff
No ratings yet
Neural Networks Training Loss Functions, Stochastic Gradient Descent, Backpropagation Algorithm, Bias-Variance Tradeoff
29 pages
CS460 - Deep Learning - W02 & W03
No ratings yet
CS460 - Deep Learning - W02 & W03
44 pages
Neural Networks: Key Concepts & Functions
No ratings yet
Neural Networks: Key Concepts & Functions
22 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
26 pages
09-Neural Networks
No ratings yet
09-Neural Networks
18 pages
Introduction To Neural Network
No ratings yet
Introduction To Neural Network
20 pages
Back Propagation Learning Algorithm
No ratings yet
Back Propagation Learning Algorithm
15 pages
00005187-Deep Learning
No ratings yet
00005187-Deep Learning
11 pages
Neural Networks
No ratings yet
Neural Networks
10 pages
Fundamentals of Deep Learning: Part 2: How A Neural Network Trains
No ratings yet
Fundamentals of Deep Learning: Part 2: How A Neural Network Trains
54 pages
Backpropagation
No ratings yet
Backpropagation
12 pages
Neural Networks
No ratings yet
Neural Networks
38 pages
DL Mod2
No ratings yet
DL Mod2
45 pages
ML807 Distributed and Federated Learning Slides 2
No ratings yet
ML807 Distributed and Federated Learning Slides 2
211 pages
Slides 11
No ratings yet
Slides 11
48 pages
Week 2 Artificial Neural Networks - Part II
No ratings yet
Week 2 Artificial Neural Networks - Part II
40 pages
Neural Networks & Gradient Descent
No ratings yet
Neural Networks & Gradient Descent
77 pages
Lecture 40,41 BP Algorithm
No ratings yet
Lecture 40,41 BP Algorithm
11 pages
Kagan Lecture2
No ratings yet
Kagan Lecture2
118 pages
Supervised Deep Learning
No ratings yet
Supervised Deep Learning
28 pages
1 Intro
No ratings yet
1 Intro
91 pages
Back Propagation ALGORITHM
No ratings yet
Back Propagation ALGORITHM
11 pages
Lecture 02-2
No ratings yet
Lecture 02-2
37 pages
Lecture 5
No ratings yet
Lecture 5
34 pages
Com - 415 Chapter
No ratings yet
Com - 415 Chapter
8 pages
COM 412 Lecture Note
No ratings yet
COM 412 Lecture Note
19 pages
Unit 1 and 2
No ratings yet
Unit 1 and 2
62 pages
Com 414 Lecture Note HND II
100% (1)
Com 414 Lecture Note HND II
34 pages
Chapter OneLTUCL7QDHN
No ratings yet
Chapter OneLTUCL7QDHN
3 pages
GI B2 Wordlist English
No ratings yet
GI B2 Wordlist English
7 pages
Viscosity Index Impact on Lubricants
No ratings yet
Viscosity Index Impact on Lubricants
8 pages
CHN Practice Questions PDF
100% (1)
CHN Practice Questions PDF
19 pages
ch2 Free Undamped Vibration of SDOF
No ratings yet
ch2 Free Undamped Vibration of SDOF
14 pages
Culinary Skills Assessment Guide
No ratings yet
Culinary Skills Assessment Guide
12 pages
Course Syllabus - Summer 2025
No ratings yet
Course Syllabus - Summer 2025
2 pages
Design of Modular Scania
No ratings yet
Design of Modular Scania
205 pages
Topic For Class-1
No ratings yet
Topic For Class-1
33 pages
Mba-Syllabus (BGU)
No ratings yet
Mba-Syllabus (BGU)
240 pages
Strategic Target Setting at Endesa
No ratings yet
Strategic Target Setting at Endesa
3 pages
Ipc-A-610D VN: Viÿc CH'P Nhœn Cæc LƑP Ræp (Iÿn T
100% (1)
Ipc-A-610D VN: Viÿc CH'P Nhœn Cæc LƑP Ræp (Iÿn T
404 pages
Ovarian Tumors Essay Robbins
No ratings yet
Ovarian Tumors Essay Robbins
3 pages
Introduction To Subject
No ratings yet
Introduction To Subject
2 pages
FCOM vs FCTM: Cockpit Prep Steps
No ratings yet
FCOM vs FCTM: Cockpit Prep Steps
6 pages
MGT 480 - Group Assignment - Spring 2024
No ratings yet
MGT 480 - Group Assignment - Spring 2024
2 pages
Candidate Name: KIRTAN KUMAR Roll No: PAT-G-0400-17 Exam Name: PRB-3-X-Social Science Total Score: 78.50
No ratings yet
Candidate Name: KIRTAN KUMAR Roll No: PAT-G-0400-17 Exam Name: PRB-3-X-Social Science Total Score: 78.50
6 pages
Guía Gold Epoc 2022 - Manual de Bolsillo
100% (1)
Guía Gold Epoc 2022 - Manual de Bolsillo
54 pages
Visual Pathway: With Dr. Craig Canby
No ratings yet
Visual Pathway: With Dr. Craig Canby
28 pages
Tutorial Session Using Excel Solver Network Optimization Models - Formulation and Solution
No ratings yet
Tutorial Session Using Excel Solver Network Optimization Models - Formulation and Solution
2 pages
Andrei Phdthesis
No ratings yet
Andrei Phdthesis
310 pages
Pierre Bourdieu Sociology Is A Martial A
No ratings yet
Pierre Bourdieu Sociology Is A Martial A
4 pages
MR Darcys Undoing Abigail Reynolds Instant Download
No ratings yet
MR Darcys Undoing Abigail Reynolds Instant Download
26 pages
Adobe 2013 Security Breach Analysis
No ratings yet
Adobe 2013 Security Breach Analysis
14 pages
Fitting An Origin-Displaced Logarithmic Spiral To Empirical Data by Differential Evolution Method of Global Optimization
No ratings yet
Fitting An Origin-Displaced Logarithmic Spiral To Empirical Data by Differential Evolution Method of Global Optimization
7 pages
HYPERTEXT
No ratings yet
HYPERTEXT
21 pages
Climate Change: The Ultimate Determinant of Health
No ratings yet
Climate Change: The Ultimate Determinant of Health
11 pages
Admission Notification For Nursery Doddakallasandra For The Year 2025-2026
No ratings yet
Admission Notification For Nursery Doddakallasandra For The Year 2025-2026
2 pages
Commonly Asked Questions and Model Answers - MedicoNotes
100% (1)
Commonly Asked Questions and Model Answers - MedicoNotes
32 pages
Datasheet MMBT2907AL
No ratings yet
Datasheet MMBT2907AL
8 pages
Hewlett-Packard HP 510 Notebook PC (RU964AA#ABU)
No ratings yet
Hewlett-Packard HP 510 Notebook PC (RU964AA#ABU)
543 pages

Neural Network Intro Lecture 4

Uploaded by

Neural Network Intro Lecture 4

Uploaded by

Neural Network

What is Deep Learning

It is called deep learning if we use neural network as a model in supervise learning

Neural network compose of 4 main components

Node aka neuron

Fully connected layer/ dense

The whole image is input

E.g. If you have 100 images, each image

Input layer Layer 1 x : input

Input layer Layer 1 x : input

Input layer Layer 1 x : input

Loss = target value – F(S)

w1, w2 .. W4 are weights

Input layer 1 layer 2 output

Input layer 1 layer 2 output

x2 S F Loss = target value – F

S F Every path has loss

We go Input layer 1 layer 2 output

Optimizer works at every path

Optimizer Change the

This approach called “Mini batch gradient descent”

Dataset is 100 samples

for i less than or equal to Epoch

What is happening during epoch?

for i less than or equal to Epoch

Parameter Any value that change by computer.

Hyperparameter Any value that change by human.

for i less than or equal to Epoch

Loss is the percentage during 1 epoch.

for i less than or equal to Epoch

Underfitting is simply the loss is much bigger

Randomly pick any nodes and disable it.

We gives every nodes a probability for being alive.

Dropped out is always related to overcome overfitting.

You might also like