0% found this document useful (0 votes)

38 views14 pages

Unit II.

Uploaded by

prathmeshbajpai123

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

38 views14 pages

Unit II.

Uploaded by

prathmeshbajpai123

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 14

CONVOLUTIONAL NEURAL NETWORK

What is a Convolutional Neural Network (CNN)?

A Convolutional Neural Network (CNN), also known as ConvNet, is a specialized type of deep learning
algorithm mainly designed for tasks that necessitate object recognition, including image classification,
detection, and segmentation. CNNs are employed in a variety of practical scenarios, such as autonomous
vehicles, security camera systems, and others.

Convolution Neural Network

Convolutional Neural Network (CNN) is the extended version of artificial neural networks (ANN) which
is predominantly used to extract the feature from the grid-like matrix dataset.

Convolution Neural Network

Convolutional Neural Network (CNN) is the extended version of artificial neural networks (ANN) which
is predominantly used to extract the feature from the grid-like matrix dataset.

Layers used to build ConvNets

A complete Convolution Neural Networks architecture is also known as covnets. A covnets is a sequence
of layers, and every layer transforms one volume to another through a differentiable function.
Types of layers: datasets
Let’s take an example by running a covnets on of image of dimension 32 x 32 x 3.
 Input Layers: It’s the layer in which we give input to our model. In CNN, Generally, the input will
be an image or a sequence of images. This layer holds the raw input of the image with width 32,
height 32, and depth 3.
 Convolutional Layers: This is the layer, which is used to extract the feature from the
input dataset. It applies a set of learnable filters known as the kernels to the input images.
The filters/kernels are smaller matrices usually 2×2, 3×3, or 5×5 shape. it slides over the
input image data and computes the dot product between kernel weight and the
corresponding input image patch. The output of this layer is referred as feature maps.
Suppose we use a total of 12 filters for this layer we’ll get an output volume of dimension
32 x 32 x 12. Stride determines the step size at which the filter moves across the input. Padding
adds additional border pixels to the input image to control the spatial dimensions of the output
feature map.

 Activation Layer: By adding an activation function to the output of the preceding layer, activation
layers add nonlinearity to the network. it will apply an element-wise activation function to the output
of the convolution layer. Some common activation functions are RELU: max(0, x), Tanh, Leaky
RELU, etc. The volume remains unchanged hence output volume will have dimensions 32 x 32 x 12.
 Pooling layer: This layer is periodically inserted in the covnets and its main function is to reduce the
size of volume which makes the computation fast reduces memory and also prevents overfitting. Two
common types of pooling layers are max pooling and average pooling. If we use a max pool with 2 x
2 filters and stride 2, the resultant volume will be of dimension 16x16x12.

Image source: cs231n.stanford.edu

 Flattening: The resulting feature maps are flattened into a one-dimensional vector after the
convolution and pooling layers so they can be passed into a completely linked layer for categorization
or regression.
 Fully Connected Layers: It takes the input from the previous layer and computes the final
classification or regression task.
 Output Layer: The output from the fully connected layers is then fed into a logistic function for
classification tasks like sigmoid or softmax which converts the output of each class into the
probability score of each class.
Overall Process:

1. Input image passes through the convolution layer, applying filters to extract features.
2. The resulting feature map undergoes an activation function (e.g., ReLU) to introduce non-
linearity.
3. Max-pooling layers downsample the feature map by selecting maximum values in local regions.
4. This process of convolution and pooling is typically repeated in a stack of layers to create a deep
CNN architecture.

Spatial Convolution and pooling

POOL layer will perform a downsampling operation along the spatial dimensions (width, height),
resulting in volume such as [16x16x12].

Dropout:

Dropout helps prevent overfitting by randomly nullifying outputs from neurons during the training
process. This encourages the network to learn redundant representations for everything and hence,
increases the model's ability to generalize.

Dropout is a regularization technique which involves randomly ignoring or “dropping out” some layer
outputs during training, used in deep neural networks to prevent overfitting.

A dropout regularization in deep learning is a regularization approach that prevents overfitting by

ensuring that no units are codependent with one another. Dropout regularization is one technique used to
tackle overfitting problems in deep learning.

Let’s try to understand with a given input x: {1, 2, 3, 4, 5} to the fully connected layer. We have a
dropout layer with probability p = 0.2 (or keep probability = 0.8). During the forward propagation
(training) from the input x, 20% of the nodes would be dropped, i.e. the x could become {1, 0, 3, 4, 5} or
{1, 2, 0, 4, 5} and so on. Similarly, it applied to the hidden layers.

For instance, if the hidden layers have 1000 neurons (nodes) and a dropout is applied with drop
probability = 0.5, then 500 neurons would be randomly dropped in every iteration (batch).

Understanding Dropout Regularization

Dropout regularization leverages the concept of dropout during training in deep learning models to
specifically address overfitting, which occurs when a model performs nicely on schooling statistics
however poorly on new, unseen facts.

During training, dropout randomly deactivates a chosen proportion of neurons (and their connections)
within a layer. This essentially temporarily removes them from the network.
The deactivated neurons are chosen at random for each training iteration. This randomness is crucial for
preventing overfitting.
To account for the deactivated neurons, the outputs of the remaining active neurons are scaled up by a
factor equal to the probability of keeping a neuron active (e.g., if 50% are dropped, the remaining ones
are multiplied by 2).

Advantages of Dropout Regularization in Deep Learning

Prevents Overfitting: By randomly disabling neurons, the network cannot overly rely on the specific
connections between them.

Ensemble Effect: Dropout acts like training an ensemble of smaller neural networks with varying
structures during each iteration. This ensemble effect improves the model’s ability to generalize to unseen
data.

Enhancing Data Representation: Dropout methods are used to enhance data representation by
introducing noise, generating additional training samples, and improving the effectiveness of the model
during training.

Drawbacks of Dropout Regularization and How to Mitigate Them

Despite its benefits, dropout regularization in deep learning is not without its drawbacks. Here are some
of the challenges related to dropout and methods to mitigate them:

Longer Training Times: Dropout increases training duration due to random dropout of units in hidden
layers. To address this, consider powerful computing resources or parallelize training where possible.

Optimization Complexity: Understanding why dropout works is unclear, making optimization

challenging. Experiment with dropout rates on a smaller scale before full implementation to fine-tune
model performance.

Hyperparameter Tuning: Dropout adds hyperparameters like dropout chance and learning rate,
requiring careful tuning. Use techniques such as grid search or random search to systematically find
optimal combinations.
Redundancy with Batch Normalization: Batch normalization can sometimes replace dropout effects.
Evaluate model performance with and without dropout when using batch normalization to determine its
necessity.

Model Complexity: Dropout layers add complexity. Simplify the model architecture where possible,
ensuring each dropout layer is justified by performance gains in validation.

What is Recurrent Neural Network (RNN)?

Recurrent Neural Network(RNN) is a type of Neural Network where the output from the previous step is
fed as input to the current step. In traditional neural networks, all the inputs and outputs are independent
of each other. Still, in cases when it is required to predict the next word of a sentence, the previous words
are required and hence there is a need to remember the previous words. Thus RNN came into existence,
which solved this issue with the help of a Hidden Layer. The main and most important feature of RNN is
its Hidden state, which remembers some information about a sequence. The state is also referred to as
Memory State since it remembers the previous input to the network. It uses the same parameters for each
input as it performs the same task on all the inputs or hidden layers to produce the output. This reduces
the complexity of parameters, unlike other neural networks.

How RNN differs from Feedforward Neural Network?

Artificial neural networks that do not have looping nodes are called feed forward neural networks.
Because all information is only passed forward, this kind of neural network is also referred to as a multi-
layer neural network.
Information moves from the input layer to the output layer – if any hidden layers are present –
unidirectionally in a feedforward neural network. These networks are appropriate for image classification
tasks, for example, where input and output are independent. Nevertheless, their inability to retain previous
inputs automatically renders them less useful for sequential data analysis.

Recurrent Neuron and RNN Unfolding

The fundamental processing unit in a Recurrent Neural Network (RNN) is a Recurrent Unit, which is not
explicitly called a “Recurrent Neuron.” This unit has the unique ability to maintain a hidden state,
allowing the network to capture sequential dependencies by remembering previous inputs while
processing. Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) versions improve the
RNN’s ability to handle long-term dependencies.

Types Of RNN

There are four types of RNNs based on the number of inputs and outputs in the network.
One to One

One to Many

Many to One

Many to Many

One to One

This type of RNN behaves the same as any simple Neural network it is also known as Vanilla Neural
Network. In this Neural network, there is only one input and one output.

One To Many

In this type of RNN, there is one input and many outputs associated with it. One of the most used
examples of this network is Image captioning where given an image we predict a sentence having
Multiple words.

Many to One

In this type of network, Many inputs are fed to the network at several states of the network generating
only one output. This type of network is used in the problems like sentimental analysis. Where we give
multiple words as input and predict only the sentiment of the sentence as output.

Many to Many

In this type of neural network, there are multiple inputs and multiple outputs corresponding to a problem.
One Example of this Problem will be language translation. In language translation, we provide multiple
words from one language as input and predict multiple words from the second language as output.

Recurrent Neural Network Architecture

How does RNN work?

The Recurrent Neural Network consists of multiple fixed activation function units, one for each time step.
Each unit has an internal state which is called the hidden state of the unit. This hidden state signifies the
past knowledge that the network currently holds at a given time step. This hidden state is updated at every
time step to signify the change in the knowledge of the network about the past. The hidden state is
updated using the following recurrence relation:-

The formula for calculating the current state:

ht=f(ht−1,xt)
where,
 ht -> current state
 ht-1 -> previous state
 xt -> input state
Formula for applying Activation function(tanh)

ℎ𝑡=𝑡𝑎𝑛ℎ(𝑊ℎℎℎ𝑡−1+𝑊𝑥ℎ𝑥𝑡)ht=tanh(Whhht−1+Wxhxt)

where,
 whh -> weight at recurrent neuron
 wxh -> weight at input neuron

𝑦𝑡=𝑊ℎ𝑦ℎ𝑡yt=Whyht
The formula for calculating output:

 Yt -> output
 Why -> weight at output layer

Issues of Standard RNNs

Vanishing Gradient: Text generation, machine translation, and stock market prediction are just a
few examples of the time-dependent and sequential data problems that can be modelled with
recurrent neural networks. You will discover, though, that the gradient problem makes training
RNN difficult.
Exploding Gradient: An Exploding Gradient occurs when a neural network is being trained and the
slope tends to grow exponentially rather than decay. Large error gradients that build up during
training lead to very large updates to the neural network model weights, which is the source of this
issue.
These parameters are updated using Backpropagation. However, since RNN works on sequential data
here we use an updated backpropagation which is known as Backpropagation through time.

Vanishing Gradient

Vanishing is when as backpropagation occurs, the gradients normally get smaller and smaller, gradually
approaching zero. This leaves the weights of the initial or lower layers unchanged, causing the Gradient
Descent to never converge to the optimum.

Exploding is the opposite of Vanishing and is when the gradient continues to get larger which causes a
large weight update and results in the Gradient Descent to diverge. Exploding gradients occur due to the
weights in the Neural Network, not the activation function.
The weights in the lower layers of the Neural Network are more likely to be affected by Exploding
Gradient as their associated gradients are products of more values. This leads to the gradients of the lower
layers being more unstable, causing the algorithm to diverge.

How can we identify?

Identifying the vanishing gradient problem typically involves monitoring the training dynamics of a deep
neural network.
One key indicator is observing model weights converging to 0 or stagnation in the improvement of the
model’s performance metrics over training epochs.
During training, if the loss function fails to decrease significantly, or if there is erratic behavior in the
learning curves, it suggests that the gradients may be vanishing.
Additionally, examining the gradients themselves during backpropagation can provide insights.
Visualization techniques, such as gradient histograms or norms, can aid in assessing the distribution of
gradients throughout the network.

LSTM networks are the most commonly used variation of Recurrent Neural Networks (RNNs). The
critical component of the LSTM is the memory cell and the gates (including the forget gate but also the
input gate), inner contents of the memory cell are modulated by the input gates and forget gates

The Logic Behind LSTM

These three parts of an LSTM unit are known as gates. They control the flow of information in and out of
the memory cell or lstm cell. The first gate is called Forget gate, the second gate is known as the Input
gate, and the last one is the Output gate. An LSTM unit that consists of these three gates and a memory
cell or lstm cell can be considered as a layer of neurons in traditional feedforward neural network, with
each neuron having a hidden layer and a current state.
Just like a simple RNN, an LSTM also has a hidden state where H(t-1) represents the hidden state of the
previous timestamp and Ht is the hidden state of the current timestamp. In addition to that, LSTM also
has a cell state represented by C(t-1) and C(t) for the previous and current timestamps, respectively.

Here the hidden state is known as Short term memory, and the cell state is known as Long term memory.
Refer to the following image.
Forget Gate

In a cell of the LSTM neural network, the first step is to decide whether we should keep the

information from the previous time step or forget it. Here is the equation for forget gate.

Let’s try to understand the equation, here

Xt: input to the current timestamp.

Uf: weight associated with the input
Ht-1: The hidden state of the previous timestamp
Wf: It is the weight matrix associated with the hidden state
Later, a sigmoid function is applied to it. That will make ft a number between 0 and 1. This ft is later
multiplied with the cell state of the previous timestamp, as shown below.

Input Gate

The input gate is used to quantify the importance of the new information carried by the input. Here is the
equation of the input gate
Xt: Input at the current timestamp t
Ui: weight matrix of input
Ht-1: A hidden state at the previous timestamp
Wi: Weight matrix of input associated with hidden state
Again we have applied the sigmoid function over it. As a result, the value of I at timestamp t will be
between 0 and 1.

New Information

Now the new information that needed to be passed to the cell state is a function of a hidden state at the
previous timestamp t-1 and input x at timestamp t. The activation function here is tanh. Due to the tanh
function, the value of new information will be between -1 and 1. If the value of Nt is negative, the
information is subtracted from the cell state, and if the value is positive, the information is added to the
cell state at the current timestamp.

However, the Nt won’t be added directly to the cell state. Here comes the updated equation:

Here, Ct-1 is the cell state at the current timestamp, and the others are the values we have calculated

previously.
Output Gate

Its value will also lie between 0 and 1 because of this sigmoid function. Now to calculate the current

hidden state, we will use Ot and tanh of the updated cell state. As shown below.

It turns out that the hidden state is a function of Long term memory (Ct) and the current output. If
you need to take the output of the current timestamp, just apply the SoftMax activation on hidden
state Ht.

Here the token with the maximum score in the output is the prediction.

LTSM vs RNN
RNN (Recurrent Neural
Aspect LSTM (Long Short-Term Memory)
Network)

Architecture A type of RNN with additional memory cells A basic type of RNN
RNN (Recurrent Neural
Aspect LSTM (Long Short-Term Memory)
Network)

Struggles with long-term

Handles long-term dependencies and prevents
Memory Retention dependencies and vanishing
vanishing gradient problem
gradient problem

Complex cell structure with input, output, and Simple cell structure with only
Cell Structure
forget gates hidden state

Also designed for sequential data,

Handling Sequences Suitable for processing sequential data
but limited memory

Slower training process due to increased Faster training process due to

Training Efficiency
complexity simpler architecture

Performance on Struggles to retain information on

Performs better on long sequences
Long Sequences long sequences

Best suited for tasks requiring long-term Appropriate for simple sequential
Usage memory, such as language translation and tasks, such as time series
sentiment analysis forecasting

Vanishing Gradient Prone to the vanishing gradient

Addresses the vanishing gradient problem
Problem problem

Tensorboard

TensorBoard is an open source toolkit which enables us to understand training progress and improve
model performance by updating the hyperparameters. TensorBoard toolkit displays a dashboard where the
logs can be visualized as graphs, images, histograms, embeddings, text etc. It also helps in tracking
information like gradients, losses, metrics, and intermediate outputs. arcgis.learn module integrates
TensorBoard toolkit to the model training process which now makes it possible for us to monitor model
training process.

DL Unit 3
No ratings yet
DL Unit 3
14 pages
Machine Learning (ML) :: Aim: Analysis and Implementation of Deep Neural Network. Definitions
No ratings yet
Machine Learning (ML) :: Aim: Analysis and Implementation of Deep Neural Network. Definitions
6 pages
DL PYQs ENDSEM
No ratings yet
DL PYQs ENDSEM
36 pages
Unit IV Deep Leraning
No ratings yet
Unit IV Deep Leraning
35 pages
Class Notes Unit 5
No ratings yet
Class Notes Unit 5
13 pages
UNIT-III DLL Full Unit
No ratings yet
UNIT-III DLL Full Unit
63 pages
9.b Handout-2-Regularization
No ratings yet
9.b Handout-2-Regularization
5 pages
Lecture 3
No ratings yet
Lecture 3
48 pages
DL Question Paper Solved
No ratings yet
DL Question Paper Solved
12 pages
Deep Neural Networks
No ratings yet
Deep Neural Networks
26 pages
ML Lec 13 CNN
No ratings yet
ML Lec 13 CNN
44 pages
Unit 3 Deep Learning
No ratings yet
Unit 3 Deep Learning
10 pages
May Deep Learning
No ratings yet
May Deep Learning
16 pages
CNN & RNN Explained for Beginners
No ratings yet
CNN & RNN Explained for Beginners
8 pages
Various Neural Network Architect Assignment Questions
No ratings yet
Various Neural Network Architect Assignment Questions
9 pages
Deep Learning Techniques
No ratings yet
Deep Learning Techniques
72 pages
DL-dropout Layer
No ratings yet
DL-dropout Layer
2 pages
CNN Guide for Machine Learning Students
No ratings yet
CNN Guide for Machine Learning Students
37 pages
Layers
No ratings yet
Layers
4 pages
Unit III
No ratings yet
Unit III
89 pages
Unit III
No ratings yet
Unit III
89 pages
Unit III
No ratings yet
Unit III
43 pages
Convolutional Neural Networks: Convolutional Layer Pooling Layer Fully Connected Layer
No ratings yet
Convolutional Neural Networks: Convolutional Layer Pooling Layer Fully Connected Layer
33 pages
CC511 Week 7 - Deep - Learning
No ratings yet
CC511 Week 7 - Deep - Learning
33 pages
Introduction To Convolution Neural Network
No ratings yet
Introduction To Convolution Neural Network
15 pages
New
No ratings yet
New
8 pages
Deep Learning Interview Q&a
No ratings yet
Deep Learning Interview Q&a
7 pages
DL Endsem 2024 FlyHigh Services
No ratings yet
DL Endsem 2024 FlyHigh Services
18 pages
CNNs for Image Recognition
No ratings yet
CNNs for Image Recognition
16 pages
Deep Learning & CNN Fundamentals
No ratings yet
Deep Learning & CNN Fundamentals
56 pages
Assignment 13 Modern AI
No ratings yet
Assignment 13 Modern AI
3 pages
Convolutional Neural Networks (Image Recognition) Part - II: Dr. Syed M. Usman
No ratings yet
Convolutional Neural Networks (Image Recognition) Part - II: Dr. Syed M. Usman
75 pages
Convolutional Neural Network
No ratings yet
Convolutional Neural Network
37 pages
Step by Step Procedure That How I Resolve Given Task Pytorh
No ratings yet
Step by Step Procedure That How I Resolve Given Task Pytorh
6 pages
Super VIP Cheatsheet - Deep Learning
No ratings yet
Super VIP Cheatsheet - Deep Learning
47 pages
Deep Learning Unit2
No ratings yet
Deep Learning Unit2
16 pages
UNIT-III DeepLearning Notes
No ratings yet
UNIT-III DeepLearning Notes
30 pages
MLT UNIT-4 & 5 Imp Sol
No ratings yet
MLT UNIT-4 & 5 Imp Sol
22 pages
Convolutional Neural Networks: Computer Vision
No ratings yet
Convolutional Neural Networks: Computer Vision
14 pages
Weight Dropout For Preventing Neural Networks From Overfitting
No ratings yet
Weight Dropout For Preventing Neural Networks From Overfitting
4 pages
An Introduction To Convolutional Neural Networks
No ratings yet
An Introduction To Convolutional Neural Networks
11 pages
3 - DeepLearning - and - CNN v3
No ratings yet
3 - DeepLearning - and - CNN v3
50 pages
DL Unit3
No ratings yet
DL Unit3
8 pages
AlexNet and Other Pretrained Models - Presentation
No ratings yet
AlexNet and Other Pretrained Models - Presentation
182 pages
DEEP LEARNING Unit-2 NOTES For Post Graduation
No ratings yet
DEEP LEARNING Unit-2 NOTES For Post Graduation
11 pages
UNIT-III Convolution Neural Networks
No ratings yet
UNIT-III Convolution Neural Networks
9 pages
Deep Learning Cheatsheet Guide
No ratings yet
Deep Learning Cheatsheet Guide
14 pages
What Is A Convolutional Neural Network-Unit3
No ratings yet
What Is A Convolutional Neural Network-Unit3
12 pages
Gen Aiml Notes by Piyush
No ratings yet
Gen Aiml Notes by Piyush
39 pages
ImageNet Classification With Deep Convolutional Neural Networks
No ratings yet
ImageNet Classification With Deep Convolutional Neural Networks
18 pages
A Beginner's Guide To Understanding Convolutional Neural Networks Part 2
No ratings yet
A Beginner's Guide To Understanding Convolutional Neural Networks Part 2
8 pages
Dropout in Deep Learning
No ratings yet
Dropout in Deep Learning
14 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
15 pages
Unit II
No ratings yet
Unit II
38 pages
Module 05 CNN Arctitecture
No ratings yet
Module 05 CNN Arctitecture
7 pages
Deep Learning
No ratings yet
Deep Learning
30 pages
Convolutional Neural Network (CNN)
No ratings yet
Convolutional Neural Network (CNN)
9 pages
Soft Computing Roadmap
No ratings yet
Soft Computing Roadmap
3 pages
Back Propagation
No ratings yet
Back Propagation
10 pages
ML-Based Word Prediction Guide
No ratings yet
ML-Based Word Prediction Guide
12 pages
JNTU - Neural Network
No ratings yet
JNTU - Neural Network
5 pages
Intro to CNNs for Deep Learning Students
No ratings yet
Intro to CNNs for Deep Learning Students
65 pages
ObjectiveQ&a Mid-I NNDL
No ratings yet
ObjectiveQ&a Mid-I NNDL
15 pages
Neural Networks: Key Concepts
No ratings yet
Neural Networks: Key Concepts
26 pages
Coursera Deep Learning
No ratings yet
Coursera Deep Learning
1 page
NN Mdu Previousyears
No ratings yet
NN Mdu Previousyears
10 pages
Tutorial On Neural Networks - 18MAR2024
No ratings yet
Tutorial On Neural Networks - 18MAR2024
33 pages
21CS743
100% (1)
21CS743
1 page
Neural Network Parameter Guide
No ratings yet
Neural Network Parameter Guide
10 pages
Artificial Neural Networks An Artificial Neuron: X W X W S X W W y
No ratings yet
Artificial Neural Networks An Artificial Neuron: X W X W S X W W y
7 pages
Stanford CS 224N Deep Learning For NLP Practice Quiz Pack
No ratings yet
Stanford CS 224N Deep Learning For NLP Practice Quiz Pack
4 pages
Generative AI Landscape Self Assessment
No ratings yet
Generative AI Landscape Self Assessment
6 pages
Deep Learning
No ratings yet
Deep Learning
7 pages
04 - Neural Networks
No ratings yet
04 - Neural Networks
12 pages
Multilayer Perceptron Neural Network
No ratings yet
Multilayer Perceptron Neural Network
17 pages
3.convolutional Networks and Sequence Modeling
No ratings yet
3.convolutional Networks and Sequence Modeling
19 pages
Encoder Decoder Transformers Notes
No ratings yet
Encoder Decoder Transformers Notes
6 pages
Deep Fake Video Detection Using Transfer Learning Approach: Researcharticle-Computerengineeringandcomputerscience
No ratings yet
Deep Fake Video Detection Using Transfer Learning Approach: Researcharticle-Computerengineeringandcomputerscience
11 pages
DL Assignment 4
No ratings yet
DL Assignment 4
7 pages
ML Lecture#4
No ratings yet
ML Lecture#4
109 pages
Word 2 Vec
No ratings yet
Word 2 Vec
29 pages
Recurrent Neural Networks
No ratings yet
Recurrent Neural Networks
20 pages
Attention & Transformers
No ratings yet
Attention & Transformers
66 pages
DL CNN
No ratings yet
DL CNN
129 pages
Deep Learning Techniques (Important Questions)
No ratings yet
Deep Learning Techniques (Important Questions)
5 pages
CBSE UNIT VI Understanding Neural Networks
No ratings yet
CBSE UNIT VI Understanding Neural Networks
13 pages
402B Deep Learning
100% (1)
402B Deep Learning
82 pages

Unit II.

Uploaded by

Unit II.

Uploaded by

CONVOLUTIONAL NEURAL NETWORK

What is a Convolutional Neural Network (CNN)?

Convolution Neural Network

Convolution Neural Network

Layers used to build ConvNets

Image source: cs231n.stanford.edu

Spatial Convolution and pooling

A dropout regularization in deep learning is a regularization approach that prevents overfitting by

Understanding Dropout Regularization

Advantages of Dropout Regularization in Deep Learning

Drawbacks of Dropout Regularization and How to Mitigate Them

Optimization Complexity: Understanding why dropout works is unclear, making optimization

What is Recurrent Neural Network (RNN)?

How RNN differs from Feedforward Neural Network?

Recurrent Neuron and RNN Unfolding

Recurrent Neural Network Architecture

The formula for calculating the current state:

Issues of Standard RNNs

How can we identify?

The Logic Behind LSTM

Let’s try to understand the equation, here

Xt: input to the current timestamp.

Struggles with long-term

Also designed for sequential data,

Slower training process due to increased Faster training process due to

Performance on Struggles to retain information on

Vanishing Gradient Prone to the vanishing gradient

You might also like