[go: up one dir, main page]

0% found this document useful (0 votes)
29 views29 pages

Report On Deep Learning

The document provides an overview of deep learning, a subfield of machine learning that utilizes artificial neural networks to learn complex patterns from data. It discusses various types of neural networks, their learning mechanisms, and applications across different fields such as computer vision, natural language processing, and healthcare. Additionally, it highlights the architecture and functioning of convolutional neural networks, which are particularly effective for image analysis.

Uploaded by

kanishkajain739
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views29 pages

Report On Deep Learning

The document provides an overview of deep learning, a subfield of machine learning that utilizes artificial neural networks to learn complex patterns from data. It discusses various types of neural networks, their learning mechanisms, and applications across different fields such as computer vision, natural language processing, and healthcare. Additionally, it highlights the architecture and functioning of convolutional neural networks, which are particularly effective for image analysis.

Uploaded by

kanishkajain739
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

REPORT

ON
DEEP LEARNING

SUMBITTED BY:

Kanishka jain
INTRODUCTION TO DEEP LEARNING
Deep learning is a branch of machine learning which is based on artificial neural networks. It is
capable of learning complex patterns and relationships within data. In deep learning, we don’t
need to explicitly program everything. It has become increasingly popular in recent years due to
the advances in processing power and the availability of large datasets. Because it is based on
artificial neural networks (ANNs) also known as deep neural networks (DNNs). These neural
networks are inspired by the structure and function of the human brain’s biological neurons, and
they are designed to learn from large amounts of data.

1. Deep Learning is a subfield of Machine Learning that involves the use of neural networks to
model and solve complex problems. Neural networks are modelled after the structure and
function of the human brain and consist of layers of interconnected nodes that process and
transform data.

2. The key characteristic of Deep Learning is the use of deep neural networks, which have
multiple layers of interconnected nodes. These networks can learn complex representations of
data by discovering hierarchical patterns and features in the data. Deep Learning algorithms
can automatically learn and improve from data without the need for manual feature engineering.

3. Deep Learning has achieved significant success in various fields, including image
recognition, natural language processing, speech recognition, and recommendation systems.
Some of the popular Deep Learning architectures include Convolutional Neural Networks
(CNNs), Recurrent Neural Networks (RNNs), and Deep Belief Networks (DBNs).

4. Training deep neural networks typically requires a large amount of data and computational
resources. However, the availability of cloud computing and the development of specialized
hardware, such as Graphics Processing Units (GPUs), has made it easier to train deep neural
networks.

In summary, Deep Learning is a subfield of Machine Learning that involves the use of deep
neural networks to model and solve complex problems. Deep Learning has achieved significant
success in various fields, and its use is expected to continue to grow as more data becomes
available, and more powerful computing resources become available .
WHAT IS DEEP LEARNING ?
Deep learning is the branch of “Machine Learning ” which is based on artificial neural network
architecture. An artificial neural network or ANN uses layers of interconnected nodes called
neurons that work together to process and learn from the input data. In a fully connected Deep
neural network, there is an input layer and one or more hidden layers connected one after the
other. Each neuron receives input from the previous layer neurons or the input layer. The output
of one neuron becomes the input to other neurons in the next layer of the network, and this
process continues until the final layer produces the output of the network. The layers of the
neural network transform the input data through a series of nonlinear transformations, allowing
the network to learn complex representations of the input data.
ARTIFICIAL NEURAL NETWORKS (ANN)

“ Artificial neural networks ” are built on the principles of the structure and operation of
human neurons. It is also known as neural networks or neural nets. An artificial neural network’s
input layer, which is the first layer, receives input from external sources and passes it on to the
hidden layer, which is the second layer. Each neuron in the hidden layer gets information from
the neurons in the previous layer, computes the weighted total,and then transfers it to the
neurons in the next layer.These connections are weighted, which means that the impacts of the
inputs from the preceding layer are more or less optimized by giving each input a distinct
weight. These weights are then adjusted during the training process to enhance the
performance of the model.

The structures and operations of human neurons serve as the basis for artificial neural
networks. It is also known as neural networks or neural nets. The input layer of an artificial
neural network is the first layer, and it receives input from external sources and releases it to
the hidden layer, which is the second layer. In the hidden layer, each neuron receives input
from the previous layer neurons, computes the weighted sum, and sends it to the neurons in the
next layer. These connections are weighted means effects of the inputs from the previous layer
are optimized more or less by assigning different-different weights to each input and it is
adjusted during the training process by optimizing these weights for improved model
performance.
How do Artificial Neural Networks learn?
Artificial neural networks are trained using a training set. For example, suppose you want to
teach an ANN to recognize a cat. Then it is shown thousands of different images of cats so that
the network can learn to identify a cat. Once the neural network has been trained enough using
images of cats, then you need to check if it can identify cat images correctly. This is done by
making the ANN classify the images it is provided by deciding whether they are cat images or
not. The output obtained by the ANN is corroborated by a human-provided description of
whether the image is a cat image or not. If the ANN identifies incorrectly then it is used to adjust
whatever it has learned during training. Backpropagation is done by fine-tuning the weights of
the connections in ANN units based on the error rate obtained. This process continues until the
artificial neural network can correctly recognize a cat in an image with minimal possible error
rates.

What are the types of Artificial Neural Networks?

●​ Feedforward Neural Network : The feedforward neural network is one of the most
basic artificial neural networks. In this ANN, the data or the input provided travels in a
single direction. It enters into the ANN through the input layer and exits through the
output layer while hidden layers may or may not exist. So the feedforward neural
network has a front-propagated wave only and usually does not have
backpropagation.
●​ Convolutional Neural Network : A Convolutional neural network has some
similarities to the feed-forward neural network, where the connections between units
have weights that determine the influence of one unit on another unit. But a CNN has
one or more than one convolutional layer that uses a convolution operation on the
input and then passes the result obtained in the form of output to the next layer. CNN
has applications in speech and image processing which is particularly useful in
computer vision.
●​ Modular Neural Network: A Modular Neural Network contains a collection of
different neural networks that work independently towards obtaining the output with
no interaction between them. Each of the different neural networks performs a
different sub-task by obtaining unique inputs compared to other networks. The
advantage of this modular neural network is that it breaks down a large and complex
computational process into smaller components, thus decreasing its complexity while
still obtaining the required output.
●​ Radial basis function Neural Network: Radial basis functions are those functions
that consider the distance of a point concerning the center. RBF functions have two
layers. In the first layer, the input is mapped into all the Radial basis functions in the
hidden layer and then the output layer computes the output in the next step. Radial
basis function nets are normally used to model the data that represents any
underlying trend or function.
●​ Recurrent Neural Network: The Recurrent Neural Network saves the output of a
layer and feeds this output back to the input to better predict the outcome of the
layer. The first layer in the RNN is quite similar to the feed-forward neural network
and the recurrent neural network starts once the output of the first layer is computed.
After this layer, each unit will remember some information from the previous step so
that it can act as a memory cell in performing computations.

Applications of Artificial Neural Networks

●​ Social Media: Artificial Neural Networks are used heavily in Social Media. For
example, let’s take the ‘People you may know’ feature on Facebook that suggests
people that you might know in real life so that you can send them friend requests.
Well, this magical effect is achieved by using Artificial Neural Networks that analyze
your profile, your interests, your current friends, and also their friends and various
other factors to calculate the people you might potentially know.
●​ Healthcare: Artificial Neural Networks are used in Oncology to train algorithms that
can identify cancerous tissue at the microscopic level at the same accuracy as
trained physicians. Various rare diseases may manifest in physical characteristics
and can be identified in their premature stages by using Facial Analysis on the
patient photos.
●​ Personal Assistants: Personal assistants like Alexa, Siri uses Natural Language
Processing to interact with the users and formulate a response accordingly. Natural
Language Processing uses artificial neural networks that are made to handle many
tasks of these personal assistants such as managing the language syntax,
semantics, correct speech, the conversation that is going on, etc.
APPLICATIONS OF DEEP LEARNING

The main applications of deep learning can be divided into computer vision, natural language
processing (NLP), and reinforcement learning.

Computer Vision

In computer vision, Deep learning models can enable machines to identify and understand
visual data .They are ultimately able to decide based on what their eyesight tells them. It
includes different processes such as image processing, pattern recognition, and machine
learning. Algorithms analyze visual data, detecting patterns and making predictions. The aim of
the technique is to allow machines to automatically interpret and make decisions based on
visual data.
Some of the main applications of deep learning in computer vision include:

• Object detection and recognition: Deep learning model can be used to identify and locate
objects within images and videos, making it possible for machines to perform tasks such as
self-driving cars, surveillance, and robotics.
• Image classification: Deep learning models can be used to classify images into categories
such as animals, plants, and buildings. This is used in applications such as medical imaging,
quality control, and image retrieval.
• Image Segmentation: Deep learning models can be used for image segmentation into
different regions, making it possible to identify specific features within images.
Natural Language Processing

In NLP, the Deep learning model can enable machines to understand and generate human
language.
Some of the main applications of deep learning in NLP include:

• Automatic Text Generation – Deep learning models can learn the corpus of text and new
text like summaries,essays can be automatically generated using these trained models.
• Language translation: Deep learning models can translate text from one language to
another, making it possible to communicate with people from different linguistic backgrounds.
• Sentiment analysis: Deep learning models can analyze the sentiment of a piece of
text,makingitpossibletodeterminewhetherthetextispositive,negative, or neutral. This is used in
applications such as customer service, social media monitoring, and political analysis.
• Speech recognition: Deep learning models can recognize and transcribe spoken words,
making it possible to perform tasks such as speech-to-text conversion, voice search, and
voice-controlled devices.

Reinforcement Learning

In reinforcement learning, deep learning works as training agents to take action in an


environment to maximize a reward.RL allows machines to learn by interacting with an
environment and receiving feedback based on their actions. This feedback comes in the form of
rewards or penalties.
Some of the main applications of deep learning in reinforcement learning include:

• Game playing: Deep reinforcement learning models have been able to beat human experts at
games such as Go, Chess, and Atari.
• Robotics: Deep reinforcement learning models can be used to train robots to perform complex
tasks such as grasping objects, navigation, and manipulation.
• Control systems: Deep reinforcement learning models can be used to control complex
systems such as power grids, traffic management, and supply chain optimization
Types of deep learning models

Deep learning algorithms are incredibly complex, and there are different types of neural
networks to address specific problems or datasets. Each has its own advantages and
they are presented here roughly in the order of their development, with each successive
model adjusting to overcome a weakness in a previous model.
One potential weakness across them all is that deep learning models are often “black
boxes,” making it difficult to understand their inner workings and posing interpretability
challenges. But this can be balanced against the overall benefits of high accuracy and
scalability.

●​ Convolutional Neural Network (CNN)


●​ Recurrent Neural Networks (RNNs)
●​ Long Short-Term Memory Networks (LSTMs)
●​ Deep Belief Networks (DBNs)
●​ Generative Adversarial Networks (GANs)
●​ Autoencoders
●​ Transformers
CONVOLUTION NEURAL NETWORKS (CNNs)

Deep Learning has proved to be a very powerful tool because of its ability to handle large
amounts of data. The interest to use hidden layers has surpassed traditional techniques,
especially in pattern recognition. One of the DeepLearning most popular deep neural networks
is Convolutional Neural Networks (also known as CNN or ConvNet) in deep learning, especially
when it comes to Computer Vision applications.

What is Convolutional Neural Network?


A convolutional neural network is a feed-forward neural network that is generally used to
analyze visual images by processing data with grid-like topology. It’s also known as a ConvNet.
A convolutional neural network is used to detect and classify objects in an image.

Below is a neural network that identifies two types of flowers: Orchid and Rose
In CNN, every image is represented in the form of an array of pixel values.

The convolution operation forms the basis of any convolutional neural network. Let’s understand
the convolution operation using two matrices, a and b, of 1 dimension.

a = [5,3,7,5,9,7]

b = [1,2,3]

In convolution operation, the arrays are multiplied element-wise, and the product is summed to
create a new array, which represents a*b.

The first three elements of the matrix a are multiplied with the elements of matrix b. The product
is summed to get the result.
The next three elements from the matrix a are multiplied by the elements in matrix b, and the
product is summed up.

This process continues until the convolution operation is complete.

How Does CNN Recognize Images?

Consider the following images:

The boxes that are colored represent a pixel value of 1, and 0 if not colored.
Here is another example to depict how CNN recognizes an image:
Layers in a Convolutional Neural Network

A convolution neural network has multiple hidden layers that help in extracting information from
an image. The four important layers in CNN are:

1.​ Convolution layer


2.​ ReLU layer
3.​ Pooling layer
4.​ Fully connected layer
5.​ ReLU layer/ Activation Layer
6.​ Flattening
7.​ Output Layer

●​ Convolution Layer

This is the first step in the process of extracting valuable features from an image. A convolution
layer has several filters that perform the convolution operation. Every image is considered as a
matrix of pixel values.

Consider the following 5x5 image whose pixel values are either 0 or 1. There’s also a filter
matrix with a dimension of 3x3. Slide the filter matrix over the image and compute the dot
product to get the convolved feature matrix.
●​ ReLU layer

ReLU stands for the rectified linear unit. Once the feature maps are extracted, the next step is to
move them to a ReLU layer.

ReLU performs an element-wise operation and sets all the negative pixels to 0. It introduces
non-linearity to the network, and the generated output is a rectified feature map. Below is the
graph of a ReLU function:

The original image is scanned with multiple convolutions and ReLU layers for locating the
features.
●​ Pooling Layer

Pooling is a down-sampling operation that reduces the dimensionality of the feature map. The
rectified feature map now goes through a pooling layer to generate a pooled feature map.

The pooling layer uses various filters to identify different parts of the image like edges, corners,
body, feathers, eyes, and beak.
Here’s how the structure of the convolution neural network looks so far:

The next step in the process is called flattening. Flattening is used to convert all the resultant
2-Dimensional arrays from pooled feature maps into a single long continuous linear vector.

The flattened matrix is fed as input to the fully connected layer to classify the image.
Here’s how exactly CNN recognizes a bird:

●​ The pixels from the image are fed to the convolutional layer that performs the
convolution operation
●​ It results in a convolved map
●​ The convolved map is applied to a ReLU function to generate a rectified feature map
●​ The image is processed with multiple convolutions and ReLU layers for locating the
features
●​ Different pooling layers with various filters are used to identify specific parts of the
image
●​ The pooled feature map is flattened and fed to a fully connected layer to get the final
output
●​ Activation Layer

The activation layer introduces nonlinearity into the network by applying an activation function to
the output of the previous layer. This is crucial for the network to learn complex patterns.
Common activation functions, such as ReLU, Tanh, and Leaky ReLU, transform the input while
keeping the output size unchanged.

●​ Flattening

After the convolution and pooling operations, the feature maps still exist in a multi-dimensional
format. Flattening converts these feature maps into a one-dimensional vector. This process is
essential because it prepares the data to be passed into fully connected layers for classification
or regression tasks.

●​ Output Layer

In the output layer, the final result from the fully connected layers is processed through a logistic
function, such as sigmoid or softmax. These functions convert the raw scores into probability
distributions, enabling the model to predict the most likely class label.
Applications of CNN

●​ Image Classification

CNN in deep learning excels at image classification, which involves sorting images into
predefined categories. They can effectively identify whether an image depicts a cat, dog,
car, or flower, making them indispensable for tasks that require sorting and labeling large
volumes of visual data.

●​ Object Detection

CNNs are particularly skilled in object detection, allowing them to identify and pinpoint
specific items within an image. Whether it's recognizing people, cars, or buildings, CNNs
can locate these objects and highlight their positions, which is crucial for applications
needing accurate object placement and identification.

●​ Image Segmentation

CNNs are highly effective for tasks that involve breaking down an image into distinct
parts. Image segmentation allows CNNs to distinguish and label different objects or
regions within an image. This capability is essential in fields like medical imaging, where
detailed analysis of structures is required, and in robotics, where intricate scenes need to
be understood.

●​ Video Analysis

CNNs are also adept at video analysis, where they can track objects and detect events
over time. This makes them valuable for applications like surveillance and traffic
monitoring, where continuously analyzing dynamic scenes helps in understanding and
managing real-time activities.
RECURRENT NEURAL NETWORKS (CNNs)

Recurrent Neural Networks (RNNs) are a type of artificial neural network designed to process
sequences of data. They work especially well for jobs requiring sequences, such as time series
data, voice, natural language, and other activities.

RNN works on the principle of saving the output of a particular layer and feeding this back to the
input in order to predict the output of the layer.

Below is how you can convert a Feed-Forward Neural Network into a Recurrent Neural Network:

Fig: Simple Recurrent Neural Network

The nodes in different layers of the neural network are compressed to form a single layer of
recurrent neural networks. A, B, and C are the parameters of the network.
Fig: Fully connected Recurrent Neural Network

Here, “x” is the input layer, “h” is the hidden layer, and “y” is the output layer. A, B, and C are the
network parameters used to improve the output of the model. At any given time t, the current
input is a combination of input at x(t) and x(t-1). The output at any given time is fetched back to
the network to improve on the output.

Fig: Fully connected Recurrent Neural Network


Now that you understand what a recurrent neural network is let’s look at the different types of
recurrent neural networks.

Why Recurrent Neural Networks?

RNN were created because there were a few issues in the feed-forward neural network:

●​ Cannot handle sequential data


●​ Considers only the current input
●​ Cannot memorize previous inputs

The solution to these issues is the RNN. An RNN can handle sequential data, accepting the
current input data, and previously received inputs. RNNs can memorize previous inputs due to
their internal memory.

How Does Recurrent Neural Networks Work?

In Recurrent Neural networks, the information cycles through a loop to the middle hidden layer.

Fig: Working of Recurrent Neural Network


The input layer ‘x’ takes in the input to the neural network and processes it and passes it onto
the middle layer.

The middle layer ‘h’ can consist of multiple hidden layers, each with its own activation functions
and weights and biases. If you have a neural network where the various parameters of different
hidden layers are not affected by the previous layer, ie: the neural network does not have
memory, then you can use a recurrent neural network.

The Recurrent Neural Network will standardize the different activation functions and weights and
biases so that each hidden layer has the same parameters. Then, instead of creating multiple
hidden layers, it will create one and loop over it as many times as required.

Applications of Recurrent Neural Networks


RNNs are used in various applications where data is sequential or time-based:

●​ Time-Series Prediction: RNNs excel in forecasting tasks, such as stock market


predictions and weather forecasting.
●​ Natural Language Processing (NLP): RNNs are fundamental in NLP tasks like
language modeling, sentiment analysis and machine translation.
●​ Speech Recognition: RNNs capture temporal patterns in speech data, aiding in
speech-to-text and other audio-related applications.
●​ Image and Video Processing: When combined with convolutional layers, RNNs
help analyze video sequences, facial expressions and gesture recognition.
Generative Adversarial Networks(GANs)

Generative Adversarial Networks (GANs) were introduced in 2014 by Ian J. Goodfellow and
co-authors. GANs perform unsupervised learning tasks in machine learning. It consists of 2
models that automatically discover and learn the patterns in input data.

The two models are known as Generator and Discriminator.

They compete with each other to scrutinize, capture, and replicate the variations within a
dataset. GANs can be used to generate new examples that plausibly could have been drawn
from the original dataset.

What is a Generator?

A Generator in GANs is a neural network that creates fake data to be trained on the
discriminator. It learns to generate plausible data. The generated examples/instances become
negative training examples for the discriminator. It takes a fixed-length random vector carrying
noise as input and generates a sample.

The main aim of the Generator is to make the discriminator classify its output as real. The part
of the GAN that trains the Generator includes:

●​ noisy input vector


●​ generator network, which transforms the random input into a data instance
●​ discriminator network, which classifies the generated data
●​ generator loss, which penalizes the Generator for failing to dolt the discriminator

The backpropagation method is used to adjust each weight in the right direction by calculating
the weight's impact on the output. It is also used to obtain gradients and these gradients can
help change the generator weights.
What is a Discriminator?

The Discriminator is a neural network that identifies real data from the fake data created
by the Generator. The discriminator's training data comes from different two sources:

●​ The real data instances, such as real pictures of birds, humans, currency
notes, etc., are used by the Discriminator as positive samples during training.
●​ The fake data instances created by the Generator are used as negative
examples during the training process.

While training the discriminator, it connects to two loss functions. During discriminator
training, the discriminator ignores the generator loss and just uses the discriminator
loss.In the process of training the discriminator, the discriminator classifies both real
data and fake data from the generator. The discriminator loss penalizes the
discriminator for misclassifying a real data instance as fake or a fake data instance as
real.The discriminator updates its weights through backpropagation from the
discriminator loss through the discriminator network.
How does a GAN work?
GANs train by having two networks the Generator (G) and the Discriminator (D) compete and
improve together. Here's the step-by-step process

1. Generator's First Move

The generator starts with a random noise vector like random numbers. It uses this noise as a
starting point to create a fake data sample such as a generated image. The generator’s internal
layers transform this noise into something that looks like real data.

2. Discriminator's Turn

The discriminator receives two types of data:

●​ Real samples from the actual training dataset.


●​ Fake samples created by the generator.

D's job is to analyze each input and find whether it's real data or something G cooked up. It
outputs a probability score between 0 and 1. A score of 1 shows the data is likely real and 0
suggests it's fake.

3. Adversarial Learning

●​ If the discriminator correctly classifies real and fake data it gets better at its job.
●​ If the generator fools the discriminator by creating realistic fake data, it receives a
positive update and the discriminator is penalized for making a wrong decision.

4. Generator's Improvement

●​ Each time the discriminator mistakes fake data for real, the generator learns from
this success.
●​ Through many iterations, the generator improves and creates more convincing fake
samples.
5. Discriminator's Adaptation

●​ The discriminator also learns continuously by updating itself to better spot fake data.
●​ This constant back-and-forth makes both networks stronger over time.

6. Training Progression

●​ As training continues, the generator becomes highly proficient at producing realistic


data.
●​ Eventually the discriminator struggles to distinguish real from fake shows that the
GAN has reached a well-trained state.
●​ At this point, the generator can produce high-quality synthetic data that can be used
for different applications.

Types of GANs
1.​ Vanilla GANs: Vanilla GANs have a min-max optimization formulation where the
Discriminator is a binary classifier and uses sigmoid cross-entropy loss during
optimization. The Generator and the Discriminator in Vanilla GANs are multi-layer
perceptrons. The algorithm tries to optimize the mathematical equation using stochastic
gradient descent.
2.​ Deep Convolutional GANs (DCGANs): DCGANs support convolution neural networks
instead of vanilla neural networks at both Discriminator and Generator. They are more
stable and generate better quality images. The Generator is a set of convolution layers
with fractional-strided convolutions or transpose convolutions, so it up-samples the input
image at every convolutional layer. The discriminator is a set of convolution layers with
strided convolutions, so it down-samples the input image at every convolution layer.
3.​ Conditional GANs: Vanilla GANs can be extended into Conditional models by using
extra-label information to generate better results. In CGAN, an additional parameter ‘y’ is
added to the Generator for generating the corresponding data. Labels are fed as input to
the Discriminator to help distinguish the real data from the fake generated data.
4.​ Super Resolution GANs: SRGANs use deep neural networks along with an adversarial
network to produce higher resolution images. SRGANs generate a photorealistic
high-resolution image when given a low-resolution image.

Application Of Generative Adversarial Networks (GANs)

1.​ Image Synthesis & Generation: GANs generate realistic images, avatars and

high-resolution visuals by learning patterns from training data. They are used in art,
gaming and AI-driven design.
2.​ Image-to-Image Translation: They can transform images between domains while

preserving key features. Examples include converting day images to night, sketches
to realistic images or changing artistic styles.
3.​ Text-to-Image Synthesis: They create visuals from textual descriptions helps

applications in AI-generated art, automated design and content creation.


4.​ Data Augmentation: They generate synthetic data to improve machine learning

models and help in making them more robust and generalizable in fields with limited
labeled data.
5.​ High-Resolution Image Enhancement: They upscale low-resolution images which

helps in improving clarity for applications like medical imaging, satellite imagery and
video enhancement.

You might also like