Deep Learning

About Machine learning

Uploaded by

You TV

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

26 views24 pages

Deep Learning

About Machine learning

Uploaded by

You TV

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 24

Deep Learning

MLP Feed Forward Neural Network

• Definition: A fully connected, feedforward neural
network with at least one hidden layer between the
input and output layers.
• Feedforward: Data flows in one direction (input →
hidden layers → output) with no cycles.
• Purpose: Learn non-linear relationships in data for
tasks like classification, regression, and pattern
recognition.
Architecture of an MLP
•Input Layer: Receives raw features (e.g., pixels in an
image, words in a document).
•Hidden Layers: Transform inputs using weights and
activation functions (e.g., ReLU, Sigmoid).
•Output Layer: Produces predictions (e.g., class
probabilities for classification).
Convolutional Neural Networks
(CNNs)
• Convolutional Neural Networks (CNNs) for image
classification (2D CNN) and text classification (1D CNN),
2D CNN for Image Classification

What:
CNNs use convolutional layers to extract spatial
hierarchies of features (edges → textures → objects)
from images.
Why:
• Translation invariance: Detects patterns regardless of
position.
• Parameter sharing: Fewer weights than dense layers.
1D CNN for Text Classification

What:
1D CNNs apply temporal convolutions to sequences
(e.g., word embeddings) to detect local patterns (n-
grams).
Why:
• Faster than RNNs: Parallel processing of sequences.
• Captures local context: Detects phrases or word
combinations.
Recurrent Neural Networks
(RNNs)
• Class of neural networks designed for sequential
data (e.g., time series, text, speech).
• Purpose: Process sequences by maintaining a hidden
state that captures temporal dependencies.
• Key Idea: Reuse weights across time steps, allowing
the network to "remember" past information.
• Use Cases:
• Time series forecasting.
• Text generation/sentiment analysis.
• Speech recognition.
• Machine translation.
Challenges with Basic RNNs

• Vanishing/Exploding Gradients: Difficulty learning

long-range dependencies (e.g., in long sentences).
• Short-Term Memory: Basic RNNs struggle to retain
information over many steps.
Solutions:
• Long Short-Term Memory (LSTM):
Uses gates (input, forget, output) to control information
flow.
• Gated Recurrent Units (GRU): Simplified version of
LSTM with fewer gates.
What Are Word Embeddings?
•Definition: Dense vector representations of words in a
continuous space, where similar words are closer
geometrically.
•Purpose: Capture semantic meaning (e.g., king - man +
woman ≈ queen) and syntactic patterns (e.g., verb
tenses).
•Key Benefit: Overcome thinness issues of traditional
methods like BoW.
Word2Vec
A framework by Google in 2013 for learning word
embeddings using shallow neural networks. Two
variants:
1.Continuous Bag-of-Words (CBOW)
2.Skip-Gram
Continuous Bag-of-Words
(CBOW)
• Goal: Predict a target word from its surrounding context
words.
• Input: Average of context word vectors (e.g., window of
2 words before/after).
• Output: Probability distribution over the vocabulary for
the target word.
Example:
•Context: ["The", "cat", "on", "the"] → Predict target
word "sat".
•Training: Adjust weights to maximize the probability
of "sat" given the context.
Skip-Gram
•Goal: Predict context words given a target word (inverse
of CBOW).
•Input: Target word vector.
•Output: Probability distribution over context words.

Example:
•Target word: "sat" → Predict context ["The", "cat", "on",
"the"].
•Training: Adjust weights to maximize the probability of
context words.
Gensim and Custom Embedding
Training
• Gensim: Gensim is a Python library for natural
language processing, particularly known for its topic
modeling and document indexing capabilities.
• Word Embeddings: These are vector representations
of words, capturing semantic relationships between
them.
• Word2Vec: A popular algorithm within Gensim for
creating word embeddings.
Gensim and Custom Embedding Training

Training a Custom Word2Vec Model:

•Input Data:
You'll need a corpus of text data. Gensim's Word2Vec expects
input as a sequence of sentences, where each sentence is a
list of words.
•Preprocessing:
You might need to preprocess your text data (e.g., lowercasing,
removing punctuation, handling special characters) before
feeding it to the model.
•Model Training:
•Word2Vec Model: Instantiate the Word2Vec model from Gensim.
Sequence Models
• Sequence models are a type of machine learning model
designed to process and predict sequential data, such
as text, time series, or audio, by leveraging the inherent
order and dependencies within the data.
• They are commonly used in tasks like machine
translation, speech recognition, and text generation.
• Sequence models by their input/output structures 1-
to-1, 1-to-Many, Many-to-1, and Many-to-Many.
1-to-1 (Vanilla Feedforward
Model)
• Structure: Single input → Single output (no sequential
dependency).
Use Case: Standard classification/regression (e.g.,
MNIST digit classification).
1-to-Many
Structure: Single input → Sequence of outputs.
Use Cases:
• Image Captioning: Generate a sentence from an
image.
• Music Generation: Create a melody from a seed note.
Many-to-1
Structure: Sequence of inputs → Single output.
Use Cases:
• Sentiment Analysis: Classify a sentence as
positive/negative.
• Time Series Forecasting: Predict stock price from
historical data.
Many-to-Many
Structure: Sequence of inputs → Sequence of outputs.
Subtypes:
1.Aligned (Same Length): Each input step maps to an
output step (e.g., POS tagging).
2.Non-Aligned (Different Length): Input and output
sequences differ in length (e.g., translation).
a. Aligned Many-to-Many Use Cases:
• POS Tagging: Assign grammatical tags to each word.
• Video Frame Prediction: Predict next frame in a
video.
b. Non-Aligned Many-to-Many (Encoder-Decoder) Use
Cases:
• Machine Translation: Translate English → French.
• Speech Recognition: Transcribe audio → text.

Encoder: Processes input sequence into a context

vector.
Decoder: Generates output sequence from the context
vector.
Bi-Directional LSTM/RNN in
Sequence Models
• Bidirectional LSTMs (BiLSTMs) are a type of recurrent
neural network (RNN) that processes sequential data in
both forward and backward directions, allowing the
model to capture context from both past and future
inputs, which is particularly useful in tasks like natural
language processing.
• Why: Traditional unidirectional RNNs only use past
context. Bidirectional models improve performance in
tasks where future context matters (e.g., text
understanding, speech recognition).
•Two Hidden Layers:
• Forward Layer: Processes sequence
from t=1t=1 to t=Tt=T.
• Backward Layer: Processes sequence
from t=Tt=T to t=1t=1.
•Output Concatenation: Combine outputs from both layers at
Use Cases
1.Sentiment Analysis:
1.Understand how words influence each other in both directions
(e.g., "not good").
2.Machine Translation:
1.Encoder in seq2seq models (replaced by Transformers in
modern systems).
3.Speech Recognition:
1.Transcribe audio by leveraging past and future frames.
Limitations
• Not Suitable for Real-Time Prediction: Future
context isn’t available in streaming tasks (e.g., live
captioning).
• Memory-Intensive: Requires storing states for both
directions.

A M3 RD Ipjn Yd Ps GKF
No ratings yet
A M3 RD Ipjn Yd Ps GKF
20 pages
Model5 Partial
No ratings yet
Model5 Partial
52 pages
FDP Deep Learning Architectures and Applications
No ratings yet
FDP Deep Learning Architectures and Applications
51 pages
Day 4
No ratings yet
Day 4
22 pages
RNN LSTM GRU Transformers
0% (1)
RNN LSTM GRU Transformers
123 pages
Summaries of The Chapters
No ratings yet
Summaries of The Chapters
29 pages
RNN StannfordBased
No ratings yet
RNN StannfordBased
102 pages
Thuyết Trình TWP
No ratings yet
Thuyết Trình TWP
7 pages
11-Transformer LLMs Updated
No ratings yet
11-Transformer LLMs Updated
96 pages
Unit III (2) RNN, LSTM, Gru
No ratings yet
Unit III (2) RNN, LSTM, Gru
14 pages
NLP Slides2
No ratings yet
NLP Slides2
93 pages
Sequence Models231205
No ratings yet
Sequence Models231205
72 pages
AAM Unit 6 Notes
No ratings yet
AAM Unit 6 Notes
20 pages
BDMH LLM
No ratings yet
BDMH LLM
51 pages
Complete NLP Guide - From Fundamentals To Deep Learning With TensorFlow
No ratings yet
Complete NLP Guide - From Fundamentals To Deep Learning With TensorFlow
13 pages
Unit 3
No ratings yet
Unit 3
8 pages
LSTM Seq2Seq Models for Text Data
No ratings yet
LSTM Seq2Seq Models for Text Data
44 pages
RNNs and Their Types - 15 Slides (Easy Copy-Paste Format)
No ratings yet
RNNs and Their Types - 15 Slides (Easy Copy-Paste Format)
6 pages
NN UNIT 5 Notes
No ratings yet
NN UNIT 5 Notes
23 pages
Lecture8 421
No ratings yet
Lecture8 421
85 pages
Time Series RNN LSTM 1746197734
No ratings yet
Time Series RNN LSTM 1746197734
25 pages
Unit III - Recurrent Neural Networks
No ratings yet
Unit III - Recurrent Neural Networks
44 pages
Deep Learning Notes
100% (1)
Deep Learning Notes
16 pages
Unit-III NLP
No ratings yet
Unit-III NLP
15 pages
DL Unit-IV
No ratings yet
DL Unit-IV
20 pages
Generative AI Unit 3 Notes
No ratings yet
Generative AI Unit 3 Notes
8 pages
Unit - 4 DL
No ratings yet
Unit - 4 DL
33 pages
CS585 Lecture October15th
No ratings yet
CS585 Lecture October15th
162 pages
LSTM, RNN
No ratings yet
LSTM, RNN
38 pages
Deep Learning RNN
100% (2)
Deep Learning RNN
53 pages
WINSEM2024-25 CSE4006 ETH AP2024254000689 2025-02-28 Reference-Material-I
No ratings yet
WINSEM2024-25 CSE4006 ETH AP2024254000689 2025-02-28 Reference-Material-I
39 pages
Cheatsheet Recurrent Neural Networks
No ratings yet
Cheatsheet Recurrent Neural Networks
5 pages
Sequence Models For NLP
No ratings yet
Sequence Models For NLP
195 pages
RNNs: Types, LSTM, and Applications
No ratings yet
RNNs: Types, LSTM, and Applications
11 pages
Foundations of Text Representation, LLMs and Transformers
No ratings yet
Foundations of Text Representation, LLMs and Transformers
87 pages
10 RNN
No ratings yet
10 RNN
56 pages
08-DL-Deep Learning For Text Data (Transfer Learning in NLP)
No ratings yet
08-DL-Deep Learning For Text Data (Transfer Learning in NLP)
53 pages
Nn4ir PDF
No ratings yet
Nn4ir PDF
290 pages
UNIT - 5 Lecture 2
No ratings yet
UNIT - 5 Lecture 2
26 pages
Over Description About The Model
No ratings yet
Over Description About The Model
3 pages
06 - LLM
No ratings yet
06 - LLM
18 pages
DL Module 5
No ratings yet
DL Module 5
10 pages
Deep Learning Basics
No ratings yet
Deep Learning Basics
10 pages
For Seminar
No ratings yet
For Seminar
17 pages
Machine Learning and Pattern Recognition Week 8 Neural Net Architectures
No ratings yet
Machine Learning and Pattern Recognition Week 8 Neural Net Architectures
3 pages
2022 Foundations Tutorial3 Sunwang Deeplearning4nlp
No ratings yet
2022 Foundations Tutorial3 Sunwang Deeplearning4nlp
103 pages
Recurrent Neural Networks LSTMS, Transformers, Graph Neural Networks
No ratings yet
Recurrent Neural Networks LSTMS, Transformers, Graph Neural Networks
97 pages
ML For NLP-LO4
No ratings yet
ML For NLP-LO4
42 pages
Lecture 8 RNN LSTMs W Annotations
No ratings yet
Lecture 8 RNN LSTMs W Annotations
22 pages
Sequence Modeling
100% (1)
Sequence Modeling
131 pages
NLP - Machine Learning
No ratings yet
NLP - Machine Learning
23 pages
Deep Learning-5
No ratings yet
Deep Learning-5
5 pages
DL Mod4
No ratings yet
DL Mod4
105 pages
Slide
No ratings yet
Slide
28 pages
GenAI Workflow Automation NPTEL Zoom Course
No ratings yet
GenAI Workflow Automation NPTEL Zoom Course
88 pages
Sequence Models Notes
No ratings yet
Sequence Models Notes
4 pages
Lecture15 - Neural Models For NLP
No ratings yet
Lecture15 - Neural Models For NLP
62 pages
Artificial Neural Networks - 12: Dr. Aditya Abhyankar
No ratings yet
Artificial Neural Networks - 12: Dr. Aditya Abhyankar
42 pages
Neural Network Learning Models
No ratings yet
Neural Network Learning Models
7 pages
Word 2 Vec
No ratings yet
Word 2 Vec
29 pages
Unit 13 - Week 12: Assignment 12
No ratings yet
Unit 13 - Week 12: Assignment 12
3 pages
Unit 4 Deeplearning
No ratings yet
Unit 4 Deeplearning
41 pages
50 Deep Learning Technical Interview Questions With Answers
100% (2)
50 Deep Learning Technical Interview Questions With Answers
20 pages
Hcai Mock
100% (1)
Hcai Mock
5 pages
Course Curriculum
No ratings yet
Course Curriculum
3 pages
RNN LSTM
No ratings yet
RNN LSTM
72 pages
Project Report Format (2024-25)
No ratings yet
Project Report Format (2024-25)
35 pages
Ad3451 ML Unit 4 Notes
No ratings yet
Ad3451 ML Unit 4 Notes
34 pages
A Brief Guide To Artificial Intelligence Tutorial Introductions James V Stone All Chapters Available
100% (5)
A Brief Guide To Artificial Intelligence Tutorial Introductions James V Stone All Chapters Available
135 pages
Deep Learning Revision Guide
No ratings yet
Deep Learning Revision Guide
6 pages
Continual Learning and Catastrophic Forgetting
No ratings yet
Continual Learning and Catastrophic Forgetting
21 pages
Deep Learning Sem 5
No ratings yet
Deep Learning Sem 5
3 pages
NNDL Lab
No ratings yet
NNDL Lab
33 pages
Backpropagation in Neural Networks
No ratings yet
Backpropagation in Neural Networks
17 pages
Pretrained Inception-V3 Convolutional Neural Network - MATLAB Inceptionv3
100% (1)
Pretrained Inception-V3 Convolutional Neural Network - MATLAB Inceptionv3
2 pages
Deep Learning and NLP With PYTHON - Course Outline
No ratings yet
Deep Learning and NLP With PYTHON - Course Outline
11 pages
Vanishing & Exploding Gradients in NN
No ratings yet
Vanishing & Exploding Gradients in NN
2 pages
Deep Neural Networks
No ratings yet
Deep Neural Networks
25 pages
Activation Functions
No ratings yet
Activation Functions
9 pages
Image Captioning for AI Developers
67% (3)
Image Captioning for AI Developers
16 pages
DL - Assignment 8 Solution
100% (2)
DL - Assignment 8 Solution
6 pages
We Used Neural Networks To Detect Clickbaits: You Won't Believe What Happened Next!
No ratings yet
We Used Neural Networks To Detect Clickbaits: You Won't Believe What Happened Next!
7 pages
Artificial Neural Network (2019 Pattern) Pyq
No ratings yet
Artificial Neural Network (2019 Pattern) Pyq
3 pages
Attention & Transformers
No ratings yet
Attention & Transformers
66 pages
Mengenali Fungsi Logika "And" Melalui Pemrograman Perceptron Dengan Matlab
No ratings yet
Mengenali Fungsi Logika "And" Melalui Pemrograman Perceptron Dengan Matlab
8 pages
2022 Bnext
No ratings yet
2022 Bnext
16 pages
00 DL2023 Prelim
No ratings yet
00 DL2023 Prelim
10 pages

Deep Learning

Uploaded by

Deep Learning

Uploaded by

Deep Learning

MLP Feed Forward Neural Network

• Vanishing/Exploding Gradients: Difficulty learning

Training a Custom Word2Vec Model:

Encoder: Processes input sequence into a context

You might also like