0% found this document useful (0 votes)

19 views21 pages

RNN and LSTM Introduction Lecture

Lecture 11 introduces recurrent neural networks (RNN) and Long Short-Term Memory (LSTM) networks, focusing on their application in analyzing sequential data. Key concepts include the differences between RNNs and multi-layer perceptrons (MLPs), the function of gating sub-units in LSTMs, and a software framework for building RNNs. The lecture also covers advanced RNN architectures, including bi-directional RNNs and attention mechanisms, and suggests additional readings and resources for further understanding.

Uploaded by

Esraa Al dn

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views21 pages

RNN and LSTM Introduction Lecture

Uploaded by

Esraa Al dn

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 21

SYSC4415

Introduction to Machine Learning

Lecture 11

Prof James Green

jrgreen@sce.Carleton.ca
Systems and Computer Engineering, Carleton University
Learning Objectives for Lecture 11
• Introduce recurrent neural networks (RNN) and Long Short-Term
Memory (LSTM) networks for analyzing sequential data
• Understand how a recurrent neural network (RNN) differs from a MLP
• Understand the function of each gating sub-unit within an LSTM
• Introduce at least one software framework for building, training, and
testing RNN/LSTM
Pre-Lecture Assignment
• Chapter 6.2.2(72-75 = 4 pages)
• https://www.youtube.com/watch?v=WCUNPb-5EYI
• (RNN and LSTM at a conceptual level; 26min, or 17min @ 1.5 speed)
Key terms
• Recurrent neural networks (RNNs), state, softmax function,
backpropagation through time, long short-term memory (LSTM),
Gated Recurrent Unit (GRU), minimal gated GRU, bi-directional RNNs,
attention, sequence-to-sequence RNN, recursive neural network.
In-Class Activities
• Review key concepts in the chapter through discussion,
PollEverywhere questions
• Tutorial: Review a Jupyter notebook that builds, trains, and tests a
LSTM network using Keras
RNN : used to label , classify ,
generate sequences
>
- different from FNN (contains Wops)

· Each unit u
of recurrent
layer L has a
-
state/h
<, u
&
-

memory
unit
of the

RVN
Seqseq - , specifically LSTN

element by element addition

remember/(
& gating (on/off) 1 0
.

,
0 . 5 ,
0 0
.

Parameters : , , blu , in matrix) , Enu

found using gradient descent with backpropagation .

Through time
didn't e
das
~

&
do
in
>
-
I
Motivational Example
• Introduction to RNN:
• https://www.youtube.com/watch?v=LHXXI4-IEns (10 min)
vanishing gradient causes to
>
- problem us

move
away from RNN towards LSTM Set dimensionality of parameter matrix Vj such that Vjhjt results
in a vector of dimension c (# classes)

Recurrent Neural Network Hidden Layers

Softmax function:
Input sequence:
=>
-
RNN Unfolding/unrolling
http://colah.github.io/posts/2015-08-Understanding-LSTMs/

• Left: https://www.youtube.com/watch?v=S0XFd0VMFss (2-6 of 8min)

• Right: https://www.youtube.com/watch?v=_h66BW-xNgk&index=1&list=PLtBw6njQRU-rwp5__7C0oIVt26ZgjG9NI (~15 min mark)

RNN Unfolding

Unfolding
>
-
LSTM and GRU
• Watch: https://www.youtube.com/watch?v=8HyCNIVRbSU (11min)

ct-1 ct ht-1 ht

ht-1 ht
xt xt
-
Minimal Gated Recurrent Unit
Each “cell” is made up of multiple units (sizel). One unit shown here:
“Forget” or “Update” gate

Textbook: Minimal Gated GRU (gated recurrent unit)

1) New potential memory cell value f(inputs and ht-1);&

g1 = tanh
2) Memory forget gate; 1=forget=take new, 0=keep=ignore new; f(inputs and ht-1)
- g2 “gate function” uses sigmoid function  Fgate value;
3) New memory cell value. Either take new h~ value, or keep ht-1; f(Fgate and ht-1)
4) Vector of new memory cell values. 1 per unit in this layer
5) Output vector;-
g3=softmax

Discussion of dimensions of signals in an LSTM: https://mmuratarat.github.io//2019-01-19/dimensions-of-lstm

true for
both
Whig SGD)
(aery !
Advanced RNN Architectures
• Other important extensions to RNNs include:
• A generalization of an RNN is a recursive neural network
• bi-directional RNNs
• RNNs with attention (see extended Ch6 material on course wiki)
• Attention-only networks = transformers…
• sequence-to-sequence RNN models.
• Frequently used to build neural machine translation models and other models for text to
text transformations.
• Will see this later in the textbook (section 7.7)…
• Combinations of CNN+LSTM
• Image Captioning
• Video:
• CNN on indiv frames to extact feature vectors, LSTM for time-series
• Or look at 3D conv with fixed time window
Textbook Recommended Readings for RNN:
• An extended version of Chapter 6 with RNN unfolding, bidirectional RNN, and attention
• The Unreasonable Effectiveness of Recurrent Neural Networks by Andrej Karpathy (2015)
• Recurrent Neural Networks and LSTM by Niklas Donges (2018)
• Understanding LSTM Networks by Christopher Olah (2015)
• Introduction to RNNs by Denny Britz (2015)
• Implementing a RNN with Python, Numpy and Theano by Denny Britz (2015)
• Backpropagation Through Time and Vanishing Gradients by Denny Britz (2015)
• Implementing a GRU/LSTM RNN with Python and Theano by Denny Britz (2015)
• Simplified Minimal Gated Unit Variations for Recurrent Neural Networks by Joel Heck and
Fathi Salem (2017)
Transformers: “Attention is all you need”
Great 1-hour Transformers Tutorial
• Transformers (“Attention is all you need” 2017)
• Attention: watch ~11 mins from 10min mark: Transformers with Lucas Beyer
• “LSTM is Dead. Long Live Transformers” (2019)
• https://www.youtube.com/watch?v=S27pHKBEp30 (~45min)
• Warning: we won’t cover NLP for a few weeks…
• Code and pre-trained models: github.com/huggingface/transformers
• “The Illustrated Transformer”: http://jalammar.github.io/illustrated-transformer/
• Transformers replacing CNN for image analysis…
• “An Image Is Worth 16x16 Words: Transformers For Image Recognition At Scale”
https://arxiv.org/pdf/2010.11929.pdf (2021)
• ViT = Vision Transformer
• Break image into patches; flatten each patch into a vector; add positional information (where
did patch come from within image?); get ‘sequence’ of encoded patches; compute key, value,
query using linear layer; compute attention; MLP/FFNN; …

DeepLearning SecC
No ratings yet
DeepLearning SecC
20 pages
Chapter 2
No ratings yet
Chapter 2
68 pages
Part 5
No ratings yet
Part 5
37 pages
Module 6
No ratings yet
Module 6
51 pages
CE6146 Lecture 4
No ratings yet
CE6146 Lecture 4
53 pages
RNN Basics & Implementation Guide
No ratings yet
RNN Basics & Implementation Guide
66 pages
5 Deep Learning RNNs
No ratings yet
5 Deep Learning RNNs
26 pages
RNNs and Their Types - 15 Slides (Easy Copy-Paste Format)
No ratings yet
RNNs and Their Types - 15 Slides (Easy Copy-Paste Format)
6 pages
RNN LSTM
No ratings yet
RNN LSTM
49 pages
Recurrent Neural Networks (RNNS) : Foundations and Applications in Sequential Learning
No ratings yet
Recurrent Neural Networks (RNNS) : Foundations and Applications in Sequential Learning
9 pages
Dis6 Sol
No ratings yet
Dis6 Sol
6 pages
Lecture 11
No ratings yet
Lecture 11
57 pages
CNN RNN LSTM GRU Simple
100% (3)
CNN RNN LSTM GRU Simple
20 pages
RNNs: Architecture and Applications
No ratings yet
RNNs: Architecture and Applications
6 pages
Cs224n 2025 Lecture06 Fancy RNN
No ratings yet
Cs224n 2025 Lecture06 Fancy RNN
57 pages
RNN LSTM
No ratings yet
RNN LSTM
72 pages
RNNs: Applications and Training Guide
No ratings yet
RNNs: Applications and Training Guide
36 pages
07 RNN Recurrent Neural Networks
No ratings yet
07 RNN Recurrent Neural Networks
115 pages
1 Recurrent Neural Networks
No ratings yet
1 Recurrent Neural Networks
36 pages
CNN RNN LSTM Attention
No ratings yet
CNN RNN LSTM Attention
86 pages
AI Exam Prep: Neural Networks
No ratings yet
AI Exam Prep: Neural Networks
115 pages
RNN & LSTM: Nguyen Van Vinh Computer Science Department, UET, Vnu Ha Noi
No ratings yet
RNN & LSTM: Nguyen Van Vinh Computer Science Department, UET, Vnu Ha Noi
35 pages
Unit 4
No ratings yet
Unit 4
50 pages
RNNs and Their Types - Simple Explanation
No ratings yet
RNNs and Their Types - Simple Explanation
5 pages
Deep Learning Module 4 Important Topics PYQs
No ratings yet
Deep Learning Module 4 Important Topics PYQs
23 pages
1 Recurrent Neural Networks
No ratings yet
1 Recurrent Neural Networks
34 pages
Lecture 4 - Language Modelling and RNNs Part 2
No ratings yet
Lecture 4 - Language Modelling and RNNs Part 2
44 pages
DL Half TechKnowledge
No ratings yet
DL Half TechKnowledge
50 pages
10DL
No ratings yet
10DL
20 pages
RNN LSTM GRU Transformers
0% (1)
RNN LSTM GRU Transformers
123 pages
LSTM Ucl
100% (1)
LSTM Ucl
35 pages
Practice Question DL Unit-3
No ratings yet
Practice Question DL Unit-3
3 pages
Introduction To Recurrent Neural Networks (RNNS) : Dr. Hans Weber February 9, 2024
No ratings yet
Introduction To Recurrent Neural Networks (RNNS) : Dr. Hans Weber February 9, 2024
9 pages
Semster - DL
No ratings yet
Semster - DL
15 pages
Endsem Imp DL Unit 4
No ratings yet
Endsem Imp DL Unit 4
30 pages
Deep Learning
No ratings yet
Deep Learning
26 pages
42 Recurrent Neural Networks and LSTM
No ratings yet
42 Recurrent Neural Networks and LSTM
68 pages
Bianchi
No ratings yet
Bianchi
62 pages
Final PDL - Unit IV
No ratings yet
Final PDL - Unit IV
51 pages
CH4 - AA1.1-Sequence Models
No ratings yet
CH4 - AA1.1-Sequence Models
26 pages
DeepLear Qes
No ratings yet
DeepLear Qes
9 pages
RNNs and LSTMs: Concepts and Applications
No ratings yet
RNNs and LSTMs: Concepts and Applications
23 pages
RNNs & LSTMs for Tech Enthusiasts
No ratings yet
RNNs & LSTMs for Tech Enthusiasts
9 pages
Chap 7.2 Sequence Analysis Using RNN LSTM
No ratings yet
Chap 7.2 Sequence Analysis Using RNN LSTM
60 pages
RNN LSTM
No ratings yet
RNN LSTM
42 pages
RNNs: A Guide for AI Enthusiasts
No ratings yet
RNNs: A Guide for AI Enthusiasts
83 pages
Were Rnns All We Needed?: Leo - Feng@Mila - Quebec
No ratings yet
Were Rnns All We Needed?: Leo - Feng@Mila - Quebec
27 pages
Recurrent Neural Network (RNN)
No ratings yet
Recurrent Neural Network (RNN)
36 pages
Definition of RNN (Recurrent Neural Network) :: H F W X W H B y G W H B
No ratings yet
Definition of RNN (Recurrent Neural Network) :: H F W X W H B y G W H B
26 pages
Lecture Notes - RRN
No ratings yet
Lecture Notes - RRN
8 pages
15.03.2024 Csa3007 A24+d23+d24
No ratings yet
15.03.2024 Csa3007 A24+d23+d24
8 pages
RNNs
No ratings yet
RNNs
22 pages
Sequence Modeling for IT Students
No ratings yet
Sequence Modeling for IT Students
71 pages
Lab 9 RNN
No ratings yet
Lab 9 RNN
8 pages
Unit 3 RCNN Updated
No ratings yet
Unit 3 RCNN Updated
28 pages
LSTM & GRU: Simplified Guide
No ratings yet
LSTM & GRU: Simplified Guide
10 pages
What Is A Recurrent Neural Network
No ratings yet
What Is A Recurrent Neural Network
36 pages
Module 06
No ratings yet
Module 06
5 pages
Illustrated Guide To LSTM's and GRU'S - A Step by Step Explanation - by Michael Phi - Towards Data Science
No ratings yet
Illustrated Guide To LSTM's and GRU'S - A Step by Step Explanation - by Michael Phi - Towards Data Science
15 pages
Chap 10-2 Sequence Modeling Recurrent and Recursive Net-Hyun-Lim Yang
No ratings yet
Chap 10-2 Sequence Modeling Recurrent and Recursive Net-Hyun-Lim Yang
39 pages
Ad3501-Dl-Unit 3 Notes
No ratings yet
Ad3501-Dl-Unit 3 Notes
34 pages
4 RNN
No ratings yet
4 RNN
65 pages
Unit-2 Part-2
No ratings yet
Unit-2 Part-2
42 pages
Deep Learning
100% (3)
Deep Learning
32 pages
Deep Recurrent Neural Networks
No ratings yet
Deep Recurrent Neural Networks
10 pages
Explain The Concept of Unfolding Computational Graphs in The Context of Recurrent Neural Networks
No ratings yet
Explain The Concept of Unfolding Computational Graphs in The Context of Recurrent Neural Networks
9 pages
Important Questions
No ratings yet
Important Questions
19 pages
Assignment 3
No ratings yet
Assignment 3
3 pages
Module 4 - S8 CSE NOTES - KTU DEEP LEARNING NOTES - CST414
No ratings yet
Module 4 - S8 CSE NOTES - KTU DEEP LEARNING NOTES - CST414
21 pages
Semantic Compositionality Through Recursive Matrix-Vector Spaces
No ratings yet
Semantic Compositionality Through Recursive Matrix-Vector Spaces
11 pages
DL Unit Iv
No ratings yet
DL Unit Iv
15 pages
Module 4-1
No ratings yet
Module 4-1
44 pages
Advanced RNN Design & Applications
No ratings yet
Advanced RNN Design & Applications
41 pages

RNN and LSTM Introduction Lecture

Uploaded by

RNN and LSTM Introduction Lecture

Uploaded by

SYSC4415

Introduction to Machine Learning

Prof James Green

element by element addition

Parameters : , , blu , in matrix) , Enu

Recurrent Neural Network Hidden Layers

• Left: https://www.youtube.com/watch?v=S0XFd0VMFss (2-6 of 8min)

• Right: https://www.youtube.com/watch?v=_h66BW-xNgk&index=1&list=PLtBw6njQRU-rwp5__7C0oIVt26ZgjG9NI (~15 min mark)

Textbook: Minimal Gated GRU (gated recurrent unit)

1) New potential memory cell value f(inputs and ht-1);&

Discussion of dimensions of signals in an LSTM: https://mmuratarat.github.io//2019-01-19/dimensions-of-lstm

You might also like