0% found this document useful (0 votes)

275 views8 pages

Encoder Decoder

The document explains what an encoder-decoder model is. An encoder-decoder model takes an input sequence and encodes it into a fixed-length vector, then a decoder outputs a target sequence based on the encoded vector. It discusses how encoder-decoder models are used for tasks like image captioning, sentiment analysis, and language translation.

Uploaded by

anu.abhi0107

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

275 views8 pages

Encoder Decoder

Uploaded by

anu.abhi0107

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

This member-only story is on us. Upgrade to access all of Medium.

Member-only story

What is an encoder decoder model?

Encoder Decoder is a widely used structure in deep learning and
through this article, we will understand its architecture

Nechu BM · Follow
Published in Towards Data Science · 4 min read · Oct 7, 2020

321 1

Photo by Michael Dziedzic on Unsplash

In this post, we introduce the encoder decoder structure in some cases

known as Sequence to Sequence (Seq2Seq) model. For a better
understanding of the structure of this model, previous knowledge on RNN is
helpful.

When do we use an encoder decoder model?

1-Image Captioning
Encoder decoder models allow for a process in which a machine learning
model generates a sentence describing an image. It receives the image as the
input and outputs a sequence of words. This also works with videos.
ML output: ‘Road surrounded by palm trees leading to a beach’, Photo by Milo Miloezger on Unsplash

2-Sentiment Analysis
These models understand the meaning and emotions of the input sentence
and output a sentiment score. It is usually rated between -1 (negative) and 1
(positive) where 0 is neutral. It is used in call centers to analyse the evolution
of the client’s emotions and their reactions to certain keywords or company
discounts.

Image by the author

3-Translation
This model reads an input sentence, understands the message and the
concepts, then translates it into a second language. Google Translate is built
upon an encoder decoder structure, for more detail follow this paper.

Image by the author

What is an encoder decoder model?

The best way to understand the concept of an encoder-decoder model is by
playing Pictionary. The rules of the game are very simple, player 1 randomly
picks a word from a list and needs to sketch the meaning in a drawing. The
role of the second player in the team is to analyse the drawing and identify
the word which it describes. In this example we have three important
elements player 1(the person that converts the word into a drawing), the
drawing (rabbit) and the person that guesses the word the drawing
represents (player 2). This is all we need to understand an encoder decoder
model, below we will build a comparative of the Pictionary game and an
encoder decoder model for translating Spanish to English.

Pictionary Game, Image by the author

If we translate the above graph into machine learning concepts, we would

see the below one. In the following sections we will go through each
component.

Encoder Decoder Model, Image by the author

1-Encoder (Picturist)
Encoding means to convert data into a required format. In the Pictionary
example we convert a word (text) into a drawing (image). In the machine
learning context, we convert a sequence of words in Spanish into a two-
dimensional vector, this two-dimensional vector is also known as hidden
state. The encoder is built by stacking recurrent neural network (RNN). We
use this type of layer because its structure allows the model to understand
context and temporal dependencies of the sequences. The output of the
encoder, the hidden state, is the state of the last RNN timestep.

Encoder, Image by the author

2-Hidden State (Sketch)

The output of the encoder, a two-dimensional vector that encapsulates the
whole meaning of the input sequence. The length of the vector depends on
the number of cells in the RNN.

Encoder and hidden state, Image by the author

3-Decoder
To decode means to convert a coded message into intelligible language. The
second person in the team playing Pictionary will convert the drawing into a
word. In the machine learning model, the role of the decoder will be to
convert the two-dimensional vector into the output sequence, the English
sentence. It is also built with RNN layers and a dense layer to predict the
English word.
Open in app

Search Write

Decoder, Image by the author

Conclusion
One of the major advantages of this model is that the length of the input and
output sequences may differ. This opens the door for very interesting
applications such as video captioning or question and answer.

The major limit of this simple encoder decoder model is that all the
information needs to be summarized in one dimensional vector, for long
input sequences that can be extremely difficult to achieve. Having said that,
understanding encoder decoder models is key for the latest advances in NLP
because it is the seed for attention models and transformers. In the next
article, we will follow the process of building a translation model with an
encoder decoder structure. Then we will continue by exploring the attention
mechanism in order to achieve higher accuracy.

How to build an encoder decoder translation model using LSTM

with Python and Keras.
Follow this step by step guide to build an encoder decoder model
and create your own translation model.
towardsdatascience.com

Machine Learning Artificial Intelligence Deep Learning AI Neural Networks

Written by Nechu BM Follow

183 Followers · Writer for Towards Data Science

Data Scientist & Entrepreneur! Learn Artificial Intelligence and Machine Learning to
become a Linchpin ➜ https://bit.ly/36vajnu

Nechu BM in Towards Data Science Damian Gil in Towards Data Science

How to build an encoder decoder Mastering Customer Segmentation

translation model using LSTM wit… with LLM
Follow this step by step guide to build an Unlock advanced customer segmentation
encoder decoder model and create your ow… techniques using LLMs, and improve your…

8 min read · Oct 20, 2020 24 min read · Sep 27

205 3 3.4K 25

Adrian H. Raudaschl in Towards Data Science Nechu BM in The Startup

Forget RAG, the Future is RAG- Introduction to Recurrent Neural

Fusion Networks (RNN)
The Next Frontier of Search: Retrieval How are computers able to understand
Augmented Generation meets Reciprocal… context? What is a recurrent neural network…

· 10 min read · Oct 6 · 6 min read · Jun 10, 2020

1.7K 23 220

See all from Nechu BM See all from Towards Data Science

Recommended from Medium

AL Anany Unbecoming

The ChatGPT Hype Is Over — Now 10 Seconds That Ended My 20 Year

Watch How Google Will Kill… Marriage
It never happens instantly. The business It’s August in Northern Virginia, hot and
game is longer than you know. humid. I still haven’t showered from my…

· 6 min read · Sep 1 · 4 min read · Feb 16, 2022

16.8K 492 68K 981

Lists

Natural Language Processing Predictive Modeling w/

755 stories · 338 saves Python
20 stories · 529 saves

AI Regulation Generative AI Recommended

6 stories · 162 saves Reading
52 stories · 341 saves

Jonte Dancker in Towards Data Science Minhajul Hoque

A Brief Introduction to Recurrent A Comprehensive Overview of

Neural Networks Transformer-Based Models:…
An introduction to RNN, LSTM, and GRU and Transformers are a type of deep learning
their implementation architecture that have revolutionized the fiel…

12 min read · Dec 26, 2022 5 min read · Apr 30

413 5 9

Amirhossein Abaskohi Tomas Vykruta

Navigating Transformers: A Understanding Causal LLM’s,
Comprehensive Exploration of… Masked LLM’s, and Seq2Seq: A…
Introduction In the world of natural language processing
(NLP), choosing the right training approach i…

9 min read · Aug 16 7 min read · Apr 30

19 30

See more recommendations

Unit IV DL
No ratings yet
Unit IV DL
122 pages
Dlunit 4
No ratings yet
Dlunit 4
122 pages
Unit IV DL
No ratings yet
Unit IV DL
122 pages
DL Co4 PPT-1
No ratings yet
DL Co4 PPT-1
29 pages
M5 Topic 1 - Encoder Decoder
No ratings yet
M5 Topic 1 - Encoder Decoder
21 pages
Visualizing A Neural Machine Translation Model
No ratings yet
Visualizing A Neural Machine Translation Model
38 pages
Auto Encoders
No ratings yet
Auto Encoders
23 pages
Encoder Vs Decoder Transformer Updated
No ratings yet
Encoder Vs Decoder Transformer Updated
10 pages
DUnit IV
No ratings yet
DUnit IV
22 pages
Module 3 Part 2 Encoder
No ratings yet
Module 3 Part 2 Encoder
14 pages
Encoder-Decoder Models
No ratings yet
Encoder-Decoder Models
6 pages
Encoder-Decoder Sequence To Sequence Architechure
No ratings yet
Encoder-Decoder Sequence To Sequence Architechure
16 pages
Lecture 2.3.1 - Autoencoders
No ratings yet
Lecture 2.3.1 - Autoencoders
6 pages
Exploring Sequence-to-Sequence Models - Understanding The Power of Encoder and Decoder Architecture - by Sachinsoni - Medium
No ratings yet
Exploring Sequence-to-Sequence Models - Understanding The Power of Encoder and Decoder Architecture - by Sachinsoni - Medium
18 pages
Generative AI
No ratings yet
Generative AI
54 pages
D5 PPT
No ratings yet
D5 PPT
79 pages
Auto Encoder
No ratings yet
Auto Encoder
7 pages
15 - NEW 2020 ATTENTION ENC DEC TRANSFORMERS Lect15
No ratings yet
15 - NEW 2020 ATTENTION ENC DEC TRANSFORMERS Lect15
50 pages
Part 15 MD
No ratings yet
Part 15 MD
36 pages
Deep Recurrent Neural Networks
No ratings yet
Deep Recurrent Neural Networks
24 pages
Transformer-Based Error Correction
No ratings yet
Transformer-Based Error Correction
11 pages
DL Unit 5
No ratings yet
DL Unit 5
19 pages
Understanding Transformer Model Architectures - Practical Artificial Intelligence
No ratings yet
Understanding Transformer Model Architectures - Practical Artificial Intelligence
6 pages
CS 15-16 Transformers
No ratings yet
CS 15-16 Transformers
75 pages
Deep Neural Network Module 7 Attention Transformer
No ratings yet
Deep Neural Network Module 7 Attention Transformer
40 pages
The Decoder: Deconstructed
No ratings yet
The Decoder: Deconstructed
35 pages
Unit 5
No ratings yet
Unit 5
23 pages
New Microsoft Word Document
No ratings yet
New Microsoft Word Document
4 pages
Attention Is All We Need
No ratings yet
Attention Is All We Need
5 pages
Auto v8
No ratings yet
Auto v8
38 pages
Llms Course Andrew
No ratings yet
Llms Course Andrew
46 pages
Lesson 4: Attention Is All You Need Encoder and Decoder Processes
No ratings yet
Lesson 4: Attention Is All You Need Encoder and Decoder Processes
5 pages
Graph Representation Learning
No ratings yet
Graph Representation Learning
32 pages
Autoencoders in Machine Learning
No ratings yet
Autoencoders in Machine Learning
7 pages
DLD - Lecture - 13
No ratings yet
DLD - Lecture - 13
41 pages
Transformer Neural Networks: RAHUL 121AD0036
No ratings yet
Transformer Neural Networks: RAHUL 121AD0036
43 pages
LP4-4,5,6 Writeup
No ratings yet
LP4-4,5,6 Writeup
14 pages
L23 Autoencoders
No ratings yet
L23 Autoencoders
16 pages
Autoencoders
No ratings yet
Autoencoders
12 pages
Lecture 13 - Transformer Encoder Decoderv2
No ratings yet
Lecture 13 - Transformer Encoder Decoderv2
65 pages
Computer Vision 11 Transformers
No ratings yet
Computer Vision 11 Transformers
63 pages
Encoders and Decoders - D456a275 1082 4cce Aa21 08b148bf6cf1
No ratings yet
Encoders and Decoders - D456a275 1082 4cce Aa21 08b148bf6cf1
9 pages
Autoencoders - Presentation
No ratings yet
Autoencoders - Presentation
18 pages
05 Attention Slides
No ratings yet
05 Attention Slides
69 pages
Auto Encoders
No ratings yet
Auto Encoders
4 pages
21BQ1A4907
No ratings yet
21BQ1A4907
8 pages
03 Autoencoders 4
No ratings yet
03 Autoencoders 4
159 pages
Krigestroke Et Al (2019) Interpreting Encoding and Decoding Models
No ratings yet
Krigestroke Et Al (2019) Interpreting Encoding and Decoding Models
13 pages
Auto Encoder
No ratings yet
Auto Encoder
10 pages
Generative Models
No ratings yet
Generative Models
65 pages
Bahdanau Attention Mechanism (Also Known As Additive Attention)
No ratings yet
Bahdanau Attention Mechanism (Also Known As Additive Attention)
41 pages
Unit 5
No ratings yet
Unit 5
27 pages
GEN-AI Handout 1
No ratings yet
GEN-AI Handout 1
4 pages
Chapter 6 Completing Business Messages-3
100% (1)
Chapter 6 Completing Business Messages-3
27 pages
BLOA Review 2
No ratings yet
BLOA Review 2
4 pages
Course Outline Template
No ratings yet
Course Outline Template
7 pages
Mie - Empirismo Conocimiento Previo e Induccion en Aristoteles
No ratings yet
Mie - Empirismo Conocimiento Previo e Induccion en Aristoteles
42 pages
Range of Index of Discrimination Level
No ratings yet
Range of Index of Discrimination Level
3 pages
Business English for Professionals
No ratings yet
Business English for Professionals
3 pages
Year 2 English Yearly Plan 3
No ratings yet
Year 2 English Yearly Plan 3
22 pages
The Role of AI in Theatre
No ratings yet
The Role of AI in Theatre
8 pages
Bootcamp Bootleg
No ratings yet
Bootcamp Bootleg
47 pages
Schopenhauer AFF & NEG
No ratings yet
Schopenhauer AFF & NEG
207 pages
Training Skills: Equipping People To Learn and Grow
No ratings yet
Training Skills: Equipping People To Learn and Grow
13 pages
CHATGPT-Advantage or Disadvantage Project 2
No ratings yet
CHATGPT-Advantage or Disadvantage Project 2
21 pages
TESOL Journal - 2024 - Rodriguez Mojica - Translanguaging Formative Assessment Tools For Bilingual Teachers of Multilingual
No ratings yet
TESOL Journal - 2024 - Rodriguez Mojica - Translanguaging Formative Assessment Tools For Bilingual Teachers of Multilingual
9 pages
Scope of Curriculum
No ratings yet
Scope of Curriculum
6 pages
Lesson Plan Martin Robinson
No ratings yet
Lesson Plan Martin Robinson
2 pages
Syllabus BANA 2010-E02, ONLINE, Business Statistics, Fall 2020
No ratings yet
Syllabus BANA 2010-E02, ONLINE, Business Statistics, Fall 2020
7 pages
String Rubric 2015
No ratings yet
String Rubric 2015
1 page
2024 Grade 10 Business Studies JSE Project
No ratings yet
2024 Grade 10 Business Studies JSE Project
6 pages
Reading and Writing Skills Literature Review
No ratings yet
Reading and Writing Skills Literature Review
17 pages
Bt2 MCQ May22
No ratings yet
Bt2 MCQ May22
15 pages
Sikolohiyang Pilipino
100% (1)
Sikolohiyang Pilipino
3 pages
Safety Unit Plan
100% (2)
Safety Unit Plan
16 pages
Pedagogy and Power Code of Power in Education MCE
No ratings yet
Pedagogy and Power Code of Power in Education MCE
2 pages
Principles To Action (Short)
No ratings yet
Principles To Action (Short)
6 pages
Link Practice
50% (2)
Link Practice
12 pages
Quick and Handy Grammar Review: Questions Formation Exercise 1
No ratings yet
Quick and Handy Grammar Review: Questions Formation Exercise 1
14 pages
NAISBrainology CarolDweck
No ratings yet
NAISBrainology CarolDweck
6 pages
Personality Profile - Behavioural Style Worksheets
No ratings yet
Personality Profile - Behavioural Style Worksheets
5 pages
SSRN Id3953428
No ratings yet
SSRN Id3953428
115 pages
WAP-on Data-Driven
No ratings yet
WAP-on Data-Driven
8 pages