Advanced Data Analytics: Simon Scheidegger - University of Lausanne, Department of Economics
Advanced Data Analytics: Simon Scheidegger - University of Lausanne, Department of Economics
Advanced Data Analytics: Simon Scheidegger - University of Lausanne, Department of Economics
Lecture 9
Keras API:
https://www.tensorflow.org/guide/keras/sequential_model
This Notebook contains all the basic functionality from a theoretical point of
view.
Tensorboard
On Nuvolos: In-cell it won't work
in JupyterLab.
Once you've run all the cells, go to the launcher, click TensorBoard, and
you should be good to go.
Right after this, a new tensorboard tab should show up that contains the
expected.
ACTION REQUIRED
Focus on the example with the Kaggle data set.
Glorot initialization.
Examples:
Time-series comparisons, such as estimating how closely related two documents or two stock
tickers are.
Sequence-to-sequence learning, such as decoding an English sentence into French.
Sentiment analysis, such as classifying the sentiment of tweets or movie reviews as positive or
negative.
Time-series forecasting, such as predicting the future weather at a certain location, given recent
weather data.
EXAMPLE
Activation Functions
e.g. sigmoid function
→ Bias term allows you to shift your activation function to the left or the right
FEED-FORWARD NETS REVISITED
X X X X
RECURRENT NEURAL NETS
To model sequences, we need to
Handle variable-length sequences.
Track long-term dependencies.
Maintain information about the order.
Share parameters across the sequence.
http://colah.github.io/posts/2015-08-Understanding-LSTMs/
RNN
RNNs, are a family of neural networks for processing sequential data.
A RNN is a neural network that is specialized for
processing a sequence of values x(1), . . . , x(τ).
Unfold the computational graph of a dynamical system:
http://colah.github.io/posts/2015-08-Understanding-LSTMs/
FEED-FORWARD NETS REVISITED
HANDLING INDIVIDUAL TIME STEPS
output
vector
input
vector
NEURONS WITH RECURRENCE
output
vector
input
vector
output
vector
input
vector
RNNs have a cell state that is updated at each time step as a sequence is proceeded.
RNN INTUITION
output
vector
input
vector
RNN INTUITION
output
vector
input
vector
RNN INTUITION
output
vector
input
vector
RNN STATE UPDATE AND OUTPUT
Output vector
output
vector
Update hidden state
RNN
input
vector Input vector
RNN – IN ONE SLIDE
RNN models a dynamic system, where the hidden (cell) state ht is not only dependent on the current
observation xt, but also relies on the previous hidden state ht-1.
output
vector
input
vector
BACK-PROPAGATION THROUGH TIME
http://colah.github.io/posts/2015-08-Understanding-LSTMs/
RNN FROM SCRATCH & TENSORFLOW
output
vector
RNN
recurrent cell
input
vector
RNN INTUITION
Many-to-one One-to-many many-to-many
e.g., sentiment classification e.g., text generation e.g., translation & forecasting
one-to-one
ordinary DNN, e.g.,
classification
SEQUENCE MODELING – DESIGN CRITERIA
input
Share parameters across the sequence.
vector
→ Recurrent Neural Networks (RNNs) meet these
sequence modeling design criteria.
HANDLE VARIABLE SEQUENCE LENGTHS
output
vector
RNN
input
vector Computing the gradient wrt. h0 involves many factors
of Whh + repeated gradient computation!
BACKPROPAGATION THROUGH TIME
output
vector
RNN
input
vector Many values > 1: Many values < 1:
Exploding gradients Vanishing gradients
RECALL: RNN HARD TO TRAIN
Recurrent blocks suffer from two problems:
Long-term dependencies do not work well.
Difficult to connect two distant parts of the input.
Magnitude of the signal can get amplified at each recurrent
connection.
At every time iteration, the gradient can either vanish or
explode.
Very hard to train them.
http://colah.github.io/posts/2015-08-Understanding-LSTMs/
LONG SHORT-TERM MEMORY (LSTM)
http://www.bioinf.jku.at/publications/older/2604.pdf
http://colah.github.io/posts/2015-08-Understanding-LSTMs/
LONG SHORT-TERM MEMORY (LSTM)
The core of LSTM is a memory unit (or cell) ct
which encodes the information of the inputs
that have been observed up to that step.
The memory cell ct has the same inputs
(ht−1 and xt) and outputs ht as a normal
recurrent network, but has more gating units
which control the information flow.
The input gate and output gate respectively
control the information input to the memory
unit and the information output from the unit.
More specifically, the output ht of the LSTM cell can be shut off via the output gate.
1. Forget
2. Store
3. Update
4. Output
http://colah.github.io/posts/2015-08-Understanding-LSTMs/
LSTM OUTPUT
http://www.bioinf.jku.at/publications/older/2604.pdf
1. Forget
2. Store
3. Update
4. Output
Based on the new state and the input,
the layer can produce a result.
this is the output.
the same value is also passed to the next iteration.
http://colah.github.io/posts/2015-08-Understanding-LSTMs/
ACTION REQUIRED
There is a weather data set from the Max Planck Institute of Biochemistry
https://www.bgc-jena.mpg.de/wetter/.