Lecture 09 - Generative Models
Lecture 09 - Generative Models
Fall – 2024
Muhammad Naseer Bajwa
Assistant Professor,
Department of Computing, SEECS
Co-Principal Investigator,
Deep Learning Lab, NCAI
NUST, Islamabad
naseer.bajwa@seecs.edu.pk
WM/04.02 S. 1
WM/04-05 S. 1
Overview of this week’s lectures
Generative Models
- Autoencoders
- Variational Autoencoders
- Diffusion Models
WM/04.02 S. 2
02/21
WM/04-05 S. 2
High dimensional data can often be represented by a low dimensional code
WM/04.02 S. 3
03/21
WM/04-05 S. 3
High dimensional data can often be represented by a low dimensional code
- PCA: Efficient
03/21
WM/04-05 S. 4
Autoencoders are unsupervised algorithms for representation learning
- Curved manifold in the input space may also be dealt with. Output Vector
Hidden Activations
Code
Hidden Activations
Input Vector
WM/04.02 S. 5
04/21
WM/04-05 S. 5
Autoencoders are unsupervised algorithms for representation learning
- Curved manifold in the input space may also be dealt with. Output Vector
Hidden Activations
Input Vector
WM/04.02 S. 6
04/21
WM/04-05 S. 6
Autoencoders are unsupervised algorithms for representation learning
- Curved manifold in the input space may also be dealt with. Output Vector
Input Vector
WM/04.02 S. 7
04/21
WM/04-05 S. 7
Autoencoders are unsupervised algorithms for representation learning
- Curved manifold in the input space may also be dealt with. Output Vector
Input Vector
WM/04.02 S. 8
04/21
WM/04-05 S. 8
Deep Autoencoders Architecture
WM/04.02 S. 9
https://architecturewithexample.blogspot.com/2021/10/convolutional-autoencoder-architecture.html
https://starship-knowledge.com/tag/autoencoder-vs-restricted-boltzmann-machine
05/21
WM/04-05 S. 9
Autoencoders have multiple applications
- Compression
WM/04.02 S. 10
06/21
WM/04-05 S. 10
Autoencoders have multiple applications
- Compression
- Generation
WM/04.02 S. 11
06/21
WM/04-05 S. 11
Autoencoders have multiple applications
- Compression
- Generation
- Denoising
WM/04.02 S. 12
07/21
WM/04-05 S. 12
Denoising Autoencoders denoise input
WM/04.02 S. 13
Vincent, Pascal, et al. "Extracting and composing robust features with denoising autoencoders." Proceedings of the 25th international conference on
Machine learning. 2008.
08/21
WM/04-05 S. 13
Denoising Autoencoders denoise input
WM/04.02 S. 14
Vincent, Pascal, et al. "Extracting and composing robust features with denoising autoencoders." Proceedings of the 25th international conference on
Machine learning. 2008.
08/21
WM/04-05 S. 14
Denoising Autoencoders denoise input
WM/04.02 S. 15
Vincent, Pascal, et al. "Extracting and composing robust features with denoising autoencoders." Proceedings of the 25th international conference on
Machine learning. 2008.
08/21
WM/04-05 S. 15
Denoising Autoencoders denoise input
WM/04.02 S. 16
Vincent, Pascal, et al. "Extracting and composing robust features with denoising autoencoders." Proceedings of the 25th international conference on
Machine learning. 2008.
08/21
WM/04-05 S. 16
Variational Autoencoders are purpose-built for generating new data
- DAE are good for many tasks but not for generation.
WM/04.02 S. 17
Kingma, Diederik P., and Max Welling. "Auto-encoding variational Bayes." arXiv preprint arXiv:1312.6114 (2013).
09/21
WM/04-05 S. 17
Variational Autoencoders are purpose-built for generating new data
- DAE are good for many tasks but not for generation.
- Instead of mapping an input sample to a single point in the latent space, VAE map each sample to a
multivariate probability distribution in the latent space.
WM/04.02 S. 18
Kingma, Diederik P., and Max Welling. "Auto-encoding variational Bayes." arXiv preprint arXiv:1312.6114 (2013).
09/21
WM/04-05 S. 18
Variational Autoencoders are purpose-built for generating new data
- DAE are good for many tasks but not for generation.
- Instead of mapping an input sample to a single point in the latent space, VAE map each sample to a
multivariate probability distribution in the latent space.
WM/04.02 S. 19
Kingma, Diederik P., and Max Welling. "Auto-encoding variational Bayes." arXiv preprint arXiv:1312.6114 (2013).
09/21
WM/04-05 S. 19
Variational Autoencoders are purpose-built for generating new data
- DAE are good for many tasks but not for generation.
- Instead of mapping an input sample to a single point in the latent space, VAE map each sample to a
multivariate probability distribution in the latent space.
ℒ 𝜙, 𝜃, 𝑥 = 𝑥 − 𝑥ො 2 + 𝐷(𝑞𝜙 𝑧 𝑥 || 𝑝(𝑧))
WM/04.02 S. 20
Kingma, Diederik P., and Max Welling. "Auto-encoding variational Bayes." arXiv preprint arXiv:1312.6114 (2013).
09/21
WM/04-05 S. 20
What properties do we want to achieve from regularisation?
WM/04.02 S. 21
Kingma, Diederik P., and Max Welling. "Auto-Encoding Variational Bayes." arXiv preprint arXiv:1312.6114 (2013).
10/21
WM/04-05 S. 21
What properties do we want to achieve from regularisation?
WM/04.02 S. 22
Kingma, Diederik P., and Max Welling. "Auto-Encoding Variational Bayes." arXiv preprint arXiv:1312.6114 (2013).
10/21
WM/04-05 S. 22
What properties do we want to achieve from regularisation?
WM/04.02 S. 23
Kingma, Diederik P., and Max Welling. "Auto-Encoding Variational Bayes." arXiv preprint arXiv:1312.6114 (2013).
10/21
WM/04-05 S. 23
Generative Adversarial Networks
WM/04.02 S. 24
Goodfellow, Ian, et al. "Generative Adversarial Networks." Communications of the ACM 63.11 (2020): 139-144.
11/21
WM/04-05 S. 24
Generative Adversarial Networks
WM/04.02 S. 25
Goodfellow, Ian, et al. "Generative Adversarial Networks." Communications of the ACM 63.11 (2020): 139-144.
11/21
WM/04-05 S. 25
Generative Adversarial Networks
- Every time a sub-model loses, it updates its weights and starts another round.
WM/04.02 S. 26
Goodfellow, Ian, et al. "Generative Adversarial Networks." Communications of the ACM 63.11 (2020): 139-144.
11/21
WM/04-05 S. 26
Generative Adversarial Networks
- Every time a sub-model loses, it updates its weights and starts another round.
- The training stops when fake samples are no more distinguishable from real samples.
WM/04.02 S. 27
Goodfellow, Ian, et al. "Generative Adversarial Networks." Communications of the ACM 63.11 (2020): 139-144.
11/21
WM/04-05 S. 27
How to train a GAN?
WM/04.02 S. 28
https://www.youtube.com/watch?v=Gib_kiXgnvA
12/21
WM/04-05 S. 28
How to train a GAN?
- Train 𝐷 to maximise the probability of assigning the correct label to both 𝑝𝑑𝑎𝑡𝑎 (𝑥) and samples from 𝐺.
WM/04.02 S. 29
https://www.youtube.com/watch?v=Gib_kiXgnvA
12/21
WM/04-05 S. 29
How to train a GAN?
- Train 𝐷 to maximise the probability of assigning the correct label to both 𝑝𝑑𝑎𝑡𝑎 (𝑥) and samples from 𝐺.
WM/04.02 S. 30
https://www.youtube.com/watch?v=Gib_kiXgnvA
12/21
WM/04-05 S. 30
How to train a GAN?
- Train 𝐷 to maximise the probability of assigning the correct label to both 𝑝𝑑𝑎𝑡𝑎 (𝑥) and samples from 𝐺.
12/21
WM/04-05 S. 31
GANs are distribution transformers
𝒛 𝐺(𝑧) ෝ
𝒙
WM/04.02 S. 32
Goodfellow, Ian, et al. "Generative Adversarial Networks." Communications of the ACM 63.11 (2020): 139-144.
13/21
WM/04-05 S. 32
GANs are distribution transformers
- Gaussian Noise
Noise
𝒁
distribution
𝒛 𝐺(𝑧) ෝ
𝒙
WM/04.02 S. 33
Goodfellow, Ian, et al. "Generative Adversarial Networks." Communications of the ACM 63.11 (2020): 139-144.
13/21
WM/04-05 S. 33
GANs are distribution transformers
- Gaussian Noise
Noise
𝒁
distribution
Learned Data
𝑿
𝒛 𝐺(𝑧) ෝ
𝒙 distribution
WM/04.02 S. 34
Goodfellow, Ian, et al. "Generative Adversarial Networks." Communications of the ACM 63.11 (2020): 139-144.
13/21
WM/04-05 S. 34
GANs are distribution transformers
- Gaussian Noise
Noise
𝒁
distribution
Learned Data
𝑿
𝒛 𝐺(𝑧) distribution
WM/04.02 S. 35
Goodfellow, Ian, et al. "Generative Adversarial Networks." Communications of the ACM 63.11 (2020): 139-144.
13/21
WM/04-05 S. 35
GANs are distribution transformers
- Gaussian Noise
Noise
𝒁
distribution
Learned Data
𝑿
𝒛 𝐺(𝑧) distribution
WM/04.02 S. 36
Goodfellow, Ian, et al. "Generative Adversarial Networks." Communications of the ACM 63.11 (2020): 139-144.
13/21
WM/04-05 S. 36
GANs are distribution transformers
- Gaussian Noise
Noise
𝒁
distribution
Learned Data
𝑿
𝒛 𝐺(𝑧)
? distribution
WM/04.02 S. 37
Goodfellow, Ian, et al. "Generative Adversarial Networks." Communications of the ACM 63.11 (2020): 139-144.
13/21
WM/04-05 S. 37
Conditional GANs control the generated output
𝒄
𝐷(𝑥) 0/1
𝐺(𝑧) ෝ
𝒙
WM/04.02 S. 38
Isola, Phillip, et al. "Image-to-image translation with conditional adversarial networks." Proceedings of the IEEE conference on
computer vision and pattern recognition. 2017.
14/21
WM/04-05 S. 38
Conditional GANs control the generated output
Input Output
WM/04.02 S. 39
Isola, Phillip, et al. "Image-to-image translation with conditional adversarial networks." Proceedings of the IEEE conference on
computer vision and pattern recognition. 2017.
15/21
WM/04-05 S. 39
Conditional GANs control the generated output
Input Output
Black and White to Colour
WM/04.02 S. 40
Input Output
Isola, Phillip, et al. "Image-to-image translation with conditional adversarial networks." Proceedings of the IEEE conference on
computer vision and pattern recognition. 2017.
15/21
WM/04-05 S. 40
Conditional GANs control the generated output
Input Output
Black and White to Colour Day to Night
WM/04.02 S. 41
Input Output Input Output
Isola, Phillip, et al. "Image-to-image translation with conditional adversarial networks." Proceedings of the IEEE conference on
computer vision and pattern recognition. 2017.
15/21
WM/04-05 S. 41
Difference between VAE and GANs
- Reason?
VAE
GANs
WM/04.02 S. 42
16/21
WM/04-05 S. 42
Difference between VAE and GANs
- Reason?
GANs
WM/04.02 S. 43
16/21
WM/04-05 S. 43
Applications of GANs
WM/04.02 S. 44
17/21
WM/04-05 S. 44
Applications of GANs
- Morphing Audio/Videos/Images
WM/04.02 S. 45
17/21
WM/04-05 S. 45
Applications of GANs
- Morphing Audio/Videos/Images
- Image enhancement
WM/04.02 S. 46
17/21
WM/04-05 S. 46
Applications of GANs
- Morphing Audio/Videos/Images
- Image enhancement
WM/04.02 S. 47
17/21
WM/04-05 S. 47
Diffusion Models break the transformation into small steps
𝐷(𝑥) 0/1
GANs 𝒛 𝐺(𝑧) ෝ
𝒙
WM/04.02 S. 48
18/21
WM/04-05 S. 48
Diffusion Models break the transformation into small steps
𝐷(𝑥) 0/1
GANs 𝒛 𝐺(𝑧) ෝ
𝒙
Diffusion 𝒙𝟏 𝒙𝟐 𝒙𝟑 … 𝒛
𝒙𝟎
Models
WM/04.02 S. 49
Ho, Jonathan, Ajay Jain, and Pieter Abbeel. "Denoising diffusion probabilistic models." Advances in neural information processing
systems 33 (2020): 6840-6851.
18/21
WM/04-05 S. 49
Diffusion Models break the transformation into small steps
Diffusion 𝒙𝟏 𝒙𝟐 𝒙𝟑 … 𝒛
𝒙𝟎
Models
WM/04.02 S. 50
Ho, Jonathan, Ajay Jain, and Pieter Abbeel. "Denoising diffusion probabilistic models." Advances in neural information processing
systems 33 (2020): 6840-6851.
18/21
WM/04-05 S. 50
Diffusion Models gradually add Gaussian noise and then remove it
WM/04.02 S. 51
19/21
WM/04-05 S. 51
Diffusion Models gradually add Gaussian noise and then remove it
- Add different noise levels to the training data and train the model
to denoise it.
WM/04.02 S. 52
19/21
WM/04-05 S. 52
Diffusion Models gradually add Gaussian noise and then remove it
- Add different noise levels to the training data and train the model
to denoise it.
WM/04.02 S. 53
19/21
WM/04-05 S. 53
A trained diffusion model can generate new data
- The model learns to take different noisy images and turn them back to the original image.
WM/04.02 S. 54
20/21
WM/04-05 S. 54
A trained diffusion model can generate new data
- The model learns to take different noisy images and turn them back to the original image.
- Once trained, the model can take an arbitrary noise vector and turn it into a synthetic image.
WM/04.02 S. 55
20/21
WM/04-05 S. 55
A trained diffusion model can generate new data
- The model learns to take different noisy images and turn them back to the original image.
- Once trained, the model can take an arbitrary noise vector and turn it into a synthetic image.
- Just like a trained sculptor can chisel away a face from a block of stone.
Apollo
WM/04.02 S. 56
20/21
WM/04-05 S. 56
Diffusion model can generate interesting images from prompt
An elephant scuba A banana riding a a bunny reading his A panda bear eating A red boat flying
diving underwater horse on the moon email on laptop pasta upside down in rain
An astronaut A crocodile fishing A tree with all kinds Bichon Maltese and a A diffusion model
WM/04.02
walking a crocodile S. 57 on a boat while of fruits black bunny playing generating an image
in park reading a paper backgammon
21/21
WM/04-05 S. 57
Do you have any problem?
WM/04.02 S. 58
EOP
WM/04-05 S. 58