0% found this document useful (0 votes)

7 views8 pages

DL Midterm Report Topic Id 41

This midterm report discusses the application of a Variational Autoencoder-Generative Adversarial Network (VAE-GAN) model for generating brain tumor MRI scans to enhance medical imaging. The project outlines the model architecture, experimentation setup, and results indicating significant improvements in image quality over training epochs. The findings suggest that while the model shows promise for medical research, further fine-tuning is necessary to improve diagnostic accuracy and patient outcomes.

Uploaded by

Bình Nguyễn Thái

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views8 pages

DL Midterm Report Topic Id 41

Uploaded by

Bình Nguyễn Thái

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

DEEP LEARNING

MIDTERM REPORT
Variational autoencoder GAN for medical image generation.

Group members

Nguyễn Thái Bình - 22BI13059

Nguyễn Minh Đức - 22BI13092

Vũ Tuấn Hải - 22BI13149

Cấn Minh Hiển - 22BI13154

Nguyễn Quang Hưng - 22BI13184

Nguyễn Thế Khải - 22BI13201

Table of Contents
1. Introduction .............................................................................................................................. 1
2. Deep Learning Model: VAE – GAN .......................................................................................... 1

2.1. Variational Autoencoders (VAEs)...................................................................................... 1

2.2. Generative Adversarial Networks (GANs) ......................................................................... 2

2.3. VAE – GAN Hybrid........................................................................................................... 3

3. Experimentation........................................................................................................................ 4
3.1. Dataset .............................................................................................................................. 4

3.2. Model Components:........................................................................................................... 5

3.3. Training Setup................................................................................................................... 5

3.4. Devices .............................................................................................................................. 5

4. Result and Analysis ................................................................................................................... 5

4.1. Result ................................................................................................................................ 5

4.2. Analysis ............................................................................................................................. 6

5. Conclusion ................................................................................................................................ 7

1. Introduction
Medical imaging provides essential insights into the body's internal structures but is challenged
by limited high-quality data, privacy concerns, and the need for advanced diagnostic tools.
VAE is effective at learning compact representations of complex data, while GANs excel at
generating realistic synthetic images. Merging these methods can leverage their strengths to
develop a powerful tool to enhance medical image generation and analysis.
In this project, we aim to apply VAE-GAN models to generate brain tumor MRI scans,
providing an accurate diagnostic tool for identifying various brain conditions like cancer,
cerebral infarction, encephalocele, and more.

2. Deep Learning Model: VAE – GAN

2.1. Variational Autoencoders (VAEs)
Variational Autoencoders is an artificial neural network architecture designed to generate
new data points similar to the input data by learning a probabilistic distribution over the data.

• The Encoder extracts latent variables of input data x and outputs them in the form of a
vector representing latent space z.

1
• The Latent space is both the output layer of the encoder network and the input layer
of the decoder network. It is fully compressed, lower-dimensional embedding of the
input data.
• The Decoder use the data in latent space to reconstruct the original input by essentially
reversing the encoder

Loss function:

ℒ(𝜃, 𝜑; 𝑥, 𝑧) = 𝔼𝑞𝜑 (𝑧|𝑥 ) [log 𝑝𝜃 (𝑥|𝑧)] − 𝐷𝐾𝐿 (𝑞𝜑 (𝑧|𝑥) || 𝑝(𝑧))

Reconstruction loss Kullback-Leibler divergence

Reconstruction loss: is an expectation operator that measures how close the decoder output is
to the original input.

Kullback-Leibler divergence: measures the difference between two probability distributions,

forcing latent distribution to stay close to Normal (0,1).

2.2. Generative Adversarial Networks (GANs)

Generative Adversarial Networks are a deep neural network that can learn from training data
and generate new data with the same characteristics. A generative adversarial network is made

2
up of two neural networks, which are trained simultaneously, with the generator trying to fool
the discriminator and the discriminator trying to classify real and fake samples accurately.

• The Generator takes random noise as input and produces data from it. Its goal is to
generate data that is as real as possible.
• The Discriminator takes real data and the data generated by the Generator as input and
attempts to distinguish between the two. It outputs the probability that the given data is
real.

Loss function:

min(G) max(D) V(D, G) = Ex~pdata(x) [log D(x)] + Ex~pz (z) [log (1-D(G(z))]

2.3. VAE – GAN Hybrid

A VAE is combined with a GAN by collapsing the decoder and the generator into one.

VAE-GAN Training Procedures

3
Loss function:
𝐷𝑖𝑠𝑙
ℒ = ℒ𝑝𝑟𝑖𝑜𝑟 + ℒ𝑙𝑙𝑖𝑘𝑒 + ℒ𝐺𝐴𝑁
With

ℒ𝑝𝑟𝑖𝑜𝑟 = 𝐷𝐾𝐿 (𝑞(𝑧|𝑥) || 𝑝(𝑧))

𝐷𝑖𝑠
ℒ𝑙𝑙𝑖𝑘𝑒𝑙 = −𝔼𝑞(𝑧|𝑥 ) [log 𝑝(𝐷𝑖𝑠𝑙 (𝑥)| 𝑧)]

ℒ𝐺𝐴𝑁 = log(𝐷𝑖𝑠(𝑥)) + log(1 − 𝐷𝑖𝑠(𝐺𝑒𝑛(𝑧)))

3. Experimentation
The model was implemented in Pytorch, using NVIDIA CUDA for acceleration and efficient
training on high dimensional images.

3.1. Dataset
This project uses a dataset containing various brain tumor MRI images. A brain tumor is a
collection, or mass, of abnormal cells in the brain. When brain tumors grow, they can cause
brain damage and even be life-threatening. The dataset includes over 5000 images categorized
into the following classes:

• Glioma: A glioma is a type of primary tumor that starts in the glial cells of the brain or
spinal cord.

• Meningioma: Meningioma is typically a slow-growing tumor from the meninges, the

membranous layers surrounding the brain and spinal cord.

• Notumor: No tumor is when the brain is in standard condition and no tumor appears.

4
• Pituitary: The pituitary gland is a small, pea-sized endocrine gland located at the base
of the brain below the hypothalamus.

3.2. Model Components:

Encoder Decoder Discriminator

5x5 64 conv. ↓, BNorm, ReLu 8x8x256 fully-connected, BNorm, 5x5 32 conv. ↓, BNorm, ReLu
5x5 128 conv. ↓, BNorm, ReLu ReLu 5x5 128 conv. ↓, BNorm, ReLu
5x5 256 conv. ↓, BNorm, ReLu 6x6 256 conv. ↑, BNorm, ReLu 5x5 256 conv. ↓, BNorm, ReLu
2048 fully-connected, BNorm, ReLu 6x6 128 conv. ↑, BNorm, ReLu 5x5 256 conv. ↓, BNorm, ReLu
6x6 32 conv. ↑, BNorm, ReLu 512 fully-connected, BNorm, ReLu
5x5 1 conv., tanh 1 fully-connected, sigmoid

Architectures for the three networks (Encoder, Decoder, Discriminator) that comprise VAE/GAN. ↓ and ↑
represent down- and upsampling respectively. BNorm denotes batch normalization.

3.3. Training Setup

• Latent space: The Encoder outputs a 128-dimensional latent vector.
• Optimizers: RMSProp was used with a learning rate of 3e-4 for the encoder and the
decoder; and 3e-5 for the discriminator.
• Number of epochs: 25.
• Batch size: 64.
• The image is resized to 64x64 pixels, converted to Pytorch tensor then normalize the
pixel values to a range [-1; 1].

3.4. Devices
• The model was trained on an NVIDIA RTX 2060 and completed after 2 hours and 18
minutes. It was then saved to “vae_gan_model.pth”, allowing it to run on any device
without retraining. This helps low-end devices handle the model without any issues.

4. Result and Analysis

4.1. Result
• The generated images have a general structural consistency, meaning the VAE-GAN
model has learned key spatial features from the dataset.
• There are variety in the intensity and texture across the images, indicating that the
model has captured the variability in the training data.

5
Image generated by VAE – GAN model

4.2. Analysis

Image1: after 1 epoch Image2: after 25 epochs

• Image 1: The generated images are highly blurry and lack distinct features. This mean
model has just started training and has not learned any meaningful patterns from the
data.

6
• Image 2: The generated images show significant improvement. The images now have
clearer structures, with recognizable patterns and features that resemble real images.
While there is still some noise and the images are not perfect, they are far more detailed
than those generated after 1 epoch.

The VAE-GAN model shows noticeable progression in image generation from epoch 1 to
epoch 25. Initially, the generated images are very blurry with no clear features, which means
the model has not yet learned the patterns of the data. By epoch 25, the generated images show
significant improvement, capturing the overall structure and texture of the brain MRI scans.
However, noise is still present, so further adjustments can improve the image details.

5. Conclusion
Integrating Variational Autoencoders (VAEs) and Generative Adversarial Networks (GANs)
in brain tumor MRI generation marks a breakthrough in medical imaging AI. By combining
the structured latent space of VAEs with the realistic image generation capabilities of GANs,
this model supports large-scale research and data sharing while preserving patient privacy and
fostering medical innovation.

In summary, we have successfully demonstrated unsupervised learning using encoder-decoder

models and a similarity measure. Our results indicate that the visual fidelity of our method is
very promising for medical researchers. However, our model requires further fine-tuning to
enhance diagnostic accuracy, reduce risk, and improve patient outcomes in brain tumor
detection and treatment.

6. References
- A. B. L. Larsen, S. K. Sønderby, H. Larochelle, and O. Winther, "Autoencoding beyond pixels using
a learned similarity metric," arXiv, 2015. [Online]. Available: https://arxiv.org/pdf/1512.09300
- S. H. Tsang, "Review: VAE-GAN - Autoencoding beyond pixels using a learned similarity metric,"
Medium, Oct. 10, 2019. [Online]. Available: https://sh-tsang.medium.com/review-vae-gan-
autoencoding-beyond-pixels-using-a-learned-similarity-metric-dc0f8cb74435
- D. Bergmann and C. Stryker, "Variational autoencoder," IBM, June 12, 2024. [Online]. Available:
https://www.ibm.com/think/topics/variational-autoencoder
- P. D. Khanh, "GAN: An overview," phamdinhkhanh.github.io, July 13, 2020. [Online]. Available:
https://phamdinhkhanh.github.io/2020/07/13/GAN.html
- M. Del Pra, "Generative adversarial networks," Medium, Oct. 30, 2023. [Online]. Available:
https://medium.com/@marcodelpra/generative-adversarial-networks-dba10e1b4424

Lec15 Generative Models
No ratings yet
Lec15 Generative Models
51 pages
Lec 19
No ratings yet
Lec 19
111 pages
10 - Generative AI
No ratings yet
10 - Generative AI
71 pages
Module 2 Gen
No ratings yet
Module 2 Gen
57 pages
Deep Generative Models
No ratings yet
Deep Generative Models
55 pages
12 Variational Autoencoder v2.07
No ratings yet
12 Variational Autoencoder v2.07
35 pages
Comprehensive Report On Generative AI and Computer Vision Projects
No ratings yet
Comprehensive Report On Generative AI and Computer Vision Projects
15 pages
Vae - Gan 1
No ratings yet
Vae - Gan 1
136 pages
Ladder Logic Symbols
No ratings yet
Ladder Logic Symbols
13 pages
MIT 6.0002 Introduction To Computational Thinking and Data Science Notes
No ratings yet
MIT 6.0002 Introduction To Computational Thinking and Data Science Notes
25 pages
Practical Program List Term 1
No ratings yet
Practical Program List Term 1
5 pages
Lecture # 6 Latent Variable Models
No ratings yet
Lecture # 6 Latent Variable Models
55 pages
Dis10 Sol
No ratings yet
Dis10 Sol
11 pages
Auto Encoder S
No ratings yet
Auto Encoder S
16 pages
Indian Institute of Technology Jodhpur: Assignment-5
No ratings yet
Indian Institute of Technology Jodhpur: Assignment-5
10 pages
Module 5
No ratings yet
Module 5
23 pages
Module 5
No ratings yet
Module 5
23 pages
Variational Autoencoders (VAEs)
No ratings yet
Variational Autoencoders (VAEs)
5 pages
AI60201 Module3
No ratings yet
AI60201 Module3
61 pages
Uses and Abuses of Internet
No ratings yet
Uses and Abuses of Internet
1 page
Assignment-10.1 NLP 2103a51375
No ratings yet
Assignment-10.1 NLP 2103a51375
8 pages
Unit 5 Autoencoders
No ratings yet
Unit 5 Autoencoders
6 pages
Reconstructing Faces From fMRI Patterns Using Deep Generative Neural Networks
No ratings yet
Reconstructing Faces From fMRI Patterns Using Deep Generative Neural Networks
10 pages
Module 2
No ratings yet
Module 2
11 pages
Elegant Lines Pitch Deck - by Slidesgo
No ratings yet
Elegant Lines Pitch Deck - by Slidesgo
22 pages
VAE Vs GAN
100% (1)
VAE Vs GAN
3 pages
Embedded Systems VTU Module 4 Quiz Questions
No ratings yet
Embedded Systems VTU Module 4 Quiz Questions
4 pages
Lab Manual PPSC
No ratings yet
Lab Manual PPSC
35 pages
GAPE Module 3
No ratings yet
GAPE Module 3
21 pages
Gans Diffusion Models CNN.
No ratings yet
Gans Diffusion Models CNN.
3 pages
C 03 Variational Autoencoders Generative Adversarial Network
No ratings yet
C 03 Variational Autoencoders Generative Adversarial Network
54 pages
White and Grey Modern Business Research Proposal Presentation
No ratings yet
White and Grey Modern Business Research Proposal Presentation
10 pages
Agile Methodologies Author Nicolas Viera
No ratings yet
Agile Methodologies Author Nicolas Viera
38 pages
Week 2 - VAE - Lesson
No ratings yet
Week 2 - VAE - Lesson
22 pages
Python List Exercises
No ratings yet
Python List Exercises
3 pages
6 Types of Neural Network
No ratings yet
6 Types of Neural Network
8 pages
AAI - Module 2 - Variational Autoencoders
No ratings yet
AAI - Module 2 - Variational Autoencoders
9 pages
Auto Encoder S
No ratings yet
Auto Encoder S
22 pages
Part 15 MD
No ratings yet
Part 15 MD
36 pages
Introvae: Introspective Variational Autoencoders For Photographic Image Synthesis
No ratings yet
Introvae: Introspective Variational Autoencoders For Photographic Image Synthesis
20 pages
5 - Vae
No ratings yet
5 - Vae
20 pages
Week 2 - VAE
No ratings yet
Week 2 - VAE
14 pages
P A S - L S - R VAE: Erformance Nalysis of EMI Supervised Earning in THE Mall Data Egime Using S
No ratings yet
P A S - L S - R VAE: Erformance Nalysis of EMI Supervised Earning in THE Mall Data Egime Using S
7 pages
FAQ - Season of War - Thondia 27-10-2022
No ratings yet
FAQ - Season of War - Thondia 27-10-2022
1 page
Chapter 5
No ratings yet
Chapter 5
40 pages
DAA Approximation Algorithms
No ratings yet
DAA Approximation Algorithms
32 pages
Generative Models
No ratings yet
Generative Models
65 pages
Architectures RST
No ratings yet
Architectures RST
4 pages
LabTask2 F2070
No ratings yet
LabTask2 F2070
8 pages
Unit V 2 Marks With Header DL
No ratings yet
Unit V 2 Marks With Header DL
6 pages
DL Asmt-2
No ratings yet
DL Asmt-2
17 pages
CARRIER HAP v6 Training Announcement
No ratings yet
CARRIER HAP v6 Training Announcement
1 page
7.variational Autoencoders
No ratings yet
7.variational Autoencoders
4 pages
21.3 VAE Apps
No ratings yet
21.3 VAE Apps
29 pages
MuskanSharma - III IT
No ratings yet
MuskanSharma - III IT
10 pages
Fully Spiking Variational Autoencoder: Hiromichi Kamata, Yusuke Mukuta, Tatsuya Harada
No ratings yet
Fully Spiking Variational Autoencoder: Hiromichi Kamata, Yusuke Mukuta, Tatsuya Harada
15 pages
Communication Aids and Strategies Using Tools of Technology
0% (1)
Communication Aids and Strategies Using Tools of Technology
15 pages
Intel 8086 Registers and Addressing Modes: Atanu Shome
No ratings yet
Intel 8086 Registers and Addressing Modes: Atanu Shome
17 pages
Unsupervised Deep Learning
No ratings yet
Unsupervised Deep Learning
11 pages
MIT-101 Introduction To Information Technology
No ratings yet
MIT-101 Introduction To Information Technology
26 pages
Combinevae&Gan 4
No ratings yet
Combinevae&Gan 4
19 pages
Describe The Algorithm
No ratings yet
Describe The Algorithm
4 pages
CSD411 Week14 AutoRBM
No ratings yet
CSD411 Week14 AutoRBM
18 pages
Generative Model For Image Classification
No ratings yet
Generative Model For Image Classification
4 pages
Slot Machines2017
No ratings yet
Slot Machines2017
9 pages
A Hybrid Machine Learning Method For Image Classification
No ratings yet
A Hybrid Machine Learning Method For Image Classification
15 pages
Pre-Assignment For L1: Limit of A Function
No ratings yet
Pre-Assignment For L1: Limit of A Function
6 pages
Mod 3 Advanced AI
No ratings yet
Mod 3 Advanced AI
37 pages
Generative Nural Network
No ratings yet
Generative Nural Network
5 pages
Penguji Proposal TA 1 20 21
No ratings yet
Penguji Proposal TA 1 20 21
15 pages
O&M Manual - An Series Börger Pump - Rotex - Nord Integral
No ratings yet
O&M Manual - An Series Börger Pump - Rotex - Nord Integral
266 pages
A Review On Generative Adversarial Networks Used For Image Reconstruction in Medical Imaging
No ratings yet
A Review On Generative Adversarial Networks Used For Image Reconstruction in Medical Imaging
5 pages
Introtodeeplearning MIT 6.S191
No ratings yet
Introtodeeplearning MIT 6.S191
36 pages
Variational AutoEncoder
No ratings yet
Variational AutoEncoder
21 pages
Bharath Simha Reddy 2021 IOP Conf. Ser. Mater. Sci. Eng. 1022 012020
No ratings yet
Bharath Simha Reddy 2021 IOP Conf. Ser. Mater. Sci. Eng. 1022 012020
11 pages
Walking Simulation Model of Lower Limb Exoskeleton Robot Design
100% (1)
Walking Simulation Model of Lower Limb Exoskeleton Robot Design
11 pages
Make 02 00020
No ratings yet
Make 02 00020
19 pages
Sensors: HEMIGEN: Human Embryo Image Generator Based On Generative Adversarial Networks
No ratings yet
Sensors: HEMIGEN: Human Embryo Image Generator Based On Generative Adversarial Networks
16 pages
A Crash Course On Python
No ratings yet
A Crash Course On Python
27 pages
A Review of Generative Adversarial Networks For Computer Vision TasksElectronics Switzerland
No ratings yet
A Review of Generative Adversarial Networks For Computer Vision TasksElectronics Switzerland
17 pages
DB Managment Ch5
No ratings yet
DB Managment Ch5
4 pages
OSI Layer
No ratings yet
OSI Layer
4 pages
MAM1020F Class Test 1 2019 - Solutions
No ratings yet
MAM1020F Class Test 1 2019 - Solutions
9 pages
C 14 Mar Apr 2018 Time Tables.12
No ratings yet
C 14 Mar Apr 2018 Time Tables.12
1 page
1375055-2 Product Specification PDF
No ratings yet
1375055-2 Product Specification PDF
3 pages
CVAE-GAN Fine-Grained Image Generation Through Asymmetric Training
No ratings yet
CVAE-GAN Fine-Grained Image Generation Through Asymmetric Training
10 pages
Short Questions: Computer Science For 9 Class (Unit # 1)
No ratings yet
Short Questions: Computer Science For 9 Class (Unit # 1)
5 pages
Online Midterm Exam Guidelines - 200091 B2B
No ratings yet
Online Midterm Exam Guidelines - 200091 B2B
4 pages
Generative Ai: A Comprehensive Guide to Innovative Ai Models (A Step-by-step Understanding of Fundamental Concepts With Practical Applications)
From Everand
Generative Ai: A Comprehensive Guide to Innovative Ai Models (A Step-by-step Understanding of Fundamental Concepts With Practical Applications)
Anthony Phillips
No ratings yet
Advanced Multiplayer Game Development with Ureal Engine 5: A Comprehensive Guide to C++ Scripting
From Everand
Advanced Multiplayer Game Development with Ureal Engine 5: A Comprehensive Guide to C++ Scripting
Vladimir Kiselev
No ratings yet