[go: up one dir, main page]

0% found this document useful (0 votes)
29 views36 pages

Deep Learning - Image Synthesis

Applying artificial intelligence in image synthesis

Uploaded by

pvgopika333
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views36 pages

Deep Learning - Image Synthesis

Applying artificial intelligence in image synthesis

Uploaded by

pvgopika333
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 36

JURY ASSIGNMENT

SUBMITTED BY :
GOPIKA P V
(BFT/22/1080)
MEERA K L
(BFT/22/202)
FASHION IMAGE SYNTHESIS
INTRODUCTION
o Fashion is a major global business with designers playing a crucial
role in creating new styles.
o AI and machine learning have enhanced various industries but
fashion has seen less exploration in data analytics.
o AI has potential in fashion for tasks like classification, forecasting,
and recommendation systems.
o Generative Modelling and GANs, Comparing GAN Models
o The innovate fashion design by generating new images and
compares two advanced GAN models for this purpose.
INDUSTRY
IMAGE SYNTHESIS
UNDERSTANDING METHODOLOGY
DATA UNDERSTANDING

EVALUATION DATA

DATA PREPARATION

DATA MODELING
1. UNDERSTANDING THE INDUSTRY

• About studying about the fashion industry generating various images


related to design coping up with the standards of the current market
trend is the first and foremost step
RESEARCH OBJECTIVES
• Creating new images of fashion items can assist fashion designers, acting
as virtual assistants in the industry.
2. Data Understanding
• This stage begins with data acquisition and then understanding the data and
then finding similarities
• The dataset contains 70,000 greyscale images of 28x28 pixels consisting of 10
classes Ankle Boot, Bag, Coat, T-shirt/Top, Dress, Shirt, Trousers, Sandal, Pullover,
Sneaker

• Each row consists of an image in the dataset and the size of images are 28x28
pixels

• Each pixels represents the light and dark pixels value of an image where in high
numbers represents darker shades.

• This pixel values ranges from 0-255 where 0 means white and 255 means black.
• The dataset has total 785 columns where 784 columns are 8 from 28x28 pixels
consisting each cell one pixel value and there is one column for class label at
start of each row.
3. Data Preparation

• Data acquisition is the most essential stage

• The dataset should be finalized after performing relevant study in


domain as if dataset is not suitable for the project

• The pre-processing of the data


• Reading the Dataset

S • Checking on dataset shape


T
E • Analysis of train and test data
P
• Normalization of data and Reshaping
S
• Data acquisition is the most important
step

• Data should be finalized after studying 3.1 Reading the Dataset

• The pre processing of data is done in


python using Tensor flow ( it is an open
source platform for machine learning)
• The dataset shape can be defined by
shape of an image as each cell
consists of a pixel value. 3.2 CHECKING ON
DATASET SHAPE
• The total number of rows represents
the number of image

• Total number of pixel values


represents number of columns
• After analyzing the dataset there are
6000 images per category in the training 3.3 ANALYSIS OF
set and 10000 images per category in the TRAIN AND TEST
testing set. DATA

TRAIN DATA TEST DATA


• Pixel Value Range: pixel values in the images range
from 0 to 1

• Normalization: To optimize model performance and 3.5 Normalization


reduce training time, the pixel values are normalized
to a range of 0 to 1. This is achieved by dividing each of data and
pixel value by 255. Reshaping

• Reshaping: After normalization, the data is reshaped


to fit the input requirements of the model. The input
shape for both DCGAN and Caps GAN is set to
28x28x1, where 1 represents the number of channels.
• Data Shape: After pre-processing, the shape of both training and
testing data is adjusted accordingly.

• Readiness for Model Input: Following exploration and pre-processing,


the dataset is now prepared to be fed into the model as the expected
input for training.

Before normalization

Training data Testing data


4. DATA MODELING

• GAN Models Implemented Two advanced GAN models.

• Capsule Network based Generative Adversarial Network (Caps GAN)

• Deep Convolutional Generative Adversarial Network (DCGAN), are


applied to the prepared dataset.
• Introduced by Ian Goodfellow
• Consist of two neural networks a
generator and a discriminator.
GANS
• The generator creates synthetic
data
• The discriminator evaluates the GENERATIVE
authenticity
• The generated data against real
ADVERSARIAL
data. NETWORKS
• The generator improves its ability
to create realistic data over time.
The basic GAN consists of two neural networks

• Generator and Discriminator


• The Generator network starts with random noise and creates data, like
images.
• Its goal is to make this generated data look as real as possible
• The Discriminator network takes both real data and generated data and
tries to tell them apart.
• It outputs a probability indicating whether the data is real or fake.
• The Generator uses noise to create new sample images.
• The Discriminator's job is to distinguish between real and fake images with
a binary output.
• Both the Generator and Discriminator use Convolutional Neural Networks
(CNNs).
• Noise is just a random data sample used to start the generation process.
CONVOLUTIONAL
• It is a class of deep learning
• CNN is heavily used in computer vision NEURAL NETWORKS
• It is similar to the basic neural network
• CNN also have learnable parameter like neural
network
• Convolutional neural network (ConvNet’s or CNNs) is
one of the main categories to do
IMAGE RECOGNITION IMAGE CLASSIFICATION OBJECT DETECTION
3 BASIC COMPONENTS TO DEFINE CNN

• The Convolution Layer

• The Pooling Layer

• The Output Layer or Fully


connected layer
Working of CNN

• The CNN model has 2 convolutional layer and pooling layers followed
by 2 fully connected layers.

• Batch normalization is applied in the 2nd and 3rd layers with Leaky
ReLU activation for all layers and a sigmoid function in the final layer.

HYPERPARAMETER K
Number of epochs applied to the model
• Batch normalization = Batch Norm is a technique that normalizes
data between neural network layers, using mini-batches instead of
the full dataset. It speeds up training.

mz is the neuron’s output

Sz is the standard deviation of the neuron’s output


Leak ReLu

• Leaky ReLu Activation Function (Rectified Linear Unit)- Instead of


defining the ReLU activation function as 0 for negative values of
inputs(x), it defines as an extremely small linear component of x.

• This function returns x if it receives any positive input but for any
negative value of x, it returns a really small value which is 0.01
times x. Thus it gives an output for negative values.
Sigmoid Function
• Sigmoid Function – It is used as a neural network activation function.
• When the activation function for a neuron is a sigmoid function it is a
guarantee that the output of this unit will always be between 0 and 1.

EQUATION
Deep Convolutional Generative
Adversarial Network DCGAN
• It’s a type of Generative Adversarial Network that
use a deep convolutional neural networks to
generate high quality images.

BENEFITS
APPLICATIONS
• High quality image generation
• Image synthesis
• Improved training stability
• Super resolution
Working Method
• Use batch normalization.
• Apply Leaky ReLU activation function.
• Use convolutional layers instead of pooling layers.
• The discriminator is a CNN with 2 convolutional layers, 2 fully
connected layers, batch normalization, Leaky ReLU and sigmoid
activation.
• The generator also has 2 convolutional layers.
• Batch normalization is used in each generator layer except the last
one.
• ReLU activation is used in the first 3 generator layers and sigmoid
activation in the last layer.
Capsule Network based Generative
Adversarial Network
Caps GAN

• Caps GAN integrates two concepts to enhance generative models'


ability to understand and reproduce complex structures and
hierarchies in data.

BENEFITS APPLICATIONS
• Improved Data Generation • Image Synthesis
• Dynamic Routing • Medical Imaging
CAPSULE NETWORK BASED GENERATIVE ADVERSARIAL
NETWORK
• The generator networks are the same.
• The discriminator network is a Capsule network instead of CNN.
• The first convolutional layer has a kernel size 3x3, 1 stride, and 256
filters.
• It consists of two Capsule net layers Primary-Caps and Digit-Caps
along with Leaky ReLU activation, batch normalization, flattening and
a Dense layer in Keras.
• Then it ends with a sigmoid function.
• Digit Capsule Layer outputs 16D vectors containing object
instantiation parameters.
Conv2d
Primary caps layer (Conv2d
Input layers Leaky ReLU
Reshape Squash)
Batch Normalization

Digit caps layer (Dense


Sigmoid function multiply Leaky ReLU Flatten function
Activation)
• Primary Caps layer – lowest capsule layer
• Flatten function – Acts as a bridge between layers.
• Generator: The generator of Caps GAN and the DCGAN are same
using deconvolutional neural network.
• Keras Dense Layer - The dense layer is a simple Layer of neurons in
which each neuron receives input from all the neurons of the
previous layer called as dense.
• The dense layer is used to classify images based on output from
convolutional layers.
5.EVALUATION
Qualitative Evaluation
Qualitative evaluation has been conducted on basis of visual analysis of
generated images of GAN
FUTURE
SCOPE

• New and advanced GANs are emerging regularly.


• Further studies in the fashion industry can improve
results.
• Future research could use complex datasets, such
as models in shops, fashion shows, and events.
• GANs can also be applied to smaller datasets for
research.
Choosing proper data

Not enough datasets in specific fields

CHALLENGES
Neural networks take much more time
for the execution

Requires high computation power to run


References
• https://norma.ncirl.ie/4399/1/karanjain.pdf/
• https://www.researchgate.net/
publication/373205261_AI_Assisted_Fashion_Design_A_Review/
• https://www.baeldung.com/cs/batch-normalization-cnn/
• https://builtin.com/machine-learning/sigmoid-activation-function/
• https://www.mygreatlearning.com/blog/relu-activation-function//
• https://blog.paperspace.com/capsule-networks/
• https://www.geeksforgeeks.org/generative-adversarial-network-gan/
/
THANK YOU

You might also like