0% found this document useful (0 votes)

3 views33 pages

Deep Learning For Vision-Lab Manual

The document outlines a practical laboratory observation book for a Deep Learning for Vision course, detailing course objectives, practical exercises, and expected outcomes. It includes specific examples of image processing operations, neural network implementation, and the use of pretrained models for image classification. Each exercise is designed to enhance students' understanding of deep learning applications in computer vision.

Uploaded by

prasannananda18

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views33 pages

Deep Learning For Vision-Lab Manual

Uploaded by

prasannananda18

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 33

DEPARTMENT OF

COMPUTER SCIENCE AND ENGINEERING

(AIML)

Course Code/Name: AL3502 / DEEP LEARNING

FOR VISION

Department: AIML

PRACTICAL LABORATORY - OBSERVATION

BOOK

Name:

………………………………………….

RegisterNo:

……………………………………

Branch :……………………………………

Year & Sem :………………………………

COURSE OBJECTIVES:
 To introduce basic computer vision concepts
 To understand the methods and terminologies involved in deep neural network
 To impart knowledge on CNN
 To introduce RNN and Deep Generative model
 To solve real world computer vision applications using Deep learning.

PRACTICAL EXERCISES:
1. Implementation of basic Image processing operations including Feature
Representation and Feature Extraction
2. Implementation of simple neural network
3. Study of pretrained deep neural network model for Images
4. CNN for Image classification
5. CNN for Image segmentation
6. RNN for video processing
7. Implementation of Deep Generative model for Image editing

COURSE OUTCOMES:
Upon successful completion of this course, students will be able to:
CO 1: Implement basic Image processing operations
CO 2: Understand the basic concept of deep learning
CO 3: Design and implement CNN and RNN and Deep generative model
CO 4: Understand the role of deep learning in computer vision applications.
CO 5: Design and implement Deep generative model
EX : 1 1.Implementation of basic image processing
operations including Feature Representation
and Feature Extraction

Aim:
The aim of this program is to perform basic image processing operations, including
Feature Representation and Feature Extraction. Specifically, the program will:

1. Detect edges in an image using the Canny edge detector.

2. Extract corners using Harris Corner detection as features of the image.
3. Display the results, including the original image, edge-detected image, and
image with extracted features.

Algorithm:
1. Load Image: Load the input image from the file system.
2. Convert to Grayscale: Convert the image to grayscale to simplify processing,
as color information isn't necessary for edge detection and corner
detection.
3. Edge Detection: Apply the Canny edge detection algorithm to highlight the
boundaries (edges) in the image.
4. Feature Extraction (Corners): Use Harris Corner Detection to identify key points
(corners) in the image. These are points where there is a significant change
in intensity.
5. Display Results: Show the original image, edge-detected image, and the
image with corners highlighted.
6. Save Results (Optional): Save the images of the edges and features as output
files.

Program:

import cv2
import numpy as np
import matplotlib.pyplot as plt

# Step 1: Load the image

image = cv2.imread('image.jpg') # Replace 'image.jpg' with your image path

# Step 2: Convert the image to grayscale

gray_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
# Step 3: Apply edge detection (using Canny)
edges = cv2.Canny(gray_image, threshold1=100, threshold2=200)
# Step 4: Feature extraction (Using Harris Corner
Detection) # Convert image to float32 for Harris
detection
gray_float = np.float32(gray_image)
dst = cv2.cornerHarris(gray_float, 2, 3, 0.04)

# Dilate to mark the

corners dst =
cv2.dilate(dst, None)

# Step 5: Mark the corners in the original image

image_with_corners = image.copy()
image_with_corners[dst > 0.01 * dst.max()] = [0, 0, 255] # Red color for corners

# Step 6: Display the results

# Original image with corners marked
plt.subplot(1, 2, 1)
plt.imshow(cv2.cvtColor(image,
cv2.COLOR_BGR2RGB)) plt.title("Original Image")
plt.axis('off')

# Image with edges highlighted

plt.subplot(1, 2, 2)
plt.imshow(edges,
cmap='gray') plt.title("Edge
Detection (Canny)")
plt.axis('off')

# Show corners on the

image plt.figure()
plt.imshow(cv2.cvtColor(image_with_corners, cv2.COLOR_BGR2RGB))
plt.title("Feature Extraction (Corners)")
plt.axis('of

f')

plt.show()

# Step 7: Save the results (optional)

cv2.imwrite('edges_output.jpg', edges)
cv2.imwrite('corners_output.jpg', image_with_corners)
Output:

Result:
It focuses on performing basic image processing tasks such as edge detection
(using the Canny edge detector) and feature extraction (using Harris Corner
detection), followed by displaying the results with highlighted features in the
images.
EX : 2 2. Implementation of simple neural network

Aim:

The aim of this task is to implement a simple neural network using Python. The
neural network will be designed to classify data based on a simple dataset (like the
Iris dataset or a basic binary classification problem). This example will use Keras (a
high-level neural network API) and TensorFlow as the backend for creating the
neural network.

Algorithm:
1. Import Required Libraries: Import libraries like TensorFlow, Keras, and
other necessary modules for neural network creation.
2. Load and Preprocess Data: Load a dataset for classification, and
preprocess it (normalize, split into training and testing sets).
3. Build Neural Network Model:
○ Define the architecture of the neural network (input layer, hidden
layers, output layer).
○ Use activation functions like ReLU for hidden layers and softmax or
sigmoid for the output layer depending on the problem (multi-class or
binary).
4. Compile the Model: Choose a loss function and optimizer (e.g.,
categorical_crossentropy for multi-class classification or
binary_crossentropy for binary classification).
5. Train the Model: Use the training data to train the neural network.
6. Evaluate the Model: Test the trained model on unseen test data to
measure its performance.
7. Output Results: Display accuracy and loss metrics.

Program :
import numpy as np
from sklearn import datasets
from sklearn.model_selection import
train_test_split from sklearn.preprocessing
import LabelEncoder from
tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.utils import
to_categorical import tensorflow as tf
# Step 1: Load the Iris
dataset iris =
datasets.load_iris()
X = iris.data #
Features y =
iris.target # Labels

# Step 2: Preprocess the data

# Convert labels to one-hot encoding
y_encoded = to_categorical(y,
num_classes=3)

# Split data into training and testing sets

X_train, X_test, y_train, y_test = train_test_split(X, y_encoded,
test_size=0.2, random_state=42)

# Step 3: Build the Neural Network

Model model = Sequential()

# Input layer and first hidden layer with 10 neurons and ReLU activation
model.add(Dense(10, input_dim=4, activation='relu'))

# Second hidden layer with 8 neurons and ReLU

activation model.add(Dense(8, activation='relu'))

# Output layer with 3 neurons (one for each class) and softmax activation
model.add(Dense(3, activation='softmax'))

# Step 4: Compile the model

model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

# Step 5: Train the model

model.fit(X_train, y_train, epochs=100, batch_size=5, verbose=1)

# Step 6: Evaluate the model

loss, accuracy = model.evaluate(X_test, y_test)

# Output results
print(f"Test Loss: {loss:.4f}")
print(f"Test Accuracy:
{accuracy:.4f}")

Output :
Epoch 1/100
24/24 ━━━━━━━━━━━━━━━━━━━━2s 4ms/step - accuracy: 0.3858 - loss: 1.2491 Epoch
2/100
24/24 ━━━━━━━━━━━━━━━━━━━━0s 4ms/step - accuracy: 0.7343 - loss: 0.8896
Epoch 3/100
24/24 ━━━━━━━━━━━━━━━━━━━━0s 5ms/step - accuracy: 0.6574 - loss: 0.8656 Epoch
4/100
.
.
.
.
.
.
.
24/2━━━━━━━━━━━━━0s 3ms/step - accuracy: 0.9697 - loss: 0.0752
Epoch 100/100
24/24 ━━━━━━━━━━━━━━━━━━━━0s 3ms/step - accuracy: 0.9831 - loss: 0.0714 1/1
━━━━━━━━━━━━━━━━━━━━0s 279ms/step - accuracy: 0.9667 - loss: 0.0997 Test Loss:
0.0997
Test Accuracy: 0.9667

Result:

1. Training Progress: During the training, you will see the loss and accuracy
for each epoch. As training progresses, the model improves, and the
loss decreases while accuracy increases.
2. Test Accuracy: After training, the model's accuracy on the test set is
displayed. The higher the accuracy, the better the model has learned to
classify the data. For this example, we may see an accuracy of around
96.67% on the test set, which indicates that the model is performing well.
3. Loss: The test loss is also reported, showing how far off the model's
predictions were from the actual labels on the test data. A lower test loss
indicates a better-performing model.
EX : 3 3. Study of Pretrained Deep Neural Network
Model for Images

Aim:

● Objective: The primary aim is to leverage a pretrained deep neural network

model to perform image classification, detection, segmentation, or any
other image-related task.
● Goal: Use a pretrained model to achieve high accuracy in image-related
tasks without the need for extensive training from scratch, saving both
time and computational resources.

Algorithm:

The basic algorithm for using a pretrained deep neural network typically follows these
steps:

● Step 1: Load Pretrained Model

Load a deep learning model that has already been trained on a large dataset
(e.g., ImageNet, COCO). Common models include:
○ VGG16, VGG19
○ ResNet (e.g., ResNet50, ResNet101)
○ InceptionV3
○ EfficientNet
○ MobileNet
● Step 2: Preprocess the Input Image
Prepare the image according to the input requirements of the model (e.g., resizing,
normalization).
○ Resize the image to the required dimensions (e.g., 224x224 for VGG,
299x299 for Inception).
○ Normalize pixel values, typically between 0 and 1 or -1 and 1,
depending on the model.
● Step 3: Pass Image through the Model
Feed the preprocessed image into the model and get predictions (e.g.,
class probabilities for classification tasks).
● Step 4: Post-process the Output
Post-process the model's output (e.g., decode the class probabilities into human-
readable labels).
● Step 5: Evaluate Performance (if required)
Use metrics like accuracy, precision, recall, and F1 score to evaluate the model’s
performance on a test dataset.
Program :

# Import
required
libraries
import
tensorflow as
tf
from tensorflow.keras.preprocessing import image
from tensorflow.keras.applications.vgg16 import VGG16, preprocess_input,
decode_predictions import numpy as np
import matplotlib.pyplot as plt

# Step 1: Load Pretrained Model

(VGG16 in this case) model =
VGG16(weights='imagenet')

# Step 2: Load and Preprocess the Image

img_path = 'your_image_path.jpg' # Replace with the path to your image
img = image.load_img(img_path, target_size=(224, 224)) #
Resize image to 224x224 img_array =
image.img_to_array(img) # Convert image to numpy array
img_array = np.expand_dims(img_array, axis=0) #
Add batch dimension img_array =
preprocess_input(img_array) # Preprocess image
for VGG16

# Step 3: Predict the Image

Class predictions =
model.predict(img_array)

# Step 4: Decode the Predictions

decoded_predictions = decode_predictions(predictions, top=3)[0]
for i, (imagenet_id, label, score) in
enumerate(decoded_predictions): print(f"{i + 1}: {label}
({score * 100:.2f}%)")

# Step 5:
Display the
Image
plt.imshow(i
mg)
plt.title(f"Prediction: {decoded_predictions[0][1]} - {decoded_predictions[0]
[2]*100:.2f}%") plt.axis('off')
plt.show()

Output :

1/1 ━━━━━━━━━━━━━━━━━━━━1s 770ms/step 1:

banana (99.85%)
2: pineapple (0.05%)
3: zucchini (0.02%)
Result :
1. The pretrained deep neural network successfully performed the image
classification task with high accuracy.
2. Utilizing a pretrained model significantly reduced the need for extensive
training, saving time and computational resources.
3. The process demonstrated the practical applicability of transfer learning
for image- related tasks, providing reliable and interpretable outcomes.
EX : 4 4. CNN for image Classification

Aim:

The aim of this experiment is to implement a Convolutional Neural Network

(CNN) for image classification. The CNN will learn to recognize patterns
in images and classify them into categories. We will use a dataset like
CIFAR-10 to train and test the model.

Algorithm:

1. Input Layer: Take the image as input.

2. Convolutional Layers: Apply filters to extract features from the image
(e.g., edges, textures).
3. Activation (ReLU): Apply the ReLU function to introduce non-linearity.
4. Pooling Layers: Reduce the size of the image while retaining important features.
5. Fully Connected Layers: Flatten the features and connect them to
make final predictions.
6. Output Layer: Use the softmax function to get class probabilities.
7. Loss Function: Use categorical cross-entropy to measure prediction error.
8. Optimizer: Use Adam or SGD to minimize the error.
9. Training: Update the weights using backpropagation.
10.Evaluation: Test the model on unseen images and calculate accuracy.

Program :

# Import
librarie
s
import
tensorfl
ow as tf
from tensorflow.keras import
layers, models import
matplotlib.pyplot as plt
from tensorflow.keras.datasets import cifar10

# Load and preprocess CIFAR-10 dataset

(x_train, y_train), (x_test, y_test) = cifar10.load_data()

# Normalize the image values

x_train, x_test = x_train / 255.0, x_test / 255.0

# Create the
CNN model
model =
models.Sequ
ential([
# First Convolutional Layer
layers.Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)),
layers.MaxPooling2D((2, 2)),

# Second Convolutional Layer

layers.Conv2D(64, (3, 3),
activation='relu'),
layers.MaxPooling2D((2, 2)),

# Third Convolutional Layer

layers.Conv2D(64, (3, 3),
activation='relu'),

# Flatten the data and add a Dense layer

layers.Flatten(),
layers.Dense(64, activation='relu'),

# Output layer with 10 classes

layers.Dense(10, activation='softmax')
])

# Compile the model

model.compile(optimizer='adam', loss='sparse_categorical_crossentropy',
metrics=['accuracy'])

# Train the model

history = model.fit(x_train, y_train, epochs=10, validation_data=(x_test, y_test))

# Evaluate the model

test_loss, test_acc = model.evaluate(x_test, y_test,
verbose=2) print(f"Test accuracy: {test_acc}")

# Plot training accuracy

plt.plot(history.history['accuracy'], label='Training
accuracy') plt.plot(history.history['val_accuracy'],
label='Validation accuracy') plt.xlabel('Epoch')
plt.ylabel('Accur
acy')
plt.legend(loc
='lower
right')
plt.show()

Output :

Epoch 1/10

1563/1563 ━━━━━━━━━━━━━━━━━━━━78s 49ms/step - accuracy: 0.3510 - loss: 1.7526 -

val_accuracy: 0.5510 - val_loss: 1.2774
Epoch 2/10
….

Epoch 10/10

1563/1563 ━━━━━━━━━━━━━━━━━━━━73s 47ms/step - accuracy: 0.8050 - loss: 0.5588 - val_accuracy:

0.7163 - val_loss: 0.8717

313/313 - 5s - 16ms/step - accuracy: 0.7163 - loss: 0.8717

Test accuracy: 0.7163000106811523

Result :
The CNN successfully classifies images from the CIFAR-10 dataset, achieving a test
accuracy of approximately 71.63% after 10 epochs. The training accuracy
improved significantly over the epochs, showing that the model effectively learned
to recognize patterns in the images.
EX : 5 5. CNN for image Segmentation

Aim:

The aim of this experiment is to implement a Convolutional Neural Network (CNN) for
image segmentation. Image segmentation involves dividing an image into multiple
segments or regions to simplify its analysis. The goal is to classify each pixel in an
image into a predefined category.

Algorithm:

1. Input Image:
○ Start by loading the image that needs to be segmented.
2. Preprocessing:
○ Resize the image to match the input size expected by the model
(e.g., 128x128 pixels).
○ Normalize the pixel values to be between 0 and 1.
3. Build CNN Model for Segmentation:
○ Use convolutional layers to extract features from the image.
○ Use pooling layers to reduce the image dimensions and keep important
features.
○ Use an upsampling or deconvolution layer to bring back the image to
its original size.
○ The final output layer should have as many channels as the number
of classes (e.g., background and object).
4. Activation Function:
○ Use a softmax activation for multi-class segmentation (for
pixel-wise classification).
5. Loss Function:
○ Use categorical cross-entropy loss for multi-class segmentation tasks.
6. Optimizer:
○ Use an optimizer like Adam to minimize the loss function.
7. Training:
○ Train the CNN model with ground truth data to learn to segment the
image.
8. Output:
○ The model will output segmented images with each pixel labeled
according to the predicted class.
Program:

# Import libraries
import tensorflow as
tf
from tensorflow.keras import layers,
models import numpy as np
import matplotlib.pyplot as plt
from tensorflow.keras.preprocessing import image

# Step 1: Load and Preprocess the Image

img_path = 'your_image_path.jpg' # Replace with the path to your image
img = image.load_img(img_path, target_size=(128, 128)) # Resize image
to 128x128 img_array = image.img_to_array(img) # Convert image to
numpy array
img_array = np.expand_dims(img_array, axis=0) # Add batch
dimension img_array = img_array / 255.0 # Normalize pixel
values

# Step 2: Build CNN Model for Image

Segmentation model = models.Sequential([
# Convolutional layers for feature extraction
layers.Conv2D(32, (3, 3), activation='relu', padding='same', input_shape=(128, 128,
3)),
layers.MaxPooling2D((2, 2)),

layers.Conv2D(64, (3, 3), activation='relu', padding='same'),

layers.MaxPooling2D((2, 2)),

layers.Conv2D(128, (3, 3), activation='relu', padding='same'),

layers.MaxPooling2D((2, 2)),

# Upsampling to match the original image size

layers.Conv2DTranspose(64, (3, 3), strides=(2, 2),
padding='same'),
layers.Conv2DTranspose(32, (3, 3), strides=(2, 2), padding='same'),

# Output layer for pixel-wise classification

layers.Conv2D(3, (1, 1), activation='softmax', padding='same') # 3 classes
(background,
object 1, object 2)
])

# Step 3: Compile the model

model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

# Step 4: Make predictions on the input image

segmentation_output = model.predict(img_array)

# Step 5: Display the original image and segmented output

plt.figure(figsize=(12, 6))

# Original image
plt.subplot(1, 2, 1)
plt.imshow(img)
plt.title("Original
Image") plt.axis('off')

# Segmented output (showing class predictions)

plt.subplot(1, 2, 2)
plt.imshow(np.argmax(segmentation_output[0], axis=-1)) # Convert to class labels
(argmax for segmentation)
plt.title("Segmented
Image") plt.axis('off')

plt.show()

Inference:

1. Prepare the Image:

○ Ensure that the image file (e.g., 'your_image_path.jpg')
exists at the specified location on your system.
2. Model Training:
○ The model you’ve built has no pre-trained weights. To
get meaningful results, you'll need to either train the
model on labeled segmentation data or load a pre-
trained model for segmentation tasks.
3. Visualize the Results:
○ Once the model is trained or fine-tuned, running the
prediction on the input image will display the segmented
regions.

Output:

● Original Image: Displays the original image (e.g., a photo of an

object or scene).
● Segmented Image: The segmented output, with each pixel
labeled according to the predicted class (background, object 1,
or object 2), will be displayed with the mapped colors.
Result:

After running the program, the output will show:

1. The original image.

2. The segmented image, where each pixel is classified into one of the
predefined classes (e.g., background, object 1, object 2).

Example:

● Original Image: An image of a cat.

● Segmented Image: The cat’s pixels are classified, with background pixels in
one color and the cat’s body in another color.
EX : 6 6. RNN for video processing

Aim:
To develop a Recurrent Neural Network (RNN) model for video processing,
extracting temporal features to classify actions or detect events in a video
sequence.

Algorithm:
1. Data Preparation:
● Load video files and split them into individual frames.
● Preprocess frames (resize, normalize, and convert to tensors).
● Organize frames into sequences corresponding to video clips.
2. Model Design:
● Use a Convolutional Neural Network (CNN) to extract spatial
features from individual frames.
● Feed the extracted features into an RNN (e.g., LSTM or GRU) to
capture temporal dependencies between frames.
3. Training:
● Split the dataset into training and validation sets.
● Train the RNN on sequences of features extracted from the video
frames.
● Use a suitable loss function (e.g., categorical
crossentropy for classification tasks).
4. Evaluation:
● Test the trained model on unseen video sequences.
● Evaluate performance metrics like accuracy or F1-score.
5. Prediction:
● Use the trained model to predict actions or events on
new video sequences.

Program:
import tensorflow as tf
from tensorflow.keras
import layers, models
import numpy as np
impo
r
t
c
v
2

i
m
p
o
r
t
o
s
# Step 1: Load and preprocess video data
def load_video(video_path, frame_size=(64, 64),
max_frames=30): cap = cv2.VideoCapture(video_path)
frames = []
while len(frames) <
max_frames: ret, frame =
cap.read()
if not
ret:
break
frame = cv2.resize(frame, frame_size) # Resize frame
frame = frame / 255.0 # Normalize
frames.append(frame)
cap.release()
frames = np.array(frames)
if len(frames) < max_frames:
# Padding if less frames
padding = np.zeros((max_frames - len(frames),
*frames[0].shape)) frames = np.concatenate((frames,
padding))
return frames

# Sample video path

(replace with your file)
video_path =
"sample_video.mp4"
video_data =
load_video(video_path)

# Step 2: Create dataset

video_sequences = np.expand_dims(video_data, axis=0) #
Add batch dimension labels = np.array([0]) # Example
label for one video

# Step 3: Build
RNN model
cnn_model =
models.Sequ
ential([
layers.TimeDistributed(layers.Conv2D(32, (3, 3), activation='relu'), input_shape=(30, 64, 64, 3)),
layers.TimeDistributed(layers.MaxPooling2D((2, 2))),
layers.TimeDistributed(layers.Flatten()),
layers.LSTM(64, return_sequences=False), # Temporal modeling
layers.Dense(32, activation='relu'),
layers.Dense(1, activation='sigmoid') # Example for binary classification
])

cnn_model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

# Step 4: Training
# cnn_model.fit(video_sequences, labels, epochs=10) # Example training

# Step 5: Prediction
predictions = cnn_model.predict(video_sequences)
print("Predictions:", predictions)

Output:
1. Predictions:
The output is a classification probability for the video sequence, e.g.,
2. Predictions: [[0.85]]
3. Visual Output:
Not directly provided in this code but can be added to visualize feature
extraction or prediction results.

Result:
The RNN model successfully processed video sequences and predicted the
classification result for the input video. The performance of the model
depends on the quality of the training dataset and the temporal resolution
of the video clips.
EX : 7 7. Implementation of Deep Generative model for
Image editing

Aim:
To develop a deep generative model that can perform image editing tasks,
such as modifying specific attributes (e.g., changing color, adding objects) or
transforming one image into another (e.g., style transfer or image-to-image
translation).

Algorithm:
1. Data Collection:
● Collect a dataset of images that represent the type of editing
you want to perform (e.g., faces for facial attribute editing,
landscapes for image-to- image translation).
2. Model Architecture:
● Use a Generative Adversarial Network (GAN) or Variational
Autoencoder (VAE) to learn the distribution of images.
● The generator network creates modified images, while the
discriminator evaluates how realistic the generated images
are.
● For image-to-image translation, models like pix2pix or
CycleGAN are useful.
3. Training:
● Train the model on the collected image dataset, ensuring the
generator learns to produce realistic edits and
transformations.
● Use loss functions like adversarial loss (from the
discriminator) and L1 loss (for pixel accuracy).
4. Image Editing:
● Provide an image as input and apply the desired
transformations or edits based on learned features.
5. Evaluation:
● Evaluate the model by comparing generated images with
ground truth images (real images or manually edited
images).
● Assess quality using metrics like Inception Score (IS) or Fréchet
Inception Distance (FID).
Program
import torch
import torch.nn as nn
import torch.optim as
optim
from torch.utils.data import DataLoader
from torchvision import transforms, datasets,
utils from torchvision.models import vgg19
import os

# Define Generator
class Generator(nn.Module):
def init (self, in_channels,
out_channels): super(Generator,
self). init () self.encoder =
nn.Sequential(
self._conv_block(in_channels, 64, 4, 2, 1),
self._conv_block(64, 128, 4, 2, 1),
self._conv_block(128, 256, 4, 2, 1),
self._conv_block(256, 512, 4, 2, 1)
)
self.decoder =
nn.Sequential( self._deconv_block(
512, 256, 4, 2, 1),
self._deconv_block(256, 128, 4, 2, 1),
self._deconv_block(128, 64, 4, 2, 1),
nn.ConvTranspose2d(64, out_channels, kernel_size=4, stride=2,
padding=1), nn.Tanh()
)

def _conv_block(self, in_channels, out_channels, kernel_size, stride,

padding): return nn.Sequential(
nn.Conv2d(in_channels, out_channels, kernel_size, stride, padding, bias=False),
nn.BatchNorm2d(out_channels),
nn.LeakyReLU(0.2, inplace=True)
)

def _deconv_block(self, in_channels, out_channels, kernel_size, stride,

padding): return nn.Sequential(
nn.ConvTranspose2d(in_channels, out_channels, kernel_size, stride,
padding, bias=False),
nn.BatchNorm2d(out_channels),
nn.ReLU(inplace=True)
)

def forward(self, x):

x = self.encoder(x)
x=
self.decoder(x)
return x

# Define Discriminator
class
Discriminator(nn.Module):
def init (self,
in_channels):
super(Discriminator, self). init ()
self.model = nn.Sequential(
self._conv_block(in_channels, 64, 4, 2, 1),
self._conv_block(64, 128, 4, 2, 1),
self._conv_block(128, 256, 4, 2, 1),
nn.Conv2d(256, 1, kernel_size=4, stride=1, padding=1)
)

def _conv_block(self, in_channels, out_channels, kernel_size, stride,

padding): return nn.Sequential(
nn.Conv2d(in_channels, out_channels, kernel_size, stride, padding, bias=False),
nn.BatchNorm2d(out_channels),
nn.LeakyReLU(0.2, inplace=True)
)

def forward(self,
x): return
self.model(x)

# Hyperparameters
device = torch.device("cuda" if torch.cuda.is_available()
else "cpu") lr = 2e-4
batch_size = 16
epochs = 100
image_size = 128
in_channels = 3
out_channels = 3

# Data Preparation
transform =
transforms.Compose([ transforms.Resize((i
mage_size, image_size)),
transforms.ToTensor(),
transforms.Normalize((0.5,), (0.5,))
])

dataset = datasets.ImageFolder(root='path/to/dataset',
transform=transform) dataloader = DataLoader(dataset,
batch_size=batch_size, shuffle=True)

# Initialize Models
generator = Generator(in_channels, out_channels).to(device)
discriminator = Discriminator(in_channels +
out_channels).to(device)

# Optimizers and Losses

optimizer_G = optim.Adam(generator.parameters(), lr=lr, betas=(0.5,
0.999)) optimizer_D = optim.Adam(discriminator.parameters(), lr=lr,
betas=(0.5, 0.999)) adversarial_loss = nn.BCEWithLogitsLoss()
pixel_loss = nn.L1Loss()

# Training Loop
for epoch in range(epochs):
for i, (input_image, target_image) in enumerate(dataloader):
input_image, target_image = input_image.to(device),
target_image.to(device) real_labels = torch.ones((input_image.size(0),
1), requires_grad=False).to(device) fake_labels =
torch.zeros((input_image.size(0), 1), requires_grad=False).to(device)

# Train Generator
optimizer_G.zero_grad
()
generated_image = generator(input_image)
disc_fake = discriminator(torch.cat((input_image, generated_image),
dim=1)) g_loss = adversarial_loss(disc_fake, real_labels) +
pixel_loss(generated_image,
target_image)
g_loss.backward()
optimizer_G.step()

# Train Discriminator
optimizer_D.zero_grad
()
disc_real = discriminator(torch.cat((input_image, target_image),
dim=1)) real_loss = adversarial_loss(disc_real, real_labels)
disc_fake = discriminator(torch.cat((input_image, generated_image.detach()),
dim=1)) fake_loss = adversarial_loss(disc_fake, fake_labels)
d_loss = (real_loss + fake_loss)
/ 2 d_loss.backward()
optimizer_D.step()

# Logging
if i % 50 == 0:
print(f"Epoch [{epoch}/{epochs}] Batch {i}/{len(dataloader)} - Loss D:
{d_loss.item()}, Loss G: {g_loss.item()}")
# Save Models
torch.save(generator.state_dict(), "generator.pth")
torch.save(discriminator.state_dict(),
"discriminator.pth")

print("Training Complete!")
Inference :

1. Image Input Transformation:

The input image is resized to 128x128 pixels, normalized to the range [-1, 1],
and converted into a tensor, making it suitable for deep learning models.
2. Generator Functionality:
The generator takes the input image and learns to modify or transform it into the
desired output image using an encoder-decoder architecture.
3. Discriminator Role:
The discriminator evaluates the generator's output by distinguishing
between real images (ground truth) and generated images. It learns to guide
the generator to improve its output.
4. Loss Calculation:
Two types of losses are used:
○ Adversarial Loss: Ensures the generated images are realistic enough to
fool the discriminator.
○ Pixel Loss (L1 Loss): Ensures the generated image matches the
target image at the pixel level.
5. Training Loop:
The training alternates between improving the generator and the
discriminator. The generator aims to produce convincing image edits,
while the discriminator learns to identify imperfections in the generated
images.

Simple Autoencoder-Based Generator :

import torch
import torch.nn as nn
import torchvision.transforms
as T from PIL import Image

# Generator
(Autoencoder) class
Generator(nn.Module):
def init (self):
super(). init ()
self.encoder = nn.Sequential(nn.Conv2d(3, 64, 4, 2, 1), nn.ReLU())
self.decoder = nn.Sequential(nn.ConvTranspose2d(64, 3, 4, 2, 1), nn.Tanh())

def forward(self, x):

return self.decoder(self.encoder(x))

# Load image and preprocess

transform = T.Compose([T.Resize((64, 64)), T.ToTensor(),
T.Normalize((0.5,), (0.5,))]) image =
transform(Image.open("image.jpg")).unsqueeze(0)
# Initialize model, edit image, and save
output generator = Generator()
edited_image = generator(image)
output = T.ToPILImage()((edited_image.squeeze(0) * 0.5 + 0.5).clamp(0, 1))
output.save("edited_image.jpg")

Result:
After running the training loop, the model will output a trained generator capable of
performing image edits.
The generator can take an input image, process it through the learned network,
and produce the transformed or edited version of the image (e.g., modifying
facial features or changing the style of a landscape).
The saved model files (generator.pth, discriminator.pth) can be used for inference in
future image editing tasks.

Deep Learning For Vision Lab Manual 2024
100% (1)
Deep Learning For Vision Lab Manual 2024
25 pages
DLV Lab Manual Print
No ratings yet
DLV Lab Manual Print
29 pages
Deep Learning
No ratings yet
Deep Learning
5 pages
Ilovepdf Merged
No ratings yet
Ilovepdf Merged
62 pages
Al3502 Deep Learning For Vision Lab Manuval
No ratings yet
Al3502 Deep Learning For Vision Lab Manuval
19 pages
Image Classification Using MNIST Dataset
No ratings yet
Image Classification Using MNIST Dataset
28 pages
Lab Manual Aiml
No ratings yet
Lab Manual Aiml
29 pages
ML Guide: MNIST Digit Classification
No ratings yet
ML Guide: MNIST Digit Classification
98 pages
Introduction To Genetic Algorithm Neural Networks
No ratings yet
Introduction To Genetic Algorithm Neural Networks
44 pages
DL Exp5 22108B0055
No ratings yet
DL Exp5 22108B0055
14 pages
Introduction To ANN With Steps 10 25
No ratings yet
Introduction To ANN With Steps 10 25
30 pages
FA I - Unit5
No ratings yet
FA I - Unit5
11 pages
21BCP167 Ai 9
No ratings yet
21BCP167 Ai 9
10 pages
DL Lab-Final
No ratings yet
DL Lab-Final
22 pages
Lab Manual Ccs355
No ratings yet
Lab Manual Ccs355
12 pages
Vineela Ann1
No ratings yet
Vineela Ann1
9 pages
Soc5 Recordex 1to4, PDF
No ratings yet
Soc5 Recordex 1to4, PDF
11 pages
MVS - Expt8 Object Detection and Reconstruction Using CNN
No ratings yet
MVS - Expt8 Object Detection and Reconstruction Using CNN
5 pages
Deep Learning Lab With Output
No ratings yet
Deep Learning Lab With Output
12 pages
DL LAB MANUAL Mugesh
No ratings yet
DL LAB MANUAL Mugesh
12 pages
Sign Detection
No ratings yet
Sign Detection
6 pages
"I C U N N ": Mage Lassification Sing Eural Etworks
No ratings yet
"I C U N N ": Mage Lassification Sing Eural Etworks
15 pages
Lab 6 ML
No ratings yet
Lab 6 ML
7 pages
Artificial Intelligence Mini Project
No ratings yet
Artificial Intelligence Mini Project
5 pages
Keras
No ratings yet
Keras
4 pages
Project Manual - Team 591965
No ratings yet
Project Manual - Team 591965
27 pages
Case Study - AP23322130042
No ratings yet
Case Study - AP23322130042
7 pages
NNDL Record Manual
No ratings yet
NNDL Record Manual
36 pages
Project Documentation
No ratings yet
Project Documentation
24 pages
Assignment 02# - Machine Learning 2023
No ratings yet
Assignment 02# - Machine Learning 2023
8 pages
Exp. No.: I. Aim: AIML634P Neural Network Lab 2262034
No ratings yet
Exp. No.: I. Aim: AIML634P Neural Network Lab 2262034
6 pages
Deep Learning Experiments
No ratings yet
Deep Learning Experiments
42 pages
Lab Manual Aiml
No ratings yet
Lab Manual Aiml
29 pages
Shaurya DL File
No ratings yet
Shaurya DL File
75 pages
Deep Learning Models (Basic)
No ratings yet
Deep Learning Models (Basic)
35 pages
Neural Networks Lab Guide
No ratings yet
Neural Networks Lab Guide
26 pages
DL Programs
No ratings yet
DL Programs
12 pages
Implement A Neural Network Using Python
No ratings yet
Implement A Neural Network Using Python
4 pages
Deep Learning
No ratings yet
Deep Learning
46 pages
Deep Learning File
No ratings yet
Deep Learning File
14 pages
Week 6
No ratings yet
Week 6
8 pages
Implement A Neural Network Using Python
No ratings yet
Implement A Neural Network Using Python
5 pages
Classifying Hand-Written Digits Using Neural Network
No ratings yet
Classifying Hand-Written Digits Using Neural Network
21 pages
This Python Script Implements A Single
No ratings yet
This Python Script Implements A Single
6 pages
Capstone Project-1
No ratings yet
Capstone Project-1
15 pages
DL Lab - Merged
No ratings yet
DL Lab - Merged
60 pages
Explore The Implementation of CNNs in Python
No ratings yet
Explore The Implementation of CNNs in Python
10 pages
Exno 4
No ratings yet
Exno 4
3 pages
Deep Learning Manual
No ratings yet
Deep Learning Manual
44 pages
DL Practical
No ratings yet
DL Practical
23 pages
Lab 1 Assignment - W2022
No ratings yet
Lab 1 Assignment - W2022
7 pages
Assignment3 AL
No ratings yet
Assignment3 AL
23 pages
ASNM Program Explain
No ratings yet
ASNM Program Explain
4 pages
ML Ass2
No ratings yet
ML Ass2
8 pages
Assignment 3 DS5620
No ratings yet
Assignment 3 DS5620
11 pages
AIML Lab 3
No ratings yet
AIML Lab 3
17 pages
Deep Learning Assignment
No ratings yet
Deep Learning Assignment
11 pages
Exp. 5 - 6 - 7 - 8 - 9 10
No ratings yet
Exp. 5 - 6 - 7 - 8 - 9 10
18 pages
Deep Learning Lab Practicals
No ratings yet
Deep Learning Lab Practicals
24 pages
III Year It&Cs Student Syllabus Completion 25-26
No ratings yet
III Year It&Cs Student Syllabus Completion 25-26
2 pages
How To Apply
No ratings yet
How To Apply
3 pages
Symposium Amount
No ratings yet
Symposium Amount
1 page
CNN Expalin With Direvation
No ratings yet
CNN Expalin With Direvation
8 pages
Question Paper Format - CA-i New
No ratings yet
Question Paper Format - CA-i New
2 pages
Ca1 CC Ak Set A
No ratings yet
Ca1 CC Ak Set A
5 pages
Question Paper Format - CA-i
No ratings yet
Question Paper Format - CA-i
2 pages
Deep Learning - AD3501 - Notes - Codes For Practical
No ratings yet
Deep Learning - AD3501 - Notes - Codes For Practical
10 pages
P Level 2 2025-07-02 22 59
No ratings yet
P Level 2 2025-07-02 22 59
9 pages
AI Deep Learning & NLP Course
No ratings yet
AI Deep Learning & NLP Course
4 pages
Student Assignment Analysis
No ratings yet
Student Assignment Analysis
4 pages
Aassignemt-3: Friday, 29 September 2023 Submission Due by 14 October 2023
No ratings yet
Aassignemt-3: Friday, 29 September 2023 Submission Due by 14 October 2023
1 page
Asynchronous State Machines Analysis
No ratings yet
Asynchronous State Machines Analysis
12 pages
DSA Roadmap for Beginners
No ratings yet
DSA Roadmap for Beginners
10 pages
Data Science Q&A - Latest Ed (2020) - 1 - 2
No ratings yet
Data Science Q&A - Latest Ed (2020) - 1 - 2
2 pages
Lagrange Multivariate Interpolation
No ratings yet
Lagrange Multivariate Interpolation
9 pages
Huawei H13-321 Exam Prep
No ratings yet
Huawei H13-321 Exam Prep
5 pages
Poly or Not
No ratings yet
Poly or Not
17 pages
IterMethBook 2nded PDF
100% (1)
IterMethBook 2nded PDF
567 pages
Excel Spreadsheet in Teaching Numerical Methods
No ratings yet
Excel Spreadsheet in Teaching Numerical Methods
8 pages
Mit6 100l f22 Lec21
No ratings yet
Mit6 100l f22 Lec21
19 pages
Signal Flow Graph
No ratings yet
Signal Flow Graph
29 pages
ME2142E Feedback Control Systems-Cheatsheet
67% (9)
ME2142E Feedback Control Systems-Cheatsheet
2 pages
Introduction To Machine Learning - Unit 6 - Week 3
No ratings yet
Introduction To Machine Learning - Unit 6 - Week 3
5 pages
Slip NO - 3
No ratings yet
Slip NO - 3
9 pages
Recursive Array Sum - Problem - Description
No ratings yet
Recursive Array Sum - Problem - Description
13 pages
Harris Detector Explained
No ratings yet
Harris Detector Explained
12 pages
Optimization for Engineers
No ratings yet
Optimization for Engineers
25 pages
Optimization Technique (MA10003)
No ratings yet
Optimization Technique (MA10003)
3 pages
Types of CPU Scheduling Algorithms
No ratings yet
Types of CPU Scheduling Algorithms
11 pages
AI Lab Manual
No ratings yet
AI Lab Manual
18 pages
3rd Cseai, Cseds, Cseaiml - Data Structure
No ratings yet
3rd Cseai, Cseds, Cseaiml - Data Structure
2 pages
Introduction To Multigrid Methods
No ratings yet
Introduction To Multigrid Methods
134 pages
Linear Arrays Explained
No ratings yet
Linear Arrays Explained
36 pages
Q1 Week 7.1 Factoring Polynomials
No ratings yet
Q1 Week 7.1 Factoring Polynomials
20 pages
AI Chapter 5
No ratings yet
AI Chapter 5
65 pages
CS 4820-Lecture 12
No ratings yet
CS 4820-Lecture 12
2 pages
Runge-Kutta 4 Order Method: Example: K DT D
100% (1)
Runge-Kutta 4 Order Method: Example: K DT D
10 pages
Calculus For Machine Learning
100% (2)
Calculus For Machine Learning
283 pages

Deep Learning For Vision-Lab Manual

Uploaded by

Deep Learning For Vision-Lab Manual

Uploaded by

DEPARTMENT OF

COMPUTER SCIENCE AND ENGINEERING

Course Code/Name: AL3502 / DEEP LEARNING

PRACTICAL LABORATORY - OBSERVATION

Year & Sem :………………………………

1. Detect edges in an image using the Canny edge detector.

# Step 1: Load the image

# Step 2: Convert the image to grayscale

# Dilate to mark the

# Step 5: Mark the corners in the original image

# Step 6: Display the results

# Image with edges highlighted

# Show corners on the

# Step 7: Save the results (optional)

# Step 2: Preprocess the data

# Split data into training and testing sets

# Step 3: Build the Neural Network

# Second hidden layer with 8 neurons and ReLU

# Step 4: Compile the model

# Step 5: Train the model

# Step 6: Evaluate the model

● Objective: The primary aim is to leverage a pretrained deep neural network

● Step 1: Load Pretrained Model

# Step 1: Load Pretrained Model

# Step 2: Load and Preprocess the Image

# Step 3: Predict the Image

# Step 4: Decode the Predictions

1/1 ━━━━━━━━━━━━━━━━━━━━1s 770ms/step 1:

The aim of this experiment is to implement a Convolutional Neural Network

1. Input Layer: Take the image as input.

# Load and preprocess CIFAR-10 dataset

# Normalize the image values

# Second Convolutional Layer

# Third Convolutional Layer

# Flatten the data and add a Dense layer

# Output layer with 10 classes

# Compile the model

# Train the model

# Evaluate the model

# Plot training accuracy

1563/1563 ━━━━━━━━━━━━━━━━━━━━78s 49ms/step - accuracy: 0.3510 - loss: 1.7526 -

1563/1563 ━━━━━━━━━━━━━━━━━━━━73s 47ms/step - accuracy: 0.8050 - loss: 0.5588 - val_accuracy:

313/313 - 5s - 16ms/step - accuracy: 0.7163 - loss: 0.8717

Test accuracy: 0.7163000106811523

# Step 1: Load and Preprocess the Image

# Step 2: Build CNN Model for Image

layers.Conv2D(64, (3, 3), activation='relu', padding='same'),

layers.Conv2D(128, (3, 3), activation='relu', padding='same'),

# Upsampling to match the original image size

# Output layer for pixel-wise classification

# Step 3: Compile the model

# Step 4: Make predictions on the input image

# Step 5: Display the original image and segmented output

# Segmented output (showing class predictions)

1. Prepare the Image:

● Original Image: Displays the original image (e.g., a photo of an

After running the program, the output will show:

1. The original image.

● Original Image: An image of a cat.

# Sample video path

# Step 2: Create dataset

cnn_model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

def _conv_block(self, in_channels, out_channels, kernel_size, stride,

def _deconv_block(self, in_channels, out_channels, kernel_size, stride,

def forward(self, x):

def _conv_block(self, in_channels, out_channels, kernel_size, stride,

# Optimizers and Losses

1. Image Input Transformation:

Simple Autoencoder-Based Generator :

def forward(self, x):

# Load image and preprocess

You might also like