Medical Image Computing
Assignment 4
Submission:
Submit all of your codes and results in a single zip file with name
FirstName_RollNumber_01.zip
• Submit single zip file containing
(a) codes (b) report (c) Saved Models (d) Readme.txt
• There should be Report.pdf detailing your experience and highlighting any
interesting results. Kindly don’t explain your code in the report, just explain
the results. Your report should include your comments on the results of all
the steps, with images, for example what happened when you changed the
learning rate etc.
• Readme.txt should explain how to run your code, preferably it should
accept the command line arguments e.g dataset path used for training the
model.
• Root directory should be named as FirstName_RollNumber_04
• Email instructor or TA if there are any questions. You cannot look at others
code or use others code, however you can discuss with each other.
Plagiarism will lead to a straight zero with additional consequences as well.
Due Date: 11:59 PM on Tuesday,12th Nov 2024
Task 1 : Medical Image Classification with PathMNIST Dataset
Objective:
In this task, you will implement a convolutional neural network (CNN) in PyTorch to classify medical images
using the PathMNIST dataset. You will learn how to load medical image data, preprocess it, define a CNN,
train the model, evaluate its performance, and visualize the results.
Dataset: PathMNIST
PathMNIST is a medical imaging dataset part of the MedMNIST collection. It contains pathology images of
size 28x28 pixels, with each image classified into one of nine categories:
● Classes:
["tumor", "stroma", "complex", "lympho", "debris", "mucosa",
"adipose", "empty", "background"]
Each class represents a different type of tissue or background. Your task is to correctly classify these images
using a CNN.
Expected Output
You will train a CNN to classify images into these nine categories, evaluate its performance, and visualize the
results, including sample predictions and a confusion matrix to show classification performance.
1. Environment Setup
Before starting, ensure you have the necessary packages installed:
pip install torch torchvision medmnist matplotlib
2. Data Loading
Task: Load the PathMNIST Dataset
Use the MedMNIST library to load the PathMNIST dataset.
1. Download and Load the Dataset:
○ Load the training, validation, and test sets with the PathMNIST class provided by
medmnist.
○ Preprocess the images to normalize them and convert them into tensors.
2. Implement DataLoader:
○ Use PyTorch’s DataLoader to batch the data and enable shuffling for the training set.
Explanation:
● transforms.ToTensor(): Converts images to PyTorch tensors.
● transforms.Normalize(): Scales the data to the range [-1, 1] for better model convergence.
● DataLoader: Wraps the dataset in batches and allows for efficient data loading during training.
3. Visualize Sample Data
Before implementing the model, visualize some images with their labels to understand the dataset better.
Task: Display a Few Images
● Use matplotlib to plot a 3x3 grid of sample images.
● Display each image along with its label to familiarize yourself with the dataset.
4. Define the CNN Model
Task: Implement a CNN for Classification
Define a simple CNN model with PyTorch to classify the images. Use three convolutional layers, each
followed by max-pooling, and end with a fully connected layer for classification.
5. Train the Model
Task: Train the CNN
1. Initialize the model, define the loss function as CrossEntropyLoss, and set up the optimizer
using Adam.
2. Create a training loop that:
○ Runs for several epochs.
○ Computes the loss and updates weights.
3. Validate the model on the validation set after each epoch.
6. Evaluate the Model
Task: Test and Visualize Segmentation
1. Testing:
○ Use the test set to evaluate the model after training.
2. Visualization:
○ Display a few test images, the predicted masks, and the ground truth masks.
7. Visualize Predictions
Task: Display Sample Predictions
● Choose a few test images and display them with their true and predicted labels.
● This will give you a sense of how well your model is performing visually.
Task2 : Image Segmentation with UNet using Pascal VOC 2012
Objective
In this assignment, you will implement a UNet model in PyTorch to perform object segmentation using the
Pascal VOC 2012 dataset. This will cover data loading, model training, and evaluation, with visualizations of
segmentation masks.
Dataset: Pascal VOC 2012
The Pascal VOC 2012 dataset provides images and segmentation masks for various everyday objects. We’ll
use it to train a UNet model to segment objects in each image.
1. Dataset Link: Available through torchvision.datasets.
2. Classes: Background and 20 object categories (e.g., person, car, dog).
Expected Output
The trained model will output segmentation masks, highlighting objects in each image. Evaluation will
include visualizations of predicted masks alongside the true mask
Assignment Steps
1. Environment Setup
Ensure the necessary packages are installed:
pip install torch torchvision matplotlib
2. Data Loading
Task: Load the Pascal VOC Dataset
1. Download and Load the Dataset:
○ Use torchvision.datasets.VOCSegmentation to load the Pascal VOC dataset
with the segmentation split.
2. Data Preprocessing:
○ Resize images and masks to a smaller size (e.g., 128x128) for quicker training.
○ Normalize the images and convert them to tensors.
3. Visualize Sample Data
● Before building the model, visualize some images and their corresponding masks.
● Display each image and its mask to understand the data better.
4. Define the UNet Model
Task: Implement a Simple UNet Model
Define a UNet model in PyTorch. This will have an encoder, bottleneck, and decoder with skip connections
The UNet model has three main parts: an encoder to capture context, a bottleneck to process feature
representations, and a decoder to upsample and recreate the image segmentation.
Explanation of Each Part:
● Encoder: The encoder captures contextual information from the input image. It consists of
convolutional layers followed by max-pooling to reduce the spatial dimensions while increasing
feature depth.
● Bottleneck: The bottleneck represents the deepest layer of the network. It processes and refines the
features from the encoder.
● Decoder: The decoder uses upsampling (or transposed convolutions) to reconstruct the spatial
dimensions. It combines upsampled features with skip connections from the encoder, which helps in
retaining details lost during encoding.
5. Train the Model
Task: Train the UNet
1. Define Loss Function and Optimizer:
○ Use Cross-Entropy Loss for multi-class segmentation.
○ Set up the optimizer (e.g., Adam).
2. Training Loop:
○ Create a loop with forward pass, loss computation, backward pass, and optimization steps.
○ After each epoch, evaluate on the validation set.
6. Evaluate the Model
Task: Test and Visualize Segmentation
1. Visualization:
○ Display a few test images, predicted masks, and ground truth masks to analyze the model’s
performance.
Report
In the report you will describe the critical decisions you made, important things you learned,
or any decisions you made to write your algorithm a particular way. You are required to
report the accuracy you achieved. For each experiment, you are required to provide analysis
of various hyperparameters.
Plot loss and accuracy curves for both training and testing data with mean image subtraction
and without mean image subtraction. Report the difference in their accuracy and loss
curves.Plot loss and accuracy curves
Task1:
Deliverables
1. Code Notebook with each step implemented.
2. Plots of training/validation loss and accuracy.
3. Confusion Matrix and Classification Report for test set performance.
4. Sample Prediction Visualizations with true and predicted labels.
Grading Criteria
● Correct Implementation: Each step should be implemented as described.
● Accuracy: Higher accuracy on the validation and test set is rewarded.
● Analysis and Explanation: Insightful explanation of results and observations on model
performance in report.
● Visualization: Clear and informative visualizations of sample predictions and model
performance metrics.
Task2:
Deliverables
1. Code Notebook with each step implemented.
2. Plots of training and validation loss.
3. Sample Segmentation Visualizations with true and predicted masks.
Grading Criteria
● Correct Implementation: Each step should be implemented as described.
● Segmentation Accuracy: High overlap between predicted and true masks.
● Analysis and Explanation: Insightful explanation of results and observations on model
performance.
● Visualization: Clear and informative visualizations of segmentation results.