Auto Encoder
Auto Encoder
1. Introduction
An autoencoder is a type of neural network used for unsupervised learning. It learns to
compress data (encode) and then reconstruct the original input (decode). Autoencoders are
widely used in image compression, denoising, and feature extraction.
Code:
import torch
import torch.nn as nn
import torch.optim as optim
import torchvision.transforms as transforms
import torchvision.datasets as datasets
from torch.utils.data import DataLoader
# Hyperparameters
batch_size = 32
learning_rate = 0.001
num_epochs = 5
image_size = 128
# Data Transformations
transform = transforms.Compose([
transforms.Resize((image_size, image_size)), # Resize images to 128x128
transforms.ToTensor(), # Convert images to tensor (0-1 range)
])
# Load Dataset
train_dataset = datasets.ImageFolder(root='Data/train', transform=transform)
train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
3. Encoder Module
Purpose: Compresses the image into a smaller latent representation.
Code:
class Encoder(nn.Module):
def __init__(self):
super(Encoder, self).__init__()
self.encoder = nn.Sequential(
nn.Conv2d(3, 64, kernel_size=3, stride=2, padding=1), nn.ReLU(),
nn.Conv2d(64, 128, kernel_size=3, stride=2, padding=1), nn.ReLU(),
nn.Conv2d(128, 256, kernel_size=3, stride=2, padding=1), nn.ReLU(),
)
4. Decoder Module
Purpose: Reconstructs the original image from the encoded representation.
Code:
class Decoder(nn.Module):
def __init__(self):
super(Decoder, self).__init__()
self.decoder = nn.Sequential(
nn.ConvTranspose2d(256, 128, kernel_size=3, stride=2, padding=1,
output_padding=1), nn.ReLU(),
nn.ConvTranspose2d(128, 64, kernel_size=3, stride=2, padding=1,
output_padding=1), nn.ReLU(),
nn.ConvTranspose2d(64, 3, kernel_size=3, stride=2, padding=1, output_padding=1),
nn.Sigmoid()
)
Code:
class Autoencoder(nn.Module):
def __init__(self):
super(Autoencoder, self).__init__()
self.encoder = Encoder()
self.decoder = Decoder()
6. Training Module
Purpose: Trains the autoencoder using MSE loss.
Code:
print("Training Autoencoder...")
for epoch in range(num_epochs):
total_loss = 0
for images, _ in train_loader:
images = images.to(device)
optimizer.zero_grad()
encoded, decoded = autoencoder(images)
loss = criterion(decoded, images)
loss.backward()
optimizer.step()
total_loss += loss.item()
print(f"Epoch [{epoch + 1}/{num_epochs}], Loss: {total_loss / len(train_loader):.4f}")
7. GUI Interface (Gradio)
Purpose: Provides a web interface for users to upload images and see reconstructions.
Code:
import gradio as gr
from PIL import Image
def reconstruct_image(img):
img = img.convert("RGB")
img = transform(img).unsqueeze(0).to(device)
with torch.no_grad():
encoded, reconstructed = autoencoder(img)
original_np = img.squeeze(0).permute(1, 2, 0).cpu().numpy()
reconstructed_np = reconstructed.squeeze(0).permute(1, 2, 0).cpu().numpy()
return original_np, reconstructed_np
interface.launch(share=True)