[go: up one dir, main page]

0% found this document useful (0 votes)
3 views4 pages

Auto Encoder

An autoencoder is a neural network for unsupervised learning that compresses and reconstructs data, commonly used in image processing. The document outlines the components of an autoencoder, including data preparation, encoder, decoder, training, and a GUI interface using Gradio. It provides code snippets for each module, detailing the architecture and training process for image reconstruction.

Uploaded by

kebawa4655
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views4 pages

Auto Encoder

An autoencoder is a neural network for unsupervised learning that compresses and reconstructs data, commonly used in image processing. The document outlines the components of an autoencoder, including data preparation, encoder, decoder, training, and a GUI interface using Gradio. It provides code snippets for each module, detailing the architecture and training process for image reconstruction.

Uploaded by

kebawa4655
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Autoencoder

1. Introduction
An autoencoder is a type of neural network used for unsupervised learning. It learns to
compress data (encode) and then reconstruct the original input (decode). Autoencoders are
widely used in image compression, denoising, and feature extraction.

2. Data Preparation Module


Purpose: Loads and preprocesses images for training.

Code:

import torch
import torch.nn as nn
import torch.optim as optim
import torchvision.transforms as transforms
import torchvision.datasets as datasets
from torch.utils.data import DataLoader

# Hyperparameters
batch_size = 32
learning_rate = 0.001
num_epochs = 5
image_size = 128

# Data Transformations
transform = transforms.Compose([
transforms.Resize((image_size, image_size)), # Resize images to 128x128
transforms.ToTensor(), # Convert images to tensor (0-1 range)
])

# Load Dataset
train_dataset = datasets.ImageFolder(root='Data/train', transform=transform)
train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
3. Encoder Module
Purpose: Compresses the image into a smaller latent representation.

Code:

class Encoder(nn.Module):
def __init__(self):
super(Encoder, self).__init__()
self.encoder = nn.Sequential(
nn.Conv2d(3, 64, kernel_size=3, stride=2, padding=1), nn.ReLU(),
nn.Conv2d(64, 128, kernel_size=3, stride=2, padding=1), nn.ReLU(),
nn.Conv2d(128, 256, kernel_size=3, stride=2, padding=1), nn.ReLU(),
)

def forward(self, x):


return self.encoder(x)

4. Decoder Module
Purpose: Reconstructs the original image from the encoded representation.

Code:

class Decoder(nn.Module):
def __init__(self):
super(Decoder, self).__init__()
self.decoder = nn.Sequential(
nn.ConvTranspose2d(256, 128, kernel_size=3, stride=2, padding=1,
output_padding=1), nn.ReLU(),
nn.ConvTranspose2d(128, 64, kernel_size=3, stride=2, padding=1,
output_padding=1), nn.ReLU(),
nn.ConvTranspose2d(64, 3, kernel_size=3, stride=2, padding=1, output_padding=1),
nn.Sigmoid()
)

def forward(self, x):


return self.decoder(x)
5. Autoencoder Module
Purpose: Combines the encoder and decoder into a single model.

Code:

class Autoencoder(nn.Module):
def __init__(self):
super(Autoencoder, self).__init__()
self.encoder = Encoder()
self.decoder = Decoder()

def forward(self, x):


encoded = self.encoder(x)
decoded = self.decoder(encoded)
return encoded, decoded

6. Training Module
Purpose: Trains the autoencoder using MSE loss.

Code:

device = 'cuda' if torch.cuda.is_available() else 'cpu'


autoencoder = Autoencoder().to(device)
criterion = nn.MSELoss()
optimizer = optim.Adam(autoencoder.parameters(), lr=learning_rate)

print("Training Autoencoder...")
for epoch in range(num_epochs):
total_loss = 0
for images, _ in train_loader:
images = images.to(device)
optimizer.zero_grad()
encoded, decoded = autoencoder(images)
loss = criterion(decoded, images)
loss.backward()
optimizer.step()
total_loss += loss.item()
print(f"Epoch [{epoch + 1}/{num_epochs}], Loss: {total_loss / len(train_loader):.4f}")
7. GUI Interface (Gradio)
Purpose: Provides a web interface for users to upload images and see reconstructions.

Code:

import gradio as gr
from PIL import Image

def reconstruct_image(img):
img = img.convert("RGB")
img = transform(img).unsqueeze(0).to(device)
with torch.no_grad():
encoded, reconstructed = autoencoder(img)
original_np = img.squeeze(0).permute(1, 2, 0).cpu().numpy()
reconstructed_np = reconstructed.squeeze(0).permute(1, 2, 0).cpu().numpy()
return original_np, reconstructed_np

# Create Gradio Interface


interface = gr.Interface(
fn=reconstruct_image,
inputs=gr.Image(type="pil"),
outputs=[gr.Image(type="numpy", label="Original Image"), gr.Image(type="numpy",
label="Reconstructed Image")],
title="Autoencoder Image Reconstruction",
description="Upload an image to see how the autoencoder reconstructs it."
)

interface.launch(share=True)

You might also like