0% found this document useful (0 votes)

30 views11 pages

Tut4 NN Pytorch Updated - Ipynb - Colab

Uploaded by

Kevin Luo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

30 views11 pages

Tut4 NN Pytorch Updated - Ipynb - Colab

Uploaded by

Kevin Luo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

8/11/25, 10:51 PM Tut4_NN_pytorch_updated.

ipynb - Colab

keyboard_arrow_down PyTorch Tutorial for two-layer NN for classification

Here, we import and check the version of torch .

import torch
import torchvision

import numpy as np
from tqdm.notebook import tqdm
print(torch.__version__)

2.6.0+cu124

import matplotlib.pyplot as plt

import math
%matplotlib inline

from torchvision import datasets, transforms ## load the dataset

mnist_train = datasets.MNIST('data', train=True, download=True,

transform=transforms.ToTensor())

mnist_test = datasets.MNIST('../data', train=False, download=True, transform=

transforms.ToTensor())

100%|██████████| 9.91M/9.91M [00:00<00:00, 56.0MB/s]

100%|██████████| 28.9k/28.9k [00:00<00:00, 1.66MB/s]
100%|██████████| 1.65M/1.65M [00:00<00:00, 12.3MB/s]
100%|██████████| 4.54k/4.54k [00:00<00:00, 7.08MB/s]
100%|██████████| 9.91M/9.91M [00:00<00:00, 57.0MB/s]
100%|██████████| 28.9k/28.9k [00:00<00:00, 1.70MB/s]
100%|██████████| 1.65M/1.65M [00:00<00:00, 12.8MB/s]
100%|██████████| 4.54k/4.54k [00:00<00:00, 4.20MB/s]

print(mnist_train) # print info of the data

Dataset MNIST
Number of datapoints: 60000
Root location: data
Split: Train
StandardTransform
Transform: ToTensor()

We can easily visualize the images and their corresponding label as below. See how index 0 for a given sample corresponds to the image,
and index 1 is the label.

indices = [1, 12000, 344]

fig = plt.figure(figsize=(len(indices) * 4, 4))

for i, index in enumerate(indices):

ax = fig.add_subplot(1, len(indices), i + 1)
example = mnist_train[index]
ax.imshow(example[0].reshape(28, 28), cmap=plt.cm.gray)
ax.set_title("Label: {}".format(example[1]))

https://colab.research.google.com/drive/1QQ3yGOPmYTO41J-Ga-tuagGXJud2jv8V?usp=sharing#printMode=true 1/11
8/11/25, 10:51 PM Tut4_NN_pytorch_updated.ipynb - Colab

Pytorch's DataLoader is responsible for managing batches. You can create a DataLoader from any Dataset. DataLoader makes it easier to
iterate over batches (it can shuffle and give you the next mini-batch).

A drawback of having a Dataset wrapped with a DataLoader is that the DataLoader does not allow indexing. That's why if we want to get
a batch from a DataLoader without actually iterating over it as part of a loop, we have to convert it to an Iterator , and then called the
next method.

from torch.utils.data import DataLoader

train_dl = DataLoader(mnist_train, batch_size=100, shuffle=False)
# we set shufffle = False for reproducible visualization, in practice when training via SGD, should set it as the True

dataiter = iter(train_dl)
images, labels = next(dataiter)
viz = torchvision.utils.make_grid(images, nrow=10, padding = 2).numpy()
fig, ax = plt.subplots(figsize= (8,8))
ax.imshow(np.transpose(viz, (1,2,0)))
ax.set_xticks([])
ax.set_yticks([])
plt.show()

https://colab.research.google.com/drive/1QQ3yGOPmYTO41J-Ga-tuagGXJud2jv8V?usp=sharing#printMode=true 2/11
8/11/25, 10:51 PM Tut4_NN_pytorch_updated.ipynb - Colab

Thanks to PyTorch's ability to calculate gradients automatically, we can define the model and let torch do all the gradient update!

keyboard_arrow_down Some helper functions:

def accuracy(out, yb): # the accuracy evaluation
preds = torch.argmax(out, dim=1)
return (preds==yb).float().mean()

def get_test_stat(model, dl, device): # return the test loss and test accuracy
model.eval() # set model to eval mode#only matters if we have dropout/ normalization......
cum_loss, cum_acc = 0.0, 0.0
total_samples = 0

for i, (xb, yb) in enumerate(dl):

xb = xb.to(device)
yb = yb.to(device)

xb = xb.view(xb.size(0), -1)
f_pred = model(xb)
loss = loss_fn(f_pred, yb)
acc = accuracy(f_pred, yb)
cum_loss += loss.item() * len(yb)
cum_acc += acc.item() * len(yb)
total_samples += len(yb)

cum_loss /= total_samples
cum_acc /= total_samples
model.train() # set model back to train mode
return cum_loss, cum_acc

Then, we build a neural network with one hidden layer, by extending the torch.nn.Module class. This allows us to keep the code
modularized, and is how larger and more complicated models (e.g. ConvNets, self-atttention in LLM) are also built in PyTorch.

The torch.nn.Module is the base class for all neural network models in PyTorch. It provides the infrastructure for:

• Defining layers (e.g., nn.Linear, nn.Conv2d)

• Registering parameters so optimizers can update them

• Saving/loading model state (state_dict)

To build a custom network, subclass nn.Module and: 1. Define layers in init(). 2. Implement the forward pass in forward().

class Parent:
def __init__(self):
print("Parent init")

class Child(Parent):
def __init__(self):
print("Child init")
print("--------")

class Child2(Parent):
def __init__(self):
super().__init__() # will also call Parent.__init__() by using super().
print("Child init")
print("--------")

eg1 = Child()

eg2 = Child2()

Child init
--------
Parent init
Child init
--------

import torch.nn.functional as F
https://colab.research.google.com/drive/1QQ3yGOPmYTO41J-Ga-tuagGXJud2jv8V?usp=sharing#printMode=true 3/11
8/11/25, 10:51 PM Tut4_NN_pytorch_updated.ipynb - Colab
p

class LR(torch.nn.Module): # It defines a simple one-layer NN for classfication. It is just equivalent to a logistic regression
def __init__(self, input_dim, output_dim):
super(LR, self).__init__()
# define the parameters here
self.fc = torch.nn.Linear(input_dim, output_dim) #in linear reg outputdim =1, in logistic reg, output dim =K

def forward(self, x): # defines the forward pass (overwriting the default method)
out = self.fc(x) # pass input through the first layer
return out

class Net(torch.nn.Module): # It defines a simple two-layer NN for classfication.

def __init__(self, input_dim, hidden_dim, output_dim):
super(Net, self).__init__()

# define the parameters here

self.fc = torch.nn.Linear(input_dim, hidden_dim) # first layer, FC means fully connected, , you can also try more layers
self.out_layer = torch.nn.Linear(hidden_dim, output_dim) # output layer, i..e the last layer

def forward(self, x): # defines the forward pass (overwriting the default method)
out = self.fc(x) # pass input through the first layer
out = F.relu(out) # apply ReLU activation, you can change it to other activation such as tanh.
out = self.out_layer(out) # pass through the output layer

return out

We can now train the network. Note that instead of manually updating the weights ourselves, we use a built-in PyTorch optimizer here
torch.optim.SGD . Many other optimizers are available too (https://pytorch.org/docs/stable/optim.html).

The output our defined neural net is f pred , and the predicted probability is
f pred,k
e
^
P(y = k) =
9 f pred,j
∑ e
j=0

The loss on a single data point is given by

9 9

f pred,j
ℓ(y, f pred ) = − ∑ I {y = k} ⋅ f pred,k + log (∑ e ).

k=0 j=0

This loss is nothing but simply the negative log-likelihood, or usually called cross-entropy loss in the ML convention.

learning_rate = 1e-2
epochs = 10
bs = 128 # mini-batch size
dim_x = 784 # dimension of the input features 784= 28 * 28
dim_out = 10 # dim_out here is set to be 10, as we output 10 scores of 10 classes

# instantiate the model

model_LR = LR(dim_x, dim_out)

optimizer = torch.optim.SGD(model_LR.parameters(), lr=learning_rate)

# create datasets and data loader

mnist_train = datasets.MNIST('data', train=True, download=True,
transform=transforms.ToTensor())

mnist_test = datasets.MNIST('../data', train=False, download=True, transform=

transforms.ToTensor())

train_dl = DataLoader(mnist_train, batch_size=bs,shuffle=True)

#Dataloader can help us perform random shuffling and sample splitting to perform stochastic mini-batch GD., it automatically randomly split

test_dl = DataLoader(mnist_test, batch_size = 100)

# Using GPUs in PyTorch is pretty straightforward

if torch.cuda.is_available():
print("Using cuda")
use_cuda = True
device = torch.device("cuda")
else:
device = "cpu"

model_LR.to(device)
loss_fn = torch.nn.CrossEntropyLoss()

https://colab.research.google.com/drive/1QQ3yGOPmYTO41J-Ga-tuagGXJud2jv8V?usp=sharing#printMode=true 4/11
8/11/25, 10:51 PM Tut4_NN_pytorch_updated.ipynb - Colab

# set the model to training mode

model_LR.train()

train_stats_LR = {
'epoch': [],
'loss': [],
'acc': []
}
test_stats_LR = {
'epoch': [],
'loss': [],
'acc': []
}

pbar = tqdm(range(epochs))
for epoch in pbar: # During one epoch, the model processes all samples in the training set exactly once.
pbar.set_description(f"Epoch {epoch + 1} / 10") # printing the training process
train_loss = 0.0
train_acc = 0.0
for i, (xb, yb) in enumerate(train_dl): #each iteration here means a mini-batch gradient descent update.
xb = xb.to(device) # transport the data from your storage to the device for computing gradient
yb = yb.to(device) # transport the data from your storage to the device for computing gradient
xb = xb.view(xb.size(0), -1) # originally the training data is a tensor of shape [batch_size, 28, 28], we reshape it to be [batch_si

# Forward pass
f_pred = model_LR(xb) # f_pred here is the scores/logits for each class, not the predicted label
loss = loss_fn(f_pred, yb) # return the loss defined above
acc = accuracy(f_pred, yb) # accuracy function was defined before in the helper function chunk
# Backward pass
model_LR.zero_grad() # Zero out the previous gradient computation, see Tut 3 for the reason of zero-out operation
loss.backward() # Compute the gradient on the current mini-batch
optimizer.step() # Use the gradient information to make a step, update the parameters
train_stats_LR['epoch'].append(epoch + i / len(train_dl)) # the current min-batch index
train_stats_LR['loss'].append(loss.item()) # the current mini-batch training loss
train_stats_LR['acc'].append(acc.item()) # the current mini-batch training accuracy

test_loss_LR , test_acc_LR = get_test_stat(model_LR, test_dl, device) # the test_loss, test_accuracy.

test_stats_LR['epoch'].append(epoch + 1)
test_stats_LR['loss'].append(test_loss_LR)
test_stats_LR['acc'].append(test_acc_LR)

Epoch 10 / 10: 100% 10/10 [01:43<00:00, 10.35s/it]

learning_rate = 1e-2
epochs = 10
bs = 128 # mini-batch size
dim_x = 784 # dimension of the input features 784=28 * 28
dim_h = 32 # hidden layer dimension 32
dim_out = 10 # dim_out here is set to be 10, as we output 10 scores of 10 classes

# instantiate the model

model = Net(dim_x, dim_h, dim_out)

optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)

# create datasets and data loader

mnist_train = datasets.MNIST('data', train=True, download=True,
transform=transforms.ToTensor())

mnist_test = datasets.MNIST('../data', train=False, download=True, transform=

transforms.ToTensor())
train_dl = DataLoader(mnist_train, batch_size=bs,shuffle=True)
#Dataloader can help us perform random shuffling and sample splitting to perform stochastic mini-batch GD., it automatically randomly split

test_dl = DataLoader(mnist_test, batch_size = 100)

# Using GPUs in PyTorch is pretty straightforward

if torch.cuda.is_available():
print("Using cuda")
use_cuda = True
device = torch.device("cuda")
else:
device = "cpu"

https://colab.research.google.com/drive/1QQ3yGOPmYTO41J-Ga-tuagGXJud2jv8V?usp=sharing#printMode=true 5/11
8/11/25, 10:51 PM Tut4_NN_pytorch_updated.ipynb - Colab
model.to(device)
loss_fn = torch.nn.CrossEntropyLoss()

# set the model to training mode

model.train()

train_stats = {
'epoch': [],
'loss': [],
'acc': []
}
test_stats = {
'epoch': [],
'loss': [],
'acc': []
}

# Forward pass
f_pred = model(xb) # f_pred here is the scores/logits for each class, not the predicted label
loss = loss_fn(f_pred, yb) # return the loss defined above
acc = accuracy(f_pred, yb) # accuracy function was defined before in the helper function chunk
# Backward pass
model.zero_grad() # Zero out the previous gradient computation, see Tut 3 for the reason of zero-out operation
loss.backward() # Compute the gradient on the current mini-batch
optimizer.step() # Use the gradient information to make a step, update the parameters
train_stats['epoch'].append(epoch + i / len(train_dl)) # the current min-batch index
train_stats['loss'].append(loss.item()) # the current mini-batch training loss
train_stats['acc'].append(acc.item()) # the current mini-batch training accuracy

test_loss, test_acc = get_test_stat(model, test_dl, device) # the test_loss, test_accuracy.

test_stats['epoch'].append(epoch + 1)
test_stats['loss'].append(test_loss)
test_stats['acc'].append(test_acc)

Epoch 10 / 10: 100% 10/10 [02:01<00:00, 12.21s/it]

Plot training and test loss & accuracy curves.

fig, axes = plt.subplots(nrows=2, ncols=1, figsize=(6, 6))

axes[0].plot(train_stats_LR['epoch'], train_stats_LR['loss'], label='train_LR')

axes[0].plot(test_stats_LR['epoch'], test_stats_LR['loss'], label='test_LR')
axes[0].set_xlabel('Epoch')
axes[0].set_ylabel('Loss')

axes[1].plot(train_stats['epoch'], train_stats['acc'], label='train_LR')

axes[1].plot(test_stats['epoch'], test_stats['acc'], label='test_LR')
axes[1].set_xlabel('Epoch_LR')
axes[1].set_ylabel('Accuracy_LR')

plt.legend()
plt.show()

https://colab.research.google.com/drive/1QQ3yGOPmYTO41J-Ga-tuagGXJud2jv8V?usp=sharing#printMode=true 6/11
8/11/25, 10:51 PM Tut4_NN_pytorch_updated.ipynb - Colab

fig, axes = plt.subplots(nrows=2, ncols=1, figsize=(6, 6))

axes[0].plot(train_stats['epoch'], train_stats['loss'], label='train')

axes[0].plot(test_stats['epoch'], test_stats['loss'], label='test')
axes[0].set_xlabel('Epoch')
axes[0].set_ylabel('Loss')

axes[1].plot(train_stats['epoch'], train_stats['acc'], label='train')

axes[1].plot(test_stats['epoch'], test_stats['acc'], label='test')
axes[1].set_xlabel('Epoch')
axes[1].set_ylabel('Accuracy')

plt.legend()
plt.show()

(train_stats['loss'][-1]), (train_stats_LR['loss'][-1])

(0.37917956709861755, 0.4087868630886078)

https://colab.research.google.com/drive/1QQ3yGOPmYTO41J-Ga-tuagGXJud2jv8V?usp=sharing#printMode=true 7/11
8/11/25, 10:51 PM Tut4_NN_pytorch_updated.ipynb - Colab

keyboard_arrow_down Weight visualization

We visualize the learned weights in the first layer of the network as images. Compared to the linear model before, this model has 100 hidden
units with ReLU activation, enabling it to make use of a more diverse set of features.

nrows = 4
ncols = 8
first_layer_weights = model.fc.weight.detach().cpu().numpy()
fig, axes = plt.subplots(nrows=ncols, ncols=ncols, figsize=(6, 6))

for i in range(nrows):
for j in range(ncols):
axes[i, j].imshow(first_layer_weights[i * ncols + j].reshape((28, 28)), cmap='gray')
axes[i, j].set_xticks([])
axes[i, j].set_yticks([])

plt.tight_layout(pad=0.1)
plt.show()

nrows = 1
ncols = 10
first_layer_weights_LR = model_LR.fc.weight.detach().cpu().numpy()
fig, axes = plt.subplots(nrows=1, ncols=ncols, figsize=(12, 2))

for i in range(ncols):
axes[i].imshow(first_layer_weights_LR[i].reshape((28, 28)), cmap='gray')
axes[i].set_xticks([])
axes[i].set_yticks([])

plt.tight_layout(pad=0.1)
plt.show()

https://colab.research.google.com/drive/1QQ3yGOPmYTO41J-Ga-tuagGXJud2jv8V?usp=sharing#printMode=true 8/11
8/11/25, 10:51 PM Tut4_NN_pytorch_updated.ipynb - Colab

keyboard_arrow_down More advanced techniques in DL

Residual Connection:
Denote h(i) the input to the layer i + 1 from the previous layer. The function defined in layer i + 1 is F (i+1) (⋅) . Then the residual connection
is defined as
(i+1) (i+1) (i) (i)
Res = F (h ) + h .

(i+1) (i+1) (i+1) (i)

When using residual connection, the input to layer i + 2 is Res instead of h = F (h )

AdamW:
An advanced mini-batch gradient method, with momentum and weight decay.

Momentum means that at each mini-batch, the update Δw is not purely the gradient at current mini-batch, but a weighted average of
gradient at current minibatch and gradients from previous mini-batch.

The weight decay implicitly introduces ℓ2 regularization to parameters to prevent overfitting.

Dropout:
Dropout is a regularization technique that randomly sets a fraction of neurons’ outputs to zero during training to prevent overfitting and
enhance robustness.

class ResNet(torch.nn.Module): # It defines a simple 3-layer NN for classfication.

def __init__(self, input_dim, hidden_dim, output_dim):
super(ResNet, self).__init__()

# define the parameters here

self.fc1 = torch.nn.Linear(input_dim, hidden_dim) # first layer, FC means fully connected, , you can also try more layers
self.fc2 = torch.nn.Linear(hidden_dim, hidden_dim)
self.out_layer = torch.nn.Linear(hidden_dim, output_dim) # output layer, i..e the last layer

def forward(self, x): # defines the forward pass (overwriting the default method)
out = self.fc1(x) # pass input through the first layer
out = F.relu(out)
out = self.fc2(out) + out
out = self.out_layer(out) # pass through the output layer

return out

# instantiate the model

model_RN = ResNet(dim_x, dim_h, dim_out)

optimizer = torch.optim.AdamW(
model_RN.parameters(),
lr=learning_rate, # same learning rate variable
weight_decay=1e-3 # AdamW usually uses some weight decay (equiv to L2 regularization)
)

# create datasets and data loader

mnist_train = datasets.MNIST('data', train=True, download=True,
transform=transforms.ToTensor())

mnist_test = datasets.MNIST('../data', train=False, download=True, transform=

https://colab.research.google.com/drive/1QQ3yGOPmYTO41J-Ga-tuagGXJud2jv8V?usp=sharing#printMode=true 9/11
8/11/25, 10:51 PM Tut4_NN_pytorch_updated.ipynb - Colab
transforms.ToTensor())
train_dl = DataLoader(mnist_train, batch_size=bs,shuffle=True)
#Dataloader can help us perform random shuffling and sample splitting to perform stochastic mini-batch GD., it automatically randomly split

test_dl = DataLoader(mnist_test, batch_size = 100)

# Using GPUs in PyTorch is pretty straightforward

if torch.cuda.is_available():
print("Using cuda")
use_cuda = True
device = torch.device("cuda")
else:
device = "cpu"

model_RN.to(device)
loss_fn = torch.nn.CrossEntropyLoss()

# set the model to training mode

model_RN.train()

train_stats_RN = {
'epoch': [],
'loss': [],
'acc': []
}
test_stats_RN = {
'epoch': [],
'loss': [],
'acc': []
}

pbar = tqdm(range(epochs))
for epoch in pbar: # During one epoch, the model processes all samples in the training set exactly once.
pbar.set_description(f"Epoch {epoch + 1} / 10") # printing the training process
train_loss_RN = 0.0
train_acc_RN = 0.0
for i, (xb, yb) in enumerate(train_dl): #each iteration here means a mini-batch gradient descent update.
xb = xb.to(device) # transport the data from your storage to the device for computing gradient
yb = yb.to(device) # transport the data from your storage to the device for computing gradient
xb = xb.view(xb.size(0), -1) # originally the training data is a tensor of shape [batch_size, 28, 28], we reshape it to be [batch_si

# Forward pass
f_pred = model_RN(xb) # f_pred here is the scores/logits for each class, not the predicted label
loss = loss_fn(f_pred, yb) # return the loss defined above
acc = accuracy(f_pred, yb) # accuracy function was defined before in the helper function chunk
# Backward pass
model_RN.zero_grad() # Zero out the previous gradient computation, see Tut 3 for the reason of zero-out operation
loss.backward() # Compute the gradient on the current mini-batch
optimizer.step() # Use the gradient information to make a step, update the parameters
train_stats_RN['epoch'].append(epoch + i / len(train_dl)) # the current min-batch index
train_stats_RN['loss'].append(loss.item()) # the current mini-batch training loss
train_stats_RN['acc'].append(acc.item()) # the current mini-batch training accuracy

test_loss, test_acc = get_test_stat(model_RN, test_dl, device) # the test_loss, test_accuracy.

test_stats_RN['epoch'].append(epoch + 1)
test_stats_RN['loss'].append(test_loss)
test_stats_RN['acc'].append(test_acc)

Epoch 10 / 10: 100% 10/10 [01:54<00:00, 11.32s/it]

fig, axes = plt.subplots(nrows=2, ncols=1, figsize=(6, 6))

axes[0].plot(train_stats_RN['epoch'], train_stats_RN['loss'], label='train')

axes[0].plot(test_stats_RN['epoch'], test_stats_RN['loss'], label='test')
axes[0].set_xlabel('Epoch')
axes[0].set_ylabel('Loss')

axes[1].plot(train_stats_RN['epoch'], train_stats_RN['acc'], label='train')

axes[1].plot(test_stats_RN['epoch'], test_stats_RN['acc'], label='test')
axes[1].set_xlabel('Epoch')
axes[1].set_ylabel('Accuracy')

plt.legend()
plt.show()

https://colab.research.google.com/drive/1QQ3yGOPmYTO41J-Ga-tuagGXJud2jv8V?usp=sharing#printMode=true 10/11
8/11/25, 10:51 PM Tut4_NN_pytorch_updated.ipynb - Colab

https://colab.research.google.com/drive/1QQ3yGOPmYTO41J-Ga-tuagGXJud2jv8V?usp=sharing#printMode=true 11/11

PyTorch Tensor and Autograd Guide
No ratings yet
PyTorch Tensor and Autograd Guide
15 pages
PyTorch CrashCourse
No ratings yet
PyTorch CrashCourse
16 pages
Ilovepdf Merged
No ratings yet
Ilovepdf Merged
10 pages
(Deep Learning Using PyTorch) (Cheatsheet)
No ratings yet
(Deep Learning Using PyTorch) (Cheatsheet)
7 pages
Pytorch Demo 1749471354
No ratings yet
Pytorch Demo 1749471354
10 pages
NN From Scratch
No ratings yet
NN From Scratch
5 pages
Intro To Pytorch
No ratings yet
Intro To Pytorch
12 pages
PyTorch - A Comprehensive Overview
No ratings yet
PyTorch - A Comprehensive Overview
7 pages
PyTorch CrashCourse
No ratings yet
PyTorch CrashCourse
17 pages
CS236 Introduction To PyTorch
100% (4)
CS236 Introduction To PyTorch
33 pages
Building Deep Learning Models Using The PyTorch Library
No ratings yet
Building Deep Learning Models Using The PyTorch Library
4 pages
Pytorch Neural Networks Guide 1717173717
No ratings yet
Pytorch Neural Networks Guide 1717173717
17 pages
PyTorch For Deep Learning Zero To Mastery
No ratings yet
PyTorch For Deep Learning Zero To Mastery
6 pages
Pytorch Tutorial For Beginner: Department of Computer Science & Engineering University of Washington
No ratings yet
Pytorch Tutorial For Beginner: Department of Computer Science & Engineering University of Washington
11 pages
Pytorch Tutorial: Narges Honarvar Nazari January 30
No ratings yet
Pytorch Tutorial: Narges Honarvar Nazari January 30
29 pages
Video 5 - Building A Multilayer Perceptron For Regression in PyTorch
No ratings yet
Video 5 - Building A Multilayer Perceptron For Regression in PyTorch
17 pages
Chapter 1
No ratings yet
Chapter 1
37 pages
HW3 Pedro Aguiar
No ratings yet
HW3 Pedro Aguiar
9 pages
Pytorch Tutorial 1
No ratings yet
Pytorch Tutorial 1
48 pages
Chapter 3 - Training Deep Neural Networks
No ratings yet
Chapter 3 - Training Deep Neural Networks
25 pages
MLP Pytorch Sigmoid Mse
No ratings yet
MLP Pytorch Sigmoid Mse
20 pages
Py Torch
No ratings yet
Py Torch
786 pages
Training A Classifier - PyTorch Tutorials 2.3.0+cu121 Documentation
No ratings yet
Training A Classifier - PyTorch Tutorials 2.3.0+cu121 Documentation
8 pages
Mlp-Fromscratch Sigmoid-Mse
No ratings yet
Mlp-Fromscratch Sigmoid-Mse
13 pages
Pytorch Cheatsheet EN
No ratings yet
Pytorch Cheatsheet EN
1 page
TXT
No ratings yet
TXT
7 pages
Pytorch Tutorial 1 Rev 1
No ratings yet
Pytorch Tutorial 1 Rev 1
48 pages
Assignment 3 DS5620
No ratings yet
Assignment 3 DS5620
11 pages
NN From Scratch PDF 1735495327
No ratings yet
NN From Scratch PDF 1735495327
19 pages
Py Torch
No ratings yet
Py Torch
11 pages
Train Your Image Classifier Model With PyTorch
No ratings yet
Train Your Image Classifier Model With PyTorch
6 pages
2c PyTorch4
No ratings yet
2c PyTorch4
4 pages
Assignment3 AL
No ratings yet
Assignment3 AL
23 pages
ISPR 26 Pytorch
No ratings yet
ISPR 26 Pytorch
35 pages
Py Torch
No ratings yet
Py Torch
19 pages
CIFAR - 10 - Dataset - Using - CNN - Aniiiii - HTML
No ratings yet
CIFAR - 10 - Dataset - Using - CNN - Aniiiii - HTML
8 pages
PyTorch ResNet50 Training Guide
No ratings yet
PyTorch ResNet50 Training Guide
55 pages
Pytorch 101: Deep Learning PHD Course 2017/2018
No ratings yet
Pytorch 101: Deep Learning PHD Course 2017/2018
19 pages
01 - Mnist - Ipynb (4) - JupyterLab
No ratings yet
01 - Mnist - Ipynb (4) - JupyterLab
23 pages
PyTorch for Deep Learning Experts
No ratings yet
PyTorch for Deep Learning Experts
72 pages
Deep Learning: CIFAR-10 Classification
No ratings yet
Deep Learning: CIFAR-10 Classification
15 pages
MLP Pytorch Softmax Crossentr
No ratings yet
MLP Pytorch Softmax Crossentr
20 pages
DL 1 - ComputerVision With PyTorch Notes
No ratings yet
DL 1 - ComputerVision With PyTorch Notes
304 pages
PyTorch 1 - 0 - Bringing Research and Production Together Presentation
No ratings yet
PyTorch 1 - 0 - Bringing Research and Production Together Presentation
108 pages
Faster R-CNN
No ratings yet
Faster R-CNN
20 pages
04 Pytorch Custom Datasets - Ipynb
No ratings yet
04 Pytorch Custom Datasets - Ipynb
742 pages
CV Lab Final AwaisKhan EE A
No ratings yet
CV Lab Final AwaisKhan EE A
7 pages
PyTorch Ebook
No ratings yet
PyTorch Ebook
44 pages
PyTorch Cheat Sheet for Developers
No ratings yet
PyTorch Cheat Sheet for Developers
2 pages
TensorFlow Debugging Guide
No ratings yet
TensorFlow Debugging Guide
28 pages
Stars 4 0 0 0 + Forks 7 0 0 + License MIT
No ratings yet
Stars 4 0 0 0 + Forks 7 0 0 + License MIT
19 pages
Reproducibility Project
No ratings yet
Reproducibility Project
22 pages
Code
No ratings yet
Code
4 pages
Beginner's PyTorch Guide
No ratings yet
Beginner's PyTorch Guide
35 pages
Deep Learning Lab: How To Train Your First Neural Network
No ratings yet
Deep Learning Lab: How To Train Your First Neural Network
68 pages
Chapter 3
No ratings yet
Chapter 3
26 pages
5 Cryptocurrency - Bitcoin, Altcoin and Token
No ratings yet
5 Cryptocurrency - Bitcoin, Altcoin and Token
39 pages
Encoder Tool
No ratings yet
Encoder Tool
3 pages
Ai Unit 3
No ratings yet
Ai Unit 3
30 pages
A03 20250126164051
No ratings yet
A03 20250126164051
25 pages
CRg7-Passport-CP-tester (English V1)
No ratings yet
CRg7-Passport-CP-tester (English V1)
21 pages
Unit I Microprocessor Notes
No ratings yet
Unit I Microprocessor Notes
15 pages
Solutions For Exercises in Reinforcement Learning: An Introduction (2nd Edition) by Sutton & Barto
No ratings yet
Solutions For Exercises in Reinforcement Learning: An Introduction (2nd Edition) by Sutton & Barto
5 pages
Test, Form 1A: NAME - DATE - PERIOD
No ratings yet
Test, Form 1A: NAME - DATE - PERIOD
2 pages
AI Knowledge Representation Guide
No ratings yet
AI Knowledge Representation Guide
42 pages
AICIS 2024 Event Setup Costs
No ratings yet
AICIS 2024 Event Setup Costs
3 pages
Centralized CVM Counter Employee Manual-11.0.0
No ratings yet
Centralized CVM Counter Employee Manual-11.0.0
39 pages
Eloquent JavaScript, 4th Edition (A Modern Introduction To Programming) Haverbeke
No ratings yet
Eloquent JavaScript, 4th Edition (A Modern Introduction To Programming) Haverbeke
10 pages
Emerging Technology
No ratings yet
Emerging Technology
8 pages
DBS10956-31, Installation Manual DM100 VDR G2
No ratings yet
DBS10956-31, Installation Manual DM100 VDR G2
135 pages
Pipeline Leak Detection Techniques
100% (2)
Pipeline Leak Detection Techniques
10 pages
Benefits and Applications of High-Volume PCB Manufacturing
No ratings yet
Benefits and Applications of High-Volume PCB Manufacturing
10 pages
Maranga Amos Industrial Attachment Report
100% (1)
Maranga Amos Industrial Attachment Report
36 pages
IAA202 - LAB1 - SE140442: RISK, Threats and Vulnerabilities
No ratings yet
IAA202 - LAB1 - SE140442: RISK, Threats and Vulnerabilities
11 pages
A LSTM Neural Network Applied To Mobile Robots Path Planning
No ratings yet
A LSTM Neural Network Applied To Mobile Robots Path Planning
6 pages
Applied Physiology in Intensive Care Medicine 1 Physiological Notes Technical Notes Seminal Studies in Intensive Care Test Bank Available Instantly
No ratings yet
Applied Physiology in Intensive Care Medicine 1 Physiological Notes Technical Notes Seminal Studies in Intensive Care Test Bank Available Instantly
401 pages
Data Visualization - Data Mining
No ratings yet
Data Visualization - Data Mining
11 pages
Carte D'embarquement
No ratings yet
Carte D'embarquement
1 page
Artificial Intelligence PPT by Nandita
No ratings yet
Artificial Intelligence PPT by Nandita
19 pages
Minibeacon Plus Configuration Instruction
No ratings yet
Minibeacon Plus Configuration Instruction
14 pages
Explainingthe Delayin Digital Disruptionofthe Insurance Througha Complex Industry Paradigm
No ratings yet
Explainingthe Delayin Digital Disruptionofthe Insurance Througha Complex Industry Paradigm
41 pages
RCS-9622CN Instruction Manual en General X R1.01 (En DYBH0413.0086.0002)
No ratings yet
RCS-9622CN Instruction Manual en General X R1.01 (En DYBH0413.0086.0002)
152 pages
Smart Parking System Designed by Using The Arduino For University Park
No ratings yet
Smart Parking System Designed by Using The Arduino For University Park
6 pages
Practical Manual On Java Gui
No ratings yet
Practical Manual On Java Gui
17 pages
Siemens - Slides - 7UT6 English
No ratings yet
Siemens - Slides - 7UT6 English
26 pages
Data Structures-All Possible Questions
No ratings yet
Data Structures-All Possible Questions
3 pages

Tut4 NN Pytorch Updated - Ipynb - Colab

Uploaded by

Tut4 NN Pytorch Updated - Ipynb - Colab

Uploaded by

8/11/25, 10:51 PM Tut4_NN_pytorch_updated.

keyboard_arrow_down PyTorch Tutorial for two-layer NN for classification

import matplotlib.pyplot as plt

from torchvision import datasets, transforms ## load the dataset

mnist_train = datasets.MNIST('data', train=True, download=True,

mnist_test = datasets.MNIST('../data', train=False, download=True, transform=

100%|██████████| 9.91M/9.91M [00:00<00:00, 56.0MB/s]

print(mnist_train) # print info of the data

indices = [1, 12000, 344]

fig = plt.figure(figsize=(len(indices) * 4, 4))

for i, index in enumerate(indices):

from torch.utils.data import DataLoader

keyboard_arrow_down Some helper functions:

for i, (xb, yb) in enumerate(dl):

• Defining layers (e.g., nn.Linear, nn.Conv2d)

• Registering parameters so optimizers can update them

• Saving/loading model state (state_dict)

class Net(torch.nn.Module): # It defines a simple two-layer NN for classfication.

# define the parameters here

The loss on a single data point is given by

# instantiate the model

optimizer = torch.optim.SGD(model_LR.parameters(), lr=learning_rate)

# create datasets and data loader

mnist_test = datasets.MNIST('../data', train=False, download=True, transform=

train_dl = DataLoader(mnist_train, batch_size=bs,shuffle=True)

test_dl = DataLoader(mnist_test, batch_size = 100)

# Using GPUs in PyTorch is pretty straightforward

# set the model to training mode

test_loss_LR , test_acc_LR = get_test_stat(model_LR, test_dl, device) # the test_loss, test_accuracy.

Epoch 10 / 10: 100% 10/10 [01:43<00:00, 10.35s/it]

# instantiate the model

optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)

# create datasets and data loader

mnist_test = datasets.MNIST('../data', train=False, download=True, transform=

test_dl = DataLoader(mnist_test, batch_size = 100)

# Using GPUs in PyTorch is pretty straightforward

# set the model to training mode

test_loss, test_acc = get_test_stat(model, test_dl, device) # the test_loss, test_accuracy.

Epoch 10 / 10: 100% 10/10 [02:01<00:00, 12.21s/it]

Plot training and test loss & accuracy curves.

fig, axes = plt.subplots(nrows=2, ncols=1, figsize=(6, 6))

axes[0].plot(train_stats_LR['epoch'], train_stats_LR['loss'], label='train_LR')

axes[1].plot(train_stats['epoch'], train_stats['acc'], label='train_LR')

fig, axes = plt.subplots(nrows=2, ncols=1, figsize=(6, 6))

axes[0].plot(train_stats['epoch'], train_stats['loss'], label='train')

axes[1].plot(train_stats['epoch'], train_stats['acc'], label='train')

keyboard_arrow_down Weight visualization

keyboard_arrow_down More advanced techniques in DL

(i+1) (i+1) (i+1) (i)

The weight decay implicitly introduces ℓ2 regularization to parameters to prevent overfitting.

class ResNet(torch.nn.Module): # It defines a simple 3-layer NN for classfication.

# define the parameters here

# instantiate the model

# create datasets and data loader

mnist_test = datasets.MNIST('../data', train=False, download=True, transform=

test_dl = DataLoader(mnist_test, batch_size = 100)

# Using GPUs in PyTorch is pretty straightforward

# set the model to training mode

test_loss, test_acc = get_test_stat(model_RN, test_dl, device) # the test_loss, test_accuracy.

Epoch 10 / 10: 100% 10/10 [01:54<00:00, 11.32s/it]

fig, axes = plt.subplots(nrows=2, ncols=1, figsize=(6, 6))

axes[0].plot(train_stats_RN['epoch'], train_stats_RN['loss'], label='train')

axes[1].plot(train_stats_RN['epoch'], train_stats_RN['acc'], label='train')

You might also like