02/11/2024, 12:19 NN_From_Scratch
Neural Networks From Scratch
We will focus on the following 4-layer neural network, with fully connected layers in
this notebook. Ideally, you can modify the layers in PyTorch and TensorFlow to use
convolutions and filters.
Choosing a Dataset
For this walkthrough, we will focus on importing the MNIST dataset and using that as
the input to our deep neural networks. Note that this is purely a demonstration of
how to make a neural network from scratch, and it is NOT the recommended
architecture for solving the MNIST problem. We will reuse some code from one of the
other articles on Activation Functions Explained.
In [ ]: from sklearn.datasets import fetch_openml
from keras.utils import to_categorical
import numpy as np
from sklearn.model_selection import train_test_split
import time
x, y = fetch_openml('mnist_784', version=1, return_X_y=True)
x = (x/255).astype('float32')
y = to_categorical(y)
x_train, x_val, y_train, y_val = train_test_split(x, y, test_size=0.15, r
file:///Users/mythao/Desktop/NN_From_Scratch.html 1/5
02/11/2024, 12:19 NN_From_Scratch
/usr/local/lib/python3.10/dist-packages/sklearn/datasets/_openml.py:968: F
utureWarning: The default value of `parser` will change from `'liac-arff'`
to `'auto'` in 1.4. You can set `parser='auto'` to silence this warning. T
herefore, an `ImportError` will be raised from 1.4 if the dataset is dense
and pandas is not installed. Note that the pandas parser may return differ
ent data types. See the Notes Section in fetch_openml's API doc for detail
s.
warn(
PyTorch
PyTorch is yet another popular framework for computations, and this one is widely
used in the research community. In PyTorch, you still have to do a lot of work yourself
by aligning all the dimensions of the data and specifying the layers and forward pass
in an exact manner. This is not a bad thing though, and it's much easier to customize
components of a neural network.
Loading MNIST with PyTorch
Importing with PyTorch is straight away more complicated than we saw previously.
This was one of the things that threw me off at first, but it seems straight forward
once you get what the DataLoader returns you, and how you can access the data in
the objects.
In [ ]: import torch
from torchvision import datasets, transforms
transform = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize((0.1307,), (0.3081,))
])
train_loader = torch.utils.data.DataLoader(
datasets.MNIST('data', train=True, download=True, transform=transform
test_loader = torch.utils.data.DataLoader(
datasets.MNIST('data', train=False, transform=transform))
Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to
data/MNIST/raw/train-images-idx3-ubyte.gz
100%|██████████| 9912422/9912422 [00:00<00:00, 84765527.67it/s]
Extracting data/MNIST/raw/train-images-idx3-ubyte.gz to data/MNIST/raw
Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz to
data/MNIST/raw/train-labels-idx1-ubyte.gz
100%|██████████| 28881/28881 [00:00<00:00, 91079469.04it/s]
Extracting data/MNIST/raw/train-labels-idx1-ubyte.gz to data/MNIST/raw
Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz to
data/MNIST/raw/t10k-images-idx3-ubyte.gz
file:///Users/mythao/Desktop/NN_From_Scratch.html 2/5
02/11/2024, 12:19 NN_From_Scratch
100%|██████████| 1648877/1648877 [00:00<00:00, 56135482.12it/s]
Extracting data/MNIST/raw/t10k-images-idx3-ubyte.gz to data/MNIST/raw
Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz to
data/MNIST/raw/t10k-labels-idx1-ubyte.gz
100%|██████████| 4542/4542 [00:00<00:00, 2136667.65it/s]
Extracting data/MNIST/raw/t10k-labels-idx1-ubyte.gz to data/MNIST/raw
4-layer Neural Network With PyTorch
In [ ]: import time
import torch.nn as nn
import torch.optim as optim
import torch.nn.functional as F
class Net(nn.Module):
def __init__(self, epochs=10):
super(Net, self).__init__()
self.linear1 = nn.Linear(784, 128)
self.linear2 = nn.Linear(128, 64)
self.linear3 = nn.Linear(64, 10)
self.epochs = epochs
def forward_pass(self, x):
x = self.linear1(x)
x = torch.sigmoid(x)
x = self.linear2(x)
x = torch.sigmoid(x)
x = self.linear3(x)
x = torch.softmax(x, dim=0)
return x
def one_hot_encode(self, y):
encoded = torch.zeros([10], dtype=torch.float64)
encoded[y[0]] = 1.
return encoded
def train(self, train_loader, optimizer, criterion):
start_time = time.time()
loss = None
for iteration in range(self.epochs):
for x,y in train_loader:
y = self.one_hot_encode(y)
optimizer.zero_grad()
output = self.forward_pass(torch.flatten(x))
loss = criterion(output, y)
loss.backward()
optimizer.step()
print('Epoch: {0}, Time Spent: {1:.2f}s, Loss: {2}'.format(
iteration+1, time.time() - start_time, loss
))
file:///Users/mythao/Desktop/NN_From_Scratch.html 3/5
02/11/2024, 12:19 NN_From_Scratch
In [ ]: model = Net()
optimizer = optim.SGD(model.parameters(), lr=0.001)
criterion = nn.BCEWithLogitsLoss()
model.train(train_loader, optimizer, criterion)
Epoch: 1, Time Spent: 83.66s, Loss: 0.7329011663794518
Epoch: 2, Time Spent: 167.53s, Loss: 0.73334516659379
Epoch: 3, Time Spent: 250.46s, Loss: 0.7337711714208126
Epoch: 4, Time Spent: 333.27s, Loss: 0.7341624341905117
Epoch: 5, Time Spent: 415.96s, Loss: 0.7345092415809631
Epoch: 6, Time Spent: 497.98s, Loss: 0.7348068140447139
Epoch: 7, Time Spent: 580.77s, Loss: 0.7350574970245362
Epoch: 8, Time Spent: 662.58s, Loss: 0.7352710545063019
Epoch: 9, Time Spent: 744.86s, Loss: 0.7354604937136173
Epoch: 10, Time Spent: 826.54s, Loss: 0.7356384605169296
TensorFlow 2.0 with Keras
Now that we know just how much code lies behind a simple neural network in NumPy
and PyTorch, let's look at how easily we can construct the same network in
TensorFlow (with Keras).
With TensorFlow and Keras, we don't have to think as much about activation
functions, optimizers etc., since they are already implemented. On top of this, we will
see huge improvements in the time it takes to execute and train a neural network,
since the frameworks are completely optimized compared to NumPy.
The following approach goes for a complete Keras solution, without a custom training
function or anything very TensorFlow related. Go to the end of my TensorFlow 2.0
tutorial to see what a custom training function looks like.
In [ ]: !pip install --upgrade tensorflow-gpu
In [ ]: import tensorflow as tf
from tensorflow.keras.datasets import mnist
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.layers import Flatten, Dense
from tensorflow.keras.losses import BinaryCrossentropy
In [ ]: (x_train, y_train), (x_val, y_val) = mnist.load_data()
x_train = x_train.astype('float32') / 255
y_train = to_categorical(y_train)
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-d
atasets/mnist.npz
11490434/11490434 [==============================] - 2s 0us/step
In [ ]: model = tf.keras.Sequential([
Flatten(input_shape=(28, 28)),
Dense(128, activation='sigmoid'),
Dense(64, activation='sigmoid'),
Dense(10)
file:///Users/mythao/Desktop/NN_From_Scratch.html 4/5
02/11/2024, 12:19 NN_From_Scratch
])
model.compile(optimizer='SGD',
loss=BinaryCrossentropy(),
metrics=['accuracy'])
model.fit(x_train, y_train, epochs=10)
Epoch 1/10
1875/1875 [==============================] - 11s 3ms/step - loss: 1.5230 -
accuracy: 0.3088
Epoch 2/10
1875/1875 [==============================] - 5s 3ms/step - loss: 0.6463 -
accuracy: 0.5299
Epoch 3/10
1875/1875 [==============================] - 6s 3ms/step - loss: 2.5284 -
accuracy: 0.1874
Epoch 4/10
1875/1875 [==============================] - 5s 3ms/step - loss: 3.4365 -
accuracy: 0.1396
Epoch 5/10
1875/1875 [==============================] - 6s 3ms/step - loss: 5.0578 -
accuracy: 0.2379
Epoch 6/10
1875/1875 [==============================] - 5s 3ms/step - loss: 5.9086 -
accuracy: 0.1849
Epoch 7/10
1875/1875 [==============================] - 5s 3ms/step - loss: 7.6598 -
accuracy: 0.0974
Epoch 8/10
1875/1875 [==============================] - 6s 3ms/step - loss: 7.6598 -
accuracy: 0.0974
Epoch 9/10
1875/1875 [==============================] - 5s 3ms/step - loss: 7.6598 -
accuracy: 0.0974
Epoch 10/10
1875/1875 [==============================] - 6s 3ms/step - loss: 7.6598 -
accuracy: 0.0974
Out[ ]: <keras.src.callbacks.History at 0x7ed9adea0e80>
file:///Users/mythao/Desktop/NN_From_Scratch.html 5/5