0% found this document useful (0 votes)

9 views199 pages

Experiment 1: ANN Using Keras Library and Various Activation Experiment 1: ANN Using Keras Library and Various Activation Functions Functions

The document details an experiment using Keras to create and train artificial neural networks (ANNs) with various activation functions on a classification dataset. It includes steps for data generation, scaling, model creation, and training, highlighting the use of different activation functions like relu, sigmoid, tanh, selu, and elu. The training process and model summaries for each activation function are also presented, showcasing the architecture and performance metrics.

Uploaded by

speedcomcyber25

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views199 pages

Experiment 1: ANN Using Keras Library and Various Activation Experiment 1: ANN Using Keras Library and Various Activation Functions Functions

Uploaded by

speedcomcyber25

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 199

Experiment 1: ANN using Keras Library and Various Activation

Functions
In [1]:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers, models
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout
from tensorflow.keras.utils import to_categorical
from sklearn.model_selection import train_test_split
from sklearn.datasets import make_classification
from sklearn.preprocessing import StandardScaler

# Check TensorFlow version

print(f"TensorFlow version: {tf.__version__}")
print(f"Keras version: {keras.__version__}")

TensorFlow version: 2.17.0

Keras version: 3.9.0.dev2025031403

In [2]:
# Generate a classification dataset
X, y = make_classification(n_samples=1000, n_features=20, n_informative=15,
n_redundant=5, random_state=42)

# Scale the features

scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

# Split the data into training and testing sets

X_train, X_test, y_train, y_test = train_test_split(X_scaled, y, test_size=0.2, random_s
tate=42)

print(f"Training data shape: {X_train.shape}")

print(f"Testing data shape: {X_test.shape}")

Training data shape: (800, 20)

Testing data shape: (200, 20)

In [3]:
def create_model(activation='relu'):
model = Sequential([
Dense(64, activation=activation, input_shape=(X_train.shape[1],)),
Dense(32, activation=activation),
Dense(16, activation=activation),
Dense(1, activation='sigmoid') # Output layer with sigmoid for binary classific
ation
])

model.compile(optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy'])
return model

# Create models with different activation functions

activation_functions = ['relu', 'sigmoid', 'tanh', 'selu', 'elu']
models = {}
histories = {}
for activation in activation_functions:
print(f"Creating model with {activation} activation function...")
models[activation] = create_model(activation)
models[activation].summary()

Creating model with relu activation function...

c:\Users\Rishi\AppData\Local\Programs\Python\Python312\Lib\site-packages\keras\src\layers
\core\dense.py:87: UserWarning: Do not pass an ìnput_shape`/ìnput_dim` argument to a la
yer. When using Sequential models, prefer using an Ìnput(shape)` object as the first lay
er in the model instead.
super().__init__(activity_regularizer=activity_regularizer, **kwargs)

Model: "sequential"

Total params: 3,969 (15.50 KB)

Trainable params: 3,969 (15.50 KB)

Non-trainable params: 0 (0.00 B)

Creating model with sigmoid activation function...

Model: "sequential_1"

Total params: 3,969 (15.50 KB)

Trainable params: 3,969 (15.50 KB)

Non-trainable params: 0 (0.00 B)

Creating model with tanh activation function...

Model: "sequential_2"

Total params: 3,969 (15.50 KB)

Trainable params: 3,969 (15.50 KB)

Non-trainable params: 0 (0.00 B)

Creating model with selu activation function...

Model: "sequential_3"

Total params: 3,969 (15.50 KB)

Trainable params: 3,969 (15.50 KB)

Non-trainable params: 0 (0.00 B)

Creating model with elu activation function...

Model: "sequential_4"

Total params: 3,969 (15.50 KB)

Trainable params: 3,969 (15.50 KB)

Non-trainable params: 0 (0.00 B)

In [4]:
# Train models and store training history
for activation in activation_functions:
print(f"\n Training model with {activation} activation function...")
histories[activation] = models[activation].fit(
X_train, y_train,
epochs=20,
batch_size=32,
validation_split=0.2,
verbose=1
)

Training model with relu activation function...

Epoch 1/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 1s 10ms/step - accuracy: 0.5939 - loss: 0.6721 - val_accuracy:
0.7125 - val_loss: 0.6041
Epoch 2/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 1s 10ms/step - accuracy: 0.5939 - loss: 0.6721 - val_accuracy:
0.7125 - val_loss: 0.6041
Epoch 2/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.7909 - loss: 0.5740 - val_accuracy:
0.8562 - val_loss: 0.5159
Epoch 3/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.7909 - loss: 0.5740 - val_accuracy:
0.8562 - val_loss: 0.5159
Epoch 3/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8761 - loss: 0.4830 - val_accuracy:
0.8562 - val_loss: 0.4170
Epoch 4/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8761 - loss: 0.4830 - val_accuracy:
0.8562 - val_loss: 0.4170
Epoch 4/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9192 - loss: 0.3404 - val_accuracy:
0.8687 - val_loss: 0.3447
Epoch 5/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9192 - loss: 0.3404 - val_accuracy:
0.8687 - val_loss: 0.3447
Epoch 5/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9311 - loss: 0.2430 - val_accuracy:
0.8938 - val_loss: 0.2919
Epoch 6/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9311 - loss: 0.2430 - val_accuracy:
0.8938 - val_loss: 0.2919
Epoch 6/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9422 - loss: 0.1977 - val_accuracy:
0.8875 - val_loss: 0.2678
Epoch 7/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9422 - loss: 0.1977 - val_accuracy:
0.8875 - val_loss: 0.2678
Epoch 7/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9333 - loss: 0.1924 - val_accuracy:
0.8875 - val_loss: 0.2564
Epoch 8/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9333 - loss: 0.1924 - val_accuracy:
0.8875 - val_loss: 0.2564
Epoch 8/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 6ms/step - accuracy: 0.9506 - loss: 0.1568 - val_accuracy:
0.9062 - val_loss: 0.2276
Epoch 9/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 6ms/step - accuracy: 0.9506 - loss: 0.1568 - val_accuracy:
0.9062 - val_loss: 0.2276
Epoch 9/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.9589 - loss: 0.1390 - val_accuracy:
0.9125 - val_loss: 0.2213
Epoch 10/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.9589 - loss: 0.1390 - val_accuracy:
0.9125 - val_loss: 0.2213
Epoch 10/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9677 - loss: 0.1227 - val_accuracy:
0.9062 - val_loss: 0.2169
Epoch 11/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9677 - loss: 0.1227 - val_accuracy:
0.9062 - val_loss: 0.2169
Epoch 11/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9712 - loss: 0.1086 - val_accuracy:
0.9187 - val_loss: 0.2160
Epoch 12/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9712 - loss: 0.1086 - val_accuracy:
0.9187 - val_loss: 0.2160
Epoch 12/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.9774 - loss: 0.0967 - val_accuracy:
0.9375 - val_loss: 0.2098
Epoch 13/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.9774 - loss: 0.0967 - val_accuracy:
0.9375 - val_loss: 0.2098
Epoch 13/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.9791 - loss: 0.0769 - val_accuracy:
0.9312 - val_loss: 0.2076
Epoch 14/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.9791 - loss: 0.0769 - val_accuracy:
0.9312 - val_loss: 0.2076
Epoch 14/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.9892 - loss: 0.0713 - val_accuracy:
0.9187 - val_loss: 0.2151
Epoch 15/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.9892 - loss: 0.0713 - val_accuracy:
0.9187 - val_loss: 0.2151
Epoch 15/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.9857 - loss: 0.0713 - val_accuracy:
0.9438 - val_loss: 0.2044
Epoch 16/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.9857 - loss: 0.0713 - val_accuracy:
0.9438 - val_loss: 0.2044
Epoch 16/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.9845 - loss: 0.0666 - val_accuracy:
0.9312 - val_loss: 0.2250
Epoch 17/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.9845 - loss: 0.0666 - val_accuracy:
0.9312 - val_loss: 0.2250
Epoch 17/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9868 - loss: 0.0503 - val_accuracy:
0.9250 - val_loss: 0.2149
Epoch 18/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9868 - loss: 0.0503 - val_accuracy:
0.9250 - val_loss: 0.2149
Epoch 18/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9957 - loss: 0.0396 - val_accuracy:
0.9375 - val_loss: 0.2072
Epoch 19/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9957 - loss: 0.0396 - val_accuracy:
0.9375 - val_loss: 0.2072
Epoch 19/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9964 - loss: 0.0341 - val_accuracy:
0.9375 - val_loss: 0.2152
Epoch 20/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9964 - loss: 0.0341 - val_accuracy:
0.9375 - val_loss: 0.2152
Epoch 20/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9984 - loss: 0.0298 - val_accuracy:
0.9312 - val_loss: 0.2151

20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9984 - loss: 0.0298 - val_accuracy:

0.9312 - val_loss: 0.2151

Training model with sigmoid activation function...

Epoch 1/20

Epoch 1/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 1s 10ms/step - accuracy: 0.4988 - loss: 0.7235 - val_accuracy:
0.5250 - val_loss: 0.6876
Epoch 2/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 1s 10ms/step - accuracy: 0.4988 - loss: 0.7235 - val_accuracy:
0.5250 - val_loss: 0.6876
Epoch 2/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.5574 - loss: 0.6877 - val_accuracy:
0.4812 - val_loss: 0.6862
Epoch 3/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.5574 - loss: 0.6877 - val_accuracy:
0.4812 - val_loss: 0.6862
Epoch 3/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.5633 - loss: 0.6819 - val_accuracy:
0.7250 - val_loss: 0.6747
Epoch 4/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.5633 - loss: 0.6819 - val_accuracy:
0.7250 - val_loss: 0.6747
Epoch 4/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.7033 - loss: 0.6693 - val_accuracy:
0.7250 - val_loss: 0.6646
Epoch 5/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.7033 - loss: 0.6693 - val_accuracy:
0.7250 - val_loss: 0.6646
Epoch 5/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.7318 - loss: 0.6558 - val_accuracy:
0.7375 - val_loss: 0.6499
Epoch 6/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.7318 - loss: 0.6558 - val_accuracy:
0.7375 - val_loss: 0.6499
Epoch 6/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.7223 - loss: 0.6450 - val_accuracy:
0.7500 - val_loss: 0.6299
Epoch 7/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.7223 - loss: 0.6450 - val_accuracy:
0.7500 - val_loss: 0.6299
Epoch 7/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.7689 - loss: 0.6148 - val_accuracy:
0.7750 - val_loss: 0.6061
Epoch 8/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.7689 - loss: 0.6148 - val_accuracy:
0.7750 - val_loss: 0.6061
Epoch 8/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.7623 - loss: 0.5901 - val_accuracy:
0.7750 - val_loss: 0.5771
Epoch 9/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.7623 - loss: 0.5901 - val_accuracy:
0.7750 - val_loss: 0.5771
Epoch 9/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.7813 - loss: 0.5577 - val_accuracy:
0.7812 - val_loss: 0.5502
Epoch 10/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.7813 - loss: 0.5577 - val_accuracy:
0.7812 - val_loss: 0.5502
Epoch 10/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.7828 - loss: 0.5270 - val_accuracy:
0.8000 - val_loss: 0.5248
Epoch 11/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.7828 - loss: 0.5270 - val_accuracy:
0.8000 - val_loss: 0.5248
Epoch 11/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8023 - loss: 0.4962 - val_accuracy:
0.7875 - val_loss: 0.5050
Epoch 12/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8023 - loss: 0.4962 - val_accuracy:
0.7875 - val_loss: 0.5050
Epoch 12/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.8028 - loss: 0.4631 - val_accuracy:
0.7875 - val_loss: 0.4878
Epoch 13/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.8028 - loss: 0.4631 - val_accuracy:
0.7875 - val_loss: 0.4878
Epoch 13/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8262 - loss: 0.4370 - val_accuracy:
0.7812 - val_loss: 0.4766
Epoch 14/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8262 - loss: 0.4370 - val_accuracy:
0.7812 - val_loss: 0.4766
Epoch 14/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8067 - loss: 0.4419 - val_accuracy:
0.8000 - val_loss: 0.4699
Epoch 15/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8067 - loss: 0.4419 - val_accuracy:
0.8000 - val_loss: 0.4699
Epoch 15/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8243 - loss: 0.4062 - val_accuracy:
0.8000 - val_loss: 0.4655
Epoch 16/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8243 - loss: 0.4062 - val_accuracy:
0.8000 - val_loss: 0.4655
Epoch 16/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8073 - loss: 0.4138 - val_accuracy:
0.8125 - val_loss: 0.4612
Epoch 17/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8073 - loss: 0.4138 - val_accuracy:
0.8125 - val_loss: 0.4612
Epoch 17/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8081 - loss: 0.4125 - val_accuracy:
0.7937 - val_loss: 0.4600
Epoch 18/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8081 - loss: 0.4125 - val_accuracy:
0.7937 - val_loss: 0.4600
Epoch 18/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8205 - loss: 0.4058 - val_accuracy:
0.7937 - val_loss: 0.4568
Epoch 19/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8205 - loss: 0.4058 - val_accuracy:
0.7937 - val_loss: 0.4568
Epoch 19/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8194 - loss: 0.4061 - val_accuracy:
0.8062 - val_loss: 0.4568
Epoch 20/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8194 - loss: 0.4061 - val_accuracy:
0.8062 - val_loss: 0.4568
Epoch 20/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8430 - loss: 0.3769 - val_accuracy:
0.7937 - val_loss: 0.4586

20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8430 - loss: 0.3769 - val_accuracy:

0.7937 - val_loss: 0.4586

Training model with tanh activation function...

Epoch 1/20

Epoch 1/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 1s 10ms/step - accuracy: 0.6257 - loss: 0.6447 - val_accuracy:
0.7688 - val_loss: 0.5008
Epoch 2/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 1s 10ms/step - accuracy: 0.6257 - loss: 0.6447 - val_accuracy:
0.7688 - val_loss: 0.5008
Epoch 2/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8291 - loss: 0.4360 - val_accuracy:
0.8250 - val_loss: 0.4518
Epoch 3/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8291 - loss: 0.4360 - val_accuracy:
0.8250 - val_loss: 0.4518
Epoch 3/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8380 - loss: 0.3965 - val_accuracy:
0.8000 - val_loss: 0.4410
Epoch 4/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8380 - loss: 0.3965 - val_accuracy:
0.8000 - val_loss: 0.4410
Epoch 4/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.8310 - loss: 0.3797 - val_accuracy:
0.8250 - val_loss: 0.4248
Epoch 5/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.8310 - loss: 0.3797 - val_accuracy:
0.8250 - val_loss: 0.4248
Epoch 5/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8583 - loss: 0.3475 - val_accuracy:
0.8188 - val_loss: 0.4254
Epoch 6/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8583 - loss: 0.3475 - val_accuracy:
0.8188 - val_loss: 0.4254
Epoch 6/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8574 - loss: 0.3325 - val_accuracy:
0.7937 - val_loss: 0.4093
Epoch 7/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8574 - loss: 0.3325 - val_accuracy:
0.7937 - val_loss: 0.4093
Epoch 7/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8482 - loss: 0.3601 - val_accuracy:
0.8188 - val_loss: 0.4125
Epoch 8/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8482 - loss: 0.3601 - val_accuracy:
0.8188 - val_loss: 0.4125
Epoch 8/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8626 - loss: 0.3058 - val_accuracy:
0.8125 - val_loss: 0.3933
Epoch 9/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8626 - loss: 0.3058 - val_accuracy:
0.8125 - val_loss: 0.3933
Epoch 9/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8841 - loss: 0.2868 - val_accuracy:
0.8062 - val_loss: 0.4002
Epoch 10/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8841 - loss: 0.2868 - val_accuracy:
0.8062 - val_loss: 0.4002
Epoch 10/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8979 - loss: 0.2611 - val_accuracy:
0.8062 - val_loss: 0.3721
Epoch 11/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8979 - loss: 0.2611 - val_accuracy:
0.8062 - val_loss: 0.3721
Epoch 11/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9114 - loss: 0.2449 - val_accuracy:
0.8062 - val_loss: 0.3571
Epoch 12/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9114 - loss: 0.2449 - val_accuracy:
0.8062 - val_loss: 0.3571
Epoch 12/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9137 - loss: 0.2131 - val_accuracy:
0.8188 - val_loss: 0.3570
Epoch 13/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9137 - loss: 0.2131 - val_accuracy:
0.8188 - val_loss: 0.3570
Epoch 13/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9264 - loss: 0.2103 - val_accuracy:
0.8313 - val_loss: 0.3371
Epoch 14/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9264 - loss: 0.2103 - val_accuracy:
0.8313 - val_loss: 0.3371
Epoch 14/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9290 - loss: 0.2037 - val_accuracy:
0.8500 - val_loss: 0.3201
Epoch 15/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9290 - loss: 0.2037 - val_accuracy:
0.8500 - val_loss: 0.3201
Epoch 15/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9316 - loss: 0.1930 - val_accuracy:
0.8687 - val_loss: 0.3115
Epoch 16/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9316 - loss: 0.1930 - val_accuracy:
0.8687 - val_loss: 0.3115
Epoch 16/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9407 - loss: 0.1641 - val_accuracy:
0.8750 - val_loss: 0.3133
Epoch 17/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9407 - loss: 0.1641 - val_accuracy:
0.8750 - val_loss: 0.3133
Epoch 17/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9554 - loss: 0.1592 - val_accuracy:
0.8938 - val_loss: 0.2834
Epoch 18/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9554 - loss: 0.1592 - val_accuracy:
0.8938 - val_loss: 0.2834
Epoch 18/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9671 - loss: 0.1399 - val_accuracy:
0.8875 - val_loss: 0.2831
Epoch 19/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9671 - loss: 0.1399 - val_accuracy:
0.8875 - val_loss: 0.2831
Epoch 19/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9660 - loss: 0.1393 - val_accuracy:
0.9000 - val_loss: 0.2684
Epoch 20/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9660 - loss: 0.1393 - val_accuracy:
0.9000 - val_loss: 0.2684
Epoch 20/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9642 - loss: 0.1350 - val_accuracy:
0.9062 - val_loss: 0.2630
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9642 - loss: 0.1350 - val_accuracy:
0.9062 - val_loss: 0.2630

Training model with selu activation function...

Epoch 1/20

Training model with selu activation function...

Epoch 1/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 1s 10ms/step - accuracy: 0.6646 - loss: 0.6233 - val_accuracy:
0.8000 - val_loss: 0.4576
Epoch 2/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 1s 10ms/step - accuracy: 0.6646 - loss: 0.6233 - val_accuracy:
0.8000 - val_loss: 0.4576
Epoch 2/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8302 - loss: 0.3819 - val_accuracy:
0.8188 - val_loss: 0.4184
Epoch 3/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8302 - loss: 0.3819 - val_accuracy:
0.8188 - val_loss: 0.4184
Epoch 3/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 6ms/step - accuracy: 0.8517 - loss: 0.3308 - val_accuracy:
0.8313 - val_loss: 0.3937
Epoch 4/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 6ms/step - accuracy: 0.8517 - loss: 0.3308 - val_accuracy:
0.8313 - val_loss: 0.3937
Epoch 4/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8550 - loss: 0.3290 - val_accuracy:
0.8375 - val_loss: 0.3780
Epoch 5/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8550 - loss: 0.3290 - val_accuracy:
0.8375 - val_loss: 0.3780
Epoch 5/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8870 - loss: 0.2750 - val_accuracy:
0.8375 - val_loss: 0.3664
Epoch 6/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8870 - loss: 0.2750 - val_accuracy:
0.8375 - val_loss: 0.3664
Epoch 6/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8875 - loss: 0.2640 - val_accuracy:
0.8438 - val_loss: 0.3482
Epoch 7/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8875 - loss: 0.2640 - val_accuracy:
0.8438 - val_loss: 0.3482
Epoch 7/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9150 - loss: 0.2430 - val_accuracy:
0.8438 - val_loss: 0.3341
Epoch 8/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9150 - loss: 0.2430 - val_accuracy:
0.8438 - val_loss: 0.3341
Epoch 8/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9247 - loss: 0.2061 - val_accuracy:
0.8313 - val_loss: 0.3362
Epoch 9/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9247 - loss: 0.2061 - val_accuracy:
0.8313 - val_loss: 0.3362
Epoch 9/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9356 - loss: 0.1951 - val_accuracy:
0.8500 - val_loss: 0.3140
Epoch 10/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9356 - loss: 0.1951 - val_accuracy:
0.8500 - val_loss: 0.3140
Epoch 10/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9208 - loss: 0.2145 - val_accuracy:
0.8500 - val_loss: 0.2953
Epoch 11/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9208 - loss: 0.2145 - val_accuracy:
0.8500 - val_loss: 0.2953
Epoch 11/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9363 - loss: 0.1752 - val_accuracy:
0.8687 - val_loss: 0.2941
Epoch 12/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9363 - loss: 0.1752 - val_accuracy:
0.8687 - val_loss: 0.2941
Epoch 12/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9408 - loss: 0.1860 - val_accuracy:
0.8562 - val_loss: 0.2848
Epoch 13/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9408 - loss: 0.1860 - val_accuracy:
0.8562 - val_loss: 0.2848
Epoch 13/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9456 - loss: 0.1474 - val_accuracy:
0.8687 - val_loss: 0.2826
Epoch 14/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9456 - loss: 0.1474 - val_accuracy:
0.8687 - val_loss: 0.2826
Epoch 14/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9574 - loss: 0.1481 - val_accuracy:
0.8750 - val_loss: 0.2724
Epoch 15/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9574 - loss: 0.1481 - val_accuracy:
0.8750 - val_loss: 0.2724
Epoch 15/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9465 - loss: 0.1606 - val_accuracy:
0.8813 - val_loss: 0.2605
Epoch 16/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9465 - loss: 0.1606 - val_accuracy:
0.8813 - val_loss: 0.2605
Epoch 16/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9669 - loss: 0.1187 - val_accuracy:
0.8938 - val_loss: 0.2591
Epoch 17/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9669 - loss: 0.1187 - val_accuracy:
0.8938 - val_loss: 0.2591
Epoch 17/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9601 - loss: 0.1307 - val_accuracy:
0.8875 - val_loss: 0.2488
Epoch 18/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9601 - loss: 0.1307 - val_accuracy:
0.8875 - val_loss: 0.2488
Epoch 18/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9656 - loss: 0.1159 - val_accuracy:
0.9000 - val_loss: 0.2496
Epoch 19/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9656 - loss: 0.1159 - val_accuracy:
0.9000 - val_loss: 0.2496
Epoch 19/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9703 - loss: 0.1036 - val_accuracy:
0.9062 - val_loss: 0.2460
Epoch 20/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9703 - loss: 0.1036 - val_accuracy:
0.9062 - val_loss: 0.2460
Epoch 20/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.9653 - loss: 0.1124 - val_accuracy:
0.9062 - val_loss: 0.2359
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.9653 - loss: 0.1124 - val_accuracy:
0.9062 - val_loss: 0.2359

Training model with elu activation function...

Epoch 1/20

Training model with elu activation function...

Epoch 1/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 1s 10ms/step - accuracy: 0.6105 - loss: 0.6421 - val_accuracy:
0.7750 - val_loss: 0.4973
Epoch 2/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 1s 10ms/step - accuracy: 0.6105 - loss: 0.6421 - val_accuracy:
0.7750 - val_loss: 0.4973
Epoch 2/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8113 - loss: 0.4574 - val_accuracy:
0.8250 - val_loss: 0.4155
Epoch 3/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8113 - loss: 0.4574 - val_accuracy:
0.8250 - val_loss: 0.4155
Epoch 3/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8512 - loss: 0.3562 - val_accuracy:
0.8375 - val_loss: 0.3750
Epoch 4/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8512 - loss: 0.3562 - val_accuracy:
0.8375 - val_loss: 0.3750
Epoch 4/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.8907 - loss: 0.3097 - val_accuracy:
0.8313 - val_loss: 0.3458
Epoch 5/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.8907 - loss: 0.3097 - val_accuracy:
0.8313 - val_loss: 0.3458
Epoch 5/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9035 - loss: 0.2762 - val_accuracy:
0.8562 - val_loss: 0.3198
Epoch 6/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9035 - loss: 0.2762 - val_accuracy:
0.8562 - val_loss: 0.3198
Epoch 6/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9199 - loss: 0.2368 - val_accuracy:
0.8625 - val_loss: 0.3021
Epoch 7/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9199 - loss: 0.2368 - val_accuracy:
0.8625 - val_loss: 0.3021
Epoch 7/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9174 - loss: 0.2120 - val_accuracy:
0.8562 - val_loss: 0.2926
Epoch 8/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9174 - loss: 0.2120 - val_accuracy:
0.8562 - val_loss: 0.2926
Epoch 8/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9245 - loss: 0.1953 - val_accuracy:
0.8813 - val_loss: 0.2774
Epoch 9/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9245 - loss: 0.1953 - val_accuracy:
0.8813 - val_loss: 0.2774
Epoch 9/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9457 - loss: 0.1856 - val_accuracy:
0.8687 - val_loss: 0.2708
Epoch 10/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9457 - loss: 0.1856 - val_accuracy:
0.8687 - val_loss: 0.2708
Epoch 10/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9393 - loss: 0.1759 - val_accuracy:
0.9000 - val_loss: 0.2590
Epoch 11/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9393 - loss: 0.1759 - val_accuracy:
0.9000 - val_loss: 0.2590
Epoch 11/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9326 - loss: 0.1605 - val_accuracy:
0.9000 - val_loss: 0.2497
Epoch 12/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9326 - loss: 0.1605 - val_accuracy:
0.9000 - val_loss: 0.2497
Epoch 12/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.9531 - loss: 0.1384 - val_accuracy:
0.9000 - val_loss: 0.2437
Epoch 13/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.9531 - loss: 0.1384 - val_accuracy:
0.9000 - val_loss: 0.2437
Epoch 13/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.9350 - loss: 0.1534 - val_accuracy:
0.9062 - val_loss: 0.2316
Epoch 14/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.9350 - loss: 0.1534 - val_accuracy:
0.9062 - val_loss: 0.2316
Epoch 14/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9725 - loss: 0.1039 - val_accuracy:
0.9062 - val_loss: 0.2294
Epoch 15/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9725 - loss: 0.1039 - val_accuracy:
0.9062 - val_loss: 0.2294
Epoch 15/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9550 - loss: 0.1294 - val_accuracy:
0.9000 - val_loss: 0.2424
Epoch 16/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9550 - loss: 0.1294 - val_accuracy:
0.9000 - val_loss: 0.2424
Epoch 16/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9643 - loss: 0.1046 - val_accuracy:
0.9125 - val_loss: 0.2275
Epoch 17/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9643 - loss: 0.1046 - val_accuracy:
0.9125 - val_loss: 0.2275
Epoch 17/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9546 - loss: 0.1106 - val_accuracy:
0.9125 - val_loss: 0.2242
Epoch 18/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9546 - loss: 0.1106 - val_accuracy:
0.9125 - val_loss: 0.2242
Epoch 18/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9680 - loss: 0.1003 - val_accuracy:
0.9125 - val_loss: 0.2230
Epoch 19/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9680 - loss: 0.1003 - val_accuracy:
0.9125 - val_loss: 0.2230
Epoch 19/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9666 - loss: 0.0900 - val_accuracy:
0.9125 - val_loss: 0.2212
Epoch 20/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9666 - loss: 0.0900 - val_accuracy:
0.9125 - val_loss: 0.2212
Epoch 20/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9685 - loss: 0.0845 - val_accuracy:
0.9250 - val_loss: 0.2185
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9685 - loss: 0.0845 - val_accuracy:
0.9250 - val_loss: 0.2185

In [5]:
# Evaluate models on test data
results = {}
for activation in activation_functions:
print(f"\n Evaluating model with {activation} activation function:")
loss, accuracy = models[activation].evaluate(X_test, y_test, verbose=0)
results[activation] = {
'loss': loss,
'accuracy': accuracy
}
print(f"Test Loss: {loss:.4f}")
print(f"Test Accuracy: {accuracy:.4f}")

# Create a DataFrame to compare results

results_df = pd.DataFrame({
'Activation Function': list(results.keys()),
'Test Loss': [results[act]['loss'] for act in results],
'Test Accuracy': [results[act]['accuracy'] for act in results]
})

print("\n Comparison of Different Activation Functions:")

results_df.sort_values('Test Accuracy', ascending=False)

Evaluating model with relu activation function:

Test Loss: 0.1698
Test Accuracy: 0.9350

Evaluating model with sigmoid activation function:

Test Loss: 0.1698
Test Accuracy: 0.9350

Evaluating model with sigmoid activation function:

Test Loss: 0.3892
Test Accuracy: 0.8250

Evaluating model with tanh activation function:

Test Loss: 0.3892
Test Accuracy: 0.8250

Evaluating model with tanh activation function:

Test Loss: 0.1933
Test Accuracy: 0.9150

Evaluating model with selu activation function:

Test Loss: 0.1933
Test Accuracy: 0.9150

Evaluating model with selu activation function:

Test Loss: 0.1715
Test Accuracy: 0.9300

Evaluating model with elu activation function:

Test Loss: 0.1715
Test Accuracy: 0.9300

Evaluating model with elu activation function:

Test Loss: 0.1732
Test Accuracy: 0.9200

Comparison of Different Activation Functions:

Test Loss: 0.1732
Test Accuracy: 0.9200

Comparison of Different Activation Functions:

Out[5]:

Activation Function Test Loss Test Accuracy

0 relu 0.169778 0.935

3 selu 0.171519 0.930

4 elu 0.173155 0.920

2 tanh 0.193340 0.915

1 sigmoid 0.389227 0.825

In [6]:
# Plot training & validation accuracy for each activation function
plt.figure(figsize=(16, 10))

# Plot training accuracy

plt.subplot(2, 2, 1)
for activation in activation_functions:
plt.plot(histories[activation].history['accuracy'], label=f'{activation} ')
plt.title('Training Accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend()

# Plot validation accuracy

plt.subplot(2, 2, 2)
for activation in activation_functions:
plt.plot(histories[activation].history['val_accuracy'], label=f'{activation} ')
plt.title('Validation Accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend()

# Plot training loss

plt.subplot(2, 2, 3)
for activation in activation_functions:
plt.plot(histories[activation].history['loss'], label=f'{activation} ')
plt.title('Training Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.legend()

# Plot validation loss

plt.subplot(2, 2, 4)
for activation in activation_functions:
plt.plot(histories[activation].history['val_loss'], label=f'{activation} ')
plt.title('Validation Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.legend()

plt.tight_layout()
plt.show()

In [7]:
# Define activation functions for visualization
def relu(x):
return np.maximum(0, x)

def sigmoid(x):
return 1 / (1 + np.exp(-x))

def tanh(x):
return np.tanh(x)

def leaky_relu(x, alpha=0.3):

return np.where(x > 0, x, alpha * x)

def elu(x, alpha=1.0):

return np.where(x > 0, x, alpha * (np.exp(x) - 1))

# Create a range of input values

x = np.linspace(-10, 10, 1000)

# Create a plot
plt.figure(figsize=(12, 8))
plt.plot(x, relu(x), label='ReLU')
plt.plot(x, sigmoid(x), label='Sigmoid')
plt.plot(x, tanh(x), label='Tanh')
plt.plot(x, leaky_relu(x), label='Leaky ReLU (alpha=0.3)')
plt.plot(x, elu(x), label='ELU (alpha=1.0)')
plt.grid(True)
plt.legend()
plt.title('Activation Functions')
plt.xlabel('Input (x)')
plt.ylabel('Output f(x)')
plt.axhline(y=0, color='k', linestyle='-', alpha=0.3)
plt.axvline(x=0, color='k', linestyle='-', alpha=0.3)
plt.show()
Experiment 2: ANN on Diabetes Dataset
In [1]:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers, models
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import confusion_matrix, classification_report, roc_curve, roc_auc_s
core

# Set random seeds for reproducibility

np.random.seed(42)
tf.random.set_seed(42)

# Check versions
print(f"TensorFlow version: {tf.__version__}")
print(f"Keras version: {keras.__version__}")

TensorFlow version: 2.17.0

Keras version: 3.9.0.dev2025031403

In [2]:

# Load the dataset

# Trying to download the dataset directly; if not available, we'll use a synthetic versio
n
try:
url = "https://raw.githubusercontent.com/jbrownlee/Datasets/master/pima-indians-diabe
tes.data.csv"
column_names = ["Pregnancies", "Glucose", "BloodPressure", "SkinThickness",
"Insulin", "BMI", "DiabetesPedigreeFunction", "Age", "Outcome"]
df = pd.read_csv(url, names=column_names)
except:
print("Unable to download dataset, creating synthetic version...")
# Create synthetic data similar to Pima Indians Diabetes dataset
np.random.seed(42)
n_samples = 768

# Generate synthetic data with similar distributions to the real dataset

pregnancies = np.random.randint(0, 18, n_samples)
glucose = np.random.normal(121, 32, n_samples).astype(int)
glucose = np.clip(glucose, 0, 200)
blood_pressure = np.random.normal(69, 19, n_samples).astype(int)
blood_pressure = np.clip(blood_pressure, 0, 122)
skin_thickness = np.random.normal(23, 16, n_samples).astype(int)
skin_thickness = np.clip(skin_thickness, 0, 100)
insulin = np.random.normal(80, 115, n_samples).astype(int)
insulin = np.clip(insulin, 0, 846)
bmi = np.random.normal(32, 8, n_samples)
bmi = np.clip(bmi, 0, 67)
diabetes_pedigree = np.random.normal(0.47, 0.33, n_samples)
diabetes_pedigree = np.clip(diabetes_pedigree, 0.08, 2.42)
age = np.random.normal(33, 12, n_samples).astype(int)
age = np.clip(age, 21, 81)

# Create synthetic target with ~35% positive cases (similar to original dataset)
# Using a formula that considers multiple risk factors
risk_score = (glucose > 140) * 3 + (bmi > 30) * 2 + (age > 40) * 1.5 + (pregnancies
> 6) * 1
outcome = (risk_score + np.random.normal(0, 1, n_samples) > 3).astype(int)

# Create dataframe
data = {
"Pregnancies": pregnancies,
"Glucose": glucose,
"BloodPressure": blood_pressure,
"SkinThickness": skin_thickness,
"Insulin": insulin,
"BMI": bmi,
"DiabetesPedigreeFunction": diabetes_pedigree,
"Age": age,
"Outcome": outcome
}
df = pd.DataFrame(data)
print("Created synthetic data similar to Pima Indians Diabetes dataset.")

# Display basic information about the dataset

print(f"Dataset shape: {df.shape}")
print("\n First few rows of the dataset:")
df.head()

Dataset shape: (768, 9)

First few rows of the dataset:

Out[2]:

Pregnancies Glucose BloodPressure SkinThickness Insulin BMI DiabetesPedigreeFunction Age Outcome

0 6 148 72 35 0 33.6 0.627 50 1

1 1 85 66 29 0 26.6 0.351 31 0

2 8 183 64 0 0 23.3 0.672 32 1

3 1 89 66 23 94 28.1 0.167 21 0

4 0 137 40 35 168 43.1 2.288 33 1

In [3]:
# Basic statistics of the dataset
print("Basic statistics of the dataset:")
df.describe()

Basic statistics of the dataset:

Out[3]:

Pregnancies Glucose BloodPressure SkinThickness Insulin BMI DiabetesPedigreeFunction Age Ou

count 768.000000 768.000000 768.000000 768.000000 768.000000 768.000000 768.000000 768.000000 768.

mean 3.845052 120.894531 69.105469 20.536458 79.799479 31.992578 0.471876 33.240885 0.3

std 3.369578 31.972618 19.355807 15.952218 115.244002 7.884160 0.331329 11.760232 0.4

min 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.078000 21.000000 0.0

25% 1.000000 99.000000 62.000000 0.000000 0.000000 27.300000 0.243750 24.000000 0.0

50% 3.000000 117.000000 72.000000 23.000000 30.500000 32.000000 0.372500 29.000000 0.0

75% 6.000000 140.250000 80.000000 32.000000 127.250000 36.600000 0.626250 41.000000 1.0

max 17.000000 199.000000 122.000000 99.000000 846.000000 67.100000 2.420000 81.000000 1.0

In [4]:
# Check for missing values
print("\n Missing values in each column:")
df.isnull().sum()

Missing values in each column:

Out[4]:
Pregnancies 0
Glucose 0
BloodPressure 0
SkinThickness 0
Insulin 0
BMI 0
DiabetesPedigreeFunction 0
Age 0
Outcome 0
dtype: int64

In [5]:

# Check class distribution (0: No Diabetes, 1: Diabetes)

print("\n Class distribution:")
class_distribution = df['Outcome'].value_counts(normalize=True) * 100
print(class_distribution)

# Visualize class distribution

plt.figure(figsize=(8, 6))
sns.countplot(x='Outcome', data=df, palette='viridis')
plt.title('Class Distribution: Diabetes vs No Diabetes')
plt.xlabel('Outcome (0: No Diabetes, 1: Diabetes)')
plt.ylabel('Count')

# Add percentages to the bars

for i, percentage in enumerate(class_distribution):
count = df['Outcome'].value_counts()[i]
plt.text(i, count + 5, f'{percentage:.1f}%', ha='center')

plt.show()

Class distribution:
Outcome
0 65.104167
1 34.895833
Name: proportion, dtype: float64

C:\Users\Rishi\AppData\Local\Temp\ipykernel_22384\2991995795.py:8: FutureWarning:

Passing `palette` without assigning `hue` is deprecated and will be removed in v0.14.0. A
ssign the `x` variable to `hue` and set `legend=False` for the same effect.

sns.countplot(x='Outcome', data=df, palette='viridis')

3. Data Preprocessing
In [6]:
# In the real dataset, zero values for certain features don't make physiological sense
# For our analysis, let's treat zeros as missing values for the following features
features_with_zeros = ['Glucose', 'BloodPressure', 'SkinThickness', 'Insulin', 'BMI']

# Check how many rows have zeros for these features

for feature in features_with_zeros:
zero_count = (df[feature] == 0).sum()
zero_percentage = (zero_count / len(df)) * 100
print(f"{feature}: {zero_count} zeros ({zero_percentage:.2f}%)")

Glucose: 5 zeros (0.65%)

BloodPressure: 35 zeros (4.56%)
SkinThickness: 227 zeros (29.56%)
Insulin: 374 zeros (48.70%)
BMI: 11 zeros (1.43%)

In [7]:
# Replace zeros with NaN for these features
for feature in features_with_zeros:
df[feature] = df[feature].replace(0, np.nan)

# Display missing value count after replacement

print("\n Missing values in each column after zero replacement:")
df.isnull().sum()

Missing values in each column after zero replacement:

Out[7]:
Pregnancies 0
Glucose 5
BloodPressure 35
SkinThickness 227
Insulin 374
BMI 11
DiabetesPedigreeFunction 0
Age 0
Outcome 0
dtype: int64

In [ ]:
# Fill missing values with the median of each feature
for feature in features_with_zeros:
median_value = df[feature].median()
df[feature].fillna(median_value, inplace=True)

# Confirm no missing values remain

print("\n Missing values after imputation:")
df.isnull().sum()

In [9]:
# Visualize features by outcome
plt.figure(figsize=(20, 15))
for i, feature in enumerate(df.columns[:-1]):
plt.subplot(3, 3, i+1)
sns.boxplot(x='Outcome', y=feature, data=df)
plt.title(f'{feature} by Diabetes Outcome')
plt.xlabel('Diabetes (0: No, 1: Yes)')
plt.ylabel(feature)
plt.tight_layout()
plt.show()

In [10]:
# Visualize correlation between features
plt.figure(figsize=(12, 10))
correlation_matrix = df.corr()
sns.heatmap(correlation_matrix, annot=True, cmap='coolwarm', fmt='.2f')
plt.title('Correlation Matrix')
plt.show()
In [11]:
# Separate features and target
X = df.drop('Outcome', axis=1)
y = df['Outcome']

# Split data into training and testing sets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42
, stratify=y)

# Scale the features

scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

print(f"Training data shape: {X_train_scaled.shape} ")

print(f"Testing data shape: {X_test_scaled.shape}")

Training data shape: (614, 8)

Testing data shape: (154, 8)

In [12]:
# Create a sequential model
model = Sequential([
# Input layer
Dense(32, activation='relu', input_shape=(X_train_scaled.shape[1],)),
Dropout(0.2), # Add dropout to prevent overfitting

# Hidden layer 1
Dense(16, activation='relu'),
Dropout(0.2),

# Hidden layer 2
Dense(8, activation='relu'),

# Output layer - binary classification

Dense(1, activation='sigmoid')
])

# Compile the model

model.compile(
optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy']
)
# Display model summary
model.summary()

Model: "sequential"

Total params: 961 (3.75 KB)

Trainable params: 961 (3.75 KB)

Non-trainable params: 0 (0.00 B)

In [ ]:
# Define early stopping callback to prevent overfitting
early_stopping = tf.keras.callbacks.EarlyStopping(
monitor='val_loss',
patience=10,
mode='min',
restore_best_weights=True
)

# Train the model

history = model.fit(
X_train_scaled,
y_train,
epochs=100,
batch_size=16,
validation_split=0.2,
callbacks=[early_stopping],
verbose=1
)

In [14]:
# Evaluate on test set
loss, accuracy = model.evaluate(X_test_scaled, y_test)
print(f"\n Test Loss: {loss:.4f}")
print(f"Test Accuracy: {accuracy:.4f}")

5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.7237 - loss: 0.5447

Test Loss: 0.5189

Test Accuracy: 0.7403
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.7237 - loss: 0.5447

Test Loss: 0.5189

Test Accuracy: 0.7403
In [15]:
# Plot training history
plt.figure(figsize=(12, 5))

# Plot training & validation accuracy

plt.subplot(1, 2, 1)
plt.plot(history.history['accuracy'], label='Training Accuracy')
plt.plot(history.history['val_accuracy'], label='Validation Accuracy')
plt.title('Model Accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend()

# Plot training & validation loss

plt.subplot(1, 2, 2)
plt.plot(history.history['loss'], label='Training Loss')
plt.plot(history.history['val_loss'], label='Validation Loss')
plt.title('Model Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.legend()

plt.tight_layout()
plt.show()

In [16]:
# Make predictions on the test set
y_pred_prob = model.predict(X_test_scaled)
y_pred = (y_pred_prob > 0.5).astype(int).flatten()

# Create a confusion matrix

cm = confusion_matrix(y_test, y_pred)

# Plot confusion matrix

plt.figure(figsize=(8, 6))
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues')
plt.title('Confusion Matrix')
plt.xlabel('Predicted Label')
plt.ylabel('True Label')
plt.show()

# Calculate and display classification metrics

print("\n Classification Report:")
print(classification_report(y_test, y_pred))

5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 11ms/step

5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 11ms/step
Classification Report:
precision recall f1-score support

0 0.79 0.82 0.80 100

1 0.64 0.59 0.62 54

accuracy 0.74 154

macro avg 0.71 0.71 0.71 154
weighted avg 0.74 0.74 0.74 154

In [17]:
# Calculate ROC curve and AUC
fpr, tpr, thresholds = roc_curve(y_test, y_pred_prob)
auc = roc_auc_score(y_test, y_pred_prob)

# Plot ROC curve

plt.figure(figsize=(8, 6))
plt.plot(fpr, tpr, label=f'AUC = {auc:.3f}')
plt.plot([0, 1], [0, 1], 'k--')
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('ROC Curve')
plt.legend(loc='lower right')
plt.grid(True, alpha=0.3)
plt.show()
In [18]:

# Get the weights of the first layer

weights = model.layers[0].get_weights()[0]

# Calculate absolute mean weights for each feature

feature_importance = np.mean(np.abs(weights), axis=1)

# Create a DataFrame for visualization

importance_df = pd.DataFrame({
'Feature': X.columns,
'Importance': feature_importance
})

# Sort by importance
importance_df = importance_df.sort_values('Importance', ascending=False)

# Plot feature importance

plt.figure(figsize=(10, 6))
sns.barplot(x='Importance', y='Feature', data=importance_df, palette='viridis')
plt.title('Feature Importance (Based on First Layer Weights)')
plt.xlabel('Mean Absolute Weight')
plt.tight_layout()
plt.show()

C:\Users\Rishi\AppData\Local\Temp\ipykernel_22384\4215535867.py:18: FutureWarning:

Passing `palette` without assigning `hue` is deprecated and will be removed in v0.14.0. A
ssign the `y` variable to `hue` and set `legend=False` for the same effect.

sns.barplot(x='Importance', y='Feature', data=importance_df, palette='viridis')

In [ ]:
# Create sample data for prediction (you can modify these values)
sample_data = {
'Pregnancies': [1, 8, 2],
'Glucose': [85, 183, 137],
'BloodPressure': [66, 64, 70],
'SkinThickness': [29, 0, 38],
'Insulin': [0, 0, 240],
'BMI': [26.6, 23.3, 30.8],
'DiabetesPedigreeFunction': [0.351, 0.672, 0.429],
'Age': [31, 32, 45]
}

# Convert to DataFrame
sample_df = pd.DataFrame(sample_data)

# Replace zeros as we did with the training data

for feature in features_with_zeros:
sample_df[feature] = sample_df[feature].replace(0, np.nan)
sample_df[feature].fillna(df[feature].median(), inplace=True)

# Scale the data using the same scaler

sample_scaled = scaler.transform(sample_df)

# Make predictions
sample_predictions = model.predict(sample_scaled)

# Convert predictions to binary class

sample_predictions_binary = (sample_predictions > 0.5).astype(int)

# Create a DataFrame for the results

results_df = sample_df.copy()
results_df['Diabetes Probability'] = sample_predictions
results_df['Predicted Diabetes'] = sample_predictions_binary

# Display results
print("Predictions on Sample Data:")
results_df[['Diabetes Probability', 'Predicted Diabetes']]
Experiment 3: MNIST - Deep Neural Network with Keras
In [1]:

# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python docker image: https://github.com/kaggle/docker-pytho
n
# For example, here's several helpful packages to load in

import numpy as np # linear algebra

import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

import matplotlib.pyplot as plt # plotting library

%matplotlib inline

from keras.models import Sequential

from keras.layers import Dense , Activation, Dropout
from keras.optimizers import Adam ,RMSprop
from keras import backend as K

# Input data files are available in the "../input/" directory.

# For example, running this (by clicking run or pressing Shift+Enter) will list the files
in the input directory

from subprocess import check_output

print(check_output(["ls", "../input"]).decode("utf8"))

# Any results you write to the current directory are saved as output.
Using TensorFlow backend.

digit-recognizer

In [2]:
# import dataset
from keras.datasets import mnist

# load dataset
(x_train, y_train),(x_test, y_test) = mnist.load_data()

# count the number of unique train labels

unique, counts = np.unique(y_train, return_counts=True)
print("Train labels: ", dict(zip(unique, counts)))

# count the number of unique test labels

unique, counts = np.unique(y_test, return_counts=True)
print("\n Test labels: ", dict(zip(unique, counts)))

Downloading data from https://s3.amazonaws.com/img-datasets/mnist.npz

11493376/11490434 [==============================] - 2s 0us/step
Train labels: {0: 5923, 1: 6742, 2: 5958, 3: 6131, 4: 5842, 5: 5421, 6: 5918, 7: 6265, 8
: 5851, 9: 5949}

Test labels: {0: 980, 1: 1135, 2: 1032, 3: 1010, 4: 982, 5: 892, 6: 958, 7: 1028, 8: 974
, 9: 1009}
, 9: 1009}

In [3]:
# sample 25 mnist digits from train dataset
indexes = np.random.randint(0, x_train.shape[0], size=25)
images = x_train[indexes]
labels = y_train[indexes]

# plot the 25 mnist digits

plt.figure(figsize=(5,5))
for i in range(len(indexes)):
plt.subplot(5, 5, i + 1)
image = images[i]
plt.imshow(image, cmap='gray')
plt.axis('off')

plt.show()
plt.savefig("mnist-samples.png")
plt.close('all')

In [4]:
from keras.models import Sequential
from keras.layers import Dense, Activation, Dropout
from keras.utils import to_categorical, plot_model

In [5]:
# compute the number of labels
num_labels = len(np.unique(y_train))

In [6]:

# convert to one-hot vector

y_train = to_categorical(y_train)
y_test = to_categorical(y_test)

In [7]:
# image dimensions (assumed square)
image_size = x_train.shape[1]
input_size = image_size * image_size
input_size
Out[7]:
784

In [8]:
# resize and normalize
x_train = np.reshape(x_train, [-1, input_size])
x_train = x_train.astype('float32') / 255
x_test = np.reshape(x_test, [-1, input_size])
x_test = x_test.astype('float32') / 255

In [9]:
# network parameters
batch_size = 128
hidden_units = 256
dropout = 0.45

In [10]:
# model is a 3-layer MLP with ReLU and dropout after each layer
model = Sequential()
model.add(Dense(hidden_units, input_dim=input_size))
model.add(Activation('relu'))
model.add(Dropout(dropout))
model.add(Dense(hidden_units))
model.add(Activation('relu'))
model.add(Dropout(dropout))
model.add(Dense(num_labels))
model.add(Activation('softmax'))

In [11]:
model.summary()

Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_1 (Dense) (None, 256) 200960
_________________________________________________________________
activation_1 (Activation) (None, 256) 0
_________________________________________________________________
dropout_1 (Dropout) (None, 256) 0
_________________________________________________________________
dense_2 (Dense) (None, 256) 65792
_________________________________________________________________
activation_2 (Activation) (None, 256) 0
_________________________________________________________________
dropout_2 (Dropout) (None, 256) 0
_________________________________________________________________
dense_3 (Dense) (None, 10) 2570
_________________________________________________________________
activation_3 (Activation) (None, 10) 0
=================================================================
Total params: 269,322
Trainable params: 269,322
Non-trainable params: 0
_________________________________________________________________

In [12]:
plot_model(model, to_file='mlp-mnist.png', show_shapes=True)
Out[12]:
In [13]:
model.compile(loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])

In [ ]:
model.fit(x_train, y_train, epochs=20, batch_size=batch_size)

In [15]:
loss, acc = model.evaluate(x_test, y_test, batch_size=batch_size)
print("\n Test accuracy: %.1f%%" % (100.0 * acc))

10000/10000 [==============================] - 0s 22us/step

Test accuracy: 98.2%

In [16]:
from keras.regularizers import l2
model.add(Dense(hidden_units,
kernel_regularizer=l2(0.001),
input_dim=input_size))
Experiment 4: Handwritten Digit Recognition using CNN
(Convolutional Neural Networks)
In [1]:

# for numerical analysis

import numpy as np
# to store and process in a dataframe
import pandas as pd

# for ploting graphs

import matplotlib.pyplot as plt
# advancec ploting
import seaborn as sns

# image processing
import matplotlib.image as mpimg

# train test split

from sklearn.model_selection import train_test_split
# model performance metrics
from sklearn.metrics import confusion_matrix, classification_report

# utility functions
from tensorflow.keras.utils import to_categorical
# sequential model
from tensorflow.keras.models import Sequential
# layers
from tensorflow.keras.layers import Conv2D, MaxPool2D, Dense, Flatten, Dropout

# from keras.optimizers import RMSprop

# from keras.preprocessing.image import ImageDataGenerator
# from keras.callbacks import ReduceLROnPlateau

In [2]:

# list of files
! ls ../input/digit-recognizer

sample_submission.csv test.csv train.csv

In [3]:
# import train and test dataset
train = pd.read_csv("../input/digit-recognizer/train.csv")
test = pd.read_csv("../input/digit-recognizer/test.csv")

In [4]:
# training dataset
train.head()
Out[4]:

label pixel0 pixel1 pixel2 pixel3 pixel4 pixel5 pixel6 pixel7 pixel8 ... pixel774 pixel775 pixel776 pixel777 pixel778 pixel7

0 1 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0

1 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0

2 1 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0

3 4 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0

4 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0

5 rows × 785 columns

In [5]:
# test dataset
test.head()
Out[5]:

pixel0 pixel1 pixel2 pixel3 pixel4 pixel5 pixel6 pixel7 pixel8 pixel9 ... pixel774 pixel775 pixel776 pixel777 pixel778 pixe

0 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0

1 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0

2 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0

3 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0

4 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0

5 rows × 784 columns

In [6]:
# looking for missing values
print(train.isna().sum().sum())
print(test.isna().sum().sum())

0
0

In [7]:

plt.figure(figsize=(8, 5))
sns.countplot(train['label'], palette='Dark2')
plt.title('Train labels count')
plt.show()

In [8]:
train['label'].value_counts().sort_index()

Out[8]:
0 4132
1 4684
2 4177
3 4351
4 4072
5 3795
6 4137
7 4401
8 4063
9 4188
Name: label, dtype: int64

In [9]:
# first few train images with labels
fig, ax = plt.subplots(figsize=(18, 8))
for ind, row in train.iloc[:8, :].iterrows():
plt.subplot(2, 4, ind+1)
plt.title(row[0])
img = row.to_numpy()[1:].reshape(28, 28)
fig.suptitle('Train images', fontsize=24)
plt.axis('off')
plt.imshow(img, cmap='magma')

In [10]:
# first few test images
fig, ax = plt.subplots(figsize=(18, 8))
for ind, row in test.iloc[:8, :].iterrows():
plt.subplot(2, 4, ind+1)
img = row.to_numpy()[:].reshape(28, 28)
fig.suptitle('Test images', fontsize=24)
plt.axis('off')
plt.imshow(img, cmap='magma')
In [11]:
# split into image and labels and convert to numpy array
X = train.iloc[:, 1:].to_numpy()
y = train['label'].to_numpy()

# test dataset
test = test.loc[:, :].to_numpy()

for i in [X, y, test]:

print(i.shape)

(42000, 784)
(42000,)
(28000, 784)

In [12]:
# normalize the data
# ==================

X = X / 255.0
test = test / 255.0

In [13]:
# reshape dataset
# ===============

# shape of training and test dataset

print(X.shape)
print(test.shape)

# reshape the dataframe to 3x3 matrix with 1 channel grey scale values
X = X.reshape(-1,28,28,1)
test = test.reshape(-1,28,28,1)

# shape of training and test dataset

print(X.shape)
print(test.shape)

(42000, 784)
(28000, 784)
(42000, 28, 28, 1)
(28000, 28, 28, 1)

In [14]:
# one hot encode target
# =====================

# shape and values of target

print(y.shape)
print(y[0])

# convert Y_train to categorical by one-hot-encoding

y_enc = to_categorical(y, num_classes = 10)

# shape and values of target

print(y_enc.shape)
print(y_enc[0])

(42000,)
1
(42000, 10)
[0. 1. 0. 0. 0. 0. 0. 0. 0. 0.]
[0. 1. 0. 0. 0. 0. 0. 0. 0. 0.]

In [15]:
# train test split
# ================

# random seed
random_seed = 2

# train validation split

X_train, X_val, y_train_enc, y_val_enc = train_test_split(X, y_enc, test_size=0.3)

# shape
for i in [X_train, y_train_enc, X_val, y_val_enc]:
print(i.shape)

(29400, 28, 28, 1)

(29400, 10)
(12600, 28, 28, 1)
(12600, 10)

In [16]:
g = plt.imshow(X_train[0][:,:,0])
print(y_train_enc[0])

[0. 0. 0. 0. 0. 0. 0. 1. 0. 0.]

In [17]:
g = plt.imshow(X_train[9][:,:,0])
print(y_train_enc[9])

[0. 1. 0. 0. 0. 0. 0. 0. 0. 0.]

In [18]:
INPUT_SHAPE = (28,28,1)
OUTPUT_SHAPE = 10
BATCH_SIZE = 128
EPOCHS = 10
VERBOSE = 2

In [19]:
model = Sequential()

model.add(Conv2D(32, kernel_size=(3,3), activation='relu', input_shape=INPUT_SHAPE))

model.add(MaxPool2D((2,2)))

model.add(Conv2D(64, kernel_size=(3,3), activation='relu'))

model.add(MaxPool2D((2,2)))

model.add(Flatten())

model.add(Dense(128, activation='relu'))
model.add(Dropout(0.2))

model.add(Dense(64, activation='relu'))
model.add(Dropout(0.2))

model.add(Dense(10, activation='softmax'))

In [20]:
model.compile(optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy'])

In [21]:
model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 26, 26, 32) 320
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 13, 13, 32) 0
_________________________________________________________________
conv2d_1 (Conv2D) (None, 11, 11, 64) 18496
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 5, 5, 64) 0
_________________________________________________________________
flatten (Flatten) (None, 1600) 0
_________________________________________________________________
dense (Dense) (None, 128) 204928
_________________________________________________________________
dropout (Dropout) (None, 128) 0
_________________________________________________________________
dense_1 (Dense) (None, 64) 8256
_________________________________________________________________
dropout_1 (Dropout) (None, 64) 0
_________________________________________________________________
dense_2 (Dense) (None, 10) 650
=================================================================
Total params: 232,650
Trainable params: 232,650
Non-trainable params: 0
_________________________________________________________________

In [ ]:
history = model.fit(X_train, y_train_enc,
epochs=EPOCHS,
batch_size=BATCH_SIZE,
verbose=VERBOSE,
validation_split=0.3)
In [23]:
plt.figure(figsize=(14, 5))

plt.subplot(1, 2, 1)
plt.plot(history.history['accuracy'], label='Training Accuracy')
plt.plot(history.history['val_accuracy'], label='Validation Accuracy')
plt.legend(loc='lower right')
plt.title('Training and Validation Accuracy')

plt.subplot(1, 2, 2)
plt.plot(history.history['loss'], label='Training Loss')
plt.plot(history.history['val_loss'], label='Validation Loss')
plt.legend(loc='upper right')
plt.title('Training and Validation Loss')

plt.savefig('./foo.png')
plt.show()

In [24]:
# model loss and accuracy on validation set
model.evaluate(X_val, y_val_enc, verbose=False)
Out[24]:
[0.04319758138058768, 0.98825395]

In [25]:
# predicted values
y_pred_enc = model.predict(X_val)

# actual
y_act = [np.argmax(i) for i in y_val_enc]

# decoding predicted values

y_pred = [np.argmax(i) for i in y_pred_enc]

print(y_pred_enc[0])
print(y_pred[0])

[2.9256535e-09 3.0600136e-06 7.0534142e-08 1.5436905e-10 9.9999607e-01

2.3972677e-09 6.5206586e-07 3.1757565e-08 9.4072838e-10 1.4325420e-07]
4

In [26]:
print(classification_report(y_act, y_pred))

precision recall f1-score support

0 0.98 1.00 0.99 1244

1 0.99 1.00 0.99 1427
2 0.99 0.98 0.98 1278
3 0.99 0.99 0.99 1280
4 0.99 0.99 0.99 1232
5 0.99 0.99 0.99 1127
6 1.00 0.98 0.99 1232
7 0.99 0.99 0.99 1289
8 0.98 0.99 0.99 1225
9 0.98 0.98 0.98 1266

accuracy 0.99 12600

macro avg 0.99 0.99 0.99 12600
weighted avg 0.99 0.99 0.99 12600

In [27]:
fig, ax = plt.subplots(figsize=(7, 7))
sns.heatmap(confusion_matrix(y_act, y_pred), annot=True,
cbar=False, fmt='1d', cmap='Blues', ax=ax)
ax.set_title('Confusion Matrix', loc='left', fontsize=16)
ax.set_xlabel('Predicted')
ax.set_ylabel('Actual')
plt.show()

In [28]:
# predicted values
y_pred_enc = model.predict(test)

# decoding predicted values

y_pred = [np.argmax(i) for i in y_pred_enc]

print(y_pred_enc[0])
print(y_pred[0])

[6.1794458e-10 6.5849557e-09 9.9999988e-01 4.3089918e-09 1.1493416e-10

7.5394031e-12 3.9346677e-09 5.5646712e-08 9.5187547e-10 3.4489059e-13]
2

In [29]:
# predicted targets of each images
# (labels above the images are predicted labels)
fig, ax = plt.subplots(figsize=(18, 12))
for ind, row in enumerate(test[:15]):
plt.subplot(3, 5, ind+1)
plt.title(y_pred[ind])
img = row.reshape(28, 28)
fig.suptitle('Predicted values', fontsize=24)
plt.axis('off')
plt.imshow(img, cmap='cividis')
Experiment 5: House Price Prediction using Neural
Networks
In [1]:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, BatchNormalization
from tensorflow.keras.optimizers import Adam
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, MinMaxScaler
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score
import warnings

# Set random seeds for reproducibility

np.random.seed(42)
tf.random.set_seed(42)

# Ignore warnings
warnings.filterwarnings('ignore')

# Check versions
print(f"TensorFlow version: {tf.__version__}")
print(f"Keras version: {keras.__version__}")

TensorFlow version: 2.17.0

Keras version: 3.9.0.dev2025031403

In [2]:

# Function to generate synthetic house price data

def generate_house_price_data(n_samples=1000):
np.random.seed(42) # For reproducibility

# Generate features
data = {
# Size features
'SquareFootage': np.random.normal(2200, 800, n_samples),
'NumBedrooms': np.random.randint(1, 6, n_samples),
'NumBathrooms': np.random.choice([1, 1.5, 2, 2.5, 3, 3.5, 4], n_samples),
'LotSize': np.random.normal(10000, 5000, n_samples),

# Property characteristics
'YearBuilt': np.random.randint(1950, 2023, n_samples),
'Garage': np.random.randint(0, 4, n_samples),
'HasPool': np.random.choice([0, 1], n_samples, p=[0.8, 0.2]),
'HasBasement': np.random.choice([0, 1], n_samples, p=[0.3, 0.7]),

# Location and market features

'DistanceToCity': np.random.normal(15, 10, n_samples),
'SchoolRating': np.random.uniform(1, 10, n_samples),
'CrimeRate': np.random.normal(5, 3, n_samples),
'MedianNeighborhoodIncome': np.random.normal(70000, 30000, n_samples)
}

# Create DataFrame
df = pd.DataFrame(data)

# Apply constraints and clean up data

df['SquareFootage'] = np.maximum(500, df['SquareFootage']) # Minimum 500 sq ft
df['LotSize'] = np.maximum(1000, df['LotSize']) # Minimum 1000 sq ft lot
df['CrimeRate'] = np.maximum(0, df['CrimeRate']) # Non-negative crime rate
df['MedianNeighborhoodIncome'] = np.maximum(20000, df['MedianNeighborhoodIncome'])

# Calculate property age

df['PropertyAge'] = 2023 - df['YearBuilt']

# Engineered feature: rooms per sq ft

df['RoomsPerSqft'] = (df['NumBedrooms'] + df['NumBathrooms']) / df['SquareFootage']

# Generate target variable (house price)

# Base price
price = 50000 + 110 * df['SquareFootage']

# Add effects of various features

price += 15000 * df['NumBedrooms']
price += 25000 * df['NumBathrooms']
price += 0.5 * df['LotSize']
price -= 1000 * df['PropertyAge'] # Older houses are worth less
price += 12000 * df['Garage'] # Value of garage spaces
price += 30000 * df['HasPool'] # Value of pool
price += 25000 * df['HasBasement'] # Value of basement
price -= 5000 * df['DistanceToCity'] # Further from city reduces value
price += 15000 * df['SchoolRating'] # Better schools increase value
price -= 10000 * df['CrimeRate'] # Higher crime reduces value
price += 0.8 * df['MedianNeighborhoodIncome'] # Neighborhood income affects value

# Add some random noise

price += np.random.normal(0, 50000, n_samples)

# Ensure minimum price

df['Price'] = np.maximum(50000, price)

return df

# Generate the dataset

house_df = generate_house_price_data(1500)

# Display the first few rows

print("Dataset Shape:", house_df.shape)
house_df.head()

Dataset Shape: (1500, 15)

Out[2]:

SquareFootage NumBedrooms NumBathrooms LotSize YearBuilt Garage HasPool HasBasement DistanceToCity Schoo

0 2597.371322 4 4.0 2995.403359 1959 1 1 0 17.024980

1 2089.388559 4 4.0 10302.213950 2004 2 0 1 4.521723

2 2718.150830 3 2.5 9081.684837 1958 1 0 1 22.751506

3 3418.423885 1 3.0 5826.183053 2001 1 0 1 8.665742

4 2012.677300 3 2.0 7657.212568 1951 0 0 0 17.020605

In [3]:
# Basic statistics of the dataset
house_df.describe()
Out[3]:

SquareFootage NumBedrooms NumBathrooms LotSize YearBuilt Garage HasPool HasBasement Distanc

count 1500.000000 1500.000000 1500.000000 1500.000000 1500.000000 1500.000000 1500.000000 1500.000000 150

mean 2242.712546 3.002667 2.543333 9921.081335 1985.995333 1.448000 0.192667 0.708000

std 783.412111 1.420800 1.009357 4826.557276 21.182314 1.133793 0.394525 0.454834

min 500.000000 1.000000 1.000000 1000.000000 1950.000000 0.000000 0.000000 0.000000 -1

25% SquareFootage
1700.774908 NumBedrooms
2.000000 NumBathrooms
1.500000 6379.818844
LotSize 1968.000000
YearBuilt 0.000000
Garage 0.000000 0.000000 Distanc
HasPool HasBasement

50% 2240.322382 3.000000 2.500000 9850.783418 1985.000000 1.000000 0.000000 1.000000

75% 2744.058789 4.000000 3.500000 13395.054832 2005.000000 2.000000 0.000000 1.000000

max 5282.185193 5.000000 4.000000 25463.206392 2022.000000 3.000000 1.000000 1.000000

In [4]:
# Check for missing values
print("Missing values in each column:")
house_df.isnull().sum()

Missing values in each column:

Out[4]:
SquareFootage 0
NumBedrooms 0
NumBathrooms 0
LotSize 0
YearBuilt 0
Garage 0
HasPool 0
HasBasement 0
DistanceToCity 0
SchoolRating 0
CrimeRate 0
MedianNeighborhoodIncome 0
PropertyAge 0
RoomsPerSqft 0
Price 0
dtype: int64

In [5]:
# Visualize the distribution of house prices
plt.figure(figsize=(10, 6))
sns.histplot(house_df['Price'], kde=True)
plt.title('Distribution of House Prices')
plt.xlabel('Price ($)')
plt.ylabel('Frequency')
plt.show()

# Calculate price statistics

print(f"Min Price: ${house_df['Price'].min():,.2f}")
print(f"Max Price: ${house_df['Price'].max():,.2f}")
print(f"Mean Price: ${house_df['Price'].mean():,.2f}")
print(f"Median Price: ${house_df['Price'].median():,.2f}")
Min Price: $50,000.00
Max Price: $870,754.60
Mean Price: $427,721.96
Median Price: $422,454.80

In [6]:
# Explore relationships between features and house prices
plt.figure(figsize=(15, 10))

# Create a list of numerical features to plot

numerical_features = ['SquareFootage', 'NumBedrooms', 'NumBathrooms', 'LotSize',
'PropertyAge', 'SchoolRating', 'DistanceToCity', 'CrimeRate']

# Plot scatter plots for each feature vs price

for i, feature in enumerate(numerical_features):
plt.subplot(2, 4, i+1)
plt.scatter(house_df[feature], house_df['Price'], alpha=0.5, s=10)
plt.title(f'{feature} vs Price')
plt.xlabel(feature)
plt.ylabel('Price')

plt.tight_layout()
plt.show()

In [7]:
# Explore categorical features
plt.figure(figsize=(15, 5))
# Plot HasPool
plt.subplot(1, 3, 1)
sns.boxplot(x='HasPool', y='Price', data=house_df)
plt.title('Price by Pool Presence')
plt.xlabel('Has Pool (1=Yes, 0=No)')
plt.ylabel('Price')

# Plot HasBasement
plt.subplot(1, 3, 2)
sns.boxplot(x='HasBasement', y='Price', data=house_df)
plt.title('Price by Basement Presence')
plt.xlabel('Has Basement (1=Yes, 0=No)')
plt.ylabel('Price')

# Plot Garage
plt.subplot(1, 3, 3)
sns.boxplot(x='Garage', y='Price', data=house_df)
plt.title('Price by Garage Size')
plt.xlabel('Number of Garage Spaces')
plt.ylabel('Price')

plt.tight_layout()
plt.show()

In [8]:
# Compute and visualize correlation matrix
correlation_matrix = house_df.corr()

plt.figure(figsize=(14, 10))
sns.heatmap(correlation_matrix, annot=True, cmap='coolwarm', fmt='.2f', linewidths=0.5)
plt.title('Correlation Matrix of House Features')
plt.tight_layout()
plt.show()
In [9]:
# Sort correlations with Price for better insights
price_correlations = correlation_matrix['Price'].sort_values(ascending=False)
print("Features Correlation with Price:")
price_correlations

Features Correlation with Price:

Out[9]:
Price 1.000000
SquareFootage 0.664093
SchoolRating 0.273853
YearBuilt 0.199843
MedianNeighborhoodIncome 0.186012
NumBathrooms 0.171549
NumBedrooms 0.144245
HasPool 0.134249
Garage 0.105025
HasBasement 0.079272
LotSize -0.012259
PropertyAge -0.199843
CrimeRate -0.238229
RoomsPerSqft -0.341459
DistanceToCity -0.343838
Name: Price, dtype: float64

In [10]:
# Prepare features and target
X = house_df.drop(['Price', 'YearBuilt'], axis=1) # Drop price (target) and YearBuilt (
redundant with PropertyAge)
y = house_df['Price']

# Split the data into training, validation, and test sets

X_train, X_temp, y_train, y_temp = train_test_split(X, y, test_size=0.3, random_state=42
)
X_val, X_test, y_val, y_test = train_test_split(X_temp, y_temp, test_size=0.5, random_st
ate=42)

print(f"Training data shape: {X_train.shape}")

print(f"Validation data shape: {X_val.shape}")
print(f"Test data shape: {X_test.shape}")

Training data shape: (1050, 13)

Validation data shape: (225, 13)
Test data shape: (225, 13)

In [11]:
# Scale the features
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_val_scaled = scaler.transform(X_val)
X_test_scaled = scaler.transform(X_test)
# Check scaled data
print("Scaled training data - first 5 rows:")
pd.DataFrame(X_train_scaled[:5], columns=X.columns)

Scaled training data - first 5 rows:

Out[11]:

SquareFootage NumBedrooms NumBathrooms LotSize Garage HasPool HasBasement DistanceToCity SchoolRating Crim

-
0 -1.541859 1.346061 -1.512224 0.125611 0.472375 0.639841 -0.020848 -0.323407 0.8
0.470032

-
1 0.014937 -1.421172 -0.531170 0.239143 1.356500 -1.562889 -0.225932 -0.138350 -0.1
0.470032

-
2 -1.246153 -0.729363 0.449883 0.204953 0.472375 0.639841 1.189736 1.669335 -1.1
0.470032

-
3 1.137920 0.654253 0.449883 2.873937 0.472375 0.639841 -0.861980 1.258086 1.4
0.470032

- - -
4 0.140264 1.346061 0.940410 0.639841 1.439414 -1.238716 -0.0
0.245218 1.295875 0.470032

In [12]:
# Create a neural network for regression
def build_model(input_dim):
model = Sequential([
# Input layer
Dense(64, activation='relu', input_dim=input_dim),
BatchNormalization(),
Dropout(0.2),

# Hidden layer 1
Dense(32, activation='relu'),
BatchNormalization(),
Dropout(0.2),

# Hidden layer 2
Dense(16, activation='relu'),
BatchNormalization(),
Dropout(0.1),

# Output layer - linear activation for regression

Dense(1)
])

# Compile the model

model.compile(
optimizer=Adam(learning_rate=0.001),
loss='mean_squared_error',
metrics=['mean_absolute_error']
)

return model

# Build the model

model = build_model(X_train_scaled.shape[1])
model.summary()

Model: "sequential"

Total params: 3,969 (15.50 KB)

Trainable params: 3,745 (14.63 KB)

Non-trainable params: 224 (896.00 B)

In [ ]:
# Define callbacks for training
callbacks = [
# Early stopping to prevent overfitting
keras.callbacks.EarlyStopping(
monitor='val_loss',
patience=20,
mode='min',
restore_best_weights=True
),
# Reduce learning rate when training plateaus
keras.callbacks.ReduceLROnPlateau(
monitor='val_loss',
factor=0.2,
patience=5,
min_lr=0.00001
)
]

# Train the model

history = model.fit(
X_train_scaled,
y_train,
epochs=100,
batch_size=32,
validation_data=(X_val_scaled, y_val),
callbacks=callbacks,
verbose=1
)

In [14]:
# Plot training history
plt.figure(figsize=(12, 5))

# Plot training & validation loss

plt.subplot(1, 2, 1)
plt.plot(history.history['loss'], label='Training Loss')
plt.plot(history.history['val_loss'], label='Validation Loss')
plt.title('Model Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss (MSE)')
plt.legend()
plt.grid(True, alpha=0.3)

# Plot training & validation MAE

plt.subplot(1, 2, 2)
plt.plot(history.history['mean_absolute_error'], label='Training MAE')
plt.plot(history.history['val_mean_absolute_error'], label='Validation MAE')
plt.title('Model Mean Absolute Error')
plt.xlabel('Epoch')
plt.ylabel('MAE')
plt.legend()
plt.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

In [15]:
# Evaluate on test set
test_loss, test_mae = model.evaluate(X_test_scaled, y_test, verbose=0)
print(f"Test Loss (MSE): {test_loss:.2f}")
print(f"Test MAE: ${test_mae:.2f}")

Test Loss (MSE): 200826667008.00

Test MAE: $430195.06

In [16]:
# Make predictions
y_pred = model.predict(X_test_scaled)

# Calculate regression metrics

mse = mean_squared_error(y_test, y_pred)
rmse = np.sqrt(mse)
mae = mean_absolute_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)

print(f"Mean Squared Error (MSE): {mse:.2f}")

print(f"Root Mean Squared Error (RMSE): ${rmse:.2f}")
print(f"Mean Absolute Error (MAE): ${mae:.2f}")
print(f"R² Score: {r2:.4f}")

# Calculate MAPE (Mean Absolute Percentage Error)

mape = np.mean(np.abs((y_test - y_pred.flatten()) / y_test)) * 100
print(f"Mean Absolute Percentage Error (MAPE): {mape:.2f}%")

8/8 ━━━━━━━━━━━━━━━━━━━━ 0s 13ms/step

Mean Squared Error (MSE): 200826681383.91
Root Mean Squared Error (RMSE): $448136.90
Mean Absolute Error (MAE): $430195.08
R² Score: -11.6721
Mean Absolute Percentage Error (MAPE): 99.93%
In [17]:
# Visualize predictions vs actual values
plt.figure(figsize=(10, 6))
plt.scatter(y_test, y_pred, alpha=0.5)
plt.plot([y_test.min(), y_test.max()], [y_test.min(), y_test.max()], 'r--', lw=2)
plt.title('Predicted vs Actual House Prices')
plt.xlabel('Actual Price ($)')
plt.ylabel('Predicted Price ($)')
plt.grid(True, alpha=0.3)

# Add metrics to the plot

plt.annotate(f"R² = {r2:.4f}\n RMSE = ${rmse:.2f}\n MAPE = {mape:.2f}%",
xy=(0.05, 0.92), xycoords='axes fraction',
bbox=dict(boxstyle="round,pad=0.3", fc="white", ec="gray", alpha=0.8))

plt.show()

In [18]:
# Plot residuals
residuals = y_test - y_pred.flatten()

plt.figure(figsize=(12, 5))

# Residuals vs Predicted
plt.subplot(1, 2, 1)
plt.scatter(y_pred, residuals, alpha=0.5)
plt.axhline(y=0, color='r', linestyle='--')
plt.title('Residuals vs Predicted Values')
plt.xlabel('Predicted Price ($)')
plt.ylabel('Residuals')
plt.grid(True, alpha=0.3)

# Residual distribution
plt.subplot(1, 2, 2)
sns.histplot(residuals, kde=True)
plt.title('Distribution of Residuals')
plt.xlabel('Residual Value')
plt.ylabel('Frequency')

plt.tight_layout()
plt.show()

In [ ]:
# Function to measure feature importance using permutation method
def permutation_importance(model, X, y, n_repeats=10):
baseline_mae = mean_absolute_error(y, model.predict(X))
importances = []

for col_idx in range(X.shape[1]):

col_importances = []
for _ in range(n_repeats):
# Create a shuffled copy of the feature
X_permuted = X.copy()
np.random.shuffle(X_permuted[:, col_idx])

# Measure the change in MAE

permuted_mae = mean_absolute_error(y, model.predict(X_permuted))
importance = permuted_mae - baseline_mae
col_importances.append(importance)

importances.append(np.mean(col_importances))

return importances

# Calculate feature importances

feature_importances = permutation_importance(model, X_test_scaled, y_test, n_repeats=5)

# Create a DataFrame for visualization

importance_df = pd.DataFrame({
'Feature': X.columns,
'Importance': feature_importances
})

# Sort by importance
importance_df = importance_df.sort_values('Importance', ascending=False)

# Plot feature importance

plt.figure(figsize=(12, 8))
plt.barh(importance_df['Feature'], importance_df['Importance'], color='skyblue')
plt.title('Feature Importance based on Permutation Method')
plt.xlabel('Increase in MAE when Feature is Permuted')
plt.grid(True, alpha=0.3, axis='x')
plt.tight_layout()
plt.show()

In [21]:
# Create sample houses for prediction
sample_houses = pd.DataFrame([
# Small starter home
{
'SquareFootage': 1200,
'NumBedrooms': 2,
'NumBathrooms': 1,
'LotSize': 5000,
'PropertyAge': 40,
'Garage': 1,
'HasPool': 0,
'HasBasement': 0,
'DistanceToCity': 20,
'SchoolRating': 6.5,
'CrimeRate': 7.2,
'MedianNeighborhoodIncome': 45000,
'RoomsPerSqft': (2 + 1) / 1200
},
# Medium suburban home
{
'SquareFootage': 2500,
'NumBedrooms': 3,
'NumBathrooms': 2.5,
'LotSize': 8500,
'PropertyAge': 15,
'Garage': 2,
'HasPool': 0,
'HasBasement': 1,
'DistanceToCity': 12,
'SchoolRating': 8.2,
'CrimeRate': 3.1,
'MedianNeighborhoodIncome': 85000,
'RoomsPerSqft': (3 + 2.5) / 2500
},
# Luxury home
{
'SquareFootage': 4200,
'NumBedrooms': 5,
'NumBathrooms': 4.5,
'LotSize': 15000,
'PropertyAge': 5,
'Garage': 3,
'HasPool': 1,
'HasBasement': 1,
'DistanceToCity': 8,
'SchoolRating': 9.8,
'CrimeRate': 1.2,
'MedianNeighborhoodIncome': 150000,
'RoomsPerSqft': (5 + 4.5) / 4200
}
])

# Reorder columns to match training data

sample_houses = sample_houses[X.columns]

# Scale the sample houses

sample_houses_scaled = scaler.transform(sample_houses)

# Make predictions
sample_predictions = model.predict(sample_houses_scaled).flatten()

# Add predictions to the sample houses DataFrame

sample_houses['Predicted Price'] = sample_predictions

# Define house types for display

house_types = ['Small Starter Home', 'Medium Suburban Home', 'Luxury Home']
sample_houses['House Type'] = house_types

# Display the results

print("\n Price Predictions for Sample Houses:\n ")
sample_results = sample_houses[['House Type', 'SquareFootage', 'NumBedrooms', 'NumBathroo
ms', 'PropertyAge', 'Predicted Price']]
display(sample_results)

# Format prices nicely

for i, house_type in enumerate(house_types):
price = sample_predictions[i]
print(f"{house_type} : ${price:,.2f} ")

1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 30ms/step

Price Predictions for Sample Houses:

House Type SquareFootage NumBedrooms NumBathrooms PropertyAge Predicted Price

0 Small Starter Home 1200 2 1.0 40 -434.238617

1 Medium Suburban Home 2500 3 2.5 15 925.150818

2 Luxury Home 4200 5 4.5 5 1586.954102

Small Starter Home: $-434.24

Medium Suburban Home: $925.15
Luxury Home: $1,586.95

In [23]:

# Function to predict price given a base house and varied feature

def predict_price_with_varied_feature(base_house, feature_name, feature_values):
# Create copies of the base house with different feature values
varied_houses = []

for value in feature_values:

house_copy = base_house.copy()
house_copy[feature_name] = value

# Update RoomsPerSqft if needed

if feature_name in ['NumBedrooms', 'NumBathrooms', 'SquareFootage']:
bedrooms = house_copy['NumBedrooms']
bathrooms = house_copy['NumBathrooms']
sqft = house_copy['SquareFootage']
house_copy['RoomsPerSqft'] = (bedrooms + bathrooms) / sqft

varied_houses.append(house_copy)

# Convert to DataFrame
varied_df = pd.DataFrame(varied_houses)

# Scale and predict

varied_scaled = scaler.transform(varied_df)
predictions = model.predict(varied_scaled).flatten()

return predictions

# Use the medium suburban home as our base for what-if analysis
base_house = sample_houses.iloc[1].to_dict()
base_house.pop('House Type', None)
base_house.pop('Predicted Price', None)
Out[23]:

925.1508178710938

In [ ]:

# Analyze the impact of square footage

sqft_values = np.linspace(1000, 5000, 10)
sqft_predictions = predict_price_with_varied_feature(base_house, 'SquareFootage', sqft_va
lues)

# Analyze the impact of property age

age_values = np.linspace(0, 70, 10)
age_predictions = predict_price_with_varied_feature(base_house, 'PropertyAge', age_values
)

# Analyze the impact of school rating

school_values = np.linspace(1, 10, 10)
school_predictions = predict_price_with_varied_feature(base_house, 'SchoolRating', school
_values)

# Analyze the impact of distance to city

distance_values = np.linspace(0, 30, 10)
distance_predictions = predict_price_with_varied_feature(base_house, 'DistanceToCity', di
stance_values)

In [25]:

# Plot the what-if analysis results

plt.figure(figsize=(20, 15))

# Square footage impact

plt.subplot(2, 2, 1)
plt.plot(sqft_values, sqft_predictions, marker='o', linestyle='-', linewidth=2, markersi
ze=8)
plt.title('Impact of Square Footage on House Price')
plt.xlabel('Square Footage')
plt.ylabel('Predicted Price ($)')
plt.grid(True, alpha=0.3)

# Calculate price increase per square foot

price_per_sqft = (sqft_predictions[-1] - sqft_predictions[0]) / (sqft_values[-1] - sqft_
values[0])
plt.annotate(f"Avg Price Increase: ${price_per_sqft:.2f} per sq ft",
xy=(0.05, 0.92), xycoords='axes fraction',
bbox=dict(boxstyle="round,pad=0.3", fc="white", ec="gray", alpha=0.8))

# Property age impact

plt.subplot(2, 2, 2)
plt.plot(age_values, age_predictions, marker='o', linestyle='-', linewidth=2, markersize
=8, color='green')
plt.title('Impact of Property Age on House Price')
plt.xlabel('Property Age (years)')
plt.ylabel('Predicted Price ($)')
plt.grid(True, alpha=0.3)

# Calculate price decrease per year of age

price_per_year = (age_predictions[0] - age_predictions[-1]) / (age_values[-1] - age_valu
es[0])
plt.annotate(f"Avg Price Decrease: ${price_per_year:.2f} per year",
xy=(0.05, 0.92), xycoords='axes fraction',
bbox=dict(boxstyle="round,pad=0.3", fc="white", ec="gray", alpha=0.8))

# School rating impact

plt.subplot(2, 2, 3)
plt.plot(school_values, school_predictions, marker='o', linestyle='-', linewidth=2, marke
rsize=8, color='red')
plt.title('Impact of School Rating on House Price')
plt.xlabel('School Rating (1-10)')
plt.ylabel('Predicted Price ($)')
plt.grid(True, alpha=0.3)

# Calculate price increase per school rating point

price_per_rating = (school_predictions[-1] - school_predictions[0]) / (school_values[-1]
- school_values[0])
plt.annotate(f"Avg Price Increase: ${price_per_rating:.2f} per rating point",
xy=(0.05, 0.92), xycoords='axes fraction',
bbox=dict(boxstyle="round,pad=0.3", fc="white", ec="gray", alpha=0.8))

# Distance to city impact

plt.subplot(2, 2, 4)
plt.plot(distance_values, distance_predictions, marker='o', linestyle='-', linewidth=2,
markersize=8, color='purple')
plt.title('Impact of Distance to City on House Price')
plt.xlabel('Distance to City (miles)')
plt.ylabel('Predicted Price ($)')
plt.grid(True, alpha=0.3)
# Calculate price decrease per mile
price_per_mile = (distance_predictions[0] - distance_predictions[-1]) / (distance_values
[-1] - distance_values[0])
plt.annotate(f"Avg Price Decrease: ${price_per_mile:.2f} per mile",
xy=(0.05, 0.92), xycoords='axes fraction',
bbox=dict(boxstyle="round,pad=0.3", fc="white", ec="gray", alpha=0.8))

plt.suptitle('What-If Analysis: How Individual Features Affect House Price', fontsize=20)

plt.tight_layout()
plt.subplots_adjust(top=0.92)
plt.show()

In [27]:
# Create a function to predict house price based on input features
def predict_house_price(square_footage, num_bedrooms, num_bathrooms, lot_size, property_a
ge,
garage_spaces, has_pool, has_basement, distance_to_city,
school_rating, crime_rate, median_neighborhood_income):

# Calculate the derived feature

rooms_per_sqft = (num_bedrooms + num_bathrooms) / square_footage

# Create a DataFrame with the house features

house_features = pd.DataFrame([
{
'SquareFootage': square_footage,
'NumBedrooms': num_bedrooms,
'NumBathrooms': num_bathrooms,
'LotSize': lot_size,
'PropertyAge': property_age,
'Garage': garage_spaces,
'HasPool': has_pool,
'HasBasement': has_basement,
'DistanceToCity': distance_to_city,
'SchoolRating': school_rating,
'CrimeRate': crime_rate,
'MedianNeighborhoodIncome': median_neighborhood_income,
'RoomsPerSqft': rooms_per_sqft
}
])

# Reorder columns to match training data

house_features = house_features[X.columns]

# Scale the features

house_features_scaled = scaler.transform(house_features)

# Make prediction
predicted_price = model.predict(house_features_scaled)[0][0]

return predicted_price

# Test the function with an example house

example_price = predict_house_price(
square_footage=2200,
num_bedrooms=3,
num_bathrooms=2,
lot_size=9000,
property_age=12,
garage_spaces=2,
has_pool=0,
has_basement=1,
distance_to_city=15,
school_rating=7.5,
crime_rate=4.2,
median_neighborhood_income=75000
)

print(f"Predicted House Price: ${example_price:,.2f}")

1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 35ms/step

Predicted House Price: $590.48
Experiment 6: Basic Speech Recognition using Neural
Networks
In [ ]:

%pip install librosa

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Conv1D, MaxPooling1D, Flatten, LSTM,
BatchNormalization
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder
from sklearn.metrics import confusion_matrix, classification_report
import librosa
import librosa.display
import IPython.display as ipd
import os
import warnings

# Set random seeds for reproducibility

np.random.seed(42)
tf.random.set_seed(42)

# Ignore warnings
warnings.filterwarnings('ignore')

# Check versions
print(f"TensorFlow version: {tf.__version__}")
print(f"Keras version: {keras.__version__}")
print(f"Librosa version: {librosa.__version__}")

In [2]:
# Try to download a small version of the Speech Commands dataset
try:
# Create a directory for our audio files
os.makedirs('speech_data', exist_ok=True)

# Download a subset of the Speech Commands dataset

!wget -q -O speech_commands_v0.01.tar.gz http://download.tensorflow.org/data/speech_
commands_v0.01.tar.gz
!tar -xzf speech_commands_v0.01.tar.gz -C speech_data

# Check if download and extraction were successful

if not os.path.exists('speech_data/_background_noise_'):
raise Exception("Download or extraction failed")

print("Successfully downloaded and extracted Speech Commands dataset.")

# List the available command categories

categories = [d for d in os.listdir('speech_data') if os.path.isdir(os.path.join('sp
eech_data', d)) and not d.startswith('_')]
print(f"Available command categories: {categories} ")

# We'll use a subset of commands for our example

selected_commands = ['yes', 'no', 'up', 'down', 'left', 'right']
print(f"Selected commands for our model: {selected_commands}")

using_synthetic_data = False
except Exception as e:
print(f"Error: {e}")
print("Failed to download Speech Commands dataset. Creating synthetic audio data inst
ead.")
using_synthetic_data = True

Successfully downloaded and extracted Speech Commands dataset.

Available command categories: ['yes', 'six', 'house', 'tree', 'no', 'on', 'off', 'go', 't
hree', 'up', 'seven', 'right', 'happy', 'eight', 'stop', 'five', 'cat', 'one', 'sheila',
'down', 'wow', 'two', 'nine', 'zero', 'left', 'marvin', 'dog', 'four', 'bird', 'bed']
Selected commands for our model: ['yes', 'no', 'up', 'down', 'left', 'right']

In [3]:

# Function to generate synthetic audio for demonstration

def generate_synthetic_speech_data(n_samples=1000, duration=1.0, sr=16000):
"""
Generate synthetic audio data that mimics speech for different commands.
This is not real speech but will allow us to demonstrate the concepts.
"""
# Commands to simulate
commands = ['yes', 'no', 'up', 'down', 'left', 'right']
n_commands = len(commands)
n_samples_per_command = n_samples // n_commands

# Create synthetic data

X = []
y = []

for idx, command in enumerate(commands):

for _ in range(n_samples_per_command):
# Generate a base frequency between 150 and 350 Hz
base_freq = np.random.uniform(150, 350)

# Create time array

t = np.linspace(0, duration, int(sr * duration), endpoint=False)

# Generate base signal with the frequency

x = np.sin(2 * np.pi * base_freq * t)

# Add harmonics with different patterns for each command

if command == 'yes':
x += 0.5 * np.sin(2 * np.pi * (base_freq * 2) * t) * np.exp(-t/0.5)
elif command == 'no':
x += 0.5 * np.sin(2 * np.pi * (base_freq * 1.5) * t) * np.exp(-t/0.3)
elif command == 'up':
x += 0.4 * np.sin(2 * np.pi * np.linspace(base_freq, base_freq * 2, len(
t)) * t)
elif command == 'down':
x += 0.4 * np.sin(2 * np.pi * np.linspace(base_freq * 2, base_freq, len(
t)) * t)
elif command == 'left':
x += 0.3 * np.sin(2 * np.pi * (base_freq * 1.2) * t) * (1 + np.sin(2 *
np.pi * 3 * t))
elif command == 'right':
x += 0.3 * np.sin(2 * np.pi * (base_freq * 1.8) * t) * (1 + np.sin(2 *
np.pi * 4 * t))

# Add some noise

x += np.random.normal(0, 0.1, len(x))

# Normalize
x = x / np.max(np.abs(x))

X.append(x)
y.append(command)

return np.array(X), np.array(y), commands

# If real data is not available, generate synthetic data

if using_synthetic_data:
X_synthetic, y_synthetic, selected_commands = generate_synthetic_speech_data(1200)
print(f"Generated synthetic speech data for commands: {selected_commands}")
print(f"Data shape: {X_synthetic.shape}, Labels shape: {y_synthetic.shape}")

In [4]:
# Function to load and process real audio files
def load_audio_files(commands, data_dir='speech_data', max_files_per_command=200, duratio
n=1.0, sr=16000):
X = []
y = []

for command in commands:

print(f"Loading {command} audio files...")
command_dir = os.path.join(data_dir, command)
files = os.listdir(command_dir)[:max_files_per_command]

for file in files:

file_path = os.path.join(command_dir, file)
try:
# Load the audio file with a fixed duration
audio, _ = librosa.load(file_path, sr=sr, duration=duration)

# Ensure all audio samples have the same length

if len(audio) < sr * duration:
audio = np.pad(audio, (0, int(sr * duration) - len(audio)))
else:
audio = audio[:int(sr * duration)]

X.append(audio)
y.append(command)
except Exception as e:
print(f"Error loading file {file_path}: {e}")
continue

return np.array(X), np.array(y)

# Load real audio data if available, otherwise use synthetic data

if not using_synthetic_data:
# Load real audio files
X, y = load_audio_files(selected_commands)
print(f"Loaded {len(X)} audio files.")
else:
# Use synthetic data
X, y = X_synthetic, y_synthetic
print(f"Using synthetic audio data with {len(X)} samples.")

Loading yes audio files...

Loading no audio files...
Loading up audio files...
Loading down audio files...
Loading left audio files...
Loading right audio files...
Loaded 1200 audio files.

In [5]:
# Visualize some audio waveforms
plt.figure(figsize=(15, 10))
commands_to_plot = set(y) # Get unique commands

for idx, command in enumerate(commands_to_plot):

# Find first occurrence of this command
audio_idx = np.where(y == command)[0][0]
audio = X[audio_idx]

plt.subplot(len(commands_to_plot), 2, 2*idx+1)
plt.plot(audio)
plt.title(f'Waveform for "{command}"')
plt.xlabel('Sample')
plt.ylabel('Amplitude')
# Plot the spectrogram
plt.subplot(len(commands_to_plot), 2, 2*idx+2)
D = librosa.amplitude_to_db(np.abs(librosa.stft(audio)), ref=np.max)
librosa.display.specshow(D, y_axis='log', x_axis='time')
plt.title(f'Spectrogram for "{command}"')
plt.colorbar(format='%+2.0f dB')

plt.tight_layout()
plt.show()

In [6]:
# Play some audio samples (only works in interactive notebook environments)
for command in list(set(y))[:3]: # Only play first few commands
audio_idx = np.where(y == command)[0][0]
audio = X[audio_idx]

print(f"Playing sample for '{command}'")

# This will only work in a Jupyter notebook environment
ipd.display(ipd.Audio(audio, rate=16000))

Playing sample for 'left'

Your browser does not support the audio element.

Playing sample for 'up'

Your browser does not support the audio element.

Playing sample for 'down'

Your browser does not support the audio element.

In [7]:
# Function to extract MFCCs from audio data
def extract_mfccs(audio, sr=16000, n_mfcc=13):
mfccs = librosa.feature.mfcc(y=audio, sr=sr, n_mfcc=n_mfcc)
return mfccs.T # Transpose to get time steps as first dimension
# Extract MFCCs for all samples
n_mfcc = 13 # Number of MFCC coefficients to extract
X_mfccs = []

print("Extracting MFCC features...")

for audio in X:
mfccs = extract_mfccs(audio, n_mfcc=n_mfcc)
X_mfccs.append(mfccs)

# Convert list to numpy array

X_mfccs = np.array(X_mfccs)
print(f"MFCC features shape: {X_mfccs.shape}")

Extracting MFCC features...

MFCC features shape: (1200, 32, 13)

In [8]:
# Visualize MFCC features for different commands
plt.figure(figsize=(15, 10))
commands_to_plot = set(y) # Get unique commands

for idx, command in enumerate(commands_to_plot):

# Find first occurrence of this command
audio_idx = np.where(y == command)[0][0]
mfcc = X_mfccs[audio_idx]

plt.subplot(len(commands_to_plot), 1, idx+1)
librosa.display.specshow(mfcc.T, x_axis='time')
plt.title(f'MFCC for "{command}"')
plt.colorbar()

plt.tight_layout()
plt.show()

In [9]:
In [9]:
# Encode the labels
label_encoder = LabelEncoder()
y_encoded = label_encoder.fit_transform(y)

# Print the mapping

print("Label encoding:")
for idx, label in enumerate(label_encoder.classes_):
print(f" {label} -> {idx}")

# Split the data into training and testing sets

X_train, X_test, y_train, y_test = train_test_split(
X_mfccs, y_encoded, test_size=0.2, random_state=42, stratify=y_encoded
)

print(f"Training data shape: {X_train.shape}")

print(f"Testing data shape: {X_test.shape}")

# Convert to categorical (one-hot encoded) format

y_train_cat = tf.keras.utils.to_categorical(y_train, num_classes=len(label_encoder.class
es_))
y_test_cat = tf.keras.utils.to_categorical(y_test, num_classes=len(label_encoder.classes_
))

print(f"y_train_cat shape: {y_train_cat.shape}")

print(f"y_test_cat shape: {y_test_cat.shape}")

Label encoding:
down -> 0
left -> 1
no -> 2
right -> 3
up -> 4
yes -> 5
Training data shape: (960, 32, 13)
Testing data shape: (240, 32, 13)
y_train_cat shape: (960, 6)
y_test_cat shape: (240, 6)

In [10]:
# 1. Build a CNN model
def build_cnn_model(input_shape, num_classes):
model = Sequential([
# Convolutional layers
Conv1D(32, 3, activation='relu', padding='same', input_shape=input_shape),
BatchNormalization(),
MaxPooling1D(pool_size=2),

Conv1D(64, 3, activation='relu', padding='same'),

BatchNormalization(),
MaxPooling1D(pool_size=2),

Conv1D(128, 3, activation='relu', padding='same'),

BatchNormalization(),
MaxPooling1D(pool_size=2),

# Flatten and dense layers

Flatten(),
Dense(128, activation='relu'),
Dropout(0.5),
Dense(num_classes, activation='softmax')
])

model.compile(
optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy']
)

return model
# 2. Build an LSTM model (recurrent neural network)
def build_lstm_model(input_shape, num_classes):
model = Sequential([
# LSTM layers
LSTM(64, return_sequences=True, input_shape=input_shape),
Dropout(0.2),

LSTM(64),
Dropout(0.2),

# Dense layers
Dense(32, activation='relu'),
Dropout(0.3),
Dense(num_classes, activation='softmax')
])

model.compile(
optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy']
)

return model

# Create models
input_shape = X_train.shape[1:]
num_classes = len(label_encoder.classes_)

cnn_model = build_cnn_model(input_shape, num_classes)

lstm_model = build_lstm_model(input_shape, num_classes)

print("CNN Model Summary:")

cnn_model.summary()

print("\n LSTM Model Summary:")

lstm_model.summary()

CNN Model Summary:

Model: "sequential"

Total params: 99,526 (388.77 KB)

Trainable params: 99,078 (387.02 KB)

Non-trainable params: 448 (1.75 KB)

LSTM Model Summary:

Model: "sequential_1"

Total params: 55,270 (215.90 KB)

Trainable params: 55,270 (215.90 KB)

Non-trainable params: 0 (0.00 B)

In [ ]:
# Define callbacks for early stopping
callbacks = [
tf.keras.callbacks.EarlyStopping(
monitor='val_loss',
patience=5,
mode='min',
restore_best_weights=True
)
]

# Train CNN model

print("Training CNN model...")
cnn_history = cnn_model.fit(
X_train, y_train_cat,
epochs=30,
batch_size=32,
validation_split=0.2,
callbacks=callbacks,
verbose=1
)

# Train LSTM model

print("\n Training LSTM model...")
lstm_history = lstm_model.fit(
X_train, y_train_cat,
epochs=30,
batch_size=32,
validation_split=0.2,
callbacks=callbacks,
verbose=1
)

In [12]:
# Plot training histories
plt.figure(figsize=(12, 5))

# Plot accuracy
plt.subplot(1, 2, 1)
plt.plot(cnn_history.history['accuracy'], label='CNN Training')
plt.plot(cnn_history.history['val_accuracy'], label='CNN Validation')
plt.plot(lstm_history.history['accuracy'], label='LSTM Training')
plt.plot(lstm_history.history['val_accuracy'], label='LSTM Validation')
plt.title('Model Accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend()
plt.grid(True, alpha=0.3)

# Plot loss
plt.subplot(1, 2, 2)
plt.plot(cnn_history.history['loss'], label='CNN Training')
plt.plot(cnn_history.history['val_loss'], label='CNN Validation')
plt.plot(lstm_history.history['loss'], label='LSTM Training')
plt.plot(lstm_history.history['val_loss'], label='LSTM Validation')
plt.title('Model Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.legend()
plt.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

In [13]:
# Evaluate models on the test set
print("Evaluating CNN model...")
cnn_test_loss, cnn_test_acc = cnn_model.evaluate(X_test, y_test_cat, verbose=0)
print(f"CNN Test Loss: {cnn_test_loss:.4f}")
print(f"CNN Test Accuracy: {cnn_test_acc:.4f}")

print("\n Evaluating LSTM model...")

lstm_test_loss, lstm_test_acc = lstm_model.evaluate(X_test, y_test_cat, verbose=0)
print(f"LSTM Test Loss: {lstm_test_loss:.4f}")
print(f"LSTM Test Accuracy: {lstm_test_acc:.4f}")

Evaluating CNN model...

CNN Test Loss: 0.7453
CNN Test Accuracy: 0.7583
Evaluating LSTM model...
LSTM Test Loss: 0.8699
LSTM Test Accuracy: 0.6833

In [14]:
# Make predictions with both models
cnn_predictions = cnn_model.predict(X_test)
cnn_pred_classes = np.argmax(cnn_predictions, axis=1)

lstm_predictions = lstm_model.predict(X_test)
lstm_pred_classes = np.argmax(lstm_predictions, axis=1)

# Compute confusion matrices

plt.figure(figsize=(15, 6))

# CNN confusion matrix

plt.subplot(1, 2, 1)
cnn_cm = confusion_matrix(y_test, cnn_pred_classes)
sns.heatmap(cnn_cm, annot=True, fmt='d', cmap='Blues',
xticklabels=label_encoder.classes_,
yticklabels=label_encoder.classes_)
plt.title('CNN Confusion Matrix')
plt.xlabel('Predicted Label')
plt.ylabel('True Label')

# LSTM confusion matrix

plt.subplot(1, 2, 2)
lstm_cm = confusion_matrix(y_test, lstm_pred_classes)
sns.heatmap(lstm_cm, annot=True, fmt='d', cmap='Blues',
xticklabels=label_encoder.classes_,
yticklabels=label_encoder.classes_)
plt.title('LSTM Confusion Matrix')
plt.xlabel('Predicted Label')
plt.ylabel('True Label')

plt.tight_layout()
plt.show()

8/8 ━━━━━━━━━━━━━━━━━━━━ 1s 50ms/step

8/8 ━━━━━━━━━━━━━━━━━━━━ 0s 27ms/step

In [15]:

# Generate classification reports

print("CNN Classification Report:")
print(classification_report(y_test, cnn_pred_classes, target_names=label_encoder.classes_
))

print("\n LSTM Classification Report:")

print(classification_report(y_test, lstm_pred_classes, target_names=label_encoder.classe
s_))

CNN Classification Report:

precision recall f1-score support

down 0.66 0.72 0.69 40

left 0.69 0.62 0.66 40
no 0.60 0.72 0.66 40
right 0.86 0.93 0.89 40
up 0.89 0.85 0.87 40
yes 0.90 0.70 0.79 40

accuracy 0.76 240

macro avg 0.77 0.76 0.76 240
weighted avg 0.77 0.76 0.76 240

LSTM Classification Report:

precision recall f1-score support

down 0.59 0.57 0.58 40

left 0.65 0.55 0.59 40
no 0.57 0.53 0.55 40
right 0.73 0.80 0.76 40
up 0.71 0.80 0.75 40
yes 0.83 0.85 0.84 40

accuracy 0.68 240

macro avg 0.68 0.68 0.68 240
weighted avg 0.68 0.68 0.68 240

In [16]:

# Function to recognize speech commands

def recognize_command(audio, model, label_encoder, sr=16000, duration=1.0, n_mfcc=13):
"""
Recognize speech command from audio data
"""
# Ensure audio is the right length
if len(audio) < sr * duration:
audio = np.pad(audio, (0, int(sr * duration) - len(audio)))
else:
audio = audio[:int(sr * duration)]

# Extract MFCC features

mfccs = extract_mfccs(audio, sr=sr, n_mfcc=n_mfcc)

# Reshape for the model

mfccs = np.expand_dims(mfccs, axis=0) # Add batch dimension

# Make prediction
prediction = model.predict(mfccs)[0]
predicted_class_idx = np.argmax(prediction)
predicted_command = label_encoder.classes_[predicted_class_idx]
confidence = prediction[predicted_class_idx]

return predicted_command, confidence, prediction

# Choose the better performing model

if cnn_test_acc >= lstm_test_acc:
best_model = cnn_model
print("Using CNN model for speech recognition")
else:
best_model = lstm_model
print("Using LSTM model for speech recognition")

Using CNN model for speech recognition

In [17]:
# Test speech recognition on a few samples
plt.figure(figsize=(15, 10))

for i in range(5): # Test 5 random samples

idx = np.random.randint(0, len(X_test))
audio = X[idx]
true_command = y[idx]

# Recognize the command

predicted_command, confidence, all_probabilities = recognize_command(
audio, best_model, label_encoder
)

# Plot audio waveform

plt.subplot(5, 2, 2*i+1)
plt.plot(audio)
plt.title(f"Sample {i+1}: True = '{true_command}', Predicted = '{predicted_command}'
")
plt.xlabel('Sample')
plt.ylabel('Amplitude')

# Plot probabilities for each command

plt.subplot(5, 2, 2*i+2)
barlist = plt.bar(label_encoder.classes_, all_probabilities)

# Highlight the predicted and true command

for j, cls in enumerate(label_encoder.classes_):
if cls == predicted_command:
barlist[j].set_color('green')
elif cls == true_command and cls != predicted_command:
barlist[j].set_color('red')

plt.title(f"Prediction Probabilities (Confidence: {confidence:.2f})")

plt.xticks(rotation=45)
plt.xlabel('Command')
plt.ylabel('Probability')

plt.tight_layout()
plt.show()

1/1 ━━━━━━━━━━━━━━━━━━━━ 1s 623ms/step

1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 80ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 106ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 126ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 102ms/step
In [18]:

# Function to execute actions based on recognized commands

def execute_command(command, confidence_threshold=0.7):
"""Simulate executing actions based on recognized commands"""
if command == 'yes':
return " Confirmed action"
elif command == 'no':
return " Cancelled action"
elif command == 'up':
return "⬆️ Moving up"
elif command == 'down':
return "⬇️ Moving down"
elif command == 'left':
return "⬅️ Moving left"
elif command == 'right':
return "➡️ Moving right"
else:
return " Command not recognized"

# Simulation of real-time speech recognition

def speech_command_demo(num_commands=10):
print("=== Speech Command Recognition Demo ===\n ")
print("Supported commands:", ', '.join(label_encoder.classes_))
print("\n Listening for commands...\n ")

# Randomly select some test samples

indices = np.random.randint(0, len(X), num_commands)

for i, idx in enumerate(indices):

audio = X[idx]
true_command = y[idx]

# Simulate recording audio

print(f"[Recording audio {i+1}...]")

# Process the audio and recognize the command

predicted_command, confidence, _ = recognize_command(
audio, best_model, label_encoder
)

# Execute the command action

action_result = execute_command(predicted_command)

# Print results
print(f"Heard: '{predicted_command}' (Confidence: {confidence:.2f})")
print(f"Executing: {action_result}")
print(f"[Actual command was: '{true_command}']")
print("-" * 40)

print("\n Demo completed.")

# Run the demo

speech_command_demo(8)

=== Speech Command Recognition Demo ===

Supported commands: down, left, no, right, up, yes

Listening for commands...

[Recording audio 1...]

1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 95ms/step
Heard: 'right' (Confidence: 1.00)
Executing: ➡️ Moving right
[Actual command was: 'right']
----------------------------------------
[Recording audio 2...]
[Recording audio 2...]
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 138ms/step
Heard: 'right' (Confidence: 1.00)
Executing: ➡️ Moving right
[Actual command was: 'right']
----------------------------------------
[Recording audio 3...]
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 89ms/step
Heard: 'yes' (Confidence: 1.00)
Executing: Confirmed action
[Actual command was: 'yes']
----------------------------------------
[Recording audio 4...]
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 43ms/step
Heard: 'up' (Confidence: 1.00)
Executing: ⬆️ Moving up
[Actual command was: 'up']
----------------------------------------
[Recording audio 5...]
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 47ms/step
Heard: 'no' (Confidence: 0.99)
Executing: Cancelled action
[Actual command was: 'no']
----------------------------------------
[Recording audio 6...]
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 41ms/step
Heard: 'no' (Confidence: 0.60)
Executing: Cancelled action
[Actual command was: 'yes']
----------------------------------------
[Recording audio 7...]
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 42ms/step
Heard: 'right' (Confidence: 1.00)
Executing: ➡️ Moving right
[Actual command was: 'right']
----------------------------------------
[Recording audio 8...]
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 42ms/step
Heard: 'left' (Confidence: 0.98)
Executing: ⬅️ Moving left
[Actual command was: 'left']
----------------------------------------

Demo completed.

In [19]:
# Simulate Google Speech-to-Text API integration
def simulate_google_speech_api(audio, language_code="en-US"):
"""Simulate calling Google Speech-to-Text API"""
# In a real implementation, you would:
# 1. Convert the audio to the correct format
# 2. Call the Google Speech-to-Text API
# 3. Process the response

# For simulation, we'll just use our existing model but add some API-like output
predicted_command, confidence, _ = recognize_command(
audio, best_model, label_encoder
)

# Simulate API response format

response = {
"results": [
{
"alternatives": [
{
"transcript": predicted_command,
"confidence": float(confidence)
}
]
}
],
"language_code": language_code
}

return response

# Demonstration of API integration

def api_integration_demo():
print("=== Speech Recognition API Integration Demo ===\n ")

# Sample commands to recognize

commands_to_test = set(y)

for command in commands_to_test:

# Find an audio sample for this command
idx = np.where(y == command)[0][0]
audio = X[idx]

print(f"Processing audio for command: '{command}'")

print("Calling Speech-to-Text API...")

# Call simulated API

response = simulate_google_speech_api(audio)

# Process API response

if response and "results" in response and len(response["results"]) > 0:
transcript = response["results"][0]["alternatives"][0]["transcript"]
confidence = response["results"][0]["alternatives"][0]["confidence"]

print(f"API Result: '{transcript} ' (Confidence: {confidence:.2f})")

# Check if it's correct

if transcript == command:
print(" Correctly recognized!")
else:
print(" Incorrectly recognized.")
else:
print(" API returned no results.")

print("-" * 40)

print("\n API integration demo completed.")

# Run the API integration demo

api_integration_demo()

=== Speech Recognition API Integration Demo ===

Processing audio for command: 'left'

Calling Speech-to-Text API...
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 41ms/step
API Result: 'left' (Confidence: 0.93)
Correctly recognized!
----------------------------------------
Processing audio for command: 'up'
Calling Speech-to-Text API...
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 40ms/step
API Result: 'up' (Confidence: 0.64)
Correctly recognized!
----------------------------------------
Processing audio for command: 'down'
Calling Speech-to-Text API...
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 42ms/step
API Result: 'down' (Confidence: 0.99)
Correctly recognized!
----------------------------------------
Processing audio for command: 'right'
Calling Speech-to-Text API...
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 40ms/step
API Result: 'right' (Confidence: 1.00)
Correctly recognized!
----------------------------------------
Processing audio for command: 'no'
Calling Speech-to-Text API...
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 47ms/step
API Result: 'no' (Confidence: 0.51)
Correctly recognized!
----------------------------------------
Processing audio for command: 'yes'
Calling Speech-to-Text API...
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 41ms/step
API Result: 'yes' (Confidence: 1.00)
Correctly recognized!
----------------------------------------

API integration demo completed.

In [20]:

# Simplified DTW implementation for demonstration purposes

def simplified_dtw(s1, s2):
"""Simplified Dynamic Time Warping distance between two sequences"""
# Create a matrix of distances
n, m = len(s1), len(s2)
dtw_matrix = np.zeros((n+1, m+1))

# Initialize the DTW matrix

for i in range(n+1):
for j in range(m+1):
dtw_matrix[i, j] = float('inf')
dtw_matrix[0, 0] = 0

# Fill the DTW matrix

for i in range(1, n+1):
for j in range(1, m+1):
cost = np.abs(s1[i-1] - s2[j-1])
dtw_matrix[i, j] = cost + min(dtw_matrix[i-1, j], dtw_matrix[i, j-1], dtw_ma
trix[i-1, j-1])

return dtw_matrix[n, m]

# Simulate DTW-based speech recognition

def dtw_speech_recognition(audio, templates, labels, sr=16000):
"""Recognize speech using DTW against templates"""
# Extract a simplified feature (e.g., average energy in frequency bands)
n_bands = 10
S = np.abs(librosa.stft(audio))
energy_bands = librosa.feature.mfcc(S=S, sr=sr, n_mfcc=n_bands)
sample_feature = np.mean(energy_bands, axis=1)

# Compare with each template

distances = []
for template in templates:
S_template = np.abs(librosa.stft(template))
template_feature = np.mean(librosa.feature.mfcc(S=S_template, sr=sr, n_mfcc=n_ba
nds), axis=1)
distance = simplified_dtw(sample_feature, template_feature)
distances.append(distance)

# Find the best match

best_idx = np.argmin(distances)
best_distance = distances[best_idx]
recognized_label = labels[best_idx]

# Convert distance to confidence (inverse relationship)

max_distance = max(distances) if len(distances) > 1 else best_distance
confidence = 1 - (best_distance / max_distance) if max_distance > 0 else 0

return recognized_label, confidence, distances

In [21]:
# Create template examples for each command
templates = []
template_labels = []

for command in set(y):

# Find examples of this command (use the first 3 as templates)
indices = np.where(y == command)[0][:3]
for idx in indices:
templates.append(X[idx])
template_labels.append(command)

print(f"Created {len(templates)} templates for {len(set(y))} different commands")

# Test DTW-based recognition on a few samples

print("\n Testing DTW-based speech recognition:")
for i in range(5): # Test 5 random samples
idx = np.random.randint(0, len(X))
audio = X[idx]
true_command = y[idx]

# Recognize using DTW

dtw_command, dtw_confidence, _ = dtw_speech_recognition(audio, templates, template_l
abels)

# Recognize using neural network

nn_command, nn_confidence, _ = recognize_command(audio, best_model, label_encoder)

# Print results
print(f"Sample {i+1} (True: '{true_command}'):")
print(f" DTW Recognition: '{dtw_command}' (Confidence: {dtw_confidence:.2f})")
print(f" NN Recognition: '{nn_command} ' (Confidence: {nn_confidence:.2f})")
print("-" * 40)

Created 18 templates for 6 different commands

Testing DTW-based speech recognition:

1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 39ms/step
Sample 1 (True: 'yes'):
DTW Recognition: 'yes' (Confidence: 0.78)
NN Recognition: 'yes' (Confidence: 1.00)
----------------------------------------
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 39ms/step
Sample 2 (True: 'down'):
DTW Recognition: 'left' (Confidence: 0.84)
NN Recognition: 'down' (Confidence: 0.85)
----------------------------------------
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 64ms/step
Sample 3 (True: 'no'):
DTW Recognition: 'left' (Confidence: 0.52)
NN Recognition: 'down' (Confidence: 0.30)
----------------------------------------
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 42ms/step
Sample 4 (True: 'left'):
DTW Recognition: 'no' (Confidence: 0.44)
NN Recognition: 'left' (Confidence: 0.94)
----------------------------------------
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 47ms/step
Sample 5 (True: 'no'):
DTW Recognition: 'up' (Confidence: 0.90)
NN Recognition: 'no' (Confidence: 1.00)
----------------------------------------
Experiment 7: Image Classification using CNN and RNN on
CIFAR-10 Dataset
In [1]:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers, models, Sequential
from tensorflow.keras.layers import Dense, Dropout, Conv2D, MaxPooling2D, Flatten
from tensorflow.keras.layers import LSTM, TimeDistributed, Reshape, BatchNormalization
from tensorflow.keras.datasets import cifar10
from tensorflow.keras.utils import to_categorical
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report, confusion_matrix
import seaborn as sns

# Check TensorFlow version

print(f"TensorFlow version: {tf.__version__}")
print(f"Keras version: {keras.__version__}")

TensorFlow version: 2.18.0

Keras version: 3.8.0

In [2]:

# Standard approach for CIFAR-10 (works without issues due to its relatively small size)
(X_train, y_train), (X_test, y_test) = cifar10.load_data()

# Normalize pixel values to be between 0 and 1

X_train = X_train.astype('float32') / 255.0
X_test = X_test.astype('float32') / 255.0

# Convert class vectors to binary class matrices (one-hot encoding)

y_train_cat = to_categorical(y_train, 10)
y_test_cat = to_categorical(y_test, 10)

# Define class names for interpretability

class_names = ['airplane', 'automobile', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse',
'ship', 'truck']

print(f"Training data shape: {X_train.shape}")

print(f"Testing data shape: {X_test.shape}")

# Note: For very large datasets, you would use the streaming approach like this:
# Example code (not run, just for demonstration):
"""
from datasets import load_dataset

# Load dataset in streaming mode

dataset = load_dataset("some/large_dataset", split='train', streaming=True)

# Process the streaming dataset

processed_dataset = dataset.map(preprocess_function, batched=True)

# Create a data generator for model training

def data_generator():
for example in processed_dataset:
yield example['input_features'], example['label']
"""

print("\n Note: For CIFAR-10, streaming is unnecessary due to its manageable size,")
print("but for very large datasets, streaming mode helps avoid memory overflow errors.")

Downloading data from https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz

Downloading data from https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz
170498071/170498071 ━━━━━━━━━━━━━━━━━━━━ 4s 0us/step
Training data shape: (50000, 32, 32, 3)
Testing data shape: (10000, 32, 32, 3)

Note: For CIFAR-10, streaming is unnecessary due to its manageable size,

but for very large datasets, streaming mode helps avoid memory overflow errors.

In [3]:
# Example of how to handle larger datasets with the datasets library
from tensorflow.keras.preprocessing.image import ImageDataGenerator

# This function shows how you would approach training with a streaming dataset
def train_with_streaming_data(model, dataset_stream, batch_size=32, steps_per_epoch=100,
epochs=10):
"""
Train a model using a streaming dataset to avoid memory issues.

Parameters:
- model: Compiled Keras model
- dataset_stream: Streaming dataset
- batch_size: Batch size for training
- steps_per_epoch: Number of batches per epoch
- epochs: Number of epochs to train
"""
# This is a conceptual function - implementation would depend on your specific datase
t

for epoch in range(epochs):

print(f"Epoch {epoch+1}/{epochs}")
batch_count = 0
for _ in range(steps_per_epoch):
# In a real implementation, you would:
# 1. Collect a batch of examples from the stream
# 2. Preprocess the batch
# 3. Train on the batch
# model.train_on_batch(x_batch, y_batch)
batch_count += 1

print(f"Completed {batch_count} batches")

return model

print("For CIFAR-10, we use the standard approach, but the streaming method is valuable f
or massive datasets")

For CIFAR-10, we use the standard approach, but the streaming method is valuable for mass
ive datasets

3. Visualize Sample Images from the Dataset

In [4]:

# Plot some sample images

plt.figure(figsize=(10, 10))
for i in range(25):
plt.subplot(5, 5, i+1)
plt.xticks([])
plt.yticks([])
plt.grid(False)
plt.imshow(X_train[i])
plt.xlabel(class_names[y_train[i][0]])
plt.tight_layout()
plt.show()
In [5]:
# Define CNN model architecture
def create_cnn_model():
model = Sequential([
# First convolutional layer
Conv2D(32, (3, 3), activation='relu', padding='same', input_shape=(32, 32, 3)),
BatchNormalization(),
Conv2D(32, (3, 3), activation='relu', padding='same'),
BatchNormalization(),
MaxPooling2D((2, 2)),
Dropout(0.25),

# Second convolutional layer

Conv2D(64, (3, 3), activation='relu', padding='same'),
BatchNormalization(),
Conv2D(64, (3, 3), activation='relu', padding='same'),
BatchNormalization(),
MaxPooling2D((2, 2)),
Dropout(0.25),

# Third convolutional layer

Conv2D(128, (3, 3), activation='relu', padding='same'),
BatchNormalization(),
Conv2D(128, (3, 3), activation='relu', padding='same'),
BatchNormalization(),
MaxPooling2D((2, 2)),
Dropout(0.25),
# Flatten and dense layers
Flatten(),
Dense(512, activation='relu'),
BatchNormalization(),
Dropout(0.5),
Dense(10, activation='softmax')
])

return model

# Create and compile the CNN model

cnn_model = create_cnn_model()
cnn_model.compile(optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy'])

# Print model summary

cnn_model.summary()

/usr/local/lib/python3.11/dist-packages/keras/src/layers/convolutional/base_conv.py:107:
UserWarning: Do not pass an ìnput_shape`/ìnput_dim` argument to a layer. When using Seq
uential models, prefer using an Ìnput(shape)` object as the first layer in the model ins
tead.
super().__init__(activity_regularizer=activity_regularizer, **kwargs)

Model: "sequential"

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type) ┃ Output Shape ┃ Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ conv2d (Conv2D) │ (None, 32, 32, 32) │ 896 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ batch_normalization │ (None, 32, 32, 32) │ 128 │
│ (BatchNormalization) │ │ │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ conv2d_1 (Conv2D) │ (None, 32, 32, 32) │ 9,248 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ batch_normalization_1 │ (None, 32, 32, 32) │ 128 │
│ (BatchNormalization) │ │ │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ max_pooling2d (MaxPooling2D) │ (None, 16, 16, 32) │ 0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout (Dropout) │ (None, 16, 16, 32) │ 0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ conv2d_2 (Conv2D) │ (None, 16, 16, 64) │ 18,496 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ batch_normalization_2 │ (None, 16, 16, 64) │ 256 │
│ (BatchNormalization) │ │ │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ conv2d_3 (Conv2D) │ (None, 16, 16, 64) │ 36,928 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ batch_normalization_3 │ (None, 16, 16, 64) │ 256 │
│ (BatchNormalization) │ │ │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ max_pooling2d_1 (MaxPooling2D) │ (None, 8, 8, 64) │ 0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout_1 (Dropout) │ (None, 8, 8, 64) │ 0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ conv2d_4 (Conv2D) │ (None, 8, 8, 128) │ 73,856 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ batch_normalization_4 │ (None, 8, 8, 128) │ 512 │
│ (BatchNormalization) │ │ │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ conv2d_5 (Conv2D) │ (None, 8, 8, 128) │ 147,584 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ batch_normalization_5 │ (None, 8, 8, 128) │ 512 │
│ (BatchNormalization) │ │ │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ max_pooling2d_2 (MaxPooling2D) │ (None, 4, 4, 128) │ 0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout_2 (Dropout) │ (None, 4, 4, 128) │ 0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ flatten (Flatten) │ (None, 2048) │ 0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense (Dense) │ (None, 512) │ 1,049,088 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ batch_normalization_6 │ (None, 512) │ 2,048 │
│ (BatchNormalization) │ │ │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout_3 (Dropout) │ (None, 512) │ 0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_1 (Dense) │ (None, 10) │ 5,130 │
└─────────────────────────────────┴────────────────────────┴───────────────┘

Total params: 1,345,066 (5.13 MB)

Trainable params: 1,343,146 (5.12 MB)

Non-trainable params: 1,920 (7.50 KB)

In [ ]:
# Train the CNN model with data augmentation
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import gc # Garbage collection to manage memory

# Try to free up memory before starting training

gc.collect()

# Data augmentation for training

datagen = ImageDataGenerator(
rotation_range=15,
width_shift_range=0.1,
height_shift_range=0.1,
horizontal_flip=True
)
datagen.fit(X_train)

# Early stopping and learning rate reduction

early_stopping = keras.callbacks.EarlyStopping(monitor='val_loss', patience=10, restore_b
est_weights=True)
reduce_lr = keras.callbacks.ReduceLROnPlateau(monitor='val_loss', factor=0.2, patience=5
, min_lr=1e-5)

try:
# Reduce batch size if you're having memory issues
batch_size = 32 # Reduced from 64 to try to avoid memory issues

# Train the model with augmented data

cnn_history = cnn_model.fit(
datagen.flow(X_train, y_train_cat, batch_size=batch_size),
validation_data=(X_test, y_test_cat),
epochs=50,
callbacks=[early_stopping, reduce_lr],
verbose=1
)

print("CNN model training completed successfully!")

except Exception as e:
print(f"Error during CNN model training: {e}")
print("\n Troubleshooting tips:")
print("1. Try reducing batch_size further")
print("2. Try reducing the model complexity")
print("3. Check if you have enough memory available")
# Try with a much smaller batch size and fewer epochs as a fallback
try:
print("\n Attempting training with smaller batch size...")
cnn_history = cnn_model.fit(
datagen.flow(X_train, y_train_cat, batch_size=16),
validation_data=(X_test, y_test_cat),
epochs=10, # Reduced epochs for faster completion
callbacks=[early_stopping, reduce_lr],
verbose=1
)
print("Fallback training completed successfully!")
except Exception as e2:
print(f"Fallback training also failed: {e2}")

In [7]:
# Prepare data for RNN (reshape images to sequences)
# For RNN, we'll treat each row of the image as a time step
X_train_rnn = X_train.reshape(X_train.shape[0], 32, 32*3) # 32 time steps, each with 32
*3 features
X_test_rnn = X_test.reshape(X_test.shape[0], 32, 32*3)

print(f"RNN training data shape: {X_train_rnn.shape}")

print(f"RNN testing data shape: {X_test_rnn.shape}")

RNN training data shape: (50000, 32, 96)

RNN testing data shape: (10000, 32, 96)

In [8]:
# Define RNN model architecture
def create_rnn_model():
model = Sequential([
# LSTM layers
LSTM(128, return_sequences=True, input_shape=(32, 32*3)),
Dropout(0.25),
LSTM(128),
Dropout(0.25),

# Output layer
Dense(128, activation='relu'),
Dropout(0.5),
Dense(10, activation='softmax')
])

return model

# Create and compile the RNN model

rnn_model = create_rnn_model()
rnn_model.compile(optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy'])

# Print model summary

rnn_model.summary()

/usr/local/lib/python3.11/dist-packages/keras/src/layers/rnn/rnn.py:200: UserWarning: Do
not pass an ìnput_shape`/ìnput_dim` argument to a layer. When using Sequential models,
prefer using an Ìnput(shape)` object as the first layer in the model instead.
super().__init__(**kwargs)

Model: "sequential_1"

Total params: 264,586 (1.01 MB)

Trainable params: 264,586 (1.01 MB)

Non-trainable params: 0 (0.00 B)

In [ ]:
# Train the RNN model
early_stopping = keras.callbacks.EarlyStopping(monitor='val_loss', patience=5, restore_be
st_weights=True)

rnn_history = rnn_model.fit(
X_train_rnn, y_train_cat,
batch_size=128,
epochs=25,
validation_data=(X_test_rnn, y_test_cat),
callbacks=[early_stopping],
verbose=1
)

In [10]:
# Evaluate the CNN model
cnn_loss, cnn_accuracy = cnn_model.evaluate(X_test, y_test_cat, verbose=0)
print(f"CNN Test Accuracy: {cnn_accuracy:.4f}")

# Evaluate the RNN model

rnn_loss, rnn_accuracy = rnn_model.evaluate(X_test_rnn, y_test_cat, verbose=0)
print(f"RNN Test Accuracy: {rnn_accuracy:.4f}")

CNN Test Accuracy: 0.8768

RNN Test Accuracy: 0.6061

In [11]:
# Plot training history comparison
plt.figure(figsize=(12, 5))

# Plot accuracy
plt.subplot(1, 2, 1)
plt.plot(cnn_history.history['accuracy'], label='CNN Training Accuracy')
plt.plot(cnn_history.history['val_accuracy'], label='CNN Validation Accuracy')
plt.plot(rnn_history.history['accuracy'], label='RNN Training Accuracy')
plt.plot(rnn_history.history['val_accuracy'], label='RNN Validation Accuracy')
plt.title('Model Accuracy Comparison')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(loc='lower right')

# Plot loss
plt.subplot(1, 2, 2)
plt.plot(cnn_history.history['loss'], label='CNN Training Loss')
plt.plot(cnn_history.history['val_loss'], label='CNN Validation Loss')
plt.plot(rnn_history.history['loss'], label='RNN Training Loss')
plt.plot(rnn_history.history['val_loss'], label='RNN Validation Loss')
plt.title('Model Loss Comparison')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(loc='upper right')
plt.tight_layout()
plt.show()

In [12]:

# Generate predictions with the CNN model

cnn_predictions = cnn_model.predict(X_test)
cnn_predicted_classes = np.argmax(cnn_predictions, axis=1)
cnn_true_classes = np.argmax(y_test_cat, axis=1)

# Generate predictions with the RNN model

rnn_predictions = rnn_model.predict(X_test_rnn)
rnn_predicted_classes = np.argmax(rnn_predictions, axis=1)

# Display classification report for CNN

print("CNN Classification Report:")
print(classification_report(cnn_true_classes, cnn_predicted_classes, target_names=class_
names))

# Display classification report for RNN

print("RNN Classification Report:")
print(classification_report(cnn_true_classes, rnn_predicted_classes, target_names=class_
names))

313/313 ━━━━━━━━━━━━━━━━━━━━ 2s 5ms/step

313/313 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step
CNN Classification Report:
precision recall f1-score support

airplane 0.91 0.89 0.90 1000

automobile 0.94 0.95 0.95 1000
bird 0.86 0.82 0.84 1000
cat 0.85 0.68 0.76 1000
deer 0.84 0.88 0.86 1000
dog 0.88 0.75 0.81 1000
frog 0.80 0.96 0.87 1000
horse 0.89 0.93 0.91 1000
ship 0.93 0.94 0.94 1000
truck 0.88 0.95 0.92 1000

accuracy 0.88 10000

macro avg 0.88 0.88 0.87 10000
weighted avg 0.88 0.88 0.87 10000

RNN Classification Report:

precision recall f1-score support

airplane 0.66 0.68 0.67 1000

automobile 0.73 0.75 0.74 1000
bird 0.52 0.41 0.46 1000
cat 0.43 0.42 0.42 1000
deer 0.55 0.49 0.52 1000
dog 0.49 0.49 0.49 1000
frog 0.58 0.73 0.65 1000
horse 0.66 0.68 0.67 1000
ship 0.73 0.76 0.74 1000
truck 0.68 0.67 0.67 1000

accuracy 0.61 10000

macro avg 0.60 0.61 0.60 10000
weighted avg 0.60 0.61 0.60 10000

In [13]:
# Plot confusion matrices
plt.figure(figsize=(16, 7))

# CNN confusion matrix

plt.subplot(1, 2, 1)
cnn_cm = confusion_matrix(cnn_true_classes, cnn_predicted_classes)
sns.heatmap(cnn_cm, annot=True, fmt='d', cmap='Blues', xticklabels=class_names, yticklab
els=class_names)
plt.title('CNN Confusion Matrix')
plt.ylabel('True Label')
plt.xlabel('Predicted Label')

# RNN confusion matrix

plt.subplot(1, 2, 2)
rnn_cm = confusion_matrix(cnn_true_classes, rnn_predicted_classes)
sns.heatmap(rnn_cm, annot=True, fmt='d', cmap='Blues', xticklabels=class_names, yticklab
els=class_names)
plt.title('RNN Confusion Matrix')
plt.ylabel('True Label')
plt.xlabel('Predicted Label')

plt.tight_layout()
plt.show()

In [14]:
# Display some sample predictions
def plot_sample_predictions(model_name, X, predictions, true_labels, indices=None):
if indices is None:
indices = np.random.randint(0, len(X), 15)

predicted_classes = np.argmax(predictions, axis=1)

plt.figure(figsize=(12, 10))
for i, idx in enumerate(indices):
plt.subplot(3, 5, i+1)
plt.imshow(X[idx])
plt.title(f"True: {class_names[true_labels[idx][0]]}\n Pred: {class_names[predict
ed_classes[idx]]}")
plt.axis('off')
if true_labels[idx][0] != predicted_classes[idx]:
plt.gca().set_title(plt.gca().get_title(), color='red')
plt.tight_layout()
plt.suptitle(f"{model_name} Sample Predictions", fontsize=16, y=1.02)
plt.show()

# Random indices for sample visualization

sample_indices = np.random.randint(0, len(X_test), 15)

# Plot CNN predictions

plot_sample_predictions("CNN", X_test, cnn_predictions, y_test, sample_indices)

# Plot RNN predictions

plot_sample_predictions("RNN", X_test, rnn_predictions, y_test, sample_indices)
Experiment 9: NLP -> Sentiment Analysis, Text Classification
and PCA Visualization
In [ ]:

%pip install nltk

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import re
import string
import nltk
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize
from nltk.stem import WordNetLemmatizer, PorterStemmer
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Embedding, LSTM, SpatialDropout1D, Dropout, B
idirectional
from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import CountVectorizer, TfidfVectorizer
from sklearn.decomposition import PCA, TruncatedSVD
from sklearn.linear_model import LogisticRegression
from sklearn.naive_bayes import MultinomialNB
from sklearn.metrics import classification_report, confusion_matrix, accuracy_score
from sklearn.datasets import fetch_20newsgroups
import warnings

# Suppress warnings
warnings.filterwarnings('ignore')

# Download NLTK resources

try:
nltk.download('punkt')
nltk.download('stopwords')
nltk.download('wordnet')
except:
print("NLTK download failed, but we'll continue anyway")

# Check versions
print(f"TensorFlow version: {tf.__version__}")
print(f"Keras version: {keras.__version__}")
print(f"NLTK version: {nltk.__version__}")
print(f"Pandas version: {pd.__version__}")

In [2]:

def generate_synthetic_sentiment_data(n_samples=5000):
"""Generate synthetic text data with sentiment labels"""

# Define positive and negative sentiment word lists

positive_words = [
"good", "great", "excellent", "amazing", "wonderful", "best", "love", "happy",
"awesome", "fantastic", "pleasant", "delightful", "impressive", "joyful", "perfe
ct",
"beautiful", "enjoy", "exciting", "pleased", "recommend", "brilliant", "superb",
"outstanding", "exceptional", "terrific", "magnificent", "marvelous", "splendid"
]

negative_words = [
"bad", "terrible", "awful", "horrible", "worst", "hate", "disappointed", "poor",
"disappointing", "dislike", "mediocre", "ugly", "boring", "stupid", "waste",
"annoying", "useless", "unpleasant", "frustrating", "pathetic", "dreadful", "app
alling",
"disgusting", "disaster", "inferior", "dull", "unhappy", "awful"
]

neutral_words = [
"okay", "fine", "average", "decent", "acceptable", "fair", "satisfactory", "mode
rate",
"adequate", "reasonable", "neither", "mixed", "balanced", "so-so", "standard",
"ordinary", "regular", "typical", "usual", "common", "intermediate", "middle-of-
the-road"
]

# Define templates for various sentiment types

positive_templates = [
"This product is {positive}.",
"I {positive} this movie very much.",
"The service was {positive} and I would recommend it.",
"A {positive} experience overall.",
"This is a {positive} {item} that exceeded my expectations.",
"I'm {positive} with my purchase.",
"The {item} works {positive} and is worth the money.",
"I was {positive} by how well this {item} performed.",
"My experience with this {item} has been {positive}.",
"This {positive} {item} made me happy."
]

negative_templates = [
"This product is {negative}.",
"I {negative} this movie.",
"The service was {negative} and I would not recommend it.",
"A {negative} experience overall.",
"This is a {negative} {item} that failed to meet my expectations.",
"I'm {negative} with my purchase.",
"The {item} works {negative} and is not worth the money.",
"I was {negative} by how poorly this {item} performed.",
"My experience with this {item} has been {negative}.",
"This {negative} {item} made me unhappy."
]

neutral_templates = [
"This product is {neutral}.",
"I thought this movie was {neutral}.",
"The service was {neutral}.",
"A {neutral} experience overall.",
"This is a {neutral} {item} that met my basic expectations.",
"I'm {neutral} about my purchase.",
"The {item} works {neutral} for the price.",
"I was neither impressed nor disappointed by this {item}.",
"My experience with this {item} has been {neutral}.",
"This {neutral} {item} is just okay."
]

item_types = ["product", "book", "movie", "phone", "laptop", "TV", "tablet", "headph

ones",
"camera", "game", "experience", "service", "device", "gadget", "applia
nce"]

# Generate samples
texts = []
sentiments = []

for _ in range(n_samples):
sentiment = np.random.choice([0, 1, 2]) # 0: negative, 1: neutral, 2: positive
item = np.random.choice(item_types)

if sentiment == 0: # Negative
template = np.random.choice(negative_templates)
sentiment_word = np.random.choice(negative_words)
text = template.format(negative=sentiment_word, item=item)
elif sentiment == 1: # Neutral
template = np.random.choice(neutral_templates)
sentiment_word = np.random.choice(neutral_words)
text = template.format(neutral=sentiment_word, item=item)
else: # Positive
template = np.random.choice(positive_templates)
sentiment_word = np.random.choice(positive_words)
text = template.format(positive=sentiment_word, item=item)

texts.append(text)
sentiments.append(sentiment)

# Create DataFrame
data = pd.DataFrame({
'text': texts,
'sentiment': sentiments
})

# Convert sentiment to categorical labels

data['sentiment_label'] = data['sentiment'].map({0: 'negative', 1: 'neutral', 2: 'po
sitive'})

return data

# Generate synthetic sentiment data

sentiment_data = generate_synthetic_sentiment_data(n_samples=5000)

# Display the first few rows

print(f"Generated {len(sentiment_data)} sentiment samples")
sentiment_data.head()

Generated 5000 sentiment samples

Out[2]:

text sentiment sentiment_label

0 This is a waste product that failed to meet my... 0 negative

1 A disaster experience overall. 0 negative

2 I'm superb with my purchase. 2 positive

3 My experience with this service has been usual. 1 neutral

4 My experience with this headphones has been un... 0 negative

In [3]:

# Check the distribution of sentiments

plt.figure(figsize=(10, 6))
sns.countplot(x='sentiment_label', data=sentiment_data, palette='viridis')
plt.title('Distribution of Sentiment Classes')
plt.xlabel('Sentiment')
plt.ylabel('Count')
plt.xticks(rotation=0)
plt.grid(axis='y', linestyle='--', alpha=0.7)
plt.show()

# Calculate percentages
sentiment_counts = sentiment_data['sentiment_label'].value_counts(normalize=True) * 100
print("Sentiment distribution percentages:")
for sentiment, percentage in sentiment_counts.items():
print(f"{sentiment}: {percentage:.2f}%")
Sentiment distribution percentages:
neutral: 33.92%
positive: 33.52%
negative: 32.56%

In [4]:
# Download NLTK resources first
nltk.download('punkt')
nltk.download('stopwords')
nltk.download('wordnet')

# Visualize word frequency by sentiment

def plot_word_freq_by_sentiment(data, top_n=15):
"""Plot the most frequent words for each sentiment class"""
plt.figure(figsize=(18, 15))

for i, sentiment in enumerate(['negative', 'neutral', 'positive']):

# Filter by sentiment
texts = data[data['sentiment_label'] == sentiment]['text']

# Tokenize and count words using a simpler approach

all_words = []
for text in texts:
# Convert to lowercase and tokenize using a simpler approach
# Split by whitespace and remove punctuation
text = text.lower()
# Remove punctuation
for punct in string.punctuation:
text = text.replace(punct, ' ')
# Split into words
words = text.split()
# Remove stopwords and short words
words = [word for word in words if word not in stopwords.words('english')
and len(word) > 2]
all_words.extend(words)

# Count frequency
word_freq = pd.Series(all_words).value_counts().head(top_n)

# Plot
plt.subplot(3, 1, i+1)
sns.barplot(x=word_freq.values, y=word_freq.index, palette='viridis')
plt.title(f'Top {top_n} Words in {sentiment.capitalize()} Reviews')
plt.xlabel('Frequency')
plt.ylabel('Words')

plt.tight_layout()
plt.show()
# Visualize word frequencies
plot_word_freq_by_sentiment(sentiment_data)

[nltk_data] Downloading package punkt to /usr/share/nltk_data...

[nltk_data] Package punkt is already up-to-date!
[nltk_data] Downloading package stopwords to /usr/share/nltk_data...
[nltk_data] Package stopwords is already up-to-date!
[nltk_data] Downloading package wordnet to /usr/share/nltk_data...
[nltk_data] Package wordnet is already up-to-date!

In [5]:
def preprocess_text(text, remove_stopwords=True, lemmatize=True):
"""Preprocess text for NLP tasks"""
# Convert to lowercase
text = text.lower()

# Remove URLs
text = re.sub(r'http\S+|www\S+|https\S+', '', text, flags=re.MULTILINE)

# Remove special characters & numbers

text = re.sub(r'[^\w\s]', '', text)
text = re.sub(r'\d+', '', text)

# Tokenize
tokens = word_tokenize(text)

# Remove stopwords if requested

if remove_stopwords:
stop_words = set(stopwords.words('english'))
tokens = [word for word in tokens if word not in stop_words]
# Lemmatize if requested
if lemmatize:
lemmatizer = WordNetLemmatizer()
tokens = [lemmatizer.lemmatize(word) for word in tokens]

# Rejoin into string

preprocessed_text = ' '.join(tokens)

return preprocessed_text

# Apply preprocessing to the sentiment data

sentiment_data['processed_text'] = sentiment_data['text'].apply(preprocess_text)

# Display examples of original vs processed text

comparison = sentiment_data[['text', 'processed_text', 'sentiment_label']].head(5)
print("Original vs. Processed Text Examples:")
for i, row in comparison.iterrows():
print(f"\n Sentiment: {row['sentiment_label']}")
print(f"Original: {row['text']}")
print(f"Processed: {row['processed_text']}")

Original vs. Processed Text Examples:

Sentiment: negative
Original: This is a waste product that failed to meet my expectations.
Processed: waste product failed meet expectation

Sentiment: negative
Original: A disaster experience overall.
Processed: disaster experience overall

Sentiment: positive
Original: I'm superb with my purchase.
Processed: im superb purchase

Sentiment: neutral
Original: My experience with this service has been usual.
Processed: experience service usual

Sentiment: negative
Original: My experience with this headphones has been unhappy.
Processed: experience headphone unhappy

In [6]:
# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(
sentiment_data['processed_text'],
sentiment_data['sentiment'],
test_size=0.2,
random_state=42
)

# Create feature extractors

# Bag of Words
bow_vectorizer = CountVectorizer(max_features=5000)
bow_train = bow_vectorizer.fit_transform(X_train)
bow_test = bow_vectorizer.transform(X_test)

# TF-IDF
tfidf_vectorizer = TfidfVectorizer(max_features=5000)
tfidf_train = tfidf_vectorizer.fit_transform(X_train)
tfidf_test = tfidf_vectorizer.transform(X_test)

print(f"Bag of Words training shape: {bow_train.shape} ")

print(f"TF-IDF training shape: {tfidf_train.shape}")

Bag of Words training shape: (4000, 113)

TF-IDF training shape: (4000, 113)

In [7]:
def train_evaluate_ml_model(model, X_train, X_test, y_train, y_test, model_name):
"""Train and evaluate a machine learning model"""
# Train the model
model.fit(X_train, y_train)

# Make predictions
y_pred = model.predict(X_test)

# Calculate metrics
accuracy = accuracy_score(y_test, y_pred)
report = classification_report(y_test, y_pred, target_names=['Negative', 'Neutral',
'Positive'])

# Print results
print(f"Model: {model_name} ")
print(f"Accuracy: {accuracy:.4f}")
print("Classification Report:")
print(report)

# Confusion Matrix
cm = confusion_matrix(y_test, y_pred)
plt.figure(figsize=(8, 6))
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues',
xticklabels=['Negative', 'Neutral', 'Positive'],
yticklabels=['Negative', 'Neutral', 'Positive'])
plt.xlabel('Predicted')
plt.ylabel('True')
plt.title(f'Confusion Matrix - {model_name} ')
plt.tight_layout()
plt.show()

return model, accuracy, report

# Train and evaluate Naive Bayes with Bag of Words

nb_bow = MultinomialNB()
nb_bow_results = train_evaluate_ml_model(nb_bow, bow_train, bow_test, y_train, y_test, "
Naive Bayes (Bag of Words)")

# Train and evaluate Logistic Regression with TF-IDF

lr_tfidf = LogisticRegression(max_iter=1000, C=1.0, solver='liblinear', multi_class='ovr'
)
lr_tfidf_results = train_evaluate_ml_model(lr_tfidf, tfidf_train, tfidf_test, y_train, y
_test, "Logistic Regression (TF-IDF)")

Model: Naive Bayes (Bag of Words)

Accuracy: 0.9910
Classification Report:
precision recall f1-score support

Negative 1.00 0.97 0.98 304

Neutral 0.97 1.00 0.99 351
Positive 1.00 1.00 1.00 345

accuracy 0.99 1000

macro avg 0.99 0.99 0.99 1000
weighted avg 0.99 0.99 0.99 1000
Model: Logistic Regression (TF-IDF)
Accuracy: 1.0000
Classification Report:
precision recall f1-score support

Negative 1.00 1.00 1.00 304

Neutral 1.00 1.00 1.00 351
Positive 1.00 1.00 1.00 345

accuracy 1.00 1000

macro avg 1.00 1.00 1.00 1000
weighted avg 1.00 1.00 1.00 1000
In [8]:
def visualize_feature_importance(model, vectorizer, class_labels=['Negative', 'Neutral',
'Positive']):
"""Visualize the most important features for each class"""
# Get feature names
feature_names = vectorizer.get_feature_names_out()

# If model is Logistic Regression, we can get coefficients

if hasattr(model, 'coef_'):
coefficients = model.coef_

plt.figure(figsize=(15, 12))
for i, class_name in enumerate(class_labels):
# Top positive coefficients for this class
top_positive = np.argsort(coefficients[i])[-10:]
top_pos_coeffs = coefficients[i][top_positive]
top_pos_features = [feature_names[j] for j in top_positive]

# Top negative coefficients for this class

top_negative = np.argsort(coefficients[i])[:10]
top_neg_coeffs = coefficients[i][top_negative]
top_neg_features = [feature_names[j] for j in top_negative]

# Plot
plt.subplot(3, 1, i+1)

# Plot positive
plt.barh(range(len(top_pos_features)), top_pos_coeffs, color='forestgreen')
plt.yticks(range(len(top_pos_features)), top_pos_features)

# Plot negative (flip the order for better visualization)

plt.barh(range(len(top_pos_features), len(top_pos_features) + len(top_neg_fe
atures)),
top_neg_coeffs, color='crimson')
plt.yticks(range(len(top_pos_features), len(top_pos_features) + len(top_neg_
features)),
top_neg_features)

plt.title(f'Most Important Features for {class_name} Class')

plt.xlabel('Coefficient Value')

plt.tight_layout()
plt.show()
else:
print("This model type doesn't support feature importance visualization.")

# Visualize feature importance for the Logistic Regression model

visualize_feature_importance(lr_tfidf_results[0], tfidf_vectorizer)
In [9]:
# Prepare text data for deep learning
max_features = 5000 # Top words to consider
maxlen = 100 # Max sequence length

# Tokenize text
tokenizer = Tokenizer(num_words=max_features)
tokenizer.fit_on_texts(X_train)

X_train_seq = tokenizer.texts_to_sequences(X_train)
X_test_seq = tokenizer.texts_to_sequences(X_test)

# Pad sequences to ensure uniform length

X_train_pad = pad_sequences(X_train_seq, maxlen=maxlen)
X_test_pad = pad_sequences(X_test_seq, maxlen=maxlen)

# Convert labels to categorical for multi-class classification

y_train_cat = tf.keras.utils.to_categorical(y_train, num_classes=3)
y_test_cat = tf.keras.utils.to_categorical(y_test, num_classes=3)

print(f"Training sequence shape: {X_train_pad.shape}")

print(f"Testing sequence shape: {X_test_pad.shape}")

Training sequence shape: (4000, 100)

Testing sequence shape: (1000, 100)

In [10]:
# Build LSTM model for sentiment analysis
def create_lstm_model():
model = Sequential()
model.add(Embedding(max_features, 128, input_length=maxlen))
model.add(SpatialDropout1D(0.2))
model.add(Bidirectional(LSTM(64, dropout=0.2, recurrent_dropout=0.2)))
model.add(Dense(64, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(3, activation='softmax')) # 3 classes: negative, neutral, positive

model.compile(loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])
return model

# Create and train the model

lstm_model = create_lstm_model()
lstm_model.summary()

# Callbacks
early_stopping = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=3, restor
e_best_weights=True)
reduce_lr = tf.keras.callbacks.ReduceLROnPlateau(monitor='val_loss', factor=0.2, patienc
e=2, min_lr=0.001)

# Train with validation split

history = lstm_model.fit(
X_train_pad, y_train_cat,
epochs=15,
batch_size=128,
validation_split=0.1,
callbacks=[early_stopping, reduce_lr],
verbose=1
)

I0000 00:00:1748109294.386137 35 gpu_device.cc:2022] Created device /job:localhost/r

eplica:0/task:0/device:GPU:0 with 15513 MB memory: -> device: 0, name: Tesla P100-PCIE-1
6GB, pci bus id: 0000:00:04.0, compute capability: 6.0

Model: "sequential"

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓
┃ Layer (type) ┃ Output Shape ┃ Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩
│ embedding (Embedding) │ ? │ 0 (unbuilt) │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ spatial_dropout1d (SpatialDropout1D) │ ? │ 0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ bidirectional (Bidirectional) │ ? │ 0 (unbuilt) │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense (Dense) │ ? │ 0 (unbuilt) │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dropout (Dropout) │ ? │ 0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_1 (Dense) │ ? │ 0 (unbuilt) │
└──────────────────────────────────────┴─────────────────────────────┴─────────────────┘

Total params: 0 (0.00 B)

Trainable params: 0 (0.00 B)

Non-trainable params: 0 (0.00 B)

Epoch 1/15
29/29 ━━━━━━━━━━━━━━━━━━━━ 26s 434ms/step - accuracy: 0.3771 - loss: 1.0906 - val_accurac
y: 0.9425 - val_loss: 0.9523 - learning_rate: 0.0010
Epoch 2/15
29/29 ━━━━━━━━━━━━━━━━━━━━ 11s 391ms/step - accuracy: 0.7555 - loss: 0.8183 - val_accurac
y: 1.0000 - val_loss: 0.2278 - learning_rate: 0.0010
Epoch 3/15
29/29 ━━━━━━━━━━━━━━━━━━━━ 11s 388ms/step - accuracy: 0.9556 - loss: 0.1988 - val_accurac
y: 1.0000 - val_loss: 0.0035 - learning_rate: 0.0010
Epoch 4/15
29/29 ━━━━━━━━━━━━━━━━━━━━ 11s 394ms/step - accuracy: 0.9973 - loss: 0.0202 - val_accurac
y: 1.0000 - val_loss: 3.5055e-04 - learning_rate: 0.0010
Epoch 5/15
29/29 ━━━━━━━━━━━━━━━━━━━━ 11s 385ms/step - accuracy: 0.9988 - loss: 0.0079 - val_accurac
y: 1.0000 - val_loss: 1.2586e-04 - learning_rate: 0.0010
Epoch 6/15
29/29 ━━━━━━━━━━━━━━━━━━━━ 11s 387ms/step - accuracy: 1.0000 - loss: 0.0040 - val_accurac
y: 1.0000 - val_loss: 8.2057e-05 - learning_rate: 0.0010
Epoch 7/15
29/29 ━━━━━━━━━━━━━━━━━━━━ 11s 387ms/step - accuracy: 1.0000 - loss: 0.0040 - val_accurac
y: 1.0000 - val_loss: 3.6909e-05 - learning_rate: 0.0010
Epoch 8/15
29/29 ━━━━━━━━━━━━━━━━━━━━ 11s 378ms/step - accuracy: 0.9990 - loss: 0.0029 - val_accurac
y: 1.0000 - val_loss: 2.3085e-05 - learning_rate: 0.0010
Epoch 9/15
29/29 ━━━━━━━━━━━━━━━━━━━━ 11s 389ms/step - accuracy: 1.0000 - loss: 0.0019 - val_accurac
y: 1.0000 - val_loss: 2.9360e-05 - learning_rate: 0.0010
Epoch 10/15
29/29 ━━━━━━━━━━━━━━━━━━━━ 11s 380ms/step - accuracy: 1.0000 - loss: 9.8354e-04 - val_acc
uracy: 1.0000 - val_loss: 1.0132e-05 - learning_rate: 0.0010
Epoch 11/15
29/29 ━━━━━━━━━━━━━━━━━━━━ 11s 378ms/step - accuracy: 1.0000 - loss: 9.8051e-04 - val_acc
uracy: 1.0000 - val_loss: 1.5896e-05 - learning_rate: 0.0010
Epoch 12/15
29/29 ━━━━━━━━━━━━━━━━━━━━ 11s 388ms/step - accuracy: 1.0000 - loss: 9.5763e-04 - val_acc
29/29 ━━━━━━━━━━━━━━━━━━━━ 11s 388ms/step - accuracy: 1.0000 - loss: 9.5763e-04 - val_acc
uracy: 1.0000 - val_loss: 7.4489e-06 - learning_rate: 0.0010
Epoch 13/15
29/29 ━━━━━━━━━━━━━━━━━━━━ 11s 377ms/step - accuracy: 1.0000 - loss: 0.0011 - val_accurac
y: 1.0000 - val_loss: 8.0250e-06 - learning_rate: 0.0010
Epoch 14/15
29/29 ━━━━━━━━━━━━━━━━━━━━ 11s 385ms/step - accuracy: 0.9997 - loss: 0.0011 - val_accurac
y: 1.0000 - val_loss: 9.1459e-06 - learning_rate: 0.0010
Epoch 15/15
29/29 ━━━━━━━━━━━━━━━━━━━━ 11s 392ms/step - accuracy: 1.0000 - loss: 8.8211e-04 - val_acc
uracy: 1.0000 - val_loss: 4.0683e-06 - learning_rate: 0.0010

In [11]:
# Plot training history
plt.figure(figsize=(12, 5))

# Plot accuracy
plt.subplot(1, 2, 1)
plt.plot(history.history['accuracy'], label='Train Accuracy')
plt.plot(history.history['val_accuracy'], label='Validation Accuracy')
plt.title('Model Accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend()
plt.grid(True, linestyle='--', alpha=0.6)

# Plot loss
plt.subplot(1, 2, 2)
plt.plot(history.history['loss'], label='Train Loss')
plt.plot(history.history['val_loss'], label='Validation Loss')
plt.title('Model Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.legend()
plt.grid(True, linestyle='--', alpha=0.6)

plt.tight_layout()
plt.show()

In [12]:
# Evaluate LSTM model
lstm_pred = lstm_model.predict(X_test_pad)
y_pred_classes = np.argmax(lstm_pred, axis=1)
y_test_classes = np.argmax(y_test_cat, axis=1)

# Print classification report

lstm_accuracy = accuracy_score(y_test_classes, y_pred_classes)
lstm_report = classification_report(y_test_classes, y_pred_classes, target_names=['Negat
ive', 'Neutral', 'Positive'])
print(f"LSTM Model Accuracy: {lstm_accuracy:.4f}")
print("Classification Report:")
print(lstm_report)

# Plot confusion matrix

cm = confusion_matrix(y_test_classes, y_pred_classes)
plt.figure(figsize=(8, 6))
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues',
xticklabels=['Negative', 'Neutral', 'Positive'],
yticklabels=['Negative', 'Neutral', 'Positive'])
plt.xlabel('Predicted')
plt.ylabel('True')
plt.title('Confusion Matrix - LSTM Model')
plt.tight_layout()
plt.show()

32/32 ━━━━━━━━━━━━━━━━━━━━ 4s 101ms/step

LSTM Model Accuracy: 1.0000
Classification Report:
precision recall f1-score support

Negative 1.00 1.00 1.00 304

Neutral 1.00 1.00 1.00 351
Positive 1.00 1.00 1.00 345

accuracy 1.00 1000

macro avg 1.00 1.00 1.00 1000
weighted avg 1.00 1.00 1.00 1000

In [13]:

# Compare model performances

models = {
'Naive Bayes (BoW)': nb_bow_results[1],
'Logistic Regression (TF-IDF)': lr_tfidf_results[1],
'LSTM': lstm_accuracy
}

# Plot comparison
plt.figure(figsize=(10, 6))
bars = plt.bar(models.keys(), models.values(), color=['skyblue', 'lightgreen', 'coral'])
plt.title('Model Accuracy Comparison')
plt.xlabel('Model')
plt.ylabel('Accuracy')
plt.ylim(0, 1.0)
plt.grid(axis='y', linestyle='--', alpha=0.7)

# Add accuracy values on bars

for bar in bars:
height = bar.get_height()
plt.text(bar.get_x() + bar.get_width()/2., height + 0.01,
f'{height:.4f} ', ha='center', va='bottom')

plt.tight_layout()
plt.show()

In [14]:

# Load a subset of the 20 Newsgroups dataset

# For simplicity, we'll use 4 categories
categories = ['alt.atheism', 'sci.electronics', 'rec.sport.hockey', 'talk.politics.guns']

# Load training data

newsgroups_train = fetch_20newsgroups(subset='train', categories=categories, shuffle=Tru
e, random_state=42)
newsgroups_test = fetch_20newsgroups(subset='test', categories=categories, shuffle=True,
random_state=42)

print(f"Training data size: {len(newsgroups_train.data)}")

print(f"Testing data size: {len(newsgroups_test.data)}")
print(f"Categories: {newsgroups_train.target_names}")

Training data size: 2217

Testing data size: 1475
Categories: ['alt.atheism', 'rec.sport.hockey', 'sci.electronics', 'talk.politics.guns']

In [15]:
# Preprocess the newsgroups data
# Apply more aggressive preprocessing for this dataset
def preprocess_newsgroups(text):
# Convert to lowercase
text = text.lower()

# Remove headers, footers, and quotes which are common in newsgroups

text = re.sub(r'from:\s.*\n', '', text)
text = re.sub(r'subject:\s.*\n', '', text)
text = re.sub(r'\>.*\n', '', text) # quoted text

# Remove URLs and email addresses

text = re.sub(r'http\S+|www\S+|https\S+', '', text, flags=re.MULTILINE)
text = re.sub(r'\S+@\S+', '', text)

# Remove special characters & numbers

text = re.sub(r'[^\w\s]', '', text)
text = re.sub(r'\d+', '', text)

# Remove extra whitespace

text = re.sub(r'\s+', ' ', text).strip()

return text

# Process training and testing data

train_data_processed = [preprocess_newsgroups(text) for text in newsgroups_train.data]
test_data_processed = [preprocess_newsgroups(text) for text in newsgroups_test.data]

# Display a sample
print("Original text sample:")
print(newsgroups_train.data[0][:500] + "...")
print("\n Processed text sample:")
print(train_data_processed[0][:500] + "...")

Original text sample:

From: shah@pitt.edu (Ravindra S Shah)
Subject: Re: Nords 3 - Habs 2 in O.T. We was robbed!!
Lines: 23
X-Newsreader: TIN [version 1.1 PL8]

Deepak Chhabra (dchhabra@stpl.ists.ca) wrote:

: Speaking of great players, man-oh-man can Quebec skate. I haven't seen a

: team so potent on the rush in a long time. Watching them break out of their
: zone, especially Sundin, is a treat to watch. They remind me of the Red
: Army.

: dchhabra@stpl.ists.ca (pissed-off Habs fan)

Yeah, the Nords look like...

Processed text sample:

lines xnewsreader tin version pl deepak chhabra wrote speaking of great players manohman
can quebec skate i havent seen a team so potent on the rush in a long time watching them
break out of their zone especially sundin is a treat to watch they remind me of the red a
rmy pissedoff habs fan yeah the nords look like theyre going to be goodbut excuse the bia
s have you ever watched the pens on a rushdont answer everyone has seen this footage near
the end of the season when the pens played the nords i...

In [16]:
# Extract features using TF-IDF
tfidf_vectorizer = TfidfVectorizer(max_features=5000, stop_words='english')
X_train_tfidf = tfidf_vectorizer.fit_transform(train_data_processed)
X_test_tfidf = tfidf_vectorizer.transform(test_data_processed)

# Target labels
y_train = newsgroups_train.target
y_test = newsgroups_test.target

print(f"TF-IDF features shape: {X_train_tfidf.shape}")

TF-IDF features shape: (2217, 5000)

In [17]:

# Train a classifier
classifier = LogisticRegression(max_iter=1000, C=10.0, solver='liblinear', multi_class='
ovr')
classifier.fit(X_train_tfidf, y_train)

# Make predictions
y_pred = classifier.predict(X_test_tfidf)

# Evaluate
accuracy = accuracy_score(y_test, y_pred)
report = classification_report(y_test, y_pred, target_names=categories)

print(f"Logistic Regression Accuracy: {accuracy:.4f}")

print("Classification Report:")
print(report)

Logistic Regression Accuracy: 0.9003

Classification Report:
precision recall f1-score support

alt.atheism 0.88 0.84 0.86 319

sci.electronics 0.95 0.93 0.94 399
rec.sport.hockey 0.87 0.93 0.90 393
talk.politics.guns 0.91 0.88 0.89 364

accuracy 0.90 1475

macro avg 0.90 0.90 0.90 1475
weighted avg 0.90 0.90 0.90 1475

In [18]:

# Apply dimensionality reduction with TruncatedSVD (for sparse matrices)

# TruncatedSVD is used as PCA equivalent for sparse matrices
svd = TruncatedSVD(n_components=2, random_state=42)
X_train_2d = svd.fit_transform(X_train_tfidf)

# Get category names for plotting

category_names = newsgroups_train.target_names

# Visualize the data in 2D

plt.figure(figsize=(12, 10))
colors = ['red', 'blue', 'green', 'purple']
for i, category in enumerate(categories):
# Get indices for this category
indices = y_train == i

# Plot points for this category

plt.scatter(X_train_2d[indices, 0], X_train_2d[indices, 1],
c=colors[i], label=category, alpha=0.7, s=50)

plt.title('PCA Visualization of 20 Newsgroups Categories')

plt.xlabel(f'Principal Component 1 (Explained Variance: {svd.explained_variance_ratio_[0]
:.2f})')
plt.ylabel(f'Principal Component 2 (Explained Variance: {svd.explained_variance_ratio_[1]
:.2f})')
plt.legend(loc='best')
plt.grid(True, linestyle='--', alpha=0.6)
plt.tight_layout()
plt.show()

# Print total explained variance

total_var = sum(svd.explained_variance_ratio_)
print(f"Total explained variance by 2 components: {total_var:.4f} or {total_var*100:.2f}%
")
Total explained variance by 2 components: 0.0092 or 0.92%

In [19]:

from sklearn.manifold import TSNE

# First reduce dimensionality with TruncatedSVD to make t-SNE computationally feasible

svd_50 = TruncatedSVD(n_components=50, random_state=42)
X_train_50d = svd_50.fit_transform(X_train_tfidf)

# Apply t-SNE to the reduced data

tsne = TSNE(n_components=2, perplexity=40, n_iter=300, random_state=42)
X_train_tsne = tsne.fit_transform(X_train_50d)

# Visualize t-SNE results

plt.figure(figsize=(12, 10))

for i, category in enumerate(categories):

# Get indices for this category
indices = y_train == i

# Plot points for this category

plt.scatter(X_train_tsne[indices, 0], X_train_tsne[indices, 1],
c=colors[i], label=category, alpha=0.7, s=50)

plt.title('t-SNE Visualization of 20 Newsgroups Categories')

plt.xlabel('t-SNE Component 1')
plt.ylabel('t-SNE Component 2')
plt.legend(loc='best')
plt.grid(True, linestyle='--', alpha=0.6)
plt.tight_layout()
plt.show()
In [20]:

# Function to classify new text examples

def classify_new_text(text, model, vectorizer, label_names):
"""Classify a new text example using the trained model"""
# Preprocess text
processed_text = preprocess_newsgroups(text)

# Vectorize
text_tfidf = vectorizer.transform([processed_text])

# Get prediction and probabilities

prediction = model.predict(text_tfidf)[0]
probabilities = model.predict_proba(text_tfidf)[0]

# Sort probabilities in descending order

sorted_indices = np.argsort(probabilities)[::-1]

# Create results
result = {
'predicted_class': label_names[prediction],
'prediction_index': prediction,
'probabilities': []
}

# Add sorted probabilities

for idx in sorted_indices:
result['probabilities'].append({
'class': label_names[idx],
'probability': probabilities[idx]
})

return result
# Example texts to classify
test_examples = [
"I think we need stricter regulations on firearms to prevent violence.",
"The semiconductor industry is evolving with new transistor technologies.",
"Last night's hockey game was amazing with multiple goals in overtime.",
"Religious beliefs should not influence public policy decisions."
]

# Classify each example

for i, example in enumerate(test_examples):
result = classify_new_text(example, classifier, tfidf_vectorizer, categories)

print(f"\n Example {i+1}: {example}")

print(f"Predicted class: {result['predicted_class']} ")
print("Probabilities:")
for prob in result['probabilities']:
print(f" {prob['class']}: {prob['probability']:.4f}")

Example 1: I think we need stricter regulations on firearms to prevent violence.

Predicted class: talk.politics.guns
Probabilities:
talk.politics.guns: 0.5926
rec.sport.hockey: 0.1610
sci.electronics: 0.1545
alt.atheism: 0.0920

Example 2: The semiconductor industry is evolving with new transistor technologies.

Predicted class: rec.sport.hockey
Probabilities:
rec.sport.hockey: 0.6343
sci.electronics: 0.2048
alt.atheism: 0.0967
talk.politics.guns: 0.0641

Example 3: Last night's hockey game was amazing with multiple goals in overtime.
Predicted class: sci.electronics
Probabilities:
sci.electronics: 0.9507
alt.atheism: 0.0189
talk.politics.guns: 0.0163
rec.sport.hockey: 0.0141

Example 4: Religious beliefs should not influence public policy decisions.

Predicted class: alt.atheism
Probabilities:
alt.atheism: 0.7303
talk.politics.guns: 0.1093
rec.sport.hockey: 0.0850
sci.electronics: 0.0754
Experiment 10: Stock Market Prediction using LSTM
In [ ]:

%pip install yfinance

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import yfinance as yf # For fetching stock data
from datetime import datetime, timedelta
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, LSTM, Dropout, GRU, Bidirectional
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint, ReduceLROnPlateau
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score
import math
import warnings

# Suppress warnings
warnings.filterwarnings('ignore')

# Set style for plots

sns.set_style('whitegrid')
plt.style.use("fivethirtyeight")

# Check versions
print(f"TensorFlow version: {tf.__version__}")
print(f"Keras version: {keras.__version__}")
print(f"Pandas version: {pd.__version__}")
print(f"NumPy version: {np.__version__}")

In [4]:
# Define the stock tickers and time period
tickers = ['AAPL', 'MSFT', 'GOOGL', 'AMZN']
start_date = '2018-01-01'
end_date = datetime.now().strftime('%Y-%m-%d ')

# Function to fetch stock data

def get_stock_data(ticker, start, end):
try:
data = yf.download(ticker, start=start, end=end)
data.reset_index(inplace=True)
return data
except Exception as e:
print(f"Error fetching data for {ticker}: {e}")
# Create synthetic data if fetch fails
return generate_synthetic_stock_data(ticker, start, end)

# Function to generate synthetic stock data if API fetch fails

def generate_synthetic_stock_data(ticker, start_date, end_date):
# Convert dates to datetime
start = pd.to_datetime(start_date)
end = pd.to_datetime(end_date)

# Generate date range

date_range = pd.date_range(start=start, end=end, freq='B') # 'B' for business days

# Create random starting price based on ticker

if ticker == 'AAPL':
base_price = 150
elif ticker == 'MSFT':
base_price = 250
elif ticker == 'GOOGL':
base_price = 1200
elif ticker == 'AMZN':
base_price = 1800
else:
base_price = 100

# Generate random walk for prices with upward trend

n_days = len(date_range)
noise = np.random.normal(0, 1, n_days)
trend = np.linspace(0, 30, n_days) # Upward trend
seasonality = 10 * np.sin(np.linspace(0, 15 * np.pi, n_days)) # Seasonal pattern

# Combine components
prices = base_price + trend + seasonality + np.cumsum(noise)

# Create volume data

volume = np.random.randint(1000000, 10000000, n_days)

# Create DataFrame
df = pd.DataFrame({
'Date': date_range,
'Open': prices * np.random.uniform(0.99, 1.01, n_days),
'High': prices * np.random.uniform(1.01, 1.03, n_days),
'Low': prices * np.random.uniform(0.97, 0.99, n_days),
'Close': prices,
'Adj Close': prices,
'Volume': volume
})

print(f"Generated synthetic data for {ticker}")

return df

# Fetch data for each ticker

stock_data = {}
for ticker in tickers:
stock_data[ticker] = get_stock_data(ticker, start_date, end_date)
print(f"Retrieved {len(stock_data[ticker])} days of data for {ticker}")

# Select one stock for detailed analysis (Apple)

selected_stock = 'AAPL'
df = stock_data[selected_stock].copy()

# Display the first few rows

df.head()

YF.download() has changed argument auto_adjust default to True

[*********************100%***********************] 1 of 1 completed

Retrieved 1854 days of data for AAPL

[*********************100%***********************] 1 of 1 completed

Retrieved 1854 days of data for MSFT

[*********************100%***********************] 1 of 1 completed

Retrieved 1854 days of data for GOOGL

[*********************100%***********************] 1 of 1 completed

Retrieved 1854 days of data for AMZN

Out[4]:

Price Date Close High Low Open Volume

Ticker AAPL AAPL AAPL AAPL AAPL

0 2018-01-02 40.426826 40.436216 39.722772 39.933990 102223600

1 2018-01-03 40.419792 40.964263 40.356430 40.490198 118071600

1 2018-01-03 40.419792 40.964263 40.356430 40.490198 118071600
Price Date Close High Low Open Volume
2 2018-01-04 40.607540 40.710802 40.384590 40.492543 89738400
Ticker AAPL AAPL AAPL AAPL AAPL
3 2018-01-05 41.069859 41.156691 40.612224 40.703751 94640000

4 2018-01-08 40.917324 41.213026 40.818753 40.917324 82271200

In [5]:
# Check basic statistics
print(f"\n Statistics for {selected_stock}:")
print(df.describe())

# Check for missing values

print(f"\n Missing values in {selected_stock} data:")
print(df.isnull().sum())

# Plot the stock's closing price history

plt.figure(figsize=(16, 8))
plt.title(f'{selected_stock} Stock Price History')
plt.plot(df['Date'], df['Close'])
plt.xlabel('Date', fontsize=14)
plt.ylabel('Close Price (USD)', fontsize=14)
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

Statistics for AAPL:

Price Date Close High Low \
Ticker AAPL AAPL AAPL
count 1854 1854.000000 1854.000000 1854.000000
mean 2021-09-07 07:34:22.135922176 127.071460 128.370795 125.623331
min 2018-01-02 00:00:00 33.870842 34.711717 33.825582
25% 2019-11-04 06:00:00 58.870005 60.070647 58.003616
50% 2021-09-07 12:00:00 137.429787 139.645323 135.496434
75% 2023-07-12 18:00:00 172.366409 173.933534 170.980190
max 2025-05-16 00:00:00 258.396667 259.474086 257.010028
std NaN 61.769196 62.322379 61.115449

Price Open Volume

Ticker AAPL AAPL
count 1854.000000 1.854000e+03
mean 126.938500 9.807716e+07
min 34.297233 2.323470e+07
25% 58.957534 6.014090e+07
50% 137.397325 8.470835e+07
75% 172.219155 1.189786e+08
max 257.568678 4.265100e+08
std 61.692889 5.489307e+07

Missing values in AAPL data:

Price Ticker
Date 0
Close AAPL 0
High AAPL 0
Low AAPL 0
Open AAPL 0
Volume AAPL 0
dtype: int64
In [6]:
# Create additional plots to understand the data better

# Plot volume over time

plt.figure(figsize=(16, 6))
plt.plot(df['Date'], df['Volume'], color='orange')
plt.title(f'{selected_stock} Trading Volume')
plt.xlabel('Date', fontsize=14)
plt.ylabel('Volume', fontsize=14)
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

# Plot daily price change

df['Daily_Return'] = df['Close'].pct_change() * 100
plt.figure(figsize=(16, 6))
plt.plot(df['Date'], df['Daily_Return'], color='green')
plt.title(f'{selected_stock} Daily Returns (%)')
plt.xlabel('Date', fontsize=14)
plt.ylabel('Daily Return (%)', fontsize=14)
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

# Plot moving averages

df['MA50'] = df['Close'].rolling(window=50).mean()
df['MA200'] = df['Close'].rolling(window=200).mean()

plt.figure(figsize=(16, 8))
plt.plot(df['Date'], df['Close'], label='Close Price')
plt.plot(df['Date'], df['MA50'], label='50-day MA', color='orange')
plt.plot(df['Date'], df['MA200'], label='200-day MA', color='red')
plt.title(f'{selected_stock} Stock Price with Moving Averages')
plt.xlabel('Date', fontsize=14)
plt.ylabel('Price (USD)', fontsize=14)
plt.legend()
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()
In [7]:
# Compare all stocks on the same chart
plt.figure(figsize=(16, 8))

for ticker in tickers:

# Normalize to the starting price for fair comparison
normalized = stock_data[ticker]['Close'] / stock_data[ticker]['Close'].iloc[0] * 100
plt.plot(stock_data[ticker]['Date'], normalized, label=ticker)

plt.title('Normalized Stock Price Comparison (Base = 100)')

plt.xlabel('Date', fontsize=14)
plt.ylabel('Normalized Price', fontsize=14)
plt.legend()
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()
In [8]:
# Prepare data for LSTM model
def prepare_lstm_data(data, target_col='Close', sequence_length=60, train_split=0.8):
"""Prepare data for LSTM model with proper scaling and sequence creation"""
# Extract target column data
target_data = data[target_col].values.reshape(-1, 1)

# Scale the data

scaler = MinMaxScaler(feature_range=(0, 1))
scaled_data = scaler.fit_transform(target_data)

# Determine training data length

train_size = int(len(scaled_data) * train_split)
train_data = scaled_data[:train_size]
test_data = scaled_data[train_size - sequence_length:]

# Create sequences for training

X_train, y_train = [], []
for i in range(sequence_length, len(train_data)):
X_train.append(train_data[i - sequence_length:i, 0])
y_train.append(train_data[i, 0])

# Create sequences for testing

X_test, y_test = [], []
for i in range(sequence_length, len(test_data)):
X_test.append(test_data[i - sequence_length:i, 0])
y_test.append(test_data[i, 0])

# Convert to numpy arrays

X_train, y_train = np.array(X_train), np.array(y_train)
X_test, y_test = np.array(X_test), np.array(y_test)

# Reshape for LSTM [samples, time steps, features]

X_train = np.reshape(X_train, (X_train.shape[0], X_train.shape[1], 1))
X_test = np.reshape(X_test, (X_test.shape[0], X_test.shape[1], 1))

# Return processed data and scaler for inverse transformation

return X_train, y_train, X_test, y_test, scaler, train_size

# Prepare data for the selected stock

sequence_length = 60 # 60 days of data to predict the next day
X_train, y_train, X_test, y_test, scaler, train_size = prepare_lstm_data(
df, target_col='Close', sequence_length=sequence_length, train_split=0.8
)

print(f"Training data shape: {X_train.shape}")

print(f"Test data shape: {X_test.shape}")

# Show date ranges for training and testing

train_dates = df['Date'][:train_size]
test_dates = df['Date'][train_size:]
print(f"Training date range: {train_dates.iloc[0]} to {train_dates.iloc[-1]} ")
print(f"Testing date range: {test_dates.iloc[0]} to {test_dates.iloc[-1]}")

Training data shape: (1423, 60, 1)

Test data shape: (371, 60, 1)
Training date range: 2018-01-02 00:00:00 to 2023-11-21 00:00:00
Testing date range: 2023-11-22 00:00:00 to 2025-05-16 00:00:00

In [9]:
# Simple LSTM model
def create_simple_lstm_model(sequence_length):
model = Sequential([
LSTM(50, return_sequences=True, input_shape=(sequence_length, 1)),
Dropout(0.2),
LSTM(50, return_sequences=False),
Dropout(0.2),
Dense(25),
Dense(1)
])
model.compile(optimizer='adam', loss='mean_squared_error')
return model

# Stacked LSTM model

def create_stacked_lstm_model(sequence_length):
model = Sequential([
LSTM(100, return_sequences=True, input_shape=(sequence_length, 1)),
Dropout(0.2),
LSTM(100, return_sequences=True),
Dropout(0.2),
LSTM(100, return_sequences=False),
Dropout(0.2),
Dense(50),
Dense(1)
])
model.compile(optimizer='adam', loss='mean_squared_error')
return model

# Bidirectional LSTM model

def create_bidirectional_lstm_model(sequence_length):
model = Sequential([
Bidirectional(LSTM(50, return_sequences=True), input_shape=(sequence_length, 1))
,
Dropout(0.2),
Bidirectional(LSTM(50, return_sequences=False)),
Dropout(0.2),
Dense(25),
Dense(1)
])
model.compile(optimizer='adam', loss='mean_squared_error')
return model

# Create models
simple_lstm_model = create_simple_lstm_model(sequence_length)
stacked_lstm_model = create_stacked_lstm_model(sequence_length)
bidirectional_lstm_model = create_bidirectional_lstm_model(sequence_length)

# Display model architecture

simple_lstm_model.summary()

Model: "sequential"

Total params: 31,901 (124.61 KB)

Trainable params: 31,901 (124.61 KB)

Non-trainable params: 0 (0.00 B)

In [ ]:
# Define callbacks for training
callbacks = [
EarlyStopping(monitor='val_loss', patience=10, restore_best_weights=True),
ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=5, min_lr=1e-5)
]

# Train Simple LSTM

print("Training Simple LSTM Model...")
simple_history = simple_lstm_model.fit(
X_train, y_train,
epochs=50,
batch_size=32,
validation_split=0.1,
callbacks=callbacks,
verbose=1
)

# Train Stacked LSTM

print("\n Training Stacked LSTM Model...")
stacked_history = stacked_lstm_model.fit(
X_train, y_train,
epochs=50,
batch_size=32,
validation_split=0.1,
callbacks=callbacks,
verbose=1
)

# Train Bidirectional LSTM

print("\n Training Bidirectional LSTM Model...")
bidirectional_history = bidirectional_lstm_model.fit(
X_train, y_train,
epochs=50,
batch_size=32,
validation_split=0.1,
callbacks=callbacks,
verbose=1
)

In [11]:
# Plot training history for all models
plt.figure(figsize=(14, 7))

plt.plot(simple_history.history['loss'], label='Simple LSTM Training Loss')

plt.plot(simple_history.history['val_loss'], label='Simple LSTM Validation Loss')
plt.plot(stacked_history.history['loss'], label='Stacked LSTM Training Loss')
plt.plot(stacked_history.history['val_loss'], label='Stacked LSTM Validation Loss')
plt.plot(bidirectional_history.history['loss'], label='Bidirectional LSTM Training Loss')
plt.plot(bidirectional_history.history['val_loss'], label='Bidirectional LSTM Validation
Loss')

plt.title('Model Loss Comparison')

plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.legend()
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()
In [12]:
# Function to evaluate model performance
def evaluate_model(model, X_test, y_test, scaler, model_name):
"""Evaluate a model and return predictions and metrics"""
# Make predictions
predictions = model.predict(X_test)

# Inverse transform to get actual price values

y_test_inv = scaler.inverse_transform(y_test.reshape(-1, 1)).flatten()
predictions_inv = scaler.inverse_transform(predictions).flatten()

# Calculate error metrics

mse = mean_squared_error(y_test_inv, predictions_inv)
rmse = math.sqrt(mse)
mae = mean_absolute_error(y_test_inv, predictions_inv)
r2 = r2_score(y_test_inv, predictions_inv)

print(f"\n {model_name} Performance Metrics:")

print(f"MSE: {mse:.2f}")
print(f"RMSE: {rmse:.2f}")
print(f"MAE: {mae:.2f}")
print(f"R² Score: {r2:.4f}")

return predictions_inv, y_test_inv, {

'mse': mse,
'rmse': rmse,
'mae': mae,
'r2': r2
}

# Evaluate all models

simple_pred, y_test_inv, simple_metrics = evaluate_model(simple_lstm_model, X_test, y_te
st, scaler, "Simple LSTM")
stacked_pred, _, stacked_metrics = evaluate_model(stacked_lstm_model, X_test, y_test, sc
aler, "Stacked LSTM")
bidirectional_pred, _, bidirectional_metrics = evaluate_model(bidirectional_lstm_model, X
_test, y_test, scaler, "Bidirectional LSTM")

12/12 ━━━━━━━━━━━━━━━━━━━━ 1s 48ms/step

Simple LSTM Performance Metrics:

MSE: 77.65
RMSE: 8.81
MAE: 6.84
R² Score: 0.8666
12/12 ━━━━━━━━━━━━━━━━━━━━ 1s 75ms/step

Stacked LSTM Performance Metrics:

MSE: 88.82
RMSE: 9.42
MAE: 7.42
R² Score: 0.8475
12/12 ━━━━━━━━━━━━━━━━━━━━ 2s 81ms/step

Bidirectional LSTM Performance Metrics:

Bidirectional LSTM Performance Metrics:
MSE: 54.49
RMSE: 7.38
MAE: 5.69
R² Score: 0.9064

In [13]:
# Plot the predictions vs actual
def plot_predictions(predictions, actual, model_name, dates=None):
plt.figure(figsize=(16, 8))

if dates is not None and len(dates) == len(actual):

plt.plot(dates, actual, label='Actual', color='blue', linewidth=2)
plt.plot(dates, predictions, label=f'{model_name} Predictions', color='red', lin
estyle='--', linewidth=2)
else:
plt.plot(actual, label='Actual', color='blue', linewidth=2)
plt.plot(predictions, label=f'{model_name} Predictions', color='red', linestyle=
'--', linewidth=2)

plt.title(f'{selected_stock} Stock Price Prediction - {model_name} ')

plt.xlabel('Date', fontsize=14)
plt.ylabel('Stock Price (USD)', fontsize=14)
plt.legend()
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

# Extract dates for test data

test_dates_for_plot = df['Date'].iloc[train_size + sequence_length:train_size + sequence
_length + len(y_test_inv)]

# Plot predictions for each model

plot_predictions(simple_pred, y_test_inv, "Simple LSTM", test_dates_for_plot)
plot_predictions(stacked_pred, y_test_inv, "Stacked LSTM", test_dates_for_plot)
plot_predictions(bidirectional_pred, y_test_inv, "Bidirectional LSTM", test_dates_for_plo
t)
In [15]:

# Plot all predictions on the same graph for comparison

plt.figure(figsize=(16, 8))

plt.plot(test_dates, y_test_inv, label='Actual', color='blue', linewidth=2)

plt.plot(test_dates, simple_pred, label='Simple LSTM', color='red', linestyle='--', line
width=1.5)
plt.plot(test_dates, stacked_pred, label='Stacked LSTM', color='green', linestyle='--',
linewidth=1.5)
plt.plot(test_dates, bidirectional_pred, label='Bidirectional LSTM', color='orange', line
style='--', linewidth=1.5)

plt.title(f'{selected_stock} Stock Price Prediction - Model Comparison')

plt.xlabel('Date', fontsize=14)
plt.ylabel('Stock Price (USD)', fontsize=14)
plt.legend()
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()
In [16]:

# Compare error metrics

metrics = pd.DataFrame({
'Simple LSTM': [simple_metrics['rmse'], simple_metrics['mae'], simple_metrics['r2']]
,
'Stacked LSTM': [stacked_metrics['rmse'], stacked_metrics['mae'], stacked_metrics['r
2']],
'Bidirectional LSTM': [bidirectional_metrics['rmse'], bidirectional_metrics['mae'],
bidirectional_metrics['r2']]
}, index=['RMSE', 'MAE', 'R² Score'])

# Display the metrics table

print("Model Performance Comparison:")
metrics

Model Performance Comparison:

Out[16]:

Simple LSTM Stacked LSTM Bidirectional LSTM

RMSE 8.811947 9.424283 7.381580

MAE 6.841683 7.418596 5.688194

R² Score 0.866637 0.847458 0.906418

In [17]:

# Visualize model comparison

metrics_to_plot = ['RMSE', 'MAE']
plt.figure(figsize=(12, 6))

for i, metric in enumerate(metrics_to_plot):

plt.subplot(1, 2, i+1)
values = metrics.loc[metric]
bars = plt.bar(values.index, values.values, color=['red', 'green', 'orange'])
plt.title(f'Model Comparison: {metric}')
plt.ylabel(metric)
plt.xticks(rotation=45)

# Add value labels on bars

for bar in bars:
height = bar.get_height()
plt.text(bar.get_x() + bar.get_width()/2., height + 0.1,
f'{height:.2f} ', ha='center', va='bottom')

plt.tight_layout()
plt.show()

# Plot R² scores
plt.figure(figsize=(10, 6))
r2_values = metrics.loc['R² Score']
bars = plt.bar(r2_values.index, r2_values.values, color=['red', 'green', 'orange'])
plt.title('Model Comparison: R² Score')
plt.ylabel('R² Score')
plt.ylim(0, 1)
plt.xticks(rotation=45)

# Add value labels on bars

for bar in bars:
height = bar.get_height()
plt.text(bar.get_x() + bar.get_width()/2., height + 0.01,
f'{height:.4f} ', ha='center', va='bottom')

plt.tight_layout()
plt.show()

In [ ]:
# Function to predict future stock prices
def predict_future_prices(model, last_sequence, days_to_predict, scaler):
"""Predict future stock prices given the last sequence of data"""
future_predictions = []
current_sequence = last_sequence.copy()

for _ in range(days_to_predict):
# Reshape for prediction
current_reshaped = current_sequence.reshape(1, current_sequence.shape[0], 1)

# Predict next price

next_price = model.predict(current_reshaped)[0][0]
future_predictions.append(next_price)

# Update sequence for next prediction

current_sequence = np.append(current_sequence[1:], next_price)

# Inverse transform to get actual price values

future_predictions = scaler.inverse_transform(
np.array(future_predictions).reshape(-1, 1)
).flatten()

return future_predictions

# Get the last sequence from our test data

last_sequence = X_test[-1]

# Predict the next 30 days

days_to_predict = 30

# Generate future dates

last_date = test_dates_for_plot.iloc[-1]
future_dates = pd.date_range(start=last_date + timedelta(days=1), periods=days_to_predic
t, freq='B')

# Predict future prices with all models

simple_future = predict_future_prices(simple_lstm_model, last_sequence, days_to_predict,
scaler)
stacked_future = predict_future_prices(stacked_lstm_model, last_sequence, days_to_predict
, scaler)
bidirectional_future = predict_future_prices(bidirectional_lstm_model, last_sequence, day
s_to_predict, scaler)

# Combine historical and future data for visualization

plt.figure(figsize=(16, 8))

# Plot historical actual prices

plt.plot(test_dates_for_plot, y_test_inv[-len(test_dates_for_plot):], label='Historical P
rices', color='blue', linewidth=2)

# Plot future predictions

plt.plot(future_dates, simple_future, label='Simple LSTM Forecast', color='red', linestyl
e='--')
plt.plot(future_dates, stacked_future, label='Stacked LSTM Forecast', color='green', line
style='--')
plt.plot(future_dates, bidirectional_future, label='Bidirectional LSTM Forecast', color='
orange', linestyle='--')

# Add a vertical line to separate historical data from predictions

plt.axvline(x=test_dates_for_plot.iloc[-1], color='gray', linestyle='-.')
plt.text(test_dates_for_plot.iloc[-1], min(y_test_inv), 'Prediction Start', rotation=90,
verticalalignment='bottom')

plt.title(f'{selected_stock} Stock Price Forecast - Next {days_to_predict} Trading Days')

plt.xlabel('Date', fontsize=14)
plt.ylabel('Stock Price (USD)', fontsize=14)
plt.legend()
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

# Print the forecasted prices

forecast_df = pd.DataFrame({
'Date': future_dates,
'Simple LSTM': simple_future,
'Stacked LSTM': stacked_future,
'Bidirectional LSTM': bidirectional_future
})

print("\n Forecasted Prices for the Next 30 Trading Days:")

forecast_df.set_index('Date')

In [ ]:
In [ ]:

# Add technical indicators to the original dataframe

def add_technical_indicators(df):
"""Add common technical indicators to the dataframe"""
# Copy of the dataframe
df_temp = df.copy()

# Moving Averages
df_temp['MA5'] = df_temp['Close'].rolling(window=5).mean()
df_temp['MA10'] = df_temp['Close'].rolling(window=10).mean()
df_temp['MA20'] = df_temp['Close'].rolling(window=20).mean()
df_temp['MA50'] = df_temp['Close'].rolling(window=50).mean()

# Exponential Moving Averages

df_temp['EMA12'] = df_temp['Close'].ewm(span=12, adjust=False).mean()
df_temp['EMA26'] = df_temp['Close'].ewm(span=26, adjust=False).mean()

# MACD (Moving Average Convergence Divergence)

df_temp['MACD'] = df_temp['EMA12'] - df_temp['EMA26']
df_temp['MACD_Signal'] = df_temp['MACD'].ewm(span=9, adjust=False).mean()

# RSI (Relative Strength Index)

delta = df_temp['Close'].diff()
gain = delta.clip(lower=0)
loss = -delta.clip(upper=0)
avg_gain = gain.rolling(window=14).mean()
avg_loss = loss.rolling(window=14).mean()
rs = avg_gain / avg_loss
df_temp['RSI'] = 100 - (100 / (1 + rs))

# Bollinger Bands
df_temp['MA20_std'] = df_temp['Close'].rolling(window=20).std()
df_temp['Upper_Band'] = df_temp['MA20'] + (df_temp['MA20_std'] * 2)
df_temp['Lower_Band'] = df_temp['MA20'] - (df_temp['MA20_std'] * 2)

# Price Rate of Change

df_temp['ROC'] = ((df_temp['Close'] / df_temp['Close'].shift(10)) - 1) * 100

# Drop rows with NaN values resulting from calculations

df_temp.dropna(inplace=True)

return df_temp

# Add technical indicators

df_with_indicators = add_technical_indicators(df)

# Display the dataframe with indicators

df_with_indicators.tail()

In [23]:

# Visualize some technical indicators

def plot_technical_indicators(df, indicators=['MA20', 'Upper_Band', 'Lower_Band', 'RSI',
'MACD']):
"""Plot the stock price along with selected technical indicators"""
# Create figure with subplots
fig, axes = plt.subplots(3, 1, figsize=(16, 15), sharex=True)

# Plot price and Bollinger Bands

df['Close'].plot(ax=axes[0], color='blue', label='Close Price')
if 'MA20' in indicators:
df['MA20'].plot(ax=axes[0], color='orange', label='20-day MA')
if 'Upper_Band' in indicators and 'Lower_Band' in indicators:
df['Upper_Band'].plot(ax=axes[0], color='red', linestyle='--', label='Upper Boll
inger Band')
df['Lower_Band'].plot(ax=axes[0], color='green', linestyle='--', label='Lower Bo
llinger Band')

axes[0].set_title(f'{selected_stock} Stock Price with Bollinger Bands')

axes[0].set_ylabel('Price')
axes[0].legend()
axes[0].grid(True, alpha=0.3)

# Plot RSI
if 'RSI' in indicators:
df['RSI'].plot(ax=axes[1], color='purple', label='RSI')
# Add overbought/oversold levels
axes[1].axhline(y=70, color='red', linestyle='--', alpha=0.5)
axes[1].axhline(y=30, color='green', linestyle='--', alpha=0.5)
axes[1].text(df.index[0], 70, 'Overbought', color='red')
axes[1].text(df.index[0], 30, 'Oversold', color='green')

axes[1].set_title('Relative Strength Index (RSI)')

axes[1].set_ylabel('RSI')
axes[1].grid(True, alpha=0.3)

# Plot MACD
if 'MACD' in indicators:
df['MACD'].plot(ax=axes[2], color='blue', label='MACD')
df['MACD_Signal'].plot(ax=axes[2], color='red', label='Signal Line')

# Highlight MACD Histogram

macd_hist = df['MACD'] - df['MACD_Signal']
axes[2].bar(df.index, macd_hist, color=macd_hist.apply(lambda x: 'green' if x >
0 else 'red'),
label='MACD Histogram', alpha=0.5, width=2)

axes[2].set_title('Moving Average Convergence Divergence (MACD)')

axes[2].set_ylabel('MACD')
axes[2].axhline(y=0, color='black', linestyle='-', alpha=0.3)
axes[2].legend()
axes[2].grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

# First need to recreate the df_with_indicators since it appears to be empty

df_with_indicators = add_technical_indicators(df)

# Check if we have data

print(f"Technical indicators dataframe shape: {df_with_indicators.shape}")

# Plot technical indicators for the last 200 days

if not df_with_indicators.empty:
plot_technical_indicators(df_with_indicators.iloc[-200:])
else:
print("Error: The dataframe with technical indicators is empty.")

Technical indicators dataframe shape: (0, 21)

Error: The dataframe with technical indicators is empty.

10. Monte Carlo Simulation for Risk Assessment

In [24]:
# Function for Monte Carlo simulation
def monte_carlo_simulation(last_price, days, simulations, volatility):
"""Perform Monte Carlo simulation for stock price predictions"""
# Calculate daily returns
returns = np.random.normal(0, volatility, (days, simulations))

# Create price paths

price_paths = np.zeros((days + 1, simulations))
price_paths[0] = last_price

# Calculate price paths

for t in range(1, days + 1):
price_paths[t] = price_paths[t-1] * np.exp(returns[t-1])

return price_paths
# Get the last known price
last_price = y_test_inv[-1]

# Calculate historical volatility (standard deviation of returns)

returns = df['Daily_Return'].dropna() / 100 # convert from percentage
volatility = returns.std()

# Monte Carlo parameters

days_to_simulate = 30
num_simulations = 1000

# Run simulation
simulated_paths = monte_carlo_simulation(last_price, days_to_simulate, num_simulations,
volatility)

# Plot simulation results

plt.figure(figsize=(16, 8))

# Plot historical prices

plt.plot(range(-30, 0), y_test_inv[-30:], label='Historical Prices', color='blue', linew
idth=2)

# Plot simulation paths (sample 100 to avoid overcrowding)

sample_paths = np.random.choice(num_simulations, 100, replace=False)
for i in sample_paths:
plt.plot(range(days_to_simulate), simulated_paths[1:, i], color='gray', alpha=0.2)

# Plot expected path (mean)

mean_path = np.mean(simulated_paths, axis=1)
plt.plot(range(days_to_simulate), mean_path[1:], label='Expected Path', color='red', line
width=2)

# Plot confidence intervals

percentile_5 = np.percentile(simulated_paths, 5, axis=1)
percentile_95 = np.percentile(simulated_paths, 95, axis=1)
plt.fill_between(range(days_to_simulate), percentile_5[1:], percentile_95[1:], color='red
', alpha=0.2, label='90% Confidence Interval')

# Add LSTM predictions for comparison

plt.plot(range(days_to_simulate), bidirectional_future, label='Bidirectional LSTM Forecas
t', color='orange', linestyle='--', linewidth=2)

# Add vertical line separating historical from future

plt.axvline(x=0, color='gray', linestyle='-.')
plt.text(0, last_price*0.8, 'Simulation Start', rotation=90, verticalalignment='bottom')

plt.title(f'Monte Carlo Simulation for {selected_stock} (1000 simulations)')

plt.xlabel('Days')
plt.ylabel('Stock Price (USD)')
plt.legend()
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

# Calculate risk metrics from the simulation

final_prices = simulated_paths[-1]
expected_price = np.mean(final_prices)
var_95 = last_price - np.percentile(final_prices, 5) # 95% Value at Risk
gain_potential = np.percentile(final_prices, 95) - last_price # 95% Upside potential

print(f"\n Risk Assessment for {selected_stock} over {days_to_simulate} days:")

print(f"Current Price: ${last_price:.2f}")
print(f"Expected Price: ${expected_price:.2f} (Change: {(expected_price/last_price-1)*100
:.2f}%)")
print(f"95% Value at Risk: ${var_95:.2f} ({var_95/last_price*100:.2f}% of current price)"
)
print(f"95% Upside Potential: ${gain_potential:.2f} ({gain_potential/last_price*100:.2f}%
of current price)")
print(f"90% Confidence Range: ${np.percentile(final_prices, 5):.2f} to ${np.percentile(fi
nal_prices, 95):.2f}")
Risk Assessment for AAPL over 30 days:
Current Price: $211.26
Expected Price: $213.34 (Change: 0.99%)
95% Value at Risk: $32.76 (15.51% of current price)
95% Upside Potential: $41.52 (19.65% of current price)
90% Confidence Range: $178.50 to $252.78
Experiment 11: Mini Project -> Emotion Detection from
Tweets
In [1]:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import re
import string
import nltk
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize
from nltk.stem import WordNetLemmatizer
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.models import Sequential, Model
from tensorflow.keras.layers import Dense, LSTM, Embedding, SpatialDropout1D, Dropout, B
idirectional
from tensorflow.keras.layers import Input, GlobalMaxPooling1D, Conv1D, MaxPooling1D, conc
atenate
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras.callbacks import EarlyStopping, ReduceLROnPlateau
from tensorflow.keras.utils import to_categorical
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report, confusion_matrix, accuracy_score
from sklearn.preprocessing import LabelEncoder
import matplotlib.cm as cm
from wordcloud import WordCloud
import warnings

# Suppress warnings
warnings.filterwarnings('ignore')

# Set style for plots

sns.set_style('whitegrid')

# Download NLTK resources

try:
nltk.download('punkt')
nltk.download('stopwords')
nltk.download('wordnet')
except:
print("NLTK download failed, but we'll continue anyway")

# Check versions
print(f"TensorFlow version: {tf.__version__}")
print(f"Keras version: {keras.__version__}")
print(f"NLTK version: {nltk.__version__}")

[nltk_data] Downloading package punkt to /root/nltk_data...

[nltk_data] Unzipping tokenizers/punkt.zip.
[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data] Unzipping corpora/stopwords.zip.
[nltk_data] Downloading package wordnet to /root/nltk_data...

TensorFlow version: 2.18.0

Keras version: 3.8.0
NLTK version: 3.9.1

In [2]:
def generate_synthetic_twitter_data(n_samples=5000):
"""Generate synthetic Twitter data with emotion labels"""
# Define emotions and their associated words/phrases
emotions = {
'joy': [
"happy", "excited", "blessed", "wonderful", "fantastic", "thrilled", "love",
"joyful", "delighted", "ecstatic", "pleased", "overjoyed", "grateful", "blis
sful",
"elated", "cheerful", "content", "jubilant", "peaceful", "radiant", "sunny"
],
'sadness': [
"sad", "depressed", "heartbroken", "miserable", "grief", "sorrowful", "unhap
py",
"disappointed", "despondent", "hopeless", "melancholy", "gloomy", "blue", "d
own",
"hurt", "lost", "devastated", "tearful", "regretful", "broken", "crying"
],
'anger': [
"angry", "furious", "outraged", "annoyed", "irritated", "fuming", "enraged",
"mad", "resentful", "bitter", "incensed", "livid", "hostile", "irate",
"seething", "hate", "frustrated", "disgusted", "infuriated", "upset", "offen
ded"
],
'fear': [
"afraid", "scared", "terrified", "frightened", "anxious", "worried", "nervou
s",
"panicked", "horrified", "fearful", "dread", "uneasy", "alarmed", "threatene
d",
"intimidated", "apprehensive", "paranoid", "tense", "stressed", "petrified",
"timid"
],
'surprise': [
"surprised", "shocked", "astonished", "amazed", "stunned", "unexpected", "wo
w",
"speechless", "startled", "dumbfounded", "flabbergasted", "bewildered", "ast
ounded",
"mind-blown", "awestruck", "taken aback", "could not believe", "unbelievable
", "whoa", "omg"
]
}

# Tweet templates for each emotion

templates = {
'joy': [
"I'm feeling so {0} today! Life is good! #blessed",
"Just had the most {0} experience at {1}! Can't stop smiling! ",
"This {1} makes me feel so {0}! Best day ever! ❤️",
"I'm {0} to announce that {1}! Dreams do come true. #grateful",
"Nothing but {0} vibes with {1}. #goodtimes",
"Feeling absolutely {0} after {1}!",
"Today was {0}! {1} made everything perfect. #happiness",
"So {0} to be here with {1}. Memories that will last forever!",
"Can't express how {0} I am about {1}. #blessed #thankful",
"Such a {0} day spent {1}. My heart is full! "
],
'sadness': [
"Feeling so {0} right now... {1} ",
"Can't believe {1}. I'm completely {0}. #heartbroken",
"Today has been so {0}. {1} has me in tears.",
"Why does {1} have to happen? Feeling {0} beyond words.",
"Nothing but {0} thoughts after {1}. Need some time alone.",
"The news about {1} has left me {0}. Can't stop crying.",
"I'm {0} to say that {1}. Please send positive thoughts.",
"{1} has me feeling so {0} today. Hard to see the light.",
"My heart is {0} because {1}. Life can be so unfair.",
"In a {0} mood after {1}. Just want to stay in bed all day."
],
'anger': [
"I'm so {0} right now! {1} is absolutely unacceptable! ",
"Can't believe {1}! Makes me so {0}! #furious",
"Nothing makes me more {0} than {1}. Seriously?!",
"I'm {0} beyond words about {1}. This is ridiculous!",
"The way {1} happened has me feeling {0}. Not okay!",
"So {0} at the situation with {1}. This needs to change!",
"{1} is driving me insane! Feeling extremely {0} right now.",
"My blood is boiling! {1} has me completely {0}!",
"I can't stand {1}! Makes me {0} every single time.",
"Why am I so {0} about {1}? Because it's totally wrong!"
],
'fear': [
"Feeling so {0} about {1}. Can't stop thinking about it.",
"I'm {0} that {1} might happen. Anyone else feeling this way?",
"The thought of {1} makes me {0}. I don't know what to do.",
"So {0} right now. {1} has me on edge. #anxiety",
"Can't sleep because I'm {0} about {1}. Help?",
"The news about {1} has me {0}. Trying to stay calm.",
"I get so {0} whenever {1} happens. My heart races.",
"Why does {1} make me feel so {0}? I hate this feeling!",
"Having {0} thoughts about {1}. Need reassurance.",
"Absolutely {0} after hearing about {1}. Please be safe everyone."
],
'surprise': [
"I'm completely {0} by {1}! Did not see that coming! ",
"Well that was {0}! {1} just blew my mind!",
"Can't believe what just happened! {1} has me {0}! #whoa",
"I'm {0} at the news about {1}. Totally unexpected!",
"My jaw dropped! So {0} by {1}! #unexpected",
"{1} just happened and I'm absolutely {0}! What a twist!",
"The most {0} thing just occurred: {1}! Still processing it.",
"I was not prepared for {1}! Feeling {0} right now!",
"OMG! {1} has left me {0}! Did anyone else know about this?",
"Talk about a {0} turn of events! {1} has changed everything!"
]
}

# Context phrases to fill in templates

contexts = {
'joy': [
"my promotion", "our vacation", "the party last night", "meeting new friends
",
"spending time with family", "the concert", "my new job", "passing my exam",
"my birthday celebration", "winning the game", "my new puppy", "the surprise
gift",
"the beautiful weather", "the delicious meal", "my wedding plans", "finishin
g my project"
],
'sadness': [
"losing a friend", "the bad news", "missing someone", "the movie ending",
"the rainy weather", "failing my test", "being alone tonight", "remembering
the past",
"the loss in our family", "saying goodbye", "moving away", "the tragic news"
,
"seeing people struggle", "the restaurant closing", "the canceled plans", "l
osing my favorite item"
],
'anger': [
"being stuck in traffic", "poor customer service", "people being rude", "the
flight cancellation",
"waiting in long lines", "being overcharged", "the WiFi not working", "someo
ne cutting me off",
"the policy change", "people not wearing masks", "misleading advertising", "
the noisy neighbors",
"the political situation", "the unfair treatment", "broken promises", "the d
elivery being late"
],
'fear': [
"the upcoming presentation", "the medical test results", "walking alone at ni
ght", "the strange noise",
"the upcoming deadline", "the job interview", "the turbulence on my flight",
"losing my job",
"the storm warning", "the virus spreading", "going to the dentist", "making
a big decision",
"meeting the new boss", "the final exam", "the economic uncertainty", "movin
g to a new city"
],
'surprise': [
"the plot twist", "the unexpected visit", "the sudden announcement", "winnin
g the lottery",
"the celebrity sighting", "the marriage proposal", "the sudden career change
", "the news headline",
"the secret reveal", "the sudden weather change", "the price drop", "the une
xpected gift",
"the team victory", "the reunion", "the talent show performance", "the drama
tic ending"
]
}

# Generate tweets
tweets = []
labels = []

emotion_distribution = {
'joy': int(n_samples * 0.25),
'sadness': int(n_samples * 0.20),
'anger': int(n_samples * 0.20),
'fear': int(n_samples * 0.15),
'surprise': int(n_samples * 0.20)
}

# Ensure we get exactly n_samples by adjusting the last category

total = sum(emotion_distribution.values())
if total < n_samples:
emotion_distribution['joy'] += (n_samples - total)
elif total > n_samples:
emotion_distribution['joy'] -= (total - n_samples)

for emotion, count in emotion_distribution.items():

for _ in range(count):
# Select random template
template = np.random.choice(templates[emotion])

# Select random emotion word

emotion_word = np.random.choice(emotions[emotion])

# Select random context

context = np.random.choice(contexts[emotion])

# Create tweet
tweet = template.format(emotion_word, context)

tweets.append(tweet)
labels.append(emotion)

# Create DataFrame
data = pd.DataFrame({
'tweet': tweets,
'emotion': labels
})

# Shuffle the data

data = data.sample(frac=1, random_state=42).reset_index(drop=True)

return data

# Generate synthetic Twitter data

tweet_data = generate_synthetic_twitter_data(n_samples=5000)

# Display the first few rows

print(f"Generated {len(tweet_data)} synthetic tweets")
tweet_data.head()

Generated 5000 synthetic tweets

Out[2]:

tweet emotion
0 Today has been so lost. the canceled plans has... sadness
tweet emotion
1 Nothing makes me more irritated than poor cust... anger

2 Why am I so outraged about people not wearing ... anger

3 I'm feeling so excited today! Life is good! #b... joy

4 I'm elated to announce that our vacation! Drea... joy

Categorical distributions

2-d categorical distributions

Error: Runtime no longer has a reference to this dataframe, please re-run this cell and t
ry again.

In [3]:

# Check the distribution of emotions

plt.figure(figsize=(12, 6))
count_plot = sns.countplot(x='emotion', data=tweet_data, palette='viridis')
plt.title('Distribution of Emotions in Tweets', fontsize=16)
plt.xlabel('Emotion', fontsize=14)
plt.ylabel('Count', fontsize=14)
plt.xticks(fontsize=12)
plt.yticks(fontsize=12)

# Add count labels on the bars

for p in count_plot.patches:
count_plot.annotate(format(p.get_height(), '.0f'),
(p.get_x() + p.get_width() / 2., p.get_height()),
ha = 'center', va = 'center',
xytext = (0, 10),
textcoords = 'offset points')

plt.grid(axis='y', alpha=0.3)
plt.tight_layout()
plt.show()
# Calculate percentages
emotion_counts = tweet_data['emotion'].value_counts(normalize=True) * 100
print("Emotion distribution percentages:")
for emotion, percentage in emotion_counts.items():
print(f"{emotion}: {percentage:.2f}%")

Emotion distribution percentages:

joy: 25.00%
sadness: 20.00%
anger: 20.00%
surprise: 20.00%
fear: 15.00%

In [4]:
# Analyze tweet length by emotion
tweet_data['tweet_length'] = tweet_data['tweet'].apply(len)

plt.figure(figsize=(12, 6))
sns.boxplot(x='emotion', y='tweet_length', data=tweet_data, palette='viridis')
plt.title('Tweet Length by Emotion', fontsize=16)
plt.xlabel('Emotion', fontsize=14)
plt.ylabel('Tweet Length (characters)', fontsize=14)
plt.grid(axis='y', alpha=0.3)
plt.tight_layout()
plt.show()

# Calculate average tweet length by emotion

avg_length_by_emotion = tweet_data.groupby('emotion')['tweet_length'].mean().sort_values
(ascending=False)
print("Average tweet length by emotion:")
for emotion, avg_length in avg_length_by_emotion.items():
print(f"{emotion}: {avg_length:.1f} characters")
Average tweet length by emotion:
fear: 74.8 characters
surprise: 74.6 characters
anger: 72.6 characters
sadness: 69.7 characters
joy: 66.9 characters

In [5]:
# Generate word clouds for each emotion
def plot_wordcloud(text, title, max_words=100):
wordcloud = WordCloud(
background_color='white',
max_words=max_words,
max_font_size=40,
scale=3,
random_state=42
).generate(text)

plt.figure(figsize=(10, 6))
plt.imshow(wordcloud, interpolation='bilinear')
plt.axis('off')
plt.title(title, fontsize=16)
plt.tight_layout(pad=0)
plt.show()

# Create word clouds for each emotion

emotions = tweet_data['emotion'].unique()

for emotion in emotions:

# Get all tweets for this emotion
emotion_tweets = tweet_data[tweet_data['emotion'] == emotion]['tweet']

# Combine into one string

combined_tweets = ' '.join(emotion_tweets)

# Create word cloud

plot_wordcloud(combined_tweets, f'Word Cloud for {emotion.capitalize()} Tweets')
In [6]:
# Make sure NLTK resources are downloaded correctly
try:
nltk.download('punkt')
nltk.download('stopwords')
nltk.download('wordnet')
except Exception as e:
print(f"NLTK download issue: {e}")
print("Continuing with available resources...")

def preprocess_tweet(tweet, remove_stopwords=True):

"""Preprocess tweet text for analysis"""
# Convert to lowercase
tweet = tweet.lower()

# Remove URLs
tweet = re.sub(r'http\S+|www\S+|https\S+', '', tweet, flags=re.MULTILINE)

# Remove user mentions and hashtag symbol (keep the hashtag text)
tweet = re.sub(r'@\w+', '', tweet)
tweet = re.sub(r'#', '', tweet)

# Remove punctuation
tweet = re.sub(r'[^\w\s]', '', tweet)

# Tokenize - use a try-except block to handle potential errors

try:
tokens = word_tokenize(tweet)
except LookupError:
# Fallback tokenization if word_tokenize fails
tokens = tweet.split()

# Remove stopwords if requested

if remove_stopwords:
try:
stop_words = set(stopwords.words('english'))
tokens = [word for word in tokens if word not in stop_words]
except:
# If stopwords can't be loaded, continue without stopword removal
pass

# Lemmatize
try:
lemmatizer = WordNetLemmatizer()
tokens = [lemmatizer.lemmatize(word) for word in tokens]
except:
# If lemmatization fails, use the original tokens
pass

# Rejoin into string

preprocessed_tweet = ' '.join(tokens)

return preprocessed_tweet

# Apply preprocessing to the tweet data

tweet_data['processed_tweet'] = tweet_data['tweet'].apply(preprocess_tweet)

# Display examples of original vs processed tweets

comparison = tweet_data[['tweet', 'processed_tweet', 'emotion']].head(5)
print("Original vs. Processed Tweet Examples:")
for i, row in comparison.iterrows():
print(f"\n Emotion: {row['emotion']}")
print(f"Original: {row['tweet']}")
print(f"Processed: {row['processed_tweet']}")

[nltk_data] Downloading package punkt to /root/nltk_data...

[nltk_data] Package punkt is already up-to-date!
[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data] Package stopwords is already up-to-date!
[nltk_data] Downloading package wordnet to /root/nltk_data...
[nltk_data] Package wordnet is already up-to-date!

Original vs. Processed Tweet Examples:

Emotion: sadness
Original: Today has been so lost. the canceled plans has me in tears.
Processed: today lost canceled plan tear

Emotion: anger
Original: Nothing makes me more irritated than poor customer service. Seriously?!
Processed: nothing make irritated poor customer service seriously

Emotion: anger
Original: Why am I so outraged about people not wearing masks? Because it's totally wrong
!
Processed: outraged people wearing mask totally wrong

Emotion: joy
Original: I'm feeling so excited today! Life is good! #blessed
Processed: im feeling excited today life good blessed

Emotion: joy
Original: I'm elated to announce that our vacation! Dreams do come true. #grateful
Processed: im elated announce vacation dream come true grateful

In [7]:
# Encode emotion labels
label_encoder = LabelEncoder()
encoded_labels = label_encoder.fit_transform(tweet_data['emotion'])
num_classes = len(label_encoder.classes_)

# Map encoded values to original labels for reference

label_mapping = dict(zip(range(num_classes), label_encoder.classes_))
print("Label encoding mapping:")
for encoded, original in label_mapping.items():
print(f"{encoded}: {original}")

# Convert to categorical for multi-class classification

categorical_labels = to_categorical(encoded_labels, num_classes=num_classes)
# Split data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(
tweet_data['processed_tweet'],
categorical_labels,
test_size=0.2,
stratify=categorical_labels, # Ensure balanced classes in train and test
random_state=42
)

# Tokenize text
max_words = 10000 # Maximum number of words to consider
max_len = 100 # Maximum sequence length

tokenizer = Tokenizer(num_words=max_words, oov_token='<OOV>')

tokenizer.fit_on_texts(X_train)

# Convert text to sequences of integers

X_train_seq = tokenizer.texts_to_sequences(X_train)
X_test_seq = tokenizer.texts_to_sequences(X_test)

# Pad sequences to ensure uniform length

X_train_pad = pad_sequences(X_train_seq, maxlen=max_len, padding='post', truncating='pos
t')
X_test_pad = pad_sequences(X_test_seq, maxlen=max_len, padding='post', truncating='post'
)

# Get the vocabulary size

vocab_size = len(tokenizer.word_index) + 1 # +1 for padding token

print(f"Vocabulary size: {vocab_size} ")

print(f"Training set size: {X_train_pad.shape}")
print(f"Testing set size: {X_test_pad.shape}")

Label encoding mapping:

0: anger
1: fear
2: joy
3: sadness
4: surprise
Vocabulary size: 346
Training set size: (4000, 100)
Testing set size: (1000, 100)

In [8]:

# Define callbacks to use with all models

callbacks = [
EarlyStopping(monitor='val_loss', patience=3, restore_best_weights=True),
ReduceLROnPlateau(monitor='val_loss', factor=0.1, patience=2, min_lr=0.001)
]

# 1. Simple LSTM Model

def build_lstm_model():
model = Sequential([
Embedding(input_dim=vocab_size, output_dim=128, input_length=max_len),
SpatialDropout1D(0.2),
LSTM(128, dropout=0.2, recurrent_dropout=0.2),
Dense(64, activation='relu'),
Dropout(0.5),
Dense(num_classes, activation='softmax')
])

model.compile(
optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy']
)

return model

# 2. Bidirectional LSTM Model

def build_bidirectional_lstm_model():
model = Sequential([
Embedding(input_dim=vocab_size, output_dim=128, input_length=max_len),
SpatialDropout1D(0.2),
Bidirectional(LSTM(64, dropout=0.2, recurrent_dropout=0.2)),
Dense(64, activation='relu'),
Dropout(0.5),
Dense(num_classes, activation='softmax')
])

model.compile(
optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy']
)

return model

# 3. CNN Model for Text Classification

def build_cnn_model():
model = Sequential([
Embedding(input_dim=vocab_size, output_dim=128, input_length=max_len),
Conv1D(filters=128, kernel_size=5, activation='relu'),
MaxPooling1D(pool_size=2),
Conv1D(filters=128, kernel_size=5, activation='relu'),
MaxPooling1D(pool_size=2),
Conv1D(filters=128, kernel_size=5, activation='relu'),
GlobalMaxPooling1D(),
Dense(128, activation='relu'),
Dropout(0.5),
Dense(num_classes, activation='softmax')
])

model.compile(
optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy']
)

return model

# 4. CNN-LSTM Hybrid Model

def build_cnn_lstm_model():
model = Sequential([
Embedding(input_dim=vocab_size, output_dim=128, input_length=max_len),
SpatialDropout1D(0.2),
Conv1D(filters=128, kernel_size=5, activation='relu'),
MaxPooling1D(pool_size=2),
LSTM(64, dropout=0.2, recurrent_dropout=0.2),
Dense(64, activation='relu'),
Dropout(0.5),
Dense(num_classes, activation='softmax')
])

model.compile(
optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy']
)

return model

In [ ]:
# Initialize empty dictionaries to store models and histories
models = {}
histories = {}

# Build, train and evaluate each model

def train_evaluate_model(model_name, model_builder):
print(f"\n Training {model_name} ...")
model = model_builder()

# Use small integers for random seeds to avoid overflow

tf.random.set_seed(42)
np.random.seed(42)

history = model.fit(
X_train_pad, y_train,
epochs=15,
batch_size=64,
validation_split=0.1,
callbacks=callbacks,
verbose=1
)

# Evaluate on test set

loss, accuracy = model.evaluate(X_test_pad, y_test, verbose=0)
print(f"{model_name} Test Accuracy: {accuracy:.4f}")

return model, history

# Train all models

# Use small integer for random seed to avoid overflow
tf.random.set_seed(42)
np.random.seed(42)

# Train each model one by one

print("Starting model training...")
models['LSTM'], histories['LSTM'] = train_evaluate_model('LSTM', build_lstm_model)
models['BiLSTM'], histories['BiLSTM'] = train_evaluate_model('Bidirectional LSTM', build_
bidirectional_lstm_model)
models['CNN'], histories['CNN'] = train_evaluate_model('CNN', build_cnn_model)
models['CNN-LSTM'], histories['CNN-LSTM'] = train_evaluate_model('CNN-LSTM Hybrid', build
_cnn_lstm_model)

In [10]:

# Compare training histories

plt.figure(figsize=(15, 6))

# Plot accuracy
plt.subplot(1, 2, 1)
for model_name, history in histories.items():
plt.plot(history.history['accuracy'], label=f'{model_name} Training')
plt.plot(history.history['val_accuracy'], label=f'{model_name} Validation', linestyl
e='--')

plt.title('Model Accuracy Comparison')

plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend()
plt.grid(True, alpha=0.3)

# Plot loss
plt.subplot(1, 2, 2)
for model_name, history in histories.items():
plt.plot(history.history['loss'], label=f'{model_name} Training')
plt.plot(history.history['val_loss'], label=f'{model_name} Validation', linestyle='-
-')

plt.title('Model Loss Comparison')

plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend()
plt.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()
In [11]:

# Function to evaluate model and produce classification report and confusion matrix
def evaluate_model_detail(model, model_name, X_test, y_test, label_mapping):
# Get predictions
y_pred_prob = model.predict(X_test)
y_pred = np.argmax(y_pred_prob, axis=1)
y_true = np.argmax(y_test, axis=1)

# Get classification report

target_names = [label_mapping[i] for i in range(len(label_mapping))]
report = classification_report(y_true, y_pred, target_names=target_names)
print(f"\n {model_name} Classification Report:\n ")
print(report)

# Get confusion matrix

cm = confusion_matrix(y_true, y_pred)

# Plot confusion matrix

plt.figure(figsize=(10, 8))
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues',
xticklabels=target_names,
yticklabels=target_names)
plt.title(f'{model_name} Confusion Matrix')
plt.ylabel('True Label')
plt.xlabel('Predicted Label')
plt.tight_layout()
plt.show()

return y_pred, y_true, y_pred_prob

# Select the best performing model for detailed evaluation

# Let's use BiLSTM for this example, but you could compare all models
best_model_name = 'BiLSTM' # Change this based on your results
best_model = models[best_model_name]

# Evaluate the best model in detail

y_pred, y_true, y_pred_prob = evaluate_model_detail(best_model, best_model_name, X_test_
pad, y_test, label_mapping)

32/32 ━━━━━━━━━━━━━━━━━━━━ 4s 106ms/step

BiLSTM Classification Report:

precision recall f1-score support

anger 1.00 1.00 1.00 200

fear 1.00 1.00 1.00 150
joy 1.00 1.00 1.00 250
sadness 1.00 1.00 1.00 200
surprise 1.00 1.00 1.00 200

accuracy 1.00 1000

macro avg 1.00 1.00 1.00 1000
weighted avg 1.00 1.00 1.00 1000
In [12]:
# Compare accuracies of all models
model_accuracies = {}
for model_name, model in models.items():
_, accuracy = model.evaluate(X_test_pad, y_test, verbose=0)
model_accuracies[model_name] = accuracy

# Plot model comparison

plt.figure(figsize=(10, 6))
bars = plt.bar(model_accuracies.keys(), model_accuracies.values(), color=['blue', 'green'
, 'red', 'purple'])

# Add accuracy values on bars

for bar in bars:
height = bar.get_height()
plt.text(bar.get_x() + bar.get_width()/2., height + 0.01,
f'{height:.4f} ', ha='center', va='bottom')

plt.title('Model Accuracy Comparison')

plt.xlabel('Model')
plt.ylabel('Test Accuracy')
plt.ylim(0, 1.0)
plt.grid(axis='y', alpha=0.3)
plt.tight_layout()
plt.show()

# Find best model

best_model_name = max(model_accuracies, key=model_accuracies.get)
best_accuracy = model_accuracies[best_model_name]
print(f"The best performing model is {best_model_name} with an accuracy of {best_accuracy
:.4f}")

The best performing model is BiLSTM with an accuracy of 1.0000

In [13]:
# Analyze misclassifications to understand model weaknesses
def analyze_misclassifications(X_test_raw, y_true, y_pred, label_mapping, n_examples=10)
:
# Get original texts from test set
misclassified_indices = np.where(y_true != y_pred)[0]

if len(misclassified_indices) == 0:
print("No misclassifications found!")
return

# Limit to n examples
n_examples = min(n_examples, len(misclassified_indices))
selected_indices = np.random.choice(misclassified_indices, n_examples, replace=False
)

print(f"\n Misclassification Analysis ({n_examples} examples):")

print("-" * 80)

for idx in selected_indices:

text = X_test_raw.iloc[idx]
true_label = label_mapping[y_true[idx]]
pred_label = label_mapping[y_pred[idx]]

print(f"Tweet: {text}")
print(f"True emotion: {true_label} ")
print(f"Predicted emotion: {pred_label} ")
print("-" * 80)

# Calculate confusion pairs (which emotions get confused with each other)
confusion_pairs = {}
for idx in misclassified_indices:
true_label = label_mapping[y_true[idx]]
pred_label = label_mapping[y_pred[idx]]
pair = (true_label, pred_label)

if pair in confusion_pairs:
confusion_pairs[pair] += 1
else:
confusion_pairs[pair] = 1

# Show most common confusion pairs

print("\n Most Common Confusion Pairs:")
for pair, count in sorted(confusion_pairs.items(), key=lambda x: x[1], reverse=True)
[:5]:
true_label, pred_label = pair
print(f"True: {true_label} , Predicted: {pred_label} - {count} instances")

# Extract the raw test data

X_test_raw = X_test.reset_index(drop=True)

# Analyze misclassifications
analyze_misclassifications(X_test_raw, y_true, y_pred, label_mapping, n_examples=5)

No misclassifications found!

In [14]:
# Create a function to predict emotions from new tweets
def predict_emotion(tweet, model, tokenizer, label_mapping, max_len=100):
"""Predict the emotion of a single tweet"""
# Preprocess the tweet
processed_tweet = preprocess_tweet(tweet)

# Convert to sequence
sequence = tokenizer.texts_to_sequences([processed_tweet])

# Pad sequence
padded_sequence = pad_sequences(sequence, maxlen=max_len, padding='post', truncating
='post')

# Make prediction
prediction = model.predict(padded_sequence)[0]

# Get predicted class and probability

predicted_class = np.argmax(prediction)
probability = prediction[predicted_class]

# Get emotion name

emotion = label_mapping[predicted_class]

# Get probabilities for all emotions

all_probs = {label_mapping[i]: float(prob) for i, prob in enumerate(prediction)}

return {
'emotion': emotion,
'probability': float(probability),
'all_probabilities': all_probs
}

# Test the prediction function with some example tweets

example_tweets = [
"I'm so happy today! Just got a promotion at work and feeling blessed! #grateful",
"Feeling so sad after watching that movie. It really broke my heart. ",
"Can't believe how terrible the customer service was! I'm absolutely furious right no
w! ",
"I'm really nervous about my presentation tomorrow. Can't sleep thinking about it. #a
nxiety",
"OMG! I just won the lottery! I can't believe this is happening to me! What a surpris
e! "
]

# Use the best model for predictions

best_model = models[best_model_name]

# Make predictions for each example

for i, tweet in enumerate(example_tweets):
result = predict_emotion(tweet, best_model, tokenizer, label_mapping, max_len)
print(f"\n Example {i+1}: {tweet} ")
print(f"Predicted Emotion: {result['emotion']} (Confidence: {result['probability']:.2
%} )")

# Show all emotion probabilities sorted by likelihood

print("All Emotion Probabilities:")
for emotion, prob in sorted(result['all_probabilities'].items(), key=lambda x: x[1],
reverse=True):
print(f" - {emotion}: {prob:.2%}")

1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 66ms/step

Example 1: I'm so happy today! Just got a promotion at work and feeling blessed! #grat
eful
Predicted Emotion: joy (Confidence: 100.00%)
All Emotion Probabilities:
- joy: 100.00%
- fear: 0.00%
- surprise: 0.00%
- anger: 0.00%
- sadness: 0.00%
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 73ms/step

Example 2: Feeling so sad after watching that movie. It really broke my heart.
Predicted Emotion: sadness (Confidence: 99.96%)
All Emotion Probabilities:
- sadness: 99.96%
- fear: 0.04%
- joy: 0.00%
- anger: 0.00%
- surprise: 0.00%
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 61ms/step

Example 3: Can't believe how terrible the customer service was! I'm absolutely furious ri
ght now!
Predicted Emotion: anger (Confidence: 100.00%)
All Emotion Probabilities:
- anger: 100.00%
- sadness: 0.00%
- fear: 0.00%
- surprise: 0.00%
- joy: 0.00%
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 58ms/step

Example 4: I'm really nervous about my presentation tomorrow. Can't sleep thinking about
it. #anxiety
Predicted Emotion: fear (Confidence: 100.00%)
All Emotion Probabilities:
- fear: 100.00%
- anger: 0.00%
- sadness: 0.00%
- joy: 0.00%
- surprise: 0.00%
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 61ms/step

Example 5: OMG! I just won the lottery! I can't believe this is happening to me! What a s
urprise!
Predicted Emotion: surprise (Confidence: 99.97%)
All Emotion Probabilities:
- surprise: 99.97%
- sadness: 0.02%
- joy: 0.01%
- anger: 0.00%
- fear: 0.00%

In [15]:

# Create a visualization of the emotion prediction

def visualize_emotion_prediction(tweet, result):
# Sort emotions by probability
emotions = []
probabilities = []
for emotion, prob in sorted(result['all_probabilities'].items(), key=lambda x: x[1],
reverse=True):
emotions.append(emotion)
probabilities.append(prob)

# Create a colormap based on probabilities

colors = cm.viridis(np.array(probabilities))

# Plot
plt.figure(figsize=(12, 6))

# Tweet text with predicted emotion

plt.suptitle(f"Tweet: {tweet} ", fontsize=14, wrap=True)

# Bar chart of emotion probabilities

bars = plt.bar(emotions, probabilities, color=colors)

# Add value labels on bars

for bar in bars:
height = bar.get_height()
plt.text(bar.get_x() + bar.get_width()/2., height + 0.01,
f'{height:.2%} ', ha='center', va='bottom')

plt.title(f"Predicted Emotion: {result['emotion']} (Confidence: {result['probability'

]:.2%})")
plt.xlabel('Emotion')
plt.ylabel('Probability')
plt.ylim(0, 1.0)
plt.grid(axis='y', alpha=0.3)
plt.tight_layout(rect=[0, 0, 1, 0.95])
plt.show()

# Visualize predictions for a few examples

for tweet in example_tweets[:3]: # Just show first 3 for space
result = predict_emotion(tweet, best_model, tokenizer, label_mapping, max_len)
visualize_emotion_prediction(tweet, result)

1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 61ms/step

1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 59ms/step

1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 61ms/step

In [16]:
# Create a simple interactive tool for emotion prediction
def interactive_emotion_predictor():
print("=== Twitter Emotion Predictor ===\n ")
print("Enter a tweet to analyze its emotion, or type 'quit' to exit.\n ")

while True:
# Get input from user
tweet = input("\n Enter a tweet: ")

# Check if user wants to quit

if tweet.lower() == 'quit':
print("\n Thank you for using the Twitter Emotion Predictor!")
break

# Skip empty tweets

if not tweet.strip():
print("Tweet cannot be empty. Please try again.")
continue

# Predict emotion
result = predict_emotion(tweet, best_model, tokenizer, label_mapping, max_len)

# Display result
print(f"\n Predicted Emotion: {result['emotion']} (Confidence: {result['probabilit
y']:.2%})")

# Show all emotion probabilities sorted by likelihood

print("\n All Emotion Probabilities:")
for emotion, prob in sorted(result['all_probabilities'].items(), key=lambda x: x
[1], reverse=True):
print(f" - {emotion}: {prob:.2%}")

# Optional: Visualize result

visualize_emotion_prediction(tweet, result)

# Run the interactive tool - uncomment to use

# interactive_emotion_predictor()
Experiment 12: Introduction to Deep Learning using Keras and
TensorFlow
In [1]:

# Import necessary libraries

import tensorflow as tf
from tensorflow import keras
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# Check versions
print(f"TensorFlow version: {tf.__version__}")
print(f"Keras version: {keras.__version__}")
print(f"NumPy version: {np.__version__}")
print(f"Pandas version: {pd.__version__}")

TensorFlow version: 2.18.0

Keras version: 3.8.0
NumPy version: 2.0.2
Pandas version: 2.2.2

In [2]:
# Set random seeds for reproducibility
np.random.seed(42)
tf.random.set_seed(42)

In [3]:
# Simple sequential model
model_sequential = keras.Sequential([
keras.layers.Dense(64, activation='relu', input_shape=(784,)),
keras.layers.Dense(32, activation='relu'),
keras.layers.Dense(10, activation='softmax')
])

# Compile the model

model_sequential.compile(
optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy']
)

# Model summary
model_sequential.summary()

Model: "sequential"

Trainable params: 52,650 (205.66 KB)

Non-trainable params: 0 (0.00 B)

In [4]:
# Functional API example
inputs = keras.Input(shape=(784,))
x = keras.layers.Dense(64, activation='relu')(inputs)
x = keras.layers.Dense(32, activation='relu')(x)
outputs = keras.layers.Dense(10, activation='softmax')(x)

model_functional = keras.Model(inputs=inputs, outputs=outputs)

# Compile the model

model_functional.compile(
optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy']
)

# Model summary
model_functional.summary()

Model: "functional_1"

Total params: 52,650 (205.66 KB)

Trainable params: 52,650 (205.66 KB)

Non-trainable params: 0 (0.00 B)

In [5]:
# Subclassing example
class CustomModel(keras.Model):
def __init__(self):
super(CustomModel, self).__init__()
self.dense1 = keras.layers.Dense(64, activation='relu')
self.dense2 = keras.layers.Dense(32, activation='relu')
self.dense3 = keras.layers.Dense(10, activation='softmax')

def call(self, inputs):

x = self.dense1(inputs)
x = self.dense2(x)
return self.dense3(x)

model_subclass = CustomModel()

# Compile the model

model_subclass.compile(
optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy']
)
# Build the model with a sample input
model_subclass.build((None, 784))
model_subclass.summary()

/usr/local/lib/python3.11/dist-packages/keras/src/layers/layer.py:393: UserWarning: `buil

d()` was called on layer 'custom_model', however the layer does not have a `build()` meth
od implemented and it looks like it has unbuilt state. This will cause the layer to be ma
rked as built, despite not being actually built, which may cause failures down the line.
Make sure to implement a proper `build()` method.
warnings.warn(

Model: "custom_model"

Total params: 0 (0.00 B)

Trainable params: 0 (0.00 B)

Non-trainable params: 0 (0.00 B)

3.4. Common Layer Types in Keras

Keras provides various layer types for different purposes:

1. Dense (Fully Connected) Layers

2. Convolutional Layers (Conv1D, Conv2D, Conv3D)
3. Pooling Layers (MaxPooling, AveragePooling)
4. Recurrent Layers (SimpleRNN, LSTM, GRU)
5. Dropout Layers
6. BatchNormalization Layers
7. Embedding Layers
8. Flatten and Reshape Layers

Let's look at a more complex model using some of these layer types:

In [6]:
# Create a more complex model with different layer types
complex_model = keras.Sequential([
# Reshape input to 28x28x1 for CNNs
keras.layers.Reshape((28, 28, 1), input_shape=(784,)),

# Convolutional layers
keras.layers.Conv2D(32, kernel_size=(3, 3), activation='relu'),
keras.layers.MaxPooling2D(pool_size=(2, 2)),
keras.layers.Conv2D(64, kernel_size=(3, 3), activation='relu'),
keras.layers.MaxPooling2D(pool_size=(2, 2)),

# Flatten the output for dense layers

keras.layers.Flatten(),

# Dense layers with dropout for regularization

keras.layers.Dense(128, activation='relu'),
keras.layers.Dropout(0.5),
keras.layers.Dense(10, activation='softmax')
])

# Compile the model

complex_model.compile(
optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy']
)

# Model summary
complex_model.summary()

/usr/local/lib/python3.11/dist-packages/keras/src/layers/reshaping/reshape.py:39: UserWar
ning: Do not pass an ìnput_shape`/ìnput_dim` argument to a layer. When using Sequential
models, prefer using an Ìnput(shape)` object as the first layer in the model instead.
super().__init__(**kwargs)

Model: "sequential_1"

Total params: 225,034 (879.04 KB)

Trainable params: 225,034 (879.04 KB)

Non-trainable params: 0 (0.00 B)

In [7]:
# Load Fashion MNIST dataset with error handling
try:
# Attempt to load the dataset directly
(x_train, y_train), (x_test, y_test) = keras.datasets.fashion_mnist.load_data()
except Exception as e:
print(f"Error loading dataset directly: {e}")

# Alternative approach: use a proxy or local cache

try:
# Try using a different method with explicit path
import os
fashion_mnist_path = os.path.join(os.path.expanduser("~"), '.keras', 'datasets', 'fash
ion-mnist')
os.makedirs(fashion_mnist_path, exist_ok=True)

# If you have the dataset files locally, you can specify the path
# Otherwise, try using a different download method
print("Attempting to download using TensorFlow's get_file utility...")

from tensorflow.keras.utils import get_file

base_url = 'https://storage.googleapis.com/tensorflow/tf-keras-datasets/'
train_images_path = get_file('train-images-idx3-ubyte.gz', base_url + 'train-images-id
x3-ubyte.gz')
train_labels_path = get_file('train-labels-idx1-ubyte.gz', base_url + 'train-labels-id
x1-ubyte.gz')
test_images_path = get_file('t10k-images-idx3-ubyte.gz', base_url + 't10k-images-idx3-
ubyte.gz')
test_labels_path = get_file('t10k-labels-idx1-ubyte.gz', base_url + 't10k-labels-idx1-
ubyte.gz')

# Process the downloaded files using numpy

import gzip
import numpy as np

with gzip.open(train_images_path, 'rb') as f:

x_train = np.frombuffer(f.read(), np.uint8, offset=16).reshape(-1, 28, 28)

with gzip.open(train_labels_path, 'rb') as f:

y_train = np.frombuffer(f.read(), np.uint8, offset=8)

with gzip.open(test_images_path, 'rb') as f:

x_test = np.frombuffer(f.read(), np.uint8, offset=16).reshape(-1, 28, 28)

with gzip.open(test_labels_path, 'rb') as f:

y_test = np.frombuffer(f.read(), np.uint8, offset=8)

except Exception as e2:

print(f"Second attempt failed: {e2}")
print("Creating dummy data for demonstration purposes...")

# Create dummy data with the correct shape for demonstration

x_train = np.random.randint(0, 256, size=(60000, 28, 28), dtype=np.uint8)
y_train = np.random.randint(0, 10, size=(60000,), dtype=np.uint8)
x_test = np.random.randint(0, 256, size=(10000, 28, 28), dtype=np.uint8)
y_test = np.random.randint(0, 10, size=(10000,), dtype=np.uint8)

print("⚠️ WARNING: Using randomly generated dummy data. Real model training will not be
meaningful.")

# Check the shape of the data

print(f"Training data shape: {x_train.shape}")
print(f"Training labels shape: {y_train.shape}")
print(f"Test data shape: {x_test.shape}")
print(f"Test labels shape: {y_test.shape}")

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/train-l

abels-idx1-ubyte.gz
29515/29515 ━━━━━━━━━━━━━━━━━━━━ 0s 0us/step
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/train-i
mages-idx3-ubyte.gz
26421880/26421880 ━━━━━━━━━━━━━━━━━━━━ 0s 0us/step
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-la
bels-idx1-ubyte.gz
5148/5148 ━━━━━━━━━━━━━━━━━━━━ 0s 0us/step
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-im
ages-idx3-ubyte.gz
4422102/4422102 ━━━━━━━━━━━━━━━━━━━━ 0s 0us/step
Training data shape: (60000, 28, 28)
Training labels shape: (60000,)
Test data shape: (10000, 28, 28)
Test labels shape: (10000,)

In [8]:
# Define class names for Fashion MNIST
class_names = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat',
'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']

# Display some sample images

plt.figure(figsize=(10, 10))
for i in range(25):
plt.subplot(5, 5, i + 1)
plt.xticks([])
plt.yticks([])
plt.grid(False)
plt.imshow(x_train[i], cmap=plt.cm.binary)
plt.xlabel(class_names[y_train[i]])
plt.tight_layout()
plt.show()

In [9]:
# Preprocess the data
# Normalize pixel values to be between 0 and 1
x_train = x_train.astype('float32') / 255.0
x_test = x_test.astype('float32') / 255.0

# Reshape data for the model

x_train_flat = x_train.reshape(x_train.shape[0], -1)
x_test_flat = x_test.reshape(x_test.shape[0], -1)

In [ ]:
# Create a simple model for Fashion MNIST
fashion_model = keras.Sequential([
keras.layers.Dense(128, activation='relu', input_shape=(784,)),
keras.layers.Dropout(0.2),
keras.layers.Dense(64, activation='relu'),
keras.layers.Dropout(0.2),
keras.layers.Dense(10, activation='softmax')
])

# Compile the model

fashion_model.compile(
optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy']
)

# Train the model

history = fashion_model.fit(
x_train_flat, y_train,
epochs=10,
batch_size=64,
validation_split=0.2,
verbose=1,
shuffle=True, # Ensure data is shuffled before splitting
callbacks=[
# Add early stopping to prevent potential numerical issues
keras.callbacks.EarlyStopping(
monitor='val_loss',
patience=3,
restore_best_weights=True
)
]
)

In [11]:
# Evaluate the model
test_loss, test_acc = fashion_model.evaluate(x_test_flat, y_test, verbose=2)
print(f'\n Test accuracy: {test_acc:.4f}')

313/313 - 1s - 4ms/step - accuracy: 0.8783 - loss: 0.3426

Test accuracy: 0.8783

In [12]:
# Plot training history
plt.figure(figsize=(12, 4))

# Plot training & validation accuracy

plt.subplot(1, 2, 1)
plt.plot(history.history['accuracy'], label='Training Accuracy')
plt.plot(history.history['val_accuracy'], label='Validation Accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend()
plt.title('Training and Validation Accuracy')

# Plot training & validation loss

plt.subplot(1, 2, 2)
plt.plot(history.history['loss'], label='Training Loss')
plt.plot(history.history['val_loss'], label='Validation Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.legend()
plt.title('Training and Validation Loss')

plt.tight_layout()
plt.show()
In [13]:
# Make predictions
predictions = fashion_model.predict(x_test_flat)

# Function to plot an image with its prediction

def plot_image(i, predictions_array, true_label, img):
true_label, img = true_label[i], img[i]
plt.grid(False)
plt.xticks([])
plt.yticks([])

plt.imshow(img, cmap=plt.cm.binary)

predicted_label = np.argmax(predictions_array[i])
if predicted_label == true_label:
color = 'blue'
else:
color = 'red'

plt.xlabel("{} {:2.0f} % ({} )".format(

class_names[predicted_label],
100*np.max(predictions_array[i]),
class_names[true_label]),
color=color
)

# Function to plot the prediction bars

def plot_value_array(i, predictions_array, true_label):
true_label = true_label[i]
plt.grid(False)
plt.xticks(range(10))
plt.yticks([])
thisplot = plt.bar(range(10), predictions_array[i], color="#777777")
plt.ylim([0, 1])
predicted_label = np.argmax(predictions_array[i])

thisplot[predicted_label].set_color('red')
thisplot[true_label].set_color('blue')

# Plot the first X test images, their predicted labels, and the true labels
num_rows = 5
num_cols = 3
num_images = num_rows*num_cols
plt.figure(figsize=(2*2*num_cols, 2*num_rows))
for i in range(num_images):
plt.subplot(num_rows, 2*num_cols, 2*i+1)
plot_image(i, predictions, y_test, x_test)
plt.subplot(num_rows, 2*num_cols, 2*i+2)
plot_value_array(i, predictions, y_test)
plt.tight_layout()
plt.show()

313/313 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step

In [14]:
# Plot model architecture
try:
from tensorflow.keras.utils import plot_model

# Make sure pydot and graphviz are installed

plot_model(fashion_model, to_file='fashion_model.png', show_shapes=True, show_layer_
names=True)
from IPython.display import Image
Image('fashion_model.png')
except Exception as e:
print(f"To visualize the model, you need to install pydot and graphviz.")
print(f"Error: {e}")

In [15]:
# Reshape data for CNN
x_train_cnn = x_train.reshape(x_train.shape[0], 28, 28, 1)
x_test_cnn = x_test.reshape(x_test.shape[0], 28, 28, 1)

In [16]:
# Create a CNN model
cnn_model = keras.Sequential([
keras.layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)),
keras.layers.MaxPooling2D((2, 2)),
keras.layers.Conv2D(64, (3, 3), activation='relu'),
keras.layers.MaxPooling2D((2, 2)),
keras.layers.Flatten(),
keras.layers.Dense(128, activation='relu'),
keras.layers.Dropout(0.5),
keras.layers.Dense(10, activation='softmax')
])

cnn_model.compile(
optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy']
)

cnn_model.summary()

Model: "sequential_3"

Total params: 225,034 (879.04 KB)

Trainable params: 225,034 (879.04 KB)

Non-trainable params: 0 (0.00 B)

In [ ]:
# Train the CNN model with early stopping and learning rate reduction
early_stopping = keras.callbacks.EarlyStopping(
monitor='val_loss',
patience=3,
restore_best_weights=True
)

lr_scheduler = keras.callbacks.ReduceLROnPlateau(
monitor='val_loss',
factor=0.5,
patience=2
)

cnn_history = cnn_model.fit(
x_train_cnn, y_train,
epochs=15,
batch_size=64,
validation_split=0.2,
callbacks=[early_stopping, lr_scheduler],
verbose=1
)

In [18]:
# Evaluate the CNN model
cnn_test_loss, cnn_test_acc = cnn_model.evaluate(x_test_cnn, y_test, verbose=2)
print(f'\n CNN Test accuracy: {cnn_test_acc:.4f}')

313/313 - 1s - 4ms/step - accuracy: 0.9096 - loss: 0.2638

CNN Test accuracy: 0.9096

In [19]:

# Compare the performance of MLP and CNN

plt.figure(figsize=(10, 5))
plt.bar(['MLP Model', 'CNN Model'], [test_acc, cnn_test_acc], color=['blue', 'green'])
plt.title('Accuracy Comparison: MLP vs CNN')
plt.ylabel('Test Accuracy')
plt.ylim([0, 1])
for i, v in enumerate([test_acc, cnn_test_acc]):
plt.text(i, v + 0.01, f"{v:.4f} ", ha='center')
plt.show()

In [21]:

# Example of Transfer Learning with MobileNetV2

# Note: Fashion MNIST images are grayscale and small (28x28),
# but this is for demonstration of the approach

# Function to preprocess images for MobileNetV2

def preprocess_images_for_mobilenet(images, target_size=(96, 96)):
# Create an array to hold the preprocessed images
# The shape will be (batch_size, target_height, target_width, 3) for RGB
preprocessed = np.zeros((images.shape[0], *target_size, 3))

for i, img in enumerate(images):

# Add channel dimension to grayscale image (28x28 -> 28x28x1)
img_with_channel = tf.expand_dims(img, -1)
# Resize the 3D grayscale image to the target size (96x96x1)
resized = tf.image.resize(img_with_channel, target_size)
# Convert the resized grayscale image (now 96x96x1) to RGB (96x96x3)
preprocessed[i] = tf.image.grayscale_to_rgb(resized)

return preprocessed

# Prepare a small subset for demonstration

sample_size = 100 # Small sample for demonstration
x_sample = preprocess_images_for_mobilenet(x_train[:sample_size])
y_sample = y_train[:sample_size]

# Load pre-trained MobileNetV2

# Ensure input_shape matches the target_size and channels used in preprocessing
base_model = keras.applications.MobileNetV2(
input_shape=(96, 96, 3),
include_top=False,
weights='imagenet'
)

# Freeze the base model

base_model.trainable = False

# Create a new model on top

inputs = keras.Input(shape=(96, 96, 3))
x = base_model(inputs, training=False)
x = keras.layers.GlobalAveragePooling2D()(x)
x = keras.layers.Dense(128, activation='relu')(x)
x = keras.layers.Dropout(0.5)(x)
outputs = keras.layers.Dense(10, activation='softmax')(x)
transfer_model = keras.Model(inputs, outputs)

# Compile the model

transfer_model.compile(
optimizer=keras.optimizers.Adam(1e-4),
loss='sparse_categorical_crossentropy',
metrics=['accuracy']
)

print("Transfer Learning Model:")

transfer_model.summary()

Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/mobile

net_v2/mobilenet_v2_weights_tf_dim_ordering_tf_kernels_1.0_96_no_top.h5
9406464/9406464 ━━━━━━━━━━━━━━━━━━━━ 0s 0us/step
Transfer Learning Model:

Model: "functional_5"

Total params: 2,423,242 (9.24 MB)

Trainable params: 165,258 (645.54 KB)

Non-trainable params: 2,257,984 (8.61 MB)

In [24]:

# Save the CNN model (SavedModel format - recommended)

# Use .keras extension for the native Keras format
cnn_model.save('fashion_mnist_cnn.keras')

# Save only the weights

# Changed filepath to end with .weights.h5
cnn_model.save_weights('fashion_mnist_cnn_weights.weights.h5')

# Save in HDF5 format

cnn_model.save('fashion_mnist_cnn.h5')

# Load a model (specify the path to the .keras file)

loaded_model = keras.models.load_model('fashion_mnist_cnn.keras')

# Verify it works
loaded_model_loss, loaded_model_acc = loaded_model.evaluate(x_test_cnn, y_test, verbose=
2)
print(f'\n Loaded model test accuracy: {loaded_model_acc:.4f} ')

WARNING:absl:You are saving your model as an HDF5 file via `model.save()` or `keras.savin
g.save_model(model)`. This file format is considered legacy. We recommend using instead t
he native Keras format, e.g. `model.save('my_model.keras')` or `keras.saving.save_model(m
odel, 'my_model.keras')`.

313/313 - 1s - 4ms/step - accuracy: 0.9096 - loss: 0.2638

Loaded model test accuracy: 0.9096

In [ ]:

# Convert to TensorFlow Lite

converter = tf.lite.TFLiteConverter.from_keras_model(cnn_model)
tflite_model = converter.convert()

# Save the TF Lite model

with open('fashion_mnist_model.tflite', 'wb') as f:
f.write(tflite_model)

print(f"TFLite model size: {len(tflite_model) / 1024:.2f} KB")

Experiment 13: ANN for Customer Churn Prediction (use Telco
Churn Dataset or sample CSV)
In [1]:

# necessary imports

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

import warnings
warnings.filterwarnings('ignore')

plt.style.use('fivethirtyeight')
%matplotlib inline

In [ ]:
df = pd.read_csv(r'C:\Users\Rishi\Churn_Modelling.csv')
df.head()
Out[ ]:

RowNumber CustomerId Surname CreditScore Geography Gender Age Tenure Balance NumOfProducts HasCrCard IsAc

0 1 15634602 Hargrave 619 France Female 42 2 0.00 1 1

1 2 15647311 Hill 608 Spain Female 41 1 83807.86 1 0

2 3 15619304 Onio 502 France Female 42 8 159660.80 3 1

3 4 15701354 Boni 699 France Female 39 1 0.00 2 0

4 5 15737888 Mitchell 850 Spain Female 43 2 125510.82 1 1

In [3]:
df.describe()

Out[3]:

RowNumber CustomerId CreditScore Age Tenure Balance NumOfProducts HasCrCard IsActive

count 10000.00000 1.000000e+04 10000.000000 10000.000000 10000.000000 10000.000000 10000.000000 10000.00000 10000

mean 5000.50000 1.569094e+07 650.528800 38.921800 5.012800 76485.889288 1.530200 0.70550

std 2886.89568 7.193619e+04 96.653299 10.487806 2.892174 62397.405202 0.581654 0.45584

min 1.00000 1.556570e+07 350.000000 18.000000 0.000000 0.000000 1.000000 0.00000

25% 2500.75000 1.562853e+07 584.000000 32.000000 3.000000 0.000000 1.000000 0.00000

50% 5000.50000 1.569074e+07 652.000000 37.000000 5.000000 97198.540000 1.000000 1.00000

75% 7500.25000 1.575323e+07 718.000000 44.000000 7.000000 127644.240000 2.000000 1.00000

max 10000.00000 1.581569e+07 850.000000 92.000000 10.000000 250898.090000 4.000000 1.00000

In [4]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10000 entries, 0 to 9999
Data columns (total 14 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 RowNumber 10000 non-null int64
1 CustomerId 10000 non-null int64
2 Surname 10000 non-null object
3 CreditScore 10000 non-null int64
4 Geography 10000 non-null object
5 Gender 10000 non-null object
6 Age 10000 non-null int64
7 Tenure 10000 non-null int64
8 Balance 10000 non-null float64
9 NumOfProducts 10000 non-null int64
10 HasCrCard 10000 non-null int64
11 IsActiveMember 10000 non-null int64
12 EstimatedSalary 10000 non-null float64
13 Exited 10000 non-null int64
dtypes: float64(2), int64(9), object(3)
memory usage: 1.1+ MB

In [ ]:
# checking for null values

df.isna().sum() # no null values

In [6]:
values = df.Exited.value_counts()
labels = ['Not Exited', 'Exited']

fig, ax = plt.subplots(figsize = (4, 3), dpi = 100)

explode = (0, 0.09)

patches, texts, autotexts = ax.pie(values, labels = labels, autopct = '%1.2f%% ', shadow
= True,
startangle = 90, explode = explode)

plt.setp(texts, color = 'grey')

plt.setp(autotexts, size = 8, color = 'white')
autotexts[1].set_color('black')
plt.show()

In [7]:
# visualizing categorical variables

fig, ax = plt.subplots(3, 2, figsize = (18, 15))

sns.countplot('Geography', hue = 'Exited', data = df, ax = ax[0][0])

sns.countplot('Gender', hue = 'Exited', data = df, ax = ax[0][1])
sns.countplot('Tenure', hue = 'Exited', data = df, ax = ax[1][0])
sns.countplot('NumOfProducts', hue = 'Exited', data = df, ax = ax[1][1])
sns.countplot('HasCrCard', hue = 'Exited', data = df, ax = ax[2][0])
sns.countplot('IsActiveMember', hue = 'Exited', data = df, ax = ax[2][1])

plt.tight_layout()
plt.show()

In [8]:
# visualizing continuous variables

fig, ax = plt.subplots(2, 2, figsize = (16, 10))

sns.boxplot(x = 'Exited', y = 'CreditScore', data = df, ax = ax[0][0])

sns.boxplot(x = 'Exited', y = 'Age', data = df, ax = ax[0][1])
sns.boxplot(x = 'Exited', y = 'Balance', data = df, ax = ax[1][0])
sns.boxplot(x = 'Exited', y = 'EstimatedSalary', data = df, ax = ax[1][1])

plt.tight_layout()
plt.show()
In [9]:
# heatmap

plt.figure(figsize = (20, 12))

corr = df.corr()

sns.heatmap(corr, linewidths = 1, annot = True, fmt = ".2f")

plt.show()

In [10]:

# dropping useless columns

df.drop(columns = ['RowNumber', 'CustomerId', 'Surname'], axis = 1, inplace = True)

df.head()
Out[10]:

CreditScore Geography Gender Age Tenure Balance NumOfProducts HasCrCard IsActiveMember EstimatedSalary Exited

0 619 France Female 42 2 0.00 1 1 1 101348.88

1 608 Spain Female 41 1 83807.86 1 0 1 112542.58

2 502 France Female 42 8 159660.80 3 1 0 113931.57

3 699 France Female 39 1 0.00 2 0 0 93826.63

4 850 Spain Female 43 2 125510.82 1 1 1 79084.10

In [11]:
df.Geography.value_counts()
Out[11]:

France 5014
Germany 2509
Spain 2477
Name: Geography, dtype: int64

In [12]:
# Encoding categorical variables

df['Geography'] = df['Geography'].map({'France' : 0, 'Germany' : 1, 'Spain' : 2})

df['Gender'] = df['Gender'].map({'Male' : 0, 'Female' : 1})

In [13]:
df.head()
Out[13]:

CreditScore Geography Gender Age Tenure Balance NumOfProducts HasCrCard IsActiveMember EstimatedSalary Exited

0 619 0 1 42 2 0.00 1 1 1 101348.88

1 608 2 1 41 1 83807.86 1 0 1 112542.58

2 502 0 1 42 8 159660.80 3 1 0 113931.57

3 699 0 1 39 1 0.00 2 0 0 93826.63

4 850 2 1 43 2 125510.82 1 1 1 79084.10

In [14]:
# creating features and label

from tensorflow.keras.utils import to_categorical

X = df.drop('Exited', axis = 1)
y = to_categorical(df.Exited)

In [15]:
# splitting data into training set and test set

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.25)

In [16]:
# Scaling data

from sklearn.preprocessing import StandardScaler

sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)

In [ ]:
import keras
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Dropout
from keras.layers import BatchNormalization
# initializing ann
model = Sequential()

# adding the first input layer and the first hidden layer
model.add(Dense(10, kernel_initializer = 'normal', activation = 'relu', input_shape = (1
0, )))

# adding batch normalization and dropout layer

model.add(Dropout(rate = 0.1))
model.add(BatchNormalization())

# adding the third hidden layer

model.add(Dense(7, kernel_initializer = 'normal', activation = 'relu'))

# adding batch normalization and dropout layer

model.add(Dropout(rate = 0.1))
model.add(BatchNormalization())

# adding the output layer

model.add(Dense(2, kernel_initializer = 'normal', activation = 'sigmoid'))

# compiling the model

model.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])

# fitting the model to the training set

model_history = model.fit(X_train, y_train, validation_split = 0.20, validation_data = (

X_test, y_test), epochs = 100)

In [18]:
plt.figure(figsize = (12, 6))

train_loss = model_history.history['loss']
val_loss = model_history.history['val_loss']
epoch = range(1, 101)
sns.lineplot(epoch, train_loss, label = 'Training Loss')
sns.lineplot(epoch, val_loss, label = 'Validation Loss')
plt.title('Training and Validation Loss\n ')
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.legend()
plt.show()

In [19]:
plt.figure(figsize = (12, 6))
train_loss = model_history.history['accuracy']
val_loss = model_history.history['val_accuracy']
epoch = range(1, 101)
sns.lineplot(epoch, train_loss, label = 'Training accuracy')
sns.lineplot(epoch, val_loss, label = 'Validation accuracy')
plt.title('Training and Validation Accuracy\n ')
plt.xlabel('Epochs')
plt.ylabel('Accuracy')
plt.legend()
plt.show()

In [20]:
acc = model.evaluate(X_test, y_test)[1]

print(f'Accuracy of model is {acc}')

79/79 [==============================] - 0s 872us/step - loss: 0.3380 - accuracy: 0.8656

Accuracy of model is 0.8655999898910522

In [21]:
model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense (Dense) (None, 10) 110
_________________________________________________________________
dropout (Dropout) (None, 10) 0
_________________________________________________________________
batch_normalization (BatchNo (None, 10) 40
_________________________________________________________________
dense_1 (Dense) (None, 7) 77
_________________________________________________________________
dropout_1 (Dropout) (None, 7) 0
_________________________________________________________________
batch_normalization_1 (Batch (None, 7) 28
_________________________________________________________________
dense_2 (Dense) (None, 2) 16
=================================================================
Total params: 271
Trainable params: 237
Non-trainable params: 34
_________________________________________________________________
In [22]:
from tensorflow.keras.utils import plot_model

plot_model(model, show_shapes = True)

Out[22]:
Experiment 14: Introduction to Generative Adversarial
Networks (GANs) with a simple GAN model
In [ ]:

# Import necessary libraries

import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
import time
import os

# Check TensorFlow version

print(f"TensorFlow version: {tf.__version__}")

# Set random seeds for reproducibility

np.random.seed(42)
tf.random.set_seed(42)

TensorFlow version: 2.18.0

In [ ]:

# Check if GPU is available

print("Num GPUs Available: ", len(tf.config.list_physical_devices('GPU')))
print("Devices: ", tf.config.list_physical_devices())

# If GPU is available, configure for performance

if len(tf.config.list_physical_devices('GPU')) > 0:
# Allow memory growth
for gpu in tf.config.list_physical_devices('GPU'):
tf.config.experimental.set_memory_growth(gpu, True)
print("GPU memory growth enabled")

Num GPUs Available: 1

Devices: [PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU'), PhysicalDevi
ce(name='/physical_device:GPU:0', device_type='GPU')]
GPU memory growth enabled

In [ ]:

# Load MNIST dataset

(x_train, _), (_, _) = keras.datasets.mnist.load_data()

# Normalize the images to [-1, 1]

x_train = (x_train.astype('float32') - 127.5) / 127.5
# Add a channel dimension
x_train = x_train.reshape(x_train.shape[0], 28, 28, 1)

# Create a TensorFlow dataset

BUFFER_SIZE = 60000 # For shuffling the data
BATCH_SIZE = 256

train_dataset = tf.data.Dataset.from_tensor_slices(x_train).shuffle(BUFFER_SIZE).batch(B
ATCH_SIZE)

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.n

pz
11490434/11490434 ━━━━━━━━━━━━━━━━━━━━ 2s 0us/step

In [ ]:
# Visualize some samples from the dataset
plt.figure(figsize=(10, 10))
for i in range(25):
plt.subplot(5, 5, i+1)
# Rescale to [0, 1]
plt.imshow(x_train[i, :, :, 0] * 0.5 + 0.5, cmap='gray')
plt.axis('off')
plt.tight_layout()
plt.show()

In [ ]:
# Define the random noise dimension
NOISE_DIM = 100

# Build the generator model

def build_generator():
model = keras.Sequential()

# First, transform the input into a small spatial extent with many channels
model.add(layers.Dense(7 * 7 * 256, use_bias=False, input_shape=(NOISE_DIM,)))
model.add(layers.BatchNormalization())
model.add(layers.LeakyReLU(alpha=0.2))

# Reshape into a 3D tensor

model.add(layers.Reshape((7, 7, 256)))
# Upsampling layers
model.add(layers.Conv2DTranspose(128, (5, 5), strides=(1, 1), padding='same', use_bi
as=False))
model.add(layers.BatchNormalization())
model.add(layers.LeakyReLU(alpha=0.2))

model.add(layers.Conv2DTranspose(64, (5, 5), strides=(2, 2), padding='same', use_bia

s=False))
model.add(layers.BatchNormalization())
model.add(layers.LeakyReLU(alpha=0.2))

# Final layer with tanh activation to generate images with pixel values in [-1, 1]
model.add(layers.Conv2DTranspose(1, (5, 5), strides=(2, 2), padding='same', use_bias
=False, activation='tanh'))

return model

# Create the generator

generator = build_generator()

# Test the generator with random noise

noise = tf.random.normal([1, NOISE_DIM])
generated_image = generator(noise, training=False)

# Print model summary

generator.summary()

# Visualize a generated image

plt.figure(figsize=(4, 4))
plt.imshow(generated_image[0, :, :, 0] * 0.5 + 0.5, cmap='gray')
plt.title('Generated Image (Initial Random Weights)')
plt.axis('off')
plt.show()

/usr/local/lib/python3.11/dist-packages/keras/src/layers/core/dense.py:87: UserWarning: D
o not pass an ìnput_shape`/ìnput_dim` argument to a layer. When using Sequential models
, prefer using an Ìnput(shape)` object as the first layer in the model instead.
super().__init__(activity_regularizer=activity_regularizer, **kwargs)
/usr/local/lib/python3.11/dist-packages/keras/src/layers/activations/leaky_relu.py:41: Us
erWarning: Argument àlpha` is deprecated. Use `negative_slope` instead.
warnings.warn(

Model: "sequential"

Total params: 2,330,944 (8.89 MB)

Trainable params: 2,305,472 (8.79 MB)

Non-trainable params: 25,472 (99.50 KB)

In [ ]:
# Build the discriminator model
def build_discriminator():
model = keras.Sequential()

# First convolutional layer

model.add(layers.Conv2D(64, (5, 5), strides=(2, 2), padding='same', input_shape=[28,
28, 1]))
model.add(layers.LeakyReLU(alpha=0.2))
model.add(layers.Dropout(0.3))

# Second convolutional layer

model.add(layers.Conv2D(128, (5, 5), strides=(2, 2), padding='same'))
model.add(layers.LeakyReLU(alpha=0.2))
model.add(layers.Dropout(0.3))

# Flatten and output layer

model.add(layers.Flatten())
model.add(layers.Dense(1))

return model

# Create the discriminator

discriminator = build_discriminator()

# Test the discriminator with a real image

decision = discriminator(tf.expand_dims(x_train[0], 0))
print(f"Discriminator output for a real image: {decision.numpy()[0][0]}")

# Print model summary

discriminator.summary()

/usr/local/lib/python3.11/dist-packages/keras/src/layers/convolutional/base_conv.py:107:
UserWarning: Do not pass an ìnput_shape`/ìnput_dim` argument to a layer. When using Seq
uential models, prefer using an Ìnput(shape)` object as the first layer in the model ins
uential models, prefer using an Ìnput(shape)` object as the first layer in the model ins
tead.
super().__init__(activity_regularizer=activity_regularizer, **kwargs)

Discriminator output for a real image: 0.053151972591876984

Model: "sequential_1"

Total params: 212,865 (831.50 KB)

Trainable params: 212,865 (831.50 KB)

Non-trainable params: 0 (0.00 B)

In [ ]:

# Define the loss functions

cross_entropy = keras.losses.BinaryCrossentropy(from_logits=True)

# Define the discriminator loss

def discriminator_loss(real_output, fake_output):
real_loss = cross_entropy(tf.ones_like(real_output), real_output)
fake_loss = cross_entropy(tf.zeros_like(fake_output), fake_output)
total_loss = real_loss + fake_loss
return total_loss

# Define the generator loss

def generator_loss(fake_output):
return cross_entropy(tf.ones_like(fake_output), fake_output)

# Define the optimizers

generator_optimizer = keras.optimizers.Adam(1e-4)
discriminator_optimizer = keras.optimizers.Adam(1e-4)

In [ ]:
# Define a function to generate and save images
def generate_and_save_images(model, epoch, test_input):
# Generate images
predictions = model(test_input, training=False)

# Plot the generated images

plt.figure(figsize=(10, 10))
for i in range(test_input.shape[0]):
plt.subplot(5, 5, i+1)
plt.imshow(predictions[i, :, :, 0] * 0.5 + 0.5, cmap='gray')
plt.axis('off')
plt.tight_layout()
plt.savefig(f'gan_epoch_{epoch:04d}.png')
plt.close()
# In a Jupyter notebook, display the latest images
if epoch % 10 == 0 or epoch == 1:
plt.figure(figsize=(10, 10))
for i in range(test_input.shape[0]):
plt.subplot(5, 5, i+1)
plt.imshow(predictions[i, :, :, 0] * 0.5 + 0.5, cmap='gray')
plt.axis('off')
plt.suptitle(f'Epoch {epoch} ')
plt.tight_layout()
plt.show()

In [ ]:
# Define the training step
@tf.function
def train_step(images):
# Generate random noise for the generator
noise = tf.random.normal([BATCH_SIZE, NOISE_DIM])

with tf.GradientTape() as gen_tape, tf.GradientTape() as disc_tape:

# Generate fake images
generated_images = generator(noise, training=True)

# Get discriminator outputs for real and fake images

real_output = discriminator(images, training=True)
fake_output = discriminator(generated_images, training=True)

# Calculate losses
gen_loss = generator_loss(fake_output)
disc_loss = discriminator_loss(real_output, fake_output)

# Calculate gradients
gradients_of_generator = gen_tape.gradient(gen_loss, generator.trainable_variables)
gradients_of_discriminator = disc_tape.gradient(disc_loss, discriminator.trainable_va
riables)

# Apply gradients
generator_optimizer.apply_gradients(zip(gradients_of_generator, generator.trainable_v
ariables))
discriminator_optimizer.apply_gradients(zip(gradients_of_discriminator, discriminator
.trainable_variables))

return gen_loss, disc_loss

In [ ]:
# Define the training function
def train(dataset, epochs):
# Create a fixed noise vector for visualization
seed = tf.random.normal([25, NOISE_DIM])

# Track losses
gen_losses = []
disc_losses = []

# Training loop
for epoch in range(1, epochs + 1):
start_time = time.time()

# Lists to store batch losses

batch_gen_losses = []
batch_disc_losses = []

# Train on batches
for image_batch in dataset:
gen_loss, disc_loss = train_step(image_batch)
batch_gen_losses.append(gen_loss)
batch_disc_losses.append(disc_loss)

# Calculate average losses for the epoch

avg_gen_loss = tf.reduce_mean(batch_gen_losses)
avg_disc_loss = tf.reduce_mean(batch_disc_losses)
gen_losses.append(avg_gen_loss.numpy())
disc_losses.append(avg_disc_loss.numpy())

# Print progress
print(f"Epoch {epoch} /{epochs}, "
f"Generator Loss: {avg_gen_loss:.4f}, "
f"Discriminator Loss: {avg_disc_loss:.4f}, "
f"Time: {time.time() - start_time:.2f} sec")

# Generate and save images every 10 epochs or at the end

if epoch % 10 == 0 or epoch == 1 or epoch == epochs:
generate_and_save_images(generator, epoch, seed)

# Plot the loss curves

plt.figure(figsize=(12, 5))
plt.subplot(1, 2, 1)
plt.plot(range(1, epochs + 1), gen_losses, label='Generator')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.title('Generator Loss')
plt.grid(True)

plt.subplot(1, 2, 2)
plt.plot(range(1, epochs + 1), disc_losses, label='Discriminator')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.title('Discriminator Loss')
plt.grid(True)

plt.tight_layout()
plt.show()

return gen_losses, disc_losses

In [ ]:
# Define the number of epochs
EPOCHS = 50

# Train the GAN

print("Starting GAN training...")
gen_losses, disc_losses = train(train_dataset, EPOCHS)
print("GAN training completed!")

Starting GAN training...

Epoch 1/50, Generator Loss: 0.7640, Discriminator Loss: 1.0672, Time: 19.14 sec
Epoch 2/50, Generator Loss: 1.1070, Discriminator Loss: 0.9280, Time: 10.88 sec
Epoch 3/50, Generator Loss: 0.9801, Discriminator Loss: 1.1674, Time: 11.02 sec
Epoch 4/50, Generator Loss: 0.9755, Discriminator Loss: 1.1712, Time: 11.22 sec
Epoch 5/50, Generator Loss: 0.8822, Discriminator Loss: 1.2903, Time: 11.18 sec
Epoch 6/50, Generator Loss: 0.9582, Discriminator Loss: 1.1504, Time: 11.28 sec
Epoch 7/50, Generator Loss: 0.9361, Discriminator Loss: 1.2384, Time: 11.30 sec
Epoch 8/50, Generator Loss: 0.9839, Discriminator Loss: 1.1471, Time: 11.37 sec
Epoch 9/50, Generator Loss: 1.0402, Discriminator Loss: 1.1383, Time: 11.45 sec
Epoch 10/50, Generator Loss: 1.0986, Discriminator Loss: 1.1012, Time: 11.53 sec
Epoch 11/50, Generator Loss: 1.0813, Discriminator Loss: 1.1047, Time: 11.59 sec
Epoch 12/50, Generator Loss: 1.0634, Discriminator Loss: 1.1182, Time: 11.64 sec
Epoch 13/50, Generator Loss: 1.2573, Discriminator Loss: 1.0079, Time: 11.70 sec
Epoch 14/50, Generator Loss: 1.1528, Discriminator Loss: 1.0650, Time: 11.77 sec
Epoch 15/50, Generator Loss: 1.2226, Discriminator Loss: 0.9991, Time: 11.80 sec
Epoch 16/50, Generator Loss: 1.2214, Discriminator Loss: 1.0136, Time: 11.86 sec
Epoch 17/50, Generator Loss: 1.1788, Discriminator Loss: 1.0673, Time: 11.92 sec
Epoch 18/50, Generator Loss: 1.1950, Discriminator Loss: 1.0888, Time: 11.90 sec
Epoch 19/50, Generator Loss: 1.2432, Discriminator Loss: 1.0234, Time: 11.91 sec
Epoch 20/50, Generator Loss: 1.1656, Discriminator Loss: 1.1076, Time: 11.97 sec

Epoch 21/50, Generator Loss: 1.1547, Discriminator Loss: 1.0748, Time: 12.01 sec
Epoch 22/50, Generator Loss: 1.0937, Discriminator Loss: 1.1145, Time: 12.05 sec
Epoch 23/50, Generator Loss: 1.1250, Discriminator Loss: 1.1247, Time: 12.06 sec
Epoch 24/50, Generator Loss: 1.0968, Discriminator Loss: 1.1338, Time: 12.09 sec
Epoch 25/50, Generator Loss: 1.0754, Discriminator Loss: 1.1494, Time: 12.14 sec
Epoch 25/50, Generator Loss: 1.0754, Discriminator Loss: 1.1494, Time: 12.14 sec
Epoch 26/50, Generator Loss: 1.0967, Discriminator Loss: 1.1201, Time: 12.14 sec
Epoch 27/50, Generator Loss: 1.0493, Discriminator Loss: 1.1688, Time: 12.10 sec
Epoch 28/50, Generator Loss: 1.0357, Discriminator Loss: 1.1652, Time: 12.09 sec
Epoch 29/50, Generator Loss: 1.0498, Discriminator Loss: 1.1800, Time: 12.08 sec
Epoch 30/50, Generator Loss: 1.0252, Discriminator Loss: 1.1832, Time: 12.08 sec

Epoch 31/50, Generator Loss: 1.0058, Discriminator Loss: 1.1969, Time: 12.10 sec
Epoch 32/50, Generator Loss: 1.0144, Discriminator Loss: 1.1886, Time: 12.16 sec
Epoch 33/50, Generator Loss: 0.9977, Discriminator Loss: 1.1890, Time: 12.19 sec
Epoch 34/50, Generator Loss: 0.9557, Discriminator Loss: 1.2099, Time: 12.20 sec
Epoch 35/50, Generator Loss: 0.9276, Discriminator Loss: 1.2330, Time: 12.20 sec
Epoch 36/50, Generator Loss: 0.9567, Discriminator Loss: 1.2123, Time: 12.18 sec
Epoch 37/50, Generator Loss: 0.9403, Discriminator Loss: 1.2133, Time: 12.17 sec
Epoch 38/50, Generator Loss: 0.9853, Discriminator Loss: 1.2095, Time: 12.18 sec
Epoch 39/50, Generator Loss: 0.9811, Discriminator Loss: 1.2113, Time: 12.20 sec
Epoch 40/50, Generator Loss: 0.9665, Discriminator Loss: 1.2082, Time: 12.20 sec
Epoch 41/50, Generator Loss: 0.9887, Discriminator Loss: 1.1944, Time: 12.14 sec
Epoch 42/50, Generator Loss: 0.9751, Discriminator Loss: 1.2128, Time: 12.14 sec
Epoch 43/50, Generator Loss: 0.9558, Discriminator Loss: 1.2182, Time: 12.15 sec
Epoch 44/50, Generator Loss: 0.9504, Discriminator Loss: 1.2224, Time: 12.14 sec
Epoch 45/50, Generator Loss: 0.9616, Discriminator Loss: 1.2101, Time: 12.17 sec
Epoch 46/50, Generator Loss: 0.9443, Discriminator Loss: 1.2246, Time: 12.16 sec
Epoch 47/50, Generator Loss: 0.9527, Discriminator Loss: 1.2142, Time: 12.15 sec
Epoch 48/50, Generator Loss: 0.9618, Discriminator Loss: 1.2021, Time: 12.14 sec
Epoch 49/50, Generator Loss: 0.9717, Discriminator Loss: 1.2104, Time: 12.11 sec
Epoch 50/50, Generator Loss: 0.9633, Discriminator Loss: 1.2100, Time: 12.11 sec
GAN training completed!

In [ ]:

# Generate a larger set of images

num_examples = 100
random_noise = tf.random.normal([num_examples, NOISE_DIM])
generated_images = generator(random_noise, training=False)

# Convert to appropriate range for visualization

generated_images = generated_images * 0.5 + 0.5

# Display a grid of generated images

rows = 10
cols = 10
fig, axes = plt.subplots(rows, cols, figsize=(15, 15))
fig.suptitle("Generated Digits", fontsize=20)

for i, ax in enumerate(axes.flatten()):
if i < num_examples:
ax.imshow(generated_images[i, :, :, 0], cmap='gray')
ax.axis('off')

plt.tight_layout()
plt.subplots_adjust(top=0.95)
plt.show()
In [ ]:
# Create an animation of the generation process
# We can use the saved images from different epochs
try:
import imageio
from IPython.display import display, HTML
import glob

# Get all the saved image files

filenames = sorted(glob.glob('gan_epoch_*.png'))

# Create a GIF animation

with imageio.get_writer('gan_training.gif', mode='I', duration=0.5) as writer:
for filename in filenames:
image = imageio.imread(filename)
writer.append_data(image)

# Display the animation in the notebook

print("GAN Training Animation:")
with open('gan_training.gif', 'rb') as f:
display(HTML(f'<img src="data:image/gif;base64,{imageio.v2.imread(f).tobytes().he
x()}">'))
except Exception as e:
print(f"Could not create animation: {e}")
print("To create an animation, install imageio package.")

In [ ]:

# Create two random noise vectors

start_vector = tf.random.normal([1, NOISE_DIM])
end_vector = tf.random.normal([1, NOISE_DIM])

# Generate images for interpolated vectors

num_steps = 10
alpha_values = np.linspace(0, 1, num_steps)
interpolated_images = []

for alpha in alpha_values:

# Linear interpolation between start and end vectors
interpolated_vector = start_vector * (1 - alpha) + end_vector * alpha
# Generate an image
interpolated_image = generator(interpolated_vector, training=False)
interpolated_images.append(interpolated_image[0] * 0.5 + 0.5)

# Display the interpolated images

plt.figure(figsize=(15, 3))
for i, img in enumerate(interpolated_images):
plt.subplot(1, num_steps, i+1)
plt.imshow(img[:, :, 0], cmap='gray')
plt.axis('off')
plt.suptitle("Latent Space Interpolation", fontsize=16)
plt.tight_layout()
plt.show()

In [ ]:
# Load MNIST dataset with labels
(x_train, y_train), (_, _) = keras.datasets.mnist.load_data()

# Normalize the images to [-1, 1]

x_train = (x_train.astype('float32') - 127.5) / 127.5
# Add a channel dimension
x_train = x_train.reshape(x_train.shape[0], 28, 28, 1)

# Create a TensorFlow dataset with both images and labels

train_dataset_with_labels = tf.data.Dataset.from_tensor_slices((x_train, y_train))
train_dataset_with_labels = train_dataset_with_labels.shuffle(BUFFER_SIZE).batch(BATCH_SI
ZE)

In [ ]:

# Build the conditional generator model

def build_conditional_generator():
# Input for noise vector
noise_input = layers.Input(shape=(NOISE_DIM,))

# Input for label (condition)

label_input = layers.Input(shape=(1,))

# Embedding layer for the label

label_embedding = layers.Embedding(10, 50)(label_input)
label_embedding = layers.Flatten()(label_embedding)

# Combine noise and label

combined_input = layers.Concatenate()([noise_input, label_embedding])
# First dense layer
x = layers.Dense(7 * 7 * 256, use_bias=False)(combined_input)
x = layers.BatchNormalization()(x)
x = layers.LeakyReLU(alpha=0.2)(x)

# Reshape into 3D tensor

x = layers.Reshape((7, 7, 256))(x)

# Upsampling layers
x = layers.Conv2DTranspose(128, (5, 5), strides=(1, 1), padding='same', use_bias=Fal
se )(x)
x = layers.BatchNormalization()(x)
x = layers.LeakyReLU(alpha=0.2)(x)

x = layers.Conv2DTranspose(64, (5, 5), strides=(2, 2), padding='same', use_bias=Fals

e)(x)
x = layers.BatchNormalization()(x)
x = layers.LeakyReLU(alpha=0.2)(x)

# Final layer with tanh activation

output = layers.Conv2DTranspose(1, (5, 5), strides=(2, 2), padding='same', use_bias=
False, activation='tanh')(x)

model = keras.Model([noise_input, label_input], output)

return model

# Build the conditional discriminator model

def build_conditional_discriminator():
# Input for image
image_input = layers.Input(shape=(28, 28, 1))

# Input for label (condition)

label_input = layers.Input(shape=(1,))

# Embedding layer for the label

label_embedding = layers.Embedding(10, 50)(label_input)
label_embedding = layers.Flatten()(label_embedding)

# Reshape label for concatenation

label_embedding = layers.Dense(28 * 28)(label_embedding)
label_embedding = layers.Reshape((28, 28, 1))(label_embedding)

# Concatenate image and label

combined_input = layers.Concatenate()([image_input, label_embedding])

# Convolutional layers
x = layers.Conv2D(64, (5, 5), strides=(2, 2), padding='same')(combined_input)
x = layers.LeakyReLU(alpha=0.2)(x)
x = layers.Dropout(0.3)(x)

x = layers.Conv2D(128, (5, 5), strides=(2, 2), padding='same')(x)

x = layers.LeakyReLU(alpha=0.2)(x)
x = layers.Dropout(0.3)(x)

# Flatten and output layer

x = layers.Flatten()(x)
output = layers.Dense(1)(x)

model = keras.Model([image_input, label_input], output)

return model

# Create the conditional models

conditional_generator = build_conditional_generator()
conditional_discriminator = build_conditional_discriminator()

# Print model summaries

print("Conditional Generator:")
conditional_generator.summary()
print("\n Conditional Discriminator:")
conditional_discriminator.summary()

Conditional Generator:
Model: "functional_19"

┏━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┓
┃ Layer (type) ┃ Output Shape ┃ Param # ┃ Connected to ┃
┡━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━┩
│ input_layer_3 │ (None, 1) │ 0 │ - │
│ (InputLayer) │ │ │ │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ embedding │ (None, 1, 50) │ 500 │ input_layer_3[0]… │
│ (Embedding) │ │ │ │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ input_layer_2 │ (None, 100) │ 0 │ - │
│ (InputLayer) │ │ │ │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ flatten_1 (Flatten) │ (None, 50) │ 0 │ embedding[0][0] │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ concatenate │ (None, 150) │ 0 │ input_layer_2[0]… │
│ (Concatenate) │ │ │ flatten_1[0][0] │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ dense_2 (Dense) │ (None, 12544) │ 1,881,600 │ concatenate[0][0] │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ batch_normalizatio… │ (None, 12544) │ 50,176 │ dense_2[0][0] │
│ (BatchNormalizatio… │ │ │ │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ leaky_re_lu_5 │ (None, 12544) │ 0 │ batch_normalizat… │
│ (LeakyReLU) │ │ │ │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ reshape_1 (Reshape) │ (None, 7, 7, 256) │ 0 │ leaky_re_lu_5[0]… │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_transpose_3 │ (None, 7, 7, 128) │ 819,200 │ reshape_1[0][0] │
│ (Conv2DTranspose) │ │ │ │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ batch_normalizatio… │ (None, 7, 7, 128) │ 512 │ conv2d_transpose… │
│ (BatchNormalizatio… │ │ │ │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ leaky_re_lu_6 │ (None, 7, 7, 128) │ 0 │ batch_normalizat… │
│ (LeakyReLU) │ │ │ │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_transpose_4 │ (None, 14, 14, │ 204,800 │ leaky_re_lu_6[0]… │
│ (Conv2DTranspose) │ 64) │ │ │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ batch_normalizatio… │ (None, 14, 14, │ 256 │ conv2d_transpose… │
│ (BatchNormalizatio… │ 64) │ │ │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ leaky_re_lu_7 │ (None, 14, 14, │ 0 │ batch_normalizat… │
│ (LeakyReLU) │ 64) │ │ │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_transpose_5 │ (None, 28, 28, 1) │ 1,600 │ leaky_re_lu_7[0]… │
│ (Conv2DTranspose) │ │ │ │
└─────────────────────┴───────────────────┴────────────┴───────────────────┘

Total params: 2,958,644 (11.29 MB)

Trainable params: 2,933,172 (11.19 MB)

Non-trainable params: 25,472 (99.50 KB)

Conditional Discriminator:

Model: "functional_20"

┏━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┓
┃ Layer (type) ┃ Output Shape ┃ Param # ┃ Connected to ┃
┡━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━┩
│ input_layer_5 │ (None, 1) │ 0 │ - │
│ (InputLayer) │ │ │ │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ embedding_1 │ (None, 1, 50) │ 500 │ input_layer_5[0]… │
│ (Embedding) │ │ │ │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ flatten_2 (Flatten) │ (None, 50) │ 0 │ embedding_1[0][0] │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ dense_3 (Dense) │ (None, 784) │ 39,984 │ flatten_2[0][0] │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ input_layer_4 │ (None, 28, 28, 1) │ 0 │ - │
│ (InputLayer) │ │ │ │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ reshape_2 (Reshape) │ (None, 28, 28, 1) │ 0 │ dense_3[0][0] │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ concatenate_1 │ (None, 28, 28, 2) │ 0 │ input_layer_4[0]… │
│ (Concatenate) │ │ │ reshape_2[0][0] │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_2 (Conv2D) │ (None, 14, 14, │ 3,264 │ concatenate_1[0]… │
│ │ 64) │ │ │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ leaky_re_lu_8 │ (None, 14, 14, │ 0 │ conv2d_2[0][0] │
│ (LeakyReLU) │ 64) │ │ │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ dropout_2 (Dropout) │ (None, 14, 14, │ 0 │ leaky_re_lu_8[0]… │
│ │ 64) │ │ │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_3 (Conv2D) │ (None, 7, 7, 128) │ 204,928 │ dropout_2[0][0] │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ leaky_re_lu_9 │ (None, 7, 7, 128) │ 0 │ conv2d_3[0][0] │
│ (LeakyReLU) │ │ │ │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ dropout_3 (Dropout) │ (None, 7, 7, 128) │ 0 │ leaky_re_lu_9[0]… │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ flatten_3 (Flatten) │ (None, 6272) │ 0 │ dropout_3[0][0] │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ dense_4 (Dense) │ (None, 1) │ 6,273 │ flatten_3[0][0] │
└─────────────────────┴───────────────────┴────────────┴───────────────────┘

Total params: 254,949 (995.89 KB)

Trainable params: 254,949 (995.89 KB)

Non-trainable params: 0 (0.00 B)

In [ ]:

# Define the training step for conditional GAN

@tf.function
def train_conditional_step(images, labels):
# Generate random noise for the generator
batch_size = tf.shape(images)[0]
noise = tf.random.normal([batch_size, NOISE_DIM])

with tf.GradientTape() as gen_tape, tf.GradientTape() as disc_tape:

# Generate fake images
generated_images = conditional_generator([noise, labels], training=True)

# Get discriminator outputs for real and fake images

real_output = conditional_discriminator([images, labels], training=True)
fake_output = conditional_discriminator([generated_images, labels], training=Tru
e)

# Calculate losses
gen_loss = generator_loss(fake_output)
disc_loss = discriminator_loss(real_output, fake_output)

# Calculate gradients
gradients_of_generator = gen_tape.gradient(gen_loss, conditional_generator.trainable_
variables)
gradients_of_discriminator = disc_tape.gradient(disc_loss, conditional_discriminator.
trainable_variables)

# Apply gradients
generator_optimizer.apply_gradients(zip(gradients_of_generator, conditional_generator
.trainable_variables))
discriminator_optimizer.apply_gradients(zip(gradients_of_discriminator, conditional_d
iscriminator.trainable_variables))

return gen_loss, disc_loss

In [ ]:

# Generate a grid of digits from 0 to 9

def generate_digits():
# Generate 10 examples of each digit
rows = 10 # Digits 0-9
cols = 10 # Examples per digit
noise = tf.random.normal([rows * cols, NOISE_DIM])

# Create labels (digits 0-9, 10 examples each)

labels = np.array([digit for digit in range(rows) for _ in range(cols)])
labels = tf.convert_to_tensor(labels, dtype=tf.int32)

# Generate images
generated_images = conditional_generator([noise, labels], training=False)
generated_images = generated_images * 0.5 + 0.5 # Convert to [0, 1] range

# Plot the generated images

plt.figure(figsize=(15, 15))
for i in range(rows * cols):
plt.subplot(rows, cols, i + 1)
plt.imshow(generated_images[i, :, :, 0], cmap='gray')
plt.title(f"Digit: {labels[i].numpy()}")
plt.axis('off')
plt.tight_layout()
plt.show()

In [ ]:
# Define the training function
def train(dataset, epochs):
# Create a fixed noise vector for visualization
seed = tf.random.normal([25, NOISE_DIM])

# Track losses
gen_losses = []
disc_losses = []

# Re-initialize optimizers inside the training loop to ensure they are built with the
correct variables
# This can sometimes help with tf.function related issues where variables are not pro
perly registered
# with the optimizer in graph mode.
conditional_generator_optimizer = keras.optimizers.Adam(1e-4)
conditional_discriminator_optimizer = keras.optimizers.Adam(1e-4)

# Training loop
for epoch in range(1, epochs + 1):
start_time = time.time()

# Lists to store batch losses

batch_gen_losses = []
batch_disc_losses = []

# Train on batches
for image_batch, label_batch in dataset:
# Pass the correct optimizers to the train step
gen_loss, disc_loss = train_conditional_step(
image_batch,
label_batch,
conditional_generator_optimizer,
conditional_discriminator_optimizer
)
batch_gen_losses.append(gen_loss)
batch_disc_losses.append(disc_loss)

# Calculate average losses for the epoch

avg_gen_loss = tf.reduce_mean(batch_gen_losses)
avg_disc_loss = tf.reduce_mean(batch_disc_losses)
# Use .numpy() only when eager execution is guaranteed (outside tf.function)
gen_losses.append(avg_gen_loss.numpy())
disc_losses.append(avg_disc_loss.numpy())

# Print progress
print(f"Epoch {epoch} /{epochs}, "
f"Generator Loss: {avg_gen_loss:.4f}, "
f"Discriminator Loss: {avg_disc_loss:.4f}, "
f"Time: {time.time() - start_time:.2f} sec")

# Generate and save images every 10 epochs or at the end

if epoch % 5 == 0 or epoch == 1 or epoch == epochs: # Changed from 10 to 5 for C
GAN_EPOCHS=10
generate_digits() # generate_digits uses conditional_generator which is glob
al

# Plot the loss curves

plt.figure(figsize=(12, 5))
plt.subplot(1, 2, 1)
plt.plot(range(1, epochs + 1), gen_losses, label='Generator')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.title('Generator Loss')
plt.grid(True)

plt.subplot(1, 2, 2)
plt.plot(range(1, epochs + 1), disc_losses, label='Discriminator')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.title('Discriminator Loss')
plt.grid(True)

plt.tight_layout()
plt.show()

return gen_losses, disc_losses

# Define the training step for conditional GAN

@tf.function
def train_conditional_step(images, labels, generator_optimizer, discriminator_optimizer):
# Generate random noise for the generator
batch_size = tf.shape(images)[0]
noise = tf.random.normal([batch_size, NOISE_DIM])

with tf.GradientTape() as gen_tape, tf.GradientTape() as disc_tape:

# Generate fake images
generated_images = conditional_generator([noise, labels], training=True)

# Get discriminator outputs for real and fake images

real_output = conditional_discriminator([images, labels], training=True)
fake_output = conditional_discriminator([generated_images, labels], training=Tru
e)

# Calculate losses
gen_loss = generator_loss(fake_output)
disc_loss = discriminator_loss(real_output, fake_output)

# Apply gradients
# Pass the gradients and variables as lists or tuples explicitly
generator_optimizer.apply_gradients(zip(gradients_of_generator, conditional_generator
.trainable_variables))
discriminator_optimizer.apply_gradients(zip(gradients_of_discriminator, conditional_d
iscriminator.trainable_variables))

return gen_loss, disc_loss

# Train the conditional GAN for a few epochs

# Note: This is a simplified training loop for demonstration purposes
CGAN_EPOCHS = 10 # Reduced for demonstration, increase for better results

print("Training Conditional GAN...")

# The train function will now handle the optimizers internally
gen_losses, disc_losses = train(train_dataset_with_labels, CGAN_EPOCHS)
print("Conditional GAN training completed!")

Training Conditional GAN...

Epoch 1/10, Generator Loss: 1.1183, Discriminator Loss: 1.0813, Time: 16.59 sec
Epoch 2/10, Generator Loss: 1.1721, Discriminator Loss: 1.0404, Time: 11.76 sec
Epoch 3/10, Generator Loss: 1.1661, Discriminator Loss: 1.0226, Time: 11.79 sec
Epoch 4/10, Generator Loss: 1.0551, Discriminator Loss: 1.1580, Time: 11.87 sec
Epoch 5/10, Generator Loss: 1.0769, Discriminator Loss: 1.1099, Time: 11.92 sec

Epoch 6/10, Generator Loss: 1.1119, Discriminator Loss: 1.1047, Time: 12.13 sec
Epoch 7/10, Generator Loss: 1.0590, Discriminator Loss: 1.1147, Time: 12.11 sec
Epoch 8/10, Generator Loss: 1.0267, Discriminator Loss: 1.1591, Time: 12.19 sec
Epoch 9/10, Generator Loss: 1.1146, Discriminator Loss: 1.0851, Time: 12.28 sec
Epoch 10/10, Generator Loss: 1.1838, Discriminator Loss: 1.0617, Time: 12.28 sec
Conditional GAN training completed!

Deep LearningLAB FILE KCS751A
No ratings yet
Deep LearningLAB FILE KCS751A
16 pages
C3 W3
No ratings yet
C3 W3
59 pages
LA Lab
No ratings yet
LA Lab
4 pages
Ej Stanford Dog Densenet
No ratings yet
Ej Stanford Dog Densenet
6 pages
Deep Learning Lab Assignment
No ratings yet
Deep Learning Lab Assignment
6 pages
Deep Learning Notebook
No ratings yet
Deep Learning Notebook
7 pages
Deep Learning Project
No ratings yet
Deep Learning Project
6 pages
Robotika Cerdas M1 Savana Naurizka
No ratings yet
Robotika Cerdas M1 Savana Naurizka
3 pages
Stanford Dogs EfficientNetV2S
No ratings yet
Stanford Dogs EfficientNetV2S
7 pages
AI Coding - Ipynb - Colab
No ratings yet
AI Coding - Ipynb - Colab
6 pages
Assignment10 4
100% (1)
Assignment10 4
3 pages
Integer-Encoding-Simplernn - Ipynb - Colaboratory
No ratings yet
Integer-Encoding-Simplernn - Ipynb - Colaboratory
4 pages
Assignment No 2
No ratings yet
Assignment No 2
3 pages
Tanveer Younas B22F0047AI067
No ratings yet
Tanveer Younas B22F0047AI067
3 pages
CV Exp - 9 - Colab
No ratings yet
CV Exp - 9 - Colab
5 pages
Mardion
No ratings yet
Mardion
16 pages
Import As Import As Import As Import As From Import From Import From Import From Import From Import From Import From Import From Import From Import
No ratings yet
Import As Import As Import As Import As From Import From Import From Import From Import From Import From Import From Import From Import From Import
8 pages
Movie Review Classification
No ratings yet
Movie Review Classification
5 pages
DL LAB Expt 11 (Add On)
No ratings yet
DL LAB Expt 11 (Add On)
9 pages
ccs355 Lab Manual
No ratings yet
ccs355 Lab Manual
24 pages
M3 Nanda Salsabila 11521002 4PA42
No ratings yet
M3 Nanda Salsabila 11521002 4PA42
7 pages
M3 - Aliyah Muthi Lathifah - 10521097 - 4PA21
No ratings yet
M3 - Aliyah Muthi Lathifah - 10521097 - 4PA21
4 pages
Regularization For Neural Network
No ratings yet
Regularization For Neural Network
37 pages
M4 - Aliyah Muthi Lathifah - 10521097 - 4PA21
No ratings yet
M4 - Aliyah Muthi Lathifah - 10521097 - 4PA21
4 pages
CNN Ise
No ratings yet
CNN Ise
5 pages
Experiment 3 (A, B, C) (RNN) (Recuurent) (IMDB) )
No ratings yet
Experiment 3 (A, B, C) (RNN) (Recuurent) (IMDB) )
11 pages
Ass 3
No ratings yet
Ass 3
5 pages
Final Code
No ratings yet
Final Code
16 pages
AlexNet Guide for ML Practitioners
No ratings yet
AlexNet Guide for ML Practitioners
10 pages
DL2 - Jupyter Notebook
No ratings yet
DL2 - Jupyter Notebook
5 pages
MNIST - Ipynb - Colab
No ratings yet
MNIST - Ipynb - Colab
5 pages
Untitled Document
No ratings yet
Untitled Document
11 pages
DL Merged
No ratings yet
DL Merged
19 pages
Deep Learning
No ratings yet
Deep Learning
40 pages
DCGAN From Scratch Using TensorFlow and Keras
No ratings yet
DCGAN From Scratch Using TensorFlow and Keras
13 pages
NN & DL Lab Manual 1
No ratings yet
NN & DL Lab Manual 1
44 pages
DL Prac03IT
No ratings yet
DL Prac03IT
7 pages
f8194544 Microsoft PowerPoint DeepLearning
No ratings yet
f8194544 Microsoft PowerPoint DeepLearning
28 pages
Applied Machine Learning For Engineers: Artificial Neural Networks
0% (1)
Applied Machine Learning For Engineers: Artificial Neural Networks
6 pages
Python Deep Learning Lab Programs
No ratings yet
Python Deep Learning Lab Programs
35 pages
DLWP Chapter6
No ratings yet
DLWP Chapter6
6 pages
TensorFlow2 Image Classification Model
No ratings yet
TensorFlow2 Image Classification Model
2 pages
Course 3 - Week 2 - Exercise - Answer - Ipynb - Colaboratory
No ratings yet
Course 3 - Week 2 - Exercise - Answer - Ipynb - Colaboratory
8 pages
Assignment 7 ML
No ratings yet
Assignment 7 ML
20 pages
CCN 1
No ratings yet
CCN 1
1 page
A3 - Jupyter Notebook PDF
No ratings yet
A3 - Jupyter Notebook PDF
5 pages
Exercise - 1
No ratings yet
Exercise - 1
38 pages
TensorFlow PCA and Triplet Loss Guide
No ratings yet
TensorFlow PCA and Triplet Loss Guide
19 pages
Transfer Learning Alexnet - Ipynb - Colaboratory
No ratings yet
Transfer Learning Alexnet - Ipynb - Colaboratory
5 pages
Content: From Import Import As Import Import Import As
No ratings yet
Content: From Import Import As Import Import Import As
8 pages
DL Experiment 3
No ratings yet
DL Experiment 3
3 pages
Deep Learning LAB
No ratings yet
Deep Learning LAB
47 pages
La Praktikum m3
No ratings yet
La Praktikum m3
9 pages
# !pip Install Keras Tensorflow - U
No ratings yet
# !pip Install Keras Tensorflow - U
24 pages
Ap21110011455 Lab7
No ratings yet
Ap21110011455 Lab7
7 pages
Computer Vision Lab Guide
No ratings yet
Computer Vision Lab Guide
120 pages
MLP v4
No ratings yet
MLP v4
27 pages
Ketquacuoi
No ratings yet
Ketquacuoi
4 pages
Bimr Nursing College1
No ratings yet
Bimr Nursing College1
2 pages
Amity Frunt
No ratings yet
Amity Frunt
2 pages
10th Dpa Preface Content
No ratings yet
10th Dpa Preface Content
5 pages
10 Dpa
No ratings yet
10 Dpa
1 page
Syllabus
No ratings yet
Syllabus
2 pages
KUNAL
No ratings yet
KUNAL
1 page
Ds Assignment 1
No ratings yet
Ds Assignment 1
3 pages
Understanding Personality and Individual Differences: by Speed Cyber
No ratings yet
Understanding Personality and Individual Differences: by Speed Cyber
8 pages
Understanding Personality and Individual Differences: by Speed Cyber
No ratings yet
Understanding Personality and Individual Differences: by Speed Cyber
8 pages
Control Center and Data Exchange Requirements: PJM Manual 01
No ratings yet
Control Center and Data Exchange Requirements: PJM Manual 01
67 pages
Ed Uaa Uay Fst4 201801 (New)
67% (3)
Ed Uaa Uay Fst4 201801 (New)
67 pages
Dbms S1
No ratings yet
Dbms S1
7 pages
Enfield Bullet Workshop Manual 2000 2 PDF
No ratings yet
Enfield Bullet Workshop Manual 2000 2 PDF
53 pages
UG B Pharmacy Organic Chemistry Practice Test
No ratings yet
UG B Pharmacy Organic Chemistry Practice Test
29 pages
(Percent Per Annum) I.28 Interest Rate of Time Deposits in Rupiah by Group of Banks and Type of Maturity
No ratings yet
(Percent Per Annum) I.28 Interest Rate of Time Deposits in Rupiah by Group of Banks and Type of Maturity
36 pages
Fuel System Diagnosis
100% (1)
Fuel System Diagnosis
34 pages
CCS366 Software Testing and Automation
No ratings yet
CCS366 Software Testing and Automation
36 pages
X Zylo
No ratings yet
X Zylo
39 pages
Sample Chapter 3
No ratings yet
Sample Chapter 3
84 pages
Calculus & Matrices: Class 12 Tasks
No ratings yet
Calculus & Matrices: Class 12 Tasks
3 pages
Haemostasis Training Basics
No ratings yet
Haemostasis Training Basics
11 pages
High Delta T Chilled Water System
100% (4)
High Delta T Chilled Water System
13 pages
Chap 3@prop@blaq
No ratings yet
Chap 3@prop@blaq
19 pages
Creating C-Gauges by Dai Griffith PDF
No ratings yet
Creating C-Gauges by Dai Griffith PDF
112 pages
U2100 TSSR Updated Template
No ratings yet
U2100 TSSR Updated Template
23 pages
Construction Time Management Guide
No ratings yet
Construction Time Management Guide
54 pages
Aclar Llamado I231431 12
No ratings yet
Aclar Llamado I231431 12
40 pages
Mastering Message Tracking v1.00
No ratings yet
Mastering Message Tracking v1.00
74 pages
Operating Instructions: DL-6MB High-Capacity Lowspeed Refrigerated Centrifuge
No ratings yet
Operating Instructions: DL-6MB High-Capacity Lowspeed Refrigerated Centrifuge
29 pages
Working of Cotton Yarn Dyeing in HTHP Dyeing Machine: Samridhi Singh 17Txt112
No ratings yet
Working of Cotton Yarn Dyeing in HTHP Dyeing Machine: Samridhi Singh 17Txt112
20 pages
December SAT v0
No ratings yet
December SAT v0
16 pages
PHYSICS - Quiz Bee Reviewer
100% (6)
PHYSICS - Quiz Bee Reviewer
2 pages
ULTRAFILTRATION
100% (2)
ULTRAFILTRATION
26 pages
Untitled Document Payall Nayak
No ratings yet
Untitled Document Payall Nayak
25 pages
Power Supply Test Report
No ratings yet
Power Supply Test Report
5 pages
BBDMS Report
No ratings yet
BBDMS Report
107 pages
Research Critique
0% (1)
Research Critique
20 pages
CH 1 Demarcation of Blocks With Key
No ratings yet
CH 1 Demarcation of Blocks With Key
11 pages
Momentum Picks
No ratings yet
Momentum Picks
27 pages