[go: up one dir, main page]

0% found this document useful (0 votes)
9 views199 pages

Experiment 1: ANN Using Keras Library and Various Activation Experiment 1: ANN Using Keras Library and Various Activation Functions Functions

The document details an experiment using Keras to create and train artificial neural networks (ANNs) with various activation functions on a classification dataset. It includes steps for data generation, scaling, model creation, and training, highlighting the use of different activation functions like relu, sigmoid, tanh, selu, and elu. The training process and model summaries for each activation function are also presented, showcasing the architecture and performance metrics.

Uploaded by

speedcomcyber25
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views199 pages

Experiment 1: ANN Using Keras Library and Various Activation Experiment 1: ANN Using Keras Library and Various Activation Functions Functions

The document details an experiment using Keras to create and train artificial neural networks (ANNs) with various activation functions on a classification dataset. It includes steps for data generation, scaling, model creation, and training, highlighting the use of different activation functions like relu, sigmoid, tanh, selu, and elu. The training process and model summaries for each activation function are also presented, showcasing the architecture and performance metrics.

Uploaded by

speedcomcyber25
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 199

Experiment 1: ANN using Keras Library and Various Activation

Functions
In [1]:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers, models
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout
from tensorflow.keras.utils import to_categorical
from sklearn.model_selection import train_test_split
from sklearn.datasets import make_classification
from sklearn.preprocessing import StandardScaler

# Check TensorFlow version


print(f"TensorFlow version: {tf.__version__}")
print(f"Keras version: {keras.__version__}")

TensorFlow version: 2.17.0


Keras version: 3.9.0.dev2025031403

In [2]:
# Generate a classification dataset
X, y = make_classification(n_samples=1000, n_features=20, n_informative=15,
n_redundant=5, random_state=42)

# Scale the features


scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

# Split the data into training and testing sets


X_train, X_test, y_train, y_test = train_test_split(X_scaled, y, test_size=0.2, random_s
tate=42)

print(f"Training data shape: {X_train.shape}")


print(f"Testing data shape: {X_test.shape}")

Training data shape: (800, 20)


Testing data shape: (200, 20)

In [3]:
def create_model(activation='relu'):
model = Sequential([
Dense(64, activation=activation, input_shape=(X_train.shape[1],)),
Dense(32, activation=activation),
Dense(16, activation=activation),
Dense(1, activation='sigmoid') # Output layer with sigmoid for binary classific
ation
])

model.compile(optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy'])
return model

# Create models with different activation functions


activation_functions = ['relu', 'sigmoid', 'tanh', 'selu', 'elu']
models = {}
histories = {}
for activation in activation_functions:
print(f"Creating model with {activation} activation function...")
models[activation] = create_model(activation)
models[activation].summary()

Creating model with relu activation function...

c:\Users\Rishi\AppData\Local\Programs\Python\Python312\Lib\site-packages\keras\src\layers
\core\dense.py:87: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a la
yer. When using Sequential models, prefer using an `Input(shape)` object as the first lay
er in the model instead.
super().__init__(activity_regularizer=activity_regularizer, **kwargs)

Model: "sequential"

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type) ┃ Output Shape ┃ Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ dense (Dense) │ (None, 64) │ 1,344 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_1 (Dense) │ (None, 32) │ 2,080 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_2 (Dense) │ (None, 16) │ 528 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_3 (Dense) │ (None, 1) │ 17 │
└─────────────────────────────────┴────────────────────────┴───────────────┘

Total params: 3,969 (15.50 KB)

Trainable params: 3,969 (15.50 KB)

Non-trainable params: 0 (0.00 B)

Creating model with sigmoid activation function...

Model: "sequential_1"

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type) ┃ Output Shape ┃ Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ dense_4 (Dense) │ (None, 64) │ 1,344 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_5 (Dense) │ (None, 32) │ 2,080 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_6 (Dense) │ (None, 16) │ 528 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_7 (Dense) │ (None, 1) │ 17 │
└─────────────────────────────────┴────────────────────────┴───────────────┘

Total params: 3,969 (15.50 KB)

Trainable params: 3,969 (15.50 KB)

Non-trainable params: 0 (0.00 B)

Creating model with tanh activation function...

Model: "sequential_2"

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type) ┃ Output Shape ┃ Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ dense_8 (Dense) │ (None, 64) │ 1,344 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_9 (Dense) │ (None, 32) │ 2,080 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_10 (Dense) │ (None, 16) │ 528 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_11 (Dense) │ (None, 1) │ 17 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
└─────────────────────────────────┴────────────────────────┴───────────────┘

Total params: 3,969 (15.50 KB)

Trainable params: 3,969 (15.50 KB)

Non-trainable params: 0 (0.00 B)

Creating model with selu activation function...

Model: "sequential_3"

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type) ┃ Output Shape ┃ Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ dense_12 (Dense) │ (None, 64) │ 1,344 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_13 (Dense) │ (None, 32) │ 2,080 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_14 (Dense) │ (None, 16) │ 528 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_15 (Dense) │ (None, 1) │ 17 │
└─────────────────────────────────┴────────────────────────┴───────────────┘

Total params: 3,969 (15.50 KB)

Trainable params: 3,969 (15.50 KB)

Non-trainable params: 0 (0.00 B)

Creating model with elu activation function...

Model: "sequential_4"

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type) ┃ Output Shape ┃ Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ dense_16 (Dense) │ (None, 64) │ 1,344 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_17 (Dense) │ (None, 32) │ 2,080 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_18 (Dense) │ (None, 16) │ 528 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_19 (Dense) │ (None, 1) │ 17 │
└─────────────────────────────────┴────────────────────────┴───────────────┘

Total params: 3,969 (15.50 KB)

Trainable params: 3,969 (15.50 KB)

Non-trainable params: 0 (0.00 B)

In [4]:
# Train models and store training history
for activation in activation_functions:
print(f"\n Training model with {activation} activation function...")
histories[activation] = models[activation].fit(
X_train, y_train,
epochs=20,
batch_size=32,
validation_split=0.2,
verbose=1
)

Training model with relu activation function...


Epoch 1/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 1s 10ms/step - accuracy: 0.5939 - loss: 0.6721 - val_accuracy:
0.7125 - val_loss: 0.6041
Epoch 2/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 1s 10ms/step - accuracy: 0.5939 - loss: 0.6721 - val_accuracy:
0.7125 - val_loss: 0.6041
Epoch 2/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.7909 - loss: 0.5740 - val_accuracy:
0.8562 - val_loss: 0.5159
Epoch 3/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.7909 - loss: 0.5740 - val_accuracy:
0.8562 - val_loss: 0.5159
Epoch 3/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8761 - loss: 0.4830 - val_accuracy:
0.8562 - val_loss: 0.4170
Epoch 4/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8761 - loss: 0.4830 - val_accuracy:
0.8562 - val_loss: 0.4170
Epoch 4/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9192 - loss: 0.3404 - val_accuracy:
0.8687 - val_loss: 0.3447
Epoch 5/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9192 - loss: 0.3404 - val_accuracy:
0.8687 - val_loss: 0.3447
Epoch 5/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9311 - loss: 0.2430 - val_accuracy:
0.8938 - val_loss: 0.2919
Epoch 6/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9311 - loss: 0.2430 - val_accuracy:
0.8938 - val_loss: 0.2919
Epoch 6/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9422 - loss: 0.1977 - val_accuracy:
0.8875 - val_loss: 0.2678
Epoch 7/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9422 - loss: 0.1977 - val_accuracy:
0.8875 - val_loss: 0.2678
Epoch 7/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9333 - loss: 0.1924 - val_accuracy:
0.8875 - val_loss: 0.2564
Epoch 8/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9333 - loss: 0.1924 - val_accuracy:
0.8875 - val_loss: 0.2564
Epoch 8/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 6ms/step - accuracy: 0.9506 - loss: 0.1568 - val_accuracy:
0.9062 - val_loss: 0.2276
Epoch 9/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 6ms/step - accuracy: 0.9506 - loss: 0.1568 - val_accuracy:
0.9062 - val_loss: 0.2276
Epoch 9/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.9589 - loss: 0.1390 - val_accuracy:
0.9125 - val_loss: 0.2213
Epoch 10/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.9589 - loss: 0.1390 - val_accuracy:
0.9125 - val_loss: 0.2213
Epoch 10/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9677 - loss: 0.1227 - val_accuracy:
0.9062 - val_loss: 0.2169
Epoch 11/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9677 - loss: 0.1227 - val_accuracy:
0.9062 - val_loss: 0.2169
Epoch 11/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9712 - loss: 0.1086 - val_accuracy:
0.9187 - val_loss: 0.2160
Epoch 12/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9712 - loss: 0.1086 - val_accuracy:
0.9187 - val_loss: 0.2160
Epoch 12/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.9774 - loss: 0.0967 - val_accuracy:
0.9375 - val_loss: 0.2098
Epoch 13/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.9774 - loss: 0.0967 - val_accuracy:
0.9375 - val_loss: 0.2098
Epoch 13/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.9791 - loss: 0.0769 - val_accuracy:
0.9312 - val_loss: 0.2076
Epoch 14/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.9791 - loss: 0.0769 - val_accuracy:
0.9312 - val_loss: 0.2076
Epoch 14/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.9892 - loss: 0.0713 - val_accuracy:
0.9187 - val_loss: 0.2151
Epoch 15/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.9892 - loss: 0.0713 - val_accuracy:
0.9187 - val_loss: 0.2151
Epoch 15/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.9857 - loss: 0.0713 - val_accuracy:
0.9438 - val_loss: 0.2044
Epoch 16/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.9857 - loss: 0.0713 - val_accuracy:
0.9438 - val_loss: 0.2044
Epoch 16/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.9845 - loss: 0.0666 - val_accuracy:
0.9312 - val_loss: 0.2250
Epoch 17/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.9845 - loss: 0.0666 - val_accuracy:
0.9312 - val_loss: 0.2250
Epoch 17/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9868 - loss: 0.0503 - val_accuracy:
0.9250 - val_loss: 0.2149
Epoch 18/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9868 - loss: 0.0503 - val_accuracy:
0.9250 - val_loss: 0.2149
Epoch 18/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9957 - loss: 0.0396 - val_accuracy:
0.9375 - val_loss: 0.2072
Epoch 19/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9957 - loss: 0.0396 - val_accuracy:
0.9375 - val_loss: 0.2072
Epoch 19/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9964 - loss: 0.0341 - val_accuracy:
0.9375 - val_loss: 0.2152
Epoch 20/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9964 - loss: 0.0341 - val_accuracy:
0.9375 - val_loss: 0.2152
Epoch 20/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9984 - loss: 0.0298 - val_accuracy:
0.9312 - val_loss: 0.2151

20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9984 - loss: 0.0298 - val_accuracy:


0.9312 - val_loss: 0.2151

Training model with sigmoid activation function...


Epoch 1/20

Epoch 1/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 1s 10ms/step - accuracy: 0.4988 - loss: 0.7235 - val_accuracy:
0.5250 - val_loss: 0.6876
Epoch 2/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 1s 10ms/step - accuracy: 0.4988 - loss: 0.7235 - val_accuracy:
0.5250 - val_loss: 0.6876
Epoch 2/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.5574 - loss: 0.6877 - val_accuracy:
0.4812 - val_loss: 0.6862
Epoch 3/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.5574 - loss: 0.6877 - val_accuracy:
0.4812 - val_loss: 0.6862
Epoch 3/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.5633 - loss: 0.6819 - val_accuracy:
0.7250 - val_loss: 0.6747
Epoch 4/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.5633 - loss: 0.6819 - val_accuracy:
0.7250 - val_loss: 0.6747
Epoch 4/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.7033 - loss: 0.6693 - val_accuracy:
0.7250 - val_loss: 0.6646
Epoch 5/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.7033 - loss: 0.6693 - val_accuracy:
0.7250 - val_loss: 0.6646
Epoch 5/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.7318 - loss: 0.6558 - val_accuracy:
0.7375 - val_loss: 0.6499
Epoch 6/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.7318 - loss: 0.6558 - val_accuracy:
0.7375 - val_loss: 0.6499
Epoch 6/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.7223 - loss: 0.6450 - val_accuracy:
0.7500 - val_loss: 0.6299
Epoch 7/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.7223 - loss: 0.6450 - val_accuracy:
0.7500 - val_loss: 0.6299
Epoch 7/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.7689 - loss: 0.6148 - val_accuracy:
0.7750 - val_loss: 0.6061
Epoch 8/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.7689 - loss: 0.6148 - val_accuracy:
0.7750 - val_loss: 0.6061
Epoch 8/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.7623 - loss: 0.5901 - val_accuracy:
0.7750 - val_loss: 0.5771
Epoch 9/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.7623 - loss: 0.5901 - val_accuracy:
0.7750 - val_loss: 0.5771
Epoch 9/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.7813 - loss: 0.5577 - val_accuracy:
0.7812 - val_loss: 0.5502
Epoch 10/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.7813 - loss: 0.5577 - val_accuracy:
0.7812 - val_loss: 0.5502
Epoch 10/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.7828 - loss: 0.5270 - val_accuracy:
0.8000 - val_loss: 0.5248
Epoch 11/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.7828 - loss: 0.5270 - val_accuracy:
0.8000 - val_loss: 0.5248
Epoch 11/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8023 - loss: 0.4962 - val_accuracy:
0.7875 - val_loss: 0.5050
Epoch 12/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8023 - loss: 0.4962 - val_accuracy:
0.7875 - val_loss: 0.5050
Epoch 12/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.8028 - loss: 0.4631 - val_accuracy:
0.7875 - val_loss: 0.4878
Epoch 13/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.8028 - loss: 0.4631 - val_accuracy:
0.7875 - val_loss: 0.4878
Epoch 13/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8262 - loss: 0.4370 - val_accuracy:
0.7812 - val_loss: 0.4766
Epoch 14/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8262 - loss: 0.4370 - val_accuracy:
0.7812 - val_loss: 0.4766
Epoch 14/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8067 - loss: 0.4419 - val_accuracy:
0.8000 - val_loss: 0.4699
Epoch 15/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8067 - loss: 0.4419 - val_accuracy:
0.8000 - val_loss: 0.4699
Epoch 15/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8243 - loss: 0.4062 - val_accuracy:
0.8000 - val_loss: 0.4655
Epoch 16/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8243 - loss: 0.4062 - val_accuracy:
0.8000 - val_loss: 0.4655
Epoch 16/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8073 - loss: 0.4138 - val_accuracy:
0.8125 - val_loss: 0.4612
Epoch 17/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8073 - loss: 0.4138 - val_accuracy:
0.8125 - val_loss: 0.4612
Epoch 17/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8081 - loss: 0.4125 - val_accuracy:
0.7937 - val_loss: 0.4600
Epoch 18/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8081 - loss: 0.4125 - val_accuracy:
0.7937 - val_loss: 0.4600
Epoch 18/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8205 - loss: 0.4058 - val_accuracy:
0.7937 - val_loss: 0.4568
Epoch 19/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8205 - loss: 0.4058 - val_accuracy:
0.7937 - val_loss: 0.4568
Epoch 19/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8194 - loss: 0.4061 - val_accuracy:
0.8062 - val_loss: 0.4568
Epoch 20/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8194 - loss: 0.4061 - val_accuracy:
0.8062 - val_loss: 0.4568
Epoch 20/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8430 - loss: 0.3769 - val_accuracy:
0.7937 - val_loss: 0.4586

20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8430 - loss: 0.3769 - val_accuracy:


0.7937 - val_loss: 0.4586

Training model with tanh activation function...


Epoch 1/20

Epoch 1/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 1s 10ms/step - accuracy: 0.6257 - loss: 0.6447 - val_accuracy:
0.7688 - val_loss: 0.5008
Epoch 2/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 1s 10ms/step - accuracy: 0.6257 - loss: 0.6447 - val_accuracy:
0.7688 - val_loss: 0.5008
Epoch 2/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8291 - loss: 0.4360 - val_accuracy:
0.8250 - val_loss: 0.4518
Epoch 3/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8291 - loss: 0.4360 - val_accuracy:
0.8250 - val_loss: 0.4518
Epoch 3/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8380 - loss: 0.3965 - val_accuracy:
0.8000 - val_loss: 0.4410
Epoch 4/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8380 - loss: 0.3965 - val_accuracy:
0.8000 - val_loss: 0.4410
Epoch 4/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.8310 - loss: 0.3797 - val_accuracy:
0.8250 - val_loss: 0.4248
Epoch 5/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.8310 - loss: 0.3797 - val_accuracy:
0.8250 - val_loss: 0.4248
Epoch 5/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8583 - loss: 0.3475 - val_accuracy:
0.8188 - val_loss: 0.4254
Epoch 6/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8583 - loss: 0.3475 - val_accuracy:
0.8188 - val_loss: 0.4254
Epoch 6/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8574 - loss: 0.3325 - val_accuracy:
0.7937 - val_loss: 0.4093
Epoch 7/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8574 - loss: 0.3325 - val_accuracy:
0.7937 - val_loss: 0.4093
Epoch 7/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8482 - loss: 0.3601 - val_accuracy:
0.8188 - val_loss: 0.4125
Epoch 8/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8482 - loss: 0.3601 - val_accuracy:
0.8188 - val_loss: 0.4125
Epoch 8/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8626 - loss: 0.3058 - val_accuracy:
0.8125 - val_loss: 0.3933
Epoch 9/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8626 - loss: 0.3058 - val_accuracy:
0.8125 - val_loss: 0.3933
Epoch 9/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8841 - loss: 0.2868 - val_accuracy:
0.8062 - val_loss: 0.4002
Epoch 10/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8841 - loss: 0.2868 - val_accuracy:
0.8062 - val_loss: 0.4002
Epoch 10/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8979 - loss: 0.2611 - val_accuracy:
0.8062 - val_loss: 0.3721
Epoch 11/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8979 - loss: 0.2611 - val_accuracy:
0.8062 - val_loss: 0.3721
Epoch 11/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9114 - loss: 0.2449 - val_accuracy:
0.8062 - val_loss: 0.3571
Epoch 12/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9114 - loss: 0.2449 - val_accuracy:
0.8062 - val_loss: 0.3571
Epoch 12/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9137 - loss: 0.2131 - val_accuracy:
0.8188 - val_loss: 0.3570
Epoch 13/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9137 - loss: 0.2131 - val_accuracy:
0.8188 - val_loss: 0.3570
Epoch 13/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9264 - loss: 0.2103 - val_accuracy:
0.8313 - val_loss: 0.3371
Epoch 14/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9264 - loss: 0.2103 - val_accuracy:
0.8313 - val_loss: 0.3371
Epoch 14/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9290 - loss: 0.2037 - val_accuracy:
0.8500 - val_loss: 0.3201
Epoch 15/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9290 - loss: 0.2037 - val_accuracy:
0.8500 - val_loss: 0.3201
Epoch 15/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9316 - loss: 0.1930 - val_accuracy:
0.8687 - val_loss: 0.3115
Epoch 16/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9316 - loss: 0.1930 - val_accuracy:
0.8687 - val_loss: 0.3115
Epoch 16/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9407 - loss: 0.1641 - val_accuracy:
0.8750 - val_loss: 0.3133
Epoch 17/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9407 - loss: 0.1641 - val_accuracy:
0.8750 - val_loss: 0.3133
Epoch 17/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9554 - loss: 0.1592 - val_accuracy:
0.8938 - val_loss: 0.2834
Epoch 18/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9554 - loss: 0.1592 - val_accuracy:
0.8938 - val_loss: 0.2834
Epoch 18/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9671 - loss: 0.1399 - val_accuracy:
0.8875 - val_loss: 0.2831
Epoch 19/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9671 - loss: 0.1399 - val_accuracy:
0.8875 - val_loss: 0.2831
Epoch 19/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9660 - loss: 0.1393 - val_accuracy:
0.9000 - val_loss: 0.2684
Epoch 20/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9660 - loss: 0.1393 - val_accuracy:
0.9000 - val_loss: 0.2684
Epoch 20/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9642 - loss: 0.1350 - val_accuracy:
0.9062 - val_loss: 0.2630
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9642 - loss: 0.1350 - val_accuracy:
0.9062 - val_loss: 0.2630

Training model with selu activation function...


Epoch 1/20

Training model with selu activation function...


Epoch 1/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 1s 10ms/step - accuracy: 0.6646 - loss: 0.6233 - val_accuracy:
0.8000 - val_loss: 0.4576
Epoch 2/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 1s 10ms/step - accuracy: 0.6646 - loss: 0.6233 - val_accuracy:
0.8000 - val_loss: 0.4576
Epoch 2/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8302 - loss: 0.3819 - val_accuracy:
0.8188 - val_loss: 0.4184
Epoch 3/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8302 - loss: 0.3819 - val_accuracy:
0.8188 - val_loss: 0.4184
Epoch 3/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 6ms/step - accuracy: 0.8517 - loss: 0.3308 - val_accuracy:
0.8313 - val_loss: 0.3937
Epoch 4/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 6ms/step - accuracy: 0.8517 - loss: 0.3308 - val_accuracy:
0.8313 - val_loss: 0.3937
Epoch 4/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8550 - loss: 0.3290 - val_accuracy:
0.8375 - val_loss: 0.3780
Epoch 5/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8550 - loss: 0.3290 - val_accuracy:
0.8375 - val_loss: 0.3780
Epoch 5/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8870 - loss: 0.2750 - val_accuracy:
0.8375 - val_loss: 0.3664
Epoch 6/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8870 - loss: 0.2750 - val_accuracy:
0.8375 - val_loss: 0.3664
Epoch 6/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8875 - loss: 0.2640 - val_accuracy:
0.8438 - val_loss: 0.3482
Epoch 7/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8875 - loss: 0.2640 - val_accuracy:
0.8438 - val_loss: 0.3482
Epoch 7/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9150 - loss: 0.2430 - val_accuracy:
0.8438 - val_loss: 0.3341
Epoch 8/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9150 - loss: 0.2430 - val_accuracy:
0.8438 - val_loss: 0.3341
Epoch 8/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9247 - loss: 0.2061 - val_accuracy:
0.8313 - val_loss: 0.3362
Epoch 9/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9247 - loss: 0.2061 - val_accuracy:
0.8313 - val_loss: 0.3362
Epoch 9/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9356 - loss: 0.1951 - val_accuracy:
0.8500 - val_loss: 0.3140
Epoch 10/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9356 - loss: 0.1951 - val_accuracy:
0.8500 - val_loss: 0.3140
Epoch 10/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9208 - loss: 0.2145 - val_accuracy:
0.8500 - val_loss: 0.2953
Epoch 11/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9208 - loss: 0.2145 - val_accuracy:
0.8500 - val_loss: 0.2953
Epoch 11/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9363 - loss: 0.1752 - val_accuracy:
0.8687 - val_loss: 0.2941
Epoch 12/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9363 - loss: 0.1752 - val_accuracy:
0.8687 - val_loss: 0.2941
Epoch 12/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9408 - loss: 0.1860 - val_accuracy:
0.8562 - val_loss: 0.2848
Epoch 13/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9408 - loss: 0.1860 - val_accuracy:
0.8562 - val_loss: 0.2848
Epoch 13/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9456 - loss: 0.1474 - val_accuracy:
0.8687 - val_loss: 0.2826
Epoch 14/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9456 - loss: 0.1474 - val_accuracy:
0.8687 - val_loss: 0.2826
Epoch 14/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9574 - loss: 0.1481 - val_accuracy:
0.8750 - val_loss: 0.2724
Epoch 15/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9574 - loss: 0.1481 - val_accuracy:
0.8750 - val_loss: 0.2724
Epoch 15/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9465 - loss: 0.1606 - val_accuracy:
0.8813 - val_loss: 0.2605
Epoch 16/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9465 - loss: 0.1606 - val_accuracy:
0.8813 - val_loss: 0.2605
Epoch 16/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9669 - loss: 0.1187 - val_accuracy:
0.8938 - val_loss: 0.2591
Epoch 17/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9669 - loss: 0.1187 - val_accuracy:
0.8938 - val_loss: 0.2591
Epoch 17/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9601 - loss: 0.1307 - val_accuracy:
0.8875 - val_loss: 0.2488
Epoch 18/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9601 - loss: 0.1307 - val_accuracy:
0.8875 - val_loss: 0.2488
Epoch 18/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9656 - loss: 0.1159 - val_accuracy:
0.9000 - val_loss: 0.2496
Epoch 19/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9656 - loss: 0.1159 - val_accuracy:
0.9000 - val_loss: 0.2496
Epoch 19/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9703 - loss: 0.1036 - val_accuracy:
0.9062 - val_loss: 0.2460
Epoch 20/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9703 - loss: 0.1036 - val_accuracy:
0.9062 - val_loss: 0.2460
Epoch 20/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.9653 - loss: 0.1124 - val_accuracy:
0.9062 - val_loss: 0.2359
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.9653 - loss: 0.1124 - val_accuracy:
0.9062 - val_loss: 0.2359

Training model with elu activation function...


Epoch 1/20

Training model with elu activation function...


Epoch 1/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 1s 10ms/step - accuracy: 0.6105 - loss: 0.6421 - val_accuracy:
0.7750 - val_loss: 0.4973
Epoch 2/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 1s 10ms/step - accuracy: 0.6105 - loss: 0.6421 - val_accuracy:
0.7750 - val_loss: 0.4973
Epoch 2/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8113 - loss: 0.4574 - val_accuracy:
0.8250 - val_loss: 0.4155
Epoch 3/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8113 - loss: 0.4574 - val_accuracy:
0.8250 - val_loss: 0.4155
Epoch 3/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8512 - loss: 0.3562 - val_accuracy:
0.8375 - val_loss: 0.3750
Epoch 4/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8512 - loss: 0.3562 - val_accuracy:
0.8375 - val_loss: 0.3750
Epoch 4/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.8907 - loss: 0.3097 - val_accuracy:
0.8313 - val_loss: 0.3458
Epoch 5/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.8907 - loss: 0.3097 - val_accuracy:
0.8313 - val_loss: 0.3458
Epoch 5/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9035 - loss: 0.2762 - val_accuracy:
0.8562 - val_loss: 0.3198
Epoch 6/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9035 - loss: 0.2762 - val_accuracy:
0.8562 - val_loss: 0.3198
Epoch 6/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9199 - loss: 0.2368 - val_accuracy:
0.8625 - val_loss: 0.3021
Epoch 7/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9199 - loss: 0.2368 - val_accuracy:
0.8625 - val_loss: 0.3021
Epoch 7/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9174 - loss: 0.2120 - val_accuracy:
0.8562 - val_loss: 0.2926
Epoch 8/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9174 - loss: 0.2120 - val_accuracy:
0.8562 - val_loss: 0.2926
Epoch 8/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9245 - loss: 0.1953 - val_accuracy:
0.8813 - val_loss: 0.2774
Epoch 9/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9245 - loss: 0.1953 - val_accuracy:
0.8813 - val_loss: 0.2774
Epoch 9/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9457 - loss: 0.1856 - val_accuracy:
0.8687 - val_loss: 0.2708
Epoch 10/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9457 - loss: 0.1856 - val_accuracy:
0.8687 - val_loss: 0.2708
Epoch 10/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9393 - loss: 0.1759 - val_accuracy:
0.9000 - val_loss: 0.2590
Epoch 11/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9393 - loss: 0.1759 - val_accuracy:
0.9000 - val_loss: 0.2590
Epoch 11/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9326 - loss: 0.1605 - val_accuracy:
0.9000 - val_loss: 0.2497
Epoch 12/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9326 - loss: 0.1605 - val_accuracy:
0.9000 - val_loss: 0.2497
Epoch 12/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.9531 - loss: 0.1384 - val_accuracy:
0.9000 - val_loss: 0.2437
Epoch 13/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.9531 - loss: 0.1384 - val_accuracy:
0.9000 - val_loss: 0.2437
Epoch 13/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.9350 - loss: 0.1534 - val_accuracy:
0.9062 - val_loss: 0.2316
Epoch 14/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.9350 - loss: 0.1534 - val_accuracy:
0.9062 - val_loss: 0.2316
Epoch 14/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9725 - loss: 0.1039 - val_accuracy:
0.9062 - val_loss: 0.2294
Epoch 15/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9725 - loss: 0.1039 - val_accuracy:
0.9062 - val_loss: 0.2294
Epoch 15/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9550 - loss: 0.1294 - val_accuracy:
0.9000 - val_loss: 0.2424
Epoch 16/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9550 - loss: 0.1294 - val_accuracy:
0.9000 - val_loss: 0.2424
Epoch 16/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9643 - loss: 0.1046 - val_accuracy:
0.9125 - val_loss: 0.2275
Epoch 17/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9643 - loss: 0.1046 - val_accuracy:
0.9125 - val_loss: 0.2275
Epoch 17/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9546 - loss: 0.1106 - val_accuracy:
0.9125 - val_loss: 0.2242
Epoch 18/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9546 - loss: 0.1106 - val_accuracy:
0.9125 - val_loss: 0.2242
Epoch 18/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9680 - loss: 0.1003 - val_accuracy:
0.9125 - val_loss: 0.2230
Epoch 19/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9680 - loss: 0.1003 - val_accuracy:
0.9125 - val_loss: 0.2230
Epoch 19/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9666 - loss: 0.0900 - val_accuracy:
0.9125 - val_loss: 0.2212
Epoch 20/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9666 - loss: 0.0900 - val_accuracy:
0.9125 - val_loss: 0.2212
Epoch 20/20
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9685 - loss: 0.0845 - val_accuracy:
0.9250 - val_loss: 0.2185
20/20 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.9685 - loss: 0.0845 - val_accuracy:
0.9250 - val_loss: 0.2185

In [5]:
# Evaluate models on test data
results = {}
for activation in activation_functions:
print(f"\n Evaluating model with {activation} activation function:")
loss, accuracy = models[activation].evaluate(X_test, y_test, verbose=0)
results[activation] = {
'loss': loss,
'accuracy': accuracy
}
print(f"Test Loss: {loss:.4f}")
print(f"Test Accuracy: {accuracy:.4f}")

# Create a DataFrame to compare results


results_df = pd.DataFrame({
'Activation Function': list(results.keys()),
'Test Loss': [results[act]['loss'] for act in results],
'Test Accuracy': [results[act]['accuracy'] for act in results]
})

print("\n Comparison of Different Activation Functions:")


results_df.sort_values('Test Accuracy', ascending=False)

Evaluating model with relu activation function:


Test Loss: 0.1698
Test Accuracy: 0.9350

Evaluating model with sigmoid activation function:


Test Loss: 0.1698
Test Accuracy: 0.9350

Evaluating model with sigmoid activation function:


Test Loss: 0.3892
Test Accuracy: 0.8250

Evaluating model with tanh activation function:


Test Loss: 0.3892
Test Accuracy: 0.8250

Evaluating model with tanh activation function:


Test Loss: 0.1933
Test Accuracy: 0.9150

Evaluating model with selu activation function:


Test Loss: 0.1933
Test Accuracy: 0.9150

Evaluating model with selu activation function:


Test Loss: 0.1715
Test Accuracy: 0.9300

Evaluating model with elu activation function:


Test Loss: 0.1715
Test Accuracy: 0.9300

Evaluating model with elu activation function:


Test Loss: 0.1732
Test Accuracy: 0.9200

Comparison of Different Activation Functions:


Test Loss: 0.1732
Test Accuracy: 0.9200

Comparison of Different Activation Functions:


Out[5]:

Activation Function Test Loss Test Accuracy

0 relu 0.169778 0.935

3 selu 0.171519 0.930

4 elu 0.173155 0.920

2 tanh 0.193340 0.915

1 sigmoid 0.389227 0.825

In [6]:
# Plot training & validation accuracy for each activation function
plt.figure(figsize=(16, 10))

# Plot training accuracy


plt.subplot(2, 2, 1)
for activation in activation_functions:
plt.plot(histories[activation].history['accuracy'], label=f'{activation} ')
plt.title('Training Accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend()

# Plot validation accuracy


plt.subplot(2, 2, 2)
for activation in activation_functions:
plt.plot(histories[activation].history['val_accuracy'], label=f'{activation} ')
plt.title('Validation Accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend()

# Plot training loss


plt.subplot(2, 2, 3)
for activation in activation_functions:
plt.plot(histories[activation].history['loss'], label=f'{activation} ')
plt.title('Training Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.legend()

# Plot validation loss


plt.subplot(2, 2, 4)
for activation in activation_functions:
plt.plot(histories[activation].history['val_loss'], label=f'{activation} ')
plt.title('Validation Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.legend()

plt.tight_layout()
plt.show()

In [7]:
# Define activation functions for visualization
def relu(x):
return np.maximum(0, x)

def sigmoid(x):
return 1 / (1 + np.exp(-x))

def tanh(x):
return np.tanh(x)

def leaky_relu(x, alpha=0.3):


return np.where(x > 0, x, alpha * x)

def elu(x, alpha=1.0):


return np.where(x > 0, x, alpha * (np.exp(x) - 1))

# Create a range of input values


x = np.linspace(-10, 10, 1000)

# Create a plot
plt.figure(figsize=(12, 8))
plt.plot(x, relu(x), label='ReLU')
plt.plot(x, sigmoid(x), label='Sigmoid')
plt.plot(x, tanh(x), label='Tanh')
plt.plot(x, leaky_relu(x), label='Leaky ReLU (alpha=0.3)')
plt.plot(x, elu(x), label='ELU (alpha=1.0)')
plt.grid(True)
plt.legend()
plt.title('Activation Functions')
plt.xlabel('Input (x)')
plt.ylabel('Output f(x)')
plt.axhline(y=0, color='k', linestyle='-', alpha=0.3)
plt.axvline(x=0, color='k', linestyle='-', alpha=0.3)
plt.show()
Experiment 2: ANN on Diabetes Dataset
In [1]:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers, models
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import confusion_matrix, classification_report, roc_curve, roc_auc_s
core

# Set random seeds for reproducibility


np.random.seed(42)
tf.random.set_seed(42)

# Check versions
print(f"TensorFlow version: {tf.__version__}")
print(f"Keras version: {keras.__version__}")

TensorFlow version: 2.17.0


Keras version: 3.9.0.dev2025031403

In [2]:

# Load the dataset


# Trying to download the dataset directly; if not available, we'll use a synthetic versio
n
try:
url = "https://raw.githubusercontent.com/jbrownlee/Datasets/master/pima-indians-diabe
tes.data.csv"
column_names = ["Pregnancies", "Glucose", "BloodPressure", "SkinThickness",
"Insulin", "BMI", "DiabetesPedigreeFunction", "Age", "Outcome"]
df = pd.read_csv(url, names=column_names)
except:
print("Unable to download dataset, creating synthetic version...")
# Create synthetic data similar to Pima Indians Diabetes dataset
np.random.seed(42)
n_samples = 768

# Generate synthetic data with similar distributions to the real dataset


pregnancies = np.random.randint(0, 18, n_samples)
glucose = np.random.normal(121, 32, n_samples).astype(int)
glucose = np.clip(glucose, 0, 200)
blood_pressure = np.random.normal(69, 19, n_samples).astype(int)
blood_pressure = np.clip(blood_pressure, 0, 122)
skin_thickness = np.random.normal(23, 16, n_samples).astype(int)
skin_thickness = np.clip(skin_thickness, 0, 100)
insulin = np.random.normal(80, 115, n_samples).astype(int)
insulin = np.clip(insulin, 0, 846)
bmi = np.random.normal(32, 8, n_samples)
bmi = np.clip(bmi, 0, 67)
diabetes_pedigree = np.random.normal(0.47, 0.33, n_samples)
diabetes_pedigree = np.clip(diabetes_pedigree, 0.08, 2.42)
age = np.random.normal(33, 12, n_samples).astype(int)
age = np.clip(age, 21, 81)

# Create synthetic target with ~35% positive cases (similar to original dataset)
# Using a formula that considers multiple risk factors
risk_score = (glucose > 140) * 3 + (bmi > 30) * 2 + (age > 40) * 1.5 + (pregnancies
> 6) * 1
outcome = (risk_score + np.random.normal(0, 1, n_samples) > 3).astype(int)

# Create dataframe
data = {
"Pregnancies": pregnancies,
"Glucose": glucose,
"BloodPressure": blood_pressure,
"SkinThickness": skin_thickness,
"Insulin": insulin,
"BMI": bmi,
"DiabetesPedigreeFunction": diabetes_pedigree,
"Age": age,
"Outcome": outcome
}
df = pd.DataFrame(data)
print("Created synthetic data similar to Pima Indians Diabetes dataset.")

# Display basic information about the dataset


print(f"Dataset shape: {df.shape}")
print("\n First few rows of the dataset:")
df.head()

Dataset shape: (768, 9)

First few rows of the dataset:


Out[2]:

Pregnancies Glucose BloodPressure SkinThickness Insulin BMI DiabetesPedigreeFunction Age Outcome

0 6 148 72 35 0 33.6 0.627 50 1

1 1 85 66 29 0 26.6 0.351 31 0

2 8 183 64 0 0 23.3 0.672 32 1

3 1 89 66 23 94 28.1 0.167 21 0

4 0 137 40 35 168 43.1 2.288 33 1

In [3]:
# Basic statistics of the dataset
print("Basic statistics of the dataset:")
df.describe()

Basic statistics of the dataset:


Out[3]:

Pregnancies Glucose BloodPressure SkinThickness Insulin BMI DiabetesPedigreeFunction Age Ou

count 768.000000 768.000000 768.000000 768.000000 768.000000 768.000000 768.000000 768.000000 768.

mean 3.845052 120.894531 69.105469 20.536458 79.799479 31.992578 0.471876 33.240885 0.3

std 3.369578 31.972618 19.355807 15.952218 115.244002 7.884160 0.331329 11.760232 0.4

min 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.078000 21.000000 0.0

25% 1.000000 99.000000 62.000000 0.000000 0.000000 27.300000 0.243750 24.000000 0.0

50% 3.000000 117.000000 72.000000 23.000000 30.500000 32.000000 0.372500 29.000000 0.0

75% 6.000000 140.250000 80.000000 32.000000 127.250000 36.600000 0.626250 41.000000 1.0

max 17.000000 199.000000 122.000000 99.000000 846.000000 67.100000 2.420000 81.000000 1.0

In [4]:
# Check for missing values
print("\n Missing values in each column:")
df.isnull().sum()

Missing values in each column:


Out[4]:
Pregnancies 0
Glucose 0
BloodPressure 0
SkinThickness 0
Insulin 0
BMI 0
DiabetesPedigreeFunction 0
Age 0
Outcome 0
dtype: int64

In [5]:

# Check class distribution (0: No Diabetes, 1: Diabetes)


print("\n Class distribution:")
class_distribution = df['Outcome'].value_counts(normalize=True) * 100
print(class_distribution)

# Visualize class distribution


plt.figure(figsize=(8, 6))
sns.countplot(x='Outcome', data=df, palette='viridis')
plt.title('Class Distribution: Diabetes vs No Diabetes')
plt.xlabel('Outcome (0: No Diabetes, 1: Diabetes)')
plt.ylabel('Count')

# Add percentages to the bars


for i, percentage in enumerate(class_distribution):
count = df['Outcome'].value_counts()[i]
plt.text(i, count + 5, f'{percentage:.1f}%', ha='center')

plt.show()

Class distribution:
Outcome
0 65.104167
1 34.895833
Name: proportion, dtype: float64

C:\Users\Rishi\AppData\Local\Temp\ipykernel_22384\2991995795.py:8: FutureWarning:

Passing `palette` without assigning `hue` is deprecated and will be removed in v0.14.0. A
ssign the `x` variable to `hue` and set `legend=False` for the same effect.

sns.countplot(x='Outcome', data=df, palette='viridis')


3. Data Preprocessing
In [6]:
# In the real dataset, zero values for certain features don't make physiological sense
# For our analysis, let's treat zeros as missing values for the following features
features_with_zeros = ['Glucose', 'BloodPressure', 'SkinThickness', 'Insulin', 'BMI']

# Check how many rows have zeros for these features


for feature in features_with_zeros:
zero_count = (df[feature] == 0).sum()
zero_percentage = (zero_count / len(df)) * 100
print(f"{feature}: {zero_count} zeros ({zero_percentage:.2f}%)")

Glucose: 5 zeros (0.65%)


BloodPressure: 35 zeros (4.56%)
SkinThickness: 227 zeros (29.56%)
Insulin: 374 zeros (48.70%)
BMI: 11 zeros (1.43%)

In [7]:
# Replace zeros with NaN for these features
for feature in features_with_zeros:
df[feature] = df[feature].replace(0, np.nan)

# Display missing value count after replacement


print("\n Missing values in each column after zero replacement:")
df.isnull().sum()

Missing values in each column after zero replacement:


Out[7]:
Pregnancies 0
Glucose 5
BloodPressure 35
SkinThickness 227
Insulin 374
BMI 11
DiabetesPedigreeFunction 0
Age 0
Outcome 0
dtype: int64

In [ ]:
# Fill missing values with the median of each feature
for feature in features_with_zeros:
median_value = df[feature].median()
df[feature].fillna(median_value, inplace=True)

# Confirm no missing values remain


print("\n Missing values after imputation:")
df.isnull().sum()

In [9]:
# Visualize features by outcome
plt.figure(figsize=(20, 15))
for i, feature in enumerate(df.columns[:-1]):
plt.subplot(3, 3, i+1)
sns.boxplot(x='Outcome', y=feature, data=df)
plt.title(f'{feature} by Diabetes Outcome')
plt.xlabel('Diabetes (0: No, 1: Yes)')
plt.ylabel(feature)
plt.tight_layout()
plt.show()

In [10]:
# Visualize correlation between features
plt.figure(figsize=(12, 10))
correlation_matrix = df.corr()
sns.heatmap(correlation_matrix, annot=True, cmap='coolwarm', fmt='.2f')
plt.title('Correlation Matrix')
plt.show()
In [11]:
# Separate features and target
X = df.drop('Outcome', axis=1)
y = df['Outcome']

# Split data into training and testing sets


X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42
, stratify=y)

# Scale the features


scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

print(f"Training data shape: {X_train_scaled.shape} ")


print(f"Testing data shape: {X_test_scaled.shape}")

Training data shape: (614, 8)


Testing data shape: (154, 8)

In [12]:
# Create a sequential model
model = Sequential([
# Input layer
Dense(32, activation='relu', input_shape=(X_train_scaled.shape[1],)),
Dropout(0.2), # Add dropout to prevent overfitting

# Hidden layer 1
Dense(16, activation='relu'),
Dropout(0.2),

# Hidden layer 2
Dense(8, activation='relu'),

# Output layer - binary classification


Dense(1, activation='sigmoid')
])

# Compile the model


model.compile(
optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy']
)
# Display model summary
model.summary()

c:\Users\Rishi\AppData\Local\Programs\Python\Python312\Lib\site-packages\keras\src\layers
\core\dense.py:87: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a la
yer. When using Sequential models, prefer using an `Input(shape)` object as the first lay
er in the model instead.
super().__init__(activity_regularizer=activity_regularizer, **kwargs)

Model: "sequential"

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type) ┃ Output Shape ┃ Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ dense (Dense) │ (None, 32) │ 288 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout (Dropout) │ (None, 32) │ 0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_1 (Dense) │ (None, 16) │ 528 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout_1 (Dropout) │ (None, 16) │ 0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_2 (Dense) │ (None, 8) │ 136 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_3 (Dense) │ (None, 1) │ 9 │
└─────────────────────────────────┴────────────────────────┴───────────────┘

Total params: 961 (3.75 KB)

Trainable params: 961 (3.75 KB)

Non-trainable params: 0 (0.00 B)

In [ ]:
# Define early stopping callback to prevent overfitting
early_stopping = tf.keras.callbacks.EarlyStopping(
monitor='val_loss',
patience=10,
mode='min',
restore_best_weights=True
)

# Train the model


history = model.fit(
X_train_scaled,
y_train,
epochs=100,
batch_size=16,
validation_split=0.2,
callbacks=[early_stopping],
verbose=1
)

In [14]:
# Evaluate on test set
loss, accuracy = model.evaluate(X_test_scaled, y_test)
print(f"\n Test Loss: {loss:.4f}")
print(f"Test Accuracy: {accuracy:.4f}")

5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.7237 - loss: 0.5447

Test Loss: 0.5189


Test Accuracy: 0.7403
5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.7237 - loss: 0.5447

Test Loss: 0.5189


Test Accuracy: 0.7403
In [15]:
# Plot training history
plt.figure(figsize=(12, 5))

# Plot training & validation accuracy


plt.subplot(1, 2, 1)
plt.plot(history.history['accuracy'], label='Training Accuracy')
plt.plot(history.history['val_accuracy'], label='Validation Accuracy')
plt.title('Model Accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend()

# Plot training & validation loss


plt.subplot(1, 2, 2)
plt.plot(history.history['loss'], label='Training Loss')
plt.plot(history.history['val_loss'], label='Validation Loss')
plt.title('Model Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.legend()

plt.tight_layout()
plt.show()

In [16]:
# Make predictions on the test set
y_pred_prob = model.predict(X_test_scaled)
y_pred = (y_pred_prob > 0.5).astype(int).flatten()

# Create a confusion matrix


cm = confusion_matrix(y_test, y_pred)

# Plot confusion matrix


plt.figure(figsize=(8, 6))
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues')
plt.title('Confusion Matrix')
plt.xlabel('Predicted Label')
plt.ylabel('True Label')
plt.show()

# Calculate and display classification metrics


print("\n Classification Report:")
print(classification_report(y_test, y_pred))

5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 11ms/step


5/5 ━━━━━━━━━━━━━━━━━━━━ 0s 11ms/step
Classification Report:
precision recall f1-score support

0 0.79 0.82 0.80 100


1 0.64 0.59 0.62 54

accuracy 0.74 154


macro avg 0.71 0.71 0.71 154
weighted avg 0.74 0.74 0.74 154

In [17]:
# Calculate ROC curve and AUC
fpr, tpr, thresholds = roc_curve(y_test, y_pred_prob)
auc = roc_auc_score(y_test, y_pred_prob)

# Plot ROC curve


plt.figure(figsize=(8, 6))
plt.plot(fpr, tpr, label=f'AUC = {auc:.3f}')
plt.plot([0, 1], [0, 1], 'k--')
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('ROC Curve')
plt.legend(loc='lower right')
plt.grid(True, alpha=0.3)
plt.show()
In [18]:

# Get the weights of the first layer


weights = model.layers[0].get_weights()[0]

# Calculate absolute mean weights for each feature


feature_importance = np.mean(np.abs(weights), axis=1)

# Create a DataFrame for visualization


importance_df = pd.DataFrame({
'Feature': X.columns,
'Importance': feature_importance
})

# Sort by importance
importance_df = importance_df.sort_values('Importance', ascending=False)

# Plot feature importance


plt.figure(figsize=(10, 6))
sns.barplot(x='Importance', y='Feature', data=importance_df, palette='viridis')
plt.title('Feature Importance (Based on First Layer Weights)')
plt.xlabel('Mean Absolute Weight')
plt.tight_layout()
plt.show()

C:\Users\Rishi\AppData\Local\Temp\ipykernel_22384\4215535867.py:18: FutureWarning:

Passing `palette` without assigning `hue` is deprecated and will be removed in v0.14.0. A
ssign the `y` variable to `hue` and set `legend=False` for the same effect.

sns.barplot(x='Importance', y='Feature', data=importance_df, palette='viridis')


In [ ]:
# Create sample data for prediction (you can modify these values)
sample_data = {
'Pregnancies': [1, 8, 2],
'Glucose': [85, 183, 137],
'BloodPressure': [66, 64, 70],
'SkinThickness': [29, 0, 38],
'Insulin': [0, 0, 240],
'BMI': [26.6, 23.3, 30.8],
'DiabetesPedigreeFunction': [0.351, 0.672, 0.429],
'Age': [31, 32, 45]
}

# Convert to DataFrame
sample_df = pd.DataFrame(sample_data)

# Replace zeros as we did with the training data


for feature in features_with_zeros:
sample_df[feature] = sample_df[feature].replace(0, np.nan)
sample_df[feature].fillna(df[feature].median(), inplace=True)

# Scale the data using the same scaler


sample_scaled = scaler.transform(sample_df)

# Make predictions
sample_predictions = model.predict(sample_scaled)

# Convert predictions to binary class


sample_predictions_binary = (sample_predictions > 0.5).astype(int)

# Create a DataFrame for the results


results_df = sample_df.copy()
results_df['Diabetes Probability'] = sample_predictions
results_df['Predicted Diabetes'] = sample_predictions_binary

# Display results
print("Predictions on Sample Data:")
results_df[['Diabetes Probability', 'Predicted Diabetes']]
Experiment 3: MNIST - Deep Neural Network with Keras
In [1]:

# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python docker image: https://github.com/kaggle/docker-pytho
n
# For example, here's several helpful packages to load in

import numpy as np # linear algebra


import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

import matplotlib.pyplot as plt # plotting library


%matplotlib inline

from keras.models import Sequential


from keras.layers import Dense , Activation, Dropout
from keras.optimizers import Adam ,RMSprop
from keras import backend as K

# Input data files are available in the "../input/" directory.


# For example, running this (by clicking run or pressing Shift+Enter) will list the files
in the input directory

from subprocess import check_output


print(check_output(["ls", "../input"]).decode("utf8"))

# Any results you write to the current directory are saved as output.
Using TensorFlow backend.

digit-recognizer

In [2]:
# import dataset
from keras.datasets import mnist

# load dataset
(x_train, y_train),(x_test, y_test) = mnist.load_data()

# count the number of unique train labels


unique, counts = np.unique(y_train, return_counts=True)
print("Train labels: ", dict(zip(unique, counts)))

# count the number of unique test labels


unique, counts = np.unique(y_test, return_counts=True)
print("\n Test labels: ", dict(zip(unique, counts)))

Downloading data from https://s3.amazonaws.com/img-datasets/mnist.npz


11493376/11490434 [==============================] - 2s 0us/step
Train labels: {0: 5923, 1: 6742, 2: 5958, 3: 6131, 4: 5842, 5: 5421, 6: 5918, 7: 6265, 8
: 5851, 9: 5949}

Test labels: {0: 980, 1: 1135, 2: 1032, 3: 1010, 4: 982, 5: 892, 6: 958, 7: 1028, 8: 974
, 9: 1009}
, 9: 1009}

In [3]:
# sample 25 mnist digits from train dataset
indexes = np.random.randint(0, x_train.shape[0], size=25)
images = x_train[indexes]
labels = y_train[indexes]

# plot the 25 mnist digits


plt.figure(figsize=(5,5))
for i in range(len(indexes)):
plt.subplot(5, 5, i + 1)
image = images[i]
plt.imshow(image, cmap='gray')
plt.axis('off')

plt.show()
plt.savefig("mnist-samples.png")
plt.close('all')

In [4]:
from keras.models import Sequential
from keras.layers import Dense, Activation, Dropout
from keras.utils import to_categorical, plot_model

In [5]:
# compute the number of labels
num_labels = len(np.unique(y_train))

In [6]:

# convert to one-hot vector


y_train = to_categorical(y_train)
y_test = to_categorical(y_test)

In [7]:
# image dimensions (assumed square)
image_size = x_train.shape[1]
input_size = image_size * image_size
input_size
Out[7]:
784

In [8]:
# resize and normalize
x_train = np.reshape(x_train, [-1, input_size])
x_train = x_train.astype('float32') / 255
x_test = np.reshape(x_test, [-1, input_size])
x_test = x_test.astype('float32') / 255

In [9]:
# network parameters
batch_size = 128
hidden_units = 256
dropout = 0.45

In [10]:
# model is a 3-layer MLP with ReLU and dropout after each layer
model = Sequential()
model.add(Dense(hidden_units, input_dim=input_size))
model.add(Activation('relu'))
model.add(Dropout(dropout))
model.add(Dense(hidden_units))
model.add(Activation('relu'))
model.add(Dropout(dropout))
model.add(Dense(num_labels))
model.add(Activation('softmax'))

In [11]:
model.summary()

Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_1 (Dense) (None, 256) 200960
_________________________________________________________________
activation_1 (Activation) (None, 256) 0
_________________________________________________________________
dropout_1 (Dropout) (None, 256) 0
_________________________________________________________________
dense_2 (Dense) (None, 256) 65792
_________________________________________________________________
activation_2 (Activation) (None, 256) 0
_________________________________________________________________
dropout_2 (Dropout) (None, 256) 0
_________________________________________________________________
dense_3 (Dense) (None, 10) 2570
_________________________________________________________________
activation_3 (Activation) (None, 10) 0
=================================================================
Total params: 269,322
Trainable params: 269,322
Non-trainable params: 0
_________________________________________________________________

In [12]:
plot_model(model, to_file='mlp-mnist.png', show_shapes=True)
Out[12]:
In [13]:
model.compile(loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])

In [ ]:
model.fit(x_train, y_train, epochs=20, batch_size=batch_size)

In [15]:
loss, acc = model.evaluate(x_test, y_test, batch_size=batch_size)
print("\n Test accuracy: %.1f%%" % (100.0 * acc))

10000/10000 [==============================] - 0s 22us/step

Test accuracy: 98.2%

In [16]:
from keras.regularizers import l2
model.add(Dense(hidden_units,
kernel_regularizer=l2(0.001),
input_dim=input_size))
Experiment 4: Handwritten Digit Recognition using CNN
(Convolutional Neural Networks)
In [1]:

# for numerical analysis


import numpy as np
# to store and process in a dataframe
import pandas as pd

# for ploting graphs


import matplotlib.pyplot as plt
# advancec ploting
import seaborn as sns

# image processing
import matplotlib.image as mpimg

# train test split


from sklearn.model_selection import train_test_split
# model performance metrics
from sklearn.metrics import confusion_matrix, classification_report

# utility functions
from tensorflow.keras.utils import to_categorical
# sequential model
from tensorflow.keras.models import Sequential
# layers
from tensorflow.keras.layers import Conv2D, MaxPool2D, Dense, Flatten, Dropout

# from keras.optimizers import RMSprop


# from keras.preprocessing.image import ImageDataGenerator
# from keras.callbacks import ReduceLROnPlateau

In [2]:

# list of files
! ls ../input/digit-recognizer

sample_submission.csv test.csv train.csv

In [3]:
# import train and test dataset
train = pd.read_csv("../input/digit-recognizer/train.csv")
test = pd.read_csv("../input/digit-recognizer/test.csv")

In [4]:
# training dataset
train.head()
Out[4]:

label pixel0 pixel1 pixel2 pixel3 pixel4 pixel5 pixel6 pixel7 pixel8 ... pixel774 pixel775 pixel776 pixel777 pixel778 pixel7

0 1 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0

1 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0

2 1 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0

3 4 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0

4 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0

5 rows × 785 columns


5 rows × 785 columns

In [5]:
# test dataset
test.head()
Out[5]:

pixel0 pixel1 pixel2 pixel3 pixel4 pixel5 pixel6 pixel7 pixel8 pixel9 ... pixel774 pixel775 pixel776 pixel777 pixel778 pixe

0 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0

1 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0

2 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0

3 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0

4 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0

5 rows × 784 columns

In [6]:
# looking for missing values
print(train.isna().sum().sum())
print(test.isna().sum().sum())

0
0

In [7]:

plt.figure(figsize=(8, 5))
sns.countplot(train['label'], palette='Dark2')
plt.title('Train labels count')
plt.show()

In [8]:
train['label'].value_counts().sort_index()

Out[8]:
0 4132
1 4684
2 4177
3 4351
4 4072
5 3795
6 4137
7 4401
8 4063
9 4188
Name: label, dtype: int64

In [9]:
# first few train images with labels
fig, ax = plt.subplots(figsize=(18, 8))
for ind, row in train.iloc[:8, :].iterrows():
plt.subplot(2, 4, ind+1)
plt.title(row[0])
img = row.to_numpy()[1:].reshape(28, 28)
fig.suptitle('Train images', fontsize=24)
plt.axis('off')
plt.imshow(img, cmap='magma')

In [10]:
# first few test images
fig, ax = plt.subplots(figsize=(18, 8))
for ind, row in test.iloc[:8, :].iterrows():
plt.subplot(2, 4, ind+1)
img = row.to_numpy()[:].reshape(28, 28)
fig.suptitle('Test images', fontsize=24)
plt.axis('off')
plt.imshow(img, cmap='magma')
In [11]:
# split into image and labels and convert to numpy array
X = train.iloc[:, 1:].to_numpy()
y = train['label'].to_numpy()

# test dataset
test = test.loc[:, :].to_numpy()

for i in [X, y, test]:


print(i.shape)

(42000, 784)
(42000,)
(28000, 784)

In [12]:
# normalize the data
# ==================

X = X / 255.0
test = test / 255.0

In [13]:
# reshape dataset
# ===============

# shape of training and test dataset


print(X.shape)
print(test.shape)

# reshape the dataframe to 3x3 matrix with 1 channel grey scale values
X = X.reshape(-1,28,28,1)
test = test.reshape(-1,28,28,1)

# shape of training and test dataset


print(X.shape)
print(test.shape)

(42000, 784)
(28000, 784)
(42000, 28, 28, 1)
(28000, 28, 28, 1)

In [14]:
# one hot encode target
# =====================

# shape and values of target


print(y.shape)
print(y[0])

# convert Y_train to categorical by one-hot-encoding


y_enc = to_categorical(y, num_classes = 10)

# shape and values of target


print(y_enc.shape)
print(y_enc[0])

(42000,)
1
(42000, 10)
[0. 1. 0. 0. 0. 0. 0. 0. 0. 0.]
[0. 1. 0. 0. 0. 0. 0. 0. 0. 0.]

In [15]:
# train test split
# ================

# random seed
random_seed = 2

# train validation split


X_train, X_val, y_train_enc, y_val_enc = train_test_split(X, y_enc, test_size=0.3)

# shape
for i in [X_train, y_train_enc, X_val, y_val_enc]:
print(i.shape)

(29400, 28, 28, 1)


(29400, 10)
(12600, 28, 28, 1)
(12600, 10)

In [16]:
g = plt.imshow(X_train[0][:,:,0])
print(y_train_enc[0])

[0. 0. 0. 0. 0. 0. 0. 1. 0. 0.]

In [17]:
g = plt.imshow(X_train[9][:,:,0])
print(y_train_enc[9])

[0. 1. 0. 0. 0. 0. 0. 0. 0. 0.]

In [18]:
INPUT_SHAPE = (28,28,1)
OUTPUT_SHAPE = 10
BATCH_SIZE = 128
EPOCHS = 10
VERBOSE = 2

In [19]:
model = Sequential()

model.add(Conv2D(32, kernel_size=(3,3), activation='relu', input_shape=INPUT_SHAPE))


model.add(MaxPool2D((2,2)))

model.add(Conv2D(64, kernel_size=(3,3), activation='relu'))


model.add(MaxPool2D((2,2)))

model.add(Flatten())

model.add(Dense(128, activation='relu'))
model.add(Dropout(0.2))

model.add(Dense(64, activation='relu'))
model.add(Dropout(0.2))

model.add(Dense(10, activation='softmax'))

In [20]:
model.compile(optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy'])

In [21]:
model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 26, 26, 32) 320
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 13, 13, 32) 0
_________________________________________________________________
conv2d_1 (Conv2D) (None, 11, 11, 64) 18496
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 5, 5, 64) 0
_________________________________________________________________
flatten (Flatten) (None, 1600) 0
_________________________________________________________________
dense (Dense) (None, 128) 204928
_________________________________________________________________
dropout (Dropout) (None, 128) 0
_________________________________________________________________
dense_1 (Dense) (None, 64) 8256
_________________________________________________________________
dropout_1 (Dropout) (None, 64) 0
_________________________________________________________________
dense_2 (Dense) (None, 10) 650
=================================================================
Total params: 232,650
Trainable params: 232,650
Non-trainable params: 0
_________________________________________________________________

In [ ]:
history = model.fit(X_train, y_train_enc,
epochs=EPOCHS,
batch_size=BATCH_SIZE,
verbose=VERBOSE,
validation_split=0.3)
In [23]:
plt.figure(figsize=(14, 5))

plt.subplot(1, 2, 1)
plt.plot(history.history['accuracy'], label='Training Accuracy')
plt.plot(history.history['val_accuracy'], label='Validation Accuracy')
plt.legend(loc='lower right')
plt.title('Training and Validation Accuracy')

plt.subplot(1, 2, 2)
plt.plot(history.history['loss'], label='Training Loss')
plt.plot(history.history['val_loss'], label='Validation Loss')
plt.legend(loc='upper right')
plt.title('Training and Validation Loss')

plt.savefig('./foo.png')
plt.show()

In [24]:
# model loss and accuracy on validation set
model.evaluate(X_val, y_val_enc, verbose=False)
Out[24]:
[0.04319758138058768, 0.98825395]

In [25]:
# predicted values
y_pred_enc = model.predict(X_val)

# actual
y_act = [np.argmax(i) for i in y_val_enc]

# decoding predicted values


y_pred = [np.argmax(i) for i in y_pred_enc]

print(y_pred_enc[0])
print(y_pred[0])

[2.9256535e-09 3.0600136e-06 7.0534142e-08 1.5436905e-10 9.9999607e-01


2.3972677e-09 6.5206586e-07 3.1757565e-08 9.4072838e-10 1.4325420e-07]
4

In [26]:
print(classification_report(y_act, y_pred))

precision recall f1-score support

0 0.98 1.00 0.99 1244


1 0.99 1.00 0.99 1427
2 0.99 0.98 0.98 1278
3 0.99 0.99 0.99 1280
4 0.99 0.99 0.99 1232
5 0.99 0.99 0.99 1127
6 1.00 0.98 0.99 1232
7 0.99 0.99 0.99 1289
8 0.98 0.99 0.99 1225
9 0.98 0.98 0.98 1266

accuracy 0.99 12600


macro avg 0.99 0.99 0.99 12600
weighted avg 0.99 0.99 0.99 12600

In [27]:
fig, ax = plt.subplots(figsize=(7, 7))
sns.heatmap(confusion_matrix(y_act, y_pred), annot=True,
cbar=False, fmt='1d', cmap='Blues', ax=ax)
ax.set_title('Confusion Matrix', loc='left', fontsize=16)
ax.set_xlabel('Predicted')
ax.set_ylabel('Actual')
plt.show()

In [28]:
# predicted values
y_pred_enc = model.predict(test)

# decoding predicted values


y_pred = [np.argmax(i) for i in y_pred_enc]

print(y_pred_enc[0])
print(y_pred[0])

[6.1794458e-10 6.5849557e-09 9.9999988e-01 4.3089918e-09 1.1493416e-10


7.5394031e-12 3.9346677e-09 5.5646712e-08 9.5187547e-10 3.4489059e-13]
2

In [29]:
# predicted targets of each images
# (labels above the images are predicted labels)
fig, ax = plt.subplots(figsize=(18, 12))
for ind, row in enumerate(test[:15]):
plt.subplot(3, 5, ind+1)
plt.title(y_pred[ind])
img = row.reshape(28, 28)
fig.suptitle('Predicted values', fontsize=24)
plt.axis('off')
plt.imshow(img, cmap='cividis')
Experiment 5: House Price Prediction using Neural
Networks
In [1]:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, BatchNormalization
from tensorflow.keras.optimizers import Adam
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, MinMaxScaler
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score
import warnings

# Set random seeds for reproducibility


np.random.seed(42)
tf.random.set_seed(42)

# Ignore warnings
warnings.filterwarnings('ignore')

# Check versions
print(f"TensorFlow version: {tf.__version__}")
print(f"Keras version: {keras.__version__}")

TensorFlow version: 2.17.0


Keras version: 3.9.0.dev2025031403

In [2]:

# Function to generate synthetic house price data


def generate_house_price_data(n_samples=1000):
np.random.seed(42) # For reproducibility

# Generate features
data = {
# Size features
'SquareFootage': np.random.normal(2200, 800, n_samples),
'NumBedrooms': np.random.randint(1, 6, n_samples),
'NumBathrooms': np.random.choice([1, 1.5, 2, 2.5, 3, 3.5, 4], n_samples),
'LotSize': np.random.normal(10000, 5000, n_samples),

# Property characteristics
'YearBuilt': np.random.randint(1950, 2023, n_samples),
'Garage': np.random.randint(0, 4, n_samples),
'HasPool': np.random.choice([0, 1], n_samples, p=[0.8, 0.2]),
'HasBasement': np.random.choice([0, 1], n_samples, p=[0.3, 0.7]),

# Location and market features


'DistanceToCity': np.random.normal(15, 10, n_samples),
'SchoolRating': np.random.uniform(1, 10, n_samples),
'CrimeRate': np.random.normal(5, 3, n_samples),
'MedianNeighborhoodIncome': np.random.normal(70000, 30000, n_samples)
}

# Create DataFrame
df = pd.DataFrame(data)

# Apply constraints and clean up data


df['SquareFootage'] = np.maximum(500, df['SquareFootage']) # Minimum 500 sq ft
df['LotSize'] = np.maximum(1000, df['LotSize']) # Minimum 1000 sq ft lot
df['CrimeRate'] = np.maximum(0, df['CrimeRate']) # Non-negative crime rate
df['MedianNeighborhoodIncome'] = np.maximum(20000, df['MedianNeighborhoodIncome'])

# Calculate property age


df['PropertyAge'] = 2023 - df['YearBuilt']

# Engineered feature: rooms per sq ft


df['RoomsPerSqft'] = (df['NumBedrooms'] + df['NumBathrooms']) / df['SquareFootage']

# Generate target variable (house price)


# Base price
price = 50000 + 110 * df['SquareFootage']

# Add effects of various features


price += 15000 * df['NumBedrooms']
price += 25000 * df['NumBathrooms']
price += 0.5 * df['LotSize']
price -= 1000 * df['PropertyAge'] # Older houses are worth less
price += 12000 * df['Garage'] # Value of garage spaces
price += 30000 * df['HasPool'] # Value of pool
price += 25000 * df['HasBasement'] # Value of basement
price -= 5000 * df['DistanceToCity'] # Further from city reduces value
price += 15000 * df['SchoolRating'] # Better schools increase value
price -= 10000 * df['CrimeRate'] # Higher crime reduces value
price += 0.8 * df['MedianNeighborhoodIncome'] # Neighborhood income affects value

# Add some random noise


price += np.random.normal(0, 50000, n_samples)

# Ensure minimum price


df['Price'] = np.maximum(50000, price)

return df

# Generate the dataset


house_df = generate_house_price_data(1500)

# Display the first few rows


print("Dataset Shape:", house_df.shape)
house_df.head()

Dataset Shape: (1500, 15)


Out[2]:

SquareFootage NumBedrooms NumBathrooms LotSize YearBuilt Garage HasPool HasBasement DistanceToCity Schoo

0 2597.371322 4 4.0 2995.403359 1959 1 1 0 17.024980

1 2089.388559 4 4.0 10302.213950 2004 2 0 1 4.521723

2 2718.150830 3 2.5 9081.684837 1958 1 0 1 22.751506

3 3418.423885 1 3.0 5826.183053 2001 1 0 1 8.665742

4 2012.677300 3 2.0 7657.212568 1951 0 0 0 17.020605

In [3]:
# Basic statistics of the dataset
house_df.describe()
Out[3]:

SquareFootage NumBedrooms NumBathrooms LotSize YearBuilt Garage HasPool HasBasement Distanc

count 1500.000000 1500.000000 1500.000000 1500.000000 1500.000000 1500.000000 1500.000000 1500.000000 150

mean 2242.712546 3.002667 2.543333 9921.081335 1985.995333 1.448000 0.192667 0.708000

std 783.412111 1.420800 1.009357 4826.557276 21.182314 1.133793 0.394525 0.454834

min 500.000000 1.000000 1.000000 1000.000000 1950.000000 0.000000 0.000000 0.000000 -1


25% SquareFootage
1700.774908 NumBedrooms
2.000000 NumBathrooms
1.500000 6379.818844
LotSize 1968.000000
YearBuilt 0.000000
Garage 0.000000 0.000000 Distanc
HasPool HasBasement

50% 2240.322382 3.000000 2.500000 9850.783418 1985.000000 1.000000 0.000000 1.000000

75% 2744.058789 4.000000 3.500000 13395.054832 2005.000000 2.000000 0.000000 1.000000

max 5282.185193 5.000000 4.000000 25463.206392 2022.000000 3.000000 1.000000 1.000000

In [4]:
# Check for missing values
print("Missing values in each column:")
house_df.isnull().sum()

Missing values in each column:


Out[4]:
SquareFootage 0
NumBedrooms 0
NumBathrooms 0
LotSize 0
YearBuilt 0
Garage 0
HasPool 0
HasBasement 0
DistanceToCity 0
SchoolRating 0
CrimeRate 0
MedianNeighborhoodIncome 0
PropertyAge 0
RoomsPerSqft 0
Price 0
dtype: int64

In [5]:
# Visualize the distribution of house prices
plt.figure(figsize=(10, 6))
sns.histplot(house_df['Price'], kde=True)
plt.title('Distribution of House Prices')
plt.xlabel('Price ($)')
plt.ylabel('Frequency')
plt.show()

# Calculate price statistics


print(f"Min Price: ${house_df['Price'].min():,.2f}")
print(f"Max Price: ${house_df['Price'].max():,.2f}")
print(f"Mean Price: ${house_df['Price'].mean():,.2f}")
print(f"Median Price: ${house_df['Price'].median():,.2f}")
Min Price: $50,000.00
Max Price: $870,754.60
Mean Price: $427,721.96
Median Price: $422,454.80

In [6]:
# Explore relationships between features and house prices
plt.figure(figsize=(15, 10))

# Create a list of numerical features to plot


numerical_features = ['SquareFootage', 'NumBedrooms', 'NumBathrooms', 'LotSize',
'PropertyAge', 'SchoolRating', 'DistanceToCity', 'CrimeRate']

# Plot scatter plots for each feature vs price


for i, feature in enumerate(numerical_features):
plt.subplot(2, 4, i+1)
plt.scatter(house_df[feature], house_df['Price'], alpha=0.5, s=10)
plt.title(f'{feature} vs Price')
plt.xlabel(feature)
plt.ylabel('Price')

plt.tight_layout()
plt.show()

In [7]:
# Explore categorical features
plt.figure(figsize=(15, 5))
# Plot HasPool
plt.subplot(1, 3, 1)
sns.boxplot(x='HasPool', y='Price', data=house_df)
plt.title('Price by Pool Presence')
plt.xlabel('Has Pool (1=Yes, 0=No)')
plt.ylabel('Price')

# Plot HasBasement
plt.subplot(1, 3, 2)
sns.boxplot(x='HasBasement', y='Price', data=house_df)
plt.title('Price by Basement Presence')
plt.xlabel('Has Basement (1=Yes, 0=No)')
plt.ylabel('Price')

# Plot Garage
plt.subplot(1, 3, 3)
sns.boxplot(x='Garage', y='Price', data=house_df)
plt.title('Price by Garage Size')
plt.xlabel('Number of Garage Spaces')
plt.ylabel('Price')

plt.tight_layout()
plt.show()

In [8]:
# Compute and visualize correlation matrix
correlation_matrix = house_df.corr()

plt.figure(figsize=(14, 10))
sns.heatmap(correlation_matrix, annot=True, cmap='coolwarm', fmt='.2f', linewidths=0.5)
plt.title('Correlation Matrix of House Features')
plt.tight_layout()
plt.show()
In [9]:
# Sort correlations with Price for better insights
price_correlations = correlation_matrix['Price'].sort_values(ascending=False)
print("Features Correlation with Price:")
price_correlations

Features Correlation with Price:


Out[9]:
Price 1.000000
SquareFootage 0.664093
SchoolRating 0.273853
YearBuilt 0.199843
MedianNeighborhoodIncome 0.186012
NumBathrooms 0.171549
NumBedrooms 0.144245
HasPool 0.134249
Garage 0.105025
HasBasement 0.079272
LotSize -0.012259
PropertyAge -0.199843
CrimeRate -0.238229
RoomsPerSqft -0.341459
DistanceToCity -0.343838
Name: Price, dtype: float64

In [10]:
# Prepare features and target
X = house_df.drop(['Price', 'YearBuilt'], axis=1) # Drop price (target) and YearBuilt (
redundant with PropertyAge)
y = house_df['Price']

# Split the data into training, validation, and test sets


X_train, X_temp, y_train, y_temp = train_test_split(X, y, test_size=0.3, random_state=42
)
X_val, X_test, y_val, y_test = train_test_split(X_temp, y_temp, test_size=0.5, random_st
ate=42)

print(f"Training data shape: {X_train.shape}")


print(f"Validation data shape: {X_val.shape}")
print(f"Test data shape: {X_test.shape}")

Training data shape: (1050, 13)


Validation data shape: (225, 13)
Test data shape: (225, 13)

In [11]:
# Scale the features
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_val_scaled = scaler.transform(X_val)
X_test_scaled = scaler.transform(X_test)
# Check scaled data
print("Scaled training data - first 5 rows:")
pd.DataFrame(X_train_scaled[:5], columns=X.columns)

Scaled training data - first 5 rows:


Out[11]:

SquareFootage NumBedrooms NumBathrooms LotSize Garage HasPool HasBasement DistanceToCity SchoolRating Crim

-
0 -1.541859 1.346061 -1.512224 0.125611 0.472375 0.639841 -0.020848 -0.323407 0.8
0.470032

-
1 0.014937 -1.421172 -0.531170 0.239143 1.356500 -1.562889 -0.225932 -0.138350 -0.1
0.470032

-
2 -1.246153 -0.729363 0.449883 0.204953 0.472375 0.639841 1.189736 1.669335 -1.1
0.470032

-
3 1.137920 0.654253 0.449883 2.873937 0.472375 0.639841 -0.861980 1.258086 1.4
0.470032

- - -
4 0.140264 1.346061 0.940410 0.639841 1.439414 -1.238716 -0.0
0.245218 1.295875 0.470032

In [12]:
# Create a neural network for regression
def build_model(input_dim):
model = Sequential([
# Input layer
Dense(64, activation='relu', input_dim=input_dim),
BatchNormalization(),
Dropout(0.2),

# Hidden layer 1
Dense(32, activation='relu'),
BatchNormalization(),
Dropout(0.2),

# Hidden layer 2
Dense(16, activation='relu'),
BatchNormalization(),
Dropout(0.1),

# Output layer - linear activation for regression


Dense(1)
])

# Compile the model


model.compile(
optimizer=Adam(learning_rate=0.001),
loss='mean_squared_error',
metrics=['mean_absolute_error']
)

return model

# Build the model


model = build_model(X_train_scaled.shape[1])
model.summary()

Model: "sequential"

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type) ┃ Output Shape ┃ Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ dense (Dense) │ (None, 64) │ 896 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ batch_normalization │ (None, 64) │ 256 │
│ (BatchNormalization) │ │ │
├─────────────────────────────────┼────────────────────────┼───────────────┤
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout (Dropout) │ (None, 64) │ 0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_1 (Dense) │ (None, 32) │ 2,080 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ batch_normalization_1 │ (None, 32) │ 128 │
│ (BatchNormalization) │ │ │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout_1 (Dropout) │ (None, 32) │ 0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_2 (Dense) │ (None, 16) │ 528 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ batch_normalization_2 │ (None, 16) │ 64 │
│ (BatchNormalization) │ │ │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout_2 (Dropout) │ (None, 16) │ 0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_3 (Dense) │ (None, 1) │ 17 │
└─────────────────────────────────┴────────────────────────┴───────────────┘

Total params: 3,969 (15.50 KB)

Trainable params: 3,745 (14.63 KB)

Non-trainable params: 224 (896.00 B)

In [ ]:
# Define callbacks for training
callbacks = [
# Early stopping to prevent overfitting
keras.callbacks.EarlyStopping(
monitor='val_loss',
patience=20,
mode='min',
restore_best_weights=True
),
# Reduce learning rate when training plateaus
keras.callbacks.ReduceLROnPlateau(
monitor='val_loss',
factor=0.2,
patience=5,
min_lr=0.00001
)
]

# Train the model


history = model.fit(
X_train_scaled,
y_train,
epochs=100,
batch_size=32,
validation_data=(X_val_scaled, y_val),
callbacks=callbacks,
verbose=1
)

In [14]:
# Plot training history
plt.figure(figsize=(12, 5))

# Plot training & validation loss


plt.subplot(1, 2, 1)
plt.plot(history.history['loss'], label='Training Loss')
plt.plot(history.history['val_loss'], label='Validation Loss')
plt.title('Model Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss (MSE)')
plt.legend()
plt.grid(True, alpha=0.3)

# Plot training & validation MAE


plt.subplot(1, 2, 2)
plt.plot(history.history['mean_absolute_error'], label='Training MAE')
plt.plot(history.history['val_mean_absolute_error'], label='Validation MAE')
plt.title('Model Mean Absolute Error')
plt.xlabel('Epoch')
plt.ylabel('MAE')
plt.legend()
plt.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

In [15]:
# Evaluate on test set
test_loss, test_mae = model.evaluate(X_test_scaled, y_test, verbose=0)
print(f"Test Loss (MSE): {test_loss:.2f}")
print(f"Test MAE: ${test_mae:.2f}")

Test Loss (MSE): 200826667008.00


Test MAE: $430195.06

In [16]:
# Make predictions
y_pred = model.predict(X_test_scaled)

# Calculate regression metrics


mse = mean_squared_error(y_test, y_pred)
rmse = np.sqrt(mse)
mae = mean_absolute_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)

print(f"Mean Squared Error (MSE): {mse:.2f}")


print(f"Root Mean Squared Error (RMSE): ${rmse:.2f}")
print(f"Mean Absolute Error (MAE): ${mae:.2f}")
print(f"R² Score: {r2:.4f}")

# Calculate MAPE (Mean Absolute Percentage Error)


mape = np.mean(np.abs((y_test - y_pred.flatten()) / y_test)) * 100
print(f"Mean Absolute Percentage Error (MAPE): {mape:.2f}%")

8/8 ━━━━━━━━━━━━━━━━━━━━ 0s 13ms/step


Mean Squared Error (MSE): 200826681383.91
Root Mean Squared Error (RMSE): $448136.90
Mean Absolute Error (MAE): $430195.08
R² Score: -11.6721
Mean Absolute Percentage Error (MAPE): 99.93%
In [17]:
# Visualize predictions vs actual values
plt.figure(figsize=(10, 6))
plt.scatter(y_test, y_pred, alpha=0.5)
plt.plot([y_test.min(), y_test.max()], [y_test.min(), y_test.max()], 'r--', lw=2)
plt.title('Predicted vs Actual House Prices')
plt.xlabel('Actual Price ($)')
plt.ylabel('Predicted Price ($)')
plt.grid(True, alpha=0.3)

# Add metrics to the plot


plt.annotate(f"R² = {r2:.4f}\n RMSE = ${rmse:.2f}\n MAPE = {mape:.2f}%",
xy=(0.05, 0.92), xycoords='axes fraction',
bbox=dict(boxstyle="round,pad=0.3", fc="white", ec="gray", alpha=0.8))

plt.show()

In [18]:
# Plot residuals
residuals = y_test - y_pred.flatten()

plt.figure(figsize=(12, 5))

# Residuals vs Predicted
plt.subplot(1, 2, 1)
plt.scatter(y_pred, residuals, alpha=0.5)
plt.axhline(y=0, color='r', linestyle='--')
plt.title('Residuals vs Predicted Values')
plt.xlabel('Predicted Price ($)')
plt.ylabel('Residuals')
plt.grid(True, alpha=0.3)

# Residual distribution
plt.subplot(1, 2, 2)
sns.histplot(residuals, kde=True)
plt.title('Distribution of Residuals')
plt.xlabel('Residual Value')
plt.ylabel('Frequency')

plt.tight_layout()
plt.show()

In [ ]:
# Function to measure feature importance using permutation method
def permutation_importance(model, X, y, n_repeats=10):
baseline_mae = mean_absolute_error(y, model.predict(X))
importances = []

for col_idx in range(X.shape[1]):


col_importances = []
for _ in range(n_repeats):
# Create a shuffled copy of the feature
X_permuted = X.copy()
np.random.shuffle(X_permuted[:, col_idx])

# Measure the change in MAE


permuted_mae = mean_absolute_error(y, model.predict(X_permuted))
importance = permuted_mae - baseline_mae
col_importances.append(importance)

importances.append(np.mean(col_importances))

return importances

# Calculate feature importances


feature_importances = permutation_importance(model, X_test_scaled, y_test, n_repeats=5)

# Create a DataFrame for visualization


importance_df = pd.DataFrame({
'Feature': X.columns,
'Importance': feature_importances
})

# Sort by importance
importance_df = importance_df.sort_values('Importance', ascending=False)

# Plot feature importance


plt.figure(figsize=(12, 8))
plt.barh(importance_df['Feature'], importance_df['Importance'], color='skyblue')
plt.title('Feature Importance based on Permutation Method')
plt.xlabel('Increase in MAE when Feature is Permuted')
plt.grid(True, alpha=0.3, axis='x')
plt.tight_layout()
plt.show()

In [21]:
# Create sample houses for prediction
sample_houses = pd.DataFrame([
# Small starter home
{
'SquareFootage': 1200,
'NumBedrooms': 2,
'NumBathrooms': 1,
'LotSize': 5000,
'PropertyAge': 40,
'Garage': 1,
'HasPool': 0,
'HasBasement': 0,
'DistanceToCity': 20,
'SchoolRating': 6.5,
'CrimeRate': 7.2,
'MedianNeighborhoodIncome': 45000,
'RoomsPerSqft': (2 + 1) / 1200
},
# Medium suburban home
{
'SquareFootage': 2500,
'NumBedrooms': 3,
'NumBathrooms': 2.5,
'LotSize': 8500,
'PropertyAge': 15,
'Garage': 2,
'HasPool': 0,
'HasBasement': 1,
'DistanceToCity': 12,
'SchoolRating': 8.2,
'CrimeRate': 3.1,
'MedianNeighborhoodIncome': 85000,
'RoomsPerSqft': (3 + 2.5) / 2500
},
# Luxury home
{
'SquareFootage': 4200,
'NumBedrooms': 5,
'NumBathrooms': 4.5,
'LotSize': 15000,
'PropertyAge': 5,
'Garage': 3,
'HasPool': 1,
'HasBasement': 1,
'DistanceToCity': 8,
'SchoolRating': 9.8,
'CrimeRate': 1.2,
'MedianNeighborhoodIncome': 150000,
'RoomsPerSqft': (5 + 4.5) / 4200
}
])

# Reorder columns to match training data


sample_houses = sample_houses[X.columns]

# Scale the sample houses


sample_houses_scaled = scaler.transform(sample_houses)

# Make predictions
sample_predictions = model.predict(sample_houses_scaled).flatten()

# Add predictions to the sample houses DataFrame


sample_houses['Predicted Price'] = sample_predictions

# Define house types for display


house_types = ['Small Starter Home', 'Medium Suburban Home', 'Luxury Home']
sample_houses['House Type'] = house_types

# Display the results


print("\n Price Predictions for Sample Houses:\n ")
sample_results = sample_houses[['House Type', 'SquareFootage', 'NumBedrooms', 'NumBathroo
ms', 'PropertyAge', 'Predicted Price']]
display(sample_results)

# Format prices nicely


for i, house_type in enumerate(house_types):
price = sample_predictions[i]
print(f"{house_type} : ${price:,.2f} ")

1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 30ms/step

Price Predictions for Sample Houses:

House Type SquareFootage NumBedrooms NumBathrooms PropertyAge Predicted Price

0 Small Starter Home 1200 2 1.0 40 -434.238617

1 Medium Suburban Home 2500 3 2.5 15 925.150818

2 Luxury Home 4200 5 4.5 5 1586.954102

Small Starter Home: $-434.24


Medium Suburban Home: $925.15
Luxury Home: $1,586.95

In [23]:

# Function to predict price given a base house and varied feature


def predict_price_with_varied_feature(base_house, feature_name, feature_values):
# Create copies of the base house with different feature values
varied_houses = []

for value in feature_values:


house_copy = base_house.copy()
house_copy[feature_name] = value

# Update RoomsPerSqft if needed


if feature_name in ['NumBedrooms', 'NumBathrooms', 'SquareFootage']:
bedrooms = house_copy['NumBedrooms']
bathrooms = house_copy['NumBathrooms']
sqft = house_copy['SquareFootage']
house_copy['RoomsPerSqft'] = (bedrooms + bathrooms) / sqft

varied_houses.append(house_copy)

# Convert to DataFrame
varied_df = pd.DataFrame(varied_houses)

# Scale and predict


varied_scaled = scaler.transform(varied_df)
predictions = model.predict(varied_scaled).flatten()

return predictions

# Use the medium suburban home as our base for what-if analysis
base_house = sample_houses.iloc[1].to_dict()
base_house.pop('House Type', None)
base_house.pop('Predicted Price', None)
Out[23]:

925.1508178710938

In [ ]:

# Analyze the impact of square footage


sqft_values = np.linspace(1000, 5000, 10)
sqft_predictions = predict_price_with_varied_feature(base_house, 'SquareFootage', sqft_va
lues)

# Analyze the impact of property age


age_values = np.linspace(0, 70, 10)
age_predictions = predict_price_with_varied_feature(base_house, 'PropertyAge', age_values
)

# Analyze the impact of school rating


school_values = np.linspace(1, 10, 10)
school_predictions = predict_price_with_varied_feature(base_house, 'SchoolRating', school
_values)

# Analyze the impact of distance to city


distance_values = np.linspace(0, 30, 10)
distance_predictions = predict_price_with_varied_feature(base_house, 'DistanceToCity', di
stance_values)

In [25]:

# Plot the what-if analysis results


plt.figure(figsize=(20, 15))

# Square footage impact


plt.subplot(2, 2, 1)
plt.plot(sqft_values, sqft_predictions, marker='o', linestyle='-', linewidth=2, markersi
ze=8)
plt.title('Impact of Square Footage on House Price')
plt.xlabel('Square Footage')
plt.ylabel('Predicted Price ($)')
plt.grid(True, alpha=0.3)

# Calculate price increase per square foot


price_per_sqft = (sqft_predictions[-1] - sqft_predictions[0]) / (sqft_values[-1] - sqft_
values[0])
plt.annotate(f"Avg Price Increase: ${price_per_sqft:.2f} per sq ft",
xy=(0.05, 0.92), xycoords='axes fraction',
bbox=dict(boxstyle="round,pad=0.3", fc="white", ec="gray", alpha=0.8))

# Property age impact


plt.subplot(2, 2, 2)
plt.plot(age_values, age_predictions, marker='o', linestyle='-', linewidth=2, markersize
=8, color='green')
plt.title('Impact of Property Age on House Price')
plt.xlabel('Property Age (years)')
plt.ylabel('Predicted Price ($)')
plt.grid(True, alpha=0.3)

# Calculate price decrease per year of age


price_per_year = (age_predictions[0] - age_predictions[-1]) / (age_values[-1] - age_valu
es[0])
plt.annotate(f"Avg Price Decrease: ${price_per_year:.2f} per year",
xy=(0.05, 0.92), xycoords='axes fraction',
bbox=dict(boxstyle="round,pad=0.3", fc="white", ec="gray", alpha=0.8))

# School rating impact


plt.subplot(2, 2, 3)
plt.plot(school_values, school_predictions, marker='o', linestyle='-', linewidth=2, marke
rsize=8, color='red')
plt.title('Impact of School Rating on House Price')
plt.xlabel('School Rating (1-10)')
plt.ylabel('Predicted Price ($)')
plt.grid(True, alpha=0.3)

# Calculate price increase per school rating point


price_per_rating = (school_predictions[-1] - school_predictions[0]) / (school_values[-1]
- school_values[0])
plt.annotate(f"Avg Price Increase: ${price_per_rating:.2f} per rating point",
xy=(0.05, 0.92), xycoords='axes fraction',
bbox=dict(boxstyle="round,pad=0.3", fc="white", ec="gray", alpha=0.8))

# Distance to city impact


plt.subplot(2, 2, 4)
plt.plot(distance_values, distance_predictions, marker='o', linestyle='-', linewidth=2,
markersize=8, color='purple')
plt.title('Impact of Distance to City on House Price')
plt.xlabel('Distance to City (miles)')
plt.ylabel('Predicted Price ($)')
plt.grid(True, alpha=0.3)
# Calculate price decrease per mile
price_per_mile = (distance_predictions[0] - distance_predictions[-1]) / (distance_values
[-1] - distance_values[0])
plt.annotate(f"Avg Price Decrease: ${price_per_mile:.2f} per mile",
xy=(0.05, 0.92), xycoords='axes fraction',
bbox=dict(boxstyle="round,pad=0.3", fc="white", ec="gray", alpha=0.8))

plt.suptitle('What-If Analysis: How Individual Features Affect House Price', fontsize=20)


plt.tight_layout()
plt.subplots_adjust(top=0.92)
plt.show()

In [27]:
# Create a function to predict house price based on input features
def predict_house_price(square_footage, num_bedrooms, num_bathrooms, lot_size, property_a
ge,
garage_spaces, has_pool, has_basement, distance_to_city,
school_rating, crime_rate, median_neighborhood_income):

# Calculate the derived feature


rooms_per_sqft = (num_bedrooms + num_bathrooms) / square_footage

# Create a DataFrame with the house features


house_features = pd.DataFrame([
{
'SquareFootage': square_footage,
'NumBedrooms': num_bedrooms,
'NumBathrooms': num_bathrooms,
'LotSize': lot_size,
'PropertyAge': property_age,
'Garage': garage_spaces,
'HasPool': has_pool,
'HasBasement': has_basement,
'DistanceToCity': distance_to_city,
'SchoolRating': school_rating,
'CrimeRate': crime_rate,
'MedianNeighborhoodIncome': median_neighborhood_income,
'RoomsPerSqft': rooms_per_sqft
}
])

# Reorder columns to match training data


house_features = house_features[X.columns]

# Scale the features


house_features_scaled = scaler.transform(house_features)

# Make prediction
predicted_price = model.predict(house_features_scaled)[0][0]

return predicted_price

# Test the function with an example house


example_price = predict_house_price(
square_footage=2200,
num_bedrooms=3,
num_bathrooms=2,
lot_size=9000,
property_age=12,
garage_spaces=2,
has_pool=0,
has_basement=1,
distance_to_city=15,
school_rating=7.5,
crime_rate=4.2,
median_neighborhood_income=75000
)

print(f"Predicted House Price: ${example_price:,.2f}")

1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 35ms/step


Predicted House Price: $590.48
Experiment 6: Basic Speech Recognition using Neural
Networks
In [ ]:

%pip install librosa

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Conv1D, MaxPooling1D, Flatten, LSTM,
BatchNormalization
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder
from sklearn.metrics import confusion_matrix, classification_report
import librosa
import librosa.display
import IPython.display as ipd
import os
import warnings

# Set random seeds for reproducibility


np.random.seed(42)
tf.random.set_seed(42)

# Ignore warnings
warnings.filterwarnings('ignore')

# Check versions
print(f"TensorFlow version: {tf.__version__}")
print(f"Keras version: {keras.__version__}")
print(f"Librosa version: {librosa.__version__}")

In [2]:
# Try to download a small version of the Speech Commands dataset
try:
# Create a directory for our audio files
os.makedirs('speech_data', exist_ok=True)

# Download a subset of the Speech Commands dataset


!wget -q -O speech_commands_v0.01.tar.gz http://download.tensorflow.org/data/speech_
commands_v0.01.tar.gz
!tar -xzf speech_commands_v0.01.tar.gz -C speech_data

# Check if download and extraction were successful


if not os.path.exists('speech_data/_background_noise_'):
raise Exception("Download or extraction failed")

print("Successfully downloaded and extracted Speech Commands dataset.")

# List the available command categories


categories = [d for d in os.listdir('speech_data') if os.path.isdir(os.path.join('sp
eech_data', d)) and not d.startswith('_')]
print(f"Available command categories: {categories} ")

# We'll use a subset of commands for our example


selected_commands = ['yes', 'no', 'up', 'down', 'left', 'right']
print(f"Selected commands for our model: {selected_commands}")

using_synthetic_data = False
except Exception as e:
print(f"Error: {e}")
print("Failed to download Speech Commands dataset. Creating synthetic audio data inst
ead.")
using_synthetic_data = True

Successfully downloaded and extracted Speech Commands dataset.


Available command categories: ['yes', 'six', 'house', 'tree', 'no', 'on', 'off', 'go', 't
hree', 'up', 'seven', 'right', 'happy', 'eight', 'stop', 'five', 'cat', 'one', 'sheila',
'down', 'wow', 'two', 'nine', 'zero', 'left', 'marvin', 'dog', 'four', 'bird', 'bed']
Selected commands for our model: ['yes', 'no', 'up', 'down', 'left', 'right']

In [3]:

# Function to generate synthetic audio for demonstration


def generate_synthetic_speech_data(n_samples=1000, duration=1.0, sr=16000):
"""
Generate synthetic audio data that mimics speech for different commands.
This is not real speech but will allow us to demonstrate the concepts.
"""
# Commands to simulate
commands = ['yes', 'no', 'up', 'down', 'left', 'right']
n_commands = len(commands)
n_samples_per_command = n_samples // n_commands

# Create synthetic data


X = []
y = []

for idx, command in enumerate(commands):


for _ in range(n_samples_per_command):
# Generate a base frequency between 150 and 350 Hz
base_freq = np.random.uniform(150, 350)

# Create time array


t = np.linspace(0, duration, int(sr * duration), endpoint=False)

# Generate base signal with the frequency


x = np.sin(2 * np.pi * base_freq * t)

# Add harmonics with different patterns for each command


if command == 'yes':
x += 0.5 * np.sin(2 * np.pi * (base_freq * 2) * t) * np.exp(-t/0.5)
elif command == 'no':
x += 0.5 * np.sin(2 * np.pi * (base_freq * 1.5) * t) * np.exp(-t/0.3)
elif command == 'up':
x += 0.4 * np.sin(2 * np.pi * np.linspace(base_freq, base_freq * 2, len(
t)) * t)
elif command == 'down':
x += 0.4 * np.sin(2 * np.pi * np.linspace(base_freq * 2, base_freq, len(
t)) * t)
elif command == 'left':
x += 0.3 * np.sin(2 * np.pi * (base_freq * 1.2) * t) * (1 + np.sin(2 *
np.pi * 3 * t))
elif command == 'right':
x += 0.3 * np.sin(2 * np.pi * (base_freq * 1.8) * t) * (1 + np.sin(2 *
np.pi * 4 * t))

# Add some noise


x += np.random.normal(0, 0.1, len(x))

# Normalize
x = x / np.max(np.abs(x))

X.append(x)
y.append(command)

return np.array(X), np.array(y), commands

# If real data is not available, generate synthetic data


if using_synthetic_data:
X_synthetic, y_synthetic, selected_commands = generate_synthetic_speech_data(1200)
print(f"Generated synthetic speech data for commands: {selected_commands}")
print(f"Data shape: {X_synthetic.shape}, Labels shape: {y_synthetic.shape}")

In [4]:
# Function to load and process real audio files
def load_audio_files(commands, data_dir='speech_data', max_files_per_command=200, duratio
n=1.0, sr=16000):
X = []
y = []

for command in commands:


print(f"Loading {command} audio files...")
command_dir = os.path.join(data_dir, command)
files = os.listdir(command_dir)[:max_files_per_command]

for file in files:


file_path = os.path.join(command_dir, file)
try:
# Load the audio file with a fixed duration
audio, _ = librosa.load(file_path, sr=sr, duration=duration)

# Ensure all audio samples have the same length


if len(audio) < sr * duration:
audio = np.pad(audio, (0, int(sr * duration) - len(audio)))
else:
audio = audio[:int(sr * duration)]

X.append(audio)
y.append(command)
except Exception as e:
print(f"Error loading file {file_path}: {e}")
continue

return np.array(X), np.array(y)

# Load real audio data if available, otherwise use synthetic data


if not using_synthetic_data:
# Load real audio files
X, y = load_audio_files(selected_commands)
print(f"Loaded {len(X)} audio files.")
else:
# Use synthetic data
X, y = X_synthetic, y_synthetic
print(f"Using synthetic audio data with {len(X)} samples.")

Loading yes audio files...


Loading no audio files...
Loading up audio files...
Loading down audio files...
Loading left audio files...
Loading right audio files...
Loaded 1200 audio files.

In [5]:
# Visualize some audio waveforms
plt.figure(figsize=(15, 10))
commands_to_plot = set(y) # Get unique commands

for idx, command in enumerate(commands_to_plot):


# Find first occurrence of this command
audio_idx = np.where(y == command)[0][0]
audio = X[audio_idx]

plt.subplot(len(commands_to_plot), 2, 2*idx+1)
plt.plot(audio)
plt.title(f'Waveform for "{command}"')
plt.xlabel('Sample')
plt.ylabel('Amplitude')
# Plot the spectrogram
plt.subplot(len(commands_to_plot), 2, 2*idx+2)
D = librosa.amplitude_to_db(np.abs(librosa.stft(audio)), ref=np.max)
librosa.display.specshow(D, y_axis='log', x_axis='time')
plt.title(f'Spectrogram for "{command}"')
plt.colorbar(format='%+2.0f dB')

plt.tight_layout()
plt.show()

In [6]:
# Play some audio samples (only works in interactive notebook environments)
for command in list(set(y))[:3]: # Only play first few commands
audio_idx = np.where(y == command)[0][0]
audio = X[audio_idx]

print(f"Playing sample for '{command}'")


# This will only work in a Jupyter notebook environment
ipd.display(ipd.Audio(audio, rate=16000))

Playing sample for 'left'

Your browser does not support the audio element.


Playing sample for 'up'

Your browser does not support the audio element.


Playing sample for 'down'

Your browser does not support the audio element.

In [7]:
# Function to extract MFCCs from audio data
def extract_mfccs(audio, sr=16000, n_mfcc=13):
mfccs = librosa.feature.mfcc(y=audio, sr=sr, n_mfcc=n_mfcc)
return mfccs.T # Transpose to get time steps as first dimension
# Extract MFCCs for all samples
n_mfcc = 13 # Number of MFCC coefficients to extract
X_mfccs = []

print("Extracting MFCC features...")


for audio in X:
mfccs = extract_mfccs(audio, n_mfcc=n_mfcc)
X_mfccs.append(mfccs)

# Convert list to numpy array


X_mfccs = np.array(X_mfccs)
print(f"MFCC features shape: {X_mfccs.shape}")

Extracting MFCC features...


MFCC features shape: (1200, 32, 13)

In [8]:
# Visualize MFCC features for different commands
plt.figure(figsize=(15, 10))
commands_to_plot = set(y) # Get unique commands

for idx, command in enumerate(commands_to_plot):


# Find first occurrence of this command
audio_idx = np.where(y == command)[0][0]
mfcc = X_mfccs[audio_idx]

plt.subplot(len(commands_to_plot), 1, idx+1)
librosa.display.specshow(mfcc.T, x_axis='time')
plt.title(f'MFCC for "{command}"')
plt.colorbar()

plt.tight_layout()
plt.show()

In [9]:
In [9]:
# Encode the labels
label_encoder = LabelEncoder()
y_encoded = label_encoder.fit_transform(y)

# Print the mapping


print("Label encoding:")
for idx, label in enumerate(label_encoder.classes_):
print(f" {label} -> {idx}")

# Split the data into training and testing sets


X_train, X_test, y_train, y_test = train_test_split(
X_mfccs, y_encoded, test_size=0.2, random_state=42, stratify=y_encoded
)

print(f"Training data shape: {X_train.shape}")


print(f"Testing data shape: {X_test.shape}")

# Convert to categorical (one-hot encoded) format


y_train_cat = tf.keras.utils.to_categorical(y_train, num_classes=len(label_encoder.class
es_))
y_test_cat = tf.keras.utils.to_categorical(y_test, num_classes=len(label_encoder.classes_
))

print(f"y_train_cat shape: {y_train_cat.shape}")


print(f"y_test_cat shape: {y_test_cat.shape}")

Label encoding:
down -> 0
left -> 1
no -> 2
right -> 3
up -> 4
yes -> 5
Training data shape: (960, 32, 13)
Testing data shape: (240, 32, 13)
y_train_cat shape: (960, 6)
y_test_cat shape: (240, 6)

In [10]:
# 1. Build a CNN model
def build_cnn_model(input_shape, num_classes):
model = Sequential([
# Convolutional layers
Conv1D(32, 3, activation='relu', padding='same', input_shape=input_shape),
BatchNormalization(),
MaxPooling1D(pool_size=2),

Conv1D(64, 3, activation='relu', padding='same'),


BatchNormalization(),
MaxPooling1D(pool_size=2),

Conv1D(128, 3, activation='relu', padding='same'),


BatchNormalization(),
MaxPooling1D(pool_size=2),

# Flatten and dense layers


Flatten(),
Dense(128, activation='relu'),
Dropout(0.5),
Dense(num_classes, activation='softmax')
])

model.compile(
optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy']
)

return model
# 2. Build an LSTM model (recurrent neural network)
def build_lstm_model(input_shape, num_classes):
model = Sequential([
# LSTM layers
LSTM(64, return_sequences=True, input_shape=input_shape),
Dropout(0.2),

LSTM(64),
Dropout(0.2),

# Dense layers
Dense(32, activation='relu'),
Dropout(0.3),
Dense(num_classes, activation='softmax')
])

model.compile(
optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy']
)

return model

# Create models
input_shape = X_train.shape[1:]
num_classes = len(label_encoder.classes_)

cnn_model = build_cnn_model(input_shape, num_classes)


lstm_model = build_lstm_model(input_shape, num_classes)

print("CNN Model Summary:")


cnn_model.summary()

print("\n LSTM Model Summary:")


lstm_model.summary()

CNN Model Summary:

Model: "sequential"

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type) ┃ Output Shape ┃ Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ conv1d (Conv1D) │ (None, 32, 32) │ 1,280 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ batch_normalization │ (None, 32, 32) │ 128 │
│ (BatchNormalization) │ │ │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ max_pooling1d (MaxPooling1D) │ (None, 16, 32) │ 0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ conv1d_1 (Conv1D) │ (None, 16, 64) │ 6,208 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ batch_normalization_1 │ (None, 16, 64) │ 256 │
│ (BatchNormalization) │ │ │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ max_pooling1d_1 (MaxPooling1D) │ (None, 8, 64) │ 0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ conv1d_2 (Conv1D) │ (None, 8, 128) │ 24,704 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ batch_normalization_2 │ (None, 8, 128) │ 512 │
│ (BatchNormalization) │ │ │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ max_pooling1d_2 (MaxPooling1D) │ (None, 4, 128) │ 0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ flatten (Flatten) │ (None, 512) │ 0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense (Dense) │ (None, 128) │ 65,664 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout (Dropout) │ (None, 128) │ 0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_1 (Dense) │ (None, 6) │ 774 │
└─────────────────────────────────┴────────────────────────┴───────────────┘

Total params: 99,526 (388.77 KB)

Trainable params: 99,078 (387.02 KB)

Non-trainable params: 448 (1.75 KB)

LSTM Model Summary:

Model: "sequential_1"

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type) ┃ Output Shape ┃ Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ lstm (LSTM) │ (None, 32, 64) │ 19,968 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout_1 (Dropout) │ (None, 32, 64) │ 0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ lstm_1 (LSTM) │ (None, 64) │ 33,024 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout_2 (Dropout) │ (None, 64) │ 0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_2 (Dense) │ (None, 32) │ 2,080 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout_3 (Dropout) │ (None, 32) │ 0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_3 (Dense) │ (None, 6) │ 198 │
└─────────────────────────────────┴────────────────────────┴───────────────┘

Total params: 55,270 (215.90 KB)

Trainable params: 55,270 (215.90 KB)

Non-trainable params: 0 (0.00 B)

In [ ]:
# Define callbacks for early stopping
callbacks = [
tf.keras.callbacks.EarlyStopping(
monitor='val_loss',
patience=5,
mode='min',
restore_best_weights=True
)
]

# Train CNN model


print("Training CNN model...")
cnn_history = cnn_model.fit(
X_train, y_train_cat,
epochs=30,
batch_size=32,
validation_split=0.2,
callbacks=callbacks,
verbose=1
)

# Train LSTM model


print("\n Training LSTM model...")
lstm_history = lstm_model.fit(
X_train, y_train_cat,
epochs=30,
batch_size=32,
validation_split=0.2,
callbacks=callbacks,
verbose=1
)

In [12]:
# Plot training histories
plt.figure(figsize=(12, 5))

# Plot accuracy
plt.subplot(1, 2, 1)
plt.plot(cnn_history.history['accuracy'], label='CNN Training')
plt.plot(cnn_history.history['val_accuracy'], label='CNN Validation')
plt.plot(lstm_history.history['accuracy'], label='LSTM Training')
plt.plot(lstm_history.history['val_accuracy'], label='LSTM Validation')
plt.title('Model Accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend()
plt.grid(True, alpha=0.3)

# Plot loss
plt.subplot(1, 2, 2)
plt.plot(cnn_history.history['loss'], label='CNN Training')
plt.plot(cnn_history.history['val_loss'], label='CNN Validation')
plt.plot(lstm_history.history['loss'], label='LSTM Training')
plt.plot(lstm_history.history['val_loss'], label='LSTM Validation')
plt.title('Model Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.legend()
plt.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

In [13]:
# Evaluate models on the test set
print("Evaluating CNN model...")
cnn_test_loss, cnn_test_acc = cnn_model.evaluate(X_test, y_test_cat, verbose=0)
print(f"CNN Test Loss: {cnn_test_loss:.4f}")
print(f"CNN Test Accuracy: {cnn_test_acc:.4f}")

print("\n Evaluating LSTM model...")


lstm_test_loss, lstm_test_acc = lstm_model.evaluate(X_test, y_test_cat, verbose=0)
print(f"LSTM Test Loss: {lstm_test_loss:.4f}")
print(f"LSTM Test Accuracy: {lstm_test_acc:.4f}")

Evaluating CNN model...


CNN Test Loss: 0.7453
CNN Test Accuracy: 0.7583
Evaluating LSTM model...
LSTM Test Loss: 0.8699
LSTM Test Accuracy: 0.6833

In [14]:
# Make predictions with both models
cnn_predictions = cnn_model.predict(X_test)
cnn_pred_classes = np.argmax(cnn_predictions, axis=1)

lstm_predictions = lstm_model.predict(X_test)
lstm_pred_classes = np.argmax(lstm_predictions, axis=1)

# Compute confusion matrices


plt.figure(figsize=(15, 6))

# CNN confusion matrix


plt.subplot(1, 2, 1)
cnn_cm = confusion_matrix(y_test, cnn_pred_classes)
sns.heatmap(cnn_cm, annot=True, fmt='d', cmap='Blues',
xticklabels=label_encoder.classes_,
yticklabels=label_encoder.classes_)
plt.title('CNN Confusion Matrix')
plt.xlabel('Predicted Label')
plt.ylabel('True Label')

# LSTM confusion matrix


plt.subplot(1, 2, 2)
lstm_cm = confusion_matrix(y_test, lstm_pred_classes)
sns.heatmap(lstm_cm, annot=True, fmt='d', cmap='Blues',
xticklabels=label_encoder.classes_,
yticklabels=label_encoder.classes_)
plt.title('LSTM Confusion Matrix')
plt.xlabel('Predicted Label')
plt.ylabel('True Label')

plt.tight_layout()
plt.show()

8/8 ━━━━━━━━━━━━━━━━━━━━ 1s 50ms/step


8/8 ━━━━━━━━━━━━━━━━━━━━ 0s 27ms/step

In [15]:

# Generate classification reports


print("CNN Classification Report:")
print(classification_report(y_test, cnn_pred_classes, target_names=label_encoder.classes_
))

print("\n LSTM Classification Report:")


print(classification_report(y_test, lstm_pred_classes, target_names=label_encoder.classe
s_))

CNN Classification Report:


precision recall f1-score support

down 0.66 0.72 0.69 40


left 0.69 0.62 0.66 40
no 0.60 0.72 0.66 40
right 0.86 0.93 0.89 40
up 0.89 0.85 0.87 40
yes 0.90 0.70 0.79 40

accuracy 0.76 240


macro avg 0.77 0.76 0.76 240
weighted avg 0.77 0.76 0.76 240

LSTM Classification Report:


precision recall f1-score support

down 0.59 0.57 0.58 40


left 0.65 0.55 0.59 40
no 0.57 0.53 0.55 40
right 0.73 0.80 0.76 40
up 0.71 0.80 0.75 40
yes 0.83 0.85 0.84 40

accuracy 0.68 240


macro avg 0.68 0.68 0.68 240
weighted avg 0.68 0.68 0.68 240

In [16]:

# Function to recognize speech commands


def recognize_command(audio, model, label_encoder, sr=16000, duration=1.0, n_mfcc=13):
"""
Recognize speech command from audio data
"""
# Ensure audio is the right length
if len(audio) < sr * duration:
audio = np.pad(audio, (0, int(sr * duration) - len(audio)))
else:
audio = audio[:int(sr * duration)]

# Extract MFCC features


mfccs = extract_mfccs(audio, sr=sr, n_mfcc=n_mfcc)

# Reshape for the model


mfccs = np.expand_dims(mfccs, axis=0) # Add batch dimension

# Make prediction
prediction = model.predict(mfccs)[0]
predicted_class_idx = np.argmax(prediction)
predicted_command = label_encoder.classes_[predicted_class_idx]
confidence = prediction[predicted_class_idx]

return predicted_command, confidence, prediction

# Choose the better performing model


if cnn_test_acc >= lstm_test_acc:
best_model = cnn_model
print("Using CNN model for speech recognition")
else:
best_model = lstm_model
print("Using LSTM model for speech recognition")

Using CNN model for speech recognition

In [17]:
# Test speech recognition on a few samples
plt.figure(figsize=(15, 10))

for i in range(5): # Test 5 random samples


idx = np.random.randint(0, len(X_test))
audio = X[idx]
true_command = y[idx]

# Recognize the command


predicted_command, confidence, all_probabilities = recognize_command(
audio, best_model, label_encoder
)

# Plot audio waveform


plt.subplot(5, 2, 2*i+1)
plt.plot(audio)
plt.title(f"Sample {i+1}: True = '{true_command}', Predicted = '{predicted_command}'
")
plt.xlabel('Sample')
plt.ylabel('Amplitude')

# Plot probabilities for each command


plt.subplot(5, 2, 2*i+2)
barlist = plt.bar(label_encoder.classes_, all_probabilities)

# Highlight the predicted and true command


for j, cls in enumerate(label_encoder.classes_):
if cls == predicted_command:
barlist[j].set_color('green')
elif cls == true_command and cls != predicted_command:
barlist[j].set_color('red')

plt.title(f"Prediction Probabilities (Confidence: {confidence:.2f})")


plt.xticks(rotation=45)
plt.xlabel('Command')
plt.ylabel('Probability')

plt.tight_layout()
plt.show()

1/1 ━━━━━━━━━━━━━━━━━━━━ 1s 623ms/step


1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 80ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 106ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 126ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 102ms/step
In [18]:

# Function to execute actions based on recognized commands


def execute_command(command, confidence_threshold=0.7):
"""Simulate executing actions based on recognized commands"""
if command == 'yes':
return " Confirmed action"
elif command == 'no':
return " Cancelled action"
elif command == 'up':
return "⬆️ Moving up"
elif command == 'down':
return "⬇️ Moving down"
elif command == 'left':
return "⬅️ Moving left"
elif command == 'right':
return "➡️ Moving right"
else:
return " Command not recognized"

# Simulation of real-time speech recognition


def speech_command_demo(num_commands=10):
print("=== Speech Command Recognition Demo ===\n ")
print("Supported commands:", ', '.join(label_encoder.classes_))
print("\n Listening for commands...\n ")

# Randomly select some test samples


indices = np.random.randint(0, len(X), num_commands)

for i, idx in enumerate(indices):


audio = X[idx]
true_command = y[idx]

# Simulate recording audio


print(f"[Recording audio {i+1}...]")

# Process the audio and recognize the command


predicted_command, confidence, _ = recognize_command(
audio, best_model, label_encoder
)

# Execute the command action


action_result = execute_command(predicted_command)

# Print results
print(f"Heard: '{predicted_command}' (Confidence: {confidence:.2f})")
print(f"Executing: {action_result}")
print(f"[Actual command was: '{true_command}']")
print("-" * 40)

print("\n Demo completed.")

# Run the demo


speech_command_demo(8)

=== Speech Command Recognition Demo ===

Supported commands: down, left, no, right, up, yes

Listening for commands...

[Recording audio 1...]


1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 95ms/step
Heard: 'right' (Confidence: 1.00)
Executing: ➡️ Moving right
[Actual command was: 'right']
----------------------------------------
[Recording audio 2...]
[Recording audio 2...]
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 138ms/step
Heard: 'right' (Confidence: 1.00)
Executing: ➡️ Moving right
[Actual command was: 'right']
----------------------------------------
[Recording audio 3...]
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 89ms/step
Heard: 'yes' (Confidence: 1.00)
Executing: Confirmed action
[Actual command was: 'yes']
----------------------------------------
[Recording audio 4...]
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 43ms/step
Heard: 'up' (Confidence: 1.00)
Executing: ⬆️ Moving up
[Actual command was: 'up']
----------------------------------------
[Recording audio 5...]
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 47ms/step
Heard: 'no' (Confidence: 0.99)
Executing: Cancelled action
[Actual command was: 'no']
----------------------------------------
[Recording audio 6...]
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 41ms/step
Heard: 'no' (Confidence: 0.60)
Executing: Cancelled action
[Actual command was: 'yes']
----------------------------------------
[Recording audio 7...]
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 42ms/step
Heard: 'right' (Confidence: 1.00)
Executing: ➡️ Moving right
[Actual command was: 'right']
----------------------------------------
[Recording audio 8...]
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 42ms/step
Heard: 'left' (Confidence: 0.98)
Executing: ⬅️ Moving left
[Actual command was: 'left']
----------------------------------------

Demo completed.

In [19]:
# Simulate Google Speech-to-Text API integration
def simulate_google_speech_api(audio, language_code="en-US"):
"""Simulate calling Google Speech-to-Text API"""
# In a real implementation, you would:
# 1. Convert the audio to the correct format
# 2. Call the Google Speech-to-Text API
# 3. Process the response

# For simulation, we'll just use our existing model but add some API-like output
predicted_command, confidence, _ = recognize_command(
audio, best_model, label_encoder
)

# Simulate API response format


response = {
"results": [
{
"alternatives": [
{
"transcript": predicted_command,
"confidence": float(confidence)
}
]
}
],
"language_code": language_code
}

return response

# Demonstration of API integration


def api_integration_demo():
print("=== Speech Recognition API Integration Demo ===\n ")

# Sample commands to recognize


commands_to_test = set(y)

for command in commands_to_test:


# Find an audio sample for this command
idx = np.where(y == command)[0][0]
audio = X[idx]

print(f"Processing audio for command: '{command}'")


print("Calling Speech-to-Text API...")

# Call simulated API


response = simulate_google_speech_api(audio)

# Process API response


if response and "results" in response and len(response["results"]) > 0:
transcript = response["results"][0]["alternatives"][0]["transcript"]
confidence = response["results"][0]["alternatives"][0]["confidence"]

print(f"API Result: '{transcript} ' (Confidence: {confidence:.2f})")

# Check if it's correct


if transcript == command:
print(" Correctly recognized!")
else:
print(" Incorrectly recognized.")
else:
print(" API returned no results.")

print("-" * 40)

print("\n API integration demo completed.")

# Run the API integration demo


api_integration_demo()

=== Speech Recognition API Integration Demo ===

Processing audio for command: 'left'


Calling Speech-to-Text API...
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 41ms/step
API Result: 'left' (Confidence: 0.93)
Correctly recognized!
----------------------------------------
Processing audio for command: 'up'
Calling Speech-to-Text API...
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 40ms/step
API Result: 'up' (Confidence: 0.64)
Correctly recognized!
----------------------------------------
Processing audio for command: 'down'
Calling Speech-to-Text API...
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 42ms/step
API Result: 'down' (Confidence: 0.99)
Correctly recognized!
----------------------------------------
Processing audio for command: 'right'
Calling Speech-to-Text API...
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 40ms/step
API Result: 'right' (Confidence: 1.00)
Correctly recognized!
----------------------------------------
Processing audio for command: 'no'
Calling Speech-to-Text API...
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 47ms/step
API Result: 'no' (Confidence: 0.51)
Correctly recognized!
----------------------------------------
Processing audio for command: 'yes'
Calling Speech-to-Text API...
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 41ms/step
API Result: 'yes' (Confidence: 1.00)
Correctly recognized!
----------------------------------------

API integration demo completed.

In [20]:

# Simplified DTW implementation for demonstration purposes


def simplified_dtw(s1, s2):
"""Simplified Dynamic Time Warping distance between two sequences"""
# Create a matrix of distances
n, m = len(s1), len(s2)
dtw_matrix = np.zeros((n+1, m+1))

# Initialize the DTW matrix


for i in range(n+1):
for j in range(m+1):
dtw_matrix[i, j] = float('inf')
dtw_matrix[0, 0] = 0

# Fill the DTW matrix


for i in range(1, n+1):
for j in range(1, m+1):
cost = np.abs(s1[i-1] - s2[j-1])
dtw_matrix[i, j] = cost + min(dtw_matrix[i-1, j], dtw_matrix[i, j-1], dtw_ma
trix[i-1, j-1])

return dtw_matrix[n, m]

# Simulate DTW-based speech recognition


def dtw_speech_recognition(audio, templates, labels, sr=16000):
"""Recognize speech using DTW against templates"""
# Extract a simplified feature (e.g., average energy in frequency bands)
n_bands = 10
S = np.abs(librosa.stft(audio))
energy_bands = librosa.feature.mfcc(S=S, sr=sr, n_mfcc=n_bands)
sample_feature = np.mean(energy_bands, axis=1)

# Compare with each template


distances = []
for template in templates:
S_template = np.abs(librosa.stft(template))
template_feature = np.mean(librosa.feature.mfcc(S=S_template, sr=sr, n_mfcc=n_ba
nds), axis=1)
distance = simplified_dtw(sample_feature, template_feature)
distances.append(distance)

# Find the best match


best_idx = np.argmin(distances)
best_distance = distances[best_idx]
recognized_label = labels[best_idx]

# Convert distance to confidence (inverse relationship)


max_distance = max(distances) if len(distances) > 1 else best_distance
confidence = 1 - (best_distance / max_distance) if max_distance > 0 else 0

return recognized_label, confidence, distances

In [21]:
# Create template examples for each command
templates = []
template_labels = []

for command in set(y):


# Find examples of this command (use the first 3 as templates)
indices = np.where(y == command)[0][:3]
for idx in indices:
templates.append(X[idx])
template_labels.append(command)

print(f"Created {len(templates)} templates for {len(set(y))} different commands")

# Test DTW-based recognition on a few samples


print("\n Testing DTW-based speech recognition:")
for i in range(5): # Test 5 random samples
idx = np.random.randint(0, len(X))
audio = X[idx]
true_command = y[idx]

# Recognize using DTW


dtw_command, dtw_confidence, _ = dtw_speech_recognition(audio, templates, template_l
abels)

# Recognize using neural network


nn_command, nn_confidence, _ = recognize_command(audio, best_model, label_encoder)

# Print results
print(f"Sample {i+1} (True: '{true_command}'):")
print(f" DTW Recognition: '{dtw_command}' (Confidence: {dtw_confidence:.2f})")
print(f" NN Recognition: '{nn_command} ' (Confidence: {nn_confidence:.2f})")
print("-" * 40)

Created 18 templates for 6 different commands

Testing DTW-based speech recognition:


1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 39ms/step
Sample 1 (True: 'yes'):
DTW Recognition: 'yes' (Confidence: 0.78)
NN Recognition: 'yes' (Confidence: 1.00)
----------------------------------------
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 39ms/step
Sample 2 (True: 'down'):
DTW Recognition: 'left' (Confidence: 0.84)
NN Recognition: 'down' (Confidence: 0.85)
----------------------------------------
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 64ms/step
Sample 3 (True: 'no'):
DTW Recognition: 'left' (Confidence: 0.52)
NN Recognition: 'down' (Confidence: 0.30)
----------------------------------------
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 42ms/step
Sample 4 (True: 'left'):
DTW Recognition: 'no' (Confidence: 0.44)
NN Recognition: 'left' (Confidence: 0.94)
----------------------------------------
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 47ms/step
Sample 5 (True: 'no'):
DTW Recognition: 'up' (Confidence: 0.90)
NN Recognition: 'no' (Confidence: 1.00)
----------------------------------------
Experiment 7: Image Classification using CNN and RNN on
CIFAR-10 Dataset
In [1]:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers, models, Sequential
from tensorflow.keras.layers import Dense, Dropout, Conv2D, MaxPooling2D, Flatten
from tensorflow.keras.layers import LSTM, TimeDistributed, Reshape, BatchNormalization
from tensorflow.keras.datasets import cifar10
from tensorflow.keras.utils import to_categorical
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report, confusion_matrix
import seaborn as sns

# Check TensorFlow version


print(f"TensorFlow version: {tf.__version__}")
print(f"Keras version: {keras.__version__}")

TensorFlow version: 2.18.0


Keras version: 3.8.0

In [2]:

# Standard approach for CIFAR-10 (works without issues due to its relatively small size)
(X_train, y_train), (X_test, y_test) = cifar10.load_data()

# Normalize pixel values to be between 0 and 1


X_train = X_train.astype('float32') / 255.0
X_test = X_test.astype('float32') / 255.0

# Convert class vectors to binary class matrices (one-hot encoding)


y_train_cat = to_categorical(y_train, 10)
y_test_cat = to_categorical(y_test, 10)

# Define class names for interpretability


class_names = ['airplane', 'automobile', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse',
'ship', 'truck']

print(f"Training data shape: {X_train.shape}")


print(f"Testing data shape: {X_test.shape}")

# Note: For very large datasets, you would use the streaming approach like this:
# Example code (not run, just for demonstration):
"""
from datasets import load_dataset

# Load dataset in streaming mode


dataset = load_dataset("some/large_dataset", split='train', streaming=True)

# Process the streaming dataset


processed_dataset = dataset.map(preprocess_function, batched=True)

# Create a data generator for model training


def data_generator():
for example in processed_dataset:
yield example['input_features'], example['label']
"""

print("\n Note: For CIFAR-10, streaming is unnecessary due to its manageable size,")
print("but for very large datasets, streaming mode helps avoid memory overflow errors.")

Downloading data from https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz


Downloading data from https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz
170498071/170498071 ━━━━━━━━━━━━━━━━━━━━ 4s 0us/step
Training data shape: (50000, 32, 32, 3)
Testing data shape: (10000, 32, 32, 3)

Note: For CIFAR-10, streaming is unnecessary due to its manageable size,


but for very large datasets, streaming mode helps avoid memory overflow errors.

In [3]:
# Example of how to handle larger datasets with the datasets library
from tensorflow.keras.preprocessing.image import ImageDataGenerator

# This function shows how you would approach training with a streaming dataset
def train_with_streaming_data(model, dataset_stream, batch_size=32, steps_per_epoch=100,
epochs=10):
"""
Train a model using a streaming dataset to avoid memory issues.

Parameters:
- model: Compiled Keras model
- dataset_stream: Streaming dataset
- batch_size: Batch size for training
- steps_per_epoch: Number of batches per epoch
- epochs: Number of epochs to train
"""
# This is a conceptual function - implementation would depend on your specific datase
t

for epoch in range(epochs):


print(f"Epoch {epoch+1}/{epochs}")
batch_count = 0
for _ in range(steps_per_epoch):
# In a real implementation, you would:
# 1. Collect a batch of examples from the stream
# 2. Preprocess the batch
# 3. Train on the batch
# model.train_on_batch(x_batch, y_batch)
batch_count += 1

print(f"Completed {batch_count} batches")

return model

print("For CIFAR-10, we use the standard approach, but the streaming method is valuable f
or massive datasets")

For CIFAR-10, we use the standard approach, but the streaming method is valuable for mass
ive datasets

3. Visualize Sample Images from the Dataset


In [4]:

# Plot some sample images


plt.figure(figsize=(10, 10))
for i in range(25):
plt.subplot(5, 5, i+1)
plt.xticks([])
plt.yticks([])
plt.grid(False)
plt.imshow(X_train[i])
plt.xlabel(class_names[y_train[i][0]])
plt.tight_layout()
plt.show()
In [5]:
# Define CNN model architecture
def create_cnn_model():
model = Sequential([
# First convolutional layer
Conv2D(32, (3, 3), activation='relu', padding='same', input_shape=(32, 32, 3)),
BatchNormalization(),
Conv2D(32, (3, 3), activation='relu', padding='same'),
BatchNormalization(),
MaxPooling2D((2, 2)),
Dropout(0.25),

# Second convolutional layer


Conv2D(64, (3, 3), activation='relu', padding='same'),
BatchNormalization(),
Conv2D(64, (3, 3), activation='relu', padding='same'),
BatchNormalization(),
MaxPooling2D((2, 2)),
Dropout(0.25),

# Third convolutional layer


Conv2D(128, (3, 3), activation='relu', padding='same'),
BatchNormalization(),
Conv2D(128, (3, 3), activation='relu', padding='same'),
BatchNormalization(),
MaxPooling2D((2, 2)),
Dropout(0.25),
# Flatten and dense layers
Flatten(),
Dense(512, activation='relu'),
BatchNormalization(),
Dropout(0.5),
Dense(10, activation='softmax')
])

return model

# Create and compile the CNN model


cnn_model = create_cnn_model()
cnn_model.compile(optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy'])

# Print model summary


cnn_model.summary()

/usr/local/lib/python3.11/dist-packages/keras/src/layers/convolutional/base_conv.py:107:
UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Seq
uential models, prefer using an `Input(shape)` object as the first layer in the model ins
tead.
super().__init__(activity_regularizer=activity_regularizer, **kwargs)

Model: "sequential"

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type) ┃ Output Shape ┃ Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ conv2d (Conv2D) │ (None, 32, 32, 32) │ 896 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ batch_normalization │ (None, 32, 32, 32) │ 128 │
│ (BatchNormalization) │ │ │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ conv2d_1 (Conv2D) │ (None, 32, 32, 32) │ 9,248 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ batch_normalization_1 │ (None, 32, 32, 32) │ 128 │
│ (BatchNormalization) │ │ │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ max_pooling2d (MaxPooling2D) │ (None, 16, 16, 32) │ 0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout (Dropout) │ (None, 16, 16, 32) │ 0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ conv2d_2 (Conv2D) │ (None, 16, 16, 64) │ 18,496 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ batch_normalization_2 │ (None, 16, 16, 64) │ 256 │
│ (BatchNormalization) │ │ │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ conv2d_3 (Conv2D) │ (None, 16, 16, 64) │ 36,928 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ batch_normalization_3 │ (None, 16, 16, 64) │ 256 │
│ (BatchNormalization) │ │ │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ max_pooling2d_1 (MaxPooling2D) │ (None, 8, 8, 64) │ 0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout_1 (Dropout) │ (None, 8, 8, 64) │ 0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ conv2d_4 (Conv2D) │ (None, 8, 8, 128) │ 73,856 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ batch_normalization_4 │ (None, 8, 8, 128) │ 512 │
│ (BatchNormalization) │ │ │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ conv2d_5 (Conv2D) │ (None, 8, 8, 128) │ 147,584 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ batch_normalization_5 │ (None, 8, 8, 128) │ 512 │
│ (BatchNormalization) │ │ │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ max_pooling2d_2 (MaxPooling2D) │ (None, 4, 4, 128) │ 0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout_2 (Dropout) │ (None, 4, 4, 128) │ 0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ flatten (Flatten) │ (None, 2048) │ 0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense (Dense) │ (None, 512) │ 1,049,088 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ batch_normalization_6 │ (None, 512) │ 2,048 │
│ (BatchNormalization) │ │ │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout_3 (Dropout) │ (None, 512) │ 0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_1 (Dense) │ (None, 10) │ 5,130 │
└─────────────────────────────────┴────────────────────────┴───────────────┘

Total params: 1,345,066 (5.13 MB)

Trainable params: 1,343,146 (5.12 MB)

Non-trainable params: 1,920 (7.50 KB)

In [ ]:
# Train the CNN model with data augmentation
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import gc # Garbage collection to manage memory

# Try to free up memory before starting training


gc.collect()

# Data augmentation for training


datagen = ImageDataGenerator(
rotation_range=15,
width_shift_range=0.1,
height_shift_range=0.1,
horizontal_flip=True
)
datagen.fit(X_train)

# Early stopping and learning rate reduction


early_stopping = keras.callbacks.EarlyStopping(monitor='val_loss', patience=10, restore_b
est_weights=True)
reduce_lr = keras.callbacks.ReduceLROnPlateau(monitor='val_loss', factor=0.2, patience=5
, min_lr=1e-5)

try:
# Reduce batch size if you're having memory issues
batch_size = 32 # Reduced from 64 to try to avoid memory issues

# Train the model with augmented data


cnn_history = cnn_model.fit(
datagen.flow(X_train, y_train_cat, batch_size=batch_size),
validation_data=(X_test, y_test_cat),
epochs=50,
callbacks=[early_stopping, reduce_lr],
verbose=1
)

print("CNN model training completed successfully!")

except Exception as e:
print(f"Error during CNN model training: {e}")
print("\n Troubleshooting tips:")
print("1. Try reducing batch_size further")
print("2. Try reducing the model complexity")
print("3. Check if you have enough memory available")
# Try with a much smaller batch size and fewer epochs as a fallback
try:
print("\n Attempting training with smaller batch size...")
cnn_history = cnn_model.fit(
datagen.flow(X_train, y_train_cat, batch_size=16),
validation_data=(X_test, y_test_cat),
epochs=10, # Reduced epochs for faster completion
callbacks=[early_stopping, reduce_lr],
verbose=1
)
print("Fallback training completed successfully!")
except Exception as e2:
print(f"Fallback training also failed: {e2}")

In [7]:
# Prepare data for RNN (reshape images to sequences)
# For RNN, we'll treat each row of the image as a time step
X_train_rnn = X_train.reshape(X_train.shape[0], 32, 32*3) # 32 time steps, each with 32
*3 features
X_test_rnn = X_test.reshape(X_test.shape[0], 32, 32*3)

print(f"RNN training data shape: {X_train_rnn.shape}")


print(f"RNN testing data shape: {X_test_rnn.shape}")

RNN training data shape: (50000, 32, 96)


RNN testing data shape: (10000, 32, 96)

In [8]:
# Define RNN model architecture
def create_rnn_model():
model = Sequential([
# LSTM layers
LSTM(128, return_sequences=True, input_shape=(32, 32*3)),
Dropout(0.25),
LSTM(128),
Dropout(0.25),

# Output layer
Dense(128, activation='relu'),
Dropout(0.5),
Dense(10, activation='softmax')
])

return model

# Create and compile the RNN model


rnn_model = create_rnn_model()
rnn_model.compile(optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy'])

# Print model summary


rnn_model.summary()

/usr/local/lib/python3.11/dist-packages/keras/src/layers/rnn/rnn.py:200: UserWarning: Do
not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models,
prefer using an `Input(shape)` object as the first layer in the model instead.
super().__init__(**kwargs)

Model: "sequential_1"

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type) ┃ Output Shape ┃ Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ lstm (LSTM) │ (None, 32, 128) │ 115,200 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout_4 (Dropout) │ (None, 32, 128) │ 0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ lstm_1 (LSTM) │ (None, 128) │ 131,584 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout_5 (Dropout) │ (None, 128) │ 0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_2 (Dense) │ (None, 128) │ 16,512 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout_6 (Dropout) │ (None, 128) │ 0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_3 (Dense) │ (None, 10) │ 1,290 │
└─────────────────────────────────┴────────────────────────┴───────────────┘

Total params: 264,586 (1.01 MB)

Trainable params: 264,586 (1.01 MB)

Non-trainable params: 0 (0.00 B)

In [ ]:
# Train the RNN model
early_stopping = keras.callbacks.EarlyStopping(monitor='val_loss', patience=5, restore_be
st_weights=True)

rnn_history = rnn_model.fit(
X_train_rnn, y_train_cat,
batch_size=128,
epochs=25,
validation_data=(X_test_rnn, y_test_cat),
callbacks=[early_stopping],
verbose=1
)

In [10]:
# Evaluate the CNN model
cnn_loss, cnn_accuracy = cnn_model.evaluate(X_test, y_test_cat, verbose=0)
print(f"CNN Test Accuracy: {cnn_accuracy:.4f}")

# Evaluate the RNN model


rnn_loss, rnn_accuracy = rnn_model.evaluate(X_test_rnn, y_test_cat, verbose=0)
print(f"RNN Test Accuracy: {rnn_accuracy:.4f}")

CNN Test Accuracy: 0.8768


RNN Test Accuracy: 0.6061

In [11]:
# Plot training history comparison
plt.figure(figsize=(12, 5))

# Plot accuracy
plt.subplot(1, 2, 1)
plt.plot(cnn_history.history['accuracy'], label='CNN Training Accuracy')
plt.plot(cnn_history.history['val_accuracy'], label='CNN Validation Accuracy')
plt.plot(rnn_history.history['accuracy'], label='RNN Training Accuracy')
plt.plot(rnn_history.history['val_accuracy'], label='RNN Validation Accuracy')
plt.title('Model Accuracy Comparison')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(loc='lower right')

# Plot loss
plt.subplot(1, 2, 2)
plt.plot(cnn_history.history['loss'], label='CNN Training Loss')
plt.plot(cnn_history.history['val_loss'], label='CNN Validation Loss')
plt.plot(rnn_history.history['loss'], label='RNN Training Loss')
plt.plot(rnn_history.history['val_loss'], label='RNN Validation Loss')
plt.title('Model Loss Comparison')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(loc='upper right')
plt.tight_layout()
plt.show()

In [12]:

# Generate predictions with the CNN model


cnn_predictions = cnn_model.predict(X_test)
cnn_predicted_classes = np.argmax(cnn_predictions, axis=1)
cnn_true_classes = np.argmax(y_test_cat, axis=1)

# Generate predictions with the RNN model


rnn_predictions = rnn_model.predict(X_test_rnn)
rnn_predicted_classes = np.argmax(rnn_predictions, axis=1)

# Display classification report for CNN


print("CNN Classification Report:")
print(classification_report(cnn_true_classes, cnn_predicted_classes, target_names=class_
names))

# Display classification report for RNN


print("RNN Classification Report:")
print(classification_report(cnn_true_classes, rnn_predicted_classes, target_names=class_
names))

313/313 ━━━━━━━━━━━━━━━━━━━━ 2s 5ms/step


313/313 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step
CNN Classification Report:
precision recall f1-score support

airplane 0.91 0.89 0.90 1000


automobile 0.94 0.95 0.95 1000
bird 0.86 0.82 0.84 1000
cat 0.85 0.68 0.76 1000
deer 0.84 0.88 0.86 1000
dog 0.88 0.75 0.81 1000
frog 0.80 0.96 0.87 1000
horse 0.89 0.93 0.91 1000
ship 0.93 0.94 0.94 1000
truck 0.88 0.95 0.92 1000

accuracy 0.88 10000


macro avg 0.88 0.88 0.87 10000
weighted avg 0.88 0.88 0.87 10000

RNN Classification Report:


precision recall f1-score support

airplane 0.66 0.68 0.67 1000


automobile 0.73 0.75 0.74 1000
bird 0.52 0.41 0.46 1000
cat 0.43 0.42 0.42 1000
deer 0.55 0.49 0.52 1000
dog 0.49 0.49 0.49 1000
frog 0.58 0.73 0.65 1000
horse 0.66 0.68 0.67 1000
ship 0.73 0.76 0.74 1000
truck 0.68 0.67 0.67 1000

accuracy 0.61 10000


macro avg 0.60 0.61 0.60 10000
weighted avg 0.60 0.61 0.60 10000

In [13]:
# Plot confusion matrices
plt.figure(figsize=(16, 7))

# CNN confusion matrix


plt.subplot(1, 2, 1)
cnn_cm = confusion_matrix(cnn_true_classes, cnn_predicted_classes)
sns.heatmap(cnn_cm, annot=True, fmt='d', cmap='Blues', xticklabels=class_names, yticklab
els=class_names)
plt.title('CNN Confusion Matrix')
plt.ylabel('True Label')
plt.xlabel('Predicted Label')

# RNN confusion matrix


plt.subplot(1, 2, 2)
rnn_cm = confusion_matrix(cnn_true_classes, rnn_predicted_classes)
sns.heatmap(rnn_cm, annot=True, fmt='d', cmap='Blues', xticklabels=class_names, yticklab
els=class_names)
plt.title('RNN Confusion Matrix')
plt.ylabel('True Label')
plt.xlabel('Predicted Label')

plt.tight_layout()
plt.show()

In [14]:
# Display some sample predictions
def plot_sample_predictions(model_name, X, predictions, true_labels, indices=None):
if indices is None:
indices = np.random.randint(0, len(X), 15)

predicted_classes = np.argmax(predictions, axis=1)

plt.figure(figsize=(12, 10))
for i, idx in enumerate(indices):
plt.subplot(3, 5, i+1)
plt.imshow(X[idx])
plt.title(f"True: {class_names[true_labels[idx][0]]}\n Pred: {class_names[predict
ed_classes[idx]]}")
plt.axis('off')
if true_labels[idx][0] != predicted_classes[idx]:
plt.gca().set_title(plt.gca().get_title(), color='red')
plt.tight_layout()
plt.suptitle(f"{model_name} Sample Predictions", fontsize=16, y=1.02)
plt.show()

# Random indices for sample visualization


sample_indices = np.random.randint(0, len(X_test), 15)

# Plot CNN predictions


plot_sample_predictions("CNN", X_test, cnn_predictions, y_test, sample_indices)

# Plot RNN predictions


plot_sample_predictions("RNN", X_test, rnn_predictions, y_test, sample_indices)
Experiment 9: NLP -> Sentiment Analysis, Text Classification
and PCA Visualization
In [ ]:

%pip install nltk

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import re
import string
import nltk
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize
from nltk.stem import WordNetLemmatizer, PorterStemmer
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Embedding, LSTM, SpatialDropout1D, Dropout, B
idirectional
from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import CountVectorizer, TfidfVectorizer
from sklearn.decomposition import PCA, TruncatedSVD
from sklearn.linear_model import LogisticRegression
from sklearn.naive_bayes import MultinomialNB
from sklearn.metrics import classification_report, confusion_matrix, accuracy_score
from sklearn.datasets import fetch_20newsgroups
import warnings

# Suppress warnings
warnings.filterwarnings('ignore')

# Download NLTK resources


try:
nltk.download('punkt')
nltk.download('stopwords')
nltk.download('wordnet')
except:
print("NLTK download failed, but we'll continue anyway")

# Check versions
print(f"TensorFlow version: {tf.__version__}")
print(f"Keras version: {keras.__version__}")
print(f"NLTK version: {nltk.__version__}")
print(f"Pandas version: {pd.__version__}")

In [2]:

def generate_synthetic_sentiment_data(n_samples=5000):
"""Generate synthetic text data with sentiment labels"""

# Define positive and negative sentiment word lists


positive_words = [
"good", "great", "excellent", "amazing", "wonderful", "best", "love", "happy",
"awesome", "fantastic", "pleasant", "delightful", "impressive", "joyful", "perfe
ct",
"beautiful", "enjoy", "exciting", "pleased", "recommend", "brilliant", "superb",
"outstanding", "exceptional", "terrific", "magnificent", "marvelous", "splendid"
]

negative_words = [
"bad", "terrible", "awful", "horrible", "worst", "hate", "disappointed", "poor",
"disappointing", "dislike", "mediocre", "ugly", "boring", "stupid", "waste",
"annoying", "useless", "unpleasant", "frustrating", "pathetic", "dreadful", "app
alling",
"disgusting", "disaster", "inferior", "dull", "unhappy", "awful"
]

neutral_words = [
"okay", "fine", "average", "decent", "acceptable", "fair", "satisfactory", "mode
rate",
"adequate", "reasonable", "neither", "mixed", "balanced", "so-so", "standard",
"ordinary", "regular", "typical", "usual", "common", "intermediate", "middle-of-
the-road"
]

# Define templates for various sentiment types


positive_templates = [
"This product is {positive}.",
"I {positive} this movie very much.",
"The service was {positive} and I would recommend it.",
"A {positive} experience overall.",
"This is a {positive} {item} that exceeded my expectations.",
"I'm {positive} with my purchase.",
"The {item} works {positive} and is worth the money.",
"I was {positive} by how well this {item} performed.",
"My experience with this {item} has been {positive}.",
"This {positive} {item} made me happy."
]

negative_templates = [
"This product is {negative}.",
"I {negative} this movie.",
"The service was {negative} and I would not recommend it.",
"A {negative} experience overall.",
"This is a {negative} {item} that failed to meet my expectations.",
"I'm {negative} with my purchase.",
"The {item} works {negative} and is not worth the money.",
"I was {negative} by how poorly this {item} performed.",
"My experience with this {item} has been {negative}.",
"This {negative} {item} made me unhappy."
]

neutral_templates = [
"This product is {neutral}.",
"I thought this movie was {neutral}.",
"The service was {neutral}.",
"A {neutral} experience overall.",
"This is a {neutral} {item} that met my basic expectations.",
"I'm {neutral} about my purchase.",
"The {item} works {neutral} for the price.",
"I was neither impressed nor disappointed by this {item}.",
"My experience with this {item} has been {neutral}.",
"This {neutral} {item} is just okay."
]

item_types = ["product", "book", "movie", "phone", "laptop", "TV", "tablet", "headph


ones",
"camera", "game", "experience", "service", "device", "gadget", "applia
nce"]

# Generate samples
texts = []
sentiments = []

for _ in range(n_samples):
sentiment = np.random.choice([0, 1, 2]) # 0: negative, 1: neutral, 2: positive
item = np.random.choice(item_types)

if sentiment == 0: # Negative
template = np.random.choice(negative_templates)
sentiment_word = np.random.choice(negative_words)
text = template.format(negative=sentiment_word, item=item)
elif sentiment == 1: # Neutral
template = np.random.choice(neutral_templates)
sentiment_word = np.random.choice(neutral_words)
text = template.format(neutral=sentiment_word, item=item)
else: # Positive
template = np.random.choice(positive_templates)
sentiment_word = np.random.choice(positive_words)
text = template.format(positive=sentiment_word, item=item)

texts.append(text)
sentiments.append(sentiment)

# Create DataFrame
data = pd.DataFrame({
'text': texts,
'sentiment': sentiments
})

# Convert sentiment to categorical labels


data['sentiment_label'] = data['sentiment'].map({0: 'negative', 1: 'neutral', 2: 'po
sitive'})

return data

# Generate synthetic sentiment data


sentiment_data = generate_synthetic_sentiment_data(n_samples=5000)

# Display the first few rows


print(f"Generated {len(sentiment_data)} sentiment samples")
sentiment_data.head()

Generated 5000 sentiment samples


Out[2]:

text sentiment sentiment_label

0 This is a waste product that failed to meet my... 0 negative

1 A disaster experience overall. 0 negative

2 I'm superb with my purchase. 2 positive

3 My experience with this service has been usual. 1 neutral

4 My experience with this headphones has been un... 0 negative

In [3]:

# Check the distribution of sentiments


plt.figure(figsize=(10, 6))
sns.countplot(x='sentiment_label', data=sentiment_data, palette='viridis')
plt.title('Distribution of Sentiment Classes')
plt.xlabel('Sentiment')
plt.ylabel('Count')
plt.xticks(rotation=0)
plt.grid(axis='y', linestyle='--', alpha=0.7)
plt.show()

# Calculate percentages
sentiment_counts = sentiment_data['sentiment_label'].value_counts(normalize=True) * 100
print("Sentiment distribution percentages:")
for sentiment, percentage in sentiment_counts.items():
print(f"{sentiment}: {percentage:.2f}%")
Sentiment distribution percentages:
neutral: 33.92%
positive: 33.52%
negative: 32.56%

In [4]:
# Download NLTK resources first
nltk.download('punkt')
nltk.download('stopwords')
nltk.download('wordnet')

# Visualize word frequency by sentiment


def plot_word_freq_by_sentiment(data, top_n=15):
"""Plot the most frequent words for each sentiment class"""
plt.figure(figsize=(18, 15))

for i, sentiment in enumerate(['negative', 'neutral', 'positive']):


# Filter by sentiment
texts = data[data['sentiment_label'] == sentiment]['text']

# Tokenize and count words using a simpler approach


all_words = []
for text in texts:
# Convert to lowercase and tokenize using a simpler approach
# Split by whitespace and remove punctuation
text = text.lower()
# Remove punctuation
for punct in string.punctuation:
text = text.replace(punct, ' ')
# Split into words
words = text.split()
# Remove stopwords and short words
words = [word for word in words if word not in stopwords.words('english')
and len(word) > 2]
all_words.extend(words)

# Count frequency
word_freq = pd.Series(all_words).value_counts().head(top_n)

# Plot
plt.subplot(3, 1, i+1)
sns.barplot(x=word_freq.values, y=word_freq.index, palette='viridis')
plt.title(f'Top {top_n} Words in {sentiment.capitalize()} Reviews')
plt.xlabel('Frequency')
plt.ylabel('Words')

plt.tight_layout()
plt.show()
# Visualize word frequencies
plot_word_freq_by_sentiment(sentiment_data)

[nltk_data] Downloading package punkt to /usr/share/nltk_data...


[nltk_data] Package punkt is already up-to-date!
[nltk_data] Downloading package stopwords to /usr/share/nltk_data...
[nltk_data] Package stopwords is already up-to-date!
[nltk_data] Downloading package wordnet to /usr/share/nltk_data...
[nltk_data] Package wordnet is already up-to-date!

In [5]:
def preprocess_text(text, remove_stopwords=True, lemmatize=True):
"""Preprocess text for NLP tasks"""
# Convert to lowercase
text = text.lower()

# Remove URLs
text = re.sub(r'http\S+|www\S+|https\S+', '', text, flags=re.MULTILINE)

# Remove special characters & numbers


text = re.sub(r'[^\w\s]', '', text)
text = re.sub(r'\d+', '', text)

# Tokenize
tokens = word_tokenize(text)

# Remove stopwords if requested


if remove_stopwords:
stop_words = set(stopwords.words('english'))
tokens = [word for word in tokens if word not in stop_words]
# Lemmatize if requested
if lemmatize:
lemmatizer = WordNetLemmatizer()
tokens = [lemmatizer.lemmatize(word) for word in tokens]

# Rejoin into string


preprocessed_text = ' '.join(tokens)

return preprocessed_text

# Apply preprocessing to the sentiment data


sentiment_data['processed_text'] = sentiment_data['text'].apply(preprocess_text)

# Display examples of original vs processed text


comparison = sentiment_data[['text', 'processed_text', 'sentiment_label']].head(5)
print("Original vs. Processed Text Examples:")
for i, row in comparison.iterrows():
print(f"\n Sentiment: {row['sentiment_label']}")
print(f"Original: {row['text']}")
print(f"Processed: {row['processed_text']}")

Original vs. Processed Text Examples:

Sentiment: negative
Original: This is a waste product that failed to meet my expectations.
Processed: waste product failed meet expectation

Sentiment: negative
Original: A disaster experience overall.
Processed: disaster experience overall

Sentiment: positive
Original: I'm superb with my purchase.
Processed: im superb purchase

Sentiment: neutral
Original: My experience with this service has been usual.
Processed: experience service usual

Sentiment: negative
Original: My experience with this headphones has been unhappy.
Processed: experience headphone unhappy

In [6]:
# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(
sentiment_data['processed_text'],
sentiment_data['sentiment'],
test_size=0.2,
random_state=42
)

# Create feature extractors


# Bag of Words
bow_vectorizer = CountVectorizer(max_features=5000)
bow_train = bow_vectorizer.fit_transform(X_train)
bow_test = bow_vectorizer.transform(X_test)

# TF-IDF
tfidf_vectorizer = TfidfVectorizer(max_features=5000)
tfidf_train = tfidf_vectorizer.fit_transform(X_train)
tfidf_test = tfidf_vectorizer.transform(X_test)

print(f"Bag of Words training shape: {bow_train.shape} ")


print(f"TF-IDF training shape: {tfidf_train.shape}")

Bag of Words training shape: (4000, 113)


TF-IDF training shape: (4000, 113)

In [7]:
def train_evaluate_ml_model(model, X_train, X_test, y_train, y_test, model_name):
"""Train and evaluate a machine learning model"""
# Train the model
model.fit(X_train, y_train)

# Make predictions
y_pred = model.predict(X_test)

# Calculate metrics
accuracy = accuracy_score(y_test, y_pred)
report = classification_report(y_test, y_pred, target_names=['Negative', 'Neutral',
'Positive'])

# Print results
print(f"Model: {model_name} ")
print(f"Accuracy: {accuracy:.4f}")
print("Classification Report:")
print(report)

# Confusion Matrix
cm = confusion_matrix(y_test, y_pred)
plt.figure(figsize=(8, 6))
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues',
xticklabels=['Negative', 'Neutral', 'Positive'],
yticklabels=['Negative', 'Neutral', 'Positive'])
plt.xlabel('Predicted')
plt.ylabel('True')
plt.title(f'Confusion Matrix - {model_name} ')
plt.tight_layout()
plt.show()

return model, accuracy, report

# Train and evaluate Naive Bayes with Bag of Words


nb_bow = MultinomialNB()
nb_bow_results = train_evaluate_ml_model(nb_bow, bow_train, bow_test, y_train, y_test, "
Naive Bayes (Bag of Words)")

# Train and evaluate Logistic Regression with TF-IDF


lr_tfidf = LogisticRegression(max_iter=1000, C=1.0, solver='liblinear', multi_class='ovr'
)
lr_tfidf_results = train_evaluate_ml_model(lr_tfidf, tfidf_train, tfidf_test, y_train, y
_test, "Logistic Regression (TF-IDF)")

Model: Naive Bayes (Bag of Words)


Accuracy: 0.9910
Classification Report:
precision recall f1-score support

Negative 1.00 0.97 0.98 304


Neutral 0.97 1.00 0.99 351
Positive 1.00 1.00 1.00 345

accuracy 0.99 1000


macro avg 0.99 0.99 0.99 1000
weighted avg 0.99 0.99 0.99 1000
Model: Logistic Regression (TF-IDF)
Accuracy: 1.0000
Classification Report:
precision recall f1-score support

Negative 1.00 1.00 1.00 304


Neutral 1.00 1.00 1.00 351
Positive 1.00 1.00 1.00 345

accuracy 1.00 1000


macro avg 1.00 1.00 1.00 1000
weighted avg 1.00 1.00 1.00 1000
In [8]:
def visualize_feature_importance(model, vectorizer, class_labels=['Negative', 'Neutral',
'Positive']):
"""Visualize the most important features for each class"""
# Get feature names
feature_names = vectorizer.get_feature_names_out()

# If model is Logistic Regression, we can get coefficients


if hasattr(model, 'coef_'):
coefficients = model.coef_

plt.figure(figsize=(15, 12))
for i, class_name in enumerate(class_labels):
# Top positive coefficients for this class
top_positive = np.argsort(coefficients[i])[-10:]
top_pos_coeffs = coefficients[i][top_positive]
top_pos_features = [feature_names[j] for j in top_positive]

# Top negative coefficients for this class


top_negative = np.argsort(coefficients[i])[:10]
top_neg_coeffs = coefficients[i][top_negative]
top_neg_features = [feature_names[j] for j in top_negative]

# Plot
plt.subplot(3, 1, i+1)

# Plot positive
plt.barh(range(len(top_pos_features)), top_pos_coeffs, color='forestgreen')
plt.yticks(range(len(top_pos_features)), top_pos_features)

# Plot negative (flip the order for better visualization)


plt.barh(range(len(top_pos_features), len(top_pos_features) + len(top_neg_fe
atures)),
top_neg_coeffs, color='crimson')
plt.yticks(range(len(top_pos_features), len(top_pos_features) + len(top_neg_
features)),
top_neg_features)

plt.title(f'Most Important Features for {class_name} Class')


plt.xlabel('Coefficient Value')

plt.tight_layout()
plt.show()
else:
print("This model type doesn't support feature importance visualization.")

# Visualize feature importance for the Logistic Regression model


visualize_feature_importance(lr_tfidf_results[0], tfidf_vectorizer)
In [9]:
# Prepare text data for deep learning
max_features = 5000 # Top words to consider
maxlen = 100 # Max sequence length

# Tokenize text
tokenizer = Tokenizer(num_words=max_features)
tokenizer.fit_on_texts(X_train)

X_train_seq = tokenizer.texts_to_sequences(X_train)
X_test_seq = tokenizer.texts_to_sequences(X_test)

# Pad sequences to ensure uniform length


X_train_pad = pad_sequences(X_train_seq, maxlen=maxlen)
X_test_pad = pad_sequences(X_test_seq, maxlen=maxlen)

# Convert labels to categorical for multi-class classification


y_train_cat = tf.keras.utils.to_categorical(y_train, num_classes=3)
y_test_cat = tf.keras.utils.to_categorical(y_test, num_classes=3)

print(f"Training sequence shape: {X_train_pad.shape}")


print(f"Testing sequence shape: {X_test_pad.shape}")

Training sequence shape: (4000, 100)


Testing sequence shape: (1000, 100)

In [10]:
# Build LSTM model for sentiment analysis
def create_lstm_model():
model = Sequential()
model.add(Embedding(max_features, 128, input_length=maxlen))
model.add(SpatialDropout1D(0.2))
model.add(Bidirectional(LSTM(64, dropout=0.2, recurrent_dropout=0.2)))
model.add(Dense(64, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(3, activation='softmax')) # 3 classes: negative, neutral, positive

model.compile(loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])
return model

# Create and train the model


lstm_model = create_lstm_model()
lstm_model.summary()

# Callbacks
early_stopping = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=3, restor
e_best_weights=True)
reduce_lr = tf.keras.callbacks.ReduceLROnPlateau(monitor='val_loss', factor=0.2, patienc
e=2, min_lr=0.001)

# Train with validation split


history = lstm_model.fit(
X_train_pad, y_train_cat,
epochs=15,
batch_size=128,
validation_split=0.1,
callbacks=[early_stopping, reduce_lr],
verbose=1
)

I0000 00:00:1748109294.386137 35 gpu_device.cc:2022] Created device /job:localhost/r


eplica:0/task:0/device:GPU:0 with 15513 MB memory: -> device: 0, name: Tesla P100-PCIE-1
6GB, pci bus id: 0000:00:04.0, compute capability: 6.0

Model: "sequential"

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓
┃ Layer (type) ┃ Output Shape ┃ Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩
│ embedding (Embedding) │ ? │ 0 (unbuilt) │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ spatial_dropout1d (SpatialDropout1D) │ ? │ 0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ bidirectional (Bidirectional) │ ? │ 0 (unbuilt) │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense (Dense) │ ? │ 0 (unbuilt) │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dropout (Dropout) │ ? │ 0 │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ dense_1 (Dense) │ ? │ 0 (unbuilt) │
└──────────────────────────────────────┴─────────────────────────────┴─────────────────┘

Total params: 0 (0.00 B)

Trainable params: 0 (0.00 B)

Non-trainable params: 0 (0.00 B)

Epoch 1/15
29/29 ━━━━━━━━━━━━━━━━━━━━ 26s 434ms/step - accuracy: 0.3771 - loss: 1.0906 - val_accurac
y: 0.9425 - val_loss: 0.9523 - learning_rate: 0.0010
Epoch 2/15
29/29 ━━━━━━━━━━━━━━━━━━━━ 11s 391ms/step - accuracy: 0.7555 - loss: 0.8183 - val_accurac
y: 1.0000 - val_loss: 0.2278 - learning_rate: 0.0010
Epoch 3/15
29/29 ━━━━━━━━━━━━━━━━━━━━ 11s 388ms/step - accuracy: 0.9556 - loss: 0.1988 - val_accurac
y: 1.0000 - val_loss: 0.0035 - learning_rate: 0.0010
Epoch 4/15
29/29 ━━━━━━━━━━━━━━━━━━━━ 11s 394ms/step - accuracy: 0.9973 - loss: 0.0202 - val_accurac
y: 1.0000 - val_loss: 3.5055e-04 - learning_rate: 0.0010
Epoch 5/15
29/29 ━━━━━━━━━━━━━━━━━━━━ 11s 385ms/step - accuracy: 0.9988 - loss: 0.0079 - val_accurac
y: 1.0000 - val_loss: 1.2586e-04 - learning_rate: 0.0010
Epoch 6/15
29/29 ━━━━━━━━━━━━━━━━━━━━ 11s 387ms/step - accuracy: 1.0000 - loss: 0.0040 - val_accurac
y: 1.0000 - val_loss: 8.2057e-05 - learning_rate: 0.0010
Epoch 7/15
29/29 ━━━━━━━━━━━━━━━━━━━━ 11s 387ms/step - accuracy: 1.0000 - loss: 0.0040 - val_accurac
y: 1.0000 - val_loss: 3.6909e-05 - learning_rate: 0.0010
Epoch 8/15
29/29 ━━━━━━━━━━━━━━━━━━━━ 11s 378ms/step - accuracy: 0.9990 - loss: 0.0029 - val_accurac
y: 1.0000 - val_loss: 2.3085e-05 - learning_rate: 0.0010
Epoch 9/15
29/29 ━━━━━━━━━━━━━━━━━━━━ 11s 389ms/step - accuracy: 1.0000 - loss: 0.0019 - val_accurac
y: 1.0000 - val_loss: 2.9360e-05 - learning_rate: 0.0010
Epoch 10/15
29/29 ━━━━━━━━━━━━━━━━━━━━ 11s 380ms/step - accuracy: 1.0000 - loss: 9.8354e-04 - val_acc
uracy: 1.0000 - val_loss: 1.0132e-05 - learning_rate: 0.0010
Epoch 11/15
29/29 ━━━━━━━━━━━━━━━━━━━━ 11s 378ms/step - accuracy: 1.0000 - loss: 9.8051e-04 - val_acc
uracy: 1.0000 - val_loss: 1.5896e-05 - learning_rate: 0.0010
Epoch 12/15
29/29 ━━━━━━━━━━━━━━━━━━━━ 11s 388ms/step - accuracy: 1.0000 - loss: 9.5763e-04 - val_acc
29/29 ━━━━━━━━━━━━━━━━━━━━ 11s 388ms/step - accuracy: 1.0000 - loss: 9.5763e-04 - val_acc
uracy: 1.0000 - val_loss: 7.4489e-06 - learning_rate: 0.0010
Epoch 13/15
29/29 ━━━━━━━━━━━━━━━━━━━━ 11s 377ms/step - accuracy: 1.0000 - loss: 0.0011 - val_accurac
y: 1.0000 - val_loss: 8.0250e-06 - learning_rate: 0.0010
Epoch 14/15
29/29 ━━━━━━━━━━━━━━━━━━━━ 11s 385ms/step - accuracy: 0.9997 - loss: 0.0011 - val_accurac
y: 1.0000 - val_loss: 9.1459e-06 - learning_rate: 0.0010
Epoch 15/15
29/29 ━━━━━━━━━━━━━━━━━━━━ 11s 392ms/step - accuracy: 1.0000 - loss: 8.8211e-04 - val_acc
uracy: 1.0000 - val_loss: 4.0683e-06 - learning_rate: 0.0010

In [11]:
# Plot training history
plt.figure(figsize=(12, 5))

# Plot accuracy
plt.subplot(1, 2, 1)
plt.plot(history.history['accuracy'], label='Train Accuracy')
plt.plot(history.history['val_accuracy'], label='Validation Accuracy')
plt.title('Model Accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend()
plt.grid(True, linestyle='--', alpha=0.6)

# Plot loss
plt.subplot(1, 2, 2)
plt.plot(history.history['loss'], label='Train Loss')
plt.plot(history.history['val_loss'], label='Validation Loss')
plt.title('Model Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.legend()
plt.grid(True, linestyle='--', alpha=0.6)

plt.tight_layout()
plt.show()

In [12]:
# Evaluate LSTM model
lstm_pred = lstm_model.predict(X_test_pad)
y_pred_classes = np.argmax(lstm_pred, axis=1)
y_test_classes = np.argmax(y_test_cat, axis=1)

# Print classification report


lstm_accuracy = accuracy_score(y_test_classes, y_pred_classes)
lstm_report = classification_report(y_test_classes, y_pred_classes, target_names=['Negat
ive', 'Neutral', 'Positive'])
print(f"LSTM Model Accuracy: {lstm_accuracy:.4f}")
print("Classification Report:")
print(lstm_report)

# Plot confusion matrix


cm = confusion_matrix(y_test_classes, y_pred_classes)
plt.figure(figsize=(8, 6))
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues',
xticklabels=['Negative', 'Neutral', 'Positive'],
yticklabels=['Negative', 'Neutral', 'Positive'])
plt.xlabel('Predicted')
plt.ylabel('True')
plt.title('Confusion Matrix - LSTM Model')
plt.tight_layout()
plt.show()

32/32 ━━━━━━━━━━━━━━━━━━━━ 4s 101ms/step


LSTM Model Accuracy: 1.0000
Classification Report:
precision recall f1-score support

Negative 1.00 1.00 1.00 304


Neutral 1.00 1.00 1.00 351
Positive 1.00 1.00 1.00 345

accuracy 1.00 1000


macro avg 1.00 1.00 1.00 1000
weighted avg 1.00 1.00 1.00 1000

In [13]:

# Compare model performances


models = {
'Naive Bayes (BoW)': nb_bow_results[1],
'Logistic Regression (TF-IDF)': lr_tfidf_results[1],
'LSTM': lstm_accuracy
}

# Plot comparison
plt.figure(figsize=(10, 6))
bars = plt.bar(models.keys(), models.values(), color=['skyblue', 'lightgreen', 'coral'])
plt.title('Model Accuracy Comparison')
plt.xlabel('Model')
plt.ylabel('Accuracy')
plt.ylim(0, 1.0)
plt.grid(axis='y', linestyle='--', alpha=0.7)

# Add accuracy values on bars


for bar in bars:
height = bar.get_height()
plt.text(bar.get_x() + bar.get_width()/2., height + 0.01,
f'{height:.4f} ', ha='center', va='bottom')

plt.tight_layout()
plt.show()

In [14]:

# Load a subset of the 20 Newsgroups dataset


# For simplicity, we'll use 4 categories
categories = ['alt.atheism', 'sci.electronics', 'rec.sport.hockey', 'talk.politics.guns']

# Load training data


newsgroups_train = fetch_20newsgroups(subset='train', categories=categories, shuffle=Tru
e, random_state=42)
newsgroups_test = fetch_20newsgroups(subset='test', categories=categories, shuffle=True,
random_state=42)

print(f"Training data size: {len(newsgroups_train.data)}")


print(f"Testing data size: {len(newsgroups_test.data)}")
print(f"Categories: {newsgroups_train.target_names}")

Training data size: 2217


Testing data size: 1475
Categories: ['alt.atheism', 'rec.sport.hockey', 'sci.electronics', 'talk.politics.guns']

In [15]:
# Preprocess the newsgroups data
# Apply more aggressive preprocessing for this dataset
def preprocess_newsgroups(text):
# Convert to lowercase
text = text.lower()

# Remove headers, footers, and quotes which are common in newsgroups


text = re.sub(r'from:\s.*\n', '', text)
text = re.sub(r'subject:\s.*\n', '', text)
text = re.sub(r'\>.*\n', '', text) # quoted text

# Remove URLs and email addresses


text = re.sub(r'http\S+|www\S+|https\S+', '', text, flags=re.MULTILINE)
text = re.sub(r'\S+@\S+', '', text)

# Remove special characters & numbers


text = re.sub(r'[^\w\s]', '', text)
text = re.sub(r'\d+', '', text)

# Remove extra whitespace


text = re.sub(r'\s+', ' ', text).strip()

return text

# Process training and testing data


train_data_processed = [preprocess_newsgroups(text) for text in newsgroups_train.data]
test_data_processed = [preprocess_newsgroups(text) for text in newsgroups_test.data]

# Display a sample
print("Original text sample:")
print(newsgroups_train.data[0][:500] + "...")
print("\n Processed text sample:")
print(train_data_processed[0][:500] + "...")

Original text sample:


From: shah@pitt.edu (Ravindra S Shah)
Subject: Re: Nords 3 - Habs 2 in O.T. We was robbed!!
Lines: 23
X-Newsreader: TIN [version 1.1 PL8]

Deepak Chhabra (dchhabra@stpl.ists.ca) wrote:

: Speaking of great players, man-oh-man can Quebec skate. I haven't seen a


: team so potent on the rush in a long time. Watching them break out of their
: zone, especially Sundin, is a treat to watch. They remind me of the Red
: Army.

: dchhabra@stpl.ists.ca (pissed-off Habs fan)

Yeah, the Nords look like...

Processed text sample:


lines xnewsreader tin version pl deepak chhabra wrote speaking of great players manohman
can quebec skate i havent seen a team so potent on the rush in a long time watching them
break out of their zone especially sundin is a treat to watch they remind me of the red a
rmy pissedoff habs fan yeah the nords look like theyre going to be goodbut excuse the bia
s have you ever watched the pens on a rushdont answer everyone has seen this footage near
the end of the season when the pens played the nords i...

In [16]:
# Extract features using TF-IDF
tfidf_vectorizer = TfidfVectorizer(max_features=5000, stop_words='english')
X_train_tfidf = tfidf_vectorizer.fit_transform(train_data_processed)
X_test_tfidf = tfidf_vectorizer.transform(test_data_processed)

# Target labels
y_train = newsgroups_train.target
y_test = newsgroups_test.target

print(f"TF-IDF features shape: {X_train_tfidf.shape}")


TF-IDF features shape: (2217, 5000)

In [17]:

# Train a classifier
classifier = LogisticRegression(max_iter=1000, C=10.0, solver='liblinear', multi_class='
ovr')
classifier.fit(X_train_tfidf, y_train)

# Make predictions
y_pred = classifier.predict(X_test_tfidf)

# Evaluate
accuracy = accuracy_score(y_test, y_pred)
report = classification_report(y_test, y_pred, target_names=categories)

print(f"Logistic Regression Accuracy: {accuracy:.4f}")


print("Classification Report:")
print(report)

Logistic Regression Accuracy: 0.9003


Classification Report:
precision recall f1-score support

alt.atheism 0.88 0.84 0.86 319


sci.electronics 0.95 0.93 0.94 399
rec.sport.hockey 0.87 0.93 0.90 393
talk.politics.guns 0.91 0.88 0.89 364

accuracy 0.90 1475


macro avg 0.90 0.90 0.90 1475
weighted avg 0.90 0.90 0.90 1475

In [18]:

# Apply dimensionality reduction with TruncatedSVD (for sparse matrices)


# TruncatedSVD is used as PCA equivalent for sparse matrices
svd = TruncatedSVD(n_components=2, random_state=42)
X_train_2d = svd.fit_transform(X_train_tfidf)

# Get category names for plotting


category_names = newsgroups_train.target_names

# Visualize the data in 2D


plt.figure(figsize=(12, 10))
colors = ['red', 'blue', 'green', 'purple']
for i, category in enumerate(categories):
# Get indices for this category
indices = y_train == i

# Plot points for this category


plt.scatter(X_train_2d[indices, 0], X_train_2d[indices, 1],
c=colors[i], label=category, alpha=0.7, s=50)

plt.title('PCA Visualization of 20 Newsgroups Categories')


plt.xlabel(f'Principal Component 1 (Explained Variance: {svd.explained_variance_ratio_[0]
:.2f})')
plt.ylabel(f'Principal Component 2 (Explained Variance: {svd.explained_variance_ratio_[1]
:.2f})')
plt.legend(loc='best')
plt.grid(True, linestyle='--', alpha=0.6)
plt.tight_layout()
plt.show()

# Print total explained variance


total_var = sum(svd.explained_variance_ratio_)
print(f"Total explained variance by 2 components: {total_var:.4f} or {total_var*100:.2f}%
")
Total explained variance by 2 components: 0.0092 or 0.92%

In [19]:

from sklearn.manifold import TSNE

# First reduce dimensionality with TruncatedSVD to make t-SNE computationally feasible


svd_50 = TruncatedSVD(n_components=50, random_state=42)
X_train_50d = svd_50.fit_transform(X_train_tfidf)

# Apply t-SNE to the reduced data


tsne = TSNE(n_components=2, perplexity=40, n_iter=300, random_state=42)
X_train_tsne = tsne.fit_transform(X_train_50d)

# Visualize t-SNE results


plt.figure(figsize=(12, 10))

for i, category in enumerate(categories):


# Get indices for this category
indices = y_train == i

# Plot points for this category


plt.scatter(X_train_tsne[indices, 0], X_train_tsne[indices, 1],
c=colors[i], label=category, alpha=0.7, s=50)

plt.title('t-SNE Visualization of 20 Newsgroups Categories')


plt.xlabel('t-SNE Component 1')
plt.ylabel('t-SNE Component 2')
plt.legend(loc='best')
plt.grid(True, linestyle='--', alpha=0.6)
plt.tight_layout()
plt.show()
In [20]:

# Function to classify new text examples


def classify_new_text(text, model, vectorizer, label_names):
"""Classify a new text example using the trained model"""
# Preprocess text
processed_text = preprocess_newsgroups(text)

# Vectorize
text_tfidf = vectorizer.transform([processed_text])

# Get prediction and probabilities


prediction = model.predict(text_tfidf)[0]
probabilities = model.predict_proba(text_tfidf)[0]

# Sort probabilities in descending order


sorted_indices = np.argsort(probabilities)[::-1]

# Create results
result = {
'predicted_class': label_names[prediction],
'prediction_index': prediction,
'probabilities': []
}

# Add sorted probabilities


for idx in sorted_indices:
result['probabilities'].append({
'class': label_names[idx],
'probability': probabilities[idx]
})

return result
# Example texts to classify
test_examples = [
"I think we need stricter regulations on firearms to prevent violence.",
"The semiconductor industry is evolving with new transistor technologies.",
"Last night's hockey game was amazing with multiple goals in overtime.",
"Religious beliefs should not influence public policy decisions."
]

# Classify each example


for i, example in enumerate(test_examples):
result = classify_new_text(example, classifier, tfidf_vectorizer, categories)

print(f"\n Example {i+1}: {example}")


print(f"Predicted class: {result['predicted_class']} ")
print("Probabilities:")
for prob in result['probabilities']:
print(f" {prob['class']}: {prob['probability']:.4f}")

Example 1: I think we need stricter regulations on firearms to prevent violence.


Predicted class: talk.politics.guns
Probabilities:
talk.politics.guns: 0.5926
rec.sport.hockey: 0.1610
sci.electronics: 0.1545
alt.atheism: 0.0920

Example 2: The semiconductor industry is evolving with new transistor technologies.


Predicted class: rec.sport.hockey
Probabilities:
rec.sport.hockey: 0.6343
sci.electronics: 0.2048
alt.atheism: 0.0967
talk.politics.guns: 0.0641

Example 3: Last night's hockey game was amazing with multiple goals in overtime.
Predicted class: sci.electronics
Probabilities:
sci.electronics: 0.9507
alt.atheism: 0.0189
talk.politics.guns: 0.0163
rec.sport.hockey: 0.0141

Example 4: Religious beliefs should not influence public policy decisions.


Predicted class: alt.atheism
Probabilities:
alt.atheism: 0.7303
talk.politics.guns: 0.1093
rec.sport.hockey: 0.0850
sci.electronics: 0.0754
Experiment 10: Stock Market Prediction using LSTM
In [ ]:

%pip install yfinance

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import yfinance as yf # For fetching stock data
from datetime import datetime, timedelta
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, LSTM, Dropout, GRU, Bidirectional
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint, ReduceLROnPlateau
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score
import math
import warnings

# Suppress warnings
warnings.filterwarnings('ignore')

# Set style for plots


sns.set_style('whitegrid')
plt.style.use("fivethirtyeight")

# Check versions
print(f"TensorFlow version: {tf.__version__}")
print(f"Keras version: {keras.__version__}")
print(f"Pandas version: {pd.__version__}")
print(f"NumPy version: {np.__version__}")

In [4]:
# Define the stock tickers and time period
tickers = ['AAPL', 'MSFT', 'GOOGL', 'AMZN']
start_date = '2018-01-01'
end_date = datetime.now().strftime('%Y-%m-%d ')

# Function to fetch stock data


def get_stock_data(ticker, start, end):
try:
data = yf.download(ticker, start=start, end=end)
data.reset_index(inplace=True)
return data
except Exception as e:
print(f"Error fetching data for {ticker}: {e}")
# Create synthetic data if fetch fails
return generate_synthetic_stock_data(ticker, start, end)

# Function to generate synthetic stock data if API fetch fails


def generate_synthetic_stock_data(ticker, start_date, end_date):
# Convert dates to datetime
start = pd.to_datetime(start_date)
end = pd.to_datetime(end_date)

# Generate date range


date_range = pd.date_range(start=start, end=end, freq='B') # 'B' for business days

# Create random starting price based on ticker


if ticker == 'AAPL':
base_price = 150
elif ticker == 'MSFT':
base_price = 250
elif ticker == 'GOOGL':
base_price = 1200
elif ticker == 'AMZN':
base_price = 1800
else:
base_price = 100

# Generate random walk for prices with upward trend


n_days = len(date_range)
noise = np.random.normal(0, 1, n_days)
trend = np.linspace(0, 30, n_days) # Upward trend
seasonality = 10 * np.sin(np.linspace(0, 15 * np.pi, n_days)) # Seasonal pattern

# Combine components
prices = base_price + trend + seasonality + np.cumsum(noise)

# Create volume data


volume = np.random.randint(1000000, 10000000, n_days)

# Create DataFrame
df = pd.DataFrame({
'Date': date_range,
'Open': prices * np.random.uniform(0.99, 1.01, n_days),
'High': prices * np.random.uniform(1.01, 1.03, n_days),
'Low': prices * np.random.uniform(0.97, 0.99, n_days),
'Close': prices,
'Adj Close': prices,
'Volume': volume
})

print(f"Generated synthetic data for {ticker}")


return df

# Fetch data for each ticker


stock_data = {}
for ticker in tickers:
stock_data[ticker] = get_stock_data(ticker, start_date, end_date)
print(f"Retrieved {len(stock_data[ticker])} days of data for {ticker}")

# Select one stock for detailed analysis (Apple)


selected_stock = 'AAPL'
df = stock_data[selected_stock].copy()

# Display the first few rows


df.head()

YF.download() has changed argument auto_adjust default to True

[*********************100%***********************] 1 of 1 completed

Retrieved 1854 days of data for AAPL

[*********************100%***********************] 1 of 1 completed

Retrieved 1854 days of data for MSFT

[*********************100%***********************] 1 of 1 completed

Retrieved 1854 days of data for GOOGL

[*********************100%***********************] 1 of 1 completed

Retrieved 1854 days of data for AMZN

Out[4]:

Price Date Close High Low Open Volume

Ticker AAPL AAPL AAPL AAPL AAPL

0 2018-01-02 40.426826 40.436216 39.722772 39.933990 102223600

1 2018-01-03 40.419792 40.964263 40.356430 40.490198 118071600


1 2018-01-03 40.419792 40.964263 40.356430 40.490198 118071600
Price Date Close High Low Open Volume
2 2018-01-04 40.607540 40.710802 40.384590 40.492543 89738400
Ticker AAPL AAPL AAPL AAPL AAPL
3 2018-01-05 41.069859 41.156691 40.612224 40.703751 94640000

4 2018-01-08 40.917324 41.213026 40.818753 40.917324 82271200

In [5]:
# Check basic statistics
print(f"\n Statistics for {selected_stock}:")
print(df.describe())

# Check for missing values


print(f"\n Missing values in {selected_stock} data:")
print(df.isnull().sum())

# Plot the stock's closing price history


plt.figure(figsize=(16, 8))
plt.title(f'{selected_stock} Stock Price History')
plt.plot(df['Date'], df['Close'])
plt.xlabel('Date', fontsize=14)
plt.ylabel('Close Price (USD)', fontsize=14)
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

Statistics for AAPL:


Price Date Close High Low \
Ticker AAPL AAPL AAPL
count 1854 1854.000000 1854.000000 1854.000000
mean 2021-09-07 07:34:22.135922176 127.071460 128.370795 125.623331
min 2018-01-02 00:00:00 33.870842 34.711717 33.825582
25% 2019-11-04 06:00:00 58.870005 60.070647 58.003616
50% 2021-09-07 12:00:00 137.429787 139.645323 135.496434
75% 2023-07-12 18:00:00 172.366409 173.933534 170.980190
max 2025-05-16 00:00:00 258.396667 259.474086 257.010028
std NaN 61.769196 62.322379 61.115449

Price Open Volume


Ticker AAPL AAPL
count 1854.000000 1.854000e+03
mean 126.938500 9.807716e+07
min 34.297233 2.323470e+07
25% 58.957534 6.014090e+07
50% 137.397325 8.470835e+07
75% 172.219155 1.189786e+08
max 257.568678 4.265100e+08
std 61.692889 5.489307e+07

Missing values in AAPL data:


Price Ticker
Date 0
Close AAPL 0
High AAPL 0
Low AAPL 0
Open AAPL 0
Volume AAPL 0
dtype: int64
In [6]:
# Create additional plots to understand the data better

# Plot volume over time


plt.figure(figsize=(16, 6))
plt.plot(df['Date'], df['Volume'], color='orange')
plt.title(f'{selected_stock} Trading Volume')
plt.xlabel('Date', fontsize=14)
plt.ylabel('Volume', fontsize=14)
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

# Plot daily price change


df['Daily_Return'] = df['Close'].pct_change() * 100
plt.figure(figsize=(16, 6))
plt.plot(df['Date'], df['Daily_Return'], color='green')
plt.title(f'{selected_stock} Daily Returns (%)')
plt.xlabel('Date', fontsize=14)
plt.ylabel('Daily Return (%)', fontsize=14)
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

# Plot moving averages


df['MA50'] = df['Close'].rolling(window=50).mean()
df['MA200'] = df['Close'].rolling(window=200).mean()

plt.figure(figsize=(16, 8))
plt.plot(df['Date'], df['Close'], label='Close Price')
plt.plot(df['Date'], df['MA50'], label='50-day MA', color='orange')
plt.plot(df['Date'], df['MA200'], label='200-day MA', color='red')
plt.title(f'{selected_stock} Stock Price with Moving Averages')
plt.xlabel('Date', fontsize=14)
plt.ylabel('Price (USD)', fontsize=14)
plt.legend()
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()
In [7]:
# Compare all stocks on the same chart
plt.figure(figsize=(16, 8))

for ticker in tickers:


# Normalize to the starting price for fair comparison
normalized = stock_data[ticker]['Close'] / stock_data[ticker]['Close'].iloc[0] * 100
plt.plot(stock_data[ticker]['Date'], normalized, label=ticker)

plt.title('Normalized Stock Price Comparison (Base = 100)')


plt.xlabel('Date', fontsize=14)
plt.ylabel('Normalized Price', fontsize=14)
plt.legend()
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()
In [8]:
# Prepare data for LSTM model
def prepare_lstm_data(data, target_col='Close', sequence_length=60, train_split=0.8):
"""Prepare data for LSTM model with proper scaling and sequence creation"""
# Extract target column data
target_data = data[target_col].values.reshape(-1, 1)

# Scale the data


scaler = MinMaxScaler(feature_range=(0, 1))
scaled_data = scaler.fit_transform(target_data)

# Determine training data length


train_size = int(len(scaled_data) * train_split)
train_data = scaled_data[:train_size]
test_data = scaled_data[train_size - sequence_length:]

# Create sequences for training


X_train, y_train = [], []
for i in range(sequence_length, len(train_data)):
X_train.append(train_data[i - sequence_length:i, 0])
y_train.append(train_data[i, 0])

# Create sequences for testing


X_test, y_test = [], []
for i in range(sequence_length, len(test_data)):
X_test.append(test_data[i - sequence_length:i, 0])
y_test.append(test_data[i, 0])

# Convert to numpy arrays


X_train, y_train = np.array(X_train), np.array(y_train)
X_test, y_test = np.array(X_test), np.array(y_test)

# Reshape for LSTM [samples, time steps, features]


X_train = np.reshape(X_train, (X_train.shape[0], X_train.shape[1], 1))
X_test = np.reshape(X_test, (X_test.shape[0], X_test.shape[1], 1))

# Return processed data and scaler for inverse transformation


return X_train, y_train, X_test, y_test, scaler, train_size

# Prepare data for the selected stock


sequence_length = 60 # 60 days of data to predict the next day
X_train, y_train, X_test, y_test, scaler, train_size = prepare_lstm_data(
df, target_col='Close', sequence_length=sequence_length, train_split=0.8
)

print(f"Training data shape: {X_train.shape}")


print(f"Test data shape: {X_test.shape}")

# Show date ranges for training and testing


train_dates = df['Date'][:train_size]
test_dates = df['Date'][train_size:]
print(f"Training date range: {train_dates.iloc[0]} to {train_dates.iloc[-1]} ")
print(f"Testing date range: {test_dates.iloc[0]} to {test_dates.iloc[-1]}")

Training data shape: (1423, 60, 1)


Test data shape: (371, 60, 1)
Training date range: 2018-01-02 00:00:00 to 2023-11-21 00:00:00
Testing date range: 2023-11-22 00:00:00 to 2025-05-16 00:00:00

In [9]:
# Simple LSTM model
def create_simple_lstm_model(sequence_length):
model = Sequential([
LSTM(50, return_sequences=True, input_shape=(sequence_length, 1)),
Dropout(0.2),
LSTM(50, return_sequences=False),
Dropout(0.2),
Dense(25),
Dense(1)
])
model.compile(optimizer='adam', loss='mean_squared_error')
return model

# Stacked LSTM model


def create_stacked_lstm_model(sequence_length):
model = Sequential([
LSTM(100, return_sequences=True, input_shape=(sequence_length, 1)),
Dropout(0.2),
LSTM(100, return_sequences=True),
Dropout(0.2),
LSTM(100, return_sequences=False),
Dropout(0.2),
Dense(50),
Dense(1)
])
model.compile(optimizer='adam', loss='mean_squared_error')
return model

# Bidirectional LSTM model


def create_bidirectional_lstm_model(sequence_length):
model = Sequential([
Bidirectional(LSTM(50, return_sequences=True), input_shape=(sequence_length, 1))
,
Dropout(0.2),
Bidirectional(LSTM(50, return_sequences=False)),
Dropout(0.2),
Dense(25),
Dense(1)
])
model.compile(optimizer='adam', loss='mean_squared_error')
return model

# Create models
simple_lstm_model = create_simple_lstm_model(sequence_length)
stacked_lstm_model = create_stacked_lstm_model(sequence_length)
bidirectional_lstm_model = create_bidirectional_lstm_model(sequence_length)

# Display model architecture


simple_lstm_model.summary()

Model: "sequential"

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type) ┃ Output Shape ┃ Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ lstm (LSTM) │ (None, 60, 50) │ 10,400 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout (Dropout) │ (None, 60, 50) │ 0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ lstm_1 (LSTM) │ (None, 50) │ 20,200 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout_1 (Dropout) │ (None, 50) │ 0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense (Dense) │ (None, 25) │ 1,275 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_1 (Dense) │ (None, 1) │ 26 │
└─────────────────────────────────┴────────────────────────┴───────────────┘

Total params: 31,901 (124.61 KB)

Trainable params: 31,901 (124.61 KB)


Non-trainable params: 0 (0.00 B)

In [ ]:
# Define callbacks for training
callbacks = [
EarlyStopping(monitor='val_loss', patience=10, restore_best_weights=True),
ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=5, min_lr=1e-5)
]

# Train Simple LSTM


print("Training Simple LSTM Model...")
simple_history = simple_lstm_model.fit(
X_train, y_train,
epochs=50,
batch_size=32,
validation_split=0.1,
callbacks=callbacks,
verbose=1
)

# Train Stacked LSTM


print("\n Training Stacked LSTM Model...")
stacked_history = stacked_lstm_model.fit(
X_train, y_train,
epochs=50,
batch_size=32,
validation_split=0.1,
callbacks=callbacks,
verbose=1
)

# Train Bidirectional LSTM


print("\n Training Bidirectional LSTM Model...")
bidirectional_history = bidirectional_lstm_model.fit(
X_train, y_train,
epochs=50,
batch_size=32,
validation_split=0.1,
callbacks=callbacks,
verbose=1
)

In [11]:
# Plot training history for all models
plt.figure(figsize=(14, 7))

plt.plot(simple_history.history['loss'], label='Simple LSTM Training Loss')


plt.plot(simple_history.history['val_loss'], label='Simple LSTM Validation Loss')
plt.plot(stacked_history.history['loss'], label='Stacked LSTM Training Loss')
plt.plot(stacked_history.history['val_loss'], label='Stacked LSTM Validation Loss')
plt.plot(bidirectional_history.history['loss'], label='Bidirectional LSTM Training Loss')
plt.plot(bidirectional_history.history['val_loss'], label='Bidirectional LSTM Validation
Loss')

plt.title('Model Loss Comparison')


plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.legend()
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()
In [12]:
# Function to evaluate model performance
def evaluate_model(model, X_test, y_test, scaler, model_name):
"""Evaluate a model and return predictions and metrics"""
# Make predictions
predictions = model.predict(X_test)

# Inverse transform to get actual price values


y_test_inv = scaler.inverse_transform(y_test.reshape(-1, 1)).flatten()
predictions_inv = scaler.inverse_transform(predictions).flatten()

# Calculate error metrics


mse = mean_squared_error(y_test_inv, predictions_inv)
rmse = math.sqrt(mse)
mae = mean_absolute_error(y_test_inv, predictions_inv)
r2 = r2_score(y_test_inv, predictions_inv)

print(f"\n {model_name} Performance Metrics:")


print(f"MSE: {mse:.2f}")
print(f"RMSE: {rmse:.2f}")
print(f"MAE: {mae:.2f}")
print(f"R² Score: {r2:.4f}")

return predictions_inv, y_test_inv, {


'mse': mse,
'rmse': rmse,
'mae': mae,
'r2': r2
}

# Evaluate all models


simple_pred, y_test_inv, simple_metrics = evaluate_model(simple_lstm_model, X_test, y_te
st, scaler, "Simple LSTM")
stacked_pred, _, stacked_metrics = evaluate_model(stacked_lstm_model, X_test, y_test, sc
aler, "Stacked LSTM")
bidirectional_pred, _, bidirectional_metrics = evaluate_model(bidirectional_lstm_model, X
_test, y_test, scaler, "Bidirectional LSTM")

12/12 ━━━━━━━━━━━━━━━━━━━━ 1s 48ms/step

Simple LSTM Performance Metrics:


MSE: 77.65
RMSE: 8.81
MAE: 6.84
R² Score: 0.8666
12/12 ━━━━━━━━━━━━━━━━━━━━ 1s 75ms/step

Stacked LSTM Performance Metrics:


MSE: 88.82
RMSE: 9.42
MAE: 7.42
R² Score: 0.8475
12/12 ━━━━━━━━━━━━━━━━━━━━ 2s 81ms/step

Bidirectional LSTM Performance Metrics:


Bidirectional LSTM Performance Metrics:
MSE: 54.49
RMSE: 7.38
MAE: 5.69
R² Score: 0.9064

In [13]:
# Plot the predictions vs actual
def plot_predictions(predictions, actual, model_name, dates=None):
plt.figure(figsize=(16, 8))

if dates is not None and len(dates) == len(actual):


plt.plot(dates, actual, label='Actual', color='blue', linewidth=2)
plt.plot(dates, predictions, label=f'{model_name} Predictions', color='red', lin
estyle='--', linewidth=2)
else:
plt.plot(actual, label='Actual', color='blue', linewidth=2)
plt.plot(predictions, label=f'{model_name} Predictions', color='red', linestyle=
'--', linewidth=2)

plt.title(f'{selected_stock} Stock Price Prediction - {model_name} ')


plt.xlabel('Date', fontsize=14)
plt.ylabel('Stock Price (USD)', fontsize=14)
plt.legend()
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

# Extract dates for test data


test_dates_for_plot = df['Date'].iloc[train_size + sequence_length:train_size + sequence
_length + len(y_test_inv)]

# Plot predictions for each model


plot_predictions(simple_pred, y_test_inv, "Simple LSTM", test_dates_for_plot)
plot_predictions(stacked_pred, y_test_inv, "Stacked LSTM", test_dates_for_plot)
plot_predictions(bidirectional_pred, y_test_inv, "Bidirectional LSTM", test_dates_for_plo
t)
In [15]:

# Plot all predictions on the same graph for comparison


plt.figure(figsize=(16, 8))

plt.plot(test_dates, y_test_inv, label='Actual', color='blue', linewidth=2)


plt.plot(test_dates, simple_pred, label='Simple LSTM', color='red', linestyle='--', line
width=1.5)
plt.plot(test_dates, stacked_pred, label='Stacked LSTM', color='green', linestyle='--',
linewidth=1.5)
plt.plot(test_dates, bidirectional_pred, label='Bidirectional LSTM', color='orange', line
style='--', linewidth=1.5)

plt.title(f'{selected_stock} Stock Price Prediction - Model Comparison')


plt.xlabel('Date', fontsize=14)
plt.ylabel('Stock Price (USD)', fontsize=14)
plt.legend()
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()
In [16]:

# Compare error metrics


metrics = pd.DataFrame({
'Simple LSTM': [simple_metrics['rmse'], simple_metrics['mae'], simple_metrics['r2']]
,
'Stacked LSTM': [stacked_metrics['rmse'], stacked_metrics['mae'], stacked_metrics['r
2']],
'Bidirectional LSTM': [bidirectional_metrics['rmse'], bidirectional_metrics['mae'],
bidirectional_metrics['r2']]
}, index=['RMSE', 'MAE', 'R² Score'])

# Display the metrics table


print("Model Performance Comparison:")
metrics

Model Performance Comparison:

Out[16]:

Simple LSTM Stacked LSTM Bidirectional LSTM

RMSE 8.811947 9.424283 7.381580

MAE 6.841683 7.418596 5.688194

R² Score 0.866637 0.847458 0.906418

In [17]:

# Visualize model comparison


metrics_to_plot = ['RMSE', 'MAE']
plt.figure(figsize=(12, 6))

for i, metric in enumerate(metrics_to_plot):


plt.subplot(1, 2, i+1)
values = metrics.loc[metric]
bars = plt.bar(values.index, values.values, color=['red', 'green', 'orange'])
plt.title(f'Model Comparison: {metric}')
plt.ylabel(metric)
plt.xticks(rotation=45)

# Add value labels on bars


for bar in bars:
height = bar.get_height()
plt.text(bar.get_x() + bar.get_width()/2., height + 0.1,
f'{height:.2f} ', ha='center', va='bottom')

plt.tight_layout()
plt.show()

# Plot R² scores
plt.figure(figsize=(10, 6))
r2_values = metrics.loc['R² Score']
bars = plt.bar(r2_values.index, r2_values.values, color=['red', 'green', 'orange'])
plt.title('Model Comparison: R² Score')
plt.ylabel('R² Score')
plt.ylim(0, 1)
plt.xticks(rotation=45)

# Add value labels on bars


for bar in bars:
height = bar.get_height()
plt.text(bar.get_x() + bar.get_width()/2., height + 0.01,
f'{height:.4f} ', ha='center', va='bottom')

plt.tight_layout()
plt.show()

In [ ]:
# Function to predict future stock prices
def predict_future_prices(model, last_sequence, days_to_predict, scaler):
"""Predict future stock prices given the last sequence of data"""
future_predictions = []
current_sequence = last_sequence.copy()

for _ in range(days_to_predict):
# Reshape for prediction
current_reshaped = current_sequence.reshape(1, current_sequence.shape[0], 1)

# Predict next price


next_price = model.predict(current_reshaped)[0][0]
future_predictions.append(next_price)

# Update sequence for next prediction


current_sequence = np.append(current_sequence[1:], next_price)

# Inverse transform to get actual price values


future_predictions = scaler.inverse_transform(
np.array(future_predictions).reshape(-1, 1)
).flatten()

return future_predictions

# Get the last sequence from our test data


last_sequence = X_test[-1]

# Predict the next 30 days


days_to_predict = 30

# Generate future dates


last_date = test_dates_for_plot.iloc[-1]
future_dates = pd.date_range(start=last_date + timedelta(days=1), periods=days_to_predic
t, freq='B')

# Predict future prices with all models


simple_future = predict_future_prices(simple_lstm_model, last_sequence, days_to_predict,
scaler)
stacked_future = predict_future_prices(stacked_lstm_model, last_sequence, days_to_predict
, scaler)
bidirectional_future = predict_future_prices(bidirectional_lstm_model, last_sequence, day
s_to_predict, scaler)

# Combine historical and future data for visualization


plt.figure(figsize=(16, 8))

# Plot historical actual prices


plt.plot(test_dates_for_plot, y_test_inv[-len(test_dates_for_plot):], label='Historical P
rices', color='blue', linewidth=2)

# Plot future predictions


plt.plot(future_dates, simple_future, label='Simple LSTM Forecast', color='red', linestyl
e='--')
plt.plot(future_dates, stacked_future, label='Stacked LSTM Forecast', color='green', line
style='--')
plt.plot(future_dates, bidirectional_future, label='Bidirectional LSTM Forecast', color='
orange', linestyle='--')

# Add a vertical line to separate historical data from predictions


plt.axvline(x=test_dates_for_plot.iloc[-1], color='gray', linestyle='-.')
plt.text(test_dates_for_plot.iloc[-1], min(y_test_inv), 'Prediction Start', rotation=90,
verticalalignment='bottom')

plt.title(f'{selected_stock} Stock Price Forecast - Next {days_to_predict} Trading Days')


plt.xlabel('Date', fontsize=14)
plt.ylabel('Stock Price (USD)', fontsize=14)
plt.legend()
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

# Print the forecasted prices


forecast_df = pd.DataFrame({
'Date': future_dates,
'Simple LSTM': simple_future,
'Stacked LSTM': stacked_future,
'Bidirectional LSTM': bidirectional_future
})

print("\n Forecasted Prices for the Next 30 Trading Days:")


forecast_df.set_index('Date')

In [ ]:
In [ ]:

# Add technical indicators to the original dataframe


def add_technical_indicators(df):
"""Add common technical indicators to the dataframe"""
# Copy of the dataframe
df_temp = df.copy()

# Moving Averages
df_temp['MA5'] = df_temp['Close'].rolling(window=5).mean()
df_temp['MA10'] = df_temp['Close'].rolling(window=10).mean()
df_temp['MA20'] = df_temp['Close'].rolling(window=20).mean()
df_temp['MA50'] = df_temp['Close'].rolling(window=50).mean()

# Exponential Moving Averages


df_temp['EMA12'] = df_temp['Close'].ewm(span=12, adjust=False).mean()
df_temp['EMA26'] = df_temp['Close'].ewm(span=26, adjust=False).mean()

# MACD (Moving Average Convergence Divergence)


df_temp['MACD'] = df_temp['EMA12'] - df_temp['EMA26']
df_temp['MACD_Signal'] = df_temp['MACD'].ewm(span=9, adjust=False).mean()

# RSI (Relative Strength Index)


delta = df_temp['Close'].diff()
gain = delta.clip(lower=0)
loss = -delta.clip(upper=0)
avg_gain = gain.rolling(window=14).mean()
avg_loss = loss.rolling(window=14).mean()
rs = avg_gain / avg_loss
df_temp['RSI'] = 100 - (100 / (1 + rs))

# Bollinger Bands
df_temp['MA20_std'] = df_temp['Close'].rolling(window=20).std()
df_temp['Upper_Band'] = df_temp['MA20'] + (df_temp['MA20_std'] * 2)
df_temp['Lower_Band'] = df_temp['MA20'] - (df_temp['MA20_std'] * 2)

# Price Rate of Change


df_temp['ROC'] = ((df_temp['Close'] / df_temp['Close'].shift(10)) - 1) * 100

# Drop rows with NaN values resulting from calculations


df_temp.dropna(inplace=True)

return df_temp

# Add technical indicators


df_with_indicators = add_technical_indicators(df)

# Display the dataframe with indicators


df_with_indicators.tail()

In [23]:

# Visualize some technical indicators


def plot_technical_indicators(df, indicators=['MA20', 'Upper_Band', 'Lower_Band', 'RSI',
'MACD']):
"""Plot the stock price along with selected technical indicators"""
# Create figure with subplots
fig, axes = plt.subplots(3, 1, figsize=(16, 15), sharex=True)

# Plot price and Bollinger Bands


df['Close'].plot(ax=axes[0], color='blue', label='Close Price')
if 'MA20' in indicators:
df['MA20'].plot(ax=axes[0], color='orange', label='20-day MA')
if 'Upper_Band' in indicators and 'Lower_Band' in indicators:
df['Upper_Band'].plot(ax=axes[0], color='red', linestyle='--', label='Upper Boll
inger Band')
df['Lower_Band'].plot(ax=axes[0], color='green', linestyle='--', label='Lower Bo
llinger Band')

axes[0].set_title(f'{selected_stock} Stock Price with Bollinger Bands')


axes[0].set_ylabel('Price')
axes[0].legend()
axes[0].grid(True, alpha=0.3)

# Plot RSI
if 'RSI' in indicators:
df['RSI'].plot(ax=axes[1], color='purple', label='RSI')
# Add overbought/oversold levels
axes[1].axhline(y=70, color='red', linestyle='--', alpha=0.5)
axes[1].axhline(y=30, color='green', linestyle='--', alpha=0.5)
axes[1].text(df.index[0], 70, 'Overbought', color='red')
axes[1].text(df.index[0], 30, 'Oversold', color='green')

axes[1].set_title('Relative Strength Index (RSI)')


axes[1].set_ylabel('RSI')
axes[1].grid(True, alpha=0.3)

# Plot MACD
if 'MACD' in indicators:
df['MACD'].plot(ax=axes[2], color='blue', label='MACD')
df['MACD_Signal'].plot(ax=axes[2], color='red', label='Signal Line')

# Highlight MACD Histogram


macd_hist = df['MACD'] - df['MACD_Signal']
axes[2].bar(df.index, macd_hist, color=macd_hist.apply(lambda x: 'green' if x >
0 else 'red'),
label='MACD Histogram', alpha=0.5, width=2)

axes[2].set_title('Moving Average Convergence Divergence (MACD)')


axes[2].set_ylabel('MACD')
axes[2].axhline(y=0, color='black', linestyle='-', alpha=0.3)
axes[2].legend()
axes[2].grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

# First need to recreate the df_with_indicators since it appears to be empty


df_with_indicators = add_technical_indicators(df)

# Check if we have data


print(f"Technical indicators dataframe shape: {df_with_indicators.shape}")

# Plot technical indicators for the last 200 days


if not df_with_indicators.empty:
plot_technical_indicators(df_with_indicators.iloc[-200:])
else:
print("Error: The dataframe with technical indicators is empty.")

Technical indicators dataframe shape: (0, 21)


Error: The dataframe with technical indicators is empty.

10. Monte Carlo Simulation for Risk Assessment


In [24]:
# Function for Monte Carlo simulation
def monte_carlo_simulation(last_price, days, simulations, volatility):
"""Perform Monte Carlo simulation for stock price predictions"""
# Calculate daily returns
returns = np.random.normal(0, volatility, (days, simulations))

# Create price paths


price_paths = np.zeros((days + 1, simulations))
price_paths[0] = last_price

# Calculate price paths


for t in range(1, days + 1):
price_paths[t] = price_paths[t-1] * np.exp(returns[t-1])

return price_paths
# Get the last known price
last_price = y_test_inv[-1]

# Calculate historical volatility (standard deviation of returns)


returns = df['Daily_Return'].dropna() / 100 # convert from percentage
volatility = returns.std()

# Monte Carlo parameters


days_to_simulate = 30
num_simulations = 1000

# Run simulation
simulated_paths = monte_carlo_simulation(last_price, days_to_simulate, num_simulations,
volatility)

# Plot simulation results


plt.figure(figsize=(16, 8))

# Plot historical prices


plt.plot(range(-30, 0), y_test_inv[-30:], label='Historical Prices', color='blue', linew
idth=2)

# Plot simulation paths (sample 100 to avoid overcrowding)


sample_paths = np.random.choice(num_simulations, 100, replace=False)
for i in sample_paths:
plt.plot(range(days_to_simulate), simulated_paths[1:, i], color='gray', alpha=0.2)

# Plot expected path (mean)


mean_path = np.mean(simulated_paths, axis=1)
plt.plot(range(days_to_simulate), mean_path[1:], label='Expected Path', color='red', line
width=2)

# Plot confidence intervals


percentile_5 = np.percentile(simulated_paths, 5, axis=1)
percentile_95 = np.percentile(simulated_paths, 95, axis=1)
plt.fill_between(range(days_to_simulate), percentile_5[1:], percentile_95[1:], color='red
', alpha=0.2, label='90% Confidence Interval')

# Add LSTM predictions for comparison


plt.plot(range(days_to_simulate), bidirectional_future, label='Bidirectional LSTM Forecas
t', color='orange', linestyle='--', linewidth=2)

# Add vertical line separating historical from future


plt.axvline(x=0, color='gray', linestyle='-.')
plt.text(0, last_price*0.8, 'Simulation Start', rotation=90, verticalalignment='bottom')

plt.title(f'Monte Carlo Simulation for {selected_stock} (1000 simulations)')


plt.xlabel('Days')
plt.ylabel('Stock Price (USD)')
plt.legend()
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

# Calculate risk metrics from the simulation


final_prices = simulated_paths[-1]
expected_price = np.mean(final_prices)
var_95 = last_price - np.percentile(final_prices, 5) # 95% Value at Risk
gain_potential = np.percentile(final_prices, 95) - last_price # 95% Upside potential

print(f"\n Risk Assessment for {selected_stock} over {days_to_simulate} days:")


print(f"Current Price: ${last_price:.2f}")
print(f"Expected Price: ${expected_price:.2f} (Change: {(expected_price/last_price-1)*100
:.2f}%)")
print(f"95% Value at Risk: ${var_95:.2f} ({var_95/last_price*100:.2f}% of current price)"
)
print(f"95% Upside Potential: ${gain_potential:.2f} ({gain_potential/last_price*100:.2f}%
of current price)")
print(f"90% Confidence Range: ${np.percentile(final_prices, 5):.2f} to ${np.percentile(fi
nal_prices, 95):.2f}")
Risk Assessment for AAPL over 30 days:
Current Price: $211.26
Expected Price: $213.34 (Change: 0.99%)
95% Value at Risk: $32.76 (15.51% of current price)
95% Upside Potential: $41.52 (19.65% of current price)
90% Confidence Range: $178.50 to $252.78
Experiment 11: Mini Project -> Emotion Detection from
Tweets
In [1]:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import re
import string
import nltk
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize
from nltk.stem import WordNetLemmatizer
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.models import Sequential, Model
from tensorflow.keras.layers import Dense, LSTM, Embedding, SpatialDropout1D, Dropout, B
idirectional
from tensorflow.keras.layers import Input, GlobalMaxPooling1D, Conv1D, MaxPooling1D, conc
atenate
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras.callbacks import EarlyStopping, ReduceLROnPlateau
from tensorflow.keras.utils import to_categorical
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report, confusion_matrix, accuracy_score
from sklearn.preprocessing import LabelEncoder
import matplotlib.cm as cm
from wordcloud import WordCloud
import warnings

# Suppress warnings
warnings.filterwarnings('ignore')

# Set style for plots


sns.set_style('whitegrid')

# Download NLTK resources


try:
nltk.download('punkt')
nltk.download('stopwords')
nltk.download('wordnet')
except:
print("NLTK download failed, but we'll continue anyway")

# Check versions
print(f"TensorFlow version: {tf.__version__}")
print(f"Keras version: {keras.__version__}")
print(f"NLTK version: {nltk.__version__}")

[nltk_data] Downloading package punkt to /root/nltk_data...


[nltk_data] Unzipping tokenizers/punkt.zip.
[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data] Unzipping corpora/stopwords.zip.
[nltk_data] Downloading package wordnet to /root/nltk_data...

TensorFlow version: 2.18.0


Keras version: 3.8.0
NLTK version: 3.9.1

In [2]:
def generate_synthetic_twitter_data(n_samples=5000):
"""Generate synthetic Twitter data with emotion labels"""
# Define emotions and their associated words/phrases
emotions = {
'joy': [
"happy", "excited", "blessed", "wonderful", "fantastic", "thrilled", "love",
"joyful", "delighted", "ecstatic", "pleased", "overjoyed", "grateful", "blis
sful",
"elated", "cheerful", "content", "jubilant", "peaceful", "radiant", "sunny"
],
'sadness': [
"sad", "depressed", "heartbroken", "miserable", "grief", "sorrowful", "unhap
py",
"disappointed", "despondent", "hopeless", "melancholy", "gloomy", "blue", "d
own",
"hurt", "lost", "devastated", "tearful", "regretful", "broken", "crying"
],
'anger': [
"angry", "furious", "outraged", "annoyed", "irritated", "fuming", "enraged",
"mad", "resentful", "bitter", "incensed", "livid", "hostile", "irate",
"seething", "hate", "frustrated", "disgusted", "infuriated", "upset", "offen
ded"
],
'fear': [
"afraid", "scared", "terrified", "frightened", "anxious", "worried", "nervou
s",
"panicked", "horrified", "fearful", "dread", "uneasy", "alarmed", "threatene
d",
"intimidated", "apprehensive", "paranoid", "tense", "stressed", "petrified",
"timid"
],
'surprise': [
"surprised", "shocked", "astonished", "amazed", "stunned", "unexpected", "wo
w",
"speechless", "startled", "dumbfounded", "flabbergasted", "bewildered", "ast
ounded",
"mind-blown", "awestruck", "taken aback", "could not believe", "unbelievable
", "whoa", "omg"
]
}

# Tweet templates for each emotion


templates = {
'joy': [
"I'm feeling so {0} today! Life is good! #blessed",
"Just had the most {0} experience at {1}! Can't stop smiling! ",
"This {1} makes me feel so {0}! Best day ever! ❤️",
"I'm {0} to announce that {1}! Dreams do come true. #grateful",
"Nothing but {0} vibes with {1}. #goodtimes",
"Feeling absolutely {0} after {1}!",
"Today was {0}! {1} made everything perfect. #happiness",
"So {0} to be here with {1}. Memories that will last forever!",
"Can't express how {0} I am about {1}. #blessed #thankful",
"Such a {0} day spent {1}. My heart is full! "
],
'sadness': [
"Feeling so {0} right now... {1} ",
"Can't believe {1}. I'm completely {0}. #heartbroken",
"Today has been so {0}. {1} has me in tears.",
"Why does {1} have to happen? Feeling {0} beyond words.",
"Nothing but {0} thoughts after {1}. Need some time alone.",
"The news about {1} has left me {0}. Can't stop crying.",
"I'm {0} to say that {1}. Please send positive thoughts.",
"{1} has me feeling so {0} today. Hard to see the light.",
"My heart is {0} because {1}. Life can be so unfair.",
"In a {0} mood after {1}. Just want to stay in bed all day."
],
'anger': [
"I'm so {0} right now! {1} is absolutely unacceptable! ",
"Can't believe {1}! Makes me so {0}! #furious",
"Nothing makes me more {0} than {1}. Seriously?!",
"I'm {0} beyond words about {1}. This is ridiculous!",
"The way {1} happened has me feeling {0}. Not okay!",
"So {0} at the situation with {1}. This needs to change!",
"{1} is driving me insane! Feeling extremely {0} right now.",
"My blood is boiling! {1} has me completely {0}!",
"I can't stand {1}! Makes me {0} every single time.",
"Why am I so {0} about {1}? Because it's totally wrong!"
],
'fear': [
"Feeling so {0} about {1}. Can't stop thinking about it.",
"I'm {0} that {1} might happen. Anyone else feeling this way?",
"The thought of {1} makes me {0}. I don't know what to do.",
"So {0} right now. {1} has me on edge. #anxiety",
"Can't sleep because I'm {0} about {1}. Help?",
"The news about {1} has me {0}. Trying to stay calm.",
"I get so {0} whenever {1} happens. My heart races.",
"Why does {1} make me feel so {0}? I hate this feeling!",
"Having {0} thoughts about {1}. Need reassurance.",
"Absolutely {0} after hearing about {1}. Please be safe everyone."
],
'surprise': [
"I'm completely {0} by {1}! Did not see that coming! ",
"Well that was {0}! {1} just blew my mind!",
"Can't believe what just happened! {1} has me {0}! #whoa",
"I'm {0} at the news about {1}. Totally unexpected!",
"My jaw dropped! So {0} by {1}! #unexpected",
"{1} just happened and I'm absolutely {0}! What a twist!",
"The most {0} thing just occurred: {1}! Still processing it.",
"I was not prepared for {1}! Feeling {0} right now!",
"OMG! {1} has left me {0}! Did anyone else know about this?",
"Talk about a {0} turn of events! {1} has changed everything!"
]
}

# Context phrases to fill in templates


contexts = {
'joy': [
"my promotion", "our vacation", "the party last night", "meeting new friends
",
"spending time with family", "the concert", "my new job", "passing my exam",
"my birthday celebration", "winning the game", "my new puppy", "the surprise
gift",
"the beautiful weather", "the delicious meal", "my wedding plans", "finishin
g my project"
],
'sadness': [
"losing a friend", "the bad news", "missing someone", "the movie ending",
"the rainy weather", "failing my test", "being alone tonight", "remembering
the past",
"the loss in our family", "saying goodbye", "moving away", "the tragic news"
,
"seeing people struggle", "the restaurant closing", "the canceled plans", "l
osing my favorite item"
],
'anger': [
"being stuck in traffic", "poor customer service", "people being rude", "the
flight cancellation",
"waiting in long lines", "being overcharged", "the WiFi not working", "someo
ne cutting me off",
"the policy change", "people not wearing masks", "misleading advertising", "
the noisy neighbors",
"the political situation", "the unfair treatment", "broken promises", "the d
elivery being late"
],
'fear': [
"the upcoming presentation", "the medical test results", "walking alone at ni
ght", "the strange noise",
"the upcoming deadline", "the job interview", "the turbulence on my flight",
"losing my job",
"the storm warning", "the virus spreading", "going to the dentist", "making
a big decision",
"meeting the new boss", "the final exam", "the economic uncertainty", "movin
g to a new city"
],
'surprise': [
"the plot twist", "the unexpected visit", "the sudden announcement", "winnin
g the lottery",
"the celebrity sighting", "the marriage proposal", "the sudden career change
", "the news headline",
"the secret reveal", "the sudden weather change", "the price drop", "the une
xpected gift",
"the team victory", "the reunion", "the talent show performance", "the drama
tic ending"
]
}

# Generate tweets
tweets = []
labels = []

emotion_distribution = {
'joy': int(n_samples * 0.25),
'sadness': int(n_samples * 0.20),
'anger': int(n_samples * 0.20),
'fear': int(n_samples * 0.15),
'surprise': int(n_samples * 0.20)
}

# Ensure we get exactly n_samples by adjusting the last category


total = sum(emotion_distribution.values())
if total < n_samples:
emotion_distribution['joy'] += (n_samples - total)
elif total > n_samples:
emotion_distribution['joy'] -= (total - n_samples)

for emotion, count in emotion_distribution.items():


for _ in range(count):
# Select random template
template = np.random.choice(templates[emotion])

# Select random emotion word


emotion_word = np.random.choice(emotions[emotion])

# Select random context


context = np.random.choice(contexts[emotion])

# Create tweet
tweet = template.format(emotion_word, context)

tweets.append(tweet)
labels.append(emotion)

# Create DataFrame
data = pd.DataFrame({
'tweet': tweets,
'emotion': labels
})

# Shuffle the data


data = data.sample(frac=1, random_state=42).reset_index(drop=True)

return data

# Generate synthetic Twitter data


tweet_data = generate_synthetic_twitter_data(n_samples=5000)

# Display the first few rows


print(f"Generated {len(tweet_data)} synthetic tweets")
tweet_data.head()

Generated 5000 synthetic tweets


Out[2]:

tweet emotion
0 Today has been so lost. the canceled plans has... sadness
tweet emotion
1 Nothing makes me more irritated than poor cust... anger

2 Why am I so outraged about people not wearing ... anger

3 I'm feeling so excited today! Life is good! #b... joy

4 I'm elated to announce that our vacation! Drea... joy

Categorical distributions

2-d categorical distributions

Error: Runtime no longer has a reference to this dataframe, please re-run this cell and t
ry again.

In [3]:

# Check the distribution of emotions


plt.figure(figsize=(12, 6))
count_plot = sns.countplot(x='emotion', data=tweet_data, palette='viridis')
plt.title('Distribution of Emotions in Tweets', fontsize=16)
plt.xlabel('Emotion', fontsize=14)
plt.ylabel('Count', fontsize=14)
plt.xticks(fontsize=12)
plt.yticks(fontsize=12)

# Add count labels on the bars


for p in count_plot.patches:
count_plot.annotate(format(p.get_height(), '.0f'),
(p.get_x() + p.get_width() / 2., p.get_height()),
ha = 'center', va = 'center',
xytext = (0, 10),
textcoords = 'offset points')

plt.grid(axis='y', alpha=0.3)
plt.tight_layout()
plt.show()
# Calculate percentages
emotion_counts = tweet_data['emotion'].value_counts(normalize=True) * 100
print("Emotion distribution percentages:")
for emotion, percentage in emotion_counts.items():
print(f"{emotion}: {percentage:.2f}%")

Emotion distribution percentages:


joy: 25.00%
sadness: 20.00%
anger: 20.00%
surprise: 20.00%
fear: 15.00%

In [4]:
# Analyze tweet length by emotion
tweet_data['tweet_length'] = tweet_data['tweet'].apply(len)

plt.figure(figsize=(12, 6))
sns.boxplot(x='emotion', y='tweet_length', data=tweet_data, palette='viridis')
plt.title('Tweet Length by Emotion', fontsize=16)
plt.xlabel('Emotion', fontsize=14)
plt.ylabel('Tweet Length (characters)', fontsize=14)
plt.grid(axis='y', alpha=0.3)
plt.tight_layout()
plt.show()

# Calculate average tweet length by emotion


avg_length_by_emotion = tweet_data.groupby('emotion')['tweet_length'].mean().sort_values
(ascending=False)
print("Average tweet length by emotion:")
for emotion, avg_length in avg_length_by_emotion.items():
print(f"{emotion}: {avg_length:.1f} characters")
Average tweet length by emotion:
fear: 74.8 characters
surprise: 74.6 characters
anger: 72.6 characters
sadness: 69.7 characters
joy: 66.9 characters

In [5]:
# Generate word clouds for each emotion
def plot_wordcloud(text, title, max_words=100):
wordcloud = WordCloud(
background_color='white',
max_words=max_words,
max_font_size=40,
scale=3,
random_state=42
).generate(text)

plt.figure(figsize=(10, 6))
plt.imshow(wordcloud, interpolation='bilinear')
plt.axis('off')
plt.title(title, fontsize=16)
plt.tight_layout(pad=0)
plt.show()

# Create word clouds for each emotion


emotions = tweet_data['emotion'].unique()

for emotion in emotions:


# Get all tweets for this emotion
emotion_tweets = tweet_data[tweet_data['emotion'] == emotion]['tweet']

# Combine into one string


combined_tweets = ' '.join(emotion_tweets)

# Create word cloud


plot_wordcloud(combined_tweets, f'Word Cloud for {emotion.capitalize()} Tweets')
In [6]:
# Make sure NLTK resources are downloaded correctly
try:
nltk.download('punkt')
nltk.download('stopwords')
nltk.download('wordnet')
except Exception as e:
print(f"NLTK download issue: {e}")
print("Continuing with available resources...")

def preprocess_tweet(tweet, remove_stopwords=True):


"""Preprocess tweet text for analysis"""
# Convert to lowercase
tweet = tweet.lower()

# Remove URLs
tweet = re.sub(r'http\S+|www\S+|https\S+', '', tweet, flags=re.MULTILINE)

# Remove user mentions and hashtag symbol (keep the hashtag text)
tweet = re.sub(r'@\w+', '', tweet)
tweet = re.sub(r'#', '', tweet)

# Remove punctuation
tweet = re.sub(r'[^\w\s]', '', tweet)

# Tokenize - use a try-except block to handle potential errors


try:
tokens = word_tokenize(tweet)
except LookupError:
# Fallback tokenization if word_tokenize fails
tokens = tweet.split()

# Remove stopwords if requested


if remove_stopwords:
try:
stop_words = set(stopwords.words('english'))
tokens = [word for word in tokens if word not in stop_words]
except:
# If stopwords can't be loaded, continue without stopword removal
pass

# Lemmatize
try:
lemmatizer = WordNetLemmatizer()
tokens = [lemmatizer.lemmatize(word) for word in tokens]
except:
# If lemmatization fails, use the original tokens
pass

# Rejoin into string


preprocessed_tweet = ' '.join(tokens)

return preprocessed_tweet

# Apply preprocessing to the tweet data


tweet_data['processed_tweet'] = tweet_data['tweet'].apply(preprocess_tweet)

# Display examples of original vs processed tweets


comparison = tweet_data[['tweet', 'processed_tweet', 'emotion']].head(5)
print("Original vs. Processed Tweet Examples:")
for i, row in comparison.iterrows():
print(f"\n Emotion: {row['emotion']}")
print(f"Original: {row['tweet']}")
print(f"Processed: {row['processed_tweet']}")

[nltk_data] Downloading package punkt to /root/nltk_data...


[nltk_data] Package punkt is already up-to-date!
[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data] Package stopwords is already up-to-date!
[nltk_data] Downloading package wordnet to /root/nltk_data...
[nltk_data] Package wordnet is already up-to-date!

Original vs. Processed Tweet Examples:

Emotion: sadness
Original: Today has been so lost. the canceled plans has me in tears.
Processed: today lost canceled plan tear

Emotion: anger
Original: Nothing makes me more irritated than poor customer service. Seriously?!
Processed: nothing make irritated poor customer service seriously

Emotion: anger
Original: Why am I so outraged about people not wearing masks? Because it's totally wrong
!
Processed: outraged people wearing mask totally wrong

Emotion: joy
Original: I'm feeling so excited today! Life is good! #blessed
Processed: im feeling excited today life good blessed

Emotion: joy
Original: I'm elated to announce that our vacation! Dreams do come true. #grateful
Processed: im elated announce vacation dream come true grateful

In [7]:
# Encode emotion labels
label_encoder = LabelEncoder()
encoded_labels = label_encoder.fit_transform(tweet_data['emotion'])
num_classes = len(label_encoder.classes_)

# Map encoded values to original labels for reference


label_mapping = dict(zip(range(num_classes), label_encoder.classes_))
print("Label encoding mapping:")
for encoded, original in label_mapping.items():
print(f"{encoded}: {original}")

# Convert to categorical for multi-class classification


categorical_labels = to_categorical(encoded_labels, num_classes=num_classes)
# Split data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(
tweet_data['processed_tweet'],
categorical_labels,
test_size=0.2,
stratify=categorical_labels, # Ensure balanced classes in train and test
random_state=42
)

# Tokenize text
max_words = 10000 # Maximum number of words to consider
max_len = 100 # Maximum sequence length

tokenizer = Tokenizer(num_words=max_words, oov_token='<OOV>')


tokenizer.fit_on_texts(X_train)

# Convert text to sequences of integers


X_train_seq = tokenizer.texts_to_sequences(X_train)
X_test_seq = tokenizer.texts_to_sequences(X_test)

# Pad sequences to ensure uniform length


X_train_pad = pad_sequences(X_train_seq, maxlen=max_len, padding='post', truncating='pos
t')
X_test_pad = pad_sequences(X_test_seq, maxlen=max_len, padding='post', truncating='post'
)

# Get the vocabulary size


vocab_size = len(tokenizer.word_index) + 1 # +1 for padding token

print(f"Vocabulary size: {vocab_size} ")


print(f"Training set size: {X_train_pad.shape}")
print(f"Testing set size: {X_test_pad.shape}")

Label encoding mapping:


0: anger
1: fear
2: joy
3: sadness
4: surprise
Vocabulary size: 346
Training set size: (4000, 100)
Testing set size: (1000, 100)

In [8]:

# Define callbacks to use with all models


callbacks = [
EarlyStopping(monitor='val_loss', patience=3, restore_best_weights=True),
ReduceLROnPlateau(monitor='val_loss', factor=0.1, patience=2, min_lr=0.001)
]

# 1. Simple LSTM Model


def build_lstm_model():
model = Sequential([
Embedding(input_dim=vocab_size, output_dim=128, input_length=max_len),
SpatialDropout1D(0.2),
LSTM(128, dropout=0.2, recurrent_dropout=0.2),
Dense(64, activation='relu'),
Dropout(0.5),
Dense(num_classes, activation='softmax')
])

model.compile(
optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy']
)

return model

# 2. Bidirectional LSTM Model


def build_bidirectional_lstm_model():
model = Sequential([
Embedding(input_dim=vocab_size, output_dim=128, input_length=max_len),
SpatialDropout1D(0.2),
Bidirectional(LSTM(64, dropout=0.2, recurrent_dropout=0.2)),
Dense(64, activation='relu'),
Dropout(0.5),
Dense(num_classes, activation='softmax')
])

model.compile(
optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy']
)

return model

# 3. CNN Model for Text Classification


def build_cnn_model():
model = Sequential([
Embedding(input_dim=vocab_size, output_dim=128, input_length=max_len),
Conv1D(filters=128, kernel_size=5, activation='relu'),
MaxPooling1D(pool_size=2),
Conv1D(filters=128, kernel_size=5, activation='relu'),
MaxPooling1D(pool_size=2),
Conv1D(filters=128, kernel_size=5, activation='relu'),
GlobalMaxPooling1D(),
Dense(128, activation='relu'),
Dropout(0.5),
Dense(num_classes, activation='softmax')
])

model.compile(
optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy']
)

return model

# 4. CNN-LSTM Hybrid Model


def build_cnn_lstm_model():
model = Sequential([
Embedding(input_dim=vocab_size, output_dim=128, input_length=max_len),
SpatialDropout1D(0.2),
Conv1D(filters=128, kernel_size=5, activation='relu'),
MaxPooling1D(pool_size=2),
LSTM(64, dropout=0.2, recurrent_dropout=0.2),
Dense(64, activation='relu'),
Dropout(0.5),
Dense(num_classes, activation='softmax')
])

model.compile(
optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy']
)

return model

In [ ]:
# Initialize empty dictionaries to store models and histories
models = {}
histories = {}

# Build, train and evaluate each model


def train_evaluate_model(model_name, model_builder):
print(f"\n Training {model_name} ...")
model = model_builder()

# Use small integers for random seeds to avoid overflow


tf.random.set_seed(42)
np.random.seed(42)

history = model.fit(
X_train_pad, y_train,
epochs=15,
batch_size=64,
validation_split=0.1,
callbacks=callbacks,
verbose=1
)

# Evaluate on test set


loss, accuracy = model.evaluate(X_test_pad, y_test, verbose=0)
print(f"{model_name} Test Accuracy: {accuracy:.4f}")

return model, history

# Train all models


# Use small integer for random seed to avoid overflow
tf.random.set_seed(42)
np.random.seed(42)

# Train each model one by one


print("Starting model training...")
models['LSTM'], histories['LSTM'] = train_evaluate_model('LSTM', build_lstm_model)
models['BiLSTM'], histories['BiLSTM'] = train_evaluate_model('Bidirectional LSTM', build_
bidirectional_lstm_model)
models['CNN'], histories['CNN'] = train_evaluate_model('CNN', build_cnn_model)
models['CNN-LSTM'], histories['CNN-LSTM'] = train_evaluate_model('CNN-LSTM Hybrid', build
_cnn_lstm_model)

In [10]:

# Compare training histories


plt.figure(figsize=(15, 6))

# Plot accuracy
plt.subplot(1, 2, 1)
for model_name, history in histories.items():
plt.plot(history.history['accuracy'], label=f'{model_name} Training')
plt.plot(history.history['val_accuracy'], label=f'{model_name} Validation', linestyl
e='--')

plt.title('Model Accuracy Comparison')


plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend()
plt.grid(True, alpha=0.3)

# Plot loss
plt.subplot(1, 2, 2)
for model_name, history in histories.items():
plt.plot(history.history['loss'], label=f'{model_name} Training')
plt.plot(history.history['val_loss'], label=f'{model_name} Validation', linestyle='-
-')

plt.title('Model Loss Comparison')


plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend()
plt.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()
In [11]:

# Function to evaluate model and produce classification report and confusion matrix
def evaluate_model_detail(model, model_name, X_test, y_test, label_mapping):
# Get predictions
y_pred_prob = model.predict(X_test)
y_pred = np.argmax(y_pred_prob, axis=1)
y_true = np.argmax(y_test, axis=1)

# Get classification report


target_names = [label_mapping[i] for i in range(len(label_mapping))]
report = classification_report(y_true, y_pred, target_names=target_names)
print(f"\n {model_name} Classification Report:\n ")
print(report)

# Get confusion matrix


cm = confusion_matrix(y_true, y_pred)

# Plot confusion matrix


plt.figure(figsize=(10, 8))
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues',
xticklabels=target_names,
yticklabels=target_names)
plt.title(f'{model_name} Confusion Matrix')
plt.ylabel('True Label')
plt.xlabel('Predicted Label')
plt.tight_layout()
plt.show()

return y_pred, y_true, y_pred_prob

# Select the best performing model for detailed evaluation


# Let's use BiLSTM for this example, but you could compare all models
best_model_name = 'BiLSTM' # Change this based on your results
best_model = models[best_model_name]

# Evaluate the best model in detail


y_pred, y_true, y_pred_prob = evaluate_model_detail(best_model, best_model_name, X_test_
pad, y_test, label_mapping)

32/32 ━━━━━━━━━━━━━━━━━━━━ 4s 106ms/step

BiLSTM Classification Report:

precision recall f1-score support

anger 1.00 1.00 1.00 200


fear 1.00 1.00 1.00 150
joy 1.00 1.00 1.00 250
sadness 1.00 1.00 1.00 200
surprise 1.00 1.00 1.00 200

accuracy 1.00 1000


macro avg 1.00 1.00 1.00 1000
weighted avg 1.00 1.00 1.00 1000
In [12]:
# Compare accuracies of all models
model_accuracies = {}
for model_name, model in models.items():
_, accuracy = model.evaluate(X_test_pad, y_test, verbose=0)
model_accuracies[model_name] = accuracy

# Plot model comparison


plt.figure(figsize=(10, 6))
bars = plt.bar(model_accuracies.keys(), model_accuracies.values(), color=['blue', 'green'
, 'red', 'purple'])

# Add accuracy values on bars


for bar in bars:
height = bar.get_height()
plt.text(bar.get_x() + bar.get_width()/2., height + 0.01,
f'{height:.4f} ', ha='center', va='bottom')

plt.title('Model Accuracy Comparison')


plt.xlabel('Model')
plt.ylabel('Test Accuracy')
plt.ylim(0, 1.0)
plt.grid(axis='y', alpha=0.3)
plt.tight_layout()
plt.show()

# Find best model


best_model_name = max(model_accuracies, key=model_accuracies.get)
best_accuracy = model_accuracies[best_model_name]
print(f"The best performing model is {best_model_name} with an accuracy of {best_accuracy
:.4f}")

The best performing model is BiLSTM with an accuracy of 1.0000

In [13]:
# Analyze misclassifications to understand model weaknesses
def analyze_misclassifications(X_test_raw, y_true, y_pred, label_mapping, n_examples=10)
:
# Get original texts from test set
misclassified_indices = np.where(y_true != y_pred)[0]

if len(misclassified_indices) == 0:
print("No misclassifications found!")
return

# Limit to n examples
n_examples = min(n_examples, len(misclassified_indices))
selected_indices = np.random.choice(misclassified_indices, n_examples, replace=False
)

print(f"\n Misclassification Analysis ({n_examples} examples):")


print("-" * 80)

for idx in selected_indices:


text = X_test_raw.iloc[idx]
true_label = label_mapping[y_true[idx]]
pred_label = label_mapping[y_pred[idx]]

print(f"Tweet: {text}")
print(f"True emotion: {true_label} ")
print(f"Predicted emotion: {pred_label} ")
print("-" * 80)

# Calculate confusion pairs (which emotions get confused with each other)
confusion_pairs = {}
for idx in misclassified_indices:
true_label = label_mapping[y_true[idx]]
pred_label = label_mapping[y_pred[idx]]
pair = (true_label, pred_label)

if pair in confusion_pairs:
confusion_pairs[pair] += 1
else:
confusion_pairs[pair] = 1

# Show most common confusion pairs


print("\n Most Common Confusion Pairs:")
for pair, count in sorted(confusion_pairs.items(), key=lambda x: x[1], reverse=True)
[:5]:
true_label, pred_label = pair
print(f"True: {true_label} , Predicted: {pred_label} - {count} instances")

# Extract the raw test data


X_test_raw = X_test.reset_index(drop=True)

# Analyze misclassifications
analyze_misclassifications(X_test_raw, y_true, y_pred, label_mapping, n_examples=5)

No misclassifications found!

In [14]:
# Create a function to predict emotions from new tweets
def predict_emotion(tweet, model, tokenizer, label_mapping, max_len=100):
"""Predict the emotion of a single tweet"""
# Preprocess the tweet
processed_tweet = preprocess_tweet(tweet)

# Convert to sequence
sequence = tokenizer.texts_to_sequences([processed_tweet])

# Pad sequence
padded_sequence = pad_sequences(sequence, maxlen=max_len, padding='post', truncating
='post')

# Make prediction
prediction = model.predict(padded_sequence)[0]

# Get predicted class and probability


predicted_class = np.argmax(prediction)
probability = prediction[predicted_class]

# Get emotion name


emotion = label_mapping[predicted_class]

# Get probabilities for all emotions


all_probs = {label_mapping[i]: float(prob) for i, prob in enumerate(prediction)}

return {
'emotion': emotion,
'probability': float(probability),
'all_probabilities': all_probs
}

# Test the prediction function with some example tweets


example_tweets = [
"I'm so happy today! Just got a promotion at work and feeling blessed! #grateful",
"Feeling so sad after watching that movie. It really broke my heart. ",
"Can't believe how terrible the customer service was! I'm absolutely furious right no
w! ",
"I'm really nervous about my presentation tomorrow. Can't sleep thinking about it. #a
nxiety",
"OMG! I just won the lottery! I can't believe this is happening to me! What a surpris
e! "
]

# Use the best model for predictions


best_model = models[best_model_name]

# Make predictions for each example


for i, tweet in enumerate(example_tweets):
result = predict_emotion(tweet, best_model, tokenizer, label_mapping, max_len)
print(f"\n Example {i+1}: {tweet} ")
print(f"Predicted Emotion: {result['emotion']} (Confidence: {result['probability']:.2
%} )")

# Show all emotion probabilities sorted by likelihood


print("All Emotion Probabilities:")
for emotion, prob in sorted(result['all_probabilities'].items(), key=lambda x: x[1],
reverse=True):
print(f" - {emotion}: {prob:.2%}")

1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 66ms/step

Example 1: I'm so happy today! Just got a promotion at work and feeling blessed! #grat
eful
Predicted Emotion: joy (Confidence: 100.00%)
All Emotion Probabilities:
- joy: 100.00%
- fear: 0.00%
- surprise: 0.00%
- anger: 0.00%
- sadness: 0.00%
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 73ms/step

Example 2: Feeling so sad after watching that movie. It really broke my heart.
Predicted Emotion: sadness (Confidence: 99.96%)
All Emotion Probabilities:
- sadness: 99.96%
- fear: 0.04%
- joy: 0.00%
- anger: 0.00%
- surprise: 0.00%
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 61ms/step

Example 3: Can't believe how terrible the customer service was! I'm absolutely furious ri
ght now!
Predicted Emotion: anger (Confidence: 100.00%)
All Emotion Probabilities:
- anger: 100.00%
- sadness: 0.00%
- fear: 0.00%
- surprise: 0.00%
- joy: 0.00%
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 58ms/step

Example 4: I'm really nervous about my presentation tomorrow. Can't sleep thinking about
it. #anxiety
Predicted Emotion: fear (Confidence: 100.00%)
All Emotion Probabilities:
- fear: 100.00%
- anger: 0.00%
- sadness: 0.00%
- joy: 0.00%
- surprise: 0.00%
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 61ms/step

Example 5: OMG! I just won the lottery! I can't believe this is happening to me! What a s
urprise!
Predicted Emotion: surprise (Confidence: 99.97%)
All Emotion Probabilities:
- surprise: 99.97%
- sadness: 0.02%
- joy: 0.01%
- anger: 0.00%
- fear: 0.00%

In [15]:

# Create a visualization of the emotion prediction


def visualize_emotion_prediction(tweet, result):
# Sort emotions by probability
emotions = []
probabilities = []
for emotion, prob in sorted(result['all_probabilities'].items(), key=lambda x: x[1],
reverse=True):
emotions.append(emotion)
probabilities.append(prob)

# Create a colormap based on probabilities


colors = cm.viridis(np.array(probabilities))

# Plot
plt.figure(figsize=(12, 6))

# Tweet text with predicted emotion


plt.suptitle(f"Tweet: {tweet} ", fontsize=14, wrap=True)

# Bar chart of emotion probabilities


bars = plt.bar(emotions, probabilities, color=colors)

# Add value labels on bars


for bar in bars:
height = bar.get_height()
plt.text(bar.get_x() + bar.get_width()/2., height + 0.01,
f'{height:.2%} ', ha='center', va='bottom')

plt.title(f"Predicted Emotion: {result['emotion']} (Confidence: {result['probability'


]:.2%})")
plt.xlabel('Emotion')
plt.ylabel('Probability')
plt.ylim(0, 1.0)
plt.grid(axis='y', alpha=0.3)
plt.tight_layout(rect=[0, 0, 1, 0.95])
plt.show()

# Visualize predictions for a few examples


for tweet in example_tweets[:3]: # Just show first 3 for space
result = predict_emotion(tweet, best_model, tokenizer, label_mapping, max_len)
visualize_emotion_prediction(tweet, result)

1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 61ms/step

1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 59ms/step


1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 61ms/step

In [16]:
# Create a simple interactive tool for emotion prediction
def interactive_emotion_predictor():
print("=== Twitter Emotion Predictor ===\n ")
print("Enter a tweet to analyze its emotion, or type 'quit' to exit.\n ")

while True:
# Get input from user
tweet = input("\n Enter a tweet: ")

# Check if user wants to quit


if tweet.lower() == 'quit':
print("\n Thank you for using the Twitter Emotion Predictor!")
break

# Skip empty tweets


if not tweet.strip():
print("Tweet cannot be empty. Please try again.")
continue

# Predict emotion
result = predict_emotion(tweet, best_model, tokenizer, label_mapping, max_len)

# Display result
print(f"\n Predicted Emotion: {result['emotion']} (Confidence: {result['probabilit
y']:.2%})")

# Show all emotion probabilities sorted by likelihood


print("\n All Emotion Probabilities:")
for emotion, prob in sorted(result['all_probabilities'].items(), key=lambda x: x
[1], reverse=True):
print(f" - {emotion}: {prob:.2%}")

# Optional: Visualize result


visualize_emotion_prediction(tweet, result)

# Run the interactive tool - uncomment to use


# interactive_emotion_predictor()
Experiment 12: Introduction to Deep Learning using Keras and
TensorFlow
In [1]:

# Import necessary libraries


import tensorflow as tf
from tensorflow import keras
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# Check versions
print(f"TensorFlow version: {tf.__version__}")
print(f"Keras version: {keras.__version__}")
print(f"NumPy version: {np.__version__}")
print(f"Pandas version: {pd.__version__}")

TensorFlow version: 2.18.0


Keras version: 3.8.0
NumPy version: 2.0.2
Pandas version: 2.2.2

In [2]:
# Set random seeds for reproducibility
np.random.seed(42)
tf.random.set_seed(42)

In [3]:
# Simple sequential model
model_sequential = keras.Sequential([
keras.layers.Dense(64, activation='relu', input_shape=(784,)),
keras.layers.Dense(32, activation='relu'),
keras.layers.Dense(10, activation='softmax')
])

# Compile the model


model_sequential.compile(
optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy']
)

# Model summary
model_sequential.summary()

/usr/local/lib/python3.11/dist-packages/keras/src/layers/core/dense.py:87: UserWarning: D
o not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models
, prefer using an `Input(shape)` object as the first layer in the model instead.
super().__init__(activity_regularizer=activity_regularizer, **kwargs)

Model: "sequential"

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type) ┃ Output Shape ┃ Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ dense (Dense) │ (None, 64) │ 50,240 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_1 (Dense) │ (None, 32) │ 2,080 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_2 (Dense) │ (None, 10) │ 330 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
Total params: 52,650 (205.66 KB)

Trainable params: 52,650 (205.66 KB)

Non-trainable params: 0 (0.00 B)

In [4]:
# Functional API example
inputs = keras.Input(shape=(784,))
x = keras.layers.Dense(64, activation='relu')(inputs)
x = keras.layers.Dense(32, activation='relu')(x)
outputs = keras.layers.Dense(10, activation='softmax')(x)

model_functional = keras.Model(inputs=inputs, outputs=outputs)

# Compile the model


model_functional.compile(
optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy']
)

# Model summary
model_functional.summary()

Model: "functional_1"

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type) ┃ Output Shape ┃ Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ input_layer_1 (InputLayer) │ (None, 784) │ 0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_3 (Dense) │ (None, 64) │ 50,240 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_4 (Dense) │ (None, 32) │ 2,080 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_5 (Dense) │ (None, 10) │ 330 │
└─────────────────────────────────┴────────────────────────┴───────────────┘

Total params: 52,650 (205.66 KB)

Trainable params: 52,650 (205.66 KB)

Non-trainable params: 0 (0.00 B)

In [5]:
# Subclassing example
class CustomModel(keras.Model):
def __init__(self):
super(CustomModel, self).__init__()
self.dense1 = keras.layers.Dense(64, activation='relu')
self.dense2 = keras.layers.Dense(32, activation='relu')
self.dense3 = keras.layers.Dense(10, activation='softmax')

def call(self, inputs):


x = self.dense1(inputs)
x = self.dense2(x)
return self.dense3(x)

model_subclass = CustomModel()

# Compile the model


model_subclass.compile(
optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy']
)
# Build the model with a sample input
model_subclass.build((None, 784))
model_subclass.summary()

/usr/local/lib/python3.11/dist-packages/keras/src/layers/layer.py:393: UserWarning: `buil


d()` was called on layer 'custom_model', however the layer does not have a `build()` meth
od implemented and it looks like it has unbuilt state. This will cause the layer to be ma
rked as built, despite not being actually built, which may cause failures down the line.
Make sure to implement a proper `build()` method.
warnings.warn(

Model: "custom_model"

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type) ┃ Output Shape ┃ Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ dense_6 (Dense) │ ? │ 0 (unbuilt) │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_7 (Dense) │ ? │ 0 (unbuilt) │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_8 (Dense) │ ? │ 0 (unbuilt) │
└─────────────────────────────────┴────────────────────────┴───────────────┘

Total params: 0 (0.00 B)

Trainable params: 0 (0.00 B)

Non-trainable params: 0 (0.00 B)

3.4. Common Layer Types in Keras


Keras provides various layer types for different purposes:

1. Dense (Fully Connected) Layers


2. Convolutional Layers (Conv1D, Conv2D, Conv3D)
3. Pooling Layers (MaxPooling, AveragePooling)
4. Recurrent Layers (SimpleRNN, LSTM, GRU)
5. Dropout Layers
6. BatchNormalization Layers
7. Embedding Layers
8. Flatten and Reshape Layers

Let's look at a more complex model using some of these layer types:

In [6]:
# Create a more complex model with different layer types
complex_model = keras.Sequential([
# Reshape input to 28x28x1 for CNNs
keras.layers.Reshape((28, 28, 1), input_shape=(784,)),

# Convolutional layers
keras.layers.Conv2D(32, kernel_size=(3, 3), activation='relu'),
keras.layers.MaxPooling2D(pool_size=(2, 2)),
keras.layers.Conv2D(64, kernel_size=(3, 3), activation='relu'),
keras.layers.MaxPooling2D(pool_size=(2, 2)),

# Flatten the output for dense layers


keras.layers.Flatten(),

# Dense layers with dropout for regularization


keras.layers.Dense(128, activation='relu'),
keras.layers.Dropout(0.5),
keras.layers.Dense(10, activation='softmax')
])

# Compile the model


complex_model.compile(
optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy']
)

# Model summary
complex_model.summary()

/usr/local/lib/python3.11/dist-packages/keras/src/layers/reshaping/reshape.py:39: UserWar
ning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential
models, prefer using an `Input(shape)` object as the first layer in the model instead.
super().__init__(**kwargs)

Model: "sequential_1"

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type) ┃ Output Shape ┃ Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ reshape (Reshape) │ (None, 28, 28, 1) │ 0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ conv2d (Conv2D) │ (None, 26, 26, 32) │ 320 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ max_pooling2d (MaxPooling2D) │ (None, 13, 13, 32) │ 0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ conv2d_1 (Conv2D) │ (None, 11, 11, 64) │ 18,496 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ max_pooling2d_1 (MaxPooling2D) │ (None, 5, 5, 64) │ 0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ flatten (Flatten) │ (None, 1600) │ 0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_9 (Dense) │ (None, 128) │ 204,928 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout (Dropout) │ (None, 128) │ 0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_10 (Dense) │ (None, 10) │ 1,290 │
└─────────────────────────────────┴────────────────────────┴───────────────┘

Total params: 225,034 (879.04 KB)

Trainable params: 225,034 (879.04 KB)

Non-trainable params: 0 (0.00 B)

In [7]:
# Load Fashion MNIST dataset with error handling
try:
# Attempt to load the dataset directly
(x_train, y_train), (x_test, y_test) = keras.datasets.fashion_mnist.load_data()
except Exception as e:
print(f"Error loading dataset directly: {e}")

# Alternative approach: use a proxy or local cache


try:
# Try using a different method with explicit path
import os
fashion_mnist_path = os.path.join(os.path.expanduser("~"), '.keras', 'datasets', 'fash
ion-mnist')
os.makedirs(fashion_mnist_path, exist_ok=True)

# If you have the dataset files locally, you can specify the path
# Otherwise, try using a different download method
print("Attempting to download using TensorFlow's get_file utility...")

from tensorflow.keras.utils import get_file

base_url = 'https://storage.googleapis.com/tensorflow/tf-keras-datasets/'
train_images_path = get_file('train-images-idx3-ubyte.gz', base_url + 'train-images-id
x3-ubyte.gz')
train_labels_path = get_file('train-labels-idx1-ubyte.gz', base_url + 'train-labels-id
x1-ubyte.gz')
test_images_path = get_file('t10k-images-idx3-ubyte.gz', base_url + 't10k-images-idx3-
ubyte.gz')
test_labels_path = get_file('t10k-labels-idx1-ubyte.gz', base_url + 't10k-labels-idx1-
ubyte.gz')

# Process the downloaded files using numpy


import gzip
import numpy as np

with gzip.open(train_images_path, 'rb') as f:


x_train = np.frombuffer(f.read(), np.uint8, offset=16).reshape(-1, 28, 28)

with gzip.open(train_labels_path, 'rb') as f:


y_train = np.frombuffer(f.read(), np.uint8, offset=8)

with gzip.open(test_images_path, 'rb') as f:


x_test = np.frombuffer(f.read(), np.uint8, offset=16).reshape(-1, 28, 28)

with gzip.open(test_labels_path, 'rb') as f:


y_test = np.frombuffer(f.read(), np.uint8, offset=8)

except Exception as e2:


print(f"Second attempt failed: {e2}")
print("Creating dummy data for demonstration purposes...")

# Create dummy data with the correct shape for demonstration


x_train = np.random.randint(0, 256, size=(60000, 28, 28), dtype=np.uint8)
y_train = np.random.randint(0, 10, size=(60000,), dtype=np.uint8)
x_test = np.random.randint(0, 256, size=(10000, 28, 28), dtype=np.uint8)
y_test = np.random.randint(0, 10, size=(10000,), dtype=np.uint8)

print("⚠️ WARNING: Using randomly generated dummy data. Real model training will not be
meaningful.")

# Check the shape of the data


print(f"Training data shape: {x_train.shape}")
print(f"Training labels shape: {y_train.shape}")
print(f"Test data shape: {x_test.shape}")
print(f"Test labels shape: {y_test.shape}")

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/train-l


abels-idx1-ubyte.gz
29515/29515 ━━━━━━━━━━━━━━━━━━━━ 0s 0us/step
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/train-i
mages-idx3-ubyte.gz
26421880/26421880 ━━━━━━━━━━━━━━━━━━━━ 0s 0us/step
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-la
bels-idx1-ubyte.gz
5148/5148 ━━━━━━━━━━━━━━━━━━━━ 0s 0us/step
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-im
ages-idx3-ubyte.gz
4422102/4422102 ━━━━━━━━━━━━━━━━━━━━ 0s 0us/step
Training data shape: (60000, 28, 28)
Training labels shape: (60000,)
Test data shape: (10000, 28, 28)
Test labels shape: (10000,)

In [8]:
# Define class names for Fashion MNIST
class_names = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat',
'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']

# Display some sample images


plt.figure(figsize=(10, 10))
for i in range(25):
plt.subplot(5, 5, i + 1)
plt.xticks([])
plt.yticks([])
plt.grid(False)
plt.imshow(x_train[i], cmap=plt.cm.binary)
plt.xlabel(class_names[y_train[i]])
plt.tight_layout()
plt.show()

In [9]:
# Preprocess the data
# Normalize pixel values to be between 0 and 1
x_train = x_train.astype('float32') / 255.0
x_test = x_test.astype('float32') / 255.0

# Reshape data for the model


x_train_flat = x_train.reshape(x_train.shape[0], -1)
x_test_flat = x_test.reshape(x_test.shape[0], -1)

In [ ]:
# Create a simple model for Fashion MNIST
fashion_model = keras.Sequential([
keras.layers.Dense(128, activation='relu', input_shape=(784,)),
keras.layers.Dropout(0.2),
keras.layers.Dense(64, activation='relu'),
keras.layers.Dropout(0.2),
keras.layers.Dense(10, activation='softmax')
])

# Compile the model


fashion_model.compile(
optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy']
)

# Train the model


history = fashion_model.fit(
x_train_flat, y_train,
epochs=10,
batch_size=64,
validation_split=0.2,
verbose=1,
shuffle=True, # Ensure data is shuffled before splitting
callbacks=[
# Add early stopping to prevent potential numerical issues
keras.callbacks.EarlyStopping(
monitor='val_loss',
patience=3,
restore_best_weights=True
)
]
)

In [11]:
# Evaluate the model
test_loss, test_acc = fashion_model.evaluate(x_test_flat, y_test, verbose=2)
print(f'\n Test accuracy: {test_acc:.4f}')

313/313 - 1s - 4ms/step - accuracy: 0.8783 - loss: 0.3426

Test accuracy: 0.8783

In [12]:
# Plot training history
plt.figure(figsize=(12, 4))

# Plot training & validation accuracy


plt.subplot(1, 2, 1)
plt.plot(history.history['accuracy'], label='Training Accuracy')
plt.plot(history.history['val_accuracy'], label='Validation Accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend()
plt.title('Training and Validation Accuracy')

# Plot training & validation loss


plt.subplot(1, 2, 2)
plt.plot(history.history['loss'], label='Training Loss')
plt.plot(history.history['val_loss'], label='Validation Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.legend()
plt.title('Training and Validation Loss')

plt.tight_layout()
plt.show()
In [13]:
# Make predictions
predictions = fashion_model.predict(x_test_flat)

# Function to plot an image with its prediction


def plot_image(i, predictions_array, true_label, img):
true_label, img = true_label[i], img[i]
plt.grid(False)
plt.xticks([])
plt.yticks([])

plt.imshow(img, cmap=plt.cm.binary)

predicted_label = np.argmax(predictions_array[i])
if predicted_label == true_label:
color = 'blue'
else:
color = 'red'

plt.xlabel("{} {:2.0f} % ({} )".format(


class_names[predicted_label],
100*np.max(predictions_array[i]),
class_names[true_label]),
color=color
)

# Function to plot the prediction bars


def plot_value_array(i, predictions_array, true_label):
true_label = true_label[i]
plt.grid(False)
plt.xticks(range(10))
plt.yticks([])
thisplot = plt.bar(range(10), predictions_array[i], color="#777777")
plt.ylim([0, 1])
predicted_label = np.argmax(predictions_array[i])

thisplot[predicted_label].set_color('red')
thisplot[true_label].set_color('blue')

# Plot the first X test images, their predicted labels, and the true labels
num_rows = 5
num_cols = 3
num_images = num_rows*num_cols
plt.figure(figsize=(2*2*num_cols, 2*num_rows))
for i in range(num_images):
plt.subplot(num_rows, 2*num_cols, 2*i+1)
plot_image(i, predictions, y_test, x_test)
plt.subplot(num_rows, 2*num_cols, 2*i+2)
plot_value_array(i, predictions, y_test)
plt.tight_layout()
plt.show()

313/313 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step


In [14]:
# Plot model architecture
try:
from tensorflow.keras.utils import plot_model

# Make sure pydot and graphviz are installed


plot_model(fashion_model, to_file='fashion_model.png', show_shapes=True, show_layer_
names=True)
from IPython.display import Image
Image('fashion_model.png')
except Exception as e:
print(f"To visualize the model, you need to install pydot and graphviz.")
print(f"Error: {e}")

In [15]:
# Reshape data for CNN
x_train_cnn = x_train.reshape(x_train.shape[0], 28, 28, 1)
x_test_cnn = x_test.reshape(x_test.shape[0], 28, 28, 1)

In [16]:
# Create a CNN model
cnn_model = keras.Sequential([
keras.layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)),
keras.layers.MaxPooling2D((2, 2)),
keras.layers.Conv2D(64, (3, 3), activation='relu'),
keras.layers.MaxPooling2D((2, 2)),
keras.layers.Flatten(),
keras.layers.Dense(128, activation='relu'),
keras.layers.Dropout(0.5),
keras.layers.Dense(10, activation='softmax')
])

cnn_model.compile(
optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy']
)

cnn_model.summary()

/usr/local/lib/python3.11/dist-packages/keras/src/layers/convolutional/base_conv.py:107:
UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Seq
uential models, prefer using an `Input(shape)` object as the first layer in the model ins
tead.
super().__init__(activity_regularizer=activity_regularizer, **kwargs)

Model: "sequential_3"

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type) ┃ Output Shape ┃ Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ conv2d_2 (Conv2D) │ (None, 26, 26, 32) │ 320 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ max_pooling2d_2 (MaxPooling2D) │ (None, 13, 13, 32) │ 0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ conv2d_3 (Conv2D) │ (None, 11, 11, 64) │ 18,496 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ max_pooling2d_3 (MaxPooling2D) │ (None, 5, 5, 64) │ 0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ flatten_1 (Flatten) │ (None, 1600) │ 0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_14 (Dense) │ (None, 128) │ 204,928 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout_3 (Dropout) │ (None, 128) │ 0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_15 (Dense) │ (None, 10) │ 1,290 │
└─────────────────────────────────┴────────────────────────┴───────────────┘

Total params: 225,034 (879.04 KB)

Trainable params: 225,034 (879.04 KB)

Non-trainable params: 0 (0.00 B)

In [ ]:
# Train the CNN model with early stopping and learning rate reduction
early_stopping = keras.callbacks.EarlyStopping(
monitor='val_loss',
patience=3,
restore_best_weights=True
)

lr_scheduler = keras.callbacks.ReduceLROnPlateau(
monitor='val_loss',
factor=0.5,
patience=2
)

cnn_history = cnn_model.fit(
x_train_cnn, y_train,
epochs=15,
batch_size=64,
validation_split=0.2,
callbacks=[early_stopping, lr_scheduler],
verbose=1
)

In [18]:
# Evaluate the CNN model
cnn_test_loss, cnn_test_acc = cnn_model.evaluate(x_test_cnn, y_test, verbose=2)
print(f'\n CNN Test accuracy: {cnn_test_acc:.4f}')

313/313 - 1s - 4ms/step - accuracy: 0.9096 - loss: 0.2638


CNN Test accuracy: 0.9096

In [19]:

# Compare the performance of MLP and CNN


plt.figure(figsize=(10, 5))
plt.bar(['MLP Model', 'CNN Model'], [test_acc, cnn_test_acc], color=['blue', 'green'])
plt.title('Accuracy Comparison: MLP vs CNN')
plt.ylabel('Test Accuracy')
plt.ylim([0, 1])
for i, v in enumerate([test_acc, cnn_test_acc]):
plt.text(i, v + 0.01, f"{v:.4f} ", ha='center')
plt.show()

In [21]:

# Example of Transfer Learning with MobileNetV2


# Note: Fashion MNIST images are grayscale and small (28x28),
# but this is for demonstration of the approach

# Function to preprocess images for MobileNetV2


def preprocess_images_for_mobilenet(images, target_size=(96, 96)):
# Create an array to hold the preprocessed images
# The shape will be (batch_size, target_height, target_width, 3) for RGB
preprocessed = np.zeros((images.shape[0], *target_size, 3))

for i, img in enumerate(images):


# Add channel dimension to grayscale image (28x28 -> 28x28x1)
img_with_channel = tf.expand_dims(img, -1)
# Resize the 3D grayscale image to the target size (96x96x1)
resized = tf.image.resize(img_with_channel, target_size)
# Convert the resized grayscale image (now 96x96x1) to RGB (96x96x3)
preprocessed[i] = tf.image.grayscale_to_rgb(resized)

return preprocessed

# Prepare a small subset for demonstration


sample_size = 100 # Small sample for demonstration
x_sample = preprocess_images_for_mobilenet(x_train[:sample_size])
y_sample = y_train[:sample_size]

# Load pre-trained MobileNetV2


# Ensure input_shape matches the target_size and channels used in preprocessing
base_model = keras.applications.MobileNetV2(
input_shape=(96, 96, 3),
include_top=False,
weights='imagenet'
)

# Freeze the base model


base_model.trainable = False

# Create a new model on top


inputs = keras.Input(shape=(96, 96, 3))
x = base_model(inputs, training=False)
x = keras.layers.GlobalAveragePooling2D()(x)
x = keras.layers.Dense(128, activation='relu')(x)
x = keras.layers.Dropout(0.5)(x)
outputs = keras.layers.Dense(10, activation='softmax')(x)
transfer_model = keras.Model(inputs, outputs)

# Compile the model


transfer_model.compile(
optimizer=keras.optimizers.Adam(1e-4),
loss='sparse_categorical_crossentropy',
metrics=['accuracy']
)

print("Transfer Learning Model:")


transfer_model.summary()

Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/mobile


net_v2/mobilenet_v2_weights_tf_dim_ordering_tf_kernels_1.0_96_no_top.h5
9406464/9406464 ━━━━━━━━━━━━━━━━━━━━ 0s 0us/step
Transfer Learning Model:

Model: "functional_5"

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type) ┃ Output Shape ┃ Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ input_layer_6 (InputLayer) │ (None, 96, 96, 3) │ 0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ mobilenetv2_1.00_96 │ (None, 3, 3, 1280) │ 2,257,984 │
│ (Functional) │ │ │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ global_average_pooling2d │ (None, 1280) │ 0 │
│ (GlobalAveragePooling2D) │ │ │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_16 (Dense) │ (None, 128) │ 163,968 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout_4 (Dropout) │ (None, 128) │ 0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_17 (Dense) │ (None, 10) │ 1,290 │
└─────────────────────────────────┴────────────────────────┴───────────────┘

Total params: 2,423,242 (9.24 MB)

Trainable params: 165,258 (645.54 KB)

Non-trainable params: 2,257,984 (8.61 MB)

In [24]:

# Save the CNN model (SavedModel format - recommended)


# Use .keras extension for the native Keras format
cnn_model.save('fashion_mnist_cnn.keras')

# Save only the weights


# Changed filepath to end with .weights.h5
cnn_model.save_weights('fashion_mnist_cnn_weights.weights.h5')

# Save in HDF5 format


cnn_model.save('fashion_mnist_cnn.h5')

# Load a model (specify the path to the .keras file)


loaded_model = keras.models.load_model('fashion_mnist_cnn.keras')

# Verify it works
loaded_model_loss, loaded_model_acc = loaded_model.evaluate(x_test_cnn, y_test, verbose=
2)
print(f'\n Loaded model test accuracy: {loaded_model_acc:.4f} ')

WARNING:absl:You are saving your model as an HDF5 file via `model.save()` or `keras.savin
g.save_model(model)`. This file format is considered legacy. We recommend using instead t
he native Keras format, e.g. `model.save('my_model.keras')` or `keras.saving.save_model(m
odel, 'my_model.keras')`.

313/313 - 1s - 4ms/step - accuracy: 0.9096 - loss: 0.2638

Loaded model test accuracy: 0.9096

In [ ]:

# Convert to TensorFlow Lite


converter = tf.lite.TFLiteConverter.from_keras_model(cnn_model)
tflite_model = converter.convert()

# Save the TF Lite model


with open('fashion_mnist_model.tflite', 'wb') as f:
f.write(tflite_model)

print(f"TFLite model size: {len(tflite_model) / 1024:.2f} KB")


Experiment 13: ANN for Customer Churn Prediction (use Telco
Churn Dataset or sample CSV)
In [1]:

# necessary imports

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

import warnings
warnings.filterwarnings('ignore')

plt.style.use('fivethirtyeight')
%matplotlib inline

In [ ]:
df = pd.read_csv(r'C:\Users\Rishi\Churn_Modelling.csv')
df.head()
Out[ ]:

RowNumber CustomerId Surname CreditScore Geography Gender Age Tenure Balance NumOfProducts HasCrCard IsAc

0 1 15634602 Hargrave 619 France Female 42 2 0.00 1 1

1 2 15647311 Hill 608 Spain Female 41 1 83807.86 1 0

2 3 15619304 Onio 502 France Female 42 8 159660.80 3 1

3 4 15701354 Boni 699 France Female 39 1 0.00 2 0

4 5 15737888 Mitchell 850 Spain Female 43 2 125510.82 1 1

In [3]:
df.describe()

Out[3]:

RowNumber CustomerId CreditScore Age Tenure Balance NumOfProducts HasCrCard IsActive

count 10000.00000 1.000000e+04 10000.000000 10000.000000 10000.000000 10000.000000 10000.000000 10000.00000 10000

mean 5000.50000 1.569094e+07 650.528800 38.921800 5.012800 76485.889288 1.530200 0.70550

std 2886.89568 7.193619e+04 96.653299 10.487806 2.892174 62397.405202 0.581654 0.45584

min 1.00000 1.556570e+07 350.000000 18.000000 0.000000 0.000000 1.000000 0.00000

25% 2500.75000 1.562853e+07 584.000000 32.000000 3.000000 0.000000 1.000000 0.00000

50% 5000.50000 1.569074e+07 652.000000 37.000000 5.000000 97198.540000 1.000000 1.00000

75% 7500.25000 1.575323e+07 718.000000 44.000000 7.000000 127644.240000 2.000000 1.00000

max 10000.00000 1.581569e+07 850.000000 92.000000 10.000000 250898.090000 4.000000 1.00000

In [4]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10000 entries, 0 to 9999
Data columns (total 14 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 RowNumber 10000 non-null int64
1 CustomerId 10000 non-null int64
2 Surname 10000 non-null object
3 CreditScore 10000 non-null int64
4 Geography 10000 non-null object
5 Gender 10000 non-null object
6 Age 10000 non-null int64
7 Tenure 10000 non-null int64
8 Balance 10000 non-null float64
9 NumOfProducts 10000 non-null int64
10 HasCrCard 10000 non-null int64
11 IsActiveMember 10000 non-null int64
12 EstimatedSalary 10000 non-null float64
13 Exited 10000 non-null int64
dtypes: float64(2), int64(9), object(3)
memory usage: 1.1+ MB

In [ ]:
# checking for null values

df.isna().sum() # no null values

In [6]:
values = df.Exited.value_counts()
labels = ['Not Exited', 'Exited']

fig, ax = plt.subplots(figsize = (4, 3), dpi = 100)


explode = (0, 0.09)

patches, texts, autotexts = ax.pie(values, labels = labels, autopct = '%1.2f%% ', shadow
= True,
startangle = 90, explode = explode)

plt.setp(texts, color = 'grey')


plt.setp(autotexts, size = 8, color = 'white')
autotexts[1].set_color('black')
plt.show()

In [7]:
# visualizing categorical variables

fig, ax = plt.subplots(3, 2, figsize = (18, 15))

sns.countplot('Geography', hue = 'Exited', data = df, ax = ax[0][0])


sns.countplot('Gender', hue = 'Exited', data = df, ax = ax[0][1])
sns.countplot('Tenure', hue = 'Exited', data = df, ax = ax[1][0])
sns.countplot('NumOfProducts', hue = 'Exited', data = df, ax = ax[1][1])
sns.countplot('HasCrCard', hue = 'Exited', data = df, ax = ax[2][0])
sns.countplot('IsActiveMember', hue = 'Exited', data = df, ax = ax[2][1])

plt.tight_layout()
plt.show()

In [8]:
# visualizing continuous variables

fig, ax = plt.subplots(2, 2, figsize = (16, 10))

sns.boxplot(x = 'Exited', y = 'CreditScore', data = df, ax = ax[0][0])


sns.boxplot(x = 'Exited', y = 'Age', data = df, ax = ax[0][1])
sns.boxplot(x = 'Exited', y = 'Balance', data = df, ax = ax[1][0])
sns.boxplot(x = 'Exited', y = 'EstimatedSalary', data = df, ax = ax[1][1])

plt.tight_layout()
plt.show()
In [9]:
# heatmap

plt.figure(figsize = (20, 12))

corr = df.corr()

sns.heatmap(corr, linewidths = 1, annot = True, fmt = ".2f")


plt.show()

In [10]:

# dropping useless columns

df.drop(columns = ['RowNumber', 'CustomerId', 'Surname'], axis = 1, inplace = True)


df.head()
Out[10]:

CreditScore Geography Gender Age Tenure Balance NumOfProducts HasCrCard IsActiveMember EstimatedSalary Exited

0 619 France Female 42 2 0.00 1 1 1 101348.88

1 608 Spain Female 41 1 83807.86 1 0 1 112542.58

2 502 France Female 42 8 159660.80 3 1 0 113931.57

3 699 France Female 39 1 0.00 2 0 0 93826.63

4 850 Spain Female 43 2 125510.82 1 1 1 79084.10


In [11]:
df.Geography.value_counts()
Out[11]:

France 5014
Germany 2509
Spain 2477
Name: Geography, dtype: int64

In [12]:
# Encoding categorical variables

df['Geography'] = df['Geography'].map({'France' : 0, 'Germany' : 1, 'Spain' : 2})


df['Gender'] = df['Gender'].map({'Male' : 0, 'Female' : 1})

In [13]:
df.head()
Out[13]:

CreditScore Geography Gender Age Tenure Balance NumOfProducts HasCrCard IsActiveMember EstimatedSalary Exited

0 619 0 1 42 2 0.00 1 1 1 101348.88

1 608 2 1 41 1 83807.86 1 0 1 112542.58

2 502 0 1 42 8 159660.80 3 1 0 113931.57

3 699 0 1 39 1 0.00 2 0 0 93826.63

4 850 2 1 43 2 125510.82 1 1 1 79084.10

In [14]:
# creating features and label

from tensorflow.keras.utils import to_categorical

X = df.drop('Exited', axis = 1)
y = to_categorical(df.Exited)

In [15]:
# splitting data into training set and test set

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.25)

In [16]:
# Scaling data

from sklearn.preprocessing import StandardScaler

sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)

In [ ]:
import keras
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Dropout
from keras.layers import BatchNormalization
# initializing ann
model = Sequential()

# adding the first input layer and the first hidden layer
model.add(Dense(10, kernel_initializer = 'normal', activation = 'relu', input_shape = (1
0, )))

# adding batch normalization and dropout layer


model.add(Dropout(rate = 0.1))
model.add(BatchNormalization())

# adding the third hidden layer


model.add(Dense(7, kernel_initializer = 'normal', activation = 'relu'))

# adding batch normalization and dropout layer


model.add(Dropout(rate = 0.1))
model.add(BatchNormalization())

# adding the output layer


model.add(Dense(2, kernel_initializer = 'normal', activation = 'sigmoid'))

# compiling the model


model.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])

# fitting the model to the training set

model_history = model.fit(X_train, y_train, validation_split = 0.20, validation_data = (


X_test, y_test), epochs = 100)

In [18]:
plt.figure(figsize = (12, 6))

train_loss = model_history.history['loss']
val_loss = model_history.history['val_loss']
epoch = range(1, 101)
sns.lineplot(epoch, train_loss, label = 'Training Loss')
sns.lineplot(epoch, val_loss, label = 'Validation Loss')
plt.title('Training and Validation Loss\n ')
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.legend()
plt.show()

In [19]:
plt.figure(figsize = (12, 6))
train_loss = model_history.history['accuracy']
val_loss = model_history.history['val_accuracy']
epoch = range(1, 101)
sns.lineplot(epoch, train_loss, label = 'Training accuracy')
sns.lineplot(epoch, val_loss, label = 'Validation accuracy')
plt.title('Training and Validation Accuracy\n ')
plt.xlabel('Epochs')
plt.ylabel('Accuracy')
plt.legend()
plt.show()

In [20]:
acc = model.evaluate(X_test, y_test)[1]

print(f'Accuracy of model is {acc}')

79/79 [==============================] - 0s 872us/step - loss: 0.3380 - accuracy: 0.8656


Accuracy of model is 0.8655999898910522

In [21]:
model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense (Dense) (None, 10) 110
_________________________________________________________________
dropout (Dropout) (None, 10) 0
_________________________________________________________________
batch_normalization (BatchNo (None, 10) 40
_________________________________________________________________
dense_1 (Dense) (None, 7) 77
_________________________________________________________________
dropout_1 (Dropout) (None, 7) 0
_________________________________________________________________
batch_normalization_1 (Batch (None, 7) 28
_________________________________________________________________
dense_2 (Dense) (None, 2) 16
=================================================================
Total params: 271
Trainable params: 237
Non-trainable params: 34
_________________________________________________________________
In [22]:
from tensorflow.keras.utils import plot_model

plot_model(model, show_shapes = True)


Out[22]:
Experiment 14: Introduction to Generative Adversarial
Networks (GANs) with a simple GAN model
In [ ]:

# Import necessary libraries


import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
import time
import os

# Check TensorFlow version


print(f"TensorFlow version: {tf.__version__}")

# Set random seeds for reproducibility


np.random.seed(42)
tf.random.set_seed(42)

TensorFlow version: 2.18.0

In [ ]:

# Check if GPU is available


print("Num GPUs Available: ", len(tf.config.list_physical_devices('GPU')))
print("Devices: ", tf.config.list_physical_devices())

# If GPU is available, configure for performance


if len(tf.config.list_physical_devices('GPU')) > 0:
# Allow memory growth
for gpu in tf.config.list_physical_devices('GPU'):
tf.config.experimental.set_memory_growth(gpu, True)
print("GPU memory growth enabled")

Num GPUs Available: 1


Devices: [PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU'), PhysicalDevi
ce(name='/physical_device:GPU:0', device_type='GPU')]
GPU memory growth enabled

In [ ]:

# Load MNIST dataset


(x_train, _), (_, _) = keras.datasets.mnist.load_data()

# Normalize the images to [-1, 1]


x_train = (x_train.astype('float32') - 127.5) / 127.5
# Add a channel dimension
x_train = x_train.reshape(x_train.shape[0], 28, 28, 1)

# Create a TensorFlow dataset


BUFFER_SIZE = 60000 # For shuffling the data
BATCH_SIZE = 256

train_dataset = tf.data.Dataset.from_tensor_slices(x_train).shuffle(BUFFER_SIZE).batch(B
ATCH_SIZE)

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.n


pz
11490434/11490434 ━━━━━━━━━━━━━━━━━━━━ 2s 0us/step

In [ ]:
# Visualize some samples from the dataset
plt.figure(figsize=(10, 10))
for i in range(25):
plt.subplot(5, 5, i+1)
# Rescale to [0, 1]
plt.imshow(x_train[i, :, :, 0] * 0.5 + 0.5, cmap='gray')
plt.axis('off')
plt.tight_layout()
plt.show()

In [ ]:
# Define the random noise dimension
NOISE_DIM = 100

# Build the generator model


def build_generator():
model = keras.Sequential()

# First, transform the input into a small spatial extent with many channels
model.add(layers.Dense(7 * 7 * 256, use_bias=False, input_shape=(NOISE_DIM,)))
model.add(layers.BatchNormalization())
model.add(layers.LeakyReLU(alpha=0.2))

# Reshape into a 3D tensor


model.add(layers.Reshape((7, 7, 256)))
# Upsampling layers
model.add(layers.Conv2DTranspose(128, (5, 5), strides=(1, 1), padding='same', use_bi
as=False))
model.add(layers.BatchNormalization())
model.add(layers.LeakyReLU(alpha=0.2))

model.add(layers.Conv2DTranspose(64, (5, 5), strides=(2, 2), padding='same', use_bia


s=False))
model.add(layers.BatchNormalization())
model.add(layers.LeakyReLU(alpha=0.2))

# Final layer with tanh activation to generate images with pixel values in [-1, 1]
model.add(layers.Conv2DTranspose(1, (5, 5), strides=(2, 2), padding='same', use_bias
=False, activation='tanh'))

return model

# Create the generator


generator = build_generator()

# Test the generator with random noise


noise = tf.random.normal([1, NOISE_DIM])
generated_image = generator(noise, training=False)

# Print model summary


generator.summary()

# Visualize a generated image


plt.figure(figsize=(4, 4))
plt.imshow(generated_image[0, :, :, 0] * 0.5 + 0.5, cmap='gray')
plt.title('Generated Image (Initial Random Weights)')
plt.axis('off')
plt.show()

/usr/local/lib/python3.11/dist-packages/keras/src/layers/core/dense.py:87: UserWarning: D
o not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models
, prefer using an `Input(shape)` object as the first layer in the model instead.
super().__init__(activity_regularizer=activity_regularizer, **kwargs)
/usr/local/lib/python3.11/dist-packages/keras/src/layers/activations/leaky_relu.py:41: Us
erWarning: Argument `alpha` is deprecated. Use `negative_slope` instead.
warnings.warn(

Model: "sequential"

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type) ┃ Output Shape ┃ Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ dense (Dense) │ (None, 12544) │ 1,254,400 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ batch_normalization │ (None, 12544) │ 50,176 │
│ (BatchNormalization) │ │ │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ leaky_re_lu (LeakyReLU) │ (None, 12544) │ 0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ reshape (Reshape) │ (None, 7, 7, 256) │ 0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ conv2d_transpose │ (None, 7, 7, 128) │ 819,200 │
│ (Conv2DTranspose) │ │ │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ batch_normalization_1 │ (None, 7, 7, 128) │ 512 │
│ (BatchNormalization) │ │ │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ leaky_re_lu_1 (LeakyReLU) │ (None, 7, 7, 128) │ 0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ conv2d_transpose_1 │ (None, 14, 14, 64) │ 204,800 │
│ (Conv2DTranspose) │ │ │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ batch_normalization_2 │ (None, 14, 14, 64) │ 256 │
│ (BatchNormalization) │ │ │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ leaky_re_lu_2 (LeakyReLU) │ (None, 14, 14, 64) │ 0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ conv2d_transpose_2 │ (None, 28, 28, 1) │ 1,600 │
│ (Conv2DTranspose) │ │ │
└─────────────────────────────────┴────────────────────────┴───────────────┘

Total params: 2,330,944 (8.89 MB)

Trainable params: 2,305,472 (8.79 MB)

Non-trainable params: 25,472 (99.50 KB)

In [ ]:
# Build the discriminator model
def build_discriminator():
model = keras.Sequential()

# First convolutional layer


model.add(layers.Conv2D(64, (5, 5), strides=(2, 2), padding='same', input_shape=[28,
28, 1]))
model.add(layers.LeakyReLU(alpha=0.2))
model.add(layers.Dropout(0.3))

# Second convolutional layer


model.add(layers.Conv2D(128, (5, 5), strides=(2, 2), padding='same'))
model.add(layers.LeakyReLU(alpha=0.2))
model.add(layers.Dropout(0.3))

# Flatten and output layer


model.add(layers.Flatten())
model.add(layers.Dense(1))

return model

# Create the discriminator


discriminator = build_discriminator()

# Test the discriminator with a real image


decision = discriminator(tf.expand_dims(x_train[0], 0))
print(f"Discriminator output for a real image: {decision.numpy()[0][0]}")

# Print model summary


discriminator.summary()

/usr/local/lib/python3.11/dist-packages/keras/src/layers/convolutional/base_conv.py:107:
UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Seq
uential models, prefer using an `Input(shape)` object as the first layer in the model ins
uential models, prefer using an `Input(shape)` object as the first layer in the model ins
tead.
super().__init__(activity_regularizer=activity_regularizer, **kwargs)

Discriminator output for a real image: 0.053151972591876984

Model: "sequential_1"

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type) ┃ Output Shape ┃ Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ conv2d (Conv2D) │ (None, 14, 14, 64) │ 1,664 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ leaky_re_lu_3 (LeakyReLU) │ (None, 14, 14, 64) │ 0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout (Dropout) │ (None, 14, 14, 64) │ 0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ conv2d_1 (Conv2D) │ (None, 7, 7, 128) │ 204,928 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ leaky_re_lu_4 (LeakyReLU) │ (None, 7, 7, 128) │ 0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout_1 (Dropout) │ (None, 7, 7, 128) │ 0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ flatten (Flatten) │ (None, 6272) │ 0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_1 (Dense) │ (None, 1) │ 6,273 │
└─────────────────────────────────┴────────────────────────┴───────────────┘

Total params: 212,865 (831.50 KB)

Trainable params: 212,865 (831.50 KB)

Non-trainable params: 0 (0.00 B)

In [ ]:

# Define the loss functions


cross_entropy = keras.losses.BinaryCrossentropy(from_logits=True)

# Define the discriminator loss


def discriminator_loss(real_output, fake_output):
real_loss = cross_entropy(tf.ones_like(real_output), real_output)
fake_loss = cross_entropy(tf.zeros_like(fake_output), fake_output)
total_loss = real_loss + fake_loss
return total_loss

# Define the generator loss


def generator_loss(fake_output):
return cross_entropy(tf.ones_like(fake_output), fake_output)

# Define the optimizers


generator_optimizer = keras.optimizers.Adam(1e-4)
discriminator_optimizer = keras.optimizers.Adam(1e-4)

In [ ]:
# Define a function to generate and save images
def generate_and_save_images(model, epoch, test_input):
# Generate images
predictions = model(test_input, training=False)

# Plot the generated images


plt.figure(figsize=(10, 10))
for i in range(test_input.shape[0]):
plt.subplot(5, 5, i+1)
plt.imshow(predictions[i, :, :, 0] * 0.5 + 0.5, cmap='gray')
plt.axis('off')
plt.tight_layout()
plt.savefig(f'gan_epoch_{epoch:04d}.png')
plt.close()
# In a Jupyter notebook, display the latest images
if epoch % 10 == 0 or epoch == 1:
plt.figure(figsize=(10, 10))
for i in range(test_input.shape[0]):
plt.subplot(5, 5, i+1)
plt.imshow(predictions[i, :, :, 0] * 0.5 + 0.5, cmap='gray')
plt.axis('off')
plt.suptitle(f'Epoch {epoch} ')
plt.tight_layout()
plt.show()

In [ ]:
# Define the training step
@tf.function
def train_step(images):
# Generate random noise for the generator
noise = tf.random.normal([BATCH_SIZE, NOISE_DIM])

with tf.GradientTape() as gen_tape, tf.GradientTape() as disc_tape:


# Generate fake images
generated_images = generator(noise, training=True)

# Get discriminator outputs for real and fake images


real_output = discriminator(images, training=True)
fake_output = discriminator(generated_images, training=True)

# Calculate losses
gen_loss = generator_loss(fake_output)
disc_loss = discriminator_loss(real_output, fake_output)

# Calculate gradients
gradients_of_generator = gen_tape.gradient(gen_loss, generator.trainable_variables)
gradients_of_discriminator = disc_tape.gradient(disc_loss, discriminator.trainable_va
riables)

# Apply gradients
generator_optimizer.apply_gradients(zip(gradients_of_generator, generator.trainable_v
ariables))
discriminator_optimizer.apply_gradients(zip(gradients_of_discriminator, discriminator
.trainable_variables))

return gen_loss, disc_loss

In [ ]:
# Define the training function
def train(dataset, epochs):
# Create a fixed noise vector for visualization
seed = tf.random.normal([25, NOISE_DIM])

# Track losses
gen_losses = []
disc_losses = []

# Training loop
for epoch in range(1, epochs + 1):
start_time = time.time()

# Lists to store batch losses


batch_gen_losses = []
batch_disc_losses = []

# Train on batches
for image_batch in dataset:
gen_loss, disc_loss = train_step(image_batch)
batch_gen_losses.append(gen_loss)
batch_disc_losses.append(disc_loss)

# Calculate average losses for the epoch


avg_gen_loss = tf.reduce_mean(batch_gen_losses)
avg_disc_loss = tf.reduce_mean(batch_disc_losses)
gen_losses.append(avg_gen_loss.numpy())
disc_losses.append(avg_disc_loss.numpy())

# Print progress
print(f"Epoch {epoch} /{epochs}, "
f"Generator Loss: {avg_gen_loss:.4f}, "
f"Discriminator Loss: {avg_disc_loss:.4f}, "
f"Time: {time.time() - start_time:.2f} sec")

# Generate and save images every 10 epochs or at the end


if epoch % 10 == 0 or epoch == 1 or epoch == epochs:
generate_and_save_images(generator, epoch, seed)

# Plot the loss curves


plt.figure(figsize=(12, 5))
plt.subplot(1, 2, 1)
plt.plot(range(1, epochs + 1), gen_losses, label='Generator')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.title('Generator Loss')
plt.grid(True)

plt.subplot(1, 2, 2)
plt.plot(range(1, epochs + 1), disc_losses, label='Discriminator')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.title('Discriminator Loss')
plt.grid(True)

plt.tight_layout()
plt.show()

return gen_losses, disc_losses

In [ ]:
# Define the number of epochs
EPOCHS = 50

# Train the GAN


print("Starting GAN training...")
gen_losses, disc_losses = train(train_dataset, EPOCHS)
print("GAN training completed!")

Starting GAN training...


Epoch 1/50, Generator Loss: 0.7640, Discriminator Loss: 1.0672, Time: 19.14 sec
Epoch 2/50, Generator Loss: 1.1070, Discriminator Loss: 0.9280, Time: 10.88 sec
Epoch 3/50, Generator Loss: 0.9801, Discriminator Loss: 1.1674, Time: 11.02 sec
Epoch 4/50, Generator Loss: 0.9755, Discriminator Loss: 1.1712, Time: 11.22 sec
Epoch 5/50, Generator Loss: 0.8822, Discriminator Loss: 1.2903, Time: 11.18 sec
Epoch 6/50, Generator Loss: 0.9582, Discriminator Loss: 1.1504, Time: 11.28 sec
Epoch 7/50, Generator Loss: 0.9361, Discriminator Loss: 1.2384, Time: 11.30 sec
Epoch 8/50, Generator Loss: 0.9839, Discriminator Loss: 1.1471, Time: 11.37 sec
Epoch 9/50, Generator Loss: 1.0402, Discriminator Loss: 1.1383, Time: 11.45 sec
Epoch 10/50, Generator Loss: 1.0986, Discriminator Loss: 1.1012, Time: 11.53 sec
Epoch 11/50, Generator Loss: 1.0813, Discriminator Loss: 1.1047, Time: 11.59 sec
Epoch 12/50, Generator Loss: 1.0634, Discriminator Loss: 1.1182, Time: 11.64 sec
Epoch 13/50, Generator Loss: 1.2573, Discriminator Loss: 1.0079, Time: 11.70 sec
Epoch 14/50, Generator Loss: 1.1528, Discriminator Loss: 1.0650, Time: 11.77 sec
Epoch 15/50, Generator Loss: 1.2226, Discriminator Loss: 0.9991, Time: 11.80 sec
Epoch 16/50, Generator Loss: 1.2214, Discriminator Loss: 1.0136, Time: 11.86 sec
Epoch 17/50, Generator Loss: 1.1788, Discriminator Loss: 1.0673, Time: 11.92 sec
Epoch 18/50, Generator Loss: 1.1950, Discriminator Loss: 1.0888, Time: 11.90 sec
Epoch 19/50, Generator Loss: 1.2432, Discriminator Loss: 1.0234, Time: 11.91 sec
Epoch 20/50, Generator Loss: 1.1656, Discriminator Loss: 1.1076, Time: 11.97 sec

Epoch 21/50, Generator Loss: 1.1547, Discriminator Loss: 1.0748, Time: 12.01 sec
Epoch 22/50, Generator Loss: 1.0937, Discriminator Loss: 1.1145, Time: 12.05 sec
Epoch 23/50, Generator Loss: 1.1250, Discriminator Loss: 1.1247, Time: 12.06 sec
Epoch 24/50, Generator Loss: 1.0968, Discriminator Loss: 1.1338, Time: 12.09 sec
Epoch 25/50, Generator Loss: 1.0754, Discriminator Loss: 1.1494, Time: 12.14 sec
Epoch 25/50, Generator Loss: 1.0754, Discriminator Loss: 1.1494, Time: 12.14 sec
Epoch 26/50, Generator Loss: 1.0967, Discriminator Loss: 1.1201, Time: 12.14 sec
Epoch 27/50, Generator Loss: 1.0493, Discriminator Loss: 1.1688, Time: 12.10 sec
Epoch 28/50, Generator Loss: 1.0357, Discriminator Loss: 1.1652, Time: 12.09 sec
Epoch 29/50, Generator Loss: 1.0498, Discriminator Loss: 1.1800, Time: 12.08 sec
Epoch 30/50, Generator Loss: 1.0252, Discriminator Loss: 1.1832, Time: 12.08 sec

Epoch 31/50, Generator Loss: 1.0058, Discriminator Loss: 1.1969, Time: 12.10 sec
Epoch 32/50, Generator Loss: 1.0144, Discriminator Loss: 1.1886, Time: 12.16 sec
Epoch 33/50, Generator Loss: 0.9977, Discriminator Loss: 1.1890, Time: 12.19 sec
Epoch 34/50, Generator Loss: 0.9557, Discriminator Loss: 1.2099, Time: 12.20 sec
Epoch 35/50, Generator Loss: 0.9276, Discriminator Loss: 1.2330, Time: 12.20 sec
Epoch 36/50, Generator Loss: 0.9567, Discriminator Loss: 1.2123, Time: 12.18 sec
Epoch 37/50, Generator Loss: 0.9403, Discriminator Loss: 1.2133, Time: 12.17 sec
Epoch 38/50, Generator Loss: 0.9853, Discriminator Loss: 1.2095, Time: 12.18 sec
Epoch 39/50, Generator Loss: 0.9811, Discriminator Loss: 1.2113, Time: 12.20 sec
Epoch 40/50, Generator Loss: 0.9665, Discriminator Loss: 1.2082, Time: 12.20 sec
Epoch 41/50, Generator Loss: 0.9887, Discriminator Loss: 1.1944, Time: 12.14 sec
Epoch 42/50, Generator Loss: 0.9751, Discriminator Loss: 1.2128, Time: 12.14 sec
Epoch 43/50, Generator Loss: 0.9558, Discriminator Loss: 1.2182, Time: 12.15 sec
Epoch 44/50, Generator Loss: 0.9504, Discriminator Loss: 1.2224, Time: 12.14 sec
Epoch 45/50, Generator Loss: 0.9616, Discriminator Loss: 1.2101, Time: 12.17 sec
Epoch 46/50, Generator Loss: 0.9443, Discriminator Loss: 1.2246, Time: 12.16 sec
Epoch 47/50, Generator Loss: 0.9527, Discriminator Loss: 1.2142, Time: 12.15 sec
Epoch 48/50, Generator Loss: 0.9618, Discriminator Loss: 1.2021, Time: 12.14 sec
Epoch 49/50, Generator Loss: 0.9717, Discriminator Loss: 1.2104, Time: 12.11 sec
Epoch 50/50, Generator Loss: 0.9633, Discriminator Loss: 1.2100, Time: 12.11 sec
GAN training completed!

In [ ]:

# Generate a larger set of images


num_examples = 100
random_noise = tf.random.normal([num_examples, NOISE_DIM])
generated_images = generator(random_noise, training=False)

# Convert to appropriate range for visualization


generated_images = generated_images * 0.5 + 0.5

# Display a grid of generated images


rows = 10
cols = 10
fig, axes = plt.subplots(rows, cols, figsize=(15, 15))
fig.suptitle("Generated Digits", fontsize=20)

for i, ax in enumerate(axes.flatten()):
if i < num_examples:
ax.imshow(generated_images[i, :, :, 0], cmap='gray')
ax.axis('off')

plt.tight_layout()
plt.subplots_adjust(top=0.95)
plt.show()
In [ ]:
# Create an animation of the generation process
# We can use the saved images from different epochs
try:
import imageio
from IPython.display import display, HTML
import glob

# Get all the saved image files


filenames = sorted(glob.glob('gan_epoch_*.png'))

# Create a GIF animation


with imageio.get_writer('gan_training.gif', mode='I', duration=0.5) as writer:
for filename in filenames:
image = imageio.imread(filename)
writer.append_data(image)

# Display the animation in the notebook


print("GAN Training Animation:")
with open('gan_training.gif', 'rb') as f:
display(HTML(f'<img src="data:image/gif;base64,{imageio.v2.imread(f).tobytes().he
x()}">'))
except Exception as e:
print(f"Could not create animation: {e}")
print("To create an animation, install imageio package.")

In [ ]:

# Create two random noise vectors


start_vector = tf.random.normal([1, NOISE_DIM])
end_vector = tf.random.normal([1, NOISE_DIM])

# Generate images for interpolated vectors


num_steps = 10
alpha_values = np.linspace(0, 1, num_steps)
interpolated_images = []

for alpha in alpha_values:


# Linear interpolation between start and end vectors
interpolated_vector = start_vector * (1 - alpha) + end_vector * alpha
# Generate an image
interpolated_image = generator(interpolated_vector, training=False)
interpolated_images.append(interpolated_image[0] * 0.5 + 0.5)

# Display the interpolated images


plt.figure(figsize=(15, 3))
for i, img in enumerate(interpolated_images):
plt.subplot(1, num_steps, i+1)
plt.imshow(img[:, :, 0], cmap='gray')
plt.axis('off')
plt.suptitle("Latent Space Interpolation", fontsize=16)
plt.tight_layout()
plt.show()

In [ ]:
# Load MNIST dataset with labels
(x_train, y_train), (_, _) = keras.datasets.mnist.load_data()

# Normalize the images to [-1, 1]


x_train = (x_train.astype('float32') - 127.5) / 127.5
# Add a channel dimension
x_train = x_train.reshape(x_train.shape[0], 28, 28, 1)

# Create a TensorFlow dataset with both images and labels


train_dataset_with_labels = tf.data.Dataset.from_tensor_slices((x_train, y_train))
train_dataset_with_labels = train_dataset_with_labels.shuffle(BUFFER_SIZE).batch(BATCH_SI
ZE)

In [ ]:

# Build the conditional generator model


def build_conditional_generator():
# Input for noise vector
noise_input = layers.Input(shape=(NOISE_DIM,))

# Input for label (condition)


label_input = layers.Input(shape=(1,))

# Embedding layer for the label


label_embedding = layers.Embedding(10, 50)(label_input)
label_embedding = layers.Flatten()(label_embedding)

# Combine noise and label


combined_input = layers.Concatenate()([noise_input, label_embedding])
# First dense layer
x = layers.Dense(7 * 7 * 256, use_bias=False)(combined_input)
x = layers.BatchNormalization()(x)
x = layers.LeakyReLU(alpha=0.2)(x)

# Reshape into 3D tensor


x = layers.Reshape((7, 7, 256))(x)

# Upsampling layers
x = layers.Conv2DTranspose(128, (5, 5), strides=(1, 1), padding='same', use_bias=Fal
se )(x)
x = layers.BatchNormalization()(x)
x = layers.LeakyReLU(alpha=0.2)(x)

x = layers.Conv2DTranspose(64, (5, 5), strides=(2, 2), padding='same', use_bias=Fals


e)(x)
x = layers.BatchNormalization()(x)
x = layers.LeakyReLU(alpha=0.2)(x)

# Final layer with tanh activation


output = layers.Conv2DTranspose(1, (5, 5), strides=(2, 2), padding='same', use_bias=
False, activation='tanh')(x)

model = keras.Model([noise_input, label_input], output)


return model

# Build the conditional discriminator model


def build_conditional_discriminator():
# Input for image
image_input = layers.Input(shape=(28, 28, 1))

# Input for label (condition)


label_input = layers.Input(shape=(1,))

# Embedding layer for the label


label_embedding = layers.Embedding(10, 50)(label_input)
label_embedding = layers.Flatten()(label_embedding)

# Reshape label for concatenation


label_embedding = layers.Dense(28 * 28)(label_embedding)
label_embedding = layers.Reshape((28, 28, 1))(label_embedding)

# Concatenate image and label


combined_input = layers.Concatenate()([image_input, label_embedding])

# Convolutional layers
x = layers.Conv2D(64, (5, 5), strides=(2, 2), padding='same')(combined_input)
x = layers.LeakyReLU(alpha=0.2)(x)
x = layers.Dropout(0.3)(x)

x = layers.Conv2D(128, (5, 5), strides=(2, 2), padding='same')(x)


x = layers.LeakyReLU(alpha=0.2)(x)
x = layers.Dropout(0.3)(x)

# Flatten and output layer


x = layers.Flatten()(x)
output = layers.Dense(1)(x)

model = keras.Model([image_input, label_input], output)


return model

# Create the conditional models


conditional_generator = build_conditional_generator()
conditional_discriminator = build_conditional_discriminator()

# Print model summaries


print("Conditional Generator:")
conditional_generator.summary()
print("\n Conditional Discriminator:")
conditional_discriminator.summary()

Conditional Generator:
Model: "functional_19"

┏━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┓
┃ Layer (type) ┃ Output Shape ┃ Param # ┃ Connected to ┃
┡━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━┩
│ input_layer_3 │ (None, 1) │ 0 │ - │
│ (InputLayer) │ │ │ │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ embedding │ (None, 1, 50) │ 500 │ input_layer_3[0]… │
│ (Embedding) │ │ │ │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ input_layer_2 │ (None, 100) │ 0 │ - │
│ (InputLayer) │ │ │ │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ flatten_1 (Flatten) │ (None, 50) │ 0 │ embedding[0][0] │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ concatenate │ (None, 150) │ 0 │ input_layer_2[0]… │
│ (Concatenate) │ │ │ flatten_1[0][0] │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ dense_2 (Dense) │ (None, 12544) │ 1,881,600 │ concatenate[0][0] │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ batch_normalizatio… │ (None, 12544) │ 50,176 │ dense_2[0][0] │
│ (BatchNormalizatio… │ │ │ │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ leaky_re_lu_5 │ (None, 12544) │ 0 │ batch_normalizat… │
│ (LeakyReLU) │ │ │ │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ reshape_1 (Reshape) │ (None, 7, 7, 256) │ 0 │ leaky_re_lu_5[0]… │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_transpose_3 │ (None, 7, 7, 128) │ 819,200 │ reshape_1[0][0] │
│ (Conv2DTranspose) │ │ │ │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ batch_normalizatio… │ (None, 7, 7, 128) │ 512 │ conv2d_transpose… │
│ (BatchNormalizatio… │ │ │ │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ leaky_re_lu_6 │ (None, 7, 7, 128) │ 0 │ batch_normalizat… │
│ (LeakyReLU) │ │ │ │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_transpose_4 │ (None, 14, 14, │ 204,800 │ leaky_re_lu_6[0]… │
│ (Conv2DTranspose) │ 64) │ │ │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ batch_normalizatio… │ (None, 14, 14, │ 256 │ conv2d_transpose… │
│ (BatchNormalizatio… │ 64) │ │ │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ leaky_re_lu_7 │ (None, 14, 14, │ 0 │ batch_normalizat… │
│ (LeakyReLU) │ 64) │ │ │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_transpose_5 │ (None, 28, 28, 1) │ 1,600 │ leaky_re_lu_7[0]… │
│ (Conv2DTranspose) │ │ │ │
└─────────────────────┴───────────────────┴────────────┴───────────────────┘

Total params: 2,958,644 (11.29 MB)

Trainable params: 2,933,172 (11.19 MB)

Non-trainable params: 25,472 (99.50 KB)

Conditional Discriminator:

Model: "functional_20"

┏━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┓
┃ Layer (type) ┃ Output Shape ┃ Param # ┃ Connected to ┃
┡━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━┩
│ input_layer_5 │ (None, 1) │ 0 │ - │
│ (InputLayer) │ │ │ │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ embedding_1 │ (None, 1, 50) │ 500 │ input_layer_5[0]… │
│ (Embedding) │ │ │ │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ flatten_2 (Flatten) │ (None, 50) │ 0 │ embedding_1[0][0] │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ dense_3 (Dense) │ (None, 784) │ 39,984 │ flatten_2[0][0] │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ input_layer_4 │ (None, 28, 28, 1) │ 0 │ - │
│ (InputLayer) │ │ │ │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ reshape_2 (Reshape) │ (None, 28, 28, 1) │ 0 │ dense_3[0][0] │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ concatenate_1 │ (None, 28, 28, 2) │ 0 │ input_layer_4[0]… │
│ (Concatenate) │ │ │ reshape_2[0][0] │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_2 (Conv2D) │ (None, 14, 14, │ 3,264 │ concatenate_1[0]… │
│ │ 64) │ │ │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ leaky_re_lu_8 │ (None, 14, 14, │ 0 │ conv2d_2[0][0] │
│ (LeakyReLU) │ 64) │ │ │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ dropout_2 (Dropout) │ (None, 14, 14, │ 0 │ leaky_re_lu_8[0]… │
│ │ 64) │ │ │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ conv2d_3 (Conv2D) │ (None, 7, 7, 128) │ 204,928 │ dropout_2[0][0] │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ leaky_re_lu_9 │ (None, 7, 7, 128) │ 0 │ conv2d_3[0][0] │
│ (LeakyReLU) │ │ │ │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ dropout_3 (Dropout) │ (None, 7, 7, 128) │ 0 │ leaky_re_lu_9[0]… │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ flatten_3 (Flatten) │ (None, 6272) │ 0 │ dropout_3[0][0] │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ dense_4 (Dense) │ (None, 1) │ 6,273 │ flatten_3[0][0] │
└─────────────────────┴───────────────────┴────────────┴───────────────────┘

Total params: 254,949 (995.89 KB)

Trainable params: 254,949 (995.89 KB)

Non-trainable params: 0 (0.00 B)

In [ ]:

# Define the training step for conditional GAN


@tf.function
def train_conditional_step(images, labels):
# Generate random noise for the generator
batch_size = tf.shape(images)[0]
noise = tf.random.normal([batch_size, NOISE_DIM])

with tf.GradientTape() as gen_tape, tf.GradientTape() as disc_tape:


# Generate fake images
generated_images = conditional_generator([noise, labels], training=True)

# Get discriminator outputs for real and fake images


real_output = conditional_discriminator([images, labels], training=True)
fake_output = conditional_discriminator([generated_images, labels], training=Tru
e)

# Calculate losses
gen_loss = generator_loss(fake_output)
disc_loss = discriminator_loss(real_output, fake_output)

# Calculate gradients
gradients_of_generator = gen_tape.gradient(gen_loss, conditional_generator.trainable_
variables)
gradients_of_discriminator = disc_tape.gradient(disc_loss, conditional_discriminator.
trainable_variables)

# Apply gradients
generator_optimizer.apply_gradients(zip(gradients_of_generator, conditional_generator
.trainable_variables))
discriminator_optimizer.apply_gradients(zip(gradients_of_discriminator, conditional_d
iscriminator.trainable_variables))

return gen_loss, disc_loss

In [ ]:

# Generate a grid of digits from 0 to 9


def generate_digits():
# Generate 10 examples of each digit
rows = 10 # Digits 0-9
cols = 10 # Examples per digit
noise = tf.random.normal([rows * cols, NOISE_DIM])

# Create labels (digits 0-9, 10 examples each)


labels = np.array([digit for digit in range(rows) for _ in range(cols)])
labels = tf.convert_to_tensor(labels, dtype=tf.int32)

# Generate images
generated_images = conditional_generator([noise, labels], training=False)
generated_images = generated_images * 0.5 + 0.5 # Convert to [0, 1] range

# Plot the generated images


plt.figure(figsize=(15, 15))
for i in range(rows * cols):
plt.subplot(rows, cols, i + 1)
plt.imshow(generated_images[i, :, :, 0], cmap='gray')
plt.title(f"Digit: {labels[i].numpy()}")
plt.axis('off')
plt.tight_layout()
plt.show()

In [ ]:
# Define the training function
def train(dataset, epochs):
# Create a fixed noise vector for visualization
seed = tf.random.normal([25, NOISE_DIM])

# Track losses
gen_losses = []
disc_losses = []

# Re-initialize optimizers inside the training loop to ensure they are built with the
correct variables
# This can sometimes help with tf.function related issues where variables are not pro
perly registered
# with the optimizer in graph mode.
conditional_generator_optimizer = keras.optimizers.Adam(1e-4)
conditional_discriminator_optimizer = keras.optimizers.Adam(1e-4)

# Training loop
for epoch in range(1, epochs + 1):
start_time = time.time()

# Lists to store batch losses


batch_gen_losses = []
batch_disc_losses = []

# Train on batches
for image_batch, label_batch in dataset:
# Pass the correct optimizers to the train step
gen_loss, disc_loss = train_conditional_step(
image_batch,
label_batch,
conditional_generator_optimizer,
conditional_discriminator_optimizer
)
batch_gen_losses.append(gen_loss)
batch_disc_losses.append(disc_loss)

# Calculate average losses for the epoch


avg_gen_loss = tf.reduce_mean(batch_gen_losses)
avg_disc_loss = tf.reduce_mean(batch_disc_losses)
# Use .numpy() only when eager execution is guaranteed (outside tf.function)
gen_losses.append(avg_gen_loss.numpy())
disc_losses.append(avg_disc_loss.numpy())

# Print progress
print(f"Epoch {epoch} /{epochs}, "
f"Generator Loss: {avg_gen_loss:.4f}, "
f"Discriminator Loss: {avg_disc_loss:.4f}, "
f"Time: {time.time() - start_time:.2f} sec")

# Generate and save images every 10 epochs or at the end


if epoch % 5 == 0 or epoch == 1 or epoch == epochs: # Changed from 10 to 5 for C
GAN_EPOCHS=10
generate_digits() # generate_digits uses conditional_generator which is glob
al

# Plot the loss curves


plt.figure(figsize=(12, 5))
plt.subplot(1, 2, 1)
plt.plot(range(1, epochs + 1), gen_losses, label='Generator')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.title('Generator Loss')
plt.grid(True)

plt.subplot(1, 2, 2)
plt.plot(range(1, epochs + 1), disc_losses, label='Discriminator')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.title('Discriminator Loss')
plt.grid(True)

plt.tight_layout()
plt.show()

return gen_losses, disc_losses

# Define the training step for conditional GAN


@tf.function
def train_conditional_step(images, labels, generator_optimizer, discriminator_optimizer):
# Generate random noise for the generator
batch_size = tf.shape(images)[0]
noise = tf.random.normal([batch_size, NOISE_DIM])

with tf.GradientTape() as gen_tape, tf.GradientTape() as disc_tape:


# Generate fake images
generated_images = conditional_generator([noise, labels], training=True)

# Get discriminator outputs for real and fake images


real_output = conditional_discriminator([images, labels], training=True)
fake_output = conditional_discriminator([generated_images, labels], training=Tru
e)

# Calculate losses
gen_loss = generator_loss(fake_output)
disc_loss = discriminator_loss(real_output, fake_output)

# Calculate gradients
gradients_of_generator = gen_tape.gradient(gen_loss, conditional_generator.trainable_
variables)
gradients_of_discriminator = disc_tape.gradient(disc_loss, conditional_discriminator.
trainable_variables)

# Apply gradients
# Pass the gradients and variables as lists or tuples explicitly
generator_optimizer.apply_gradients(zip(gradients_of_generator, conditional_generator
.trainable_variables))
discriminator_optimizer.apply_gradients(zip(gradients_of_discriminator, conditional_d
iscriminator.trainable_variables))

return gen_loss, disc_loss

# Train the conditional GAN for a few epochs


# Note: This is a simplified training loop for demonstration purposes
CGAN_EPOCHS = 10 # Reduced for demonstration, increase for better results

print("Training Conditional GAN...")


# The train function will now handle the optimizers internally
gen_losses, disc_losses = train(train_dataset_with_labels, CGAN_EPOCHS)
print("Conditional GAN training completed!")

Training Conditional GAN...


Epoch 1/10, Generator Loss: 1.1183, Discriminator Loss: 1.0813, Time: 16.59 sec
Epoch 2/10, Generator Loss: 1.1721, Discriminator Loss: 1.0404, Time: 11.76 sec
Epoch 3/10, Generator Loss: 1.1661, Discriminator Loss: 1.0226, Time: 11.79 sec
Epoch 4/10, Generator Loss: 1.0551, Discriminator Loss: 1.1580, Time: 11.87 sec
Epoch 5/10, Generator Loss: 1.0769, Discriminator Loss: 1.1099, Time: 11.92 sec

Epoch 6/10, Generator Loss: 1.1119, Discriminator Loss: 1.1047, Time: 12.13 sec
Epoch 7/10, Generator Loss: 1.0590, Discriminator Loss: 1.1147, Time: 12.11 sec
Epoch 8/10, Generator Loss: 1.0267, Discriminator Loss: 1.1591, Time: 12.19 sec
Epoch 9/10, Generator Loss: 1.1146, Discriminator Loss: 1.0851, Time: 12.28 sec
Epoch 10/10, Generator Loss: 1.1838, Discriminator Loss: 1.0617, Time: 12.28 sec
Conditional GAN training completed!

You might also like