[go: up one dir, main page]

0% found this document useful (0 votes)
57 views51 pages

r22 ML Lab Manual Final

The document is a learning manual for the Machine Learning Lab at Malla Reddy University for B.Tech II Year II Semester students. It includes general laboratory instructions, a list of experiments with corresponding pages, and detailed Python code examples for various machine learning tasks such as linear regression, logistic regression, and data preprocessing. The manual emphasizes preparation, discipline, and proper usage of laboratory resources.

Uploaded by

ABHI TECH STUDIO
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
57 views51 pages

r22 ML Lab Manual Final

The document is a learning manual for the Machine Learning Lab at Malla Reddy University for B.Tech II Year II Semester students. It includes general laboratory instructions, a list of experiments with corresponding pages, and detailed Python code examples for various machine learning tasks such as linear regression, logistic regression, and data preprocessing. The manual emphasizes preparation, discipline, and proper usage of laboratory resources.

Uploaded by

ABHI TECH STUDIO
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 51

Machine Learning Lab

(MR22-1CS0204)
Learning Manual

B.Tech: II Year II
Semester(CSE-AI&ML)
(2023-24)
MALLA REDDY UNIVERSITY II YEAR I SEM, CSE-AIML

GENERAL LABORATORY INSTRUCTIONS

1. Students are advised to come to the laboratory at least 5 minutes before (to the
starting time), those who come after 5 minutes will not be allowed into the lab.

2. Plan your task properly much before to the commencement, come prepared to the lab
with the synopsis / program / experiment details.

3. Student should enter into the laboratory with: Laboratory observation notes with all
the details (Problem statement, Aim, Algorithm, Procedure, Program, Expected Output,
etc.,) filled in for the lab session.

4. Laboratory Record updated up to the last session experiments and other utensils (if
any) needed in the lab.

5. Proper Dress code and Identity card.

6. Sign in the laboratory login register, write the TIME-IN, and occupy the computer
system allotted to you by the faculty.

7. Execute your task in the laboratory, and record the results / output in the lab
observation note book, and get certified by the concerned faculty.

8. All the students should be polite and cooperative with the laboratory staff, must
maintain the discipline and decency in the laboratory.

9. Computer labs are established with sophisticated and high end branded systems,
which should be utilized properly.

10. Students / Faculty must keep their mobile phones in SWITCHED OFF mode during
the lab sessions. Misuse of the equipment, misbehaviors with the staff and systems etc.,
will attract severe punishment.

11. Students must take the permission of the faculty in case of any urgency to go out; if
anybody found loitering outside the lab / class without permission during working hours
will be treated seriously and punished appropriately.

12. Students should LOG OFF/ SHUT DOWN the computer system before he/she
leaves the lab after completing the task (experiment) in all aspects. He/she must ensure
the system / seat is kept properly.
AI & ML DEPARTMENT (II YEAR II SEMESTER)

MACHINE LEARNING LABORATORY (MR22-1CS0204)


INDEX

S.No. Name of the Experiment Page No.

1 Implementation of Linear algebra , Statistics & Data Preprocessing 1-8

2 Implementation of Linear regression 9-12

3 Implementation of Logistic regression 13-16

4 Implementation of Decision trees 17-19

5 Implementation of Support vector machines 20-22

6 Implementation of Neural networks 23-28

7 Implementation of K-means clustering 29-32

8 Implementation of Principal component analysis 33-35

9 Implementation of Hierarchical clustering 36-38

10 Implementation of Ensemble learning Bagging Algorithms 39-40

11 Implementation of Random forest Algorithms 41-42

12 Implementation of Model Evaluation 43-46


MALLA REDDY UNIVERSITY AI & ML DEPARTMENT

1. Linear algebra, Statistics & Data Preprocessing


Exercise 1.1: Implement a program to calculate the dot product of two vectors.

Python Code:

def dot_product(vector1, vector2):

if len(vector1) != len(vector2):

raise ValueError("Vectors must have the same length for dot product calculation.")

result = sum(x * y for x, y in zip(vector1, vector2))

return result

# Example data

vector_a = [2, 3, 4]

vector_b = [5, 6, 7]

# Calculate dot product

result_dot_product = dot_product(vector_a, vector_b)

# Display the result

print(f"The dot product of {vector_a} and {vector_b} is: {result_dot_product}")

Out put:
The dot product of [2, 3, 4] and [5, 6, 7] is: 56

II YEAR II SEMESTER MACHINE LEARNING 1


MALLA REDDY UNIVERSITY AI & ML DEPARTMENT

Exercise 1.2: Implement a program to generate a random variable from a given probability
distribution.

Python Code:

import numpy as np

import matplotlib.pyplot as plt

# Example data: mean and standard deviation

mean_value = 0

std_deviation = 1

# Number of random variables to generate

num_samples = 1000

# Generate random variables from a normal distribution

random_variable =np.random.normal(loc=mean_value, scale=std_deviation, size=num_samples)

# Plot a histogram of the generated random variables

plt.hist(random_variable, bins=30, density=True, alpha=0.7, color='blue')

# Plot the probability density function (PDF) of the normal distribution

xmin, xmax = plt.xlim()

x = np.linspace(xmin, xmax, 100)

p = np.exp(-0.5 * ((x - mean_value) / std_deviation) ** 2) / (std_deviation * np.sqrt(2 * np.pi))

plt.plot(x, p, 'k', linewidth=2)

II YEAR II SEMESTER MACHINE LEARNING 2


MALLA REDDY UNIVERSITY AI & ML DEPARTMENT

plt.title('Random Variable from Normal Distribution')

plt.xlabel('Value')

plt.ylabel('Probability Density')

plt.show()

Output:

II YEAR II SEMESTER MACHINE LEARNING 3


MALLA REDDY UNIVERSITY AI & ML DEPARTMENT

Exercise 1.3: Implement a program to calculate the derivative of a function.

Python code:

import sympy as sp
def calculate_derivative():
# Define a symbolic variable and the function
x = sp.symbols('x')
function = x**2 + 3*x + 5

# Calculate the derivative of the function


derivative = sp.diff(function, x)

return function, derivative

# Example data
original_function, derivative_function = calculate_derivative()

# Display the results


print(f"Original function: {original_function}")
print(f"Derivative function: {derivative_function}")

Output :

Original function: x**2 + 3*x + 5


Derivative function: 2*x + 3

II YEAR II SEMESTER MACHINE LEARNING 4


MALLA REDDY UNIVERSITY AI & ML DEPARTMENT

Exercise:1.4 Implement a program to find the minimum of a function using gradient descent.
Python Code:

import numpy as np

import matplotlib.pyplot as plt

def quadratic_function(x):

return x**2 + 4*x + 4 # Example quadratic function

def gradient_quadratic_function(x):

return 2*x + 4 # Gradient of the quadratic function

def gradient_descent(initial_guess, learning_rate, num_iterations):

x_values = []

y_values = []

x = initial_guess

for _ in range(num_iterations):

x_values.append(x)

y_values.append(quadratic_function(x))

# Update x using the gradient descent formula

x = x - learning_rate * gradient_quadratic_function(x)

return x_values, y_values

# Example data

initial_guess = -5

learning_rate = 0.1

num_iterations = 20

# Run gradient descent

x_values, y_values = gradient_descent(initial_guess, learning_rate, num_iterations)

# Plot the function and the gradient descent path

II YEAR II SEMESTER MACHINE LEARNING 5


MALLA REDDY UNIVERSITY AI & ML DEPARTMENT

x_range = np.linspace(-8, 2, 100)

plt.plot(x_range, quadratic_function(x_range), label='Quadratic Function')

plt.scatter(x_values, y_values, color='red', label='Gradient Descent Path')

plt.title('Gradient Descent to Minimize a Quadratic Function')

plt.xlabel('x')

plt.ylabel('y')

plt.legend()

plt.show()

Output :

II YEAR II SEMESTER MACHINE LEARNING 6


MALLA REDDY UNIVERSITY AI & ML DEPARTMENT

Exercise:1.5 Implement a program to clean a dataset of missing values.


Python Code:

import pandas as pd
from sklearn.impute import SimpleImputer
def generate_example_dataset():
# Create an example dataset with missing values
data = {
'PassengerId': [1, 2, 3, 4, 5],
'Name': ['John', 'Jane', 'Bob', 'Alice', 'Charlie'],
'Age': [22, None, 25, None, 30],
'Fare': [7.25, 71.28, None, 8.05, 10.5],
'Survived': [0, 1, 1, 0, 1]
}
return pd.DataFrame(data)
def clean_dataset(df):
# Display the original dataset
print("Original Dataset:")
print(df)
# Drop non-numeric columns
numeric_df = df.select_dtypes(include='number')

# Handling missing values using SimpleImputer (mean strategy)


imputer = SimpleImputer(strategy='mean')
df_cleaned = pd.DataFrame(imputer.fit_transform(numeric_df),
columns=numeric_df.columns)
# Display the cleaned dataset
print("\nCleaned Dataset:")
print(df_cleaned)
if __name__ == "__main__":
# Generate an example dataset with missing values

II YEAR II SEMESTER MACHINE LEARNING 7


MALLA REDDY UNIVERSITY AI & ML DEPARTMENT

example_dataset = generate_example_dataset()
# Call the clean_dataset function
clean_dataset(example_dataset)

Output:-
Original Dataset:
PassengerId Name Age Fare Survived
0 1 John 22.0 7.25 0
1 2 Jane NaN 71.28 1
2 3 Bob 25.0 NaN 1
3 4 Alice NaN 8.05 0
4 5 Charlie 30.0 10.50 1

Cleaned Dataset:
PassengerId Age Fare Survived
0 1.0 22.000000 7.25 0.0
1 2.0 25.666667 71.28 1.0
2 3.0 25.000000 24.27 1.0
3 4.0 25.666667 8.05 0.0
4 5.0 30.000000 10.50 1.0

II YEAR II SEMESTER MACHINE LEARNING 8


MALLA REDDY UNIVERSITY AI & ML DEPARTMENT

2. Linear Regression
Exercise:2.1
Implement a program to fit a linear regression model to a dataset.

Python code:

import numpy as nm
import matplotlib.pyplot as mtp
import pandas as pd
data_set= pd.read_csv('Salary_Data.csv')
x= data_set.iloc[:, :-1].values
y= data_set.iloc[:, 1].values
# Splitting the dataset into training and test set.
from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test= train_test_split(x, y, test_size= 1/3, random_state=0)
#Fitting the Simple Linear Regression model to the training dataset
from sklearn.linear_model import LinearRegression
regressor= LinearRegression()
regressor.fit(x_train, y_train) #Prediction of Test and Training set result
y_pred= regressor.predict(x_test)
x_pred= regressor.predict(x_train)
mtp.scatter(x_train, y_train, color="green")
mtp.plot(x_train, x_pred, color="red")
mtp.title("Salary vs Experience (Training Dataset)")
mtp.xlabel("Years of Experience")
mtp.ylabel("Salary(In Rupees)")
mtp.show()
#visualizing the Test set results
mtp.scatter(x_test, y_test, color="blue")

II YEAR II SEMESTER MACHINE LEARNING 9


MALLA REDDY UNIVERSITY AI & ML DEPARTMENT

mtp.plot(x_train, x_pred, color="red")


mtp.title("Salary vs Experience (Test Dataset)")
mtp.xlabel("Years of Experience")
mtp.ylabel("Salary(In Rupees)")
mtp.show()
Output:

II YEAR II SEMESTER MACHINE LEARNING 10


MALLA REDDY UNIVERSITY AI & ML DEPARTMENT

Exercise:2.2
Implement a program to calculate the coefficient of determination for a linear regression model.

Python code:

First, we will start with importing necessary packages as follows −


%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
from sklearn import datasets, linear_model
from sklearn.metrics import mean_squared_error, r2_score
Next, we will load the diabetes dataset and create its object −
diabetes = datasets.load_diabetes()
As we are implementing SLR, we will be using only one feature as follows −
X = diabetes.data[:, np.newaxis, 2]
Next, we need to split the data into training and testing sets as follows −
X_train = X[:-30]
X_test = X[-30:]
Next, we need to split the target into training and testing sets as follows −
y_train = diabetes.target[:-30]
y_test = diabetes.target[-30:]
Now, to train the model we need to create linear regression object as follows −
regr = linear_model.LinearRegression()
Next, train the model using the training sets as follows −
regr.fit(X_train, y_train)
Next, make predictions using the testing set as follows −
y_pred = regr.predict(X_test)
Next, we will be printing some coefficient like MSE, Variance score etc. as follows −
print('Coefficients: \n', regr.coef_)
print("Mean squared error: %.2f" % mean_squared_error(y_test, y_pred))
print('Variance score: %.2f' % r2_score(y_test, y_pred))
Now, plot the outputs as follows −
plt.scatter(X_test, y_test, color='blue')
plt.plot(X_test, y_pred, color='red', linewidth=3)
plt.xticks(())
plt.yticks(())

II YEAR II SEMESTER MACHINE LEARNING 11


MALLA REDDY UNIVERSITY AI & ML DEPARTMENT

plt.show()

Output:

Coefficients:
[941.43097333]
Mean squared error: 3035.06
Variance score: 0.41

II YEAR II SEMESTER MACHINE LEARNING 12


MALLA REDDY UNIVERSITY AI & ML DEPARTMENT

3. Logistic Regression
Exercise:3.1
Implement a program to fit a logistic regression model to a dataset with the given x,y data, fit
the logistic regression model.
x = np.arange(10).reshape(-1, 1)
y = np.array([0, 0, 0, 0, 1, 1, 1, 1, 1, 1])

Python code:

import matplotlib.pyplot as plt


import numpy as np
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import classification_report, confusion_matrix
x = np.arange(10).reshape(-1, 1)
y = np.array([0, 0, 0, 0, 1, 1, 1, 1, 1, 1])
model = LogisticRegression(solver='liblinear', random_state=0)
model.fit(x, y)
model = LogisticRegression(solver='liblinear', random_state=0).fit(x, y)

model.classes_
model.intercept_
model.coef_
model.predict_proba(x)
model.predict(x)
model.score(x, y)
confusion_matrix(y, model.predict(x))
print(classification_report(y, model.predict(x)))
#Improve the Model
#You can improve your model by setting different parameters. For example, let’s work with
#the regularization strength C equal to 10.0, instead of the default value of 1.0:
model = LogisticRegression(solver='liblinear', C=10.0, random_state=0)
model.fit(x, y)
model.classes_
model.intercept_
model.coef_
model.predict_proba(x)
model.predict(x)
model.score(x, y)
confusion_matrix(y, model.predict(x))
print(classification_report(y, model.predict(x)))

II YEAR II SEMESTER MACHINE LEARNING 13


MALLA REDDY UNIVERSITY AI & ML DEPARTMENT

Output:
precision recall f1-score support

0 1.00 1.00 1.00 4


1 1.00 1.00 1.00 6

accuracy 1.00 10
macro avg 1.00 1.00 1.00 10
weighted avg 1.00 1.00 1.00 10

II YEAR II SEMESTER MACHINE LEARNING 14


MALLA REDDY UNIVERSITY AI & ML DEPARTMENT

Exercise:3.2
Implement a program to calculate the odds ratio for a logistic regression model.
Python Code:
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score
# Load or create your dataset
# for simplicity, let's create a sample dataset
data = {
'Age': [25, 30, 35, 40, 45, 50, 55, 60, 65, 70],
'Smoker': [0, 1, 0, 1, 0, 1, 0, 1, 0, 1],
'Outcome': [0, 0, 0, 1, 1, 1, 1, 1, 1, 1]
}
df = pd.DataFrame(data)
# Split the dataset into features (X) and target variable (y)
X = df[['Age', 'Smoker']]
y = df['Outcome']
# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Train a logistic regression model
model = LogisticRegression()
model.fit(X_train, y_train)
# Make predictions on the test set
y_pred = model.predict(X_test)
# Calculate the accuracy of the model
accuracy = accuracy_score(y_test, y_pred)
print(f'Accuracy: {accuracy:.2f}')

II YEAR II SEMESTER MACHINE LEARNING 15


MALLA REDDY UNIVERSITY AI & ML DEPARTMENT

# Calculate odds ratio for each feature


odds_ratio = np.exp(model.coef_)
print(f'Odds Ratio: {odds_ratio}')

Output:
Accuracy: 1.00
Odds Ratio: [[2.04559905 1.14215319]]

II YEAR II SEMESTER MACHINE LEARNING 16


MALLA REDDY UNIVERSITY AI & ML DEPARTMENT

4. Decision trees
Exercise:4.1

Implement a program to construct a decision tree from a dataset.

Python code:

# Python program to implement decision tree algorithm and plot the tree
# Importing the required libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn import metrics
import seaborn as sns
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn import tree

# Loading the dataset


iris = load_iris()

#converting the data to a pandas dataframe


data = pd.DataFrame(data = iris.data, columns = iris.feature_names)

#creating a separate column for the target variable of iris dataset


data['Species'] = iris.target

#replacing the categories of target variable with the actual names of the species
target = np.unique(iris.target)
target_n = np.unique(iris.target_names)
target_dict = dict(zip(target, target_n))
data['Species'] = data['Species'].replace(target_dict)

# Separating the independent dependent variables of the dataset


x = data.drop(columns = "Species")
y = data["Species"]
names_features = x.columns
target_labels = y.unique()

# Splitting the dataset into training and testing datasets


x_train, x_test, y_train, y_test = train_test_split(x, y, test_size = 0.3, random_state = 93)
# Importing the Decision Tree classifier class from sklearn
from sklearn.tree import DecisionTreeClassifier

II YEAR II SEMESTER MACHINE LEARNING 17


MALLA REDDY UNIVERSITY AI & ML DEPARTMENT

# Creating an instance of the classifier class


dtc = DecisionTreeClassifier(max_depth = 3, random_state = 93)
# Fitting the training dataset to the model
dtc.fit(x_train, y_train)

# Plotting the Decision Tree


plt.figure(figsize = (30, 10), facecolor = 'b')
Tree = tree.plot_tree(dtc, feature_names = names_features, class_names = target_labels, rounded
= True, filled = True, fontsize = 14)
plt.show()
y_pred = dtc.predict(x_test)

# Finding the confusion matrix


confusion_matrix = metrics.confusion_matrix(y_test, y_pred)
matrix = pd.DataFrame(confusion_matrix)
axis = plt.axes()
sns.set(font_scale = 1.3)
plt.figure(figsize = (10,7))

# Plotting heatmap
sns.heatmap(matrix, annot = True, fmt = "g", ax = axis, cmap = "magma")
axis.set_title('Confusion Matrix')
axis.set_xlabel("Predicted Values", fontsize = 10)
axis.set_xticklabels([''] + target_labels)
axis.set_ylabel( "True Labels", fontsize = 10)
axis.set_yticklabels(list(target_labels), rotation = 0)
plt.show()
Output:

II YEAR II SEMESTER MACHINE LEARNING 18


MALLA REDDY UNIVERSITY AI & ML DEPARTMENT

Exercise:4.2

Implement a program to calculate the accuracy of a decision tree model

Python code:

import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score
# Load or create your dataset
# For simplicity, let's create a sample dataset
data = {
'Feature1': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
'Feature2': [2, 4, 1, 3, 6, 8, 5, 7, 10, 9],
'Target': [0, 0, 0, 1, 1, 1, 1, 1, 1, 1]
}
df = pd.DataFrame(data)
# Split the dataset into features (X) and target variable (y)
X = df[['Feature1', 'Feature2']]
y = df['Target']
# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Train a decision tree model
model = DecisionTreeClassifier()
model.fit(X_train, y_train)
# Make predictions on the test set
y_pred = model.predict(X_test)

# Calculate the accuracy of the model


accuracy = accuracy_score(y_test, y_pred)
print(f'Accuracy: {accuracy:.2f}')
# Check if the accuracy meets the desired threshold (97%)
desired_accuracy = 0.97
if accuracy >= desired_accuracy:
print(f'Model accuracy meets the desired threshold of {desired_accuracy:.2%}')
else:
print(f'Model accuracy does not meet the desired threshold of {desired_accuracy:.2%}')

Output:
Accuracy: 1.00
Model accuracy meets the desired threshold of 97.00%

II YEAR II SEMESTER MACHINE LEARNING 19


MALLA REDDY UNIVERSITY AI & ML DEPARTMENT

5. Support Vector Machine (SVM)


Exercise: 5.1: Implement a program to fit a support vector machine model to a dataset.

Python code:

import pandas as pd
data = pd.read_csv("apples_and_oranges.csv")
#Splitting the dataset into training and test samples
from sklearn.model_selection import train_test_split
training_set, test_set = train_test_split(data, test_size = 0.2, random_state = 1)
#Classifying the predictors and target
X_train = training_set.iloc[:,0:2].values
Y_train = training_set.iloc[:,2].values
X_test = test_set.iloc[:,0:2].values
Y_test = test_set.iloc[:,2].values
#Initializing Support Vector Machine and fitting the training data
from sklearn.svm import SVC
classifier = SVC(kernel='rbf', random_state = 1)
classifier.fit(X_train,Y_train)
#Predicting the classes for test set
Y_pred = classifier.predict(X_test)
#Attaching the predictions to test set for comparing
test_set["Predictions"] = Y_pred
from sklearn.metrics import confusion_matrix
cm = confusion_matrix(Y_test,Y_pred)
accuracy = float(cm.diagonal().sum())/len(Y_test)
print("\nAccuracy of SVM for the Given Dataset : ", accuracy)
Output:
Accuracy of SVM for the Given Dataset: 0.375

II YEAR II SEMESTER MACHINE LEARNING 20


MALLA REDDY UNIVERSITY AI & ML DEPARTMENT

Exercise: 5.2:

Implement the SVM and find the performance of with the given bill_authentication.CSV dataset.

Python code:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
#matplotlib inline
bankdata = pd.read_csv("bill_authentication.csv")
bankdata.shape
bankdata.head()
X = bankdata.drop('Class', axis=1)
y = bankdata['Class']
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.20)
from sklearn.svm import SVC
svclassifier = SVC(kernel='linear')
svclassifier.fit(X_train, y_train)
y_pred = svclassifier.predict(X_test)
from sklearn.metrics import classification_report, confusion_matrix
print(confusion_matrix(y_test,y_pred))
print(classification_report(y_test,y_pred))

II YEAR II SEMESTER MACHINE LEARNING 21


MALLA REDDY UNIVERSITY AI & ML DEPARTMENT

Output:
[[152 0]
[ 1 122]]
precision recall f1-score support

0 0.99 1.00 1.00 152


1 1.00 0.99 1.00 123

avg / total 1.00 1.00 1.00 275

II YEAR II SEMESTER MACHINE LEARNING 22


MALLA REDDY UNIVERSITY AI & ML DEPARTMENT

6. Neural networks
Exercise: 6.1 Implement a program to construct a neural network from a dataset.

Python code:

# Import necessary libraries


import numpy as np
import tensorflow as tf
from tensorflow import keras
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.datasets import make_classification
# Generate a synthetic dataset
X, y = make_classification(n_samples=1000, n_features=20, n_classes=2, random_state=42)
# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Standardize the features
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)
# Define the neural network architecture
model = keras.Sequential([
keras.layers.Dense(32, activation='relu', input_shape=(X_train.shape[1],)),
keras.layers.Dense(16, activation='relu'),
keras.layers.Dense(1, activation='sigmoid') ])
# Compile the model
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
# Train the model
model.fit(X_train, y_train, epochs=10, batch_size=32, validation_split=0.2)
# Evaluate the model on the test set
test_loss, test_acc = model.evaluate(X_test, y_test)
print(f'Test accuracy: {test_acc * 100:.2f}%')
# Make predictions on new data
new_data = np.random.randn(5, 20) # Replace with your own new data
new_data_standardized = scaler.transform(new_data)
predictions = model.predict(new_data_standardized)
print("Predictions:")
print(predictions)
Output :
Epoch 1/10
20/20 [==============================] - 3s 37ms/step - loss: 0.7252 -
accuracy: 0.5641 - val_loss: 0.6267 - val_accuracy: 0.6500
Epoch 2/10
20/20 [==============================] - 0s 6ms/step - loss: 0.6146 -
accuracy: 0.6484 - val_loss: 0.5604 - val_accuracy: 0.7250

II YEAR II SEMESTER MACHINE LEARNING 23


MALLA REDDY UNIVERSITY AI & ML DEPARTMENT

Epoch 3/10
20/20 [==============================] - 0s 8ms/step - loss: 0.5520 -
accuracy: 0.7344 - val_loss: 0.5102 - val_accuracy: 0.7688
Epoch 4/10
20/20 [==============================] - 0s 8ms/step - loss: 0.5009 -
accuracy: 0.7875 - val_loss: 0.4652 - val_accuracy: 0.8062
Epoch 5/10
20/20 [==============================] - 0s 10ms/step - loss: 0.4571 -
accuracy: 0.8219 - val_loss: 0.4248 - val_accuracy: 0.8313
Epoch 6/10
20/20 [==============================] - 0s 8ms/step - loss: 0.4175 -
accuracy: 0.8422 - val_loss: 0.3871 - val_accuracy: 0.8250
Epoch 7/10
20/20 [==============================] - 0s 12ms/step - loss: 0.3873 -
accuracy: 0.8531 - val_loss: 0.3550 - val_accuracy: 0.8438
Epoch 8/10
20/20 [==============================] - 0s 8ms/step - loss: 0.3607 -
accuracy: 0.8641 - val_loss: 0.3335 - val_accuracy: 0.8562
Epoch 9/10
20/20 [==============================] - 0s 11ms/step - loss: 0.3411 -
accuracy: 0.8734 - val_loss: 0.3190 - val_accuracy: 0.8562
Epoch 10/10
20/20 [==============================] - 0s 7ms/step - loss: 0.3254 -
accuracy: 0.8813 - val_loss: 0.3034 - val_accuracy: 0.8625
7/7 [==============================] - 0s 9ms/step - loss: 0.3521 -
accuracy: 0.8550
Test accuracy: 85.50%
1/1 [==============================] - 0s 283ms/step
Predictions:
[[0.813061 ]
[0.5343085 ]
[0.99716836]
[0.67003953]
[0.19199367]]

II YEAR II SEMESTER MACHINE LEARNING 24


MALLA REDDY UNIVERSITY AI & ML DEPARTMENT

Exercise: 6.2 Implement a program to train a neural network model.

Python code:

# Import necessary libraries


import numpy as np
import tensorflow as tf
from tensorflow import keras
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import OneHotEncoder
# Load the Iris dataset
iris = load_iris()
X, y = iris.data, iris.target.reshape(-1, 1)
# One-hot encode the target variable
encoder = OneHotEncoder(sparse=False)
y_onehot = encoder.fit_transform(y)
# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y_onehot, test_size=0.2, random_state=42)
# Define the neural network architecture
model = keras.Sequential([
keras.layers.Dense(10, activation='relu', input_shape=(X_train.shape[1],)),
keras.layers.Dense(3, activation='softmax') # Output layer with 3 classes for Iris dataset
])
# Compile the model
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
# Train the model
model.fit(X_train, y_train, epochs=50, batch_size=10, validation_split=0.1)

# Evaluate the model on the test set


test_loss, test_acc = model.evaluate(X_test, y_test)
print(f'Test accuracy: {test_acc * 100:.2f}%')

II YEAR II SEMESTER MACHINE LEARNING 25


MALLA REDDY UNIVERSITY AI & ML DEPARTMENT

Output :
Epoch 1/50
11/11 [==============================] - 1s 40ms/step - loss: 3.0484 - accuracy: 0.3519 - val_loss:
3.5613 - val_accuracy: 0.1667
Epoch 2/50
11/11 [==============================] - 0s 11ms/step - loss: 2.7265 - accuracy: 0.3519 - val_loss:
3.1855 - val_accuracy: 0.1667
Epoch 3/50
11/11 [==============================] - 0s 22ms/step - loss: 2.4384 - accuracy: 0.3519 - val_loss:
2.8338 - val_accuracy: 0.1667
Epoch 4/50
11/11 [==============================] - 0s 13ms/step - loss: 2.1603 - accuracy: 0.3519 - val_loss:
2.5011 - val_accuracy: 0.1667
Epoch 5/50
11/11 [==============================] - 0s 13ms/step - loss: 1.8983 - accuracy: 0.3519 - val_loss:
2.1564 - val_accuracy: 0.1667
Epoch 6/50
11/11 [==============================] - 0s 13ms/step - loss: 1.6421 - accuracy: 0.3519 - val_loss:
1.8575 - val_accuracy: 0.1667
Epoch 7/50
11/11 [==============================] - 0s 12ms/step - loss: 1.4327 - accuracy: 0.3519 - val_loss:
1.6076 - val_accuracy: 0.1667
Epoch 8/50
11/11 [==============================] - 0s 9ms/step - loss: 1.2498 - accuracy: 0.3519 - val_loss: 1.4182
- val_accuracy: 0.1667
Epoch 9/50
11/11 [==============================] - 0s 18ms/step - loss: 1.1191 - accuracy: 0.3611 - val_loss:
1.2617 - val_accuracy: 0.2500
Epoch 10/50
11/11 [==============================] - 0s 8ms/step - loss: 1.0100 - accuracy: 0.4630 - val_loss: 1.1399
- val_accuracy: 0.4167
Epoch 11/50
11/11 [==============================] - 0s 9ms/step - loss: 0.9218 - accuracy: 0.6481 - val_loss: 1.0457
- val_accuracy: 0.5833
Epoch 12/50
11/11 [==============================] - 0s 19ms/step - loss: 0.8534 - accuracy: 0.6667 - val_loss:
0.9746 - val_accuracy: 0.5833
Epoch 13/50
11/11 [==============================] - 0s 5ms/step - loss: 0.8040 - accuracy: 0.6759 - val_loss: 0.9162
- val_accuracy: 0.5833
Epoch 14/50
11/11 [==============================] - 0s 5ms/step - loss: 0.7619 - accuracy: 0.6852 - val_loss: 0.8738
- val_accuracy: 0.5833
Epoch 15/50
11/11 [==============================] - 0s 6ms/step - loss: 0.7339 - accuracy: 0.6759 - val_loss: 0.8430
- val_accuracy: 0.5000
Epoch 16/50
11/11 [==============================] - 0s 5ms/step - loss: 0.7111 - accuracy: 0.6944 - val_loss: 0.8184
- val_accuracy: 0.5000
Epoch 17/50
11/11 [==============================] - 0s 5ms/step - loss: 0.6963 - accuracy: 0.6667 - val_loss: 0.7989
- val_accuracy: 0.3333
Epoch 18/50

II YEAR II SEMESTER MACHINE LEARNING 26


MALLA REDDY UNIVERSITY AI & ML DEPARTMENT

11/11 [==============================] - 0s 5ms/step - loss: 0.6797 - accuracy: 0.6944 - val_loss: 0.7861


- val_accuracy: 0.3333
Epoch 19/50
11/11 [==============================] - 0s 6ms/step - loss: 0.6681 - accuracy: 0.7037 - val_loss: 0.7746
- val_accuracy: 0.3333
Epoch 20/50
11/11 [==============================] - 0s 5ms/step - loss: 0.6578 - accuracy: 0.6944 - val_loss: 0.7637
- val_accuracy: 0.5000
Epoch 21/50
11/11 [==============================] - 0s 5ms/step - loss: 0.6476 - accuracy: 0.7130 - val_loss: 0.7552
- val_accuracy: 0.5000
Epoch 22/50
11/11 [==============================] - 0s 6ms/step - loss: 0.6388 - accuracy: 0.7037 - val_loss: 0.7478
- val_accuracy: 0.5833
Epoch 23/50
11/11 [==============================] - 0s 5ms/step - loss: 0.6307 - accuracy: 0.7130 - val_loss: 0.7414
- val_accuracy: 0.5833
Epoch 24/50
11/11 [==============================] - 0s 6ms/step - loss: 0.6227 - accuracy: 0.7130 - val_loss: 0.7346
- val_accuracy: 0.5833
Epoch 25/50
11/11 [==============================] - 0s 5ms/step - loss: 0.6147 - accuracy: 0.7222 - val_loss: 0.7284
- val_accuracy: 0.5833
Epoch 26/50
11/11 [==============================] - 0s 6ms/step - loss: 0.6078 - accuracy: 0.7037 - val_loss: 0.7220
- val_accuracy: 0.5833
Epoch 27/50
11/11 [==============================] - 0s 7ms/step - loss: 0.6009 - accuracy: 0.6944 - val_loss: 0.7167
- val_accuracy: 0.5833
Epoch 28/50
11/11 [==============================] - 0s 6ms/step - loss: 0.5941 - accuracy: 0.6852 - val_loss: 0.7119
- val_accuracy: 0.5833
Epoch 29/50
11/11 [==============================] - 0s 5ms/step - loss: 0.5875 - accuracy: 0.6852 - val_loss: 0.7065
- val_accuracy: 0.5833
Epoch 30/50
11/11 [==============================] - 0s 7ms/step - loss: 0.5816 - accuracy: 0.6852 - val_loss: 0.7009
- val_accuracy: 0.5833
Epoch 31/50
11/11 [==============================] - 0s 7ms/step - loss: 0.5757 - accuracy: 0.6852 - val_loss: 0.6960
- val_accuracy: 0.5833
Epoch 32/50
11/11 [==============================] - 0s 5ms/step - loss: 0.5697 - accuracy: 0.6852 - val_loss: 0.6907
- val_accuracy: 0.5833
Epoch 33/50
11/11 [==============================] - 0s 7ms/step - loss: 0.5642 - accuracy: 0.6852 - val_loss: 0.6860
- val_accuracy: 0.5833
Epoch 34/50
11/11 [==============================] - 0s 5ms/step - loss: 0.5591 - accuracy: 0.6944 - val_loss: 0.6818
- val_accuracy: 0.5833
Epoch 35/50
11/11 [==============================] - 0s 7ms/step - loss: 0.5540 - accuracy: 0.7037 - val_loss: 0.6770
- val_accuracy: 0.5833
Epoch 36/50

II YEAR II SEMESTER MACHINE LEARNING 27


MALLA REDDY UNIVERSITY AI & ML DEPARTMENT

11/11 [==============================] - 0s 7ms/step - loss: 0.5487 - accuracy: 0.7037 - val_loss: 0.6722


- val_accuracy: 0.5833
Epoch 37/50
11/11 [==============================] - 0s 5ms/step - loss: 0.5443 - accuracy: 0.6852 - val_loss: 0.6678
- val_accuracy: 0.5833
Epoch 38/50
11/11 [==============================] - 0s 7ms/step - loss: 0.5394 - accuracy: 0.7037 - val_loss: 0.6634
- val_accuracy: 0.5833
Epoch 39/50
11/11 [==============================] - 0s 5ms/step - loss: 0.5349 - accuracy: 0.7037 - val_loss: 0.6600
- val_accuracy: 0.5833
Epoch 40/50
11/11 [==============================] - 0s 5ms/step - loss: 0.5306 - accuracy: 0.7130 - val_loss: 0.6558
- val_accuracy: 0.5833
Epoch 41/50
11/11 [==============================] - 0s 5ms/step - loss: 0.5264 - accuracy: 0.7037 - val_loss: 0.6519
- val_accuracy: 0.5833
Epoch 42/50
11/11 [==============================] - 0s 7ms/step - loss: 0.5221 - accuracy: 0.7037 - val_loss: 0.6479
- val_accuracy: 0.5833
Epoch 43/50
11/11 [==============================] - 0s 6ms/step - loss: 0.5181 - accuracy: 0.7037 - val_loss: 0.6447
- val_accuracy: 0.5833
Epoch 44/50
11/11 [==============================] - 0s 6ms/step - loss: 0.5142 - accuracy: 0.7037 - val_loss: 0.6409
- val_accuracy: 0.5833
Epoch 45/50
11/11 [==============================] - 0s 7ms/step - loss: 0.5104 - accuracy: 0.7037 - val_loss: 0.6370
- val_accuracy: 0.5833
Epoch 46/50
11/11 [==============================] - 0s 7ms/step - loss: 0.5072 - accuracy: 0.7315 - val_loss: 0.6332
- val_accuracy: 0.5833
Epoch 47/50
11/11 [==============================] - 0s 6ms/step - loss: 0.5030 - accuracy: 0.7222 - val_loss: 0.6296
- val_accuracy: 0.5833
Epoch 48/50
11/11 [==============================] - 0s 5ms/step - loss: 0.4997 - accuracy: 0.7222 - val_loss: 0.6259
- val_accuracy: 0.5833
Epoch 49/50
11/11 [==============================] - 0s 5ms/step - loss: 0.4961 - accuracy: 0.7407 - val_loss: 0.6222
- val_accuracy: 0.5833
Epoch 50/50
11/11 [==============================] - 0s 5ms/step - loss: 0.4929 - accuracy: 0.7315 - val_loss: 0.6185
- val_accuracy: 0.5833
1/1 [==============================] - 0s 27ms/step - loss: 0.4907 - accuracy: 0.7667
Test accuracy: 76.67%

II YEAR II SEMESTER MACHINE LEARNING 28


MALLA REDDY UNIVERSITY AI & ML DEPARTMENT

7. K-means clustering
Exercise 7.1: Implement a program to cluster a dataset using K-means clustering.

Python code:

import numpy as np

import matplotlib.pyplot as plt

from sklearn.cluster import KMeans

from sklearn.datasets import load_iris

from sklearn.preprocessing import StandardScaler

# Load the Iris dataset

iris = load_iris()

data = iris.data # Features only, not using target labels

# Standardize the data (important for K-Means)

scaler = StandardScaler()

data_scaled = scaler.fit_transform(data)

# Apply K-Means clustering

kmeans = KMeans(n_clusters=3, random_state=42)

kmeans.fit(data_scaled)

# Get cluster labels and centroids

labels = kmeans.labels_

centroids = kmeans.cluster_centers_

# Visualize the clustering result

plt.scatter(data_scaled[:, 0], data_scaled[:, 1], c=labels, cmap='viridis', edgecolors='k', s=50)

plt.scatter(centroids[:, 0], centroids[:, 1], marker='X', s=200, color='red', label='Centroids')

plt.title('K-Means Clustering on Iris Dataset')

II YEAR II SEMESTER MACHINE LEARNING 29


MALLA REDDY UNIVERSITY AI & ML DEPARTMENT

plt.xlabel('Sepal Length (scaled)')

plt.ylabel('Sepal Width (scaled)')

plt.legend()

plt.show()

Output :

II YEAR II SEMESTER MACHINE LEARNING 30


MALLA REDDY UNIVERSITY AI & ML DEPARTMENT

Exercise 7.2: Implement a program to calculate the elbow method for K-means clustering.

Python code:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.cluster import KMeans
from sklearn.datasets import make_blobs
from sklearn.preprocessing import StandardScaler
# Generate example data
np.random.seed(42)
X, _ = make_blobs(n_samples=300, centers=4, random_state=42, cluster_std=1.0)

# Standardize the data (important for K-Means)


scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)
# Implement the elbow method
distortions = []
max_k = 10
for k in range(1, max_k + 1):
kmeans = KMeans(n_clusters=k, random_state=42)
kmeans.fit(X_scaled)
distortions.append(kmeans.inertia_) # Inertia: Sum of squared distances to the closest centroid

# Plot the elbow method graph


plt.plot(range(1, max_k + 1), distortions, marker='o')
plt.title('Elbow Method for Optimal K')
plt.xlabel('Number of Clusters (K)')
plt.ylabel('Sum of Squared Distances (Inertia)')
plt.show()

II YEAR II SEMESTER MACHINE LEARNING 31


MALLA REDDY UNIVERSITY AI & ML DEPARTMENT

Output:

II YEAR II SEMESTER MACHINE LEARNING 32


MALLA REDDY UNIVERSITY AI & ML DEPARTMENT

8. Principal component analysis


Exercise 8.1: Implement a program to perform principal component analysis on a dataset.

Python code:

import numpy as np
# Step 1: Data standardization
def standardize(X):
return (X - np.mean(X, axis=0)) / np.std(X, axis=0)

# Step 2: Covariance matrix calculation


def compute_covariance_matrix(X):
return np.cov(X.T)

# Step 3: Eigenvalue and eigenvector calculation


def find_eigenvectors_and_eigenvalues(X):
cov_matrix = compute_covariance_matrix(X)
eigenvalues, eigenvectors = np.linalg.eig(cov_matrix)
return eigenvalues, eigenvectors

# Step 4: Principal component calculation


def project_data(X, eigenvectors, k):
sorted_eigenvectors = eigenvectors[:, np.argsort(-eigenvectors)[:,:k]]
return np.dot(X, sorted_eigenvectors)

# Step 5: Dimensionality reduction


def get_variance_explained(eigenvalues, k):
return sum(eigenvalues[:k]) / sum(eigenvalues)

# Example usage
X = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
X_std = standardize(X)
eigenvalues, eigenvectors = find_eigenvectors_and_eigenvalues(X_std)
projected_data = project_data(X_std, eigenvectors, 2)
variance_explained = get_variance_explained(eigenvalues, 2)
print("Standardized data:")
print(X_std)
print("Covariance matrix:")
print(compute_covariance_matrix(X_std))
print("Eigenvalues:")
print(eigenvalues)
print("Eigenvectors:")
print(eigenvectors)
print("Projected data:")
print(projected_data)

II YEAR II SEMESTER MACHINE LEARNING 33


MALLA REDDY UNIVERSITY AI & ML DEPARTMENT

print("Variance explained:")
print(variance_explained)

Output:

II YEAR II SEMESTER MACHINE LEARNING 34


MALLA REDDY UNIVERSITY AI & ML DEPARTMENT

Exercise 8.2: Implement a program to calculate the covariance matrix for a dataset.

Python code:

import numpy as np
# Predefined dataset
dataset = np.array([
[1, 2, 3],
[4, 5, 6],
[7, 8, 9],
[10, 11, 12]
])
# Calculate the covariance matrix
covariance_matrix = np.cov(dataset, rowvar=False)

# Print the covariance matrix


print("Covariance Matrix:")
print(covariance_matrix)

Output:

Covariance Matrix:
[[15. 15. 15.]
[15. 15. 15.]
[15. 15. 15.]]

II YEAR II SEMESTER MACHINE LEARNING 35


MALLA REDDY UNIVERSITY AI & ML DEPARTMENT

9. Hierarchical clustering
Exercise 9.1: Implement a program to perform hierarchical clustering on a dataset.

Python code:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from scipy.cluster.hierarchy import linkage, dendrogram
# Generate example data
np.random.seed(42)
X = np.array([[2, 5], [3, 3], [5, 8], [8, 5], [10, 6]])
# Perform hierarchical clustering using linkage function
linkage_matrix = linkage(X, method='complete', metric='euclidean')
# Plot the dendrogram
dendrogram(linkage_matrix, labels=['A', 'B', 'C', 'D', 'E'])
plt.title('Hierarchical Clustering Dendrogram')
plt.xlabel('Data Points')
plt.ylabel('Euclidean Distance')
plt.show()

output :

II YEAR II SEMESTER MACHINE LEARNING 36


MALLA REDDY UNIVERSITY AI & ML DEPARTMENT

Exercise 9.2: Implement a program to calculate the agglomerative clustering algorithm for a
hierarchical clustering.

Python code:

import numpy as np
from sklearn.cluster import AgglomerativeClustering
from scipy.cluster.hierarchy import dendrogram, linkage
import matplotlib.pyplot as plt

# Generate example data


np.random.seed(42)
X = np.array([[2, 5], [3, 3], [5, 8], [8, 5], [10, 6]])

# Agglomerative clustering with complete linkage


agg_cluster_complete = AgglomerativeClustering(n_clusters=None, linkage='complete',
distance_threshold=0)
agg_labels_complete = agg_cluster_complete.fit_predict(X)

# Plot dendrogram for complete linkage


linkage_matrix_complete = linkage(X, method='complete')
dendrogram(linkage_matrix_complete, labels=['A', 'B', 'C', 'D', 'E'])
plt.title('Agglomerative Clustering (Complete Linkage)')
plt.show()

II YEAR II SEMESTER MACHINE LEARNING 37


MALLA REDDY UNIVERSITY AI & ML DEPARTMENT

Output:

II YEAR II SEMESTER MACHINE LEARNING 38


MALLA REDDY UNIVERSITY AI & ML DEPARTMENT

10. Bagging
Exercise 10.1: Implement a program to implement bagging for a decision tree model.

Python code:

from sklearn.datasets import make_classification


from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import BaggingClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
# Generate synthetic data
X, y = make_classification(n_samples=1000, n_features=20, n_informative=10, n_classes=2,
random_state=42)
# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Define the base decision tree model
base_model = DecisionTreeClassifier(random_state=42)
# Define the bagging classifier
bagging_model = BaggingClassifier(base_model, n_estimators=10, random_state=42)
# Train the bagging model
bagging_model.fit(X_train, y_train)
# Make predictions on the test set
predictions = bagging_model.predict(X_test)
# Calculate accuracy
accuracy = accuracy_score(y_test, predictions)
print(f"Accuracy: {accuracy}")

Output:

Accuracy: 0.88

II YEAR II SEMESTER MACHINE LEARNING 39


MALLA REDDY UNIVERSITY AI & ML DEPARTMENT

Exercise 10.2: Implement a program to calculate the out-of-bag error for a bagging model.

Python code:

from sklearn.datasets import make_classification


from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import BaggingClassifier
from sklearn.metrics import accuracy_score
# Generate synthetic data
X, y = make_classification(n_samples=1000, n_features=20, n_informative=10, n_classes=2,
random_state=42)
# Create a bagging model with decision tree as base estimator
base_model = DecisionTreeClassifier(random_state=42)
bagging_model=BaggingClassifier(base_model,n_estimators=10,oob_score=True,
random_state=42)
# Train the bagging model
bagging_model.fit(X, y)

# Access the out-of-bag score


oob_score = bagging_model.oob_score_
print(f"Out-of-Bag Score: {oob_score}")

Output:

Out-of-Bag Score: 0.871

II YEAR II SEMESTER MACHINE LEARNING 40


MALLA REDDY UNIVERSITY AI & ML DEPARTMENT

11. Random forest


Exercise 11.1: Implement a program to implement random forests for a decision tree model.

Python code:

from sklearn.datasets import make_classification


from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Generate synthetic data


X, y = make_classification(n_samples=1000, n_features=20, n_informative=10, n_classes=2,
random_state=42)
# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Define the Random Forest classifier
random_forest_model = RandomForestClassifier(n_estimators=100, random_state=42)
# Train the Random Forest model
random_forest_model.fit(X_train, y_train)
# Make predictions on the test set
predictions = random_forest_model.predict(X_test)
# Calculate accuracy
accuracy = accuracy_score(y_test, predictions)
print(f"Accuracy: {accuracy}")

Output:

Accuracy: 0.92

II YEAR II SEMESTER MACHINE LEARNING 41


MALLA REDDY UNIVERSITY AI & ML DEPARTMENT

Exercise 11.2:

Implement a program to calculate the out-of-bag error for a random forests model.

Python code:

from sklearn.datasets import make_classification


from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score

# Generate synthetic data


X, y = make_classification(n_samples=1000, n_features=20, n_informative=10, n_classes=2,
random_state=42)
# Create a Random Forest classifier
random_forest_model = RandomForestClassifier(n_estimators=100, oob_score=True,
random_state=42)
# Train the Random Forest model
random_forest_model.fit(X, y)
# Access the out-of-bag score
oob_score = random_forest_model.oob_score_
print(f"Out-of-Bag Score: {oob_score}")

Output:

Out-of-Bag Score: 0.912

II YEAR II SEMESTER MACHINE LEARNING 42


MALLA REDDY UNIVERSITY AI & ML DEPARTMENT

12. Model Evaluation


Exercise 12.1: Implement a program to calculate the accuracy, precision, and recall of a model.

Python Code:

import numpy as np
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, precision_score, recall_score, classification_report,
confusion_matrix

def load_example_dataset():
# Load the Iris dataset as an example
iris = load_iris()
X, y = iris.data, iris.target
return X, y
def evaluate_model(X, y):
# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Create a Logistic Regression model (replace with your own model)
model = LogisticRegression()
# Train the model
model.fit(X_train, y_train)
# Make predictions on the test set
y_pred = model.predict(X_test)
# Calculate and display accuracy, precision, and recall
accuracy = accuracy_score(y_test, y_pred)
precision = precision_score(y_test, y_pred, average='weighted')
recall = recall_score(y_test, y_pred, average='weighted')
print(f"Accuracy: {accuracy}")
print(f"Precision: {precision}")

II YEAR II SEMESTER MACHINE LEARNING 43


MALLA REDDY UNIVERSITY AI & ML DEPARTMENT

print(f"Recall: {recall}")
# Display classification report and confusion matrix
print("\nClassification Report:")
print(classification_report(y_test, y_pred))
print("\nConfusion Matrix:")
cm = confusion_matrix(y_test, y_pred)
print(cm)
if __name__ == "__main__":
# Load an example dataset
X, y = load_example_dataset()
# Evaluate the model and calculate metrics
evaluate_model(X, y)

Output:

Accuracy: 1.0
Precision: 1.0
Recall: 1.0

Classification Report:
precision recall f1-score support

0 1.00 1.00 1.00 10


1 1.00 1.00 1.00 9
2 1.00 1.00 1.00 11

accuracy 1.00 30
macro avg 1.00 1.00 1.00 30
weighted avg 1.00 1.00 1.00 30

Confusion Matrix:
[[10 0 0]
[ 0 9 0]
[ 0 0 11]]

II YEAR II SEMESTER MACHINE LEARNING 44


MALLA REDDY UNIVERSITY AI & ML DEPARTMENT

Exercise 12.2: Implement a program to calculate the ROC curve and AUC of a model

Python Code:

import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import roc_curve, auc, roc_auc_score
def load_example_dataset():
# Load the Breast Cancer dataset as an example
breast_cancer = load_breast_cancer()
X, y = breast_cancer.data, breast_cancer.target
return X, y
def evaluate_model(X, y):
# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create a Logistic Regression model (replace with your own model)


model = LogisticRegression()
# Train the model
model.fit(X_train, y_train)
# Make probability predictions on the test set
y_prob = model.predict_proba(X_test)[:, 1]
# Calculate ROC curve and AUC
fpr, tpr, thresholds = roc_curve(y_test, y_prob)
roc_auc = auc(fpr, tpr)
# Display AUC score
print(f"AUC Score: {roc_auc}")
# Visualize ROC curve
plt.figure(figsize=(8, 6))

II YEAR II SEMESTER MACHINE LEARNING 45


MALLA REDDY UNIVERSITY AI & ML DEPARTMENT

plt.plot(fpr, tpr, color='darkorange', lw=2, label=f'ROC Curve (AUC = {roc_auc:.2f})')


plt.plot([0, 1], [0, 1], color='navy', lw=2, linestyle='--', label='Random')
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('Receiver Operating Characteristic (ROC) Curve')
plt.legend()
plt.show()
if __name__ == "__main__":
# Load an example dataset
X, y = load_example_dataset()
# Evaluate the model and calculate ROC curve and AUC
evaluate_model(X, y)

Output:

AUC Score: 0.9970520799213888

II YEAR II SEMESTER MACHINE LEARNING 46

You might also like