[go: up one dir, main page]

0% found this document useful (0 votes)
2K views24 pages

r22 1 9 ML Lab Manual r22 Regulations

The document is a lab manual for a Machine Learning course at Malla Reddy College of Engineering, detailing objectives, outcomes, and a series of experiments using Python. It covers various machine learning techniques, including data analysis, regression models, and the use of libraries such as Pandas, NumPy, and Matplotlib. Each experiment includes aims, procedures, and example code to implement the concepts discussed.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2K views24 pages

r22 1 9 ML Lab Manual r22 Regulations

The document is a lab manual for a Machine Learning course at Malla Reddy College of Engineering, detailing objectives, outcomes, and a series of experiments using Python. It covers various machine learning techniques, including data analysis, regression models, and the use of libraries such as Pandas, NumPy, and Matplotlib. Each experiment includes aims, procedures, and example code to implement the concepts discussed.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

lOMoARcPSD|29324964

R22 1-9 ML LAB Manual - R22 Regulations

Machine Learning (Jawaharlal Nehru Technological University, Hyderabad)

Scan to open on Studocu

Studocu is not sponsored or endorsed by any college or university


Downloaded by Priya priya (priyaa33@gmail.com)
lOMoARcPSD|29324964

MALLA REDDY COLLEGE OF ENGINEERING


R22 B.Tech. CSE (AI and ML) Syllabus JNTU Hyderabad

AM505PC: MACHINE LEARNING LAB MANUAL


B.Tech. III Year I Sem. LTPC
0021

Course Objective:
The objective of this lab is to get an overview of the various machine learning
techniques and
can demonstrate them using python.

Course Outcomes:
Understand modern notions in predictive data analysis
Select data, model selection, model complexity and identify the trends
Understand a range of machine learning algorithms along with their strengths
and
weaknesses
Build predictive models from data and analyze their performance

List of Experiments
1. Write a python program to compute Central Tendency Measures: Mean,
Median,
Mode Measure of Dispersion: Variance, Standard Deviation
2. Study of Python Basic Libraries such as Statistics, Math, Numpy and Scipy
3. Study of Python Libraries for ML application such as Pandas and Matplotlib
4. Write a Python program to implement Simple Linear Regression
5. Implementation of Multiple Linear Regression for House Price Prediction
using sklearn

Page 1 of 23

Downloaded by Priya priya (priyaa33@gmail.com)


lOMoARcPSD|29324964

AIM: a python program to compute Central Tendency Measures: Mean, Median


PROCEDURE: to compute central tendency measures mean median mode
measure of dispersion variance, standard deviation
PROGRAM:
import statistics
def calculate_mean(data):
return sum(data) / len(data)
def calculate_median(data):
sorted_data = sorted(data)
n = len(sorted_data)
if n % 2 == 0:
middle1 = sorted_data[n // 2 - 1]
middle2 = sorted_data[n // 2]
return (middle1 + middle2) / 2
else:
return sorted_data[n // 2]
def calculate_mode(data):
return statistics.mode(data)
def calculate_variance(data):
mean_value = calculate_mean(data)
squared_diff_sum = sum((x - mean_value) ** 2 for x in data)
return squared_diff_sum / (len(data) - 1)
def calculate_standard_deviation(data):
variance_value = calculate_variance(data)
return variance_value ** 0.5
# Example dataset
dataset = [10, 20, 30, 40, 50]
mean_value = calculate_mean(dataset)
Page 2 of 23

Downloaded by Priya priya (priyaa33@gmail.com)


lOMoARcPSD|29324964

median_value = calculate_median(dataset)
mode_value = calculate_mode(dataset)
variance_value = calculate_variance(dataset)
std_deviation_value = calculate_standard_deviation(dataset)
print(f"Dataset: {dataset}")
print(f"Mean: {mean_value:.2f}")
print(f"Median: {median_value:.2f}")
print(f"Mode: {mode_value}")
print(f"Variance: {variance_value:.2f}")
print(f"Standard Deviation: {std_deviation_value:.2f}")
output : Dataset: [10, 20, 30, 40, 50]
Mean: 30.00
Median: 30.00
Mode: 10
Variance: 250.00
Standard Deviation: 15.81
OUTPUT:
Dataset: [10, 20, 30, 40, 50]
Mean: 30.00
Median: 30.00
Mode: 10
Variance: 250.00
Standard Deviation: 15.81

RESULT: calculating the values of mean mode and variance and standard
deviation for sample data set

Page 3 of 23

Downloaded by Priya priya (priyaa33@gmail.com)


lOMoARcPSD|29324964

2. Study of Python Basic Libraries such as Statistics, Math, Numpy and Scipy
AIM: Study of Python Basic Libraries such as Statistics, Math, Numpy and
Scipy
PROCEDURE: for python basic libraries by import command use the necessary
libraries to develop various designs which we can solve complex mathematical
problems.
PROGRAM:
Study of library such as statistics and math function
1. Math:
o The built-in math module provides various mathematical functions
and constants.
o It includes functions for trigonometry, logarithms, exponentiation, and
more.

import math
# Calculate the square root
sqrt_value = math.sqrt(25)
print(f"Square root of 25: {sqrt_value:.2f}")
# Compute the factorial
factorial_value = math.factorial(5)
print(f"Factorial of 5: {factorial_value}")
# Calculate the sine of an angle (in radians)
angle_radians = math.radians(30)
sine_value = math.sin(angle_radians)
print(f"Sine of 30 degrees: {sine_value:.2f}")

NumPy (Numerical Python):


• NumPy is a fundamental library for numerical computations in Python.
• It provides support for multi-dimensional arrays (ndarrays) and efficient
operations on these arrays.

Page 4 of 23

Downloaded by Priya priya (priyaa33@gmail.com)


lOMoARcPSD|29324964

Python
import numpy as np
# Create a 1D array
arr1d = np.array([1, 2, 3])
print(f"1D Array: {arr1d}")
# Create a 2D array
arr2d = np.array([[1, 2, 3], [4, 5, 6]])
print(f"2D Array: {arr2d}")
Example usage:
Python
import numpy as np#
#Create a 1D array
arr1d = np.array([1, 2, 3])
print(f"1D Array: {arr1d}")
# Create a 2D array
arr2d = np.array([[1, 2, 3], [4, 5, 6]])
print(f"2D Array: {arr2d}")

SciPy (Scientific Python):


• SciPy builds upon NumPy and provides additional functionality for scientific
and engineering purposes.
• It includes modules for optimization, integration, linear algebra, signal
processing, and more.

Python
import scipy.optimize as opt
#Solve an optimization problem
def objective(x):
return x[0]**2 + x[1]**2
result = opt.minimize(objective, [1, 1])
print(f"Optimal solution: {result.x}")
Page 5 of 23

Downloaded by Priya priya (priyaa33@gmail.com)


lOMoARcPSD|29324964

# Perform numerical Integration


from scipy.integrate import quad
def integrand(x):
return x**2
area, error = quad(integrand, 0, 2)
print(f"Integral result: {area:.2f}")

OUTPUT:
Calculate the square root
Square root of 25: 5.00
Factorial of 5: 120
Sine of 30 degrees: 0.50

Importing NumPy library


1D Array: [1 2 3]
2D Array: [[1 2 3]
[4 5 6]]
Importing SciPy library
Optimal solution: [-1.07505143e-08 -1.07505143e-08]
Integral result: 2.67

RESULT: Importing libraries from python and basic calculation and


implementation on sample values

Page 6 of 23

Downloaded by Priya priya (priyaa33@gmail.com)


lOMoARcPSD|29324964

3. study of python libraries for ml application such as pandas and matplotlib

AIM: procedure for importing pandas and matplotlib features from python
PROCEDURE:
Pandas is a powerful library for data manipulation and analysis. It provides data
structures (such as DataFrames and Series) to handle structured data efficiently.
including Pandas and Matplotlib. These libraries play a crucial role in data
manipulation, visualization, and analysis within the ML ecosystem.
Key Features:
Data cleaning: Pandas allows you to clean, transform, and preprocess data easily.
Data exploration: You can explore datasets, filter rows, and perform
aggregations.
Missing data handling: Pandas provides tools to handle missing values.
Integration with other libraries: It seamlessly integrates with other ML libraries.

PROGRAM:
essential Python libraries for Machine Learning (ML) applications,
Example:
1. Pandas
import pandas as pd
# Create a DataFrame
data = {'Name': ['Alice', 'Bob', 'Charlie'],
'Age': [25, 30, 22]}
df = pd.DataFrame(data)
# Display the DataFrame
print(df)
2. Matplotlib:
Purpose: Matplotlib is a popular plotting library for creating visualizations in
Python.
Key Features: Line plots, scatter plots, bar charts, histograms, etc

Page 7 of 23

Downloaded by Priya priya (priyaa33@gmail.com)


lOMoARcPSD|29324964

Customization: You can control colors, labels, and other plot properties.
Integration with Pandas: Matplotlib works seamlessly with Pandas DataFrames.
Example (creating a simple line plot):
Matplotlib libraries import from Python for drawing sample graph
import matplotlib.pyplot as plt
# Sample data
x = [1, 2, 3, 4]
y = [10, 15, 7, 20]
# Create a line plot
plt.plot(x, y, marker='o', label='Sales')
plt.xlabel('Month')
plt.ylabel('Revenue')
plt.title('Monthly Sales')
plt.legend()
plt.show()

OUTPUT:
Importing pandas from python libraries for sample values

Name Age
0 Alice 25
1 Bob 30
2 Charlie 22

Page 8 of 23

Downloaded by Priya priya (priyaa33@gmail.com)


lOMoARcPSD|29324964

Importing matplot from python libraries for sample values

RESULT: Importing libraries from python and basic calculation and plotting
graph on sample values

Page 9 of 23

Downloaded by Priya priya (priyaa33@gmail.com)


lOMoARcPSD|29324964

4. write a python program to implement simple linear regression

AIM: procedure for drawing a decision line in python with linear regression
technique
PROCEDURE: in machine learning for predicting the output which samples
are trained to machine decision line is vital for take the output consideration for
acquiring results how decision line is used and implementation of libraries is as
follows.

PROGRAM:
import numpy as np
from sklearn.linear_model import LinearRegression
import matplotlib.pyplot as plt

# Generate some sample data


X = np.array([[1], [2], [3], [4], [5]]) # Independent variable (feature)
y = np.array([2, 4, 5, 4, 6]) # Dependent variable (response)

# Create a linear regression model


reg = LinearRegression().fit(X, y)

# Get the coefficients and intercept


slope = reg.coef_[0]
intercept = reg.intercept_
print(f"Slope (coefficient): {slope:.2f}")
print(f"Intercept: {intercept:.2f}")

# Predict a new value


new_X = np.array([[6]])
predicted_y = reg.predict(new_X)
Page 10 of 23

Downloaded by Priya priya (priyaa33@gmail.com)


lOMoARcPSD|29324964

print(f"Predicted value for X = 6: {predicted_y[0]:.2f}")

# Plot the data and regression line


plt.scatter(X, y, color='blue', label='Data points')
plt.plot(X, reg.predict(X), color='red', label='Regression line')
plt.xlabel('X')
plt.ylabel('y')
plt.title('Simple Linear Regression')
plt.legend()
plt.show()

OUTPUT:

RESULT: Importing libraries from python and basic calculation and plotting
graph on sample values a python program to implement simple linear
regression

Page 11 of 23

Downloaded by Priya priya (priyaa33@gmail.com)


lOMoARcPSD|29324964

5.implementation of multiple linear regression for house price prediction


using sklearn.

AIM: procedure for drawing a multiple linear regression for house price
prediction using sklearn.

PROCEDURE: in machine learning for predicting the output which samples


are trained to machine decision line is vital for take the output consideration for
multiple linear regression for house price prediction using sklearn.

PROGRAM:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error,
mean_absolute_error
from sklearn import preprocessing

# importing data
df = pd.read_csv('Real-estate1.csv')
df.drop('No', inplace=True, axis=1)
print(df.head())
print(df.columns)

# plotting a scatterplot
sns.scatterplot(x='X4 number of convenience stores',
y='Y house price of unit area', data=df)
# creating feature variables
X = df.drop('Y house price of unit area', axis=1)
Page 12 of 23

Downloaded by Priya priya (priyaa33@gmail.com)


lOMoARcPSD|29324964

y = df['Y house price of unit area']


print(X)
print(y)
# creating train and test sets
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.3, random_state=101)
# creating a regression model
model = LinearRegression()
# fitting the model
model.fit(X_train, y_train)
# making predictions
predictions = model.predict(X_test)

# model evaluation
print('mean_squared_error : ', mean_squared_error(y_test,
predictions))
print('mean_absolute_error : ', mean_absolute_error(y_test,
predictions))

OUTPUT:

X1 transaction date X2 house age X3 distance to the nearest MRT stati


on \
0 2012.917 32.0 84.
87882
1 2012.917 19.5 306.
59470
2 2013.583 13.3 561.
98450
3 2013.500 13.3 561.
98450
4 2012.833 5.0 390.
56840

X4 number of convenience stores X5 latitude X6 longitude \

Page 13 of 23

Downloaded by Priya priya (priyaa33@gmail.com)


lOMoARcPSD|29324964

0 10 24.98298 121.54024
1 9 24.98034 121.53951
2 5 24.98746 121.54391
3 5 24.98746 121.54391
4 5 24.97937 121.54245

Y house price of unit area


0 37.9
1 42.2
2 47.3
3 54.8
4 43.1
Index(['X1 transaction date', 'X2 house age',
'X3 distance to the nearest MRT station',
'X4 number of convenience stores', 'X5 latitude', 'X6 longitude'
,
'Y house price of unit area'],
dtype='object')
X1 transaction date X2 house age \
0 2012.917 32.0
1 2012.917 19.5
2 2013.583 13.3
3 2013.500 13.3
4 2012.833 5.0
.. ... ...
409 2013.000 13.7
410 2012.667 5.6
411 2013.250 18.8
412 2013.000 8.1
413 2013.500 6.5

X3 distance to the nearest MRT station X4 number of convenience s


tores \
0 84.87882
10
1 306.59470
9
2 561.98450
5
3 561.98450
5
4 390.56840
5
.. ...
...
409 4082.01500
0
410 90.45606
9
411 390.96960
7
412 104.81010
5
413 90.45606
9

X5 latitude X6 longitude
0 24.98298 121.54024

Page 14 of 23

Downloaded by Priya priya (priyaa33@gmail.com)


lOMoARcPSD|29324964

1 24.98034 121.53951
2 24.98746 121.54391
3 24.98746 121.54391
4 24.97937 121.54245
.. ... ...
409 24.94155 121.50381
410 24.97433 121.54310
411 24.97923 121.53986
412 24.96674 121.54067
413 24.97433 121.54310

[414 rows x 6 columns]


0 37.9
1 42.2
2 47.3
3 54.8
4 43.1
...
409 15.4
410 50.0
411 40.6
412 52.5
413 63.9
Name: Y house price of unit area, Length: 414, dtype: float64
mean_squared_error : 46.211797834938665
mean_absolute_error : 5.39229368475692

RESULT: after importing libraries sample data is trained and imported from
csv file for predicted expected target data by extravagating math formulae.

Page 15 of 23

Downloaded by Priya priya (priyaa33@gmail.com)


lOMoARcPSD|29324964

6. implementation of decision tree using sklearn and its parameters


tuning.
AIM: procedure for decision tree parameters and its tuning by sklearn libraries

PROCEDURE: in machine learning for predicting the output which samples


are trained to machine decision line is vital for take the output consideration for
Decision tree for house price prediction using sklearn.

PROGRAM:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn import metrics
import seaborn as sns
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn import tree

# Loading the dataset


iris = load_iris()

#converting the data to a pandas dataframe


data = pd.DataFrame(data = iris.data, columns = iris.feature_names)

#creating a separate column for the target variable of iris dataset


data['Species'] = iris.target

#replacing the categories of target variable with the actual names of the species
target = np.unique(iris.target)
target_n = np.unique(iris.target_names)
target_dict = dict(zip(target, target_n))
data['Species'] = data['Species'].replace(target_dict)

# Separating the independent dependent variables of the dataset


x = data.drop(columns = "Species")
y = data["Species"]
names_features = x.columns
target_labels = y.unique()

# Splitting the dataset into training and testing datasets

Page 16 of 23

Downloaded by Priya priya (priyaa33@gmail.com)


lOMoARcPSD|29324964

x_train, x_test, y_train, y_test = train_test_split(x, y, test_size = 0.3,


random_state = 93)

# Importing the Decision Tree classifier class from sklearn


from sklearn.tree import DecisionTreeClassifier

# Creating an instance of the classifier class


dtc = DecisionTreeClassifier(max_depth = 3, random_state = 93)

# Fitting the training dataset to the model


dtc.fit(x_train, y_train)

# Plotting the Decision Tree


plt.figure(figsize = (30, 10), facecolor = 'b')
Tree = tree.plot_tree(dtc, feature_names = names_features, class_names =
target_labels, rounded = True, filled = True, fontsize = 14)
plt.show()
y_pred = dtc.predict(x_test)

# Finding the confusion matrix


confusion_matrix = metrics.confusion_matrix(y_test, y_pred)
matrix = pd.DataFrame(confusion_matrix)
axis = plt.axes()
sns.set(font_scale = 1.3)
plt.figure(figsize = (10,7))

# Plotting heatmap
sns.heatmap(matrix, annot = True, fmt = "g", ax = axis, cmap = "magma")
axis.set_title('Confusion Matrix')
axis.set_xlabel("Predicted Values", fontsize = 10)
axis.set_xticklabels([''] + target_labels)
axis.set_ylabel( "True Labels", fontsize = 10)
axis.set_yticklabels(list(target_labels), rotation = 0)
plt.show()

Expected Output:

Page 17 of 23

Downloaded by Priya priya (priyaa33@gmail.com)


lOMoARcPSD|29324964

RESULT: after importing libraries sample data is trained from samples for
predicted expected parameter data by extravagating its tuning.

Page 18 of 23

Downloaded by Priya priya (priyaa33@gmail.com)


lOMoARcPSD|29324964

7. implementation of knn using sklearn


AIM: procedure for K- Nearest Neighbour values detecting by sklearn libraries

PROCEDURE: in machine learning for predicting the output which samples


are trained to machine decision line is vital for take the output consideration for
K-NN for data csv format file using sklearn.

PROGRAM:

implementation of knn using sklearn

# Step 1: Importing the required Libraries


import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
import matplotlib.pyplot as plt
import seaborn as sns

# Step 2: Reading the Dataset

df = pd.read_csv('data.csv') # Replace 'data.csv' with your dataset e path


y = df['diagnosis']
X = df.drop(['diagnosis', 'Unnamed: 32', 'id'], axis=1)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=0)

# Step 3: Training the model


K = []
training = []
test = []
scores = {}
for k in range(2, 21):
clf = KNeighborsClassifier(n_neighbors=k)
clf.fit(X_train, y_train)
training_score = clf.score(X_train, y_train)
test_score = clf.score(X_test, y_test)
K.append(k)
training.append(training_score)
test.append(test_score)
scores[k] = [training_score, test_score]

# Step 4: Evaluating the model

for k, values in scores.items():


print(f"{k}: Training Score = {values[0]:.2f}, Test Score = {values[1]:.2f}")

Page 19 of 23

Downloaded by Priya priya (priyaa33@gmail.com)


lOMoARcPSD|29324964

# Step 5: Plotting the training and test scores


ax = sns.stripplot(K, training)
ax.set(xlabel='Values of k', ylabel='Training Score')
plt.show()

ax = sns.stripplot(K, test)
ax.set(xlabel='Values of k', ylabel='Test Score')
plt.show()

plt.scatter(K, training, color='k', label='Training Score')


plt.scatter(K, test, color='g', label='Test Score')
plt.legend()
plt.show()

output:
2: Training Score = 0.97, Test Score = 0.92
3: Training Score = 0.96, Test Score = 0.94
4: Training Score = 0.95, Test Score = 0.94
5: Training Score = 0.95, Test Score = 0.94
6: Training Score = 0.94, Test Score = 0.94
7: Training Score = 0.94, Test Score = 0.94
8: Training Score = 0.94, Test Score = 0.94
9: Training Score = 0.94, Test Score = 0.94
10: Training Score = 0.94, Test Score = 0.94
11: Training Score = 0.94, Test Score = 0.94
12: Training Score = 0.94, Test Score = 0.94
13: Training Score = 0.94, Test Score = 0.94
14: Training Score = 0.94, Test Score = 0.94
15: Training Score = 0.94, Test Score = 0.94
16: Training Score = 0.94, Test Score = 0.94
17: Training Score = 0.94, Test Score = 0.94
18: Training Score = 0.94, Test Score = 0.94
19: Training Score = 0.94, Test Score = 0.94
20: Training Score = 0.94, Test Score = 0.94

RESULT: after importing libraries sample data is trained from samples for
predicted expected parameter data by extravagating its tuning.

Page 20 of 23

Downloaded by Priya priya (priyaa33@gmail.com)


lOMoARcPSD|29324964

8. implementation of logic regression using sklearn


AIM: procedure for logic regrerssion by sklearn libraries

PROCEDURE: in machine learning for predicting the output which samples


are trained to machine decision line is vital for take the output consideration for
logic regression for data csv format file using sklearn.

PROGRAM:

implement logic regression using sklearn

# Import necessary libraries


import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, classification_report

# Load your dataset (replace 'data.csv' with your own data)


data = pd.read_csv('data.csv')
X = data.drop(columns=['target_column'])
y = data['target_column']

# Split data into training and test sets


X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Create a Logistic Regression model


model = LogisticRegression()
# Fit the model with training data
model.fit(X_train, y_train)

# Make predictions on test data


y_pred = model.predict(X_test)

# Evaluate model performance


print(f"Accuracy: {accuracy_score(y_test, y_pred):.2f}")
print(classification_report(y_test, y_pred))

Page 21 of 23

Downloaded by Priya priya (priyaa33@gmail.com)


lOMoARcPSD|29324964

output:

RESULT: after importing libraries sample data is trained from samples for
predicted expected values towards sample csv file.

Page 22 of 23

Downloaded by Priya priya (priyaa33@gmail.com)


lOMoARcPSD|29324964

9. implementation of k means clustering


AIM: procedure for k means clustering

PROCEDURE: in machine learning for predicting the output which samples


are trained to machine decision line is vital for take the output consideration for k
means for samples under clustering method.

PROGRAM:
implementation of k means clustering
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.cluster import KMeans

# Load the dataset (replace with your own data)


data = pd.read_csv('your_dataset.csv') # Replace 'your_dataset.csv' with your file path
X = data[['feature1', 'feature2']] # Select relevant features

# Create and fit the K-Means model


kmeans = KMeans(n_clusters=3)

# Choose the number of clusters


kmeans.fit(X)

# Get cluster centers and labels


cluster_centers = kmeans.cluster_centers_
labels = kmeans.labels_

# Visualize the clusters


plt.scatter(X['feature1'], X['feature2'], c=labels, cmap='rainbow')
plt.scatter(cluster_centers[:, 0], cluster_centers[:, 1], color='black', marker='x', s=100)
plt.xlabel('Feature 1')
plt.ylabel('Feature 2')
plt.title('K-Means Clustering')
plt.show()

Output:

RESULT: after importing libraries sample data is trained from samples for
predicted expected values towards sample csv file.

Page 23 of 23

Downloaded by Priya priya (priyaa33@gmail.com)

You might also like