0% found this document useful (0 votes)

491 views43 pages

Machine Learning Lab Manual

The document is a lab manual for a Machine Learning lab course. It contains information about the course including the class, branch, faculty involved, and year. It provides a table of contents listing 10 experiments involving concepts like Bayes' theorem, k-nearest neighbors classification, linear regression, Naive Bayes classification, genetic algorithms, and backpropagation. It also includes vision, mission, and program outcomes for the Computer Science department and laboratory instructions for students.

Uploaded by

mendaharshitha2004

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

491 views43 pages

Machine Learning Lab Manual

Uploaded by

mendaharshitha2004

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 43

Machine Learning Lab

Department of Computer Science and Engineering

CS604PC: MACHINE LEARNING LAB
LAB MANUAL

Class : III Year II Semester(CSE)

Branch : Computer Science and Engineering

PreparedBy : Dr.Ravi Prasad,

Dr.Pratap Singh and
Mr.K.Lakshminarayana

Year : 2020-21
Machine Learning Lab

TABLE OF CONTENTS
S.No Title Page No
The probability that it is Friday and that a student is absent is 3 %. Since there are 13
1. 5 school days in a week, the probability that it is Friday is 20 %. What is
theprobability that a student is absent given that today is Friday? Apply Baye’s
rule in python to get the result. (Ans: 15%)
2. Extract the data from database using python 14

3. Implement k-nearest neighbours classification using python 17

Given the following data, which specify classifications for nine combinations of 24
VAR1 and VAR2 predict a classification for a case where VAR1=0.906 and
VAR2=0.606, using the result of kmeans clustering with 3 means (i.e., 3
centroids) periments

VAR1 VAR2 CLASS

4. 1.713 1.586 0
0.180 1.786 1
0.353 1.240 1
0.940 1.566 0
1.486 0.759 1
1.266 1.106 0
1.540 0.419 1
0.459 1.799 1
0.773 0.186 1
5. The following training examples map descriptions of individuals onto high,
medium and low credit-worthiness.
medium skiing design single twenties no -> highRisk
high golf trading married forties yes -> lowRisk
low speedway transport married thirties yes -> medRisk
medium football banking single thirties yes -> lowRisk
high flying media married fifties yes -> highRisk
low football security single twenties no -> medRisk
medium golf media single thirties yes -> medRisk
medium golf transport married forties yes -> lowRisk
high skiing banking single thirties yes -> highRisk
low golf unemployed married forties yes -> highRisk

Input attributes are (from left to right) income, recreation, job, status, age-group,
Machine Learning Lab

home-owner. Find the unconditional probability of `golf' and the conditional

probability of `single' given `medRisk' in the dataset?
6. Implement linear regression using python. 22

7. Implement Naïve Bayes theorem to classify the English text 27

8. Implement an algorithm to demonstrate the significance of genetic algorithm 30

9. Implement the finite words classification system using Back-propagation 35

algorithm
10. Additional Experiments: Find-S and Candidate Elimination Algorithms 39 & 41
Machine Learning Lab

INSTITUTE VISION
To be as an ideal academic institution by graduating talented engineers to be ethically strong
competent with quality research and technologies.
INSTITUTE MISSION
• Utilize rigorous educational experiences to produce talented engineers Create an atmosphere
that facilitates the success of students.
• Programs that integrate global awareness, communication skills and Leadership qualities.
• Education and Research partnership with institutions and industries to prepare the students for
interdisciplinary research.

DEPARTMENT VISION
To empower the students to be technologically adept, innovative, self-motivated and responsible
global citizen possessing human values and contribute significantly towards high quality
technical education with ever changing world.

DEPARTMENT MISSION

• To offer high-quality education in the computing fields by providing an environment where the
knowledge is gained and applied to participate in research, for both students and faculty.
• To develop the problem solving skills in the students to be ready to deal with cutting edge
technologies of the industry.

• To make the students and faculty excel in their professional fields by inculcating the
communication skills, leadership skills, team building skills with the organization of various co-
curricular and extra-curricular programmes.

• To provide the students with theoretical and applied knowledge, and adopt an education
approach that promotes lifelong learning and ethical growth.

Programme Educational Objectives (PEO’S)

• Learn and Integrate: Graduates shall apply knowledge to solve computer science and allied engineering
problems with continuous learning.
• Think and Create: Graduates are inculcated with a passion towards higher education and research with
social responsibility.
• Communicate and Organize: Graduates shall pursue career in industry, empowered with professional
and interpersonal skills.
Machine Learning Lab

PROGRAM OUTCOMES (POs)

PO1 ENGINEERING KNOWLEDGE:

Apply the knowledge of mathematics, science, engineering fundamentals, and an engineering specialization
to the solution of complex engineering problems.
PO2 PROBLEM ANALYSIS:
Identify, formulate, research literature, and analyze complex engineering problems reaching substantiated
conclusions using first principles of mathematics, natural sciences, and engineering sciences.
PO3 DESIGN/DEVELOPMENT OF SOLUTIONS:
Design solutions for complex engineering problems and design system components or processes that meet
the specified needs with appropriate consideration for the public health and safety, and the cultural, societal,
and environmental considerations.
PO4 CONDUCT INVESTIGATIONS OF COMPLEX PROBLEMS:
Use research-based knowledge and research methods including design of experiments, analysis and
interpretation of data, and synthesis of the information to provide valid conclusions.
PO5 MODERN TOOL USAGE:
Create, select, and apply appropriate techniques, resources, and modern engineering and IT tools including
prediction and modeling to complex engineering activities with an understanding of the limitations.
PO6 THE ENGINEER AND SOCIETY:
Apply reasoning informed by the contextual knowledge to assess societal, health, safety, legal and cultural
issues and the consequent responsibilities relevant to the professional engineering practice.
PO7 ENVIRONMENT AND SUSTAINABILITY:
Understand the impact of the professional engineering solutions in societal and environmental contexts, and
demonstrate the knowledge of, and need for sustainable development.
PO8 ETHICS:
Apply ethical principles and commit to professional ethics and responsibilities and norms of the engineering
practice.
PO9 INDIVIDUAL AND TEAM WORK:
Function effectively as an individual, and as a member or leader in diverse teams, and in multidisciplinary
settings.
PO10 COMMUNICATION:
Communicate effectively on complex engineering activities with the engineering community and with
society at large, such as, being able to comprehend and write effective reports and design documentation,
make effective presentations, give and receive clearinstructions.
PO11 PROJECT MANAGEMENT AND FINANCE:
Demonstrate knowledge and understanding of the engineering and management principles and apply these
to one’s own work, as a member and leader in a team, to manage projects and in multidisciplinary
environments.
PO12 LIFE-LONG LEARNING:
Recognize the need for, and have the preparation and ability to engage in independent and life-long learning
in the broadest context of technological change.
Programme Specific Outcomes (PSOs)
PSO1: Ability to use professional, managerial, inter-disciplinary skill set and domain specific tools in
development processes, identify the research gaps and provide innovative solutions.
PSO2: An ability to succeed in competitive exams like GATE, GRE, IES, etc.
Machine Learning Lab

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

GENERAL LABORATORY INSTRUCTIONS

1. Students are advised to come to the laboratory at least 5 minutes before (to the starting time),those who come
after 5 minutes will not be allowed into thelab.
2. Plan your task properly much before to the commencement, come prepared to the lab with the synopsis /
program / experimentdetails.
3. Student should enter into the laboratorywith:
a. Laboratory observation notes with all the details (Problem statement, Aim, Algorithm,
Procedure, Program, Expected Output, etc.,) filled in for the labsession.
b. Laboratory Record updated up to the last session experiments and other utensils (if any)
needed in thelab.
c. Proper Dress code and Identitycard.
4. Sign in the laboratory login register, write the TIME-IN, and occupy the computer system allotted to you by
thefaculty.
5. Execute your task in the laboratory, and record the results / output in the lab observation note book, and get
certified by the concernedfaculty.
6. All the students should be polite and cooperative with the laboratory staff, must maintain the discipline and
decency in thelaboratory.
7. Computer labs are established with sophisticated and high end branded systems, which should be
utilizedproperly.
8. Students / Faculty must keep their mobile phones in SWITCHED OFF mode during the lab sessions. Misuse
of the equipment, misbehaviors with the staff and systems etc., will attract severepunishment.
9. Students must take the permission of the faculty in case of any urgency to go out ;if anybody found loitering
outside the lab / class without permission during working hours will be treated seriously and
punishedappropriately.
10. Students should LOG OFF/ SHUT DOWN the computer system before he/she leaves the lab after completing
the task (experiment) in all aspects. He/she must ensure the system / seat is keptproperly.
Machine Learning Lab

LIST OF EXPERIMENTS

S.No Title of the Experiment Page No Marks Signature

The probability that it is Friday and that a student is absent is 3 %.
Since there are 5 school days in a week, the probability that it is
1.
Friday is 20 %. What is theprobability that a student is absent
given that today is Friday? Apply Baye’s rule in python to get the
result. (Ans: 15%)
2. Extract the data from database using python

3. Implement k-nearest neighbours classification using python

Given the following data, which specify classifications for nine

combinations of VAR1 and VAR2 predict a classification for a
case where VAR1=0.906 and VAR2=0.606, using the result of
kmeans clustering with 3 means (i.e., 3 centroids) periments

VAR1 VAR2 CLASS

4. 1.713 1.586 0
0.180 1.786 1
0.353 1.240 1
0.940 1.566 0
1.486 0.759 1
1.266 1.106 0
1.540 0.419 1
0.459 1.799 1
0.773 0.186 1
5. The following training examples map descriptions of individuals
onto high, medium and low credit-worthiness.
medium skiing design single twenties no -> highRisk
high golf trading married forties yes -> lowRisk
low speedway transport married thirties yes -> medRisk
medium football banking single thirties yes -> lowRisk
high flying media married fifties yes -> highRisk
low football security single twenties no -> medRisk
medium golf media single thirties yes -> medRisk
medium golf transport married forties yes -> lowRisk
high skiing banking single thirties yes -> highRisk
low golf unemployed married forties yes -> highRisk

Input attributes are (from left to right) income, recreation, job,

status, age-group, home-owner. Find the unconditional probability
Machine Learning Lab

of `golf' and the conditional probability of `single' given `medRisk'

in the dataset?
6. Implement linear regression using python.

7. Implement Naïve Bayes theorem to classify the English text

8. Implement an algorithm to demonstrate the significance of genetic

algorithm
9. Implement the finite words classification system using Back-
propagation algorithm
Machine Learning Lab

Course Outcomes:
After completion of this course the students will be able to

Course Bloom’s
Course Outcome Statement
Outcome Taxonomylevel
understand complexity of Machine Learning algorithms and their Understand
C326.1 limitations
C326.2 understand modern notions in data analysis-oriented computing; Understand
be capable of confidently applying common Machine Learning Design
C326.3 algorithms in practice and implementing their own;
Be capable of performing experiments in Machine Learning using Apply
C326.4
real-world data.

Experiments mapping with course outcomes

Exp PSO/ C Justification
Program description PO
.No PI’s O
1 The probability that it is Friday and that a student is 1,2,3, 2/ 1 To find the Conditional
absent is 3 %. Since there are 5 school days in a week, 4,9 1.7.1 propability using Baye’s
the probability that it is Friday is 20 %. What is Rule.
theprobability that a student is absent given that today
is Friday? Apply Baye’s rule in python to get the result.
(Ans: 15%)
2 Extract the data from database using python 1,2,3, 2/ To Connect database and
1
4,9 1.7.1 extreact data
3 Implement k-nearest neighbours classification using 2/ To Implement k-nearest
1,2,3,
python 1.7.1 1 neighbours classification
4,9 using python.
4 Given the following data, which specify classifications 1,2,3, 2/ 1,2 Understand the concepts
for nine combinations of VAR1 and VAR2 predict a 4,9 1.7.1 of Classification and
classification for a case where VAR1=0.906 and Predictions
VAR2=0.606, using the result of kmeans clustering
with 3 means (i.e., 3 centroids) periments

VAR1 VAR2 CLASS

1.713 1.586 0
0.180 1.786 1
0.353 1.240 1
0.940 1.566 0
1.486 0.759 1
1.266 1.106 0
1.540 0.419 1
0.459 1.799 1
0.773 0.186 1
5 The following training examples map descriptions of 1,2,3, 2/ 1,2 Understand the concepts
9
Machine Learning Lab

individuals onto high, medium and low credit- 4,9 1.7.1 of Conditional
worthiness. propability
medium skiing design single twenties no -> highRisk
high golf trading married forties yes -> lowRisk
low speedway transport married thirties yes ->
medRisk
medium football banking single thirties yes -> lowRisk
high flying media married fifties yes -> highRisk
low football security single twenties no -> medRisk
medium golf media single thirties yes -> medRisk
medium golf transport married forties yes -> lowRisk
high skiing banking single thirties yes -> highRisk
low golf unemployed married forties yes -> highRisk
Input attributes are (from left to right) income,
recreation, job, status, age-group, home-owner. Find
the unconditional probability of `golf' and the
conditional probability of `single' given `medRisk' in
the dataset?
6 Implement linear regression using python. 2/ 4 Apply the Regression
1,2,3, 1.7.1, methods in python
4,9 2.5.2,
1.6.1
7 Implement Naïve Bayes theorem to classify the English 2/ 5 Understand the concepts
1,2,3, 1.7.1, of searching operations.
text 4,9 1.2.1
8 Implement an algorithm to demonstrate the 2/ 3 To Implement an algorithm
1,2,3, 1.7.1 to demonstrate the
significance of genetic algorithm
4,9 significance of genetic
algorithm
9 Implement the finite words classification system using 2/ 3 To Implement the finite
1,2,3, 1.7.1 words classification system
Back-propagation algorithm
4,9 using Back-propagation
algorithm

10
Machine Learning Lab

Course Outcomes Justification

PO PO PO PO PO PO PO PO PO PO PO PO PSO1 PSO2
1 2 3 4 5 6 7 8 9 10 11 12

C326.1 2 2 3 2 2 2

C326.2 2 1 3 2 2 2 2

C326.3 2 2 3 1 2 2 2

C326.4 2 3 2 2 2 2

C217.1 Baye’s Rule for Conditional Propability (Understand)

Justification
PO1 Students can get the knowledge on various Machine Learning (Level-2)
PO2 Students are able to identify the appropriate Machine Learning based on the real world
problem (Level-2)
PO3 Students can design the applications using Python (Level-3)
PO4 Students can able to investigate the complex problem and they can give solutions
(Level-2)
PO9 Function effectively as an individual to understand the concept.(Level-2)
PSO2 Model appropriate techniques to succeed in competitive exams like Gate, Toffel and
GRE

C217.2: Extract the data from database using python (Understand)

Justification
PO1 Students can get the knowledge on various Machine Learning (Level-2)
PO2 Students are able to identify the appropriate Machine Learning based on the real world
problem (Level-1)
PO3 Students can design the applications using various Machine Learning (Level-3)
PO4 Students can able to investigate the complex problem and they can give solutions
(Level-2)
PO9 Function effectively as an individual to understand the concept. (Level-2)
PSO1 Student can able to do research in data structures.(Level 2)
PSO2 Student can attended Gate exams.(Level 2)

11
Machine Learning Lab

C217.3: Implement linear regression using python.(Design)

Justification
PO1 Students can get the knowledge on various data structures (Level-2)
PO2 Students are able to identify the appropriate Data structure based on the real world
problem (Level-2)
PO3 Students can design the applications using various data structures (Level-3)
PO4 Students can able to investigate the complex problem and they can give solutions
(Level-2)
PO9 Function effectively as an individual to understand the concept. (Level-2)
PSO1 Student can able to do research in data structures.(Level 2)
PSO2 Student can attended Gate exams and Competitive exams.(Level 2)

C217.4: Implement Naïve Bayes theorem to classify the English text . (Apply)
Justification
PO1 Students can get the knowledge on various Machine Learning (Level-2)
PO2 Students are able to analyze the real world problem and they can solve the problem by
using various tech (Level-3)
PO3 Students can design the applications using various optimized techs (Level-2)
PO4 Students can able to investigate the complex problem and they can give solutions
(Level-2)
PO9 Function effectively as an individual to understand the concept of Sorting (Level-2)
PSO2 Student can attended Gate exams and Competitive exams.(Level 2)

12
Machine Learning Lab

Experiment :1

1. The probability that it is Friday and that a student is absent is 3 %. Since there are 5 school days in a week, the
probability that it is Friday is 20 %. What is theprobability that a student is absent given that today is Friday?
Apply Baye’s rule in python to get the result. (Ans: 15%)

ALGORITHM:

Step 1: Calculate probability for each word in a text and filter the words which have a probability less than threshold
probability. Words with probability less than threshold probability are irrelevant.
Step 2: Then for each word in the dictionary, create a probability of that word being in insincere questions and its
probability insincere questions. Then finding the conditional probability to use in naive Bayes classifier.
Step 3: Prediction using conditional probabilities.
Step 4: End.

PROGRAM:

} PFIA=float(input(“Enter probability that it is Friday and that a student is absent=”))

PF=float(input(“ probability that it is Friday=”))
PABF=PFIA / PF
print(“probability that a student is absent given that today is Friday using conditional probabilities=”,PABF)

OUTPUT:

Enter probability that it is Friday and that a student is absent= 0.03

probability that it is Friday= 0.2
probability that a student is absent given that today is Friday using conditional probabilities= 0.15

13
Machine Learning Lab

Experiment:2

2. Extract the data from database using python

ALGORITHM:

Step 1: Connect to MySQL from Python

Step 2: Define a SQL SELECT Query
Step 3: Get Cursor Object from Connection
Step 4: Execute the SELECT query using execute() method
Step 5: Extract all rows from a result
Step 6: Iterate each row
Step 7: Close the cursor object and database connection object
Step 8: End.

PROCEDURE

CREATING A DATABASE IN MYSQL AS FOLLOWS:

CREATE DATABASE myDB;

SHOW DATABASES;
USE myDB
CREATE TABLE MyGuests (id INT, name VARCHAR(20), email VARCHAR(20));
SHOW TABLES;
INSERT INTO MyGuests (id,name,email) VALUES(1,"sairam","xyz@abc.com");
…
SELECT * FROM authors;

We need to install mysql-connector to connect Python with MySQL. You can use the below command to
install this in your system.

pip install mysql-connector-python-rf

PYTHON SOURCE CODE:

import mysql.connector

mydb = mysql.connector.connect(
host="localhost",
user="root",
password="",
database="myDB"
)

14
Machine Learning Lab

mycursor = mydb.cursor()
mycursor.execute("SELECT * FROM MyGuests")

myresult = mycursor.fetchall()

for x in myresult:
print(x)

OUTPUT:

15
Machine Learning Lab

O Extracting data from Excel sheet using Python

Step1: First convert dataset present in excel to CSV file using online resources, then execute following
program:
consider dataset excel consists of 14 input columns and 3 output columns (C1, C2, C3)as follows:
Python Souce Code:
import pandas as pd
dataset=pd.read_csv("Mul_Label_Dataset.csv", delimiter=',')
print(dataset) #Print entire dataset
X=
dataset[['Send','call','DC','IFMSCV','MSCV','BA','MBZ','TxO','RS','CA','AL','IFWL','WWL','FWL']].values
Y = dataset[['C1','C2','C3']].values
print(Y) #Prints output values
print(X) #Prints intput values
X1 = dataset[['Send','call','DC','IFMSCV','MSCV']].values
print(X1) #Prints first 5 columns of intput values
print(X[0:5]) # Prints only first 5 rows of input values

OUTPUT SCREENS:
Excel Format: CSV

Format:

16
Machine Learning Lab

Experiment:3

3. Implement k-nearest neighbours classification using python

ALGORITHM:

Step 1: Load the data

Step 2: Initialize the value of k
Step 3: For getting the predicted class, iterate from 1 to total number of training data points
i) Calculate the distance between test data and each row of training data. Here we will use Euclidean
distance as our distance metric since it’s the most popular method. The other metrics that can be
used are Chebyshev, cosine, etc.
ii) Sort the calculated distances in ascending order based on distance values 3. Get top k rows from the
sorted array
iii) Get the most frequent class of these rows i.e. Get the labels of the selected K entries
iv) Return the predicted class  If regression, return the mean of the K labels  If classification, return
the mode of the K labels
 If regression, return the mean of the K labels
 If classification, return the mode of the K labels
Step 4: End.

PROGRAM

import numpy as np
from sklearn import datasets

iris = datasets.load_iris()
data = iris.data
labels = iris.target

for i in [0, 79, 99, 101]:

print(f"index: {i:3}, features: {data[i]}, label: {labels[i]}")

np.random.seed(42)
indices = np.random.permutation(len(data))
n_training_samples = 12
learn_data = data[indices[:-n_training_samples]]
learn_labels = labels[indices[:-n_training_samples]]

17
Machine Learning Lab

test_data = data[indices[-n_training_samples:]]
test_labels = labels[indices[-n_training_samples:]]

print("The first samples of our learn set:")

print(f"{'index':7s}{'data':20s}{'label':3s}")
for i in range(5):
print(f"{i:4d} {learn_data[i]} {learn_labels[i]:3}")

print("The first samples of our test set:")

print(f"{'index':7s}{'data':20s}{'label':3s}")
for i in range(5):
print(f"{i:4d} {learn_data[i]} {learn_labels[i]:3}")

#The following code is only necessary to visualize the data of our learnset
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
colours = ("r", "b")
X = []
for iclass in range(3):
X.append([[], [], []])
for i in range(len(learn_data)):
if learn_labels[i] == iclass:
X[iclass][0].append(learn_data[i][0])
X[iclass][1].append(learn_data[i][1])
X[iclass][2].append(sum(learn_data[i][2:]))

18
Machine Learning Lab

colours = ("r", "g", "y")

fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
for iclass in range(3):
ax.scatter(X[iclass][0], X[iclass][1], X[iclass][2], c=colours[iclass])
plt.show()
#----------------------------------------------------

def distance(instance1, instance2):

""" Calculates the Eucledian distance between two instances"""
return np.linalg.norm(np.subtract(instance1, instance2))

def get_neighbors(training_set, labels, test_instance, k, distance):

"""
get_neighors calculates a list of the k nearest neighbors of an instance 'test_instance'.
The function returns a list of k 3-tuples. Each 3-tuples consists of (index, dist, label)
"""
distances = []
for index in range(len(training_set)):
dist = distance(test_instance, training_set[index])
distances.append((training_set[index], dist, labels[index]))
distances.sort(key=lambda x: x[1])
neighbors = distances[:k]
return neighbors

19
Machine Learning Lab

for i in range(5):
neighbors = get_neighbors(learn_data, learn_labels, test_data[i], 3, distance=distance)
print("Index: ",i,'\n',
"Testset Data: ",test_data[i],'\n',
"Testset Label: ",test_labels[i],'\n',
"Neighbors: ",neighbors,'\n')
OUTPUT:

20
Machine Learning Lab

21
Machine Learning Lab

Experiment 4
4. Implement linear regression using python

ALGORITHM:

Step 1: Create Database for Linear Regression

Step 2:Finding Hypothesis of Linear Regression
Step 3:Training a Linear Regression model
Step 4:Evaluating the model
Step 5: Scikit-learn implementation
Step 6: End

PROGRAM:
Write a program that implement Queue (its operations)using
# Importing Necessary Libraries
import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, r2_score
# generate random data-set
np.random.seed(0)
x = np.random.rand(100, 1) #Generate a 2-D array with 100 rows, each row containing 1 random numbers:
y = 2 + 3 * x + np.random.rand(100, 1)
regression_model = LinearRegression() # Model initialization
regression_model.fit(x, y) # Fit the data(train the model)
y_predicted = regression_model.predict(x) # Predict
# model evaluation
rmse = mean_squared_error(y, y_predicted)
r2 = r2_score(y, y_predicted)

# printing values
print('Slope:' ,regression_model.coef_)
print('Intercept:', regression_model.intercept_)
22
Machine Learning Lab

print('Root mean squared error: ', rmse)

print('R2 score: ', r2)
# plotting values # data points
plt.scatter(x, y, s=10)
plt.xlabel('x-Values from 0-1')
plt.ylabel('y-values from 2-5')
# predicted values
plt.plot(x, y_predicted, color='r')
plt.show() )

OUTPUT:

23
Machine Learning Lab

Experiment 5
5. Implement K-Means_Clustering using python

ALGORITHM:

Step 1: Read the Given data Sample to X

Step 2: Train Dataset with K=5
Step 3: Find optimal number of clusters(k) in a dataset using Elbow method
Step 4: Train Dataset with K=3 (optimal K-Value)
Step 4: Compare results
Step 6: End

PROGRAM:

#Import libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.cluster import KMeans
from sklearn import datasets

#Read DataSet
df = datasets.load_iris()
x = df.data
y = df.target

print(x)
print(y)

#Lets try with k=5 initially

kmeans5 = KMeans(n_clusters=5)
y_kmeans5 = kmeans5.fit_predict(x)
print(y_kmeans5)

print(kmeans5.cluster_centers_)

# To find optimal number of clusters(k) in a dataset

Error =[ ]
for i in range(1, 11):
kmeans = KMeans(n_clusters = i).fit(x)
kmeans.fit(x)
Error.append(kmeans.inertia_)
import matplotlib.pyplot as plt
plt.plot(range(1, 11), Error)
24
Machine Learning Lab

plt.title('Elbow method')
plt.xlabel('No of clusters')
plt.ylabel('Error')
plt.show()

#Now try with k=3 finally

kmeans3 = KMeans(n_clusters=3)
y_kmeans3 = kmeans3.fit_predict(x)
print(y_kmeans3)

print(kmeans3.cluster_centers_)

OUTPUT:

25
Machine Learning Lab

26
Machine Learning Lab

Experiment 6
6. Implement Naive Bayes Theorem to Classify the English Text using python

The Naive Bayes algorithm

Naive Bayes classifiers are a collection of classification algorithms based on Bayes’ Theorem. It is not a
single algorithm but a family of algorithms where all of them share a common principle, i.e. every pair of
features being classified is independent of each other.
The dataset is divided into two parts, namely, feature matrix and the response/target vector.
• The Feature matrix (X) contains all the vectors(rows) of the dataset in which each vector consists of
the value of dependent features. The number of features is d i.e. X = (x1,x2,x2, xd).
• The Response/target vector (y) contains the value of class/group variable for each row of feature
matrix.

Now the “naïve” conditional independence assumptions come into play: assume that all features
in X are mutually independent, conditional on the category y:

Dealing with text data

The values 0,1,2, encode the frequency of a word that appeared in the initial text data.
27
Machine Learning Lab

E.g. The first transformed row is [0 1 1 1 0 0 1 0 1] and the unique vocabulary is [‘and’, ‘document’,
‘first’, ‘is’, ‘one’, ‘second’, ‘the’, ‘third’, ‘this’], thus this means that the words “document”, “first”, “is”,
“the” and “this” appeared 1 time each in the initial text string (i.e. ‘This is the first document.’).
In our example, we will convert the collection of text documents (train and test sets) into a matrix of token
counts.
To implement that text transformation we will use the make_pipeline function. This will internally transform
the text data and then the model will be fitted using the transformed data.

Source Code
print("NAIVE BAYES ENGLISH TEST CLASSIFICATION")

import numpy as np, pandas as pd

import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.datasets import fetch_20newsgroups
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.naive_bayes import MultinomialNB
from sklearn.pipeline import make_pipeline
from sklearn.metrics import confusion_matrix, accuracy_score

sns.set() # use seaborn plotting style

# Load the dataset

data = fetch_20newsgroups()# Get the text categories
text_categories = data.target_names# define the training set
train_data = fetch_20newsgroups(subset="train", categories=text_categories)# define the test set
test_data = fetch_20newsgroups(subset="test", categories=text_categories)

print("We have {} unique classes".format(len(text_categories)))

print("We have {} training samples".format(len(train_data.data)))
print("We have {} test samples".format(len(test_data.data)))

# let’s have a look as some training data let it 5th only

#print(test_data.data[5])

# Build the model

model = make_pipeline(TfidfVectorizer(), MultinomialNB())# Train the model using the training data
model.fit(train_data.data, train_data.target)# Predict the categories of the test data
predicted_categories = model.predict(test_data.data)

print(np.array(test_data.target_names)[predicted_categories])
28
Machine Learning Lab

# plot the confusion matrix

mat = confusion_matrix(test_data.target, predicted_categories)
sns.heatmap(mat.T, square = True, annot=True, fmt = "d",
xticklabels=train_data.target_names,yticklabels=train_data.target_names)
plt.xlabel("true labels")
plt.ylabel("predicted label")
plt.show()
print("The accuracy is {}".format(accuracy_score(test_data.target, predicted_categories)))

OUTPUT:

29
Machine Learning Lab

Experiment 7
7. Implement an algorithm to demonstrate the significance of Genetic Algorithm in python

ALGORITHM:

1. Individual in population compete for resources and mate

2. Those individuals who are successful (fittest) then mate to create more offspring than others
3. Genes from “fittest” parent propagate throughout the generation, that is sometimes parents create
offspring which is better than either parent.
4. Thus each successive generation is more suited for their environment.

Operators of Genetic Algorithms

Once the initial generation is created, the algorithm evolve the generation using following operators –
1) Selection Operator: The idea is to give preference to the individuals with good fitness scores and allow
them to pass there genes to the successive generations.
2) Crossover Operator: This represents mating between individuals. Two individuals are selected using
selection operator and crossover sites are chosen randomly. Then the genes at these crossover sites are
exchanged thus creating a completely new individual (offspring).
3) Mutation Operator: The key idea is to insert random genes in offspring to maintain the diversity in
population to avoid the premature convergence.

30
Machine Learning Lab

Given a target string, the goal is to produce target string starting from a random string of the same length. In
the following implementation, following analogies are made –
 Characters A-Z, a-z, 0-9 and other special symbols are considered as genes
 A string generated by these character is considered as chromosome/solution/Individual

Fitness score is the number of characters which differ from characters in target string at a particular index. So
individual having lower fitness value is given more preference.

Source Code
# Python3 program to create target string, starting from
# random string using Genetic Algorithm

import random

# Number of individuals in each generation

POPULATION_SIZE = 100

# Valid genes
GENES = '''abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOP
QRSTUVWXYZ 1234567890, .-;:_!"#%&/()=?@${[]}'''

# Target string to be generated

TARGET = "I love GeeksforGeeks"

class Individual(object):
'''
Class representing individual in population '''
def __init__(self, chromosome):
self.chromosome = chromosome
self.fitness = self.cal_fitness()

@classmethod
def mutated_genes(self):
'''
create random genes for mutation
'''
global GENES
gene = random.choice(GENES)
return gene

@classmethod
def create_gnome(self):
'''
create chromosome or string of genes
'''
global TARGET
31
Machine Learning Lab

gnome_len = len(TARGET)
return [self.mutated_genes() for _ in range(gnome_len)]

def mate(self, par2):

''' Perform mating and produce new offspring '''

# chromosome for offspring

child_chromosome = []
for gp1, gp2 in zip(self.chromosome, par2.chromosome):

# random probability
prob = random.random()

# if prob is less than 0.45, insert gene

# from parent 1
if prob < 0.45:
child_chromosome.append(gp1)

# if prob is between 0.45 and 0.90, insert

# gene from parent 2
elif prob < 0.90:
child_chromosome.append(gp2)

# otherwise insert random gene(mutate),

# for maintaining diversity
else:
child_chromosome.append(self.mutated_genes())

# create new Individual(offspring) using

# generated chromosome for offspring
return Individual(child_chromosome)

def cal_fitness(self):
''' Calculate fittness score, it is the number of
characters in string which differ from target string. '''
global TARGET
fitness = 0
for gs, gt in zip(self.chromosome, TARGET):
if gs != gt: fitness+= 1
return fitness

# Driver code
def main():
global POPULATION_SIZE

#current generation
generation = 1
32
Machine Learning Lab

found = False
population = []

# create initial population

for _ in range(POPULATION_SIZE):
gnome = Individual.create_gnome()
population.append(Individual(gnome))

while not found:

# sort the population in increasing order of fitness score

population = sorted(population, key = lambda x:x.fitness)

# if the individual having lowest fitness score ie.

# 0 then we know that we have reached to the target
# and break the loop
if population[0].fitness <= 0:
found = True
break

# Otherwise generate new offsprings for new generation

new_generation = []

# Perform Elitism, that mean 10% of fittest population

# goes to the next generation
s = int((10*POPULATION_SIZE)/100)
new_generation.extend(population[:s])

# From 50% of fittest population, Individuals

# will mate to produce offspring
s = int((90*POPULATION_SIZE)/100)
for _ in range(s):
parent1 = random.choice(population[:50])
parent2 = random.choice(population[:50])
child = parent1.mate(parent2)
new_generation.append(child)

population = new_generation

print("Generation: {}\tString: {}\tFitness: {}".\

format(generation,
"".join(population[0].chromosome),
population[0].fitness))

generation += 1

33
Machine Learning Lab

print("Generation: {}\tString: {}\tFitness: {}".\

format(generation,
"".join(population[0].chromosome),
population[0].fitness))

if __name__ == '__main__':
main()

OUTPUT:

34
Machine Learning Lab

Experiment 8
8. Implement an algorithm to demonstrate Back Propagation Algorithm in python

ALGORITHM:

It is the most widely used algorithm for training artificial neural networks.
In the simplest scenario, the architecture of a neural network consists of some sequential layers, where the
layer numbered i is connected to the layer numbered i+1. The layers can be classified into 3 classes:
1. Input
2. Hidden
3. Output

Usually, each neuron in the hidden layer uses an activation function like sigmoid or rectified linear unit
(ReLU). This helps to capture the non-linear relationship between the inputs and their outputs.
The neurons in the output layer also use activation functions like sigmoid (for regression) or SoftMax (for
classification).
To train a neural network, there are 2 passes (phases):
 Forward
 Backward
The forward and backward phases are repeated from some epochs. In each epoch, the following occurs:
1. The inputs are propagated from the input to the output layer.
2. The network error is calculated.
3. The error is propagated from the output layer to the input layer.

Knowing that there’s an error, what should we do? We should minimize it. To minimize network error, we
must change something in the network. Remember that the only parameters we can change are the weights
and biases. We can try different weights and biases, and then test our network.

35
Machine Learning Lab

Source Code:
import numpy
import matplotlib.pyplot as plt

def sigmoid(sop):
return 1.0/(1+numpy.exp(-1*sop))

def error(predicted, target):

return numpy.power(predicted-target, 2)

def error_predicted_deriv(predicted, target):

return 2*(predicted-target)

def sigmoid_sop_deriv(sop):
return sigmoid(sop)*(1.0-sigmoid(sop))

def sop_w_deriv(x):
return x

def update_w(w, grad, learning_rate):

return w - learning_rate*grad

x1=0.1
x2=0.4

target = 0.7
learning_rate = 0.01

w1=numpy.random.rand()
w2=numpy.random.rand()

print("Initial W : ", w1, w2)

predicted_output = []
network_error = []

old_err = 0
for k in range(80000):
# Forward Pass
y = w1*x1 + w2*x2
predicted = sigmoid(y)
err = error(predicted, target)

predicted_output.append(predicted)
network_error.append(err)
# Backward Pass
36
Machine Learning Lab

g1 = error_predicted_deriv(predicted, target)
g2 = sigmoid_sop_deriv(y)

g3w1 = sop_w_deriv(x1)
g3w2 = sop_w_deriv(x2)

gradw1 = g3w1*g2*g1
gradw2 = g3w2*g2*g1

w1 = update_w(w1, gradw1, learning_rate)

w2 = update_w(w2, gradw2, learning_rate)

#print(predicted)

plt.figure()
plt.plot(network_error)
plt.title("Iteration Number vs Error")
plt.xlabel("Iteration Number")
plt.ylabel("Error")
plt.show()

plt.figure()
plt.plot(predicted_output)
plt.title("Iteration Number vs Prediction")
plt.xlabel("Iteration Number")
plt.ylabel("Prediction")
plt.show()

37
Machine Learning Lab

OUTPUT:
Initial W : 0.08698924153243281 0.4532713230157145

38
Machine Learning Lab

Experiment 9
9. Implementing FIND-S algorithm using python

Training Database

Algorithm

1. Initialize h to the most specific hypothesis in H

2. For each positive training instance x
For each attribute constraint a, in h
If the constraint a, is satisfied by x
Then do nothing
Else replace a, in h by the next more general constraint that is satisfied by x
3. Output hypothesis h
-----------------------------------------------------------------------------------------------

Hypothesis Construction

39
Machine Learning Lab

Source Code:
with open('enjoysport.csv', 'r') as csvfile:
for row in csv.reader(csvfile):
a.append(row)
print(a)
print("\n The total number of training instances are : ",len(a))
num_attribute = len(a[0])-1
print("\n The initial hypothesis is : ")
hypothesis = ['0']*num_attribute
print(hypothesis)
for i in range(0, len(a)):
if a[i][num_attribute] == 'TRUE': #for each positive example only
for j in range(0, num_attribute):
if hypothesis[j] == '0' or hypothesis[j] == a[i][j]:
hypothesis[j] = a[i][j]
else:
hypothesis[j] = '?'
print("\n The hypothesis for the training instance {} is : \n".format(i+1),hypothesis)
print("\n The Maximally specific hypothesis for the training instance is ")
print(hypothesis)

OUTPUT:

40
Machine Learning Lab

Experiment 10
10. Implementing Candidate Elimination algorithm using python

Training Database

Algorithm

41
Machine Learning Lab

Source Code:
import csv

with open("enjoysport.csv") as f:
csv_file=csv.reader(f)
data=list(csv_file)

print(data)
print("--------------------")
s=data[1][:-1] #extracting one row or instance or record
g=[['?' for i in range(len(s))] for j in range(len(s))]

print(s)
print("--------------------")
print(g)
print("--------------------")

for i in data:
if i[-1]=="TRUE": # For each positive training record or instance
for j in range(len(s)):
if i[j]!=s[j]:
s[j]='?'
g[j][j]='?'

elif i[-1]=="FALSE": # For each negative training record or example

for j in range(len(s)):
if i[j]!=s[j]:
g[j][j]=s[j]
42
Machine Learning Lab

else:
g[j][j]="?"
print("\nSteps of Candidate Elimination Algorithm",data.index(i)+1)
print(s)
print(g)
gh=[]
for i in g:
for j in i:
if j!='?':
gh.append(i)
break
print("\nFinal specific hypothesis:\n",s)
print("\nFinal general hypothesis:\n",gh)

OUTPUT:

Ad3511 Deep Learning Laboratory
No ratings yet
Ad3511 Deep Learning Laboratory
1 page
SQT I
No ratings yet
SQT I
52 pages
CGM V Sem Lab Manual
No ratings yet
CGM V Sem Lab Manual
17 pages
Data Analytics Lab File Rohit
No ratings yet
Data Analytics Lab File Rohit
23 pages
CN Lab Programs Part-B Java Programs
No ratings yet
CN Lab Programs Part-B Java Programs
14 pages
CS3491 Unit 1 Notes
No ratings yet
CS3491 Unit 1 Notes
83 pages
Computer Networks Lab Manual
No ratings yet
Computer Networks Lab Manual
34 pages
cs3591 New Computer Network 2023 24 Course File
No ratings yet
cs3591 New Computer Network 2023 24 Course File
22 pages
CS8691 AI CO-PO Mapping
No ratings yet
CS8691 AI CO-PO Mapping
6 pages
Ccs354-Network Security Laboratory
No ratings yet
Ccs354-Network Security Laboratory
52 pages
Unit V
No ratings yet
Unit V
49 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
38 pages
CCS354 Set1
No ratings yet
CCS354 Set1
2 pages
CCS354 NS-UNIT-2 KEY MANAGEMENT & AUTHENTICATION Full
No ratings yet
CCS354 NS-UNIT-2 KEY MANAGEMENT & AUTHENTICATION Full
60 pages
21CS52
No ratings yet
21CS52
42 pages
Cloud Computing Lab Manual-New
No ratings yet
Cloud Computing Lab Manual-New
150 pages
CP4252 Machine Learning Lab Manual
No ratings yet
CP4252 Machine Learning Lab Manual
33 pages
CCS366 Sta Lab Manual
No ratings yet
CCS366 Sta Lab Manual
41 pages
Ccs374 Web Application Security
No ratings yet
Ccs374 Web Application Security
20 pages
Ad3351 Daa Unit I
No ratings yet
Ad3351 Daa Unit I
135 pages
San Unit-Wise Questions
No ratings yet
San Unit-Wise Questions
6 pages
Aim L Record
No ratings yet
Aim L Record
26 pages
CS8581 Lab Manual
57% (7)
CS8581 Lab Manual
45 pages
Unit 3 AI Srs 13-14
No ratings yet
Unit 3 AI Srs 13-14
45 pages
DL Lab Manual A.Y 2022-23-1
100% (1)
DL Lab Manual A.Y 2022-23-1
67 pages
CSM Laboratory Manual Edited
No ratings yet
CSM Laboratory Manual Edited
22 pages
Machine Learning Lab Manual (15CSL76)
No ratings yet
Machine Learning Lab Manual (15CSL76)
30 pages
DBMS LAB Manual Final22
0% (1)
DBMS LAB Manual Final22
74 pages
Iot Assignment 2
No ratings yet
Iot Assignment 2
2 pages
LP II Lab Manual Cloud Computing
No ratings yet
LP II Lab Manual Cloud Computing
32 pages
Machine Learning Lab Manual
100% (2)
Machine Learning Lab Manual
81 pages
Cloud Security-Unit 1 Detailed Notes
No ratings yet
Cloud Security-Unit 1 Detailed Notes
35 pages
CCS374 Web Application Security
No ratings yet
CCS374 Web Application Security
18 pages
Embedded Systems and Lot
No ratings yet
Embedded Systems and Lot
164 pages
CS3353 Question Bank
No ratings yet
CS3353 Question Bank
35 pages
Jerusalem College of Engineering: ACADEMIC YEAR 2021 - 2022
No ratings yet
Jerusalem College of Engineering: ACADEMIC YEAR 2021 - 2022
40 pages
LP I ML Viva Questions
100% (1)
LP I ML Viva Questions
9 pages
DDM Lab Manual
100% (1)
DDM Lab Manual
80 pages
3.multicore Architecture and Programming
0% (1)
3.multicore Architecture and Programming
3 pages
Iii Year Vi Sem CS6659 Artificial Intelligence
No ratings yet
Iii Year Vi Sem CS6659 Artificial Intelligence
44 pages
CCS354 Network Security
No ratings yet
CCS354 Network Security
87 pages
Cs3591 Computer Networks
No ratings yet
Cs3591 Computer Networks
32 pages
Al3391 - Ai Theory Syllabus
No ratings yet
Al3391 - Ai Theory Syllabus
2 pages
cs3401 Algorithms Lab Manual Final
No ratings yet
cs3401 Algorithms Lab Manual Final
43 pages
AI Lab MAnual Final
No ratings yet
AI Lab MAnual Final
44 pages
CSE 5th Semester - Software Testing and Automation - CCS366 - Question Bank and Important 2 Marks Questions With Answer
No ratings yet
CSE 5th Semester - Software Testing and Automation - CCS366 - Question Bank and Important 2 Marks Questions With Answer
25 pages
Cloud Computing QB
No ratings yet
Cloud Computing QB
3 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
42 pages
Computer Networks Ultra Short Notes
No ratings yet
Computer Networks Ultra Short Notes
11 pages
Data Structures Design - AD3251 - Important Questions With Answer - Unit 1 - Abstract Data Types
No ratings yet
Data Structures Design - AD3251 - Important Questions With Answer - Unit 1 - Abstract Data Types
15 pages
CS-605 Data - Analytics - Lab Complete Manual (2) - 1672730238
No ratings yet
CS-605 Data - Analytics - Lab Complete Manual (2) - 1672730238
56 pages
Al3411 Artificial Intelligence and Machine Learning Laboratory L T P C
No ratings yet
Al3411 Artificial Intelligence and Machine Learning Laboratory L T P C
11 pages
Anna University, Chennai Non-Autonomous Affiliated Colleges Regulations 2021 Choice Based Credit System B.E. Computer Science and Engineering
No ratings yet
Anna University, Chennai Non-Autonomous Affiliated Colleges Regulations 2021 Choice Based Credit System B.E. Computer Science and Engineering
86 pages
Deep Learning r18 Jntuh Lab Manual
No ratings yet
Deep Learning r18 Jntuh Lab Manual
20 pages
CCS366 Sy
No ratings yet
CCS366 Sy
1 page
CS8581 Networks Lab Manual Valliammai
No ratings yet
CS8581 Networks Lab Manual Valliammai
84 pages
Characteristics of A Good SRS
No ratings yet
Characteristics of A Good SRS
2 pages
Dsa (18CS32)
100% (1)
Dsa (18CS32)
160 pages
Klick Micro
No ratings yet
Klick Micro
3 pages
Lesson Plan - Data Structure
No ratings yet
Lesson Plan - Data Structure
3 pages
MACHINE LEARNING Notes
No ratings yet
MACHINE LEARNING Notes
40 pages
Shift Registers de Lab
No ratings yet
Shift Registers de Lab
1 page
Asynchronous de Lad
No ratings yet
Asynchronous de Lad
1 page
Full Adder de Lab
No ratings yet
Full Adder de Lab
1 page
C Programmimg Lab
No ratings yet
C Programmimg Lab
62 pages
IOT Lab Manual
No ratings yet
IOT Lab Manual
94 pages
Purposive Communication
No ratings yet
Purposive Communication
6 pages
Thematic Analysis
100% (3)
Thematic Analysis
12 pages
Cmca Lesson 1 To 3
No ratings yet
Cmca Lesson 1 To 3
18 pages
3.1 Predictive Analytics Introduction
No ratings yet
3.1 Predictive Analytics Introduction
12 pages
HR Compendium
100% (1)
HR Compendium
16 pages
Latin American Advances in Subjectivity and Development Through The Vygotsky Route Premium Download
100% (15)
Latin American Advances in Subjectivity and Development Through The Vygotsky Route Premium Download
17 pages
What Is The Learning Theory of Cognitivism
No ratings yet
What Is The Learning Theory of Cognitivism
7 pages
05 Assesment Rahabulitation Psychology
No ratings yet
05 Assesment Rahabulitation Psychology
18 pages
Data Science in R
No ratings yet
Data Science in R
17 pages
Feeling Disrespected by Parents Refining The Measurement and Understanding of Psychological Control
No ratings yet
Feeling Disrespected by Parents Refining The Measurement and Understanding of Psychological Control
15 pages
Schema
No ratings yet
Schema
7 pages
DLL Per Dev W1
No ratings yet
DLL Per Dev W1
5 pages
Flores Et Al., (2022)
No ratings yet
Flores Et Al., (2022)
16 pages
Social Studies Flow Chart
No ratings yet
Social Studies Flow Chart
1 page
A Worked Example of Qualitative Descriptive Design
No ratings yet
A Worked Example of Qualitative Descriptive Design
15 pages
Tvet First Maths n3 Module 1
No ratings yet
Tvet First Maths n3 Module 1
13 pages
Unit III. Learning Theories and Models
No ratings yet
Unit III. Learning Theories and Models
40 pages
Rem Koolhaas: Volume Magazine
No ratings yet
Rem Koolhaas: Volume Magazine
9 pages
Fjeld
No ratings yet
Fjeld
29 pages
UTS UNIT 4d - Resiliency
No ratings yet
UTS UNIT 4d - Resiliency
16 pages
Ece in Zambia LF
No ratings yet
Ece in Zambia LF
12 pages
Process Chemistry
No ratings yet
Process Chemistry
12 pages
Chapter 01 Organizations and Organization Design
No ratings yet
Chapter 01 Organizations and Organization Design
15 pages
Model Based Machine Learning 1704187221
No ratings yet
Model Based Machine Learning 1704187221
300 pages
BSC - CS - Ranklist Shift1
No ratings yet
BSC - CS - Ranklist Shift1
25 pages
BSCS Curriculum Description
No ratings yet
BSCS Curriculum Description
5 pages
UPCAT 2024 Billing
No ratings yet
UPCAT 2024 Billing
9 pages
Depression Detection From Social
No ratings yet
Depression Detection From Social
17 pages
Scheme of Analysis
No ratings yet
Scheme of Analysis
3 pages