Updated ML Lab Manual
Updated ML Lab Manual
Lab Ethics
Always:
▪ Enter the lab on time and leave at proper time.
▪ Wait for the previous class to leave before the next class enters.
▪ Keep the bag outside in the respective racks.
▪ Utilize lab hours in the corresponding.
▪ Turn off the machine before leaving the lab unless a member of lab staff has specifically told you not to do
so.
▪ Leave the labs at least as nice as you found them.
▪ If you notice a problem with a piece of equipment (e.g. a computer doesn't respond) or the room in general
(e.g. cooling, heating, lighting) please report it to lab staff immediately. Do not attempt to fix the problem
yourself.
Never:
▪ Don't abuse the equipment.
▪ Do not adjust the heat or air conditioners. If you feel the temperature is not properly set, inform lab staff; we
will attempt to maintain a balance that is healthy for people and machines.
▪ Do not attempt to reboot a computer. Report problems to lab staff.
▪ Do not remove or modify any software or file without permission.
▪ Do not remove printers and machines from the network without being explicitly told to do so by lab staff.
▪ Don't monopolize equipment. If you're going to be away from your machine for more than 10 or 15 minutes,
log out before leaving. This is both for the security of your account, and to ensure that others are able to use
the lab resources while you are not.
▪ Don’t use internet, internet chat of any kind in your regular lab schedule.
▪ Do not download or upload of MP3, JPG or MPEG files.
▪ No games are allowed in the lab sessions.
▪ No hardware including USB drives can be connected or disconnected in the labs without prior permission of
the lab in-charge.
▪ No food or drink is allowed in the lab or near any of the equipment. Aside from the fact that it leaves a mess
and attracts pests, spilling anything on a keyboard or other piece of computer equipment could cause
▪ permanent, irreparable, and costly damage. (and in fact has) If you need to eat or drink, take a break and do
so in the canteen.
▪ Don’t bring any external material in the lab, except your lab record, copy and books.
▪ Don’t bring the mobile phones in the lab. If necessary, then keep them in silence mode.
▪ Please be considerate of those around you, especially in terms of noise level. While labs are a natural place
for conversations of all types, kindly keep the volume turned down.
If you are having problems or questions, please go to either the faculty, lab in-charge or the lab supporting
staff. They will help you. We need your full support and cooperation for smooth functioning of the lab.
Instructions
BEFORE ENTERING IN THE LAB
▪ All the students are supposed to prepare the theory regarding the next experiment/ Program.
▪ Students are supposed to bring their lab records as per their lab schedule.
▪ Previous experiment/program should be written in the lab record.
▪ If applicable trace paper/graph paper must be pasted in lab record with proper labeling.
▪ All the students must follow the instructions, failing which he/she may not be allowed in the lab.
Marking/Assessment System
Total Marks -100
Distribution of Marks - 60 (Sessional)
Attendance File Work Performance Viva Total
10 10 20 20 60
Lab Plan
Total number of experiment: 10
Total number of turns required: 10
Experiment Number Turns Scheduled Day
Exp. 1 1 Day 1
Exp. 2 1 Day 2
Exp. 3 1 Day 3
Exp. 4 1 Day 4
Exp. 5 1 Day 5
Exp. 6 1 Day 6
Exp. 7 1 Day 7
Exp. 8 1 Day 8
Exp. 9 1 Day 9
Exp. 10 1 Day 10
Exp. 11 1 Day 10
● Learning basic concepts of Python through illustrative examples and small exercises.
● To prepare students to become Familiarity with the Python programming in AI environment.
● To provide student with an academic environment aware of various AI Algorithms.
● To train Students with python programming as to comprehend, analyze, design and create AI
platforms and solutions for the real life problems
Machine learning tasks are typically classified into two broad categories, depending on whether
there is a learning "signal" or "feedback" available to a learning system:
Supervised learning: The computer is presented with example inputs and their desired outputs,
given by a "teacher", and the goal is to learn a general rule that maps inputs to outputs. As
special cases, the input signal can be only partially available, or restricted to special feedback:
Semi-supervised learning: the computer is given only an incomplete training signal: a training
set with some (often many) of the target outputs missing.
Activelearning:the computer can only obtain training labels for a limited set of instances (based
on a budget), and also has to optimize its choice of objects to acquire labels for. When used
interactively, these can be presented to the user for labeling.
Reinforcement learning: training data (in form of rewards and punishments) is given only as
feedback to the program's actions in a dynamic environment, such as driving a vehicle or
playing a game against an opponent.
Unsupervised learning: No labels are given to the learning algorithm, leaving it on its own to
find structure in its input. Unsupervised learning can be a goal in itself (discovering hidden
patterns in data) or a means towards an end (feature learning).
ML-Lab Manual 6CS4-22
for i in your_list:
print(i)
if i[-1] == "True": j =
0
for x in i:
if x != "True":
if x != h[0][j] and h[0][j] == '0': h[0][j]
=x
elifx != h[0][j] and h[0][j] != '0': h[0][j]
= '?'
else:
pass j =
j+1
print("Most specific hypothesis is")
print(h)
Output
2. For a given set of training data examples stored in a .CSV file, implement and
demonstrate the Candidate-Elimination algorithm to output a description of the set of all
hypotheses consistent with the training examples.
class Holder:
factors={} #Initialize an empty dictionary
attributes = () #declaration of dictionaries parameters with an arbitrary length
'''
Constructor of class Holder holding two parameters,
self refers to the instance of the class
'''
definit(self,attr): #
self.attributes = attr for i
in attr:
self.factors[i]=[]
def add_values(self,factor,values):
self.factors[factor]=values
class CandidateElimination:
Positive={} #Initialize positive empty dictionary
Negative={} #Initialize negative empty dictionary
definit(self,data,fact): self.num_factors
= len(data[0][0]) self.factors =
fact.factors
self.attr =
fact.attributesself.dataset =
data
else:#if it is negative
print (S)
print (G)
def initializeS(self):
''' Initialize the specific boundary '''
S = tuple(['-' for factor in range(self.num_factors)]) #6 constraints in the vector
return [S]
def initializeG(self):
''' Initialize the general boundary '''
G = tuple(['?' for factor in range(self.num_factors)]) # 6 constraints in the vector
return [G]
ML-Lab Manual 6CS4-22
def is_positive(self,trial_set):
''' Check if a given training trial_set is positive ''' if
trial_set[1] == 'Y':
return True
eliftrial_set[1] == 'N':
return False
else:
raise TypeError("invalid target value")
def match_factor(self,value1,value2):
''' Check for the factors values match, necessary
while checking the consistency of training
trial_set with the hypothesis '''
if value1 == '?' or value2 == '?':
return True
elif value1 == value2 :
return True
return False
def consistent(self,hypothesis,instance):
''' Check whether the instance is part of the hypothesis ''' for
i,factor in enumerate(hypothesis):
if not self.match_factor(factor,instance[i]):
return False
return True
for g in hypotheses:
if not self.consistent(g,instance):
G_new.remove(g)
return G_new
def remove_more_general(self,hypotheses):
''' After generalizing S for a positive trial_set, the hypothesis in S
general than others in S should be removed '''
S_new = hypotheses[:] for
old in hypotheses:
def remove_more_specific(self,hypotheses):
''' After specializing G for a negative trial_set, the hypothesis in G
specific than others in G should be removed '''
G_new = hypotheses[:] for
old in hypotheses:
for new in G_new:
if old!=new and self.more_specific(new,old):
G_new.remove[new]
return G_new
def generalize_inconsistent_S(self,hypothesis,instance):
''' When a inconsistent hypothesis for positive trial_set is seen in the specific
boundary S,
it should be generalized to be consistent with the trial_set ... we will get one
hypothesis'''
hypo = list(hypothesis) # convert tuple to list for mutability for
i,factor in enumerate(hypo):
if factor == '-':
hypo[i] = instance[i]
elif not self.match_factor(factor,instance[i]):
hypo[i] = '?'
generalization = tuple(hypo) # convert list back to tuple for immutability
return generalization
def specialize_inconsistent_G(self,hypothesis,instance):
''' When a inconsistent hypothesis for negative trial_set is seen in the general
boundary G
should be specialized to be consistent with the trial_set.. we will get a set of
hypotheses '''
specializations = []
hypo = list(hypothesis) # convert tuple to list for mutability for
ML-Lab Manual 6CS4-22
i,factor in enumerate(hypo):
if factor == '?':
values = self.factors[self.attr[i]] for
j in values:
if instance[i] != j:
hyp=hypo[:]
hyp[i]=j
hyp=tuple(hyp) # convert list back to tuple for immutability
specializations.append(hyp)
return specializations
def get_general(self,generalization,G):
''' Checks if there is more general hypothesis in G
for a generalization of inconsistent hypothesis in S
in case of positive trial_set and returns valid generalization '''
for g in G:
if self.more_general(g,generalization):
return generalization
return None
def get_specific(self,specializations,S):
''' Checks if there is more specific hypothesis in S for
each of hypothesis in specializations of an
inconsistent hypothesis in G in case of negative trial_set
and return the valid specializations'''
valid_specializations = [] for
hypo in specializations:
for s in S:
if self.more_specific(s,hypo) or s==self.initializeS()[0]:
valid_specializations.append(hypo)
return valid_specializations
def exists_general(self,hypothesis,G):
'''Used to check if there exists a more general hypothesis in
general boundary for version space'''
for g in G:
if self.more_general(g,hypothesis):
return True
return False
def exists_specific(self,hypothesis,S):
'''Used to check if there exists a more specific hypothesis in
ML-Lab Manual 6CS4-22
general boundary for version space'''
for s in S:
if self.more_specific(s,hypothesis):
return True
return False
def more_general(self,hyp1,hyp2):
''' Check whether hyp1 is more general than hyp2 '''
hyp = zip(hyp1,hyp2)
for i,j in hyp: if i
== '?':
continue
elif j == '?':
if i != '?': return
False
elifi != j: return
False
else:
continue
return True
dataset=[(('sunny','warm','normal','strong','warm','same'),'Y'),(('sunny','warm','high','stron
g','warm','same'),'Y'),(('rainy','cold','high','strong','warm','change'),'N'),(('sunny','warm','hi
gh','strong','cool','change'),'Y')]
attributes =('Sky','Temp','Humidity','Wind','Water','Forecast') f =
Holder(attributes)
f.add_values('Sky',('sunny','rainy','cloudy')) #sky can be sunny rainy or cloudy
f.add_values('Temp',('cold','warm')) #Temp can be sunny cold or warm
f.add_values('Humidity',('normal','high')) #Humidity can be normal or high
f.add_values('Wind',('weak','strong')) #wind can be weak or strong
f.add_values('Water',('warm','cold')) #water can be warm or cold
f.add_values('Forecast',('same','change')) #Forecast can be same or change
a = CandidateElimination(dataset,f) #pass the dataset to the algorithm class and call the run
algoritm method
a.run_algorithm()
Output
ML-Lab Manual 6CS4-22
3. Write a program to demonstrate the working of the decision tree based ID3 algorithm.
Use an appropriate data set for building the decision tree and apply this knowledge
toclassify a new sample.
import numpy as np
import math
from data_loader import read_data
class Node:
definit(self, attribute):
self.attribute = attribute
self.children = []
self.answer = ""
defstr(self): return
self.attribute
for x in range(items.shape[0]):
dict[items[x]] = np.empty((int(count[x]), data.shape[1]), dtype="|S32")
pos = 0
for y in range(data.shape[0]): if
data[y, col] == items[x]:
dict[items[x]][pos] = data[y] pos
+= 1
if delete:
dict[items[x]] = np.delete(dict[items[x]], col, 1)
def entropy(S):
items = np.unique(S) if
ML-Lab Manual 6CS4-22
items.size == 1:
return 0
for x in range(items.shape[0]):
total_size = data.shape[0]
entropies = np.zeros((items.shape[0], 1))
intrinsic = np.zeros((items.shape[0], 1)) for
x in range(items.shape[0]):
ratio = dict[items[x]].shape[0]/(total_size * 1.0)
entropies[x] = ratio * entropy(dict[items[x]][:, -1])
intrinsic[x] = ratio * math.log(ratio, 2)
for x in range(entropies.shape[0]):
total_entropy -= entropies[x]
return total_entropy / iv
split = np.argmax(gains)
ML-Lab Manual 6CS4-22
node = Node(metadata[split])
metadata = np.delete(metadata, split, 0)
items, dict = subtables(data, split, delete=True)
for x in range(items.shape[0]):
child = create_node(dict[items[x]], metadata)
node.children.append((items[x], child))
return node def
empty(size):
s = ""
for x in range(size): s
+= " "
return s
Data_loader.py
import csv
def read_data(filename):
with open(filename, 'r') as csvfile:
datareader = csv.reader(csvfile, delimiter=',')
headers = next(datareader)
metadata = []
traindata = []
for name in headers:
metadata.append(name)
for row in datareader:
traindata.append(row)
ML-Lab Manual 6CS4-22
return (metadata, traindata)
Tennis.csv
outlook,temperature,humidity,wind,answersunny,ho
t,high,weak,nosunny,hot,high,strong,noovercast,hot,
high,weak,yesrain,mild,high,weak,yesrain,cool,nor
mal,weak,yesrain,cool,normal,strong,noovercast,coo
l,normal,strong,yessunny,mild,high,weak,nosunny,c
ool,normal,weak,yesrain,mild,normal,weak,yessunn
y,mild,normal,strong,yesovercast,mild,high,strong,y
esovercast,hot,normal,weak,yesrain,mild,high,strong
,no
Output
outlook
overcast
b'yes'
rain
wind
b'strong'
b'no' b'weak'
b'yes'
sunny
humidity
b'high'
b'no'
b'normal'
b'yes
ML-Lab Manual 6CS4-22
import numpy as np
X = np.array(([2, 9], [1, 5], [3, 6]), dtype=float)
y = np.array(([92], [86], [89]), dtype=float)
X = X/np.amax(X,axis=0) # maximum of X array longitudinally y
= y/100
#Sigmoid Function
def sigmoid (x):
return 1/(1 + np.exp(-x))
#Variable initialization
epoch=7000 #Setting training iterations
lr=0.1 #Setting learning rate
inputlayer_neurons = 2 #number of features in data set
hiddenlayer_neurons = 3 #number of hidden layers neurons
output_neurons = 1 #number of neurons at output layer
#weight and bias initialization
wh=np.random.uniform(size=(inputlayer_neurons,hiddenlayer_neurons))
bh=np.random.uniform(size=(1,hiddenlayer_neurons))
wout=np.random.uniform(size=(hiddenlayer_neurons,output_neurons))
bout=np.random.uniform(size=(1,output_neurons))
#draws a random range of numbers uniformly of dim x*y for
i in range(epoch):
#Forward Propogation
hinp1=np.dot(X,wh)
hinp=hinp1 + bhhlayer_act =
sigmoid(hinp)
outinp1=np.dot(hlayer_act,wout)
outinp= outinp1+ bout
output = sigmoid(outinp)
#Backpropagation
EO = y-output
outgrad = derivatives_sigmoid(output)
d_output = EO* outgrad
ML-Lab Manual 6CS4-22
EH = d_output.dot(wout.T)
hiddengrad = derivatives_sigmoid(hlayer_act)#how much hidden layer wts
contributed to error
d_hiddenlayer = EH * hiddengrad
wout += hlayer_act.T.dot(d_output) *lr# dotproduct of nextlayererror and
currentlayerop
# bout += np.sum(d_output, axis=0,keepdims=True) *lrwh
+= X.T.dot(d_hiddenlayer) *lr
#bh += np.sum(d_hiddenlayer, axis=0,keepdims=True) *lr
print("Input: \n" + str(X))
print("Actual Output: \n" + str(y))
print("Predicted Output: \n" ,output)
output
Input:
[[ 0.66666667 1. ]
[ 0.33333333 0.55555556]
[ 1. 0.66666667]]
A
c
t
u
a
l
O
u
t
p
u
t
:
[
[
0
.
9
2
]
[ 0.86]
[ 0.89]]
P
r
ML-Lab
e Manual 6CS4-22
d
i
c
t
e
d
O
u
t
p
u
t
:
[
[
0
.
8
9
5
5
9
5
9
1
]
[ 0.88142069]
[ 0.8928407 ]]
ML-Lab Manual 6CS4-22
5.Write a program to implement the naïve Bayesian classifier for a sample training data set
stored as a .CSV file. Compute the accuracy of the classifier, considering few test data sets.
def loadCsv(filename):
lines = csv.reader(open(filename, "r"));
dataset = list(lines)
for i in range(len(dataset)):
#converting strings into numbers for processing
dataset[i] = [float(x) for x in dataset[i]]
return dataset
def mean(numbers):
return sum(numbers)/float(len(numbers))
def stdev(numbers):
avg = mean(numbers)
variance = sum([pow(x-avg,2) for x in numbers])/float(len(numbers)-1)
ML-Lab Manual 6CS4-22
return math.sqrt(variance)
def summarize(dataset):
summaries = [(mean(attribute), stdev(attribute)) for attribute in zip(*dataset)]; del
summaries[-1]
return summaries
def summarizeByClass(dataset):
separated = separateByClass(dataset); summaries
= {}
for classValue, instances in separated.items():
#summaries is a dic of tuples(mean,std) for each class value
summaries[classValue] = summarize(instances)
return summaries
def main():
filename = '5data.csv'
splitRatio = 0.67
dataset = loadCsv(filename);
main()
Output
confusion matrix is as follows
[[17 0 0]
[ 0 17 0]
[ 0 0 11]]
Accuracy metrics
precision recall f1-score support
0 1.00 1.00 1.00 17
1 1.00 1.00 1.00 17
2 1.00 1.00 1.00 11
6. Assuming a set of documents that need to be classified, use the naïve Bayesian
Classifier model to perform this task. Built-in Java classes/API can be used to write
the program. Calculate the accuracy, precision, and recall for your data set.
import pandas as pd
msg=pd.read_csv('naivetext1.csv',names=['message','label'])
print('The dimensions of the dataset',msg.shape)
msg['labelnum']=msg.label.map({'pos':1,'neg':0})
X=msg.message
y=msg.labelnum
print(X)
print(y)
df=pd.DataFrame(xtrain_dtm.toarray(),columns=count_vect.get_feature_names())
print(df)#tabular representation
print(xtrain_dtm) #sparse matrix representation
OUTPUT
['about', 'am', 'amazing', 'an', 'and', 'awesome', 'beers', 'best', 'boss', 'can', 'deal',
'do', 'enemy', 'feel', 'fun', 'good', 'have', 'horrible', 'house', 'is', 'like', 'love', 'my',
'not', 'of', 'place', 'restaurant', 'sandwich', 'sick', 'stuff', 'these', 'this', 'tired', 'to',
'today', 'tomorrow', 'very', 'view', 'we', 'went', 'what', 'will', 'with', 'work'] about
am amazing an and awesome beers best boss can ... today \
0 1 0 0 0 0 0 1 0 0 0 ... 0
1 0 0 0 0 0 0 0 1 0 0 ... 0
2 0 0 1 1 0 0 0 0 0 0 ... 0
3 0 0 0 0 0 0 0 0 0 0 ... 1
4 0 0 0 0 0 0 0 0 0 0 ... 0
5 01 0 0 1 0 0 0 0 0 ... 0
6 0 0 0 0 0 0 0 0 0 1 ... 0
7 0 0 0 0 0 0 0 0 0 0 ... 0
8 0 1 0 0 0 0 0 0 0 0 ... 0
9 0 0 0 1 0 1 0 0 0 0 ... 0
ML-Lab Manual 6CS4-22
10 0 0 0 0 0 0 0 0 0 0 ... 0
11 0 0 0 0 0 0 0 0 1 0 ... 0
12 0 0 0 1 0 1 0 0 0 0 ... 0
1 0 0 0 0 00 0 0 1
2 0 0 0 0 00 0 0 0
3 0 0 0 0 10 0 0 0
4 0 0 0 0 00 0 0 0
5 0 0 0 0 00 0 0 0
6 0 0 0 0 00 0 1 0
7 1 0 0 1 00 1 0 0
8 0 0 0 0 00 0 0 0
ML-Lab Manual 6CS4-22
7.Write a program to construct aBayesian network considering medical data. Use
this model to demonstrate the diagnosis of heart patients using standard Heart
Disease Data Set. You can use Java/Python ML library classes/API.
Bronchitis = ConditionalProbabilityTable( [[
„True‟, „True‟, 0.92],
[„True‟, „False‟,0.08].
[ „False‟, „True‟,0.03],
[ „False‟, „False‟, 0.98]], [ smoking])
Tuberculosis_or_cancer = ConditionalProbabilityTable( [[
„True‟, „True‟, „True‟, 1.0],
[„True‟, „True‟, „False‟, 0.0],
[„True‟, „False‟, „True‟, 1.0],
[„True‟, „False‟, „False‟, 0.0],
[„False‟, „True‟, „True‟, 1.0],
[„False‟, „True‟, „False‟, 0.0],
[„False‟, „False‟ „True‟, 1.0],
[„False‟, „False‟, „False‟, 0.0]], [tuberculosis, lung])
Xray = ConditionalProbabilityTable( [[
„True‟, „True‟, 0.885],
[„True‟, „False‟, 0.115],
[ „False‟, „True‟, 0.04],
network = BayesianNetwork(“asia”)
network.add_nodes(s0,s1,s2)
network.add_edge(s0,s1)
network.add_edge(s1.s2)
network.bake()
print(network.predict_probal({„tuberculosis‟: „True‟}))
ML-Lab Manual 6CS4-22
8.Apply EM algorithm to cluster a set of data stored in a .CSV file. Use the same data
set for clustering using k-Means algorithm. Compare the results of these two
algorithms and comment on the quality of clustering. You can add Java/Python ML
library classes/API in the program.
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets.samples_generator import make_blobs X,
y_true = make_blobs(n_samples=100, centers =
4,Cluster_std=0.60,random_state=0)
X = X[:, ::-1]
U, s, Vt = np.linalg.svd(covariance)
Angle = np.degrees(np.arctan2(U[1, 0], U[0,0]))
Width, height = 2 * np.sqrt(s)
else:
angle = 0
width, height = 2 * np.sqrt(covariance)
Output
[[1 ,0, 0, 0]
[0 ,0, 1, 0]
[1 ,0, 0, 0]
[1 ,0, 0, 0]
[1 ,0, 0, 0]]
ML-Lab Manual 6CS4-22
K-means
from sklearn.cluster import KMeans
#
f
r
o
m
s
k
l
e
a
r
n
i
m
p
o
r
t
m
e
t
r
i
c
s
i
m
p
o
r
t
n
u
m
p
y
ML-Lab
a Manual .c 6CS4-22
s sv
")
n df
p 1
i =
m p
p d.
or D
t at
m a
at Fr
pl a
ot m
li e(
b. da
p ta
y )
pl print(df1)
ot f
as 1
pl
t =
i
m d
p f
or 1
t [
pa '
n D
da i
s s
as t
p a
d n
da c
ta e
= _
p F
d. e
re a
ad t
_c u
sv r
(" e
k '
m ]
ea .
ns v
da a
ta l
ML-Lab
u Manual l 6CS4-22
e i
s s
f t
2 (
z
= i
p
d (
f f
1 1
[ ,
' f
S 2
p )
e )
e )
d
i p
n l
t
g
.
_
p
F
l
e
o
a
t
t (
u )
r
plt.xlim([0, 100])
e
' p
] l
. t
v .
a y
l l
u i
e m
s (
[
X 0
= ,
n
p 5
. 0
m ]
a )
t
r p
i l
x t
( .
ML-Lab
t Manual . 6CS4-22
i x
t l
l a
e b
( e
' l
D (
a '
t D
a i
s s
e t
t a
' n
) c
e
p _
l F
t e
. a
y t
l u
a r
b e
e '
l )
(
' p
s l
p t
e .
e s
d c
i a
n t
g t
_ e
f r
e (
a f
t 1
u ,
r f
e 2
' )
) plt.show()
p #
l
t c
ML-Lab
r Manual t 6CS4-22
e h
a m
t
e #
K
n
e =
w
3
p kmeans_model = KMeans(n_clusters=3).fit(X)
l
o plt.plot()
t for i, l in enumerate(kmeans_model.labels_):
plt.plot(f1[i], f2[i], color=colors[l],
a marker=markers[l],ls='None')
n
plt.xlim([0, 100])
d
p
d
l
a
t
a t
p .
l
t y
.
p l
l
o i
t
( m
)
colors = ['b', 'g', 'r'] (
markers = ['o', 'v', 's']
[
#
0
K
M ,
e
a
n
s 5
a 0
l
g ]
o
r )
i
ML-Lab Manual 6CS4-22
Driver_ID,Distance_Feature,Speeding_Featu
re
3423311935,71.24,28
3423313212,52.53,25
3423313724,64.54,27
3423311373,55.69,22
3423310999,54.58,25
ML-Lab Manual 6CS4-22
3423313857,41.91,10
3423312432,58.64,20
3423311434,52.02,8
3423311328,31.25,34
3423312488,44.31,19
3423311254,49.35,40
3423312943,58.07,45
3423312536,44.22,22
3423311542,55.73,19
3423312176,46.63,43
3423314176,52.97,32
3423314202,46.25,35
3423311346,51.55,27
3423310666,57.05,26
3423313527,58.45,30
3423312182,43.42,23
3423313590,55.68,37
3423312268,55.15,18
ML-Lab Manual 6CS4-22
def getResponse(neighbors):
classVotes = {}
for x in range(len(neighbors)):
response = neighbors[x][-1]
if response in classVotes:
ML-Lab Manual 6CS4-22
classVotes[response] += 1
else:
classVotes[response] = 1
sortedVotes = sorted(classVotes.iteritems(),
reverse=True)
return sortedVotes[0][0]
def main():
# prepare data
trainingSet=[]
testSet=[] split
= 0.67
loadDataset('knndat.data', split, trainingSet, testSet)
print('Train set: ' + repr(len(trainingSet))) print('Test
set: ' + repr(len(testSet)))
# generate predictions
predictions=[]
k=3
for x in range(len(testSet)):
neighbors = getNeighbors(trainingSet, testSet[x], k)
result = getResponse(neighbors)
predictions.append(result)
print('> predicted=' + repr(result) + ', actual=' + repr(testSet[x][-1]))
accuracy = getAccuracy(testSet, predictions)
print('Accuracy: ' + repr(accuracy) + '%')
main()
OUTPUT
Confusion matrix is as follows
[[11 0 0]
[0 9 1]
[0 1 8]]
Accuracy metrics 0
ML-Lab Manual 6CS4-22
ypred = localWeightRegression(X,mtip,2)
ML-Lab Manual 6CS4-22
SortIndex = X[:,1].argsort(0)
xsort = X[SortIndex][:,0]
Output
ML-Lab Manual 6CS4-22
Viva Questions
1. What is machine learning?
2. Define supervised learning
3. Define unsupervised learning
4. Define semi supervised learning
5. Define reinforcement learning
6. What do you mean by hypotheses
7. What is classification
8. What is clustering
9. Define precision, accuracy and recall
10.Define entropy
11. Define regression
12. How Knn is different from k-means clustering
13.What is concept learning
14.Define specific boundary and general boundary
15.Define target function
16.Define decision tree
17.What is ANN
18.Explain gradient descent approximation
19.State Bayes theorem
20.Define Bayesian belief networks
21.Differentiate hard and soft clustering
22. Define variance
23. What is inductive machine learning
24. Why K nearest neighbour algorithm is lazy learning algorithm
ML-Lab Manual 6CS4-22
Q1. Implement Liner regression on a dataset containing features: Age, Years of Experience. Find the
best fit of the algorithm.
Q2. Implement Naïve Bayes Classifier to diagnosis of heart patients using standard Heart Disease
DataSet.
Q3. Implement decision tree classifier to predict a person we be able to go out to play or not using
Play TennisDataset.
ML-Lab Manual 6CS4-22
1. A First Course in Machine Learning (Machine Learning & Pattern Recognition) 2nd
Edition, by Simon Rogers, Mark Girolami; CRC Press
2. Learning From Data by Yaser S. Abu-Mostafa , Malik Magdon-Ismail, Hsuan-Tien Lin;
AML Book
3. Fundamentals of Machine Learning, John D. Miller, MIT Press
4. An Introduction to Statistical Learning, Gareth James, Robert Tribshirani, Springer
5. Machine Learning An Algorithmic Perspective, II Edition, Stephan Marsland, CRC Press.