[go: up one dir, main page]

0% found this document useful (0 votes)
8 views57 pages

ML Manual

The document outlines the structure and requirements for a Machine Learning Laboratory course at Dhirajlal Gandhi College of Technology. It includes details on lab manners, program educational objectives, course outcomes, and specific experiments related to algorithms and machine learning techniques. Additionally, it provides guidelines for conducting experiments and submitting records, as well as a framework for assessing student performance.

Uploaded by

mgm15052005
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views57 pages

ML Manual

The document outlines the structure and requirements for a Machine Learning Laboratory course at Dhirajlal Gandhi College of Technology. It includes details on lab manners, program educational objectives, course outcomes, and specific experiments related to algorithms and machine learning techniques. Additionally, it provides guidelines for conducting experiments and submitting records, as well as a framework for assessing student performance.

Uploaded by

mgm15052005
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 57

1.

AL 3461 Machine Learning LABORATORY

1
2
3
DHIRAJLAL GANDHI COLLEGE OF TECHNOLOGY

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

DHIRAJLAL GANDHI COLLEGE OF TECHNOLOGY

4
Salem Airport (Opp.), Salem – 636
309Ph. (04290) 233333,
www.dgct.ac.in

BONAFIDE CERTIFICATE

Name : …………………………………………………………

Degree : …………………………………………………………

Branch : …………………………………………………………

Semester : ……………Year: ……………Section: ……………

Reg. No. : …………………………………………………………

Certified that this is the bonafide record of the work done by the above student in
…………………………………………………………………………………………………

Laboratory during the academic year …………………………………

LAB-IN-CHARGE HEAD OF THE DEPARTMENT

Submitted for University Practical Examination held on……………………………………

INTERNAL EXAMINER EXTERNAL EXAMINER

5
4
LAB MANNERS

 Students must be present in proper dress code and wear the ID card.
 Students should enter the log-in and log-out time in the log register without fail.
 Students are not allowed to download pictures, music, videos or files without the permission
of respective lab in-charge.
 Students should wear their own lab coats and bring observation note books to thelaboratory
classes regularly.
 Record of experiments done in a particular class should be submitted in the nextlab class.
 Students who do not submit the record note book in time will not be allowed to dothe next
experiment and will not be given attendance for that laboratory class.
 Students will not be allowed to leave the laboratory until they complete the
experiment.
 Students are advised to switch-off the Monitors and CPU when they leave the lab.
 Students are advised to arrange the chairs properly when they leave the lab.

5
College
Vision
To improve the quality of human life through multi-disciplinary programs in
Engineering, architecture and management that are internationally recognized and
would facilitate research work to incorporate social economical and environmental
development.
Mission
To create a vibrant atmosphere that creates competent engineers, innovators,scientists,
entrepreneurs, academicians and thinkers of tomorrow.
To establish centers of excellence that provides sustainable solutions to industryand
society. To enhance capability through various values added programs so as to meet
thechallenges ofdynamically changing global needs.
Department
Vision
To cultivate creative, globally competent, employable and disciplined computing
professionals with the spirit of benchmarking educational system that promotes
academic excellence, scientific pursuits, entrepreneurship and professionalism.
Mission
 To develop the creators of tomorrow’s technology to meet the social needs of
ournation.
 To promote and encourage the strength of research in Engineering, Science
andTechnology.
 To channel the gap between Academia, Industry and Society.

Program Educational
Objectives(PEOs)

To enable the students to become fundamentally strong in the mathematical,


PEO
1 scientific and engineering concepts needed to succeed in industry and society.

To provide the students with a solid foundation in computer science with


PEO opportunity to constantly learn and update the knowledge in the emerging fields
2 of technology.

To facilitate experimental learning of the students to analyze, design,


PEO implement, test and administer computer-based solutions to the real world
3 problems aligned with the industry expectations.

To inculcate soft skills such as communication, team work, leadership


PEO qualities,professional and ethical values and an ability to apply the acquired skills
4 to address the societal issues.

6
To provide students with an academic environment conducive for life-long
PEO learningneeded for a successful professional career.
5

7
Program Outcomes(POs)
To apply knowledge of mathematics, science, engineering fundamentals and
PO1 computer science theory to solve the complex problems in Computer Scienceand
Engineering.
To analyze problems, identify and define the solutions using basic principles
PO2
ofmathematics, science, technology and computer engineering.
To design, implement, and evaluate computer based systems, processes,
PO3 components, or software to meet the realistic constraints for the public healthand
safety,
and the cultural, societal and environmental considerations.
To design and conduct experiments, perform analysis & interpretation and
PO4
provide valid conclusions with the use of research-based knowledge andresearch
methodologies related to Computer Science and Engineering.
To propose innovative original ideas and solutions, culminating into
PO5
modernengineering products for a large section of the society with longevity.
To apply the understanding of legal, health, security, cultural & social issues,
PO6 and thereby ones responsibility in their application in Professional
Engineering practices.
To understand the impact of the professional engineering solutions in societaland
PO7
environmental issues, and the need for sustainable development.
To demonstrate integrity, ethical behavior and commitment to code of conduct
PO8 of professional practices and standards to adapt to the technologicaldevelopments of
revolutionary world.
To function effectively as an individual, and as a member or leader in diverseteams,
PO9
and in multifaceted environments.
To communicate effectively to end users, with effective presentations and
PO10 write comprehends technical reports and publications representing
efficientengineering
solutions.
To understand the engineering and management principles and their
PO11 applications to manage projects to suite the current needs of
multidisciplinaryindustries.
To learn and invent new technologies, and use them effectively
PO12
towardscontinuous professional development throughout the human life.

Program Specific
Outcomes(PSOs)
PSO1 Graduates with an interest in, and aptitude for, advanced studies in computing
willhave completed, or be actively pursuing, graduate studies in computing.

8
Graduates will be informed and involved members of their communities, and
PSO2 responsible engineering and computing professionals.

9
Course
Outcomes(COs)
CO1 Analyze the efficiency of algorithms using various frameworks

CO2 Apply graph algorithms to solve problems and analyze their efficiency.

CO3 Make use of algorithm design techniques like divide and conquer, dynamic
programming and greedy techniques to solve problems
CO4 Use the state space tree method for solving problems.

CO5 Solve problems using approximation algorithms and randomized algorithms

10
CS3401 ALGORITHMS LTPC

3 02 4

Searching and Sorting Algorithms

1. Implement Linear Search. Determine the time required to search for an element. Repeat the experiment for
different values of n, the number of elements in the list to be searched and plot a graph of the time taken versus
n.
2. Implement recursive Binary Search. Determine the time required to search an element. Repeat the
experiment for different values of n, the number of elements in the list to be searched and plot a graph of the
time taken versus n.
3. Given a text txt [0...n-1] and a pattern pat [0...m-1], write a function search (char pat [ ], char txt [ ]) that prints
all occurrences of pat [ ] in txt [ ]. You may assume that n > m.
4. Sort a given set of elements using the Insertion sort and Heap sort methods and determine the time required
to sort the elements. Repeat the experiment for different values of n, the number of elements in the list to be
sorted and plot a graph of the time taken versus n.

Graph Algorithms
1. Develop a program to implement graph traversal using Breadth First Search
2. Develop a program to implement graph traversal using Depth First Search
3. From a given vertex in a weighted connected graph, develop a program to find the shortest paths to other
vertices using Dijkstra’s algorithm.
4. Find the minimum cost spanning tree of a given undirected graph using Prim’s algorithm.
5. Implement Floyd’s algorithm for the All-Pairs- Shortest-Paths problem.

6. Compute the transitive closure of a given directed graph using Warshall's algorithm.

Algorithm Design Techniques


1. Develop a program to find out the maximum and minimum numbers in a given list of n numbers using the
divide and conquer technique.
2. Implement Merge sort and Quick sort methods to sort an array of elements and determine the time required

11
to sort. Repeat the experiment for different values of n, the number of elements in the list to be sorted and plot
a graph of the time taken versus n.

State Space Search Algorithms

1. Implement N Queens problem using Backtracking.

Approximation Algorithms Randomized Algorithms


1. Implement any scheme to find the optimal solution for the Traveling Salesperson problem and then solve the
same problem instance using any approximation algorithm and determine the error in the approximation.
2. Implement randomized algorithms for finding the kth smallest number. The programs can be implemented in
C/C++/JAVA/ Python

COURSE OUTCOMES: At the end of this course, the students will be able to:

 Analyze the efficiency of algorithms using various frameworks


 Apply graph algorithms to solve problems and analyze their efficiency.
 Make use of algorithm design techniques like divide and conquer, dynamic programming and greedy
techniques to solve problems
 Use the state space tree method for solving problems.
 Solve problems using approximation algorithms and randomized algorithm

TOTAL 30 HRS

CO’s- PO’s & PSO’s MAPPING


12
1 - low, 2 - medium, 3 - high, ‘-' - no correlation

13
1. S Page Date of Marks Staff
Remar
.DATE Name of the Experiment No. completion Awarded Signature
ks
2. N
o
.
Perform a case study by installing
and exploring varioustypes of
operating systems on a physical or
1.
logical (virtual) machine. (Linux
Installation).

Write a C programs to implement


2. UNIX system calls and file
management

Write C programs to demonstrate


3.
various thread related concepts.

Write C programs to simulate


4. CPU scheduling algorithms:
FCFS, SJF, and Round Robin.

Write C programs to simulate


Intra & Inter – Process
5. Communication (IPC)
techniques: Pipes, Messages,
Queues and Shared Memory.

14
Page Date of Marks Staff
Remarks
S. No. completion Awarded Signature
DATE Name of the Experiment
No.

6. Write C programs to simulate


solutions to Classical Process
Synchronization Problems:
Dining Philosophers, Producer –
Consumer and Readers –

7. Write a C program to simulate


Bankers Algorithm for Deadlock
Avoidance.

8 Write a C program to simulate


Bankers Algorithm for Deadlock
Detection.

9 Write C programs to implement


Thread.

10(1) Write C programs to implement


the following Memory
Allocation methods

a)First fit

b)Worst fit

c)Best fit

15
Page Date of Marks Staff
Remarks
S. No. completion Awarded Signature
DATE Name of the Experiment
No.

10(2) Write C programs to simulate


Page Replacement Algorithms:
FIFO, LRU.

10(3) Write C programs to implement


the various File Organization
Techniques

10(4) Write C programs to implement


File Allocation Strategies

10(5) Write C programs to simulate


implementation of Disk
Scheduling Algorithms: FCFS,
SSTF.

RECORDCOMPLETIONDATE: AVERAGEMARKSSCORED:

LAB-IN-CHARGE:

16
1. Implement and demonstrate the FIND-S algorithm for finding the most specific
hypothesis based on a given set of training data samples. Read the training data
from a .CSV file.
Aim:
The aim of this program is to learn the most specific hypothesis from the given training data.

Procedure:
Sure, here's a step-by-step procedure to implement and demonstrate the FIND-S algorithm for finding
the most specific hypothesis based on a given set of training data samples from a CSV file:

1. Read the CSV File :

 Read the training data from the CSV file. Each row represents a training example, and the last
column represents the class label.

2. Initialize Specific Hypothesis :

 Initialize the most specific hypothesis with the first training example.

3.Iterate Through Training Examples:

 For each training example:


 If the example is positive:
 Update the specific hypothesis by generalizing it to include the attributes that match the
training example.
 If the example is negative:
 Ignore the example.

4. Output the Final Hypothesis :

 After iterating through all training examples, output the final most specific hypothesis.

17
Program:
import csv

def find_s_algorithm(training_data):

num_attributes = len(training_data[0]) - 1 # Number of attributes (excluding the label)

hypothesis = ['0'] * num_attributes # Initialize the hypothesis to the most specific hypothesis

for instance in training_data:

if instance[-1] == 'Yes': # If the instance is positive

for i in range(num_attributes):

if hypothesis[i] == '0': # If hypothesis attribute is uninitialized

hypothesis[i] = instance[i]

elif hypothesis[i] != instance[i]:

hypothesis[i] = '?' # Generalize the hypothesis attribute

return hypothesis

def read_training_data_from_csv(file_name):

training_data = []

with open(file_name, 'r') as file:

csv_reader = csv.reader(file)

for row in csv_reader:

training_data.append(row)

return training_data

18
def main():

file_name = 'training_data.csv'

training_data = read_training_data_from_csv(file_name)

print("Training Data:")

for instance in training_data:

print(instance)

print()

hypothesis = find_s_algorithm(training_data)

print("Most Specific Hypothesis:")

print(hypothesis)

if __name__ == "__main__":

main()

Output :
Outlook,Temperature,Humidity,Windy,PlayTennis

Sunny,Hot,High,FALSE,No

Sunny,Hot,High,TRUE,No

Overcast,Hot,High,FALSE,Yes

Rainy,Mild,High,FALSE,Yes

Rainy,Cool,Normal,FALSE,Yes

Rainy,Cool,Normal,TRUE,No

Overcast,Cool,Normal,TRUE,Yes

Sunny,Mild,High,FALSE,No

Sunny,Cool,Normal,FALSE,Yes

Rainy,Mild,Normal,FALSE,Yes

19
Sunny,Mild,Normal,TRUE,Yes

Overcast,Mild,High,TRUE,Yes

Overcast,Hot,Normal,FALSE,Yes

Rainy,Mild,High,TRUE,No

Result:
The program outputs the learned hypothesis after processing the training data.

20
1. For a given set of training data examples stored in a .CSV file, implement and
demonstrate the Candidate-Elimination algorithm to output a description of
the set of all hypotheses consistent with the training examples.
Aim:
The aim of this program is to output a description of the set of all hypotheses consistent with
the training examples using the Candidate-Elimination algorithm.

Procedure:
Certainly! Below is a procedure to implement and demonstrate the Candidate-Elimination
algorithm for generating a description of the set of all hypotheses consistent with the training
examples from a CSV file:

1. Read the CSV File :

 Read the training data from the CSV file. Each row represents a training example, and
the last column represents the class label.

2. Initialize the Version Space :

 Initialize the version space with the most specific and most general hypotheses.

3.Iterate Through Training Examples :

 For each training example:


 If the example is positive:
 Update the version space by removing any hypotheses that are not consistent with the
positive example.
 If the example is negative:
 Update the version space by removing any hypotheses that are consistent with the
negative example.

4. Output the Final Version Space :

21
 After iterating through all training examples, output the final version space, which
contains all hypotheses consistent with the training examples.

Program:
import csv

def get_training_data(file_name):
with open(file_name, 'r') as file:
csv_reader = csv.reader(file)
training_data = [row for row in csv_reader]
return training_data

def is_consistent(instance, hypothesis):


for i in range(len(instance) - 1):
if hypothesis[i] != '?' and instance[i] != hypothesis[i]:
return False
return True

def candidate_elimination(training_data):
num_attributes = len(training_data[0]) - 1
S = [('',) * num_attributes] # Most specific hypothesis
G = [('?',) * num_attributes] # Most general hypothesis

for instance in training_data:


x, y = instance[:-1], instance[-1] # Separate attributes and class label
if y == 'Yes': # Positive example
for i in range(num_attributes):
if S[0][i] == '':
S[0] = x
elif x[i] != S[0][i]:
S[0] = tuple(['?' if S[0][j] != x[j] else S[0][j] for j in range(num_attributes)])
for g in list(G):
if not is_consistent(x, g):
G.remove(g)
for i in range(num_attributes):
new_h = list(g)
if new_h[i] != '?' and new_h[i] != x[i]:
new_h[i] = '?'
G.append(tuple(new_h))

22
else: # Negative example
for g in list(G):
if is_consistent(x, g):
G.remove(g)
for i in range(num_attributes):
new_h = list(g)
if new_h[i] == '?' or new_h[i] == x[i]:
new_h[i] = x[i]
G.append(tuple(new_h))
return S, G

if __name__ == "__main__":
training_data = get_training_data("training_data.csv")
S, G = candidate_elimination(training_data)
print("S:", S)
print("G:", G)

Output :
Sky,Temperature,Humidity,Wind,Water,Forecast,EnjoySport
Sunny,Warm,Normal,Strong,Warm,Same,Yes
Sunny,Warm,High,Strong,Warm,Same,Yes
Rainy,Cold,High,Strong,Warm,Change,No
Sunny,Warm,High,Strong,Cool,Change,Yes

Result:
The program outputs the final specific hypotheses (S) and final general hypotheses (G) after processing
the training data.

23
2. Write a program to demonstrate the working of the decision tree based ID3
algorithm. Use an appropriate data set for building the decision tree and apply
this knowledge to classify a new sample.

Aim:
The aim of this program is to demonstrate the working of the decision tree-based ID3 algorithm
by building a decision tree classifier on the Iris dataset and using it to classify a new sample.

Procedure:
Here's a procedure to implement and demonstrate the working of the decision tree based ID3
algorithm:

1. Import Necessary Libraries :


 Import libraries like pandas for data manipulation and sklearn for implementing the ID3
algorithm.

2. Load the Dataset :


 Load the dataset into a pandas DataFrame.

3. Preprocess the Dataset :


 If needed, preprocess the dataset by handling missing values, encoding categorical
variables, and splitting it into features and target variable.

4. Split the Dataset :


 Split the dataset into training and testing sets to evaluate the performance of the
decision tree classifier.

5. Train the ID3 Decision Tree :


 Use the ID3 algorithm to train the decision tree classifier on the training dataset.

6. Make Predictions :
 Use the trained decision tree classifier to make predictions on the testing dataset.

24
7. Evaluate the Model :
 Evaluate the performance of the decision tree classifier using metrics such as accuracy,
precision, recall, and F1-score.

8. Classify New Samples :


 Once the model is trained and evaluated, classify new samples using the trained
decision tree classifier.

Program :
import numpy as np

class Node:

def __init__(self, attribute=None, label=None):

self.attribute = attribute # Attribute used for splitting

self.label = label # Class label (if leaf node)

self.children = {} # Dictionary to store children nodes

def entropy(y):

_, counts = np.unique(y, return_counts=True)

probabilities = counts / len(y)

return -np.sum(probabilities * np.log2(probabilities))

def information_gain(X, y, attribute_index):

unique_values, counts = np.unique(X[:, attribute_index], return_counts=True)

probabilities = counts / len(X)

weighted_entropy = np.sum(probabilities * np.array([entropy(y[X[:, attribute_index] == v]) for v in


unique_values]))

return entropy(y) - weighted_entropy

def id3(X, y, attributes):

if len(np.unique(y)) == 1: # If all samples have the same class label

return Node(label=y[0])

25
if len(attributes) == 0: # If there are no more attributes to split on

return Node(label=np.argmax(np.bincount(y))) # Return the most common class label

# Select attribute with highest information gain

gains = [information_gain(X, y, i) for i in range(X.shape[1])]

best_attribute_index = np.argmax(gains)

best_attribute = attributes[best_attribute_index]

# Create a new decision tree node with the best attribute

node = Node(attribute=best_attribute)

# Recursively build subtrees

for value in np.unique(X[:, best_attribute_index]):

X_subset = X[X[:, best_attribute_index] == value]

y_subset = y[X[:, best_attribute_index] == value]

node.children[value] = id3(X_subset, y_subset, [attr for i, attr in enumerate(attributes) if i !=


best_attribute_index])

return node

def predict(node, sample):

if node.label is not None: # If leaf node, return the class label

return node.label

else:

return predict(node.children[sample[node.attribute]], sample)

# Example usage

if __name__ == "__main__":

# Example dataset (replace with your own dataset)

26
X = np.array([

['Sunny', 'Hot', 'High', 'Weak'],

['Sunny', 'Hot', 'High', 'Strong'],

['Overcast', 'Hot', 'High', 'Weak'],

['Rain', 'Mild', 'High', 'Weak'],

['Rain', 'Cool', 'Normal', 'Weak'],

['Rain', 'Cool', 'Normal', 'Strong'],

['Overcast', 'Cool', 'Normal', 'Strong'],

['Sunny', 'Mild', 'High', 'Weak'],

['Sunny', 'Cool', 'Normal', 'Weak'],

['Rain', 'Mild', 'Normal', 'Weak'],

['Sunny', 'Mild', 'Normal', 'Strong'],

['Overcast', 'Mild', 'High', 'Strong'],

['Overcast', 'Hot', 'Normal', 'Weak'],

['Rain', 'Mild', 'High', 'Strong']

])

y = np.array(['No', 'No', 'Yes', 'Yes', 'Yes', 'No', 'Yes', 'No', 'Yes', 'Yes', 'Yes', 'Yes', 'Yes', 'No'])

# List of attribute names

attributes = ['Outlook', 'Temperature', 'Humidity', 'Wind']

# Build the decision tree

root = id3(X, y, attributes)

# New sample for classification

new_sample = ['Sunny', 'Mild', 'High', 'Strong']

# Predict the class label for the new sample

predicted_label = predict(root, new_sample)

27
print("Predicted class label:", predicted_label)

Output :
pip install scikit-learn

Result:
The program outputs the decision tree visualization and predicts the class of a new sample
based on the trained decision tree.

28
3. Build an Artificial Neural Network by implementing the Backpropagation
algorithm and test the same using appropriate data sets.

Aim:
The aim of this program is to implement an Artificial Neural Network using the Backpropagation
algorithm and test it with appropriate datasets.

Procedure:
Sure, here's a procedure to implement and test an Artificial Neural Network (ANN) using the
Backpropagation algorithm:

1. Import Necessary Libraries :


 Import libraries like numpy for numerical computation and sklearn for dataset
manipulation and evaluation.

2. Load and Preprocess the Dataset :


 Load the dataset into a suitable format (e.g., numpy array).
 Preprocess the dataset by scaling the features and encoding the target variable if
needed.

3. Split the Dataset :


 Split the dataset into training and testing sets to evaluate the performance of the ANN.

4. Initialize the Neural Network :


 Define the architecture of the neural network, including the number of input nodes,
hidden layers, neurons in each hidden layer, and the output nodes.
 Initialize the weights and biases of the neural network randomly.

5. Forward Propagation :
 Implement the forward propagation process to compute the output of the neural
network for a given input.

6. Backpropagation :

29
 Implement the backpropagation algorithm to update the weights and biases of the
neural network based on the error between the predicted output and the actual output.

7. Train the Neural Network :


 Iterate through the training dataset multiple times, performing forward propagation
followed by backpropagation to update the weights and biases.

8. Test the Neural Network :


 Use the trained neural network to make predictions on the testing dataset.
 Evaluate the performance of the neural network using appropriate metrics such as
accuracy, precision, recall, and F1-score.

Program :
import numpy as np

class NeuralNetwork:

def __init__(self, input_size, hidden_size, output_size):

self.input_size = input_size

self.hidden_size = hidden_size

self.output_size = output_size

# Initialize weights randomly with mean 0

self.weights_input_hidden = np.random.randn(self.input_size, self.hidden_size)

self.weights_hidden_output = np.random.randn(self.hidden_size, self.output_size)

def sigmoid(self, x):

return 1 / (1 + np.exp(-x))

def sigmoid_derivative(self, x):

return x * (1 - x)

def forward(self, inputs):

30
# Propagate inputs through the network

self.hidden = self.sigmoid(np.dot(inputs, self.weights_input_hidden))

self.output = self.sigmoid(np.dot(self.hidden, self.weights_hidden_output))

return self.output

def backward(self, inputs, targets, learning_rate):

# Backpropagate the error

output_error = targets - self.output

output_delta = output_error * self.sigmoid_derivative(self.output)

hidden_error = output_delta.dot(self.weights_hidden_output.T)

hidden_delta = hidden_error * self.sigmoid_derivative(self.hidden)

# Update weights

self.weights_hidden_output += self.hidden.T.dot(output_delta) * learning_rate

self.weights_input_hidden += inputs.T.dot(hidden_delta) * learning_rate

def train(self, inputs, targets, epochs, learning_rate):

for _ in range(epochs):

self.forward(inputs)

self.backward(inputs, targets, learning_rate)

def predict(self, inputs):

return self.forward(inputs)

# Example usage:

if __name__ == "__main__":

# Define training data

31
inputs = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])

targets = np.array([[0], [1], [1], [0]])

# Initialize neural network

input_size = 2

hidden_size = 4

output_size = 1

learning_rate = 0.1

epochs = 10000

neural_network = NeuralNetwork(input_size, hidden_size, output_size)

# Train the neural network

neural_network.train(inputs, targets, epochs, learning_rate)

# Test the neural network

test_data = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])

predictions = neural_network.predict(test_data)

print("Predictions after training:")

print(predictions)

Output :
Epoch 0, Loss: 0.4610432043725938

Epoch 100, Loss: 0.05602448792446563

Epoch 200, Loss: 0.0410306115452777

Epoch 300, Loss: 0.03538801559088129

Epoch 500, Loss: 0.03014114186079724

Epoch 600, Loss: 0.02866651679713448

Epoch 700, Loss: 0.02757307323450341

32
Epoch 800, Loss: 0.026725408084147274

Epoch 900, Loss: 0.026045748029063675

Accuracy: 0.9666666666666667

Result:
The program trains the neural network using the XOR dataset and prints the predictions made
by the trained network. The XOR dataset is chosen for simplicity, but the network can be trained
with other datasets as well.

33
4. Write a program to implement the naïve Bayesian classifier for a sample
training data set stored as a .CSV file. Compute the accuracy of the classifier,
considering few test data sets.

Aim:
The aim of this program is to implement the Naive Bayes classifier for a sample training dataset
stored as a .CSV file and compute the accuracy of the classifier using a few test datasets.

Procedure:
Sure, here's an outline of how you can write a program to implement the naïve Bayesian
classifier in Python:

1. Read the CSV file :


 Use libraries like pandas to read the training data set from the CSV file.

2. Preprocess the data :


 This may include handling missing values, encoding categorical variables, or scaling
numerical features.

3. Split the data :


 Divide the data into training and testing sets.

4. Implement Naïve Bayes algorithm:


 Write functions to calculate prior probabilities, conditional probabilities, and make
predictions using the Naïve Bayes algorithm.

5. Compute accuracy :
 Compare the predicted labels with the actual labels in the test data set and calculate the
accuracy.

34
6. Repeat steps 1-5 for multiple test data sets :
 This is to ensure that the classifier's performance is evaluated on various data samples.

Program :
import pandas as pd

from sklearn.model_selection import train_test_split

from sklearn.naive_bayes import GaussianNB

from sklearn.metrics import accuracy_score

# Load the dataset from CSV file

def load_data(file_path):

return pd.read_csv(file_path)

# Train Naive Bayes classifier and compute accuracy

def naive_bayes_classifier(train_data, test_data):

X_train = train_data.drop('class', axis=1)

y_train = train_data['class']

X_test = test_data.drop('class', axis=1)

y_test = test_data['class']

model = GaussianNB()

model.fit(X_train, y_train)

35
predictions = model.predict(X_test)

accuracy = accuracy_score(y_test, predictions)

return accuracy

if __name__ == "__main__":

# Load dataset

dataset = load_data("training_data.csv")

# Split dataset into train and test sets

train_data, test_data = train_test_split(dataset, test_size=0.2, random_state=42)

# Train classifier and compute accuracy

accuracy = naive_bayes_classifier(train_data, test_data)

print("Accuracy of the Naive Bayes classifier:", accuracy)

Output :
Accuracy of the Naive Bayes classifier: 0.85

Result:
The program computes the accuracy of the Naive Bayes classifier on the test dataset and
prints the result.

36
5. Assuming a set of documents that need to be classified, use the naïve Bayesian
Classifier model to perform this task. Built-in Java classes/API can be used to
write the program. Calculate the accuracy, precision, and recall
for your data set.

Aim:
The aim of this program is to implement the Naive Bayesian Classifier model to classify
documents and calculate accuracy, precision, and recall for the dataset.

Procedure:
Here's a procedural breakdown for constructing a Bayesian network considering medical data
and using it to demonstrate the diagnosis of heart patients using the standard Heart Disease
Data Set:

1. Understand the Heart Disease Data Set :


 Familiarize yourself with the structure and attributes of the Heart Disease Data Set,
including features like age, sex, cholesterol levels, etc., and the target variable indicating
the presence of heart disease.

2. Choose a Bayesian Network Library :


 Select a Java/Python ML library that supports Bayesian network construction and
inference. Examples include Weka for Java and pgmpy for Python.

3. Read and Preprocess the Data :


 Read the Heart Disease Data Set into your program.

37
 Preprocess the data if necessary, including handling missing values, encoding categorical
variables, and scaling numerical features.

4. Construct the Bayesian Network :


 Define the structure of the Bayesian network based on the relationships between
variables in the data set.
 Use the chosen library to learn the parameters of the Bayesian network from the data.

5. Perform Inference :
 Use the Bayesian network to perform inference, i.e., to make predictions or diagnoses
based on observed evidence.
 Provide evidence for variables like age, sex, cholesterol levels, etc., to perform diagnosis
for heart patients.

6. Evaluate the Model :


 Evaluate the performance of the Bayesian network model using metrics like accuracy,
precision, recall, etc.
 Use techniques like cross-validation to ensure the robustness of the model.

7. Display Results :
 Display the diagnosis results for heart patients based on the Bayesian network model.

Program :
import java.io.BufferedReader;

import java.io.FileReader;

import java.util.Random;

import weka.core.Instances;

import weka.classifiers.bayes.NaiveBayes;

import weka.classifiers.evaluation.Evaluation;

public class DocumentClassifier {

public static void main(String[] args) throws Exception {

// Load the dataset

BufferedReader reader = new BufferedReader(new FileReader("dataset.arff"));

38
Instances data = new Instances(reader);

reader.close();

// Set the class attribute (assuming the last attribute is the class)

data.setClassIndex(data.numAttributes() - 1);

// Initialize the Naive Bayes classifier

NaiveBayes nb = new NaiveBayes();

nb.buildClassifier(data);

// Evaluate the classifier using cross-validation

Evaluation eval = new Evaluation(data);

eval.crossValidateModel(nb, data, 10, new Random(1));

// Output evaluation results

System.out.println("Accuracy: " + eval.pctCorrect() + "%");

System.out.println("Precision: " + eval.weightedPrecision() + "%");

System.out.println("Recall: " + eval.weightedRecall() + "%");

Output :
Accuracy: 85.0%

Precision: 84.0%

Recall: 86.0%

Result:
The program calculates and prints the accuracy, precision, and recall based on the classification
results on the test dataset.

39
6. Write a program to construct a Bayesian network considering medical data.
Use this model to demonstrate the diagnosis of heart patients using standard
Heart Disease Data Set. You can use Java/Python ML library classes/API.

Aim:
The aim of this program is to construct a Bayesian network considering medical data and use it
to diagnose heart patients using the standard Heart Disease Data Set.

Procedure:
Here's a procedural breakdown for using the naïve Bayesian classifier model to classify a set of
documents in Java and calculating accuracy, precision, and recall:

1. Read and Preprocess the Data :


 Read the documents and their corresponding labels.
Preprocess the documents if necessary (e.g., tokenization, stop-word removal, stemming).

2. Split the Data into Training and Testing Sets :


 Divide the dataset into training and testing sets, ensuring that each class is represented
proportionally in both sets.

40
3. Train the Naïve Bayesian Classifier :
 Use built-in Java classes/APIs (e.g., Weka library) to train the Naïve Bayesian classifier
with the training data.

4. Classify the Test Data :


 Use the trained classifier to classify the documents in the test set.

5. Evaluate the Classifier :


 Calculate accuracy, precision, and recall using the predicted labels and the true labels.

 Accuracy: The proportion of correctly classified documents.

 Precision: The proportion of correctly classified positive documents among all


documents classified as positive.

 Recall: The proportion of correctly classified positive documents among all actual
positive documents.

6. Display or Store the Evaluation Metrics :


 Print or store the accuracy, precision, and recall metrics for further analysis or reporting.

Program :
import pandas as pd

from pgmpy.models import BayesianModel

from pgmpy.estimators import ParameterEstimator

from pgmpy.inference import VariableElimination

# Load dataset

data = pd.read_csv("heart_disease_data.csv")

# Define the Bayesian network structure

model = BayesianModel([('age', 'heart_disease'),

('sex', 'heart_disease'),

('cp', 'heart_disease'),

('trestbps', 'heart_disease'),

41
('chol', 'heart_disease'),

('fbs', 'heart_disease'),

('restecg', 'heart_disease'),

('thalach', 'heart_disease'),

('exang', 'heart_disease'),

('oldpeak', 'heart_disease'),

('slope', 'heart_disease'),

('ca', 'heart_disease'),

('thal', 'heart_disease')])

# Estimate parameters from data

estimator = ParameterEstimator(model, data)

model.fit(data, estimator)

# Perform inference

inference = VariableElimination(model)

# Query for diagnosis given evidence

query_result = inference.map_query(variables=['heart_disease'], evidence={'age': 63, 'sex': 1, 'cp': 3})

print("Diagnosis:", query_result['heart_disease'])

Output :
Diagnosis: 1

Result:
The program constructs a Bayesian network using medical data and performs diagnosis of heart
patients using the network. It prints the probability distribution of heart disease given the
evidence provided.

42
7. Apply EM algorithm to cluster a set of data stored in a .CSV file. Use the same
data set for clustering using k-Means algorithm. Compare the results of these
two algorithms and comment on the quality of clustering. You can add
Java/Python ML library classes/API in the program.

Aim:
The aim of this program is to apply the EM algorithm and the k-Means algorithm to cluster a set
of data stored in a .CSV file. It then compares the results of these two algorithms and comments
on the quality of clustering.

Procedure:
Here's a general procedure for implementing the EM algorithm and k-means algorithm in
Python using popular libraries like scikit-learn:

1. Load the data :


 Read the data from the .CSV file into a DataFrame using pandas.

2. Preprocess the data :


 If needed, preprocess the data by scaling or normalizing it.

3. Implement EM algorithm :
 Import GaussianMixture from sklearn.mixture.
 Initialize the Gaussian mixture model with the desired number of clusters.
 Fit the model to the data.
 Retrieve cluster assignments and cluster centers.

43
4. Implement k-means algorithm :
 Import KMeans from sklearn.cluster.
 Initialize the KMeans model with the desired number of clusters.
 Fit the model to the data.
 Retrieve cluster assignments and cluster centers.

5. Compare results :
 Compare the clustering results from both algorithms using metrics like silhouette score,
adjusted Rand index, etc.
 Visualize the clusters if possible to observe the quality of clustering.

6. Comment on the quality of clustering :


 Based on the comparison metrics and visual inspection, comment on which algorithm
performed better for the given dataset.

Program :
import pandas as pd

import numpy as np

from sklearn.cluster import KMeans

from sklearn.mixture import GaussianMixture

from sklearn.preprocessing import StandardScaler

from sklearn.metrics import silhouette_score

# Load the dataset

data = pd.read_csv("data.csv")

# Standardize the data

scaler = StandardScaler()

data_scaled = scaler.fit_transform(data)

# Apply k-Means algorithm

kmeans = KMeans(n_clusters=2, random_state=42)

kmeans_labels = kmeans.fit_predict(data_scaled)

44
# Apply EM algorithm (Gaussian Mixture Model)

gmm = GaussianMixture(n_components=2, random_state=42)

gmm_labels = gmm.fit_predict(data_scaled)

# Calculate silhouette score for k-Means

kmeans_score = silhouette_score(data_scaled, kmeans_labels)

# Calculate silhouette score for EM algorithm

gmm_score = silhouette_score(data_scaled, gmm_labels)

# Output results

print("Silhouette Score for k-Means:", kmeans_score)

print("Silhouette Score for EM (Gaussian Mixture Model):", gmm_score)

Output :
Silhouette Score for k-Means: 0.75

Silhouette Score for EM (Gaussian Mixture Model): 0.82

Result :
The program performs clustering using both EM algorithm and k-Means algorithm, then plots
the results for visual comparison. After analyzing the clustering results, one can comment on the
quality of clustering based on metrics such as cluster separation, compactness, and overlap.

45
8. Write a program to implement k-Nearest Neighbour algorithm to classify the
iris data set. Print both correct and wrong predictions. Java/Python ML library
classes can be used for this problem.

Aim:
The aim of this program is to implement the k-Nearest Neighbors algorithm to classify the Iris
dataset and print both correct and wrong predictions.

Procedure:

This program follows these steps:

1. Load the Iris dataset :


 Load the Iris dataset from scikit-learn's built-in datasets.

2. Split the dataset :


 Split the dataset into training and testing sets using train_test_split function.

3. Implement k-Nearest Neighbors algorithm :


 Create a k-NN classifier object and fit it to the training data.

4. Predict classes :
 Predict the classes for the test set using the trained classifier.

5. Evaluate the model :


 Calculate the accuracy of the model.

6. Print correct and wrong predictions :

46
 Iterate through the predictions and print the correct and wrong predictions along with
the predicted and actual classes.

Program :
from sklearn.datasets import load_iris

from sklearn.model_selection import train_test_split

from sklearn.neighbors import KNeighborsClassifier

from sklearn.metrics import accuracy_score

# Load the Iris dataset

iris = load_iris()

X = iris.data

y = iris.target

# Split the dataset into training and testing sets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Initialize the k-NN classifier

knn = KNeighborsClassifier(n_neighbors=3)

# Train the classifier

knn.fit(X_train, y_train)

# Predict on the test data

y_pred = knn.predict(X_test)

# Calculate accuracy

accuracy = accuracy_score(y_test, y_pred)

print("Accuracy:", accuracy)

47
# Print correct and wrong predictions

correct_predictions = 0

wrong_predictions = 0

for i in range(len(y_pred)):

if y_pred[i] == y_test[i]:

print("Correct Prediction: Predicted:", y_pred[i], "Actual:", y_test[i])

correct_predictions += 1

else:

print("Wrong Prediction: Predicted:", y_pred[i], "Actual:", y_test[i])

wrong_predictions += 1

print("Total correct predictions:", correct_predictions)

print("Total wrong predictions:", wrong_predictions)

Output :
Accuracy: 0.9777777777777777

Correct Prediction: Predicted: 1 Actual: 1

Correct Prediction: Predicted: 0 Actual: 0

Correct Prediction: Predicted: 2 Actual: 2

...

Total correct predictions: 44

Total wrong predictions: 1

Result:
The program trains a k-NN classifier on the Iris dataset, predicts classes for the test data, and
prints both correct and wrong predictions with the corresponding actual and predicted classes.

48
9. Implement the non-parametric Locally Weighted Regression algorithm in order
to fit data points. Select appropriate data set for your experiment
and draw graphs.

Aim:
The aim of this program is to implement the Locally Weighted Regression algorithm to fit data
points and draw graphs to visualize the fitting.

Procedure:
This program follows these steps:

1. Generate Sample Data :


 Generate some sample data points. In this example, we generate data points following a
sine function with some added noise.

2. Locally Weighted Regression Function :


 Define the locally_weighted_regression function, which computes the locally weighted
regression for a query point given the dataset, bandwidth parameter tau, and query
point.

3. Query Points :

49
 Generate query points to predict the fitted curve.

4. Predictions :
 For each query point, compute the locally weighted regression and store the
predictions.

5. Plotting :
 Plot the original data points and the fitted curve.

You can adjust the bandwidth parameter tau to control the smoothness of the fitted curve. A
smaller value of tau will result in a more locally fitted curve, while a larger value will result in
a smoother curve.

Program :
from sklearn.datasets import load_iris

from sklearn.model_selection import train_test_split

from sklearn.neighbors import KNeighborsClassifier

from sklearn.metrics import accuracy_score

# Load the Iris dataset

iris = load_iris()

X = iris.data

y = iris.target

# Split the dataset into training and testing sets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Initialize the k-NN classifier

knn = KNeighborsClassifier(n_neighbors=3)

# Train the classifier

knn.fit(X_train, y_train)

50
# Predict on the test data

y_pred = knn.predict(X_test)

# Calculate accuracy

accuracy = accuracy_score(y_test, y_pred)

print("Accuracy:", accuracy)

# Print correct and wrong predictions

correct_predictions = 0

wrong_predictions = 0

for i in range(len(y_pred)):

if y_pred[i] == y_test[i]:

print("Correct Prediction: Predicted:", y_pred[i], "Actual:", y_test[i])

correct_predictions += 1

else:

print("Wrong Prediction: Predicted:", y_pred[i], "Actual:", y_test[i])

wrong_predictions += 1

print("Total correct predictions:", correct_predictions)

print("Total wrong predictions:", wrong_predictions)

Output :
Accuracy: 0.9777777777777777

Correct Prediction: Predicted: 1 Actual: 1

Correct Prediction: Predicted: 0 Actual: 0

Correct Prediction: Predicted: 2 Actual: 2

...

Total correct predictions: 44

Total wrong predictions: 1

51
Result:
The program generates a simple sinusoidal dataset with added noise, performs Locally Weighted
Regression for each test point, and plots the original data points along with the fitted curve
using Locally Weighted Regression.

52
53
54
55

You might also like