0% found this document useful (0 votes)

17 views5 pages

ML-Lab Programs - VTU

Implement and demonstrate the working model of K-means clustering algorithm with Expectation Maximization Concept

Uploaded by

Deepak D

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views5 pages

ML-Lab Programs - VTU

Implement and demonstrate the working model of K-means clustering algorithm with Expectation Maximization Concept

Uploaded by

Deepak D

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

7 Implement and demonstrate the working model of K-means clustering

Aim algorithm with Expectation Maximization Concept

Apply EM algorithm to cluster a set of data stored in a .CSV file. Use the
same data set for clustering using k-Means algorithm. Compare the results
Program
of these two algorithms and comment on the quality of clustering. You can
add Python ML library classes/API in the program

CONCEPT - Expectation-Maximization (EM)

• Clustering is a type of unsupervised learning wherein data points are grouped into
different sets based on their degree of similarity.
• The Expectation-Maximization (EM) algorithm is a popular method for finding
maximum likelihood estimates of parameters in statistical models, particularly when
the model depends on unobserved latent variables. In the context of clustering, the EM
algorithm is often used with Gaussian Mixture Models (GMMs).
• The goal of clustering is to partition data points into clusters such that points within
the same cluster are more similar to each other than to points in other clusters. GMMs
assume that the data points are generated from a mixture of several Gaussian
distributions, each representing a cluster.

Steps of the EM Algorithm

1. Initialization: Initialize the parameters of the Gaussian components such as means,
covariances, and mixing coefficients. These can be initialized randomly or using some
heuristic method.

1
2
K – Means Clustering

The algorithm will categorize the items into k groups of similarity. To calculate that
similarity, use the Euclidean distance as measurement.

Algorithm

Input: Data points X = {x1, x2, ..., xn}, number of clusters K

Output: Cluster assignments for each data point

1. Initialize K cluster centroids randomly

2. Repeat until convergence or a maximum number of iterations:
a. Assign each data point xi to the nearest centroid
b. Recalculate the centroids as the mean of the data points assigned to each cluster
3. Return the cluster assignments

silhouette_score
The silhouette_score is a metric used to evaluate the quality of clustering results. It
measures how similar an object is to its own cluster compared to other clusters. The
silhouette score ranges from -1 to 1, where:

• 1 indicates that the data points are well-clustered, meaning they are close to other
points in the same cluster and far from points in other clusters.
• 0 indicates that the data points are on or very close to the decision boundary between
two neighboring clusters.
• -1 indicates that the data points may have been assigned to the wrong cluster, as they
are closer to points in other clusters than to points in their own cluster.

Formula

The silhouette score for a single data point is calculated using the formula:
• a - is the mean distance between a data point and all other points in the same cluster.
• b - is the mean distance between a data point and all other points in the nearest cluster
(i.e., the cluster that minimizes this mean distance).
The overall silhouette score for a clustering solution is the mean silhouette score of all data
points

3
Program:

from sklearn import datasets

import matplotlib.pyplot as plt
from sklearn.cluster import KMeans
from sklearn.mixture import GaussianMixture
from sklearn.metrics import silhouette_score

# Step 1: Load the Iris dataset

iris = datasets.load_iris()
X = iris.data
y = iris.target

# Step 2: Cluster the data using k-Means algorithm

kmeans = KMeans(n_clusters=3, random_state=42)
kmeans_labels = kmeans.fit_predict(X)
kmeans_silhouette = silhouette_score(X, kmeans_labels)

# Step 3: Cluster the data using EM algorithm (Gaussian Mixture Model)

gmm = GaussianMixture(n_components=3, random_state=42)
gmm_labels = gmm.fit_predict(X)
gmm_silhouette = silhouette_score(X, gmm_labels)

# Step 4: Compare the clustering results and visualize them

print(f"Silhouette Score for k-Means: {kmeans_silhouette}")
print(f"Silhouette Score for EM (GMM): {gmm_silhouette}")

# Visualization of the clusters

fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(14, 6))

# k-Means Clustering
ax1.scatter(X[:, 0], X[:, 1], c=kmeans_labels, cmap='viridis', marker='o', edgecolor='k',
s=50)
ax1.set_title('k-Means Clustering')

# EM (GMM) Clustering
ax2.scatter(X[:, 0], X[:, 1], c=gmm_labels, cmap='viridis', marker='o', edgecolor='k',
s=50)
ax2.set_title('EM (GMM) Clustering')

plt.show()

4
# Step 5: Comment on the quality of clustering
if kmeans_silhouette > gmm_silhouette:
print("k-Means clustering provides better quality clusters according to the silhouette
score.")
else:
print("EM (GMM) clustering provides better quality clusters according to the
silhouette score.")

Output

Silhouette Score for k-Means: 0.551191604619592

Silhouette Score for EM (GMM): 0.5011761635067206

k-Means clustering provides better quality clusters according to

the silhouette score.

Peer Eval
No ratings yet
Peer Eval
6 pages
K Means
No ratings yet
K Means
25 pages
MLT Lab 08
No ratings yet
MLT Lab 08
5 pages
Unit 4
No ratings yet
Unit 4
46 pages
5 - Clustering
No ratings yet
5 - Clustering
13 pages
ML DSBA Lab7
No ratings yet
ML DSBA Lab7
6 pages
PeerEval Unsupervised
No ratings yet
PeerEval Unsupervised
6 pages
UnsupervisedLearning FoundationalMathofAI S24
No ratings yet
UnsupervisedLearning FoundationalMathofAI S24
6 pages
LP I Assignment A4 Clustering
No ratings yet
LP I Assignment A4 Clustering
13 pages
0006 - K Means Clustering - Introduction - 2025
No ratings yet
0006 - K Means Clustering - Introduction - 2025
19 pages
ML Exp5 C36
No ratings yet
ML Exp5 C36
18 pages
ML Lecture06 Unsupervised Learning
No ratings yet
ML Lecture06 Unsupervised Learning
87 pages
K-Means Clustering Guide
No ratings yet
K-Means Clustering Guide
26 pages
Cluster Analysis Overview
No ratings yet
Cluster Analysis Overview
77 pages
Lec. 15-Final. ClusAdvanced
No ratings yet
Lec. 15-Final. ClusAdvanced
103 pages
Clustering FinancialData
No ratings yet
Clustering FinancialData
38 pages
Week 4 - Lecture Slides - K-Means, Mixture Models, & EM
No ratings yet
Week 4 - Lecture Slides - K-Means, Mixture Models, & EM
65 pages
Outline: Three Basic Algorithms
No ratings yet
Outline: Three Basic Algorithms
34 pages
Concepts and Techniques: - Chapter 11
No ratings yet
Concepts and Techniques: - Chapter 11
103 pages
ML Clustering2
No ratings yet
ML Clustering2
11 pages
Lecture 9 Kmean-V3
No ratings yet
Lecture 9 Kmean-V3
52 pages
Python K-Means Clustering Guide
No ratings yet
Python K-Means Clustering Guide
6 pages
Unit 3
No ratings yet
Unit 3
130 pages
Ads Exp5
No ratings yet
Ads Exp5
4 pages
EM and Kmeans Relations
No ratings yet
EM and Kmeans Relations
70 pages
K-Means Clustering Guide
No ratings yet
K-Means Clustering Guide
32 pages
Soft Vs Hard Clustering
No ratings yet
Soft Vs Hard Clustering
5 pages
Data Mining for Business Students
No ratings yet
Data Mining for Business Students
32 pages
Machine Learning K Means - Unsupervised
No ratings yet
Machine Learning K Means - Unsupervised
5 pages
02.1 K-Means Example
No ratings yet
02.1 K-Means Example
12 pages
K-Means Clustering - MATLAB Kmeans
No ratings yet
K-Means Clustering - MATLAB Kmeans
23 pages
Session 34 - 35clustering
No ratings yet
Session 34 - 35clustering
50 pages
Week 5 v1.1 - Unsupervised Learning
No ratings yet
Week 5 v1.1 - Unsupervised Learning
40 pages
Unsupervised Learning: K-Means Clustering
No ratings yet
Unsupervised Learning: K-Means Clustering
23 pages
Clustering
No ratings yet
Clustering
55 pages
Wa0033.
No ratings yet
Wa0033.
38 pages
K-Means Clustering Explained
No ratings yet
K-Means Clustering Explained
10 pages
Chapter 5 Clustering
No ratings yet
Chapter 5 Clustering
40 pages
20 - 1 - ML - Unsup - 01 - Partition Based - Kmeans
No ratings yet
20 - 1 - ML - Unsup - 01 - Partition Based - Kmeans
20 pages
Lect 10 - Unsupervised Learning
No ratings yet
Lect 10 - Unsupervised Learning
50 pages
MLT Unit 3 Notes
No ratings yet
MLT Unit 3 Notes
19 pages
Clustering in Python
No ratings yet
Clustering in Python
31 pages
Data Mining
No ratings yet
Data Mining
10 pages
AI With Python - Unsupervised Learning - Clustering
No ratings yet
AI With Python - Unsupervised Learning - Clustering
12 pages
21csc305p Machine Learning Unit 3 - Updated
No ratings yet
21csc305p Machine Learning Unit 3 - Updated
147 pages
GMM 1
No ratings yet
GMM 1
3 pages
DSML-ML09. Unsupervised Learning
No ratings yet
DSML-ML09. Unsupervised Learning
69 pages
K-Means Clustering Guide
No ratings yet
K-Means Clustering Guide
31 pages
K-Means Algorithm
No ratings yet
K-Means Algorithm
29 pages
Unsupervised Learning: Clustering & Anomaly Detection
No ratings yet
Unsupervised Learning: Clustering & Anomaly Detection
50 pages
K-Means Clustering Guide
No ratings yet
K-Means Clustering Guide
20 pages
Detecting Patterns With Unsupervised Learning
No ratings yet
Detecting Patterns With Unsupervised Learning
21 pages
Unit 4
No ratings yet
Unit 4
59 pages
Unit 3 & 4 (p18)
No ratings yet
Unit 3 & 4 (p18)
18 pages
ML Clustering
No ratings yet
ML Clustering
33 pages
K Mean Clustering
No ratings yet
K Mean Clustering
32 pages
Zara
No ratings yet
Zara
47 pages
Clustering Algorithm
No ratings yet
Clustering Algorithm
47 pages
ML Unit-3
No ratings yet
ML Unit-3
92 pages
ML Objectives Mid 1
No ratings yet
ML Objectives Mid 1
5 pages
Statistics Homework Solutions
No ratings yet
Statistics Homework Solutions
7 pages
Lec 05 - K-Means
No ratings yet
Lec 05 - K-Means
4 pages
Tracking Signal
No ratings yet
Tracking Signal
1 page
l09 Machine Learning
No ratings yet
l09 Machine Learning
39 pages
Understanding Differences Between Time Series, Cross-Sectional and Panel Data
No ratings yet
Understanding Differences Between Time Series, Cross-Sectional and Panel Data
7 pages
NIHMS329152 Supplement Appendix
No ratings yet
NIHMS329152 Supplement Appendix
2 pages
Fake Job Posting Detection Report
No ratings yet
Fake Job Posting Detection Report
10 pages
Supervised vs Unsupervised Learning
No ratings yet
Supervised vs Unsupervised Learning
15 pages
For Assignment-8 (Logistic Regression)
No ratings yet
For Assignment-8 (Logistic Regression)
10 pages
Cluster Analysis and K-Means Guide
No ratings yet
Cluster Analysis and K-Means Guide
20 pages
AIML CAT2 - Important Question
No ratings yet
AIML CAT2 - Important Question
3 pages
Hierarchical Clustering Algorithm
No ratings yet
Hierarchical Clustering Algorithm
8 pages
RT 15
No ratings yet
RT 15
2 pages
Penerapan Algoritma Convolutional Neural Network Dalam Klasifikasi Telur Ayam Fertil Dan Infertil Berdasarkan Hasil Candling
No ratings yet
Penerapan Algoritma Convolutional Neural Network Dalam Klasifikasi Telur Ayam Fertil Dan Infertil Berdasarkan Hasil Candling
9 pages
E-Commerce Product Delivery Prediction
No ratings yet
E-Commerce Product Delivery Prediction
13 pages
Assignment 1
No ratings yet
Assignment 1
17 pages
NF Assighment4
No ratings yet
NF Assighment4
5 pages
Machine Learning Semester Notes JNTUK
No ratings yet
Machine Learning Semester Notes JNTUK
2 pages
Data Mining Practical Guide
No ratings yet
Data Mining Practical Guide
27 pages
Quiz 3 - 20PAIE51J - Machine Learning - Unsupervised Model - Great Learning PDF
No ratings yet
Quiz 3 - 20PAIE51J - Machine Learning - Unsupervised Model - Great Learning PDF
6 pages
Zafira fk,+4 Vol11No1 855+ (36-47) +
No ratings yet
Zafira fk,+4 Vol11No1 855+ (36-47) +
12 pages
Buderer 1996
No ratings yet
Buderer 1996
6 pages
KNN vs SVM: A Python Implementation
No ratings yet
KNN vs SVM: A Python Implementation
6 pages
UNIT5
No ratings yet
UNIT5
60 pages
99 Machine Learning Algorithm
No ratings yet
99 Machine Learning Algorithm
7 pages
Machine Learning Algorithms Guide
No ratings yet
Machine Learning Algorithms Guide
3 pages
PyCaret Data Loading and Setup Guide
No ratings yet
PyCaret Data Loading and Setup Guide
2 pages
Jntuk Machine Learning 3-2 Unit-3
No ratings yet
Jntuk Machine Learning 3-2 Unit-3
33 pages