Practical No:– 10
Name: Hiren Daxeshbhai Patel Roll No.: 07
Title: Write a program to implement K means clustering algorithm. Select your own dataset to
test the program. Demonstrate the nature of output with varying value of K.
Software Requirement:
• Python
• NumPy
• Pandas
• matplotlib
• scikit-learn
• Jupyter Notebook
Source Code:
# Import necessary libraries import
numpy as np import pandas as pd import
matplotlib.pyplot as plt from
sklearn.datasets import load_iris from
sklearn.cluster import KMeans from
sklearn.decomposition import PCA
# Load the Iris dataset
iris = load_iris() X =
iris.data y = iris.target
# Function to perform K-Means clustering def
perform_kmeans(X, k): kmeans =
KMeans(n_clusters=k, random_state=42)
kmeans.fit(X) return kmeans
# Function to plot K-Means results def
plot_kmeans(X, kmeans, k):
# Use PCA to reduce dimensionality to 2D for visualization
pca = PCA(n_components=2)
X_reduced = pca.fit_transform(X)
plt.figure(figsize=(8, 6))
plt.scatter(X_reduced[:, 0], X_reduced[:, 1], c=kmeans.labels_, cmap='viridis', marker='o')
centers = pca.transform(kmeans.cluster_centers_)
plt.scatter(centers[:, 0], centers[:, 1], c='red', s=200, alpha=0.75, marker='X')
plt.title(f'K-Means Clustering with K={k}')
plt.xlabel('Principal Component 1')
plt.ylabel('Principal Component 2')
plt.grid() plt.show()
# Test the K-Means algorithm with varying values of K
K_values = [2, 3, 4, 5]
for k in K_values:
kmeans_model = perform_kmeans(X, k)
plot_kmeans(X, kmeans_model, k)
Output:
Conclusions:
K-Means is a simple yet powerful clustering technique that efficiently groups data into clusters
based on similarity. The algorithm's performance depends on the choice of \( K \) and the initial
placement of centroids, which may require multiple runs with different initializations to achieve
results. Proper tuning of hyperparameters and careful evaluation are essential for achieving
high performance in classification tasks.