0% found this document useful (0 votes)

44 views14 pages

ML Lab Exp 7 K-Means Clustering

This document analyzes customer data using k-means clustering in Python. It loads customer data from a CSV file, explores the data, and clusters it using k-means to find the optimal number of clusters.

Uploaded by

Yuvraj Singh Rathore

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

44 views14 pages

ML Lab Exp 7 K-Means Clustering

Uploaded by

Yuvraj Singh Rathore

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

import pandas as pd

import numpy as np
import matplotlib.pyplot as plt

data=pd.read_csv("Mall_Customers.csv")
data.head()

CustomerID Genre Age Annual Income (k$) Spending Score (1-100)

0 1 Male 19 15 39
1 2 Male 21 15 81
2 3 Female 20 16 6
3 4 Female 23 16 77
4 5 Female 31 17 40

data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 200 entries, 0 to 199
Data columns (total 5 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 CustomerID 200 non-null int64
1 Genre 200 non-null object
2 Age 200 non-null int64
3 Annual Income (k$) 200 non-null int64
4 Spending Score (1-100) 200 non-null int64
dtypes: int64(4), object(1)
memory usage: 7.9+ KB

data.isnull().sum()

CustomerID 0
Genre 0
Age 0
Annual Income (k$) 0
Spending Score (1-100) 0
dtype: int64

#Count total number of classes in Data

class_counts = data.groupby('Genre').size()
print(class_counts)

Genre
Female 112
Male 88
dtype: int64

data.hist()
plt.show()
data.describe

<bound method NDFrame.describe of CustomerID Genre Age Annual

Income (k$) Spending Score (1-100)
0 1 Male 19 15
39
1 2 Male 21 15
81
2 3 Female 20 16
6
3 4 Female 23 16
77
4 5 Female 31 17
40
.. ... ... ... ... .
..
195 196 Female 35 120
79
196 197 Female 45 126
28
197 198 Male 32 126
74
198 199 Male 32 137
18
199 200 Male 30 137
83

[200 rows x 5 columns]>

# Extracting features of dataset

X = data.iloc[:,[3,4]].values

array([[ 15, 39],

[ 15, 81],
[ 16, 6],
[ 16, 77],
[ 17, 40],
[ 17, 76],
[ 18, 6],
[ 18, 94],
[ 19, 3],
[ 19, 72],
[ 19, 14],
[ 19, 99],
[ 20, 15],
[ 20, 77],
[ 20, 13],
[ 20, 79],
[ 21, 35],
[ 21, 66],
[ 23, 29],
[ 23, 98],
[ 24, 35],
[ 24, 73],
[ 25, 5],
[ 25, 73],
[ 28, 14],
[ 28, 82],
[ 28, 32],
[ 28, 61],
[ 29, 31],
[ 29, 87],
[ 30, 4],
[ 30, 73],
[ 33, 4],
[ 33, 92],
[ 33, 14],
[ 33, 81],
[ 34, 17],
[ 34, 73],
[ 37, 26],
[ 37, 75],
[ 38, 35],
[ 38, 92],
[ 39, 36],
[ 39, 61],
[ 39, 28],
[ 39, 65],
[ 40, 55],
[ 40, 47],
[ 40, 42],
[ 40, 42],
[ 42, 52],
[ 42, 60],
[ 43, 54],
[ 43, 60],
[ 43, 45],
[ 43, 41],
[ 44, 50],
[ 44, 46],
[ 46, 51],
[ 46, 46],
[ 46, 56],
[ 46, 55],
[ 47, 52],
[ 47, 59],
[ 48, 51],
[ 48, 59],
[ 48, 50],
[ 48, 48],
[ 48, 59],
[ 48, 47],
[ 49, 55],
[ 49, 42],
[ 50, 49],
[ 50, 56],
[ 54, 47],
[ 54, 54],
[ 54, 53],
[ 54, 48],
[ 54, 52],
[ 54, 42],
[ 54, 51],
[ 54, 55],
[ 54, 41],
[ 54, 44],
[ 54, 57],
[ 54, 46],
[ 57, 58],
[ 57, 55],
[ 58, 60],
[ 58, 46],
[ 59, 55],
[ 59, 41],
[ 60, 49],
[ 60, 40],
[ 60, 42],
[ 60, 52],
[ 60, 47],
[ 60, 50],
[ 61, 42],
[ 61, 49],
[ 62, 41],
[ 62, 48],
[ 62, 59],
[ 62, 55],
[ 62, 56],
[ 62, 42],
[ 63, 50],
[ 63, 46],
[ 63, 43],
[ 63, 48],
[ 63, 52],
[ 63, 54],
[ 64, 42],
[ 64, 46],
[ 65, 48],
[ 65, 50],
[ 65, 43],
[ 65, 59],
[ 67, 43],
[ 67, 57],
[ 67, 56],
[ 67, 40],
[ 69, 58],
[ 69, 91],
[ 70, 29],
[ 70, 77],
[ 71, 35],
[ 71, 95],
[ 71, 11],
[ 71, 75],
[ 71, 9],
[ 71, 75],
[ 72, 34],
[ 72, 71],
[ 73, 5],
[ 73, 88],
[ 73, 7],
[ 73, 73],
[ 74, 10],
[ 74, 72],
[ 75, 5],
[ 75, 93],
[ 76, 40],
[ 76, 87],
[ 77, 12],
[ 77, 97],
[ 77, 36],
[ 77, 74],
[ 78, 22],
[ 78, 90],
[ 78, 17],
[ 78, 88],
[ 78, 20],
[ 78, 76],
[ 78, 16],
[ 78, 89],
[ 78, 1],
[ 78, 78],
[ 78, 1],
[ 78, 73],
[ 79, 35],
[ 79, 83],
[ 81, 5],
[ 81, 93],
[ 85, 26],
[ 85, 75],
[ 86, 20],
[ 86, 95],
[ 87, 27],
[ 87, 63],
[ 87, 13],
[ 87, 75],
[ 87, 10],
[ 87, 92],
[ 88, 13],
[ 88, 86],
[ 88, 15],
[ 88, 69],
[ 93, 14],
[ 93, 90],
[ 97, 32],
[ 97, 86],
[ 98, 15],
[ 98, 88],
[ 99, 39],
[ 99, 97],
[101, 24],
[101, 68],
[103, 17],
[103, 85],
[103, 23],
[103, 69],
[113, 8],
[113, 91],
[120, 16],
[120, 79],
[126, 28],
[126, 74],
[137, 18],
[137, 83]], dtype=int64)

# Using the elbow method to find the optimal number of clusters

from sklearn.cluster import KMeans
wcss = []
for i in range(1, 11):
kmeans = KMeans(n_clusters = i, init = 'k-means++', random_state =
42)
kmeans.fit(X)
wcss.append(kmeans.inertia_)

C:\Users\Shyam Singh\anaconda3\lib\site-packages\sklearn\cluster\
_kmeans.py:870: FutureWarning: The default value of `n_init` will
change from 10 to 'auto' in 1.4. Set the value of `n_init` explicitly
to suppress the warning
warnings.warn(
C:\Users\Shyam Singh\anaconda3\lib\site-packages\sklearn\cluster\
_kmeans.py:1382: UserWarning: KMeans is known to have a memory leak on
Windows with MKL, when there are less chunks than available threads.
You can avoid it by setting the environment variable
OMP_NUM_THREADS=1.
warnings.warn(
C:\Users\Shyam Singh\anaconda3\lib\site-packages\sklearn\cluster\
_kmeans.py:870: FutureWarning: The default value of `n_init` will
change from 10 to 'auto' in 1.4. Set the value of `n_init` explicitly
to suppress the warning
warnings.warn(
C:\Users\Shyam Singh\anaconda3\lib\site-packages\sklearn\cluster\
_kmeans.py:1382: UserWarning: KMeans is known to have a memory leak on
Windows with MKL, when there are less chunks than available threads.
You can avoid it by setting the environment variable
OMP_NUM_THREADS=1.
warnings.warn(
C:\Users\Shyam Singh\anaconda3\lib\site-packages\sklearn\cluster\
_kmeans.py:870: FutureWarning: The default value of `n_init` will
change from 10 to 'auto' in 1.4. Set the value of `n_init` explicitly
to suppress the warning
warnings.warn(
C:\Users\Shyam Singh\anaconda3\lib\site-packages\sklearn\cluster\
_kmeans.py:1382: UserWarning: KMeans is known to have a memory leak on
Windows with MKL, when there are less chunks than available threads.
You can avoid it by setting the environment variable
OMP_NUM_THREADS=1.
warnings.warn(
C:\Users\Shyam Singh\anaconda3\lib\site-packages\sklearn\cluster\
_kmeans.py:870: FutureWarning: The default value of `n_init` will
change from 10 to 'auto' in 1.4. Set the value of `n_init` explicitly
to suppress the warning
warnings.warn(
C:\Users\Shyam Singh\anaconda3\lib\site-packages\sklearn\cluster\
_kmeans.py:1382: UserWarning: KMeans is known to have a memory leak on
Windows with MKL, when there are less chunks than available threads.
You can avoid it by setting the environment variable
OMP_NUM_THREADS=1.
warnings.warn(
C:\Users\Shyam Singh\anaconda3\lib\site-packages\sklearn\cluster\
_kmeans.py:870: FutureWarning: The default value of `n_init` will
change from 10 to 'auto' in 1.4. Set the value of `n_init` explicitly
to suppress the warning
warnings.warn(
C:\Users\Shyam Singh\anaconda3\lib\site-packages\sklearn\cluster\
_kmeans.py:1382: UserWarning: KMeans is known to have a memory leak on
Windows with MKL, when there are less chunks than available threads.
You can avoid it by setting the environment variable
OMP_NUM_THREADS=1.
warnings.warn(
C:\Users\Shyam Singh\anaconda3\lib\site-packages\sklearn\cluster\
_kmeans.py:870: FutureWarning: The default value of `n_init` will
change from 10 to 'auto' in 1.4. Set the value of `n_init` explicitly
to suppress the warning
warnings.warn(
C:\Users\Shyam Singh\anaconda3\lib\site-packages\sklearn\cluster\
_kmeans.py:1382: UserWarning: KMeans is known to have a memory leak on
Windows with MKL, when there are less chunks than available threads.
You can avoid it by setting the environment variable
OMP_NUM_THREADS=1.
warnings.warn(
C:\Users\Shyam Singh\anaconda3\lib\site-packages\sklearn\cluster\
_kmeans.py:870: FutureWarning: The default value of `n_init` will
change from 10 to 'auto' in 1.4. Set the value of `n_init` explicitly
to suppress the warning
warnings.warn(
C:\Users\Shyam Singh\anaconda3\lib\site-packages\sklearn\cluster\
_kmeans.py:1382: UserWarning: KMeans is known to have a memory leak on
Windows with MKL, when there are less chunks than available threads.
You can avoid it by setting the environment variable
OMP_NUM_THREADS=1.
warnings.warn(
C:\Users\Shyam Singh\anaconda3\lib\site-packages\sklearn\cluster\
_kmeans.py:870: FutureWarning: The default value of `n_init` will
change from 10 to 'auto' in 1.4. Set the value of `n_init` explicitly
to suppress the warning
warnings.warn(
C:\Users\Shyam Singh\anaconda3\lib\site-packages\sklearn\cluster\
_kmeans.py:1382: UserWarning: KMeans is known to have a memory leak on
Windows with MKL, when there are less chunks than available threads.
You can avoid it by setting the environment variable
OMP_NUM_THREADS=1.
warnings.warn(
C:\Users\Shyam Singh\anaconda3\lib\site-packages\sklearn\cluster\
_kmeans.py:870: FutureWarning: The default value of `n_init` will
change from 10 to 'auto' in 1.4. Set the value of `n_init` explicitly
to suppress the warning
warnings.warn(
C:\Users\Shyam Singh\anaconda3\lib\site-packages\sklearn\cluster\
_kmeans.py:1382: UserWarning: KMeans is known to have a memory leak on
Windows with MKL, when there are less chunks than available threads.
You can avoid it by setting the environment variable
OMP_NUM_THREADS=1.
warnings.warn(
C:\Users\Shyam Singh\anaconda3\lib\site-packages\sklearn\cluster\
_kmeans.py:870: FutureWarning: The default value of `n_init` will
change from 10 to 'auto' in 1.4. Set the value of `n_init` explicitly
to suppress the warning
warnings.warn(
C:\Users\Shyam Singh\anaconda3\lib\site-packages\sklearn\cluster\
_kmeans.py:1382: UserWarning: KMeans is known to have a memory leak on
Windows with MKL, when there are less chunks than available threads.
You can avoid it by setting the environment variable
OMP_NUM_THREADS=1.
warnings.warn(

plt.plot(range(1, 11), wcss) # within clusterr of sum square

plt.title('The Elbow Method')
plt.xlabel('Number of clusters')
plt.ylabel('WCSS')
plt.show()
• Elbow method is a technique used to decide no of centroid(k) in k-means clustering
algorithm.

• In this method we determine the k-value continuously iterate for k=1 to n where n is
a hyperparameter

# Fitting K-Means to the dataset

from sklearn.cluster import KMeans
kmeans = KMeans(n_clusters = 5, init = 'k-means++', random_state = 42)
print(kmeans)
y_kmeans = kmeans.fit_predict(X)

KMeans(n_clusters=5, random_state=42)

print("Within cluster sum of square when k=5", kmeans.inertia_)

Within cluster sum of square when k=5 44448.45544793371

print("center of Cluster are:\n", kmeans.cluster_centers_ )

center of Cluster are:

[[55.2962963 49.51851852]
[88.2 17.11428571]
[26.30434783 20.91304348]
[25.72727273 79.36363636]
[86.53846154 82.12820513]]

print("Number of iterations", kmeans.n_iter_)

Number of iterations 3

# Visualising the clusters

plt.scatter(X[:,0], X[:,1], s = 100, c = 'black', label = 'Data
Distribution')
plt.title('Customer Distribution before clustering')
plt.xlabel('Annual Income (k$)')
plt.ylabel('Spending Score (1-100)')
plt.legend()
plt.show()
frame = pd.DataFrame(X)
frame['cluster'] = y_kmeans
frame['cluster'].value_counts()

0 81
4 39
1 35
2 23
3 22
Name: cluster, dtype: int64

Annual_Income = 70#@param {type:"number"}

Spending_Score = 40#@param {type:"number"}

predict= kmeans.predict([[ Annual_Income,Spending_Score ]])

print(predict)
if predict==[0]:
print("Customer is careless")

elif predict==[1]:
print("Customer is standard")
elif predict==[2]:
print("Customer is Target")
elif predict==[3]:
print("Customer is careful")

else:
print("Custmor is sensible" )

[0]
Customer is careless

# Visualising the clusters

plt.scatter(X[y_kmeans == 0, 0], X[y_kmeans == 0, 1], s = 100, c =
'red', label = 'Cluster 1')
plt.scatter(X[y_kmeans == 1, 0], X[y_kmeans == 1, 1], s = 100, c =
'blue', label = 'Cluster 2')
plt.scatter(X[y_kmeans == 2, 0], X[y_kmeans == 2, 1], s = 100, c =
'green', label = 'Cluster 3')
plt.scatter(X[y_kmeans == 3, 0], X[y_kmeans == 3, 1], s = 100, c =
'cyan', label = 'Cluster 4')
plt.scatter(X[y_kmeans== 4, 0], X[y_kmeans == 4, 1], s = 100, c =
'magenta', label = 'Cluster 5')
plt.scatter(kmeans.cluster_centers_[:, 0], kmeans.cluster_centers_[:,
1], s = 300, c = 'yellow', label = 'Centroids')
plt.title('Clusters of customers')
plt.xlabel('Annual Income (k$)')
plt.ylabel('Spending Score (1-100)')
plt.legend()
plt.show()
import pickle
filename = "model7.sav"
pickle.dump(kmeans, open(filename, "wb"))

Assignment5 VidulGarg
No ratings yet
Assignment5 VidulGarg
12 pages
Experiment 8 Heirarchical Clustering
No ratings yet
Experiment 8 Heirarchical Clustering
17 pages
Practical 4
No ratings yet
Practical 4
9 pages
K NN Regression
No ratings yet
K NN Regression
12 pages
Logistic Regression PRGM
No ratings yet
Logistic Regression PRGM
1 page
DSBDA5
No ratings yet
DSBDA5
10 pages
Building Classification Model 23rd - June PDF
No ratings yet
Building Classification Model 23rd - June PDF
12 pages
Numpy
No ratings yet
Numpy
13 pages
SVM Practical ML4
No ratings yet
SVM Practical ML4
1 page
1 Simple Linear Regression
No ratings yet
1 Simple Linear Regression
9 pages
K - Means - Clustering - Ipynb - Colaboratory
No ratings yet
K - Means - Clustering - Ipynb - Colaboratory
2 pages
Mall Customer
No ratings yet
Mall Customer
1 page
Project 13 Customer Segmentation Using K Means Clustering
No ratings yet
Project 13 Customer Segmentation Using K Means Clustering
9 pages
Roulette Algorithm Pro Guide
No ratings yet
Roulette Algorithm Pro Guide
3 pages
1168 - 881
No ratings yet
1168 - 881
7 pages
Data Analytics II
No ratings yet
Data Analytics II
16 pages
Neural Network Handwritten Digit Prediction
No ratings yet
Neural Network Handwritten Digit Prediction
6 pages
MNIST Digit Classification Guide
No ratings yet
MNIST Digit Classification Guide
53 pages
Neural Network Handwritten Digit Prediction 1
No ratings yet
Neural Network Handwritten Digit Prediction 1
5 pages
IMg Process
No ratings yet
IMg Process
30 pages
Haberman
No ratings yet
Haberman
1 page
Lotto
No ratings yet
Lotto
16 pages
Pandas Practice
No ratings yet
Pandas Practice
45 pages
Python Assignment: #Source Code
No ratings yet
Python Assignment: #Source Code
11 pages
Tutorial 6
No ratings yet
Tutorial 6
13 pages
CISC 504 Assignment 5 - O
No ratings yet
CISC 504 Assignment 5 - O
7 pages
DeepLearning Lect4
No ratings yet
DeepLearning Lect4
23 pages
Customer Segmentation Using K-Means Clustering
No ratings yet
Customer Segmentation Using K-Means Clustering
11 pages
Implement SOFM For Character Recognition - Watermark
No ratings yet
Implement SOFM For Character Recognition - Watermark
9 pages
CISC 504 - Vatsal - Patel - Assignment 5 - O.ipynb
No ratings yet
CISC 504 - Vatsal - Patel - Assignment 5 - O.ipynb
27 pages
Untitled 21
No ratings yet
Untitled 21
6 pages
Numpy Numpy NP NP: Mylist (, ,)
No ratings yet
Numpy Numpy NP NP: Mylist (, ,)
33 pages
MLLab 4
No ratings yet
MLLab 4
9 pages
Lab Handwritten
No ratings yet
Lab Handwritten
8 pages
Untitled10 - Jupyter Notebook
No ratings yet
Untitled10 - Jupyter Notebook
9 pages
Data Mining Portfolio
No ratings yet
Data Mining Portfolio
19 pages
Touch Panel Debug Data
No ratings yet
Touch Panel Debug Data
10 pages
TP Debug Info
No ratings yet
TP Debug Info
3 pages
TP Debug Info
No ratings yet
TP Debug Info
14 pages
Number Pattern Identification Quiz
No ratings yet
Number Pattern Identification Quiz
1 page
Tapa Trias
0% (1)
Tapa Trias
15,540 pages
Mommy Long Legs - Melsave
No ratings yet
Mommy Long Legs - Melsave
172 pages
TP Debug Info
No ratings yet
TP Debug Info
9 pages
Kelompok 7
No ratings yet
Kelompok 7
15 pages
Touchpanel Debug Info 2025-02-25
No ratings yet
Touchpanel Debug Info 2025-02-25
6 pages
AIML Preprocessing
No ratings yet
AIML Preprocessing
14 pages
TP Debug Info
No ratings yet
TP Debug Info
13 pages
Ml1.ipynb - Colaboratory
No ratings yet
Ml1.ipynb - Colaboratory
5 pages
Gray
No ratings yet
Gray
4 pages
TP Debug Info
No ratings yet
TP Debug Info
3 pages
TP Debug Info
No ratings yet
TP Debug Info
491 pages
TP Debug Info
No ratings yet
TP Debug Info
21 pages
Touch Panel Debug Data Analysis
No ratings yet
Touch Panel Debug Data Analysis
134 pages
Tickets Aleatórios
No ratings yet
Tickets Aleatórios
2 pages
Numbers
No ratings yet
Numbers
13 pages
Assignment 2
No ratings yet
Assignment 2
3 pages
Quran
No ratings yet
Quran
913 pages
Shailesh020902@gmail - Com 1
No ratings yet
Shailesh020902@gmail - Com 1
1 page
Dec 15 17 2
No ratings yet
Dec 15 17 2
1 page
Prediction of Heart Disease Using Random Forest in Comparison With Logistic Regression To Measure Accuracy
No ratings yet
Prediction of Heart Disease Using Random Forest in Comparison With Logistic Regression To Measure Accuracy
5 pages
Heart Disease Prediction Using Hybrid Machine Learning Model
No ratings yet
Heart Disease Prediction Using Hybrid Machine Learning Model
5 pages
Heart Disease Prediction Using Supervised Machine Learning Algorithms
No ratings yet
Heart Disease Prediction Using Supervised Machine Learning Algorithms
3 pages
Some SQL Functions
No ratings yet
Some SQL Functions
10 pages
Evaluating Machine Learning Model
No ratings yet
Evaluating Machine Learning Model
15 pages
NF Assighment4
No ratings yet
NF Assighment4
5 pages
ML Module 4 2022 1 PDF
No ratings yet
ML Module 4 2022 1 PDF
31 pages
Ensemble Learning: Martin Sewell
No ratings yet
Ensemble Learning: Martin Sewell
16 pages
AI & ML Program Examples
No ratings yet
AI & ML Program Examples
23 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
72 pages
Confusion Matrix: Example Table of Confusion References External Links
No ratings yet
Confusion Matrix: Example Table of Confusion References External Links
3 pages
BMI Autopsy: 1 1 3 1 1 1 1 2 1 3 1 1 1 2 2 1 2 3 3 2 4 1 2 1 1 2 1 2 1 1 1 Total Result 23 27 50
No ratings yet
BMI Autopsy: 1 1 3 1 1 1 1 2 1 3 1 1 1 2 2 1 2 3 3 2 4 1 2 1 1 2 1 2 1 1 1 Total Result 23 27 50
4 pages
Topic 4
No ratings yet
Topic 4
32 pages
Data Analytics Quiz Insights
No ratings yet
Data Analytics Quiz Insights
3 pages
Data Analytics Quiz Results
No ratings yet
Data Analytics Quiz Results
9 pages
Cluster Analysis
No ratings yet
Cluster Analysis
60 pages
K-Means Clustering Dan Local Outlier Factor: Clustering Data Remunerasi PNS Menggunakan Metode
No ratings yet
K-Means Clustering Dan Local Outlier Factor: Clustering Data Remunerasi PNS Menggunakan Metode
8 pages
CS771: Introduction To Machine Learning Piyush Rai
No ratings yet
CS771: Introduction To Machine Learning Piyush Rai
25 pages
Experiment-7: Implementation of K-Means Clustering Algorithm
No ratings yet
Experiment-7: Implementation of K-Means Clustering Algorithm
3 pages
Lecture 8 Deep Learning Overview PDF
No ratings yet
Lecture 8 Deep Learning Overview PDF
98 pages
Assignment 1
No ratings yet
Assignment 1
17 pages
AnFrek Hujan
No ratings yet
AnFrek Hujan
76 pages
Unit IV Data Analysis - Reasearch Methodology - BA 4th Semester
No ratings yet
Unit IV Data Analysis - Reasearch Methodology - BA 4th Semester
20 pages
Clustering Mall Data Students
No ratings yet
Clustering Mall Data Students
11 pages
This Study Resource Was
No ratings yet
This Study Resource Was
4 pages
ROC Analysis for Decision Systems
No ratings yet
ROC Analysis for Decision Systems
11 pages
Heart Disease Paper
No ratings yet
Heart Disease Paper
10 pages
Seminar Presentation PKD21IT012
No ratings yet
Seminar Presentation PKD21IT012
31 pages
K-Means Clustering Numerical Example
No ratings yet
K-Means Clustering Numerical Example
5 pages
Mayuresh Ai
No ratings yet
Mayuresh Ai
12 pages
ML Unit Wise Important Questions
No ratings yet
ML Unit Wise Important Questions
2 pages
03 Process Capability and CPK Index
100% (1)
03 Process Capability and CPK Index
16 pages
Phil IRI Pre Test Post Test ANALYSIS 2022 2023
No ratings yet
Phil IRI Pre Test Post Test ANALYSIS 2022 2023
1 page
Spriiprad - Machine Learning Model Basics Intermediate
No ratings yet
Spriiprad - Machine Learning Model Basics Intermediate
2 pages

ML Lab Exp 7 K-Means Clustering

Uploaded by

ML Lab Exp 7 K-Means Clustering

Uploaded by

import pandas as pd

CustomerID Genre Age Annual Income (k$) Spending Score (1-100)

#Count total number of classes in Data

<bound method NDFrame.describe of CustomerID Genre Age Annual

[200 rows x 5 columns]>

# Extracting features of dataset

array([[ 15, 39],

# Using the elbow method to find the optimal number of clusters

plt.plot(range(1, 11), wcss) # within clusterr of sum square

# Fitting K-Means to the dataset

print("Within cluster sum of square when k=5", kmeans.inertia_)

Within cluster sum of square when k=5 44448.45544793371

print("center of Cluster are:\n", kmeans.cluster_centers_ )

center of Cluster are:

print("Number of iterations", kmeans.n_iter_)

# Visualising the clusters

Annual_Income = 70#@param {type:"number"}

predict= kmeans.predict([[ Annual_Income,Spending_Score ]])

# Visualising the clusters

You might also like