0% found this document useful (0 votes)

8 views11 pages

Program 8

The document outlines the implementation of KMeans and DBSCAN clustering algorithms using the 'Mall_Customers.csv' dataset. It includes code snippets for data loading, clustering, and visualization of results, as well as warnings related to the KMeans algorithm's memory usage on Windows. The dataset consists of 200 customers with attributes such as annual income and spending score.

Uploaded by

switchxblade420

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views11 pages

Program 8

Uploaded by

switchxblade420

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 11

program-8

December 7, 2023

0.0.1 Implement KMeans and DBSCAN algorithm using appropriate Data sets.

[5]: import numpy as np import matplotlib.pyplot as plt import

pandas as pd from sklearn.cluster import DBSCAN data =
pd.read_csv("Mall_Customers.csv") data.head() print("Dataset
shape:", data.shape) data.isnull().any().any() x =
data.loc[:, ['Annual Income (k$)','Spending Score (1-
100)']].values
# cluster the data into five clusters
dbscan = DBSCAN(eps = 8, min_samples =
4).fit(x)
# fitting the model
labels = dbscan.labels_ # getting the labels
plt.scatter(x[:, 0], x[:,1], c = labels, cmap= "plasma")
# plotting the clusters
plt.xlabel("Income") # X-axis label
plt.ylabel("Spending Score") # Y-axis
label plt.show() # showing the plot

Dataset shape: (200, 5)

[6]: import numpy as nm

import matplotlib.pyplot as mtp
import pandas as pd

# Importing the dataset

dataset = pd.read_csv('Mall_Customers.csv')

[7]: dataset

[7]: CustomerID Gender Age Annual Income (k$) Spending Score (1-100)
0 1 Male 19 15 39
1 2 Male 21 15 81
2 3 Female 20 16 6
3 4 Female 23 16 77

1
4 5 Female 31 17 40
.. … … … … …
195 196 Female 35 120 79
196 197 Female 45 126 28
197 198 Male 32 126 74
198 199 Male 32 137 18
199 200 Male 30 137 83
[200 rows x 5 columns]

[9]:
[8]: x = dataset.iloc[:, [3, 4]].values

x
[9]: array([[ 15,
39], [ 15,
81],
[ 16, 6],
[ 16, 77],
[ 17, 40],
[ 17, 76],
[ 18, 6],
[ 18, 94],
[ 19, 3],
[ 19, 72],
[ 19, 14],
[ 19, 99],
[ 20, 15],
[ 20, 77],
[ 20, 13],
[ 20, 79],
[ 21, 35],
[ 21, 66],
[ 23, 29],
[ 23, 98],
[ 24, 35],
[ 24, 73],
[ 25, 5],
[ 25, 73],
[ 28, 14],
[ 28, 82],

2
[ 28, 32],
[ 28, 61],
[ 29, 31],
[ 29, 87],
[ 30, 4],
[ 30, 73],
[ 33, 4],
[ 33, 92],
[ 33, 14],
[ 33, 81],
[ 34, 17],
[ 34, 73],
[ 37, 26],
[ 37, 75],
[ 38, 35],
[ 38, 92],
[ 39, 36],
[ 39, 61],
[ 39, 28],
[ 39, 65],
[ 40, 55],
[ 40, 47],
[ 40, 42],
[ 40, 42],
[ 42, 52],
[ 42, 60],
[ 43, 54],
[ 43, 60],
[ 43, 45],
[ 43, 41],
[ 44, 50],
[ 44, 46],
[ 46, 51],
[ 46, 46],
[ 46, 56],
[ 46, 55],
[ 47, 52],
[ 47, 59],
[ 48, 51],
[ 48, 59],
[ 48, 50],
[ 48, 48],
[ 48, 59],
[ 48, 47],
[ 49, 55],
[ 49, 42],

3
[ 50, 49],
[ 50, 56],
[ 54, 47],
[ 54, 54],
[ 54, 53],
[ 54, 48],
[ 54, 52],
[ 54, 42],
[ 54, 51],
[ 54, 55],
[ 54, 41],
[ 54, 44],
[ 54, 57],
[ 54, 46],
[ 57, 58], [ 57, 55],
[ 58, 60],
[ 58, 46],
[ 59, 55],
[ 59, 41],
[ 60, 49],
[ 60, 40],
[ 60, 42],
[ 60, 52],
[ 60, 47],
[ 60, 50],
[ 61, 42],
[ 61, 49],
[ 62, 41],
[ 62, 48],
[ 62, 59],
[ 62, 55],
[ 62, 56],
[ 62, 42],
[ 63, 50],
[ 63, 46],
[ 63, 43],
[ 63, 48],
[ 63, 52],
[ 63, 54],
[ 64, 42],
[ 64, 46],
[ 65, 48],
[ 65, 50],
[ 65, 43],
[ 65, 59],
[ 67, 43],

4
[ 67, 57],
[ 67, 56],
[ 67, 40],
[ 69, 58],
[ 69, 91],
[ 70, 29],
[ 70, 77],
[ 71, 35],
[ 71, 95],
[ 71, 11],
[ 71, 75],
[ 71, 9],
[ 71, 75],
[ 72, 34],
[ 72, 71], [ 73, 5],
[ 73, 88],
[ 73, 7],
[ 73, 73],
[ 74, 10],
[ 74, 72],
[ 75, 5],
[ 75, 93],
[ 76, 40],
[ 76, 87],
[ 77, 12],
[ 77, 97],
[ 77, 36],
[ 77, 74],
[ 78, 22],
[ 78, 90],
[ 78, 17],
[ 78, 88],
[ 78, 20],
[ 78, 76],
[ 78, 16],
[ 78, 89],
[ 78, 1],
[ 78, 78],
[ 78, 1],
[ 78, 73],
[ 79, 35],
[ 79, 83],
[ 81, 5],
[ 81, 93],
[ 85, 26],
[ 85, 75],

5
[ 86, 20],
[ 86, 95],
[ 87, 27],
[ 87, 63],
[ 87, 13],
[ 87, 75],
[ 87, 10],
[ 87, 92],
[ 88, 13],
[ 88, 86],
[ 88, 15],
[ 88, 69],
[ 93, 14],
[ 93, 90],
[ 97, 32], [ 97, 86],
[ 98, 15],
[ 98, 88],
[ 99, 39],
[ 99, 97],
[101, 24],
[101, 68],
[103, 17],
[103, 85],
[103, 23],
[103, 69],
[113, 8],
[113, 91],
[120, 16],
[120, 79],
[126, 28],
[126, 74],
[137, 18],
[137, 83]], dtype=int64)

[10]: #finding optimal number of clusters using the elbow

method from sklearn.cluster import KMeans
wcss_list= [] #Initializing the list for the values
of WCSS

#Using for loop for iterations from 1 to 10. for i in

range(1, 11): kmeans = KMeans(n_clusters=i, init='k-means+
+', random_state= 42) kmeans.fit(x)
wcss_list.append(kmeans.inertia_)
mtp.plot(range(1, 11),
wcss_list) mtp.title('The
Elobw Method Graph')

6
mtp.xlabel('Number of
clusters(k)')
mtp.ylabel('wcss_list')
mtp.show()

C:\Users\shilpa\anaconda3\Lib\site-packages\sklearn\cluster\
_kmeans.py:1412:
FutureWarning: The default value of `n_init` will change from 10 to
'auto' in
1.4. Set the value of `n_init` explicitly to suppress the warning
super()._check_params_vs_input(X, default_n_init=10)
C:\Users\shilpa\anaconda3\Lib\site-packages\sklearn\cluster\
_kmeans.py:1436: UserWarning: KMeans is known to have a memory leak
on Windows with MKL, when there are less chunks than available
threads. You can avoid it by setting the environment variable
OMP_NUM_THREADS=1.
warnings.warn(
C:\Users\shilpa\anaconda3\Lib\site-packages\sklearn\cluster\
_kmeans.py:1412:
FutureWarning: The default value of `n_init` will change from 10 to
'auto' in
1.4. Set the value of `n_init` explicitly to suppress the warning
super()._check_params_vs_input(X, default_n_init=10)
C:\Users\shilpa\anaconda3\Lib\site-packages\sklearn\cluster\
_kmeans.py:1436: UserWarning: KMeans is known to have a memory leak
on Windows with MKL, when there are less chunks than available
threads. You can avoid it by setting the environment variable
OMP_NUM_THREADS=1.
warnings.warn(
C:\Users\shilpa\anaconda3\Lib\site-packages\sklearn\cluster\
_kmeans.py:1412:
FutureWarning: The default value of `n_init` will change from 10 to
'auto' in
1.4. Set the value of `n_init` explicitly to suppress the warning
super()._check_params_vs_input(X, default_n_init=10)
C:\Users\shilpa\anaconda3\Lib\site-packages\sklearn\cluster\
_kmeans.py:1436: UserWarning: KMeans is known to have a memory leak
on Windows with MKL, when there are less chunks than available
threads. You can avoid it by setting the environment variable
OMP_NUM_THREADS=1.
warnings.warn(
C:\Users\shilpa\anaconda3\Lib\site-packages\sklearn\cluster\
_kmeans.py:1412:
FutureWarning: The default value of `n_init` will change from 10 to
'auto' in

7
1.4. Set the value of `n_init` explicitly to suppress the warning
super()._check_params_vs_input(X, default_n_init=10)
C:\Users\shilpa\anaconda3\Lib\site-packages\sklearn\cluster\
_kmeans.py:1436: UserWarning: KMeans is known to have a memory leak
on Windows with MKL, when there are less chunks than available
threads. You can avoid it by setting the environment variable
OMP_NUM_THREADS=1.
warnings.warn(
C:\Users\shilpa\anaconda3\Lib\site-packages\sklearn\cluster\
_kmeans.py:1412:
FutureWarning: The default value of `n_init` will change from 10 to
'auto' in
1.4. Set the value of `n_init` explicitly to suppress the warning
super()._check_params_vs_input(X, default_n_init=10)
C:\Users\shilpa\anaconda3\Lib\site-packages\sklearn\cluster\
_kmeans.py:1436: UserWarning: KMeans is known to have a memory leak
on Windows with MKL, when there are less chunks than available
threads. You can avoid it by setting the environment variable
OMP_NUM_THREADS=1.
warnings.warn(
C:\Users\shilpa\anaconda3\Lib\site-packages\sklearn\cluster\
_kmeans.py:1412:
FutureWarning: The default value of `n_init` will change from 10 to
'auto' in
1.4. Set the value of `n_init` explicitly to suppress the warning
super()._check_params_vs_input(X, default_n_init=10)
C:\Users\shilpa\anaconda3\Lib\site-packages\sklearn\cluster\
_kmeans.py:1436: UserWarning: KMeans is known to have a memory leak
on Windows with MKL, when there are less chunks than available
threads. You can avoid it by setting the environment variable
OMP_NUM_THREADS=1.
warnings.warn(
C:\Users\shilpa\anaconda3\Lib\site-packages\sklearn\cluster\
_kmeans.py:1412:
FutureWarning: The default value of `n_init` will change from 10 to
'auto' in
1.4. Set the value of `n_init` explicitly to suppress the warning
super()._check_params_vs_input(X, default_n_init=10)
C:\Users\shilpa\anaconda3\Lib\site-packages\sklearn\cluster\
_kmeans.py:1436:
UserWarning: KMeans is known to have a memory leak on Windows with
MKL, when there are less chunks than available threads. You can
avoid it by setting the environment variable OMP_NUM_THREADS=1.
warnings.warn(
C:\Users\shilpa\anaconda3\Lib\site-packages\sklearn\cluster\
_kmeans.py:1412:

8
FutureWarning: The default value of `n_init` will change from 10 to
'auto' in
1.4. Set the value of `n_init` explicitly to suppress the warning
super()._check_params_vs_input(X, default_n_init=10)
C:\Users\shilpa\anaconda3\Lib\site-packages\sklearn\cluster\
_kmeans.py:1436: UserWarning: KMeans is known to have a memory leak
on Windows with MKL, when there are less chunks than available
threads. You can avoid it by setting the environment variable
OMP_NUM_THREADS=1.
warnings.warn(
C:\Users\shilpa\anaconda3\Lib\site-packages\sklearn\cluster\
_kmeans.py:1412:
FutureWarning: The default value of `n_init` will change from 10 to
'auto' in
1.4. Set the value of `n_init` explicitly to suppress the warning
super()._check_params_vs_input(X, default_n_init=10)
C:\Users\shilpa\anaconda3\Lib\site-packages\sklearn\cluster\
_kmeans.py:1436: UserWarning: KMeans is known to have a memory leak
on Windows with MKL, when there are less chunks than available
threads. You can avoid it by setting the environment variable
OMP_NUM_THREADS=1.
warnings.warn(
C:\Users\shilpa\anaconda3\Lib\site-packages\sklearn\cluster\
_kmeans.py:1412:
FutureWarning: The default value of `n_init` will change from 10 to
'auto' in
1.4. Set the value of `n_init` explicitly to suppress the warning
super()._check_params_vs_input(X, default_n_init=10)
C:\Users\shilpa\anaconda3\Lib\site-packages\sklearn\cluster\
_kmeans.py:1436: UserWarning: KMeans is known to have a memory leak
on Windows with MKL, when there are less chunks than available
threads. You can avoid it by setting the environment variable
OMP_NUM_THREADS=1.
warnings.warn(

9
[11]: #training the K-means model on a dataset kmeans =
KMeans(n_clusters=5, init='k-means++', random_state= 42)
y_predict= kmeans.fit_predict(x)

#visulaizing the clusters mtp.scatter(x[y_predict == 0, 0],

x[y_predict == 0, 1], s = 100, c = 'blue',␣
↪label = 'Cluster 1') #for first cluster mtp.scatter(x[y_predict

== 1, 0], x[y_predict == 1, 1], s = 100, c = 'green',␣

↪label = 'Cluster 2') #for second cluster

mtp.scatter(x[y_predict== 2, 0], x[y_predict == 2, 1], s = 100, c

= 'red',␣
↪label = 'Cluster 3') #for third cluster mtp.scatter(x[y_predict

== 3, 0], x[y_predict == 3, 1], s = 100, c = 'cyan',␣

↪label = 'Cluster 4') #for fourth cluster mtp.scatter(x[y_predict ==

4, 0], x[y_predict == 4, 1], s = 100, c = 'magenta',␣

↪label = 'Cluster 5') #for fifth cluster

mtp.scatter(kmeans.cluster_centers_[:, 0],
kmeans.cluster_centers_[:, 1], s =␣

10
↪300, c = 'yellow', label = 'Centroid')

mtp.title('Clusters of customers')
mtp.xlabel('Annual Income (k$)')
mtp.ylabel('Spending Score (1-100)')
mtp.legend()
mtp.show()

[ ]:

Exercice 1 TP K-Means
No ratings yet
Exercice 1 TP K-Means
1 page
Tugas Clustering - 132021012 - Kevin Gazkia Naufal
No ratings yet
Tugas Clustering - 132021012 - Kevin Gazkia Naufal
6 pages
Customers K Means
No ratings yet
Customers K Means
11 pages
Lab Assignment 3 Ai
No ratings yet
Lab Assignment 3 Ai
1 page
LAB7 Kmeans
No ratings yet
LAB7 Kmeans
11 pages
K-Means Clustering - Jupyter Notebook
No ratings yet
K-Means Clustering - Jupyter Notebook
11 pages
Experiment 9
No ratings yet
Experiment 9
10 pages
Mlda - Lab
No ratings yet
Mlda - Lab
35 pages
KMeans
No ratings yet
KMeans
1 page
1 Kmeans-Pratical-No-1
No ratings yet
1 Kmeans-Pratical-No-1
8 pages
Customer Segmentation Using K-Means Clustering
No ratings yet
Customer Segmentation Using K-Means Clustering
11 pages
K Means Clustering
No ratings yet
K Means Clustering
6 pages
Day59 K Means Clustering 1701989733
No ratings yet
Day59 K Means Clustering 1701989733
5 pages
Inbuilt Kmeans
No ratings yet
Inbuilt Kmeans
3 pages
ML 5
No ratings yet
ML 5
12 pages
Elbow Method Without Using Sns
No ratings yet
Elbow Method Without Using Sns
3 pages
Data Mining - Project
100% (2)
Data Mining - Project
11 pages
K - Means - Clustering - Ipynb - Colaboratory
No ratings yet
K - Means - Clustering - Ipynb - Colaboratory
2 pages
Implement Clustering Algorithms For Unsupervised Classification
No ratings yet
Implement Clustering Algorithms For Unsupervised Classification
4 pages
K Means Clustering
100% (1)
K Means Clustering
10 pages
SOLUTION ONLY CODE DWDM - Lab - All
No ratings yet
SOLUTION ONLY CODE DWDM - Lab - All
8 pages
DWDM Lab All
No ratings yet
DWDM Lab All
20 pages
ML2 Practical List
No ratings yet
ML2 Practical List
80 pages
ML Lab
No ratings yet
ML Lab
8 pages
Output Xerox
No ratings yet
Output Xerox
12 pages
Mall Customer Segmentation Using KMeans Clustering Algorithm and Classification Algorithm
No ratings yet
Mall Customer Segmentation Using KMeans Clustering Algorithm and Classification Algorithm
40 pages
D3 Docs
No ratings yet
D3 Docs
6 pages
Project 13 Customer Segmentation Using K Means Clustering
No ratings yet
Project 13 Customer Segmentation Using K Means Clustering
9 pages
K-Means Clustering Guide
No ratings yet
K-Means Clustering Guide
26 pages
Aiml Lab
No ratings yet
Aiml Lab
37 pages
Python ML Algorithms Guide
No ratings yet
Python ML Algorithms Guide
7 pages
AdityaGaur BDA Exp8
No ratings yet
AdityaGaur BDA Exp8
4 pages
Practical 5
No ratings yet
Practical 5
6 pages
K Means
No ratings yet
K Means
5 pages
21BCE5775 Clustering
No ratings yet
21BCE5775 Clustering
42 pages
7 Output
No ratings yet
7 Output
4 pages
Kmeansclustering Sales Dataset
No ratings yet
Kmeansclustering Sales Dataset
6 pages
Practical-8: Import As Import As Import As Import Import As
No ratings yet
Practical-8: Import As Import As Import As Import Import As
9 pages
Lab Report6 - B21CI014
No ratings yet
Lab Report6 - B21CI014
8 pages
Merged
No ratings yet
Merged
35 pages
Aam Codes
No ratings yet
Aam Codes
8 pages
DL Lab 3
No ratings yet
DL Lab 3
5 pages
ML 2.3 Prashant
No ratings yet
ML 2.3 Prashant
4 pages
ML External File-43
No ratings yet
ML External File-43
23 pages
Cheat Sheet-Building Unsupervised Learning Models
No ratings yet
Cheat Sheet-Building Unsupervised Learning Models
3 pages
S6 - Data Mining Lab Experiments (Except 1)
No ratings yet
S6 - Data Mining Lab Experiments (Except 1)
6 pages
Assignmnet 5
No ratings yet
Assignmnet 5
11 pages
K Means
No ratings yet
K Means
15 pages
DWM Practical
No ratings yet
DWM Practical
12 pages
Yogesh Siddiq Edited
No ratings yet
Yogesh Siddiq Edited
6 pages
ML Labs
No ratings yet
ML Labs
14 pages
Prac9 23bme053
No ratings yet
Prac9 23bme053
4 pages
Tanu Raman ML Lab File
No ratings yet
Tanu Raman ML Lab File
21 pages
Implementing K-Means Clustering: '/content/mall - Customers (1) .CSV'
No ratings yet
Implementing K-Means Clustering: '/content/mall - Customers (1) .CSV'
8 pages
Programs Lab Bca
No ratings yet
Programs Lab Bca
16 pages
21BECE30036 Prac 1
No ratings yet
21BECE30036 Prac 1
10 pages
Pa66 ML Exp6
No ratings yet
Pa66 ML Exp6
9 pages
EE2211 CheatSheet
No ratings yet
EE2211 CheatSheet
15 pages
Kris R. Carillo Resume
No ratings yet
Kris R. Carillo Resume
1 page
HR Database Management System
No ratings yet
HR Database Management System
1 page
Serial-Parallel Gateway
No ratings yet
Serial-Parallel Gateway
2 pages
Performance BTS 1407951917
No ratings yet
Performance BTS 1407951917
10 pages
Datasheet TL210
No ratings yet
Datasheet TL210
1 page
Swahili Tales As Told by Natives of Zanzibar
No ratings yet
Swahili Tales As Told by Natives of Zanzibar
524 pages
Soc (Bi)
No ratings yet
Soc (Bi)
6 pages
Visual Basic 1
No ratings yet
Visual Basic 1
4 pages
FIR Filter Design via Genetic Algorithm
No ratings yet
FIR Filter Design via Genetic Algorithm
17 pages
Equotip 540 - Operating - Instruction PDF
No ratings yet
Equotip 540 - Operating - Instruction PDF
49 pages
Lab Manual Part 2 Updated
No ratings yet
Lab Manual Part 2 Updated
12 pages
Cs6551 - Computer Networks
No ratings yet
Cs6551 - Computer Networks
93 pages
RObo MT5
No ratings yet
RObo MT5
20 pages
Peacock
No ratings yet
Peacock
1 page
Android App Development Guide
No ratings yet
Android App Development Guide
60 pages
Hvlink Pro
No ratings yet
Hvlink Pro
2 pages
2025-05-09
No ratings yet
2025-05-09
16 pages
02-What Are The Benefits and Drawbacks of Using PATCH For Updating Resources
No ratings yet
02-What Are The Benefits and Drawbacks of Using PATCH For Updating Resources
1 page
Manual Sinamics
100% (1)
Manual Sinamics
346 pages
Master Theorem & Relations Homework
No ratings yet
Master Theorem & Relations Homework
4 pages
Regress It 2020 User Manual
No ratings yet
Regress It 2020 User Manual
15 pages
CNN Basics for Computer Vision Students
No ratings yet
CNN Basics for Computer Vision Students
43 pages
G Suite Enterprise For Education-Getting Started Guide
No ratings yet
G Suite Enterprise For Education-Getting Started Guide
59 pages
Math 220 Past Question
No ratings yet
Math 220 Past Question
6 pages
Jbaci
No ratings yet
Jbaci
15 pages
Network Configuration Toolj: Operation Manual (Java Version)
No ratings yet
Network Configuration Toolj: Operation Manual (Java Version)
78 pages
Windows Activation Script Guide
No ratings yet
Windows Activation Script Guide
12 pages
Sap Datasheet Jan 2019
No ratings yet
Sap Datasheet Jan 2019
2 pages
OS Study Guide for Students
No ratings yet
OS Study Guide for Students
7 pages
Oracle App Cloning Guide: Rapid Clone Method
No ratings yet
Oracle App Cloning Guide: Rapid Clone Method
5 pages

Program 8

Uploaded by

Program 8

Uploaded by

program-8

[5]: import numpy as np import matplotlib.pyplot as plt import

Dataset shape: (200, 5)

[6]: import numpy as nm

# Importing the dataset

[10]: #finding optimal number of clusters using the elbow

#Using for loop for iterations from 1 to 10. for i in

#visulaizing the clusters mtp.scatter(x[y_predict == 0, 0],

== 1, 0], x[y_predict == 1, 1], s = 100, c = 'green',␣

mtp.scatter(x[y_predict== 2, 0], x[y_predict == 2, 1], s = 100, c

== 3, 0], x[y_predict == 3, 1], s = 100, c = 'cyan',␣

4, 0], x[y_predict == 4, 1], s = 100, c = 'magenta',␣

You might also like