0% found this document useful (0 votes)

158 views19 pages

Unsupervised Learning - Clustering

This document discusses unsupervised machine learning and clustering. It provides an overview of clustering techniques like k-means clustering. K-means clustering partitions unlabeled data into k clusters where each cluster has a centroid. It discusses how k-means works by assigning data points to the closest centroid, recomputing centroids, and repeating until convergence. The document notes some weaknesses of k-means like sensitivity to outliers, initial seeds, and not knowing the optimal number of clusters k beforehand.

Uploaded by

Spandan Rout ms17a058

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

158 views19 pages

Unsupervised Learning - Clustering

Uploaded by

Spandan Rout ms17a058

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 19

Unsupervised Machine Learning -

Clustering

May 2020

SECRET
Knowledge Share –Session plan
Topic Application Schedule

Overview of Machine learning and feature selection Generic 19-Feb

Regression - Supervised Machine Learning Market Share Forecast/ Inventory 13-Mar
Obsolescence

Classification - Supervised Machine Learning Technician Attrition 10-Apr

Clustering - Unsupervised Machine Learning Dealer/Parts Clustering 8-May
Bagging & Boosting - Ensemble Methods Service Parts Forecasting 5-Jun
Genetic Algorithm -Reinforcement Learning Vehicle Route optimization 3-Jul
Linear programming and mathematical optimization Container Loading/Vanning 31-Jul
Dimension Reduction & Pattern Search - Generic 28-Aug
Unsupervised Machine Learning

Descriptive, Predictive & prescriptive Analytics

2
SECRET
Machine Learning Universe

SECRET 3
What is clustering?

• The organization of unlabeled data into similarity groups

called clusters.
• A cluster is a collection of data items which are “similar”
between them, and “dissimilar” to data items in other clusters.
Historic application of clustering

SECRET 5
Clustering techniques

Divisive

K-means
K-Means clustering
• K-means (MacQueen, 1967) is a partitional clustering algorithm
• Let the set of data points D be {x1, x2, …, xn},
where xi = (xi1, xi2, …, xir) is a vector in X  Rr, and r is the
number of dimensions.
• The k-means algorithm partitions the given data into
k clusters:
– Each cluster has a cluster center, called centroid.
– k is specified by the user
K-means clustering example: step 1

SECRET 8
K-means clustering example – step 2

SECRET 9
K-means clustering example – step 3

SECRET 10
K-means clustering example

SECRET 11
K-means clustering example

SECRET 12
K-means clustering example

SECRET 13
Weaknesses of K-means
• The algorithm is only applicable if the mean is
defined.
– For categorical data, k-mode - the centroid is
represented by most frequent values.
• The user needs to specify k.
• Sensitive to initial seed
• The algorithm is sensitive to outliers
– Outliers are data points that are very far away
from other data points.
– Outliers could be errors in the data recording or
some special data points with very different values.
Optimal Number of cluster

Within Cluster Sum of Squares (WCSS)

Optimal Number of cluster
Sensitivity to initial seeds

Random selection of seeds (centroids) Random selection of seeds (centroids)

Iteration 1 Iteration 2 Iteration 1 Iteration 2

Outlier
s

SECRET 18
K-means summary

• Despite weaknesses, k-means is still the most

popular algorithm due to its simplicity and
efficiency
• No clear evidence that any other clustering
algorithm performs better in general
• Comparing different clustering algorithms is a
difficult task. No one knows the correct
clusters!

Lect3 Clustering
No ratings yet
Lect3 Clustering
86 pages
An Introduction To Clustering Methods
No ratings yet
An Introduction To Clustering Methods
8 pages
Lecture 17 Clustering
No ratings yet
Lecture 17 Clustering
63 pages
Clustering: CMPUT 466/551 Nilanjan Ray
No ratings yet
Clustering: CMPUT 466/551 Nilanjan Ray
34 pages
cp4252 Machine Learning Lab Manual
No ratings yet
cp4252 Machine Learning Lab Manual
38 pages
LDA in Python: Machine Learning Lab
No ratings yet
LDA in Python: Machine Learning Lab
12 pages
Clustering - K-Means: Prerequisite
No ratings yet
Clustering - K-Means: Prerequisite
8 pages
Machine Learning Quiz for Students
No ratings yet
Machine Learning Quiz for Students
8 pages
Linear Regression Analysis Theory and Computing 1st Edition Xin Yan Download
100% (8)
Linear Regression Analysis Theory and Computing 1st Edition Xin Yan Download
71 pages
PSMIReal Mode-MikhailLapshin
No ratings yet
PSMIReal Mode-MikhailLapshin
42 pages
Machine Learning Strategies
No ratings yet
Machine Learning Strategies
59 pages
Machine Learning Lab
No ratings yet
Machine Learning Lab
33 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
35 pages
Machine Learning Algorithms
No ratings yet
Machine Learning Algorithms
9 pages
Linux Interview Prep Guide
No ratings yet
Linux Interview Prep Guide
8 pages
Php-Notes Module 1
No ratings yet
Php-Notes Module 1
129 pages
Clustering
No ratings yet
Clustering
7 pages
C++ Interview Questions: Class
No ratings yet
C++ Interview Questions: Class
14 pages
CSAXXXX Applied Machine Learning
No ratings yet
CSAXXXX Applied Machine Learning
3 pages
1 Lecture 2: Supervised Machine Learning
No ratings yet
1 Lecture 2: Supervised Machine Learning
20 pages
ML Unit 1
100% (1)
ML Unit 1
71 pages
List of Programs: Sno: Name of The Program
No ratings yet
List of Programs: Sno: Name of The Program
124 pages
KNN Algorithm for Students
100% (1)
KNN Algorithm for Students
18 pages
Supervised Vs Unsupervised Learning What S The Difference IBM 24062021 035331pm
No ratings yet
Supervised Vs Unsupervised Learning What S The Difference IBM 24062021 035331pm
9 pages
ML Unit5 QB
No ratings yet
ML Unit5 QB
6 pages
Linux Commands List PDF 2022
No ratings yet
Linux Commands List PDF 2022
3 pages
AI Statistical Methods Course
No ratings yet
AI Statistical Methods Course
23 pages
Seminar Report Machine Learning
No ratings yet
Seminar Report Machine Learning
20 pages
Interview Questions Ans Answers
No ratings yet
Interview Questions Ans Answers
4 pages
Important LeetCode Questions
No ratings yet
Important LeetCode Questions
2 pages
Top 75 Leet Code Questions To Save You Time
No ratings yet
Top 75 Leet Code Questions To Save You Time
4 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
21 pages
Presentation On ML
No ratings yet
Presentation On ML
469 pages
Data Analytics for B.Tech Students
No ratings yet
Data Analytics for B.Tech Students
98 pages
ML Unit-2
No ratings yet
ML Unit-2
26 pages
Lesson 5 - Supervised Learning-Classification
100% (1)
Lesson 5 - Supervised Learning-Classification
91 pages
Decision Trees
No ratings yet
Decision Trees
25 pages
Model Generalization
No ratings yet
Model Generalization
117 pages
AdaBoost Classifier Tutorial Python
100% (1)
AdaBoost Classifier Tutorial Python
9 pages
Spam News Detection Report: Manikiran
No ratings yet
Spam News Detection Report: Manikiran
12 pages
Types of Data (Qualitative and Quantitative)
No ratings yet
Types of Data (Qualitative and Quantitative)
89 pages
Clustering Techniques for Analysts
No ratings yet
Clustering Techniques for Analysts
7 pages
U02Lecture07 Classification
100% (1)
U02Lecture07 Classification
56 pages
SROI and Non-SROI Value Map Guide
No ratings yet
SROI and Non-SROI Value Map Guide
33 pages
MCSE 003 Previous Year Question Papers by Ignouassignmentguru
No ratings yet
MCSE 003 Previous Year Question Papers by Ignouassignmentguru
80 pages
Machine Learning & Data Mining
No ratings yet
Machine Learning & Data Mining
4 pages
Intro to Machine Learning Basics
100% (1)
Intro to Machine Learning Basics
52 pages
Learning Best Practices For Model Evaluation and Hyperparameter Tuning
No ratings yet
Learning Best Practices For Model Evaluation and Hyperparameter Tuning
17 pages
SQL & PL/SQL Exercises for Students
No ratings yet
SQL & PL/SQL Exercises for Students
10 pages
Handout - BITS-F464 - Machine - Learning - August 2019
No ratings yet
Handout - BITS-F464 - Machine - Learning - August 2019
4 pages
Machine Learning Classification Guide
No ratings yet
Machine Learning Classification Guide
7 pages
Intro to k-Nearest Neighbor Algorithm
No ratings yet
Intro to k-Nearest Neighbor Algorithm
3 pages
Vinee
100% (1)
Vinee
28 pages
A "Short" Introduction To Model Selection
No ratings yet
A "Short" Introduction To Model Selection
25 pages
Scikit Learn
No ratings yet
Scikit Learn
17 pages
ML Lect1
100% (1)
ML Lect1
51 pages
Quiz
No ratings yet
Quiz
2 pages
Week 14 and 15 Machine Learning Unsupervised 2
No ratings yet
Week 14 and 15 Machine Learning Unsupervised 2
25 pages
K Means
No ratings yet
K Means
9 pages
K Means Clustering
No ratings yet
K Means Clustering
3 pages
Introduction To Prescriptive AI: A Primer For Decision Intelligence Solutioning With Python
No ratings yet
Introduction To Prescriptive AI: A Primer For Decision Intelligence Solutioning With Python
205 pages
DP-100 Overview
No ratings yet
DP-100 Overview
13 pages
Merrill DatasiteOne DueDiligence2022
No ratings yet
Merrill DatasiteOne DueDiligence2022
31 pages
Master Spilak Bruno
No ratings yet
Master Spilak Bruno
73 pages
ELKP Report GE220036
No ratings yet
ELKP Report GE220036
10 pages
CSE 412 Lab Manual 3 Linear Regression
No ratings yet
CSE 412 Lab Manual 3 Linear Regression
10 pages
FullChapter Chatbots
No ratings yet
FullChapter Chatbots
28 pages
Revalresult 3 2 RS May2023 231229 114325
No ratings yet
Revalresult 3 2 RS May2023 231229 114325
20 pages
Computer Science
No ratings yet
Computer Science
56 pages
ML Model 1
No ratings yet
ML Model 1
42 pages
The Essential Guide To Training Data
No ratings yet
The Essential Guide To Training Data
31 pages
Final Project Documentation
No ratings yet
Final Project Documentation
53 pages
Lesson 9
No ratings yet
Lesson 9
15 pages
Chapter 1
No ratings yet
Chapter 1
21 pages
Credit Risk Analysis - Project Report
No ratings yet
Credit Risk Analysis - Project Report
104 pages
AI Scaling and Limitation
No ratings yet
AI Scaling and Limitation
3 pages
AI & Machine Learning M.Sc. Handbook
No ratings yet
AI & Machine Learning M.Sc. Handbook
132 pages
ML Evaluation Metrics Guide
No ratings yet
ML Evaluation Metrics Guide
16 pages
PBLPPT
No ratings yet
PBLPPT
14 pages
Brochure - Horizons Achievers Programme 2025
No ratings yet
Brochure - Horizons Achievers Programme 2025
23 pages
MCA1
No ratings yet
MCA1
9 pages
Module-1 Predictive Analytics
No ratings yet
Module-1 Predictive Analytics
20 pages
Backend Developer
No ratings yet
Backend Developer
1 page
Itc516 Si Sapkota 11722942 A3 202260
No ratings yet
Itc516 Si Sapkota 11722942 A3 202260
14 pages
Agentic AI: The Next Evolution in Business Automation
No ratings yet
Agentic AI: The Next Evolution in Business Automation
10 pages
Machine Learning Roadmap
No ratings yet
Machine Learning Roadmap
31 pages
Machine Learning for ECE Students
No ratings yet
Machine Learning for ECE Students
88 pages
PFE Book 2024 Integration Objects
No ratings yet
PFE Book 2024 Integration Objects
24 pages
Unit 3 & 4 (p18)
No ratings yet
Unit 3 & 4 (p18)
18 pages
Malware Classification Using Machine Learning Algorithms and Tools
No ratings yet
Malware Classification Using Machine Learning Algorithms and Tools
7 pages

Unsupervised Learning - Clustering

Uploaded by

Unsupervised Learning - Clustering

Uploaded by

Unsupervised Machine Learning -

Overview of Machine learning and feature selection Generic 19-Feb

Classification - Supervised Machine Learning Technician Attrition 10-Apr

Descriptive, Predictive & prescriptive Analytics

• The organization of unlabeled data into similarity groups

Within Cluster Sum of Squares (WCSS)

Random selection of seeds (centroids) Random selection of seeds (centroids)

Iteration 1 Iteration 2 Iteration 1 Iteration 2

• Despite weaknesses, k-means is still the most

You might also like