Hierarchial Clustering

Uploaded by

vm4512

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

28 views14 pages

Hierarchial Clustering

Uploaded by

vm4512

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

21CSC305P- MACHINE LEARNING

Hierarchical Clustering
Dr. R. Srinivasan
Assistant Professor
Department of Computing Technologies
School of Computing
SRM Institute of Science and Technology
Hierarchical Clustering
1) Hierarchical clustering is another unsupervised machine learning algorithm, which is used to group the
unlabeled datasets into a cluster also known as hierarchical cluster analysis or HCA
2) In this algorithm, we develop the hierarchy of clusters in the form of a tree, and this tree-shaped structure is
known as the dendrogram
3) Sometimes the results of K-means clustering, and hierarchical clustering may look similar, but they both
differ depending on how they work. As there is no requirement to predetermine the number of clusters as we
did in the K-Means algorithm

The hierarchical clustering technique has two approaches:

1) Agglomerative: Agglomerative is a bottom-up approach, in which the algorithm starts with taking all data
points as single clusters and merging them until one cluster is left
2) Divisive: Divisive algorithm is the reverse of the agglomerative algorithm as it is a top-down approach
Why hierarchical clustering?

As we already have other clustering algorithms such as K-Means Clustering, then why we need hierarchical
clustering?
1) We have seen in the K-means clustering that there are some challenges with this algorithm, which are a
predetermined number of clusters, and it always tries to create the clusters of the same size
2) To solve these two challenges, we can opt for the hierarchical clustering algorithm because, in this algorithm,
we don't need to have knowledge about the predefined number of clusters
Agglomerative Hierarchical clustering
The agglomerative hierarchical clustering algorithm is a popular example of HCA. To group the datasets into
clusters, it follows the bottom-up approach.
It means, this algorithm considers each dataset as a single cluster at the beginning, and then start combining the
closest pair of clusters together.
It does this until all the clusters are merged into a single cluster that contains all the datasets
This hierarchy of clusters is represented in the form of the dendrogram.
Steps:
Step-1: Create each data point as a single cluster. Let's say there are N data points, so the number of clusters will
also be N.
Agglomerative Hierarchical clustering
Step-2: Take two closest data points or clusters and merge them to form one cluster. So, there will now be N-1
clusters
Step-3: Again, take the two closest clusters and merge them together to form one cluster. There will be N-2
clusters.

Step 3
Step 2
Agglomerative Hierarchical clustering
Step-4: Repeat Step 3 until only one cluster left. So, we will get the following clusters. Consider the below
images:

Step 4 (a) Step 4 (b) Step 4 (c)

Step-5: Once all the clusters are combined into one big cluster, develop the dendrogram to divide the clusters as
per the problem.
Proximity Between Clusters
The closest distance between the two clusters is crucial for the hierarchical clustering. There are various ways to
calculate the distance between two clusters, and these ways decide the rule for clustering. These measures are
called Linkage methods.
Single Linkage: It is the Shortest Distance between the closest points of the clusters.
Complete Linkage: It is the farthest distance between the two points of two different clusters. It is one of the
popular linkage methods as it forms tighter clusters than single-linkage.
Average Linkage: It is the linkage method in which the distance between each pair of datasets is added up and
then divided by the total number of datasets to calculate the average distance between two clusters. It is also one
of the most popular linkage methods.
Centroid Linkage: It is the linkage method in which the distance between the centroid of the clusters is
calculated.
Ward’s Method: It uses squared error to compute the similarity of the two clusters for merging.
Measure for the distance between two clusters

Complete Centroid
Single
Woking of Dendrogram in Hierarchical clustering
The dendrogram is a tree-like structure that is mainly used to store each step as a memory that the HC
algorithm performs. In the dendrogram plot, the Y-axis shows the Euclidean distances between the data
points, and the x-axis shows all the data points of the given dataset.
Woking of Dendrogram in Hierarchical clustering
In the above diagram, the left part is showing how clusters are created in agglomerative clustering, and the
right part is showing the corresponding dendrogram.
As we have discussed above, firstly, the datapoints P2 and P3 combine and form a cluster, correspondingly a
dendrogram is created, which connects P2 and P3 with a rectangular shape. The height is decided according
to the Euclidean distance between the data points.
In the next step, P5 and P6 form a cluster, and the corresponding dendrogram is created. It is higher than of
previous, as the Euclidean distance between P5 and P6 is a little bit greater than the P2 and P3.
Again, two new dendrograms are created that combine P1, P2, and P3 in one dendrogram, and P4, P5, and
P6, in another dendrogram.
At last, the final dendrogram is created that combines all the data points together.
Agglomerative Hierarchical clustering
Advantages
1) No apriori information about the number of clusters required.
2) Easy to implement and gives best result in some cases.
Disadvantages
1) Algorithm can never undo what was done previously.
2) Time complexity of at least O(n2 log n) is required, where ‘n’ is the number of data points.
3) Based on the type of distance matrix chosen for merging different algorithms can suffer with one or more
of the following:
i) Sensitivity to noise and outliers
ii) Breaking large clusters
iii) Difficulty handling different sized clusters and convex shapes
4) No objective function is directly minimized
5) Sometimes it is difficult to identify the correct number of clusters by the dendogram.
Divisive Clustering
Divisive clustering works just the opposite of agglomerative clustering. It starts by considering all the data
points into a big single cluster and later on splitting them into smaller heterogeneous clusters continuously
until all data points are in their own cluster.

Thus, they are good at identifying large clusters. It follows a top-down approach and is more efficient than
agglomerative clustering. But, due to its complexity in implementation, it doesn’t have any predefined
implementation in any of the major machine learning frameworks.

Steps in Divisive Clustering

Consider all the data points as a single cluster.

1.Split into clusters using any flat-clustering method, say K-Means.
2.Choose the best cluster among the clusters to split further, choose the one that has the largest Sum of
Squared Error (SSE).
3.Repeat steps 2 and 3 until a single cluster is formed.
Divisive Clustering
.
Divisive Clustering
.

1) The data points 1,2,...6 are assigned to large cluster.

2) After calculating the proximity matrix, based on the dissimilarity the points are split up into separate
clusters.
3) The proximity matrix is again computed until each point is assigned to an individual cluster.
4) The proximity matrix and linkage function follow the same procedure as agglomerative clustering,
As the divisive clustering is not used in many places, there is no predefined class/function in any
Python library.

Hierarchical Clustering: Relationship Between Clusters
No ratings yet
Hierarchical Clustering: Relationship Between Clusters
23 pages
Clustering
No ratings yet
Clustering
32 pages
Hierarchical Clustering: Class Program University Semester Lecturer Sources
100% (1)
Hierarchical Clustering: Class Program University Semester Lecturer Sources
33 pages
Hierarchical Clustering
No ratings yet
Hierarchical Clustering
19 pages
6 - Chapter 6 - Hierarchical Clustering
No ratings yet
6 - Chapter 6 - Hierarchical Clustering
32 pages
Hierarchical Clustering in Machine Learning
No ratings yet
Hierarchical Clustering in Machine Learning
7 pages
1 AI Notes Complete Watermark
No ratings yet
1 AI Notes Complete Watermark
95 pages
20 - 1 - ML - UNSUP - 02 - Hierarchical Clustering
No ratings yet
20 - 1 - ML - UNSUP - 02 - Hierarchical Clustering
41 pages
Week-9-Part-2 Agglomerative Clustering
No ratings yet
Week-9-Part-2 Agglomerative Clustering
40 pages
Hierarchical Clustering Algorithm
No ratings yet
Hierarchical Clustering Algorithm
9 pages
Cluster Analysis
No ratings yet
Cluster Analysis
12 pages
Lect 11 DM
No ratings yet
Lect 11 DM
41 pages
21MAB302T DM Assignment 2
No ratings yet
21MAB302T DM Assignment 2
1 page
Integrating AI and IoT For Smart Manufacturing
No ratings yet
Integrating AI and IoT For Smart Manufacturing
4 pages
RK Clustering
No ratings yet
RK Clustering
77 pages
AI20 - Hierarchical-Clustering
No ratings yet
AI20 - Hierarchical-Clustering
31 pages
Hierar Scale4
No ratings yet
Hierar Scale4
51 pages
Module 3 - 1
No ratings yet
Module 3 - 1
149 pages
Hierarchical Clusters
No ratings yet
Hierarchical Clusters
6 pages
Hierarchical Clustering Unit 4 ML
No ratings yet
Hierarchical Clustering Unit 4 ML
14 pages
Hierarchical
No ratings yet
Hierarchical
31 pages
ML Lec-17
No ratings yet
ML Lec-17
12 pages
Clustering
No ratings yet
Clustering
20 pages
Hierarchical Clustering
No ratings yet
Hierarchical Clustering
10 pages
Unit-4 New
No ratings yet
Unit-4 New
36 pages
Unit-6 Clustering Techniques
No ratings yet
Unit-6 Clustering Techniques
110 pages
Mapping Approaches To Data and Data Flows
100% (1)
Mapping Approaches To Data and Data Flows
44 pages
Hierarchical Clustering PDF
No ratings yet
Hierarchical Clustering PDF
7 pages
4.unsupervised Learning Model-Clustering
No ratings yet
4.unsupervised Learning Model-Clustering
45 pages
Data Mining Unit 2 Unit 2
No ratings yet
Data Mining Unit 2 Unit 2
21 pages
2.unit 1 Lecture Notes
No ratings yet
2.unit 1 Lecture Notes
72 pages
ML TCS Lecture Hierarchical 1608
No ratings yet
ML TCS Lecture Hierarchical 1608
41 pages
Hierarchical Clustering
No ratings yet
Hierarchical Clustering
34 pages
Exp 8
No ratings yet
Exp 8
3 pages
Ai Based Mock Interview Evaluator An Emotion and Confidence Classifier Model
No ratings yet
Ai Based Mock Interview Evaluator An Emotion and Confidence Classifier Model
8 pages
Hierarchical Clustering in Machine Learning
No ratings yet
Hierarchical Clustering in Machine Learning
11 pages
Clustering
No ratings yet
Clustering
19 pages
Hierarchical Clustering
No ratings yet
Hierarchical Clustering
41 pages
DA Seminar
No ratings yet
DA Seminar
29 pages
Hierarchical Clustering
No ratings yet
Hierarchical Clustering
26 pages
Clustring
No ratings yet
Clustring
20 pages
5.UNIT 2 Lecture Notes
No ratings yet
5.UNIT 2 Lecture Notes
49 pages
Previewpdf
No ratings yet
Previewpdf
102 pages
HKMA IRB Validate
No ratings yet
HKMA IRB Validate
111 pages
Ahc 1
No ratings yet
Ahc 1
6 pages
Deepfake Detector
No ratings yet
Deepfake Detector
32 pages
ML Module Iv
No ratings yet
ML Module Iv
27 pages
Effective Analysis of Machine and Deep Learning Methods For Diagnosing Mental He
No ratings yet
Effective Analysis of Machine and Deep Learning Methods For Diagnosing Mental He
21 pages
ML CO4 SESSION 30 Hierarchical Clustering
No ratings yet
ML CO4 SESSION 30 Hierarchical Clustering
20 pages
Hierarchical Clustering in Machine Learning
No ratings yet
Hierarchical Clustering in Machine Learning
10 pages
RBF and Unsupervised Learning
No ratings yet
RBF and Unsupervised Learning
34 pages
Spooo
No ratings yet
Spooo
9 pages
Hierarchical Clustering
No ratings yet
Hierarchical Clustering
8 pages
ML Unit 5
No ratings yet
ML Unit 5
50 pages
10Hierarchical&Probabilistic Clustering & GMM (ML)
No ratings yet
10Hierarchical&Probabilistic Clustering & GMM (ML)
24 pages
Study Guide For Exam AI-900 - Microsoft Azure AI Fundamentals
No ratings yet
Study Guide For Exam AI-900 - Microsoft Azure AI Fundamentals
11 pages
Design For Health Chatbot
No ratings yet
Design For Health Chatbot
20 pages
Will Humans Go The Way of Horses
100% (1)
Will Humans Go The Way of Horses
8 pages
Herichycal March2020
No ratings yet
Herichycal March2020
29 pages
21MAB302T DM Unit 1 Tutorial Sheet
No ratings yet
21MAB302T DM Unit 1 Tutorial Sheet
2 pages
Lecture Notes - Clustering
No ratings yet
Lecture Notes - Clustering
13 pages
Clustering Hierarchical PDF
No ratings yet
Clustering Hierarchical PDF
31 pages
Module-5-Cluster Analysis-Part1
No ratings yet
Module-5-Cluster Analysis-Part1
24 pages
Heirarchical Clustering
No ratings yet
Heirarchical Clustering
22 pages
Skin Lesions Detection Using Deep Learning Techniques
No ratings yet
Skin Lesions Detection Using Deep Learning Techniques
5 pages
Herichycal Cluster - March2020
No ratings yet
Herichycal Cluster - March2020
29 pages
Data and Analytics Governance Roadmap
No ratings yet
Data and Analytics Governance Roadmap
13 pages
Hackathon PPT Template
No ratings yet
Hackathon PPT Template
15 pages
Unit 4 Self Made
No ratings yet
Unit 4 Self Made
28 pages
Updated Presentation
No ratings yet
Updated Presentation
8 pages
Lecture+Notes+ +clustering
No ratings yet
Lecture+Notes+ +clustering
13 pages
Unit4 SumProductAlgorithm
No ratings yet
Unit4 SumProductAlgorithm
7 pages
Topics in Feminist Philosophy Syllabus Fall 2024 PDF
No ratings yet
Topics in Feminist Philosophy Syllabus Fall 2024 PDF
12 pages
Hierarchical Clustering - 11.3.2024 - Full
No ratings yet
Hierarchical Clustering - 11.3.2024 - Full
14 pages
DWM Exp8 127 133 137
No ratings yet
DWM Exp8 127 133 137
4 pages
FYP
No ratings yet
FYP
3 pages
Final Lab Manual
No ratings yet
Final Lab Manual
34 pages
4 5854898563208709653
No ratings yet
4 5854898563208709653
38 pages
Sudan Prajapati Aman Maharjan Prof. Dr. Shashidhar Ram Joshi Asst. Prof. Bikash Balami
No ratings yet
Sudan Prajapati Aman Maharjan Prof. Dr. Shashidhar Ram Joshi Asst. Prof. Bikash Balami
15 pages
Agglomerative Hierarchical Clustering
No ratings yet
Agglomerative Hierarchical Clustering
22 pages
23novel Approach To Classify Brain Tumor Based On Transfer Learning
No ratings yet
23novel Approach To Classify Brain Tumor Based On Transfer Learning
8 pages
Agglomerative Hierarchical Clustering Algorithm-A Review: K.Sasirekha, P.Baby
No ratings yet
Agglomerative Hierarchical Clustering Algorithm-A Review: K.Sasirekha, P.Baby
3 pages
Hierarchical Clustering
No ratings yet
Hierarchical Clustering
7 pages
Hierarchical Clustering in Data Mining
No ratings yet
Hierarchical Clustering in Data Mining
4 pages
Assessment Brief - Part 2 - Essay - V2
No ratings yet
Assessment Brief - Part 2 - Essay - V2
3 pages
Imaler: An Adversarial Attack Framework To Obfuscate Malware Structure Against Dgcnn-Based Classifier Via Reinforcement Learning
No ratings yet
Imaler: An Adversarial Attack Framework To Obfuscate Malware Structure Against Dgcnn-Based Classifier Via Reinforcement Learning
7 pages
Agnes
No ratings yet
Agnes
25 pages
The History of Artificial Intelligence
No ratings yet
The History of Artificial Intelligence
23 pages
B.E (2019 Pattern)
No ratings yet
B.E (2019 Pattern)
2 pages
Notice TEST-I ODD SEM 2020 PDF
No ratings yet
Notice TEST-I ODD SEM 2020 PDF
8 pages
Expt 5
No ratings yet
Expt 5
3 pages
ALOHA
No ratings yet
ALOHA
2 pages
Instabase SE - Internship Announcement Dossier
No ratings yet
Instabase SE - Internship Announcement Dossier
5 pages
2017 1 Multivariate Data Analysis
No ratings yet
2017 1 Multivariate Data Analysis
2 pages
Session 5
No ratings yet
Session 5
2 pages
Design And Analysis Of Algorithm
From Everand
Design And Analysis Of Algorithm
Bhupendra Mandloi
No ratings yet
Machine Learning with Clustering: A Visual Guide for Beginners with Examples in Python
From Everand
Machine Learning with Clustering: A Visual Guide for Beginners with Examples in Python
Artem Kovera
No ratings yet

Hierarchial Clustering

Uploaded by

Hierarchial Clustering

Uploaded by

21CSC305P- MACHINE LEARNING

The hierarchical clustering technique has two approaches:

Step 4 (a) Step 4 (b) Step 4 (c)

Steps in Divisive Clustering

Consider all the data points as a single cluster.

1) The data points 1,2,...6 are assigned to large cluster.

You might also like