[go: up one dir, main page]

0% found this document useful (0 votes)
12 views7 pages

Hierarchical Clustering pdf

Hierarchical clustering is an unsupervised machine learning algorithm that groups unlabeled datasets into clusters, represented as a tree structure called a Dendrogram. It includes two types: Agglomerative (bottom-up approach) and Divisive (top-down approach), and it does not require a predetermined number of clusters, making it advantageous over K-Means clustering. However, it has limitations such as lack of scalability and sensitivity to noise, and it is applied in various fields including network security and healthcare monitoring.

Uploaded by

Navin Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views7 pages

Hierarchical Clustering pdf

Hierarchical clustering is an unsupervised machine learning algorithm that groups unlabeled datasets into clusters, represented as a tree structure called a Dendrogram. It includes two types: Agglomerative (bottom-up approach) and Divisive (top-down approach), and it does not require a predetermined number of clusters, making it advantageous over K-Means clustering. However, it has limitations such as lack of scalability and sensitivity to noise, and it is applied in various fields including network security and healthcare monitoring.

Uploaded by

Navin Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Hierarchical Clustering

 Hierarchical clustering is unsupervised machine


learning algorithm, which is used to group the
unlabeled datasets into a cluster and also known
as Hierarchical Cluster Analysis or HCA.
 In this algorithm, we develop the hierarchy of clusters
in the form of a tree, and this tree-shaped structure is
known as the Dendrogram.

Types of Hierarchical Clustering


 Agglomerative Hierarchical clustering
 Divisive Hierarchical clustering

 Agglomerative Hierarchical clustering :


Agglomerative is a bottom-up approach, in which
the algorithm starts with taking all data points as single
clusters and merging them until one cluster is left.
 Divisive Hierarchical clustering :
Divisive algorithm is the reverse of the
agglomerative algorithm as it is a top-down approach.
Why hierarchical clustering?

As we already have K-Means Clustering then why we


need hierarchical clustering? In the K-means clustering
that there are some challenges, which are a
predetermined number of clusters, and it always tries to
create the clusters of the same size. To solve these two
challenges, we can opt for the Hierarchical clustering
algorithm because, in this algorithm, we don't need to
have knowledge about the predefined number of
clusters.

How the Agglomerative Hierarchical


clustering Work?
Step-1: Create each data point as a single cluster. Let's
say there are N data points, so the number of clusters will
also be N.
Step-2: Take two closest data points or clusters and
merge them to form one cluster. So, there will now be
N-1 clusters.

Step-3: Again, take the two closest clusters and merge


them together to form one cluster. There will be N-2
clusters.

Step-4: Repeat Step 3 until only one cluster left. So, we


will get the following clusters.
z

Measure for the distance between two


clusters :
The closest distance between the two clusters is crucial
for the hierarchical clustering. There are various ways to
calculate the distance between two clusters, and these
ways decide the rule for clustering. These measures are
called Linkage methods.
 Single Linkage: It is the Shortest Distance between the
closest points of the clusters.
 Complete Linkage: It is the farthest distance between
the two points of two different clusters. It is one of the
popular linkage methods as it forms tighter clusters than
single-linkage.

 Centroid Linkage: It is the linkage method in which the


distance between the centroid of the clusters is
calculated.
Advantages :
 Does not require to specify the Number of Clusters.
 Produces a Denderogram.
 Works Well for Small Datasets.
 Can Use Various Distance.

Disadvantages:
 Lack of Scalability.
 Sensitive to Noise.
 Difficult for Handling High- Dimension Data.
 Expensive for Large Datasets.

Applications:
 Detection in Network Security.
 Real-Time Financial Market Analysis.
 Real-Time SocialMedia and Sentiment Analysis.
 Telecommunicartion Networks.
 Healthcare Monitoring and Patient Managements.
 Real-Time Smart City Monitoring.
 Real-Time Financial Market Analysis.

You might also like