[go: up one dir, main page]

100% found this document useful (1 vote)
102 views31 pages

ML Ch-3 Unsupervised Learning

The document provides an overview of unsupervised learning techniques. It discusses clustering approaches like K-means and hierarchical clustering. It also covers reinforcement learning and key concepts like agents, environments, actions, states, and rewards. Clustering groups unlabeled data based on similarities and finds hidden patterns in data. Reinforcement learning involves an agent learning through trial-and-error interactions with an environment.

Uploaded by

Nasis Dereje
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
102 views31 pages

ML Ch-3 Unsupervised Learning

The document provides an overview of unsupervised learning techniques. It discusses clustering approaches like K-means and hierarchical clustering. It also covers reinforcement learning and key concepts like agents, environments, actions, states, and rewards. Clustering groups unlabeled data based on similarities and finds hidden patterns in data. Reinforcement learning involves an agent learning through trial-and-error interactions with an environment.

Uploaded by

Nasis Dereje
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 31

Unsupervised Learning

Introduction to Machine Learning

Lecture 3
Agenda
 Introduction
 Understand the principles of unsupervised learning models
 Clustering approaches: -
 K-Means,
 K nearest neighbors,
 Hierarchical clustering
 Correctly apply and evaluate clustering models
 Reinforcement learning
 Markov decision
2
Unsupervised Learning
 models itself find the hidden patterns and insights from the
given data.
 The goal of unsupervised learning is to find the underlying
structure of dataset, group that data according to
similarities, and represent that dataset in a compressed
format.

3
Why unsupervised learning
• Unsupervised learning is helpful for finding useful insights from
the data.
• Unsupervised learning is much similar as a human learns to
think by their own experiences, which makes it closer to the real
AI.
• Unsupervised learning works on unlabeled and uncategorized
data which make unsupervised learning more important.
• In real-world, we do not always have input data with the
corresponding output so to solve such cases, we need
unsupervised learning 4
How it works?

5
6
Clustering
 It is the process of segregating a huge number of items into
small groups sharing similar characteristics.
 For example,
 If we have the list of all persons in Asia, We can group them
based on their nationalities like
 Group 1: People belonging to India
 Group 2: People belonging to China
 Group 3: People belonging to Nepal etc...

7
Association
 It is the process of measuring the degree of association between any 2
items.
 For example,
 If we go to a grocery shop, there is a high probability that we will buy a jam if
we already bought bread there.
 This is because bread and jam are 2 items that are closely associated.
 But, there is only a low probability that we will buy a biscuit if we already
bought a book.
 This is because biscuits and books are not closely associated.
 These kinds of associations can be identified using an extensive data
mining process.
 This is nothing but Association rule mining.
8
Clustering vs association
 Clustering finds commonalities
 Association: used for finding the relationships between
variables in the large database.
 It determines the set of items that occurs together in the dataset
 Such as people who buy X item (suppose a bread) are also tend
to purchase Y (Butter/Jam) item.

9
Unsupervised learning
 Unsupervised learning is used for more complex tasks as
compared to supervised learning

10
Applications of unsupervised learning
 Clustering automatically split the dataset into groups base on
their similarities
 Anomaly detection can discover unusual data points in your
dataset.
 It is useful for finding fraudulent transactions
 Association mining identifies sets of items which often occur
together in your dataset
 Latent variable models are widely used for data preprocessing.
 Like reducing the number of features in a dataset or decomposing
the dataset into multiple components
11
Application areas
• Market Segmentation
• Statistical data analysis
• Social network analysis
• Image segmentation
• Anomaly detection, etc.

12
Types of Clustering methods
 Partitioning Clustering
 Density-Based Clustering
 Distribution Model-Based Clustering
 Hierarchical Clustering

13
Types of Clustering methods…
 Partitioning Clustering:-
divides the data into
nonhierarchical group.
 E.g., K means clustering, K is
the number of groups

14 14
Types of Clustering methods…
 Density-Based Clustering:- connects the highly-dense
areas into clusters,
 This algorithm does it by identifying different clusters in the
dataset and connects the areas of high densities into clusters.
 The dense areas in data space are divided from each other by
sparser areas.
 These algorithms can face difficulty in clustering the data
points if the dataset has varying densities and high dimensions.

15
Types of Clustering methods…

16
Types of Clustering methods…
 Distribution Model-Based Clustering:- the data is
divided based on the probability of how a dataset belongs to
a particular distribution.
 The grouping is done by assuming some distributions
commonly Gaussian Distribution.

17
Types of Clustering methods…
 Hierarchical Clustering:- can be used as an alternative for
the partitioned clustering as there is no requirement of pre-
specifying the number of clusters to be created.

18
Clustering Algorithms
 K-Means algorithm:- It classifies the dataset by
dividing the samples into different clusters of equal
variances.
 Mean-shift algorithm: it tries to find the dense areas in
the smooth density of data points.
 Affinity Propagation: It is different from other
clustering algorithms as it does not require to specify
the number of clusters.
 In this, each data point sends a message between the pair
of data points until convergence. 19
Applications of Clustering
 In Identification of Cancer Cells: It divides the cancerous and
non-cancerous data sets into different groups.
 In Search Engines: The search result appears based on the
closest object to the search query.
 Customer Segmentation: It is used in market research to
segment the customers based on their choice and preferences.
 In Biology: to classify different species of plants and animals
using the image recognition technique.
 In Land Use: used in identifying the area of similar lands use
in the GIS database.
20
Reinforcement learning
 is a type of machine learning method where an
intelligent agent (computer program) interacts with the
environment and learns to act within that.
 E.g., How a Robotic dog learns the movement of his arms.
 The agent continues doing these three things (take
action, change state/remain in the same state, and
get feedback), and by doing these actions, it learns
and explores the environment.

21
Reinforcement learning…
 Terminologies:
 Agent(): An entity that can perceive/explore the environment and act
upon it.
 Environment(): A situation in which an agent is present or surrounded by.
 Action(): are the moves taken by an agent within the environment.
 State(): is a situation returned by the environment after each action taken
by the agent.
 Reward(): A feedback returned to the agent from the environment to
evaluate the action of the agent.
 Policy(): is a strategy applied by the agent for the next action based on
the current state.
 Value(): It is expected long-term retuned with the discount factor and
opposite to the short-term reward. 22
23
Reinforcement Learning…
• is a feedback-based Machine learning technique in which an
agent learns to behave in an environment by performing the
actions and seeing the results of actions.
• For each good action, the agent gets positive feedback, and for
each bad action, the agent gets negative feedback or penalty.
 The agent learns automatically using feedbacks without
any labeled data, unlike supervised learning.
 Since there is no labeled data, so the agent is bound to
learn by its experience only.
24
Reinforcement Learning…
 it solves a specific type of problem where decision
making is sequential, and the goal is long-term, such
as game-playing, robotics, etc.
 The agent interacts with the environment and explores
it by itself.
 The primary goal of an agent in reinforcement learning
is to improve the performance by getting the
maximum positive rewards.

25
Reinforcement Learning…
 Key Features of Reinforcement Learning
 In RL, the agent is not instructed about the environment
and what actions need to be taken.
 It is based on the hit and trial process.
 The agent takes the next action and changes states
according to the feedback of the previous action.
 The agent may get a delayed reward.
 The environment is stochastic, and the agent needs to
explore it to reach to get the maximum positive rewards.

26
Reinforcement learning…

27
28
29
Quiz
30
Quiz
1. What is the difference between regression and
classification algorithms
2. Write real world examples which can be solved by
regression and classification

31

You might also like