[go: up one dir, main page]

0% found this document useful (0 votes)
78 views56 pages

Decision Trees Edited

A decision tree is a supervised machine learning technique that uses a tree-like model to predict outcomes. It breaks down a data set into smaller and smaller subsets while associating data with outcomes. Decision trees comprise decision nodes that specify test conditions, and leaf nodes that represent classification or prediction outcomes. The tree can be used to classify new data instances by starting at the root node and navigating through the tree to reach a leaf node, which provides the classification of the new instance.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
78 views56 pages

Decision Trees Edited

A decision tree is a supervised machine learning technique that uses a tree-like model to predict outcomes. It breaks down a data set into smaller and smaller subsets while associating data with outcomes. Decision trees comprise decision nodes that specify test conditions, and leaf nodes that represent classification or prediction outcomes. The tree can be used to classify new data instances by starting at the root node and navigating through the tree to reach a leaf node, which provides the classification of the new instance.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 56

DECISION TREE

WHAT IS A DECISION TREE?

◼ Decision Trees are a type of Supervised Machine Learning (that is you


explain what the input is and what the corresponding output is in the
training data) where the data is continuously split according to a
certain parameter. The tree can be explained by two entities, namely
decision nodes and leaves. The leaves are the decisions or the final
outcomes. And the decision nodes are where the data is split.A
decision tree follows a set of if-else conditions to visualize the data
and classify it according to the conditions.Here is the example of
decision tree.
IMPORTANT TERMINOLOGY OF DECISION TREE

Root Node

Branch or Sub-Tree

Splitting

Decision Node

Leaf or Terminal Node

Pruning
◼ A Decision Tree is an important data structure known to solve many computational problems

BASIC CONCEPT Binary Decision Tree


▪ In previous slide, we have considered a decision tree where values of any attribute if binary only. Decision tree is also possible where
attributes are of continuous data type

Decision Tree with numeric data


BASIC CONCEPT
DECISION TREE AND CLASSIFICATION TASK

◼ Decision tree helps us to classify data.


◼ Internal nodes are some attribute
◼ Edges are the values of attributes
◼ External nodes are the outcome of classification

◼ Such a classification is, in fact, made by posing questions starting from the root node to each terminal
node.
Name Body Skin Gives Birth Aquatic Aerial Has Legs Hibernates Class
Temperature Cover Creature Creature
Human Warm hair yes no no yes no Mammal
Python Cold scales no no no no yes Reptile
Salmon Cold scales no yes no no no Fish
Whale Warm hair yes yes no no no Mammal
Frog Cold none no semi no yes yes Amphibian
Komodo Cold scales no no no yes no Reptile
Bat Warm hair yes no yes yes yes Mammal
Pigeon Warm feathers no no yes yes no Bird
Cat Warm fur yes no no yes no Mammal
Leopard Cold scales yes yes no no no Fish
Turtle Cold scales no semi no yes no Reptile
Penguin Warm feathers no semi no yes no Bird
Porcupine Warm quills yes no no yes yes Mammal
Eel Cold scales no yes no no no Fish
Salamander Cold none no semi no yes yes Amphibian

Vertebrate Classification

DECISION TREE AND CLASSIFICATION TASK


● Suppose, a new species is discovered as
follows.
Body Skin Gives Aquatic Aerial Has
Name Hibernates Class
Temperature Cover Birth Creature Creature Legs
cold scale no no no yes yes ?
Gila Monster

● Decision Tree that can be inducted


based on the data is as follows.

Vertebrate Classification

DECISION TREE AND CLASSIFICATION TASK


● The example illustrates how we can solve a classification problem by asking a series of question about the
attributes.
● Each time we receive an answer, a follow-up question is asked until we reach a conclusion about the
class-label of the test.

● The series of questions and their answers can be organized in the form of a decision tree
● As a hierarchical structure consisting of nodes and edges

● Once a decision tree is built, it is applied to any test to classify it.

Vertebrate Classification

DECISION TREE AND CLASSIFICATION TASK


DECISION TREES

◼ Optimizing the performance of the trees:


◼ Assumptions made while creating the decision tree:

◼ While starting the training, the whole data-set is ◼ max_depth: The maximum depth of the tree is defined

considered as the root. here.


◼ criterion: This parameter takes the criterion method as
◼ The input values are preferred to be categorical.
the value. The default value is Gini.
◼ Records are distributed based on attribute values.
◼ splitter: This parameter allows us to choose the split
◼ The attributes are placed as the root node of the tree is
strategy. Best and random are available types of the split.
based on statistical results.
The default value is best.
ATTRIBUTE SELECTIVE MEASURE(ASM)

Attribute Subset Selection Measure is a technique used in the


data mining process for data reduction. The data reduction is
necessary to make better analysis and prediction of the target
variable.
The two main ASM techniques are
1. Gini index
2. Information Gain(ID3)
GINI INDEX

The measure of the degree of probability of a particular variable being wrongly classified when it is randomly chosen is called the
Gini index or Gini impurity. The data is equally distributed based on the Gini index.

Mathematical Formula :

Pi= probability of an object being classified into a particular class.

When you use the Gini index as the criterion for the algorithm to select the feature for the root node.,The feature with the least Gini
index is selected.
GINI INDEX OF DIVERSITY

 
  Definition 9.7: Gini Index of Diversity
GINI INDEX OF DIVERSITY AND CART

 
INFORMATION GAIN (ID3)

Entropy is the main concept of this algorithm, which helps determine a feature or attribute that gives maximum information about
a class is called Information gain or ID3 algorithm. By using this method, we can reduce the level of entropy from the root node to
the leaf node.

Mathematical Formula :

‘p’, denotes the probability of E(S), which denotes the entropy. The feature or attribute with the highest ID3 gain is used as the root
for the splitting.
ID3: DECISION TREE INDUCTION ALGORITHMS

◼ Quinlan [1986] introduced the ID3, a popular short form of Iterative Dichotomizer 3
for decision trees from a set of training data.
◼ In ID3, each node corresponds to a splitting attribute and each arc is a possible value
of that attribute.
◼ At each node, the splitting attribute is selected to be the most informative among the
attributes not yet considered in the path starting from the root.
ALGORITHM ID3

◼ In ID3, entropy is used to measure how informative a node is.


◼ It is observed that splitting on any attribute has the property that average entropy of the resulting
training subsets will be less than or equal to that of the previous training set.

◼ ID3 algorithm defines a measurement of a splitting called Information Gain to


determine the goodness of a split.
◼ The attribute with the largest value of information gain is chosen as the splitting attribute and
◼ it partitions into a number of smaller training sets based on the distinct values of attribute under
split.
 

DEFINING INFORMATION GAIN


 

DEFINING INFORMATION GAIN


  Definition : Weighted Entropy

DEFINING INFORMATION GAIN


  Definition : Information Gain

DEFINING INFORMATION GAIN


DEFINING INFORMATION GAIN
INFORMATION GAIN CALCULATION

Information gain on splitting OPTH


Age Eye-sight Astigmatism Use type Class
1 1 1 1 3
1 1 1 2 2
 
1 1 2 1 3
1 1 2 2 1
1 2 1 1 3
1 2 1 2 2
1 2 2 1 3
1 2 2 2 1

Information Gain Calculation


 

Age Eye-sight Astigmatism Use type Class

2 1 1 1 3

2 1 1 2 2

2 1 2 1 3

2 1 2 2 1

2 2 1 1 3

2 2 1 2 2

2 2 2 1 3

2 2 2 2 3

Information Gain Calculation


 

Age Eye-sight Astigmatism Use type Class

3 1 1 1 3  
3 1 1 2 3

3 1 2 1 3

3 1 2 2 1

3 2 1 1 3

3 2 1 2 2

3 2 2 1 3

3 2 2 2 3

 
INFORMATION GAINS FOR DIFFERENT ATTRIBUTES

 
DECISION TREE INDUCTION : ID3 WAY
DECISION TREES

ADVANTAGES DISADVANTAGES

◼ Decision trees are easy to visualize. ◼ Over-fitting of the data is possible.


◼ Non-linear patterns in the data can be ◼ The small variation in the input data can
captures easily. result in a different decision tree. This
◼ It can be used for predicting missing can be reduced by using feature
values, suitable for feature engineering engineering techniques.
techniques ◼ We have to balance the data-set before
training the model.
WORKING OF A DECISION TREE
1. The root node feature is selected based on the results from the Attribute Selection Measure(ASM).

2. The ASM is repeated until a leaf node, or a terminal node cannot be split into sub-nodes.
BUILDING DECISION TREE
◼ In principle, there are exponentially many decision tree that can be
constructed from a given database (also called training data).
◼ Some of the tree may not be optimum
◼ Some of them may give inaccurate result

◼ Two approaches are known


◼ Greedy strategy
◼ A top-down recursive divide-and-conquer

◼ Modification of greedy strategy


◼ ID3
◼ C4.5
◼ CART, etc.
BUILT DECISION TREE ALGORITHM
◼ Algorithm BuiltDT
◼ Input: D : Training data set
◼ Output: T : Decision tree
◼ Steps
1. If all tuples in D belongs to the same class Cj
◼ Add a leaf node labeled as Cj
◼ Return // Termination condition
2. Select an attribute Ai (so that it is not selected twice in the same branch)

3. Partition D = { D1, D2, …, Dp} based on p different values of Ai in D

4. For each Dk ϵ D
◼ Create a node and add an edge between D and Dk with label as the Ai’s attribute value in Dk

5. For each Dk ϵ D
◼ BuildTD(Dk) // Recursive call
6. Stop
NODE SPLITTING IN BUILDDT ALGORITHM
◼ BuildDT algorithm must provides a method for expressing an attribute test condition
and corresponding outcome for different attribute type

◼ Case: Binary attribute


◼ This is the simplest case of node splitting
◼ The test condition for a binary attribute generates only two outcomes
NODE SPLITTING IN BUILDDT ALGORITHM
◼ Case: Nominal attribute
◼ Since a nominal attribute can have many values, its test condition can be expressed in two ways:
◼ A multi-way split
◼ A binary split

◼ Muti-way split: Outcome depends on the number of distinct values for the corresponding attribute

◼ Binary splitting by grouping attribute values


NODE SPLITTING IN BUILDDT ALGORITHM
◼ Case: Ordinal attribute
◼ It also can be expressed in two ways:
◼ A multi-way split
◼ A binary split

◼ Muti-way split: It is same as in the case of nominal attribute


◼ Binary splitting attribute values should be grouped maintaining the order property of the attribute
values
NODE SPLITTING IN BUILDDT ALGORITHM
◼ Case: Numerical attribute
◼ For numeric attribute (with discrete or continuous values), a test condition can be expressed as a
comparison set
◼ Binary outcome: A >v or A ≤ v
◼ In this case, decision tree induction must consider all possible split positions
◼ Range query : vi ≤ A < vi+1 for i = 1, 2, …, q (if q number of ranges are chosen)
◼ Here, q should be decided a priori

o For a numeric attribute, decision tree induction is a combinatorial optimization problem


○ Consider a training data set as
shown.

Attributes:
Gender = {Male(M), Female (F)} // Binary attribute
Height = {1.5, …, 2.5} // Continuous
attribute

Class = {Short (S), Medium (M), Tall (T)}

Given a person, we are to test in which class s/he


belongs

ILLUSTRATION: BUILDIT ALGORITHM


◼ To built a decision tree, we can select an attribute in two different orderings: <Gender, Height> or <Height, Gender>
◼ Further, for each ordering, we can choose different ways of splitting
◼ Different instances are shown in the following.
◼ Approach 1 : <Gender, Height>

ILLUSTRATION: BUILDIT ALGORITHM


● Approach 2 :
<Height, Gender>
○ Consider an anonymous database as shown. • Is there any “clue” that enables to select the
“best” attribute first?
• Suppose, following are two attempts:
• A1🡪A2🡪A3🡪A4 [naïve]
• A3🡪A2🡪A4🡪A1 [Random]
• Draw the decision trees in the above-mentioned two
cases.

• Are the trees different to classify any test data?


• If any other sample data is added into the database,
is that likely to alter the decision tree already
obtained?

ILLUSTRATION: BUILDIT ALGORITHM


OVERFITTING

◼ Overfitting is a concept in data science, which occurs when a statistical model fits exactly against its training data. When this
happens, the algorithm unfortunately cannot perform accurately against unseen data, defeating its purpose. Generalization of a
model to new data is ultimately what allows us to use machine learning algorithms every day to make predictions and classify
data.
Applications of decision tree

Stock Market Prediction


Decision Trees can be used to solve problems in the stock market field. Firstly, we use data
mining approaches to evaluate past stock prices and acquire useful knowledge through the
calculation of financial indicators. Transformed data is then classified using decision trees
obtained through the application of Artificial Intelligence strategies. Finally, the different
decision trees are analyzed and evaluated, showing accuracy rates and emphasizing total profit
associated to capital gains.

We can therefore say that decision trees have a very crucial role in predicting the stock market.
Marketing

Businesses can use decision trees to enhance the accuracy of their promotional
campaigns by observing the performance of their competitors’ products and
services. Decision trees can help in audience segmentation and support
businesses in producing better-targeted advertisements that have higher
conversion rates. Another use of decision tree is use of demographic data to
find prospective clients. They can help streamline a marketing budget and make
informed decisions on the target market that the business is focused on. In the
absence of decision trees, the business may spend its marketing market without
a specific demographic in mind, which will affect its overall revenues.
Retention of Customers:

Companies use decision trees for customer retention through analyzing their behaviors and
releasing new offers or products to suit those behaviors. By using decision tree models,
companies can figure out the satisfaction levels of their customers as well.

Detection of Frauds:
Companies can prevent fraud by using decision trees to identify fraudulent behavior
beforehand. It can save companies a lot of resources, including time and money.
Diagnosis of Diseases and Ailments:
Decision trees can help physicians and medical professionals in identifying patients that are at a
higher risk of developing serious ( or preventable) conditions such as diabetes or dementia. The
ability of decision trees to narrow down possibilities according to specific variables is quite
helpful in such cases.

You might also like