0% found this document useful (0 votes)

12 views21 pages

Lecture 5 DecisionTree

The document is a lecture on Decision Trees, covering their definition, algorithm workings, and attribute selection measures like Information Gain and Gini Index. It includes examples illustrating how to classify data and discusses the potential issue of overfitting in Decision Trees. Additionally, it mentions Python implementations for practical applications.

Uploaded by

abrham keraleme

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views21 pages

Lecture 5 DecisionTree

Uploaded by

abrham keraleme

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 21

DecisionTree

Lecture 5

Salahadin Seid

School of Data Science

Emerland University

September 1, 2024

Salahadin Seid School of Data Science Emerland University

DecisionTree 1 / 21
Outline

• What is Decision Tree?

• Example
• How Does the Decision Tree Algorithm Work?
• Attribute Selection Measures
• Information Gain
• Gini-index
• Python Implementations

Salahadin Seid School of Data Science Emerland University

DecisionTree 2 / 21
What is Decision trees?

• Decision trees are a another popular ML algorithm that can

be used for both regression and classification tasks.
• Decision trees make predictions by recursively splitting on
different attributes according to a tree structure.
• It has a hierarchical tree structure consisting of a root node,
branches, internal nodes, and leaf nodes.
• They are easy to understand, interpret, and implement,
making them an ideal choice for beginners in the field of ML.

Salahadin Seid School of Data Science Emerland University

DecisionTree 3 / 21
Example

• Example: classifying fruit as an orange or lemon based on

height and width

Salahadin Seid School of Data Science Emerland University

DecisionTree 4 / 21
Example ...

Salahadin Seid School of Data Science Emerland University

DecisionTree 5 / 21
Example ...
• For continuous attributes, split based on less than or greater than
some threshold
• Thus, input space is divided into regions with boundaries parallel to
axes

Salahadin Seid School of Data Science Emerland University

DecisionTree 6 / 21
How Does the Decision Tree Algorithm Work?

The basic idea behind any decision tree algorithm is as follows:

1 Select the best attribute using Attribute Selection Measures
(ASM) to split the records.
2 Make that attribute a decision node and breaks the dataset
into smaller subsets.
3 Start tree building by repeating this process recursively for
each child until one of the conditions will match:
• All the tuples belong to the same attribute value.
• There are no more remaining attributes.
• There are no more instances.

Salahadin Seid School of Data Science Emerland University

DecisionTree 7 / 21
Attribute Selection Measures

• Attribute selection measure (ASM) , aka, splitting rules-

a heuristic for selecting the splitting criterion that partitions
data in the best possible manner.
• ASM provides a rank to each feature (or attribute) by
explaining the given dataset.
• The best score attribute will be selected as a splitting
attribute.
• The most popular selection measures are Information Gain,
and Gini Index.

Salahadin Seid School of Data Science Emerland University

DecisionTree 8 / 21
Information Gain

• Information gain is a measure used to determine which

feature should be used to split the data at each internal node
of the decision tree. It is calculated using entropy.
• Entropy is a metric to measure the impurity in a given
attribute. It speciﬁes randomness in data.
• In a decision tree, the goal is to decrease the entropy of the
dataset by creating more pure subsets of data.
• Since entropy is a measure of impurity, by decreasing the
entropy, we are increasing the purity of the data.

Salahadin Seid School of Data Science Emerland University

DecisionTree 9 / 21
Information Gain

Entropy
∑
H(x ) = − p(x )log2 p(x )
x ∈X

Pi is the probability of randomly selecting an example in class i.

Dataset made up of 3 colours; red, purple, and yellow - equation
becomes: H(x ) = −(pr log2 pr + pp log2 pp + py log2 py )

Salahadin Seid School of Data Science Emerland University

DecisionTree 10 / 21
Information Gain...

Information Gain
∑ sv
Gain(S, A) = Entropy (S) − ∗ Entropy (sv )
s
InformationGain = entropy (parent) − [weightedAverage] ∗ entropy (children)

Salahadin Seid School of Data Science Emerland University

DecisionTree 11 / 21
Information Gain... example

• A decision tree to predict whether a loan given to a person

would result in a write-off or not.
• Our entire population consists of 30 instances.
• 16 belong to the write-off class and
• other 14 belong to the non-write-off class.
• We have two features, namely
• Balance that can take on two values -> < 50K or >50K
• Residence that can take on three values -> OWN, RENT or
OTHER.

Salahadin Seid School of Data Science Emerland University

DecisionTree 12 / 21
Information Gain... example

• Feature 1: Balance
• Balance < 50K : Sub total: 13 (12 write-oﬀ, 1 not)
• Balance >= 50K : Sub total: 17 (4 write-oﬀ, 13 not)
E (parent) = − 16
30 log2 30 − 30 log2 30 = 0.99
16 14 14

E (Blance < 50K ) = − 12 13 log2 13 − 13 log2 13 = 0.39

12 1 1

E (Blance >= 50K ) = − 17 log2 17 − 17 log2 13

4 4 13
17 = 0.79
Weighted Average of entropy:
30 ∗ 0.39 + 30 ∗ 0.79 = 0.62
E (Blance) = 13 17

Information Gain
IG(Parent, Blance) = E (Parent) − E (Blance) = 0.99 − 0.62 = 0.37

Salahadin Seid School of Data Science Emerland University

DecisionTree 13 / 21
Information Gain... example

• Feature 2: Residence
• Residence = own : Sub total: 8 (7 write-off, 1 not)
• Residence= rent : Sub total: 10 (4 write-off, 6 not)
• Residence= other : Sub total: 12 (5 write-off, 7 not)
E (parent) = − 16
30 log2 30 − 30 log2 30 = 0.99
16 14 14

E (Residence = own) = − 8 log2 8 − 18 log2 18 = 0.54

7 7

E (Residence = rent) = − 10 4 4
log2 10 − 10
6 6
log2 10 = 0.97
E (Residence = other ) = − 12 5 5
log2 12 − 12
7 7
log2 12 = 0.98
Weighted Average of entropy:
E (Residence) = 308
∗ 0.54 + 1030 ∗ 0.97 + 30 ∗ 0.98 = 0.86
12

Information Gain IG(Parent, Residence) =

E (Parent) − E (Residence) = 0.99 − 0.86 = 0.13
Information gain from feature, Balance is almost 3 times more than the
information gain from Residence - it means that the feature(Balance) with the
higher information gain (0.37) is more informative and should be used to split
the data at the next internal node.
Salahadin Seid School of Data Science Emerland University
DecisionTree 14 / 21
Gini Index

• Gini Index is also known as Gini impurity - It is measure of

how mixed or impure a dataset is.
• Gini impurity ranges between 0 and 1, where 0 represents a
pure dataset and 1 represents a completely impure dataset.
• Pure dataset - all the samples belong to the same class or
category.
• Impure dataset - contains a mixture of diﬀerent classes or
categories.

Salahadin Seid School of Data Science Emerland University

DecisionTree 15 / 21
Gini Index

∑
Gini_impurity = 1 − p(i)2

where p(i) is the probability of a speciﬁc class and the summation

is done for all classes present in the dataset.

Lets consider a toy dataset with two classes Yes and No and the
following class probabilities: p(Yes) = 0.3 and p(No) = 0.7
Gini_impurity = 1 − (0.3)2 − (0.7)2 = 0.45

Salahadin Seid School of Data Science Emerland University

DecisionTree 16 / 21
Gini Index - Example

Consider a toy dataset with the following features and class labels:

• Target class label is Buys_insurance and it can take two

values Yes or No.
• We want to determine the best feature to use as the root
node for the decision tree.
• To do this, we will calculate the Gini impurity for each feature
and select the feature with the lowest Gini impurity.
Salahadin Seid School of Data Science Emerland University
DecisionTree 17 / 21
Gini Index - Example

Calculate the Gini impurity for each feature formula: Gini_impurity =

1 − p(Feature = value1)2 − p(Feature = value2)2 − ...p(Feature = valueN)2
1 AGE Gini impurity: = 1 − p(age = 20)2 − p(age = 25)2 − ...p(age = 45)2
Gini_impurity = 1 − (1/6)2 − (1/6)2 − (1/6)2 ... − (1/6)2 = 1
2 Gender Gini impurity: = 1 − p(Gender = male)2 − p(Gender = Female)2
Gini_impurity = 1 − (3/6)2 − (3/6)2 = 0.5
3 Income Gini impurity:
= 1 − p(Income = high)2 − p(Income = Medium)2 − p(Income = low )2
Gini_impurity = 1 − (3/6)2 − (2/6)2 − (1/6)2 = 0.612
4 Credit Score Gini impurity:
= 1 − p(cs = Excellent)2 − p(cs = fair )2 − p(cs = poor )2
Gini_impurity = 1 − (3/6)2 − (2/6)2 − (1/6)2 = 0.612
From the above calculations, we can see that the feature Gender have the
lowest Gini impurity of 0.5. So, we can select gender as the root node for the
decision tree.

Salahadin Seid School of Data Science Emerland University

DecisionTree 18 / 21
Decision Trees - over-ﬁtting issue

• Decision Trees are prone to over-ﬁtting.

• Overﬁtting in decision trees occurs when the tree becomes too
complex and captures the noise in the training data, rather
than the underlying pattern.
• A decision tree will always overﬁt the training data if we allow
it to grow to its max depth.
• This can lead to poor generalization performance on new
unseen data

Salahadin Seid School of Data Science Emerland University

DecisionTree 19 / 21
Python Implementations

1 entropy example: entropy_iris.py

2 Training an Unpruned Decision Tree: dt_iris_without.py
• Accuracy on the training set: 1.0
• Accuracy on the testing set: 0.9333
3 Using pruning - removing branches that do not provide much
information gain or that are not necessary for the tree to make
accurate predictions: dt_iris_with_pruning.py

Salahadin Seid School of Data Science Emerland University

DecisionTree 20 / 21
Thanks For Your Attention!
Any questions?

Salahadin Seid School of Data Science Emerland University

DecisionTree 21 / 21

Unit 1 ML (DT)
No ratings yet
Unit 1 ML (DT)
24 pages
Unit 1 ML (NN& ML Techniques)
No ratings yet
Unit 1 ML (NN& ML Techniques)
40 pages
Unit 3.2 Decision Tree Algorithm Wit Examples
No ratings yet
Unit 3.2 Decision Tree Algorithm Wit Examples
85 pages
Unit-4 (1) .Docx ML
No ratings yet
Unit-4 (1) .Docx ML
42 pages
Trees
No ratings yet
Trees
78 pages
ML Lecture 8 9 Classification
No ratings yet
ML Lecture 8 9 Classification
35 pages
CSE445 NSU Week - 4
No ratings yet
CSE445 NSU Week - 4
48 pages
Decitions Tree
No ratings yet
Decitions Tree
6 pages
Data Minning Unit 5 PDF
No ratings yet
Data Minning Unit 5 PDF
19 pages
Decision Tree in Machine Learning
No ratings yet
Decision Tree in Machine Learning
11 pages
Decision Tree
No ratings yet
Decision Tree
5 pages
Unit 1 Classification & Prediction DM
No ratings yet
Unit 1 Classification & Prediction DM
71 pages
AI Chapter 3 Part 2
No ratings yet
AI Chapter 3 Part 2
51 pages
Lec 07
No ratings yet
Lec 07
66 pages
Decision Tree
No ratings yet
Decision Tree
34 pages
Data Science Lectures 3
No ratings yet
Data Science Lectures 3
46 pages
Decision Trees: Decision Tree Is One of The Most Widely Used and
No ratings yet
Decision Trees: Decision Tree Is One of The Most Widely Used and
53 pages
Data II - Decision Trees and Rules
No ratings yet
Data II - Decision Trees and Rules
11 pages
Unit IV Decision Trees
No ratings yet
Unit IV Decision Trees
37 pages
Python Decision Tree Classification
No ratings yet
Python Decision Tree Classification
14 pages
1.decision Trees Concepts
No ratings yet
1.decision Trees Concepts
70 pages
DT-0 (3 Files Merged)
No ratings yet
DT-0 (3 Files Merged)
143 pages
Decision Tree Learning Basics
No ratings yet
Decision Tree Learning Basics
36 pages
2 Decision Tree Algo
No ratings yet
2 Decision Tree Algo
46 pages
Training Day 22
No ratings yet
Training Day 22
48 pages
Decision Tree
No ratings yet
Decision Tree
5 pages
Decision Tree New
No ratings yet
Decision Tree New
52 pages
UNIT - 3 ML
No ratings yet
UNIT - 3 ML
24 pages
9 Lecture AI 09
No ratings yet
9 Lecture AI 09
57 pages
Unit 3 Notes
No ratings yet
Unit 3 Notes
25 pages
Decision Tree
No ratings yet
Decision Tree
19 pages
Data Mining: Concepts and Techniques
No ratings yet
Data Mining: Concepts and Techniques
59 pages
Decision Trees (I) : ISOM3360 Data Mining For Business Analytics, Session 4
No ratings yet
Decision Trees (I) : ISOM3360 Data Mining For Business Analytics, Session 4
32 pages
IML Unit04 - Learning Decision Trees
No ratings yet
IML Unit04 - Learning Decision Trees
28 pages
Lecture 17 18
No ratings yet
Lecture 17 18
52 pages
ML-chap9 2024 110217
No ratings yet
ML-chap9 2024 110217
52 pages
08 Decision - Tree
No ratings yet
08 Decision - Tree
9 pages
Chapter 3
No ratings yet
Chapter 3
88 pages
MODULE 4-Dr - GM
No ratings yet
MODULE 4-Dr - GM
23 pages
Lecture 7.1 - Decision Tree Classification
No ratings yet
Lecture 7.1 - Decision Tree Classification
15 pages
Decision Tree
No ratings yet
Decision Tree
66 pages
L04 Decision Trees
No ratings yet
L04 Decision Trees
34 pages
7 DecisioinTrees
No ratings yet
7 DecisioinTrees
48 pages
Trinh Khanh Ly 20213676
No ratings yet
Trinh Khanh Ly 20213676
13 pages
Decision Tree - Notes
No ratings yet
Decision Tree - Notes
8 pages
06-Classification Part1
No ratings yet
06-Classification Part1
44 pages
Decision Tree
No ratings yet
Decision Tree
22 pages
Supervised Decision TreeRandom Forest
No ratings yet
Supervised Decision TreeRandom Forest
39 pages
Decision Tree
No ratings yet
Decision Tree
7 pages
Decision Tree Algorithm
No ratings yet
Decision Tree Algorithm
18 pages
Classification - Decision Trees
No ratings yet
Classification - Decision Trees
43 pages
Data Mining Notes Unit 4
No ratings yet
Data Mining Notes Unit 4
30 pages
Asset v1 MKAU+SEng9032+DEV 01+Type@Asset+Block@ML Chapterthree
No ratings yet
Asset v1 MKAU+SEng9032+DEV 01+Type@Asset+Block@ML Chapterthree
129 pages
Ecture Ecision REE: Sajal Halder Bsmrstu
100% (1)
Ecture Ecision REE: Sajal Halder Bsmrstu
22 pages
DM 4
No ratings yet
DM 4
68 pages
AIML Lec-11
No ratings yet
AIML Lec-11
18 pages
Lecture 4
No ratings yet
Lecture 4
74 pages
Decision Trees
No ratings yet
Decision Trees
13 pages
Decision-Tree Learning .
No ratings yet
Decision-Tree Learning .
29 pages
ASQLSSR Feb2020 Chakey With Online Figures PDF
No ratings yet
ASQLSSR Feb2020 Chakey With Online Figures PDF
10 pages
This Study Resource Was: Busse Place
No ratings yet
This Study Resource Was: Busse Place
4 pages
Induction Program & University Orientation Program 2025
No ratings yet
Induction Program & University Orientation Program 2025
12 pages
Aarogyasetu Static Report
No ratings yet
Aarogyasetu Static Report
19 pages
DIY Metal Monkey Bars Guide
No ratings yet
DIY Metal Monkey Bars Guide
8 pages
Paid Videos Adda (Telegram) : BY Gagan Pratap
No ratings yet
Paid Videos Adda (Telegram) : BY Gagan Pratap
7 pages
Saudi Arabia Election Types Guide
No ratings yet
Saudi Arabia Election Types Guide
3 pages
Beginner's Guide to Trading Videos
No ratings yet
Beginner's Guide to Trading Videos
8 pages
Full Corporate Offer For Railways Second Grade R50-R65
No ratings yet
Full Corporate Offer For Railways Second Grade R50-R65
3 pages
Norriseal PDF
100% (2)
Norriseal PDF
349 pages
Inbal DG04C Deluge Valve Electric Actuation PDF
No ratings yet
Inbal DG04C Deluge Valve Electric Actuation PDF
4 pages
C Programming Essentials Guide
No ratings yet
C Programming Essentials Guide
2 pages
Spuds System PDF
No ratings yet
Spuds System PDF
10 pages
Bylaws
No ratings yet
Bylaws
27 pages
A Capacitive Humidity Sensor Integrated With Micro Heater and Ring Oscillator Circuit Fabricated by CMOS-MEMS Technique
No ratings yet
A Capacitive Humidity Sensor Integrated With Micro Heater and Ring Oscillator Circuit Fabricated by CMOS-MEMS Technique
6 pages
Tuned Amplifier PDF
100% (1)
Tuned Amplifier PDF
40 pages
Industrial Visit Report: Forge & Forge
No ratings yet
Industrial Visit Report: Forge & Forge
14 pages
VulnerabilitySecurityRisk IIoT IEC 62443 Compliance With Standard
No ratings yet
VulnerabilitySecurityRisk IIoT IEC 62443 Compliance With Standard
8 pages
IBPS PO Prelims Day 5 Combined 168614589089
No ratings yet
IBPS PO Prelims Day 5 Combined 168614589089
42 pages
Oracle Server X7
No ratings yet
Oracle Server X7
7 pages
DSM-0410.1 CoSuperalloy StelliteType
No ratings yet
DSM-0410.1 CoSuperalloy StelliteType
7 pages
(Ebook PDF) The Law and Special Education 5th Edition by Mitchell L. Yellinstant Download
67% (6)
(Ebook PDF) The Law and Special Education 5th Edition by Mitchell L. Yellinstant Download
53 pages
CSP CBR Activities
No ratings yet
CSP CBR Activities
6 pages
The Thiefs Story PYQs CBSE10
No ratings yet
The Thiefs Story PYQs CBSE10
3 pages
Annual Budget 2024-25 Andhra
No ratings yet
Annual Budget 2024-25 Andhra
37 pages
Manual de Voo AS350B3 Arriel2B1 Treinamento
No ratings yet
Manual de Voo AS350B3 Arriel2B1 Treinamento
612 pages
Lectures On PHC
100% (1)
Lectures On PHC
21 pages
API Security Guidelines for BSFIs
No ratings yet
API Security Guidelines for BSFIs
4 pages
POSC101 Study Guide - Midterm Exam 1
No ratings yet
POSC101 Study Guide - Midterm Exam 1
4 pages
Microbial Lipids for Biotech Experts
No ratings yet
Microbial Lipids for Biotech Experts
1 page

Lecture 5 DecisionTree

Uploaded by

Lecture 5 DecisionTree

Uploaded by

DecisionTree

School of Data Science

Salahadin Seid School of Data Science Emerland University

• What is Decision Tree?

Salahadin Seid School of Data Science Emerland University

• Decision trees are a another popular ML algorithm that can

Salahadin Seid School of Data Science Emerland University

• Example: classifying fruit as an orange or lemon based on

Salahadin Seid School of Data Science Emerland University

Salahadin Seid School of Data Science Emerland University

Salahadin Seid School of Data Science Emerland University

The basic idea behind any decision tree algorithm is as follows:

Salahadin Seid School of Data Science Emerland University

• Attribute selection measure (ASM) , aka, splitting rules-

Salahadin Seid School of Data Science Emerland University

• Information gain is a measure used to determine which

Salahadin Seid School of Data Science Emerland University

Pi is the probability of randomly selecting an example in class i.

Salahadin Seid School of Data Science Emerland University

Salahadin Seid School of Data Science Emerland University

• A decision tree to predict whether a loan given to a person

Salahadin Seid School of Data Science Emerland University

E (Blance < 50K ) = − 12 13 log2 13 − 13 log2 13 = 0.39

E (Blance >= 50K ) = − 17 log2 17 − 17 log2 13

Salahadin Seid School of Data Science Emerland University

E (Residence = own) = − 8 log2 8 − 18 log2 18 = 0.54

Information Gain IG(Parent, Residence) =

• Gini Index is also known as Gini impurity - It is measure of

Salahadin Seid School of Data Science Emerland University

where p(i) is the probability of a speciﬁc class and the summation

Salahadin Seid School of Data Science Emerland University

• Target class label is Buys_insurance and it can take two

Calculate the Gini impurity for each feature formula: Gini_impurity =

Salahadin Seid School of Data Science Emerland University

• Decision Trees are prone to over-ﬁtting.

Salahadin Seid School of Data Science Emerland University

1 entropy example: entropy_iris.py

Salahadin Seid School of Data Science Emerland University

Salahadin Seid School of Data Science Emerland University

You might also like