Decision Tree Regression

The document explains the process of using decision trees for regression to predict continuous numerical outputs based on input features. It outlines steps including data preparation, choosing splitting criteria, building the tree, defining stopping criteria, and making predictions, with an example of predicting engine efficiency at a specific temperature. The document also details the iterative process of splitting the data and refining predictions based on variance reduction.

Uploaded by

YASH WANKHEDE

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

56 views5 pages

Decision Tree Regression

Uploaded by

YASH WANKHEDE

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

Decision tree regression.

Using Decision Trees for Regression: A Detailed Explanation

The goal of using a decision tree for regression is to predict a continuous numerical output (the
dependent variable) based on one or more input features (the independent variables). The tree
achieves this by recursively partitioning the data space into smaller regions and fitting a simple
prediction (typically the mean of the target variable within that region) within each region.
Here's a breakdown of the process:
1. Data Preparation:
Collect and Prepare Data: Gather your dataset with independent features and the continuous
dependent variable. Handle missing values and encode categorical features if necessary
(though tree-based methods can often handle categoricals directly).
Split the Data: Divide your dataset into training, validation (optional but recommended for
tuning), and test sets. The training set is used to build the tree, the validation set to tune
hyperparameters and prevent overfitting, and the test set to evaluate the final model's
performance on unseen data.
2. Choosing the Splitting Criterion:
At each internal node of the tree, the algorithm must decide which feature to split on and what
the split point should be. For regression trees, the goal of the split is to reduce the variance or
the Mean Squared Error (MSE) of the target variable in the resulting child nodes. Common
splitting criteria include:
Variance Reduction: The algorithm selects the split that leads to the largest decrease in the
variance of the target variable across the child nodes. The variance at a node measures the
spread of the target variable values within that node.
Mean Squared Error (MSE) Reduction: The algorithm chooses the split that minimizes the
weighted average of the MSE in the child nodes. The MSE at a node is the average of the
squared differences between the actual target values and the mean target value in that node.
Mean Absolute Error (MAE) Reduction: Similar to MSE, but uses the absolute differences
instead of squared differences.
3. Building the Tree (Recursive Partitioning):
The decision tree is built using a greedy and recursive process:
Start at the Root Node: All the training data is in the root node.
Find the Best Split: For the current node, the algorithm iterates through each feature and
considers all possible split points (thresholds for numerical features, subsets for categorical

Vipin V. Palande
NMPL
features). For each potential split, it calculates the reduction in the chosen impurity measure
(e.g., variance or MSE). The split that yields the greatest reduction is selected as the best split
for that node.
Create Child Nodes: The best split divides the data in the current node into two or more child
nodes based on the split condition.
Repeat the Process: The splitting process is recursively applied to each child node. This
continues until a stopping criterion is met.
4. Defining Stopping Criteria:
The recursive splitting must stop to prevent the tree from becoming overly complex and
overfitting the training data. Common stopping criteria include:
Maximum Depth of the Tree: Limits the number of levels in the tree.
Minimum Number of Samples at a Node: Stops splitting if a node contains fewer than a
specified number of data points.
Minimum Number of Samples in a Leaf Node: Ensures that each leaf node has a minimum
number of data points.
Minimum Impurity Reduction: Stops splitting if the improvement in the impurity measure is
below a certain threshold.
5. Making Predictions:
Once the regression tree is built, you can use it to predict the target variable for new, unseen
data points:
Traverse the Tree: Starting from the root node, for each feature in the input data point, follow
the branch that corresponds to the feature's value at the current node.
Reach a Leaf Node: Continue traversing until you reach a leaf node.
Output the Prediction: The predicted value for the input data point is the average (or sometimes
median) of the target variable values of the training samples that ended up in that leaf node.
Example
Using decision tree regression predict efficiency when engine temperature =135
Sr no. Engine Temperature (0C) Efficiency (%)
1 50 0.4
2 100 0.6
3 150 0.55
4 200 0.7

Vipin V. Palande
NMPL
let's predict the efficiency at a temperature of 135 °C using the best first split of a decision
tree based on the provided data. We'll use the variance reduction criterion to determine the
best split.

Vipin V. Palande
NMPL
Therefor using the first best split of a decision tree based on variance reduction, the predicted
0.6+0.55+0.7
efficiency at an engine temperature of 1350C = = 0.6167 = 61.67%
3

6. Further Splitting

We check if the tree can be further split. The left branch (Temperature <= 75°C) cannot be
split as it has only one data point. The right branch (Temperature > 75°C) can be split further.
Vipin V. Palande
NMPL
We calculate the variance reduction for potential splits within this branch and find that
splitting the right branch at 175°C results in a variance reduction of 0.002222

7. Second Split at 175°C

The right branch (Temperature > 75°C) is split at 175°C, resulting in the following tree:

If Engine Temperature (℃) <= 75:

Predict Efficiency = 0.4
Else (Engine Temperature (℃) > 75):
If Engine Temperature (℃) <= 175:
Predict Efficiency = 0.575
Else (Engine Temperature (℃) > 175):
Predict Efficiency = 0.7

8. Final Split at 125°C

The tree can be further split at 125°C.

If Engine Temperature (℃) <= 75:

Predict Efficiency = 0.4
Else (Engine Temperature (℃) > 75):
If Engine Temperature (℃) <= 125:
Predict Efficiency = 0.6
Else (Engine Temperature (℃) > 125):
If Engine Temperature (℃) <= 175:
Predict Efficiency = 0.55
Else (Engine Temperature (℃) > 175):
Predict Efficiency = 0.7

9. Prediction at 135°C

• After the first split at 75°C: The predicted efficiency at 135°C is 0.6167%.
• After the second split at 175°C: The predicted efficiency at 135°C is 0.575%.
• After the final split at 125°C: The predicted efficiency at 135°C is 0.55%.

References
• "Hands-On Machine Learning with Scikit-Learn, Keras & TensorFlow" by Aurélien
Géron.
• Gemini AI. (2025, April 10).

Vipin V. Palande
NMPL

MI - Unit 4
No ratings yet
MI - Unit 4
79 pages
Unit 3
No ratings yet
Unit 3
25 pages
Decision Tree
100% (1)
Decision Tree
26 pages
Main Algorithms Used in Machine Learning Lecture Notes
No ratings yet
Main Algorithms Used in Machine Learning Lecture Notes
26 pages
Prediction of Energy Consumption in Smart Homes Using Machine Learning Algorithms
No ratings yet
Prediction of Energy Consumption in Smart Homes Using Machine Learning Algorithms
13 pages
Lecture Notes 3
No ratings yet
Lecture Notes 3
11 pages
Week 12
No ratings yet
Week 12
25 pages
TEAA - Tree Ensembles-1
No ratings yet
TEAA - Tree Ensembles-1
43 pages
Session 9 10 Decision Tree
No ratings yet
Session 9 10 Decision Tree
41 pages
Lecture 16
No ratings yet
Lecture 16
5 pages
Lecture 7 - Decision Tree Regression Imran 19032025 103416am
No ratings yet
Lecture 7 - Decision Tree Regression Imran 19032025 103416am
40 pages
Random Forest Regression
No ratings yet
Random Forest Regression
57 pages
Data Mining Notes Unit 4
No ratings yet
Data Mining Notes Unit 4
30 pages
Xai25 Part03 Decision Trees and Rules
No ratings yet
Xai25 Part03 Decision Trees and Rules
39 pages
Decision Tree
No ratings yet
Decision Tree
6 pages
Lecture 16
No ratings yet
Lecture 16
6 pages
Decision Tree & Random Forest
No ratings yet
Decision Tree & Random Forest
16 pages
Tree Based Learning Methods
No ratings yet
Tree Based Learning Methods
28 pages
Decision Trees
No ratings yet
Decision Trees
8 pages
Decision Tree
No ratings yet
Decision Tree
21 pages
BSC ML Ch3
No ratings yet
BSC ML Ch3
106 pages
Lecture 5a
No ratings yet
Lecture 5a
24 pages
08 Decision - Tree
No ratings yet
08 Decision - Tree
9 pages
ESGB - 2025 - Classification and Regression Tress (Enregistré Automatiquement)
No ratings yet
ESGB - 2025 - Classification and Regression Tress (Enregistré Automatiquement)
43 pages
Machine Learning With Python - Machine Learning Algorithms - Decision Tree
No ratings yet
Machine Learning With Python - Machine Learning Algorithms - Decision Tree
17 pages
Decision Trees
No ratings yet
Decision Trees
37 pages
Decision Tree
No ratings yet
Decision Tree
28 pages
Decision Tree
No ratings yet
Decision Tree
21 pages
Csa4008 - Applied Machine Learning Exp1
No ratings yet
Csa4008 - Applied Machine Learning Exp1
7 pages
ML Ch-3 Decision Trees and Ensemble Methods
No ratings yet
ML Ch-3 Decision Trees and Ensemble Methods
14 pages
Dtree&rf
No ratings yet
Dtree&rf
26 pages
Unit 3
No ratings yet
Unit 3
31 pages
Decision Tree Notes
No ratings yet
Decision Tree Notes
6 pages
Unit Iii Machine Learning
No ratings yet
Unit Iii Machine Learning
19 pages
Heterogeneous Forests of Decision Trees: Lecture Notes in Computer Science August 2002
No ratings yet
Heterogeneous Forests of Decision Trees: Lecture Notes in Computer Science August 2002
7 pages
Decision Trees
No ratings yet
Decision Trees
24 pages
Lecture 7 Overview of ML Models
No ratings yet
Lecture 7 Overview of ML Models
77 pages
Trees and Forests: Machine Learning With Python Cookbook
No ratings yet
Trees and Forests: Machine Learning With Python Cookbook
5 pages
Chapter 09 CART - Week 06 - 02
No ratings yet
Chapter 09 CART - Week 06 - 02
53 pages
Decision Tree
No ratings yet
Decision Tree
5 pages
6 - CART Models
No ratings yet
6 - CART Models
15 pages
Decision Trees Cheat Sheet PDF
No ratings yet
Decision Trees Cheat Sheet PDF
2 pages
Dec Tree
No ratings yet
Dec Tree
5 pages
Decision Tree Algorithm
No ratings yet
Decision Tree Algorithm
14 pages
Decision Tree
No ratings yet
Decision Tree
13 pages
Classification & Regression Trees Guide
No ratings yet
Classification & Regression Trees Guide
80 pages
Decision Tree
No ratings yet
Decision Tree
5 pages
Unit 1 ML (DT)
No ratings yet
Unit 1 ML (DT)
24 pages
Best Splitting Attributes ML
No ratings yet
Best Splitting Attributes ML
34 pages
Decision Trees - Pres
No ratings yet
Decision Trees - Pres
9 pages
Decision Tree
No ratings yet
Decision Tree
43 pages
Decision Trees for Informatics Students
No ratings yet
Decision Trees for Informatics Students
20 pages
Lecture Note #5 - PEC-CS701E
No ratings yet
Lecture Note #5 - PEC-CS701E
16 pages
Decision Tree
No ratings yet
Decision Tree
11 pages
2.12 Chapter 6 Decision Tree
No ratings yet
2.12 Chapter 6 Decision Tree
56 pages
Decision Trees for Data Enthusiasts
No ratings yet
Decision Trees for Data Enthusiasts
52 pages
Decision Tree Regression Fully Explained by Example
No ratings yet
Decision Tree Regression Fully Explained by Example
4 pages
Unit 3 - ML (NEW)
No ratings yet
Unit 3 - ML (NEW)
68 pages
Mechanical Final Year Result
No ratings yet
Mechanical Final Year Result
4 pages
1 Class TT
No ratings yet
1 Class TT
8 pages
Unit 2 Presentation
No ratings yet
Unit 2 Presentation
174 pages
Unit 5 Presentation
No ratings yet
Unit 5 Presentation
77 pages
Mech 2024-h
No ratings yet
Mech 2024-h
52 pages
Unit 3 Presentation
No ratings yet
Unit 3 Presentation
49 pages
Unit 4 Presentation
No ratings yet
Unit 4 Presentation
53 pages
Unit 2 Classnotes
No ratings yet
Unit 2 Classnotes
52 pages
Unit 1 Presentation
No ratings yet
Unit 1 Presentation
28 pages
NIT+ Verification Centers List
No ratings yet
NIT+ Verification Centers List
3 pages
Lect-06 Strings
No ratings yet
Lect-06 Strings
13 pages
Part 4 Cost Acc 4th 2023 Acc Dep.
No ratings yet
Part 4 Cost Acc 4th 2023 Acc Dep.
8 pages
AMM-4115 - Assignment Submission - Group-A - Semester-10th.
No ratings yet
AMM-4115 - Assignment Submission - Group-A - Semester-10th.
25 pages
eNodeB Performance Monitoring Reference (V100R004C00 - 01) (PDF) - en
100% (1)
eNodeB Performance Monitoring Reference (V100R004C00 - 01) (PDF) - en
74 pages
Smarter Supply Chain
No ratings yet
Smarter Supply Chain
10 pages
Pasado, Presente, Futuro - Live Worksheets
No ratings yet
Pasado, Presente, Futuro - Live Worksheets
3 pages
CEE570 ppt1 Revised2014
100% (1)
CEE570 ppt1 Revised2014
20 pages
JAVA - Unit 1
No ratings yet
JAVA - Unit 1
36 pages
Chapter 2
No ratings yet
Chapter 2
7 pages
Criminology The Essentials 3rd Edition Anthony Walsh ISBN 9781506359717 ISBN 9781544341651 Official Test Bank
No ratings yet
Criminology The Essentials 3rd Edition Anthony Walsh ISBN 9781506359717 ISBN 9781544341651 Official Test Bank
333 pages
Prasad P. - App Design Apprentice (1st Edition) - 2021
100% (5)
Prasad P. - App Design Apprentice (1st Edition) - 2021
476 pages
Fci Assistant Grade III 2015 Paper 1 East Zone 52c4509d
No ratings yet
Fci Assistant Grade III 2015 Paper 1 East Zone 52c4509d
18 pages
Siemens PLC SL 200 Manual
No ratings yet
Siemens PLC SL 200 Manual
24 pages
COCO Catalogues - TrustSmartHome2015
No ratings yet
COCO Catalogues - TrustSmartHome2015
36 pages
Allen-Bradley Stratix 5700™ Network Address Translation (NAT)
100% (1)
Allen-Bradley Stratix 5700™ Network Address Translation (NAT)
20 pages
Active Power Factor Correction Technique For Three-Phase Diode Rectifiers
No ratings yet
Active Power Factor Correction Technique For Three-Phase Diode Rectifiers
10 pages
Digital Application in Electrical System, GETCO Seminar, Kamin Dave
No ratings yet
Digital Application in Electrical System, GETCO Seminar, Kamin Dave
27 pages
Hospital Temperature Analysis
No ratings yet
Hospital Temperature Analysis
3 pages
Deedy
No ratings yet
Deedy
9 pages
Yellow Black White Modern Doodle UI Computer Pitch Deck Marketing Presentation
No ratings yet
Yellow Black White Modern Doodle UI Computer Pitch Deck Marketing Presentation
13 pages
ENG 302 Technical Comm Lecture Note
No ratings yet
ENG 302 Technical Comm Lecture Note
30 pages
Helm Manual: Developed By: Matt Tytel
No ratings yet
Helm Manual: Developed By: Matt Tytel
32 pages
Much Ado About English - Up and Down The Bizarre Byways of A Fascinating Language
No ratings yet
Much Ado About English - Up and Down The Bizarre Byways of A Fascinating Language
156 pages
Digital Control Design Methods
No ratings yet
Digital Control Design Methods
8 pages
Guide To Renaming Files Fast
No ratings yet
Guide To Renaming Files Fast
9 pages
Large-Scale Photonic Circuit Performance
No ratings yet
Large-Scale Photonic Circuit Performance
4 pages
DNS Aimbot Free Fire Normal
0% (3)
DNS Aimbot Free Fire Normal
3 pages
Centrifugal Pumps
No ratings yet
Centrifugal Pumps
0 pages
Xilinx XC 3000
100% (1)
Xilinx XC 3000
21 pages
PCA82C250 / 251 CAN Transceiver: Application Note
No ratings yet
PCA82C250 / 251 CAN Transceiver: Application Note
24 pages
Repco Microfinance LTD
No ratings yet
Repco Microfinance LTD
9 pages