AdaBoost Notes

Uploaded by

Ibrahimpasha Vorubai

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

23 views5 pages

AdaBoost Notes

Uploaded by

Ibrahimpasha Vorubai

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

AdaBoost

 AdaBoost algorithm, short for Adaptive Boosting, is a Boosting technique used as an

Ensemble Method in Machine Learning.
 It is called Adaptive Boosting as the weights are re-assigned to each instance, with
higher weights assigned to incorrectly classified instances.
 Boosting is used to reduce bias as well as variance for supervised learning. It works
on the principle of learners growing sequentially. Except for the first, each
subsequent learner is grown from previously grown learners. In simple words, weak
learners are converted into strong ones. The AdaBoost algorithm works on the same
principle as boosting with a slight difference.

 How Does AdaBoost Work?

 It makes ‘n’ number of decision trees during the data training period. As the first
decision tree/model is made, the incorrectly classified record in the first model is
given priority. These records are sent as input for the second model. The process
goes on until we specify a number of base learners we want to create. Repetition of
records is allowed with all boosting techniques.
 Especially the record which is incorrectly classified is used as input for the next
model. This process is repeated until the specified condition is met. There will be ‘n’
number of models made by taking the errors from the previous model. This is how
boosting works. The models 1,2, 3,…, N are individual models that can be known as
decision trees. All types of boosting models work on the same principle.
 When the random forest is used, the algorithm makes an ‘n’ number of trees. It
makes proper trees that consist of a start node with several leaf nodes. Some trees
might be bigger than others, but there is no fixed depth in a random forest. With
AdaBoost, however, the algorithm only makes a node with two leaves, known as
Stump.
 These stumps are weak learners and boosting techniques prefer this. The order of
stumps is very important in AdaBoost. The error of the first stump influences how
other stumps are made.

 Work Flow

 Step 1 :Creating the First Base Learner

1. Let’s assume there are only 5 records. All these records will be assigned a sample
weight.
2. The formula used for this is ‘W=1/N’ where N is the number of records.
3. The sample weight becomes 1/5 initially. Every record gets the same weight. In
this case, it’s 1/5.
4. Next is to create the same number of stumps as the number of features. If there
are 3 features, it will create 3 stumps.
5. From these stumps, it will create three decision trees. This process can be called
the stumps-base learner model. Out of these 3 models, the algorithm selects
only one.
6. Two properties are considered while selecting a base learner – Gini and Entropy.
We must calculate Gini or Entropy the same way it is calculated for decision
trees.
7. The stump with the least value will be the first base learner.
8. The number below the leaves represents the correctly and incorrectly classified
records. By using these records, the Gini or Entropy index is calculated. The
stump that has the least Entropy or Gini will be selected as the base learner.
9. Let’s assume that the entropy index is the least for stump 1. So, let’s take stump
1, i.e., feature 1 as our first base learner. Let’s assume feature (f1) has classified 2
records correctly and 1 incorrectly.

 Step 2 – Calculating the Total Error (TE)

The total error is the sum of all the errors in the classified record for sample weights.
In our case, there is only 1 error, so Total Error (TE) = 1/5.

 Step 3 – Calculating Performance of the Stump

1. Formula for calculating Performance of the Stump is: –

where, ln is natural log and TE is Total Error.

2. In our case, TE is 1/5. By substituting the value of total error in the above formula
and solving it, we get the value for the performance of the stump as 0.693.
3. Why is it necessary to calculate the TE and performance of a stump?
 The answer is, we must update the sample weight before proceeding to the
next model or stage because if the same weight is applied, the output
received will be from the first model.
 In boosting, only the wrong records/incorrectly classified records would get
more preference than the correctly classified records.
 In AdaBoost, both records were allowed to pass and the wrong records are
repeated more than the correct ones.
 We must increase the weight for the wrongly classified records and decrease
the weight for the correctly classified records.
 In the next step, we will be updating the weights based on the performance
of the stump.
 Step 4 – Updating Weights

1. For incorrectly classified records, the formula for updating weights is:
New Sample Weight = Sample Weight * e^(Performance)
In our case Sample weight = 1/5 so, 1/5 * e^ (0.693) = 0.399
2. For correctly classified records, we use the same formula with the performance
value being negative. This leads the weight for correctly classified records to be
reduced as compared to the incorrectly classified ones. The formula is:
New Sample Weight = Sample Weight * e^- (Performance)
Putting the values, 1/5 * e^-(0.693) = 0.100
3. The total sum of all the new sample weights should be 1. In this case, it is seen
that the total updated weight of all the records is not 1, it’s 0.799. To bring the
sum to 1, every updated weight must be divided by the total sum of updated
weight. For example, if our updated weight is 0.399 and we divide this by 0.799,
i.e. 0.399/0.799=0.50.
0.50 can be known as the normalized weight.
Normalized Weight = updated weight divided by the total sum of updated
weight.

 Step 5 – Creating a New Dataset

4. Now, it’s time to create a new dataset from our previous one. In the new
dataset, the frequency of incorrectly classified records will be more than the
correct ones.
5. The new dataset has to be created using and considering the normalized weights.
It will probably select the wrong records for training purposes. That will be the
second decision tree/stump. To make a new dataset based on normalized
weight, the algorithm will divide it into buckets.
6. So, our first bucket is from 0 – 0.13, second will be from 0.13 –
0.63(0.13+0.50), third will be from 0.63 – 0.76(0.63+0.13), and so on.
7. After this the algorithm will run 5 iterations to select different records from the
older dataset.
8. Suppose in the 1st iteration, the algorithm will take a random value 0.46 to see
which bucket that value falls into and select that record in the new dataset.
9. It will again select a random value, see which bucket it is in and select that
record for the new dataset. The same process is repeated 5 times.
10. There is a high probability for wrong records to get selected several times. This
will form the new dataset.
For example if row 2 is incorrectly classified in previous one then it will get
selected multiple times.
Based on this new dataset, the algorithm will create a new decision tree/stump and it will
repeat the same process from step 1 till it sequentially passes through all stumps and finds
that there is less error as compared to normalized weight that we had in the initial stage.
 How Does the Algorithm Decide Output for Test Data?

1. Suppose with the above dataset, the algorithm constructed 3 decision trees or
stumps.
2. The test dataset will pass through all the stumps which have been constructed by the
algorithm.
3. While passing through the 1st stump, the output it produces is 1. Passing through
the 2nd stump, the output generated once again is 1.
4. While passing through the 3rd stump it gives the output as 0. In the AdaBoost
algorithm too, the majority of votes take place between the stumps, in the same way
as in random trees. In this case, the final output will be 1. This is how the output with
test data is decided.

 Important hyperparameters in AdaBoost

The following are the most important hyperparameters

both AdaBoostClassifier() and AdaBoostRegressor().
 base_estimator: This is the base learner used in AdaBoost algorithms. The
default and most common learner is a decision tree stump (a decision tree with
max_depth=1) as we discussed earlier.
 n_estimators: The maximum number of estimators (models) to train
sequentially. The default is 50. We’ll measure the effect of this hyperparameter
soon.
 learning_rate: This determines the weight applied to each estimator in the
boosting process. The default is 1. Smaller values such as 0.05, 0.1 force the
algorithm to train slower but with high-performance scores. We’ll measure the
effect of this hyperparameter soon.

 Advantages and disadvantages

1. Adaboost is less prone to overfitting as the input parameters are not jointly
optimized.
2. The accuracy of weak classifiers can be improved by using Adaboost. Nowadays,
Adaboost is being used to classify text and images rather than binary classification
problems.

 Disadvantages
1. The main disadvantage of Adaboost is that it needs a quality dataset.
2. Noisy data and outliers have to be avoided before adopting an Adaboost algorithm.
3. AdaBoost uses a progressively learning boosting technique. Hence high-quality data
is needed in examples of AdaBoost vs Random Forest.
4. It is also very sensitive to outliers and noise in data requiring the elimination of these
factors before using the data. It is also much slower than the XGBoost algorithm.

Statistics Project
No ratings yet
Statistics Project
5 pages
AdaBoost Interview Prep Guide
No ratings yet
AdaBoost Interview Prep Guide
6 pages
1.1 - Xgboost, GBboost, Adaboost - Boosting - Medium
No ratings yet
1.1 - Xgboost, GBboost, Adaboost - Boosting - Medium
6 pages
LECTURE+NOTES Boosting
No ratings yet
LECTURE+NOTES Boosting
8 pages
Boosting Algorithms Explained
No ratings yet
Boosting Algorithms Explained
79 pages
Boosting
No ratings yet
Boosting
13 pages
Boosting
No ratings yet
Boosting
2 pages
Bagging - Boosting
No ratings yet
Bagging - Boosting
9 pages
Boosted Trees
No ratings yet
Boosted Trees
66 pages
کتاب هفتم بارگزاری شده
No ratings yet
کتاب هفتم بارگزاری شده
57 pages
Adaboost
No ratings yet
Adaboost
22 pages
Boosting and AdaBoost For Machine Learning
No ratings yet
Boosting and AdaBoost For Machine Learning
18 pages
Pradipta Kumar Pattanayak - Ada Boosting
No ratings yet
Pradipta Kumar Pattanayak - Ada Boosting
44 pages
14-AI ML Ensemble 2022
No ratings yet
14-AI ML Ensemble 2022
41 pages
Data Mining - Ensemble Methods
No ratings yet
Data Mining - Ensemble Methods
12 pages
Random Forest-Supervised ML
No ratings yet
Random Forest-Supervised ML
45 pages
Class Adv Classification V
No ratings yet
Class Adv Classification V
50 pages
ENG6500 7 Ensembles Boosting
No ratings yet
ENG6500 7 Ensembles Boosting
49 pages
Ensemble - Part 1
No ratings yet
Ensemble - Part 1
33 pages
16-Ensemble Learning - Cont... - 12-04-2024
No ratings yet
16-Ensemble Learning - Cont... - 12-04-2024
13 pages
Ensemble Learning
No ratings yet
Ensemble Learning
9 pages
Adaboost
No ratings yet
Adaboost
4 pages
07 Boosting Notes
No ratings yet
07 Boosting Notes
10 pages
DM (Boosting)
No ratings yet
DM (Boosting)
15 pages
Adaboost
No ratings yet
Adaboost
29 pages
Ensemble Classifiers Overview
No ratings yet
Ensemble Classifiers Overview
37 pages
Ensemble Methods
No ratings yet
Ensemble Methods
30 pages
Boosting Algo Adaboost
No ratings yet
Boosting Algo Adaboost
3 pages
ml1 Lab 6
No ratings yet
ml1 Lab 6
5 pages
Ensemble, Voting, Bagging, Boosting
No ratings yet
Ensemble, Voting, Bagging, Boosting
15 pages
Types of Boosting
No ratings yet
Types of Boosting
4 pages
ML-Unit I - Ensemble Methods
No ratings yet
ML-Unit I - Ensemble Methods
54 pages
ML Minors Exp8
No ratings yet
ML Minors Exp8
6 pages
Computational Data Analysis: Machine Learning
No ratings yet
Computational Data Analysis: Machine Learning
26 pages
Ada Boost
No ratings yet
Ada Boost
7 pages
A Brief Introduction To Adaboost: Hongbo Deng 6 Feb, 2007
No ratings yet
A Brief Introduction To Adaboost: Hongbo Deng 6 Feb, 2007
35 pages
Ensemble Classifiers
100% (1)
Ensemble Classifiers
37 pages
Ensemble (v6)
No ratings yet
Ensemble (v6)
45 pages
Bagging
No ratings yet
Bagging
7 pages
Lecture 16: Boosting - Applied ML
No ratings yet
Lecture 16: Boosting - Applied ML
20 pages
Machine Learning: Ensemble Methods
No ratings yet
Machine Learning: Ensemble Methods
54 pages
Machine Learning Boosting Guide
No ratings yet
Machine Learning Boosting Guide
27 pages
AdaBoost Classifier Tutorial Python
100% (1)
AdaBoost Classifier Tutorial Python
9 pages
AdaBoost Algorithm: Key Features & Benefits
No ratings yet
AdaBoost Algorithm: Key Features & Benefits
9 pages
Multi-class AdaBoost Explained
No ratings yet
Multi-class AdaBoost Explained
12 pages
Adaboost Algorithm Guide
No ratings yet
Adaboost Algorithm Guide
22 pages
Unit 3 Aml
No ratings yet
Unit 3 Aml
9 pages
Bagging+Boosting+Gradient Boosting
100% (1)
Bagging+Boosting+Gradient Boosting
48 pages
Lesson 8 - Ensemble Learning
No ratings yet
Lesson 8 - Ensemble Learning
61 pages
ML U3 Notes
No ratings yet
ML U3 Notes
10 pages
Unit V - Multiple Learners
No ratings yet
Unit V - Multiple Learners
54 pages
Ensemble Learning for Data Scientists
No ratings yet
Ensemble Learning for Data Scientists
41 pages
Ch-4 Ensemble Learning
No ratings yet
Ch-4 Ensemble Learning
18 pages
Algorithm Adaboost
No ratings yet
Algorithm Adaboost
1 page
Module 7 Notes
No ratings yet
Module 7 Notes
3 pages
Ensemble Method
No ratings yet
Ensemble Method
8 pages
Boosting Approach To Machine Learn
No ratings yet
Boosting Approach To Machine Learn
23 pages
Ensemble Methods
No ratings yet
Ensemble Methods
31 pages
Experiment 08
No ratings yet
Experiment 08
17 pages
ML UNIT 4 Sir
No ratings yet
ML UNIT 4 Sir
42 pages
2) False Position Method (Regula Falsi Method)
No ratings yet
2) False Position Method (Regula Falsi Method)
3 pages
Final Exam Solutions: Math 116: Finashin, Pamuk, Pierce, Solak June 7, 2011, 13:30-15:30 (120 Minutes)
No ratings yet
Final Exam Solutions: Math 116: Finashin, Pamuk, Pierce, Solak June 7, 2011, 13:30-15:30 (120 Minutes)
4 pages
EDO - Lecture 6 - 2024 - v01
No ratings yet
EDO - Lecture 6 - 2024 - v01
45 pages
A Simheuristic Approach Using The NSGA II To Solve A Bi Objective Stochastic Flexible Job Shop Problem
No ratings yet
A Simheuristic Approach Using The NSGA II To Solve A Bi Objective Stochastic Flexible Job Shop Problem
26 pages
17.bayesian Learning Via Stochastic Gradient Langevin Dynamics
No ratings yet
17.bayesian Learning Via Stochastic Gradient Langevin Dynamics
8 pages
AME 60617 Bayesian Data Assimilation Homework 3: 1 Problem 1: Standard Kalman Filtering For A Linear Problem
No ratings yet
AME 60617 Bayesian Data Assimilation Homework 3: 1 Problem 1: Standard Kalman Filtering For A Linear Problem
3 pages
Communications Systems Guide
No ratings yet
Communications Systems Guide
35 pages
Fundamentals of Optimization Theory With Applications To Machine Learning Gallier J. PDF Download
100% (1)
Fundamentals of Optimization Theory With Applications To Machine Learning Gallier J. PDF Download
56 pages
AI Exam for Students
No ratings yet
AI Exam for Students
10 pages
Multithreading Matrix Multiplication
No ratings yet
Multithreading Matrix Multiplication
15 pages
Linear Programming Optimization
No ratings yet
Linear Programming Optimization
5 pages
Graph Algorithms and Hashing
No ratings yet
Graph Algorithms and Hashing
230 pages
Apriori Algorithm DWDM
No ratings yet
Apriori Algorithm DWDM
5 pages
Basic ICA Code in MATLAB (As Used in Bell and Sejnowski 1996)
No ratings yet
Basic ICA Code in MATLAB (As Used in Bell and Sejnowski 1996)
4 pages
Operation Research Chapter Three 3. Transportation and Assignment Model Chapter Objectives
No ratings yet
Operation Research Chapter Three 3. Transportation and Assignment Model Chapter Objectives
16 pages
Rectangular Pulse Bandwidth Calculation
No ratings yet
Rectangular Pulse Bandwidth Calculation
6 pages
Journal On Estimate Food Calorie
No ratings yet
Journal On Estimate Food Calorie
9 pages
Machine Learning Pros and Cons
No ratings yet
Machine Learning Pros and Cons
3 pages
Civil Engineering Integral Methods
No ratings yet
Civil Engineering Integral Methods
4 pages
Image Compression-Decompression Technique Using Arithmetic Coding
No ratings yet
Image Compression-Decompression Technique Using Arithmetic Coding
12 pages
Unit-3 Session-3 PDF
No ratings yet
Unit-3 Session-3 PDF
3 pages
Line Generation Algorithm PDF
100% (1)
Line Generation Algorithm PDF
4 pages
Transformer - Ipynb - Colab
No ratings yet
Transformer - Ipynb - Colab
5 pages
Exam2004 2 3
No ratings yet
Exam2004 2 3
22 pages
ML Unit II - Final
No ratings yet
ML Unit II - Final
138 pages
Synthetic Division Cheat Sheet PDF
No ratings yet
Synthetic Division Cheat Sheet PDF
1 page
Summative Test 1 (Grade 8 First Quarter) Directions: Choose The Letter of The Correct Answer
No ratings yet
Summative Test 1 (Grade 8 First Quarter) Directions: Choose The Letter of The Correct Answer
2 pages
Feature Selection Technique
No ratings yet
Feature Selection Technique
7 pages