Gradient Descent Algorithm

Gradient descent is an optimization algorithm used to minimize some function by iteratively moving in the direction of steepest descent as defined by the negative of the gradient of the function. The algorithm initializes parameters randomly and then repeatedly updates each parameter by subtracting a small proportion of its gradient from its current value until convergence is reached. The learning rate determines how large the steps are during each update and needs to be set appropriately for efficient convergence. Parameters should be updated simultaneously rather than sequentially to ensure convergence.

Uploaded by

Saqlain Arshad

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

44 views22 pages

Gradient Descent Algorithm

Uploaded by

Saqlain Arshad

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 22

Machine Learning

Gradient Descent
Gradient Descent
Gradient Descent is just like Agile Methodology

Make
changes Build
depending something
upon the quickly
feedback

Get some Get it out

feedback there
Gradient Descent

Lets have some function 𝐽 θ

Want to min J(θ)

Algorithm:
- initialize θ ’s randomly
- keep chaining θ ′ s to reduce J(θ)
until we hopefully end up at a minimum
Gradient Descent
Lets have some function 𝐽 θ

Want to min J(θ)

Algorithm:
- initialize θ ’s randomly
- repeat until convergence {
𝜕
θi := θi - α J(θ)
𝜕θi
Gradient Descent

Lets have some function 𝐽 θ1

Want to min J(θ1)

θ1

Algorithm:
- initialize θ1 randomly
- keep chaining θ1 to reduce J(θ 1)
until we hopefully end up at a minimum
Gradient Descent
Lets have some function 𝐽 θ1

Want to min J(θ1)

θ1

Algorithm:
- initialize θ ’s randomly
- repeat until convergence {
𝜕
θ1 := θ1 - α J(θ1)
𝜕θ1
}
Gradient Descent
𝐽 θ1 = (θ1 - 3 )2 +5 θ1 := θ1 - α
𝜕
J(θ1)
𝜕θ1
θ1 𝑱 θ1 𝜕
0 14 J(θ1) = 2(θ1 – 3) α = 0.1
𝜕θ1
1 9
-1 21
If θ1 = 10
2 6
-2 30
3 5
-3 41
4 6
-4 54
5 9
-5 69
6 14 If θ1 = -5
-6 86
7 21
8 30
9 41
10 54
11 69
12 86
13 105
Gradient Descent
Q&A
Impact of learning rate in Gradient
Descent
Impact of learning rate in Gradient Descent
Impact of learning rate in Gradient Descent
Q&A
How to implement Gradient Descent
How to implement Gradient Descent
𝐽 θ1 = (θ1 - 3 )2 +5 initialize θ ’s randomly
- repeat until convergence {
θ1 𝑱 θ1 𝜕
0 14 θ1 := θ1 - α J(θ1)
𝜕θ1
1 9 }
-1 21
2 6
-2 30
3 5 𝜕
𝐽(θ1) = 2(θ1 – 3)
-3 41 𝜕θ1
4 6
-4 54
5 9
-5 69 initialization θ1 = 10 initialization θ1 = -5
6 14
-6 86
7 21 Repeat until convergence{
8 30
θ1 := θ1 - α 2(θ1 – 3)
9 41
10 54
}
11 69
12 86
13 105
How to implement Gradient Descent

Cost function: J(θ0,θ1)

min J(θ0,θ1)
θ0,θ1

Algorithm:
- initialize θ ’s randomly
- repeat until convergence {
𝜕
θi := θi - α J(θ0,θ1)
𝜕θi
How to implement Gradient Descent
How to implement Gradient Descent
How to implement Gradient Descent
How to implement Gradient Descent
Cost function: J(θ0,θ1)
Algorithm:
- initialize θ ’s randomly min J(θ0,θ1)
θ0,θ1
- repeat until convergence {
𝜕
θi := θi - α J(θ0,θ1)
𝜕θi
Correct: Simultaneous Update Incorrect
𝜕 𝜕
temp0 := θ0 - α J(θ0,θ1) temp0 := θ0 - α J(θ0,θ1)
𝜕θ0 𝜕θ0
𝜕
temp1 := θ1 - α J(θ0,θ1) θ0 := temp0
𝜕θ1
𝜕
θ0 := temp0 temp1 := θ1 - α J(θ0,θ1)
𝜕θ1
θ1 := temp1 θ1 := temp1
Q&A

Course Structure AIML
No ratings yet
Course Structure AIML
8 pages
Modern Horizons in Agriculture Volume 122012024
No ratings yet
Modern Horizons in Agriculture Volume 122012024
377 pages
KNN Model-Based Approach in Classification
No ratings yet
KNN Model-Based Approach in Classification
11 pages
1 s2.0 S0167739X25003966 Main
No ratings yet
1 s2.0 S0167739X25003966 Main
13 pages
Python & Gen AI Data Science Course
No ratings yet
Python & Gen AI Data Science Course
9 pages
Driver Analysis and Product Optimization Using Bayesian Networks
No ratings yet
Driver Analysis and Product Optimization Using Bayesian Networks
27 pages
Evaluation of Text Transformers For Classifying Sentiment of Revi
No ratings yet
Evaluation of Text Transformers For Classifying Sentiment of Revi
104 pages
UNIT I 1 ML Introduction To ML Well Posed Learning Problem
No ratings yet
UNIT I 1 ML Introduction To ML Well Posed Learning Problem
48 pages
360DigiTmg E Book Data Science
100% (1)
360DigiTmg E Book Data Science
168 pages
AI For Business Final - Compressed
No ratings yet
AI For Business Final - Compressed
52 pages
A Review of Artificial Intelligence Adoptions in The Media Industry
100% (1)
A Review of Artificial Intelligence Adoptions in The Media Industry
24 pages
Data Science - Coding Invaders
No ratings yet
Data Science - Coding Invaders
51 pages
A Deep Learning Methodology To Predicting Cybersecurity Attacks On The Internet of Things
No ratings yet
A Deep Learning Methodology To Predicting Cybersecurity Attacks On The Internet of Things
22 pages
Masters Thesis Revised
No ratings yet
Masters Thesis Revised
4 pages
Final Project Surya S
No ratings yet
Final Project Surya S
62 pages
Building Machine Learning Systems With A Feature Store Batch, Real-Time, and LLM Systems Early Release Jim
No ratings yet
Building Machine Learning Systems With A Feature Store Batch, Real-Time, and LLM Systems Early Release Jim
84 pages
Imbalanced Dataset Classification and Solutions: A Review
No ratings yet
Imbalanced Dataset Classification and Solutions: A Review
29 pages
Railway Accident Analysis with ANN
No ratings yet
Railway Accident Analysis with ANN
10 pages
Databricks Machine Learning Professional Demo
No ratings yet
Databricks Machine Learning Professional Demo
5 pages
K Means Kkwc3f
No ratings yet
K Means Kkwc3f
19 pages
Federated Learning: Strategies & Applications
No ratings yet
Federated Learning: Strategies & Applications
24 pages
Conference
No ratings yet
Conference
15 pages
Cover Letter
No ratings yet
Cover Letter
1 page
(Lecture Notes in Networks and Systems, 637) Janusz Kacprzyk, Mostafa Ezziyyani, Valentina em
No ratings yet
(Lecture Notes in Networks and Systems, 637) Janusz Kacprzyk, Mostafa Ezziyyani, Valentina em
995 pages
Spam Classifier Design Guide
No ratings yet
Spam Classifier Design Guide
18 pages
AI on Arm: Empowering Developers
No ratings yet
AI on Arm: Empowering Developers
27 pages
Visvesvaraya Technological University Belagavi-590018: "Machine Learning Algorithm For Time Series Data"
No ratings yet
Visvesvaraya Technological University Belagavi-590018: "Machine Learning Algorithm For Time Series Data"
10 pages
OM2025 Duyen
No ratings yet
OM2025 Duyen
28 pages
Shahistha Resume
No ratings yet
Shahistha Resume
1 page
Prompt Engineering Seminar Report
No ratings yet
Prompt Engineering Seminar Report
74 pages

Gradient Descent Algorithm

Uploaded by

Gradient Descent Algorithm

Uploaded by

Machine Learning

Get some Get it out

Lets have some function 𝐽 θ

Want to min J(θ)

Want to min J(θ)

Lets have some function 𝐽 θ1

Want to min J(θ1)

Want to min J(θ1)

Cost function: J(θ0,θ1)

You might also like