[go: up one dir, main page]

0% found this document useful (0 votes)
15 views45 pages

Lecture 4 Introduction To Calculus (Part 1)

The document outlines an introductory course on Calculus and Data Analytics, detailing the weekly topics and structure of the course. It emphasizes the importance of calculus in understanding functions and optimization, particularly in the context of machine learning. Key concepts such as derivatives, convexity, and optimization techniques are introduced as foundational elements for further study in data analytics.

Uploaded by

zhwzhw1115
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views45 pages

Lecture 4 Introduction To Calculus (Part 1)

The document outlines an introductory course on Calculus and Data Analytics, detailing the weekly topics and structure of the course. It emphasizes the importance of calculus in understanding functions and optimization, particularly in the context of machine learning. Key concepts such as derivatives, convexity, and optimization techniques are introduced as foundational elements for further study in data analytics.

Uploaded by

zhwzhw1115
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 45

Introduction to Calculus

(Part 1)

Jing Li, Assistant Professor


Department of Computing
The Hong Kong Polytechnic University

1
Week Topic Instructor
1 Data Analytics: An Introduction Jing/Lotto
2 Introduction to Linear Algebra (Part 1) Jing/Lotto
3 Introduction to Linear Algebra (Part 2) Jing/Lotto
4 Introduction to Calculus (Part 1) Jing/Lotto
(+ Quiz 1)
5 Introduction to Calculus (Part 2) Jing/Lotto

Teaching 6
7
In-class Midterm Test
Programming with R (Part 1)
Jing/Lotto
Jibin
Plan 8 Programming with R (Part 2) Jibin
9 Data Visualization Jibin
10 Monto-Carlo Simulation Jibin
11 Linear Regression Jibin
(Assignment out)
12 Time-series Analysis Jibin
(+ Quiz 2)
13 Review and Exam Q&A Jibin & Jing/Lotto
(Assignment due)
Course Structure
Simulation Time-Series
Advanced Data Analytics (3 lectures) Analysis
Regression

Mathematical Basics R programming


(4 lectures) (3 lectures)

Linear Algebra Environment


Data Manipulation
Calculus
Data Analytics

3
Data Analytics is about Data and the Relations
Calculus is the key to understand functions (for relation analytics)

Take Machine
Learning as an
example…

4
ML ≈ finding suitable function (“model”) given
examples of desired input/output behavior
Sentiment Analysis Example:
f𝛉(X) ෍ 𝛉𝑤
X 𝛉 Y Review
𝛉
𝑤∊ 𝑟𝑒𝑣𝑖𝑒𝑤

text Score

𝛉 ∊ Rd 𝐓𝐫𝐚𝐢𝐧𝐢𝐧𝐠 𝐦𝐞𝐭𝐡𝐨𝐝: 2
(trainable parameters)
Training data: {(X1(i), Y(i)): i=1,..,N} 𝑀𝑖𝑛 ෍ 𝑆𝑐𝑜𝑟𝑒 − ෍ 𝛉𝑤
𝑟𝑒𝑣𝑖𝑒𝑤 𝑤∊ 𝑟𝑒𝑣𝑖𝑒𝑤

“Loss” Desired output Model’s output


(Many other loss formulations exist….) 5
How to find the “best” parameters for a model?
𝛉 ∊ Rd Starting with some 𝛉(0) ,
trainable parameters compute for t=0,1,2,..

𝛉(t+1) = 𝛉(t) − η ∇ (𝓁)


𝛉 (η ∊ R is “learning rate”, say, 0.01)
In practice can improve GD with tricks: time-varying
Loss 𝓁(𝛉) 𝜂, past gradients (“momentum”), “regularization”, ..
deficiency in desired fit Convex Non-convex
Often nonconvex,
esp. in deep
learning. 6
Subcase: deep learning* (deep models = “multilayered”)
𝛉: { M1, M2,.. }

Nonlinearity
Nonlinearity
Input Output
X1
Matrix
M1
M1X1 X2 Matrix M2X3 X3
Matrix ... f𝛉(X1)
M2 M3

“Nonlinearity”: Given a vector, output same vector but negative entries turned to zero.

Training data: {(X1(i), Y(i)): i=1,..,N}

How to compute gradient of Loss??


Backpropagation Algorithm does it fast; clever application of chain rule [Werbos’77, Rumelhart et al’84]
(*Highly simplistic: could have “convolution”, “bias”, “skip connections”, other loss fns etc.)
7
Roadmap
• Linear Algebra Recap
• Functions and Optimization
• Function Derivatives
• Derivative of Univariate Functions
• Partial Derivative and Gradient

8
Roadmap
• Linear Algebra Recap
• Functions and Optimization
• Function Derivatives
• Derivative of Univariate Functions
• Partial Derivative and Gradient

9
Vectors (list of numbers indicating a line with
direction and length in a high-dimensional space)

𝐮 ∈ ℝ𝑀
𝐯 ∈ ℝ𝑀 By default, we denote vectors as column vectors
Multiplication of vectors of the same length
Inner product (Dot product): Resulting in a scalar Algebra in a
▪ 𝑦 = 𝐮 ⋅ 𝐯 = 𝐮𝑇 𝐯 = σ𝑀 𝑢 𝑣 𝑦 ∈ ℝ y higher dimension.
𝑖=1 𝑖 𝑖
Outer product: Resulting in a matrix v

▪ 𝑌 = 𝐮 ⊗ 𝐯 = 𝐮𝐯 𝑇 𝑌 ∈ ℝ𝑀×𝑀 Y𝑖,𝑗 = 𝑢𝑖 𝑣𝑗

10
Matrix Product (Inner Product)

NXP PXM NXM

Cij is the inner product of the ith row


of A with the jth column of B
• inner matrix dimensions must agree
• Note: Matrix multiplication doesn’t (generally) commute, AB  BA
11
Matrix Product (Outer Product)

NXP PXM NXM

C is a sum of outer products of the columns of A with the rows of B


12
Roadmap
• Linear Algebra Recap
• Functions and Optimization
• Function Derivatives
• Derivative of Univariate Functions
• Partial Derivative and Gradient

13
Review: The input 𝑥 can be
vectors and even
Functions matrices, so can
the output 𝑓(𝑥)

• 𝑦 is a function of 𝑥, written with 𝑦 = 𝑓(𝑥):


• Every value of 𝑥 corresponds to one and only one value of 𝑦.
• 𝑥 is the independent variable while 𝑦 is the dependent variable.
• EXAMPLE. Distance traveled per hour 𝑦 is a function of velocity 𝑥.

Input 𝑥 Function 𝑓 Output 𝑦


14
• Given a function 𝑓(𝒙), where the vector 𝒙
denotes a set of variables 𝑥1 , 𝑥2 , … , 𝑥𝑛 .
• Optimization of the function is to find the best
𝒙 that maximizes or minimizes 𝑓 𝒙 =
𝑓 𝑥1 , 𝑥2 , … , 𝑥𝑛 .
Optimization • An optimization problem in everyday life:
of a Function • You selected 3 classes this semester. The
three classes have different effects on the
GPA and you want to maximize the GPA (the
value of a function) via priorly knowing the #
of hours (function variables) you should
spend on each of the class.

15
Convex Function and Optimization
Convex function
If 𝑓(𝒙) is convex, then for any 𝐱 𝟏 , 𝒙𝟐 , and 0 ≤ 𝜃 ≤ 1
𝑓 𝜃𝒙𝟏 + 1 − 𝜃 𝒙𝟐 ≤ 𝜃𝑓 𝒙𝟏 + 1 − 𝜃 𝑓 𝒙𝟐

Convex optimization
If 𝑓(𝒙) is convex, then:
▪ Every local minimum is also a
global minimum ☺

16
Concave Function and Optimization
Concave function
If 𝑓(𝒙) is concave, then for any 𝐱 𝟏 , 𝒙𝟐 , and 0 ≤ 𝜃 ≤ 1
𝑓 𝜃𝒙𝟏 + 1 − 𝜃 𝒙𝟐 ≥ 𝜃𝑓 𝒙𝟏 + 1 − 𝜃 𝑓 𝒙𝟐

Concave optimization
If 𝑓(𝒙) is concave, then:
▪ Every local maximum is also a
global maximum ☺

17
Roadmap
• Linear Algebra Recap
• Functions and Optimization
• Function Derivatives
• Derivative of Univariate Functions
• Partial Derivative and Gradient

18
Roadmap
• Linear Algebra Recap
• Functions and Optimization
• Function Derivatives
• Derivative of Univariate Functions
• Partial Derivative and Gradient

19
Secant Line and Tangent Line

𝑦 = 𝑥2
𝑚𝑠𝑒𝑐 = 7
(4,16) 𝑚𝑠𝑒𝑐 = 6.5 Is this
function
Average rate of change convex?
(3.5,12.25) 𝑚𝑠𝑒𝑐 = 6.1
Δ𝑦
Slope: 𝑚𝑠𝑒𝑐 =
(3.1,9.61) Δ𝑥

Instantaneous rate of change


When 𝑄 gets very close to 𝑃,
(3,9) say (3 + Δ𝑥, 3 + Δ𝑥 2 )

Δ𝑦 f x+Δ𝑥 −f(𝑥)
Slope: 𝑚𝑡𝑎𝑛 = lim = lim
Δ𝑥→0 Δ𝑥 Δ𝑥→0 Δ𝑥

20
What are Derivatives?

The derivative of 𝑓 𝑥 is the slope of tangent line (instantaneous rate of change) at (𝑥, 𝑓(𝑥))
𝑑𝑦 f x+Δ𝑥 −𝑓(𝑥)
▪ = 𝑓 ′ 𝑥 = lim
𝑑𝑥 Δ𝑥→0 Δ𝑥
The process of calculating the derivatives of a function is called differentiation.
Example: 𝑦 = 𝑥 2
𝑑𝑦 x+Δ𝑥 2 −𝑥 2
▪ = lim = lim 2𝑥 + Δ𝑥 = 2𝑥
𝑑𝑥 Δ𝑥→0 Δ𝑥 Δ𝑥→0
▪ Also written as: 𝑑𝑦 = 2𝑥𝑑𝑥 (the differential 𝑑𝑦 in terms of the differential 𝑑𝑥)
▪ Used to estimate the output difference Δ𝑦 in terms of a small input difference Δ𝑥

21
Example: Derivative of a Univariate Function

𝑓 𝑥 = 11𝑥 + 9 𝑥 Function 𝑓 y = 𝑓(𝑥)


𝑥+𝜖 𝑓(𝑥 + 𝜖)
𝑑𝑦
: how fast it changes?
𝑑𝑥

𝑥=7 𝑓 𝑥 = 11𝑥 + 9 86
𝑥 + 𝜖 = 7.1 87.1

𝑑𝑦 87.1 − 86 1.1
= = = 11
𝑑𝑥 7.1 − 7 0.1 22
Some Useful Derivatives

𝑑 𝑥𝑝 𝑑 log 𝑥
𝑏 1
Power Rule: = 𝑝𝑥 𝑝−1 Logarithm Rule:
𝑑𝑥
=
𝑥𝑙𝑛𝑏
𝑑𝑥
1
3
𝑑(𝑥 ) 𝑑(
1
) 1 𝑑𝑥 2 1 −1 𝑑 𝑙𝑛𝑥 1
▪ 2
= 3𝑥 ; 𝑥
= − 2; = 𝑥 2 =
𝑑𝑥 𝑑𝑥 𝑥 𝑑𝑥 2 𝑑𝑥 𝑥
𝑑𝑥 𝑑𝐶
▪ = 1 (𝑦 = 𝑥 has slope 1 every where) Derivatives for constants: = 0
𝑑𝑥
𝑑𝑥
𝑑 𝑏𝑥
Exponential Rule: = 𝑏 𝑥 𝑙𝑛𝑏
𝑑𝑥
𝑥
𝑑(𝑒 )
▪ = 𝑒𝑥
𝑑𝑥

23
Properties of Derivatives
𝑑[𝑐𝑓(𝑥)] 𝑑𝑓 𝑥
For any constant 𝑐 and any differentiable function 𝑓 𝑥 , =𝑐
𝑑𝑥 𝑑𝑥
𝑑[5𝑥 3 ]
▪ = 5 ⋅ 3𝑥 2 = 15𝑥 2
𝑑𝑥
2𝑥 𝑑 𝑒 2 𝑥
𝑑[−3𝑒 ]
▪ = −3 ⋅ = −3𝑒 2𝑥 ln 𝑒 2 = −6𝑒 2𝑥
𝑑𝑥 𝑑𝑥
For any two differentiable functions: 𝑓 𝑥 and g 𝑥
▪ Sum and Difference Rules.
▪ Product Rule.
▪ Quotient Rule.
▪ Chain Rule.
24
Properties of Derivatives (cont.)

For any two differentiable functions: 𝑓 𝑥 and g 𝑥


▪ Sum and Difference Rules.
𝑑 𝑓 𝑥 ±𝑔 𝑥 𝑑𝑓(𝑥) 𝑑𝑔(𝑥)
▪ = ±
𝑑𝑥 𝑑𝑥 𝑑𝑥
3
▪ Example: 𝑦 = 𝑥 2 − 7𝑥 4 + 10𝑒 −3𝑥 −5
𝑑𝑦 3 1
▪ = − 28𝑥 3 − 30𝑒 −3𝑥
𝑥2
𝑑𝑥 2
▪ Product Rule.
▪ Quotient Rule.
▪ Chain Rule.
25
Properties of Derivatives (cont.)

For any two differentiable functions: 𝑓 𝑥 and g 𝑥


▪ Sum and Difference Rules.
▪ Product Rule.
▪ 𝑓 𝑥 𝑔 𝑥 ′ = 𝑓 ′ 𝑥 𝑔 𝑥 + 𝑓 𝑥 𝑔′ 𝑥
▪ Example: 𝑦 = 𝑥 11 𝑒 6𝑥
𝑑𝑦
▪ = 11𝑥 10 𝑒 6𝑥 + 6𝑒 6𝑥 𝑥 11 = 11 + 6𝑥 𝑥 10 𝑒 6𝑥
𝑑𝑥
▪ Quotient Rule.
▪ Chain Rule.
26
Properties of Derivatives (cont.)
For any two differentiable functions: 𝑓 𝑥 and g 𝑥
▪ Sum and Difference Rules.
▪ Product Rule.
▪ Quotient Rule.

𝑓 𝑥 𝑓 ′ 𝑥 𝑔 𝑥 −𝑓 𝑥 𝑔′ 𝑥
▪ = where 𝑔(𝑥) ≠ 0
𝑔 𝑥 𝑔 𝑥 2
𝑒 4𝑥
▪ Example: 𝑦 = 7
𝑥 +8
𝑑𝑦 4𝑒 4𝑥 𝑥 +8 −𝑒 4𝑥 (7𝑥 6 )
7 𝑒 4𝑥 (4𝑥 7 +32−7𝑥 6 )
▪ = =
𝑑𝑥 (𝑥 7 +8)2 (𝑥 7 +8)2
▪ Chain Rule.
27
Properties of Derivatives (cont.)
For any two differentiable functions: 𝑓 𝑥 and g 𝑥
▪ Sum and Difference Rules.
▪ Product Rule.
▪ Quotient Rule.
▪ Chain Rule.

▪ 𝑓 𝑔 𝑥 = 𝑓′ 𝑔 𝑥 ⋅ 𝑔′(𝑥)
2 6
▪ Example: 𝑦 = 𝑥 + 2𝑒 −9𝑥
3
2 5
𝑑𝑦 2 −1
▪ =6 𝑥 +
3 2𝑒 −9𝑥 ⋅( 𝑥 3 − 18𝑒 −9𝑥 )
𝑑𝑥 3
28
Properties of Derivatives (cont.)
For any two differentiable functions: 𝑓 𝑥 and g 𝑥
▪ Sum and Difference Rules.
▪ Product Rule.
▪ Quotient Rule.
▪ Chain Rule.

▪ 𝑓 𝑔 𝑥 = 𝑓′ 𝑔 𝑥 ⋅ 𝑔′(𝑥)
2
𝑥
1 −2
▪ Example: 𝑦 = 𝑒
2𝜋
𝑥2 𝑥2
𝑑𝑦 1 −2 1 −2
▪ = 𝑒 ⋅ −𝑥 = − 𝑥 ⋅𝑒
𝑑𝑥 2𝜋 2𝜋
29
Example: Chain Rule for Univariate Functions

𝑔 𝑥 = 3𝑥
𝑓 𝑧 = 2𝑧
𝑦=𝑓 𝑔 𝑥
𝑔(⋅) 𝑓(⋅)
𝑧 𝑦
𝑥 3𝑥 2𝑧 y = 𝑓(𝑔(𝑥))

𝑑𝑦 𝑑𝑓 𝑑𝑔
= ⋅ =2⋅3=6
𝑑𝑥 𝑑𝑧 𝑑𝑥
30
Roadmap
• Linear Algebra Recap
• Functions and Optimization
• Function Derivatives
• Derivative of Univariate Functions
• Partial Derivative and Gradient

31
Let us start with this
Functions x Linear Algebra vector-in scalar out
sample

𝑥1 𝑋1,1 ⋯ 𝑋1,𝐿
Input 𝑦=f 𝑥 𝑦=f ⋮ 𝑦=f ⋮ ⋱ ⋮
Dim 𝑥𝑀 𝑋𝑀,1 ⋯ 𝑋𝑀,𝐿

𝑦1 𝑦1
Output

𝑦1
Dim

𝑥1 𝑋1,1 ⋯ 𝑋1,𝐿
⋮ =f 𝑥 ⋮ =f ⋮ ⋮ =f ⋮ ⋱ ⋮
𝑥𝑀 𝑋𝑀,1 ⋯ 𝑋𝑀,𝐿
𝑦𝑁 𝑦𝑁 𝑦𝑁

𝑌1,1 ⋯ 𝑌1,𝐾 𝑌1,1 ⋯ 𝑌1,𝐾 𝑥1 𝑌1,1 ⋯ 𝑌1,𝐾 𝑋1,1 ⋯ 𝑋1,𝐿


⋮ ⋱ ⋮ =f 𝑥 ⋮ ⋱ ⋮ =f ⋮ ⋮ ⋱ ⋮ =f ⋮ ⋱ ⋮
𝑌𝑁,𝐾 ⋯ 𝑌𝑁,𝐾 𝑌𝑁,𝐾 ⋯ 𝑌𝑁,𝐾 𝑥𝑀 𝑌𝑁,𝐾 ⋯ 𝑌𝑁,𝐾 𝑋𝑀,1 ⋯ 𝑋𝑀,𝐿

32
Partial Derivatives (for multivariate functions)

x
A function may have multiple variables, e.g.,
▪ 𝑓 𝑥, 𝑦 = 𝑥 2 𝑦 and g 𝑥1 , 𝑥2 , 𝑥3 = 𝑥1 𝑥2 𝑥3
10
y A partial derivative of a function of several
10
variables is its derivative with respect to one of
those variables, with the others held constant
f ( x, y ) = 100 − x 2 − y 2
Usually denoted by:
𝜕𝑓 𝜕𝑓
= −2𝑥 = −2𝑦 𝜕𝑓 𝑑𝑓
𝜕𝑥 𝜕y ▪ or simply
𝜕𝑥 𝑑𝑥

33
Calculus with Vector in, scalar out
Linear Algebra Define Gradient (Vectors’ derivatives)
𝑦 = 𝑓(𝐱) 𝑦 ∈ ℝ, 𝐱 ∈ ℝ𝑛

34
Examples of Partial Derivatives and Gradient

35
Many of the univariate
Rules of Partial function’s derivative
rules hold for partial
Differentiation differentiation

These include the


sum rule, product
rule, quotient
rule, chain rule,
and power rule.

36
𝑦1
𝑥1 Create a set of partial
Functions ⋮ =f ⋮ derivatives
𝜕𝑦𝑖
𝜕𝑥𝑗
with vector 𝑥𝑀
𝑦𝑁 It will form a matrix
or matrix
𝑌1,1 ⋯ 𝑌1,𝐾 𝑋1,1 ⋯ 𝑋1,𝐿
output ⋮ ⋱ ⋮ =f ⋮ ⋱ ⋮
𝑌𝑁,1 ⋯ 𝑌𝑁,𝐾 𝑋𝑀,1 ⋯ 𝑋𝑀,𝐿
One way to think of it:
𝜕𝑌𝑖,𝑘
Bag of Derivatives Create a set of partial derivatives
𝜕𝑋𝑗,𝑙

Yes, a four-dimensional tensor


37
Vector in, vector out
Vector-Valued Numerator-layout
Functions 𝒚 = 𝑓(𝒙) 𝒚∈ ℝ𝑁 , 𝒙∈ ℝ𝑀 ,
𝜕𝒚
∈ ℝ𝑁×𝑀
𝜕𝒙

38
Gradient: Jacobian Matrix

39
Examples about the use of Jacobian

Jacobian

40
Calculus • Scalar: • Multivariate:
Chain Rule • 𝑧=𝑔 𝑥 •𝒛=𝑔 𝑥
•𝑦=𝑓 𝑧 •𝑦=𝑓 𝒛
𝑑𝑦 𝑑𝑦 𝑑𝑧 𝑑𝑦 𝜕𝑦 𝜕𝑧𝑗
• = • = σ𝑗
𝑑𝑥 𝑑𝑧 𝑑𝑥 𝑑𝑥 𝜕𝑧𝑗 𝜕𝑥

41
Example: Multivariate Chain Rule
𝑧1 = 𝑔1 𝑥
𝑔1 (𝑥) 𝑧1

𝑧𝑀 = 𝑔𝑀 𝑥 𝑧2
𝑥 𝑔2 (𝑥) 𝑓(𝑧1 , … , 𝑧𝑀 ) 𝑦
𝑦 = 𝑓 𝑧1 , ⋯ , 𝑧𝑀

𝑔𝑀 (𝑥) 𝑧𝑀
𝑀
𝑑𝑦 𝜕𝑦 𝜕𝑧𝑗
=෍
𝑑𝑥 𝜕𝑧𝑗 𝜕𝑥
𝑗=1

42
Multivariate chain rule
Example: Derivatives 𝒛=𝑔 𝑥
𝑦=𝑓 𝒛
𝑑𝑦 𝜕𝑦 𝜕𝑧𝑗
𝑔1 𝑥 = 3𝑥 = σ𝑗
𝑑𝑥 𝜕𝑧𝑗 𝜕𝑥
𝑔2 𝑥 = 5𝑥 𝑔1 (⋅)
𝑓 𝑧1 , 𝑧2 = 𝑧1 + 𝑧2 3𝑥 𝑧1 𝑓(⋅)
𝑦 = 𝑓 𝑔1 𝑥 , 𝑔2 𝑥
𝑥 𝑧1 + 𝑧2 𝑦

5𝑥 𝑧2

𝑔2 (⋅)
𝑑𝑦 𝜕𝑓 𝜕𝑧1 𝜕𝑓 𝜕𝑧2
= ⋅ + ⋅ =3⋅1+5⋅1=8
𝑑𝑥 𝜕𝑧1 𝜕𝑥 𝜕𝑧2 𝜕𝑥
43
Multivariate chain rule
Example: Derivatives 𝒛=𝑔 𝑥
𝑦=𝑓 𝒛
𝑑𝑦 𝜕𝑦 𝜕𝑧𝑗
𝑧1 = 𝑔1 𝑥 = sin 𝑥 = σ𝑗
𝑔1 (⋅) 𝑑𝑥 𝜕𝑧𝑗 𝜕𝑥
𝑧2 = 𝑔2 𝑥 = 𝑥 3
𝑦 = 𝑓 𝑧1 , 𝑧2 = 𝑧1 𝑧2 s𝑖𝑛(𝑥) 𝑧1

𝑥 𝑧1 ⋅ 𝑧2 𝑦

𝑥3 𝑧2 𝑓(⋅)
𝑑𝑦 𝜕𝑓 𝜕𝑧1 𝜕𝑓 𝜕𝑧2
= ⋅ + ⋅ 𝑔2 (⋅)
𝑑𝑥 𝜕𝑧1 𝜕𝑥 𝜕𝑧2 𝜕𝑥
= 𝑧2 cos 𝑥 + 𝑧1 ⋅ 3𝑥 2
= 𝑥 3 cos 𝑥 + 3x 2 sin(𝑥)
44
A Slide to take away
• A derivative measures how a function changes as its input changes, essentially
representing the rate of change or slope of the function at a given point.
• A partial derivative is a derivative of a function with several variables, where all but the
variable of interest are held constant during the differentiation.
• The Jacobian is a matrix of all first-order partial derivatives of a vector-valued function,
providing information about the function's rate of change in different directions at a
specific point.
• Chain Rule enables easy derivative computing for complex functions.
• Identify the Outer and Inner Functions: The chain rule is used when a function is
composed of an outer function and an inner function. Identify these two parts.
• Differentiate the Outer Function: Take the derivative of the outer function, leaving the
inner function as it is.
• Multiply by the Derivative of the Inner Function: The derivative of a composite function
is the derivative of the outer function multiplied by the derivative of the inner function. 45

You might also like