0% found this document useful (0 votes)

15 views45 pages

Lecture 4 Introduction To Calculus (Part 1)

The document outlines an introductory course on Calculus and Data Analytics, detailing the weekly topics and structure of the course. It emphasizes the importance of calculus in understanding functions and optimization, particularly in the context of machine learning. Key concepts such as derivatives, convexity, and optimization techniques are introduced as foundational elements for further study in data analytics.

Uploaded by

zhwzhw1115

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views45 pages

Lecture 4 Introduction To Calculus (Part 1)

Uploaded by

zhwzhw1115

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 45

Introduction to Calculus

(Part 1)

Jing Li, Assistant Professor

Department of Computing
The Hong Kong Polytechnic University

1
Week Topic Instructor
1 Data Analytics: An Introduction Jing/Lotto
2 Introduction to Linear Algebra (Part 1) Jing/Lotto
3 Introduction to Linear Algebra (Part 2) Jing/Lotto
4 Introduction to Calculus (Part 1) Jing/Lotto
(+ Quiz 1)
5 Introduction to Calculus (Part 2) Jing/Lotto

Teaching 6
7
In-class Midterm Test
Programming with R (Part 1)
Jing/Lotto
Jibin
Plan 8 Programming with R (Part 2) Jibin
9 Data Visualization Jibin
10 Monto-Carlo Simulation Jibin
11 Linear Regression Jibin
(Assignment out)
12 Time-series Analysis Jibin
(+ Quiz 2)
13 Review and Exam Q&A Jibin & Jing/Lotto
(Assignment due)
Course Structure
Simulation Time-Series
Advanced Data Analytics (3 lectures) Analysis
Regression

Mathematical Basics R programming

(4 lectures) (3 lectures)

Linear Algebra Environment

Data Manipulation
Calculus
Data Analytics

3
Data Analytics is about Data and the Relations
Calculus is the key to understand functions (for relation analytics)

Take Machine
Learning as an
example…

4
ML ≈ finding suitable function (“model”) given
examples of desired input/output behavior
Sentiment Analysis Example:
f𝛉(X) ෍ 𝛉𝑤
X 𝛉 Y Review
𝛉
𝑤∊ 𝑟𝑒𝑣𝑖𝑒𝑤

text Score

𝛉 ∊ Rd 𝐓𝐫𝐚𝐢𝐧𝐢𝐧𝐠 𝐦𝐞𝐭𝐡𝐨𝐝: 2
(trainable parameters)
Training data: {(X1(i), Y(i)): i=1,..,N} 𝑀𝑖𝑛 ෍ 𝑆𝑐𝑜𝑟𝑒 − ෍ 𝛉𝑤
𝑟𝑒𝑣𝑖𝑒𝑤 𝑤∊ 𝑟𝑒𝑣𝑖𝑒𝑤

“Loss” Desired output Model’s output

(Many other loss formulations exist….) 5
How to find the “best” parameters for a model?
𝛉 ∊ Rd Starting with some 𝛉(0) ,
trainable parameters compute for t=0,1,2,..

𝛉(t+1) = 𝛉(t) − η ∇ (𝓁)

𝛉 (η ∊ R is “learning rate”, say, 0.01)
In practice can improve GD with tricks: time-varying
Loss 𝓁(𝛉) 𝜂, past gradients (“momentum”), “regularization”, ..
deficiency in desired fit Convex Non-convex
Often nonconvex,
esp. in deep
learning. 6
Subcase: deep learning* (deep models = “multilayered”)
𝛉: { M1, M2,.. }

Nonlinearity
Nonlinearity
Input Output
X1
Matrix
M1
M1X1 X2 Matrix M2X3 X3
Matrix ... f𝛉(X1)
M2 M3

“Nonlinearity”: Given a vector, output same vector but negative entries turned to zero.

Training data: {(X1(i), Y(i)): i=1,..,N}

How to compute gradient of Loss??

Backpropagation Algorithm does it fast; clever application of chain rule [Werbos’77, Rumelhart et al’84]
(*Highly simplistic: could have “convolution”, “bias”, “skip connections”, other loss fns etc.)
7
Roadmap
• Linear Algebra Recap
• Functions and Optimization
• Function Derivatives
• Derivative of Univariate Functions
• Partial Derivative and Gradient

8
Roadmap
• Linear Algebra Recap
• Functions and Optimization
• Function Derivatives
• Derivative of Univariate Functions
• Partial Derivative and Gradient

9
Vectors (list of numbers indicating a line with
direction and length in a high-dimensional space)

𝐮 ∈ ℝ𝑀
𝐯 ∈ ℝ𝑀 By default, we denote vectors as column vectors
Multiplication of vectors of the same length
Inner product (Dot product): Resulting in a scalar Algebra in a
▪ 𝑦 = 𝐮 ⋅ 𝐯 = 𝐮𝑇 𝐯 = σ𝑀 𝑢 𝑣 𝑦 ∈ ℝ y higher dimension.
𝑖=1 𝑖 𝑖
Outer product: Resulting in a matrix v

▪ 𝑌 = 𝐮 ⊗ 𝐯 = 𝐮𝐯 𝑇 𝑌 ∈ ℝ𝑀×𝑀 Y𝑖,𝑗 = 𝑢𝑖 𝑣𝑗

10
Matrix Product (Inner Product)

NXP PXM NXM

Cij is the inner product of the ith row

of A with the jth column of B
• inner matrix dimensions must agree
• Note: Matrix multiplication doesn’t (generally) commute, AB  BA
11
Matrix Product (Outer Product)

NXP PXM NXM

C is a sum of outer products of the columns of A with the rows of B

12
Roadmap
• Linear Algebra Recap
• Functions and Optimization
• Function Derivatives
• Derivative of Univariate Functions
• Partial Derivative and Gradient

13
Review: The input 𝑥 can be
vectors and even
Functions matrices, so can
the output 𝑓(𝑥)

• 𝑦 is a function of 𝑥, written with 𝑦 = 𝑓(𝑥):

• Every value of 𝑥 corresponds to one and only one value of 𝑦.
• 𝑥 is the independent variable while 𝑦 is the dependent variable.
• EXAMPLE. Distance traveled per hour 𝑦 is a function of velocity 𝑥.

Input 𝑥 Function 𝑓 Output 𝑦

14
• Given a function 𝑓(𝒙), where the vector 𝒙
denotes a set of variables 𝑥1 , 𝑥2 , … , 𝑥𝑛 .
• Optimization of the function is to find the best
𝒙 that maximizes or minimizes 𝑓 𝒙 =
𝑓 𝑥1 , 𝑥2 , … , 𝑥𝑛 .
Optimization • An optimization problem in everyday life:
of a Function • You selected 3 classes this semester. The
three classes have different effects on the
GPA and you want to maximize the GPA (the
value of a function) via priorly knowing the #
of hours (function variables) you should
spend on each of the class.

15
Convex Function and Optimization
Convex function
If 𝑓(𝒙) is convex, then for any 𝐱 𝟏 , 𝒙𝟐 , and 0 ≤ 𝜃 ≤ 1
𝑓 𝜃𝒙𝟏 + 1 − 𝜃 𝒙𝟐 ≤ 𝜃𝑓 𝒙𝟏 + 1 − 𝜃 𝑓 𝒙𝟐

Convex optimization
If 𝑓(𝒙) is convex, then:
▪ Every local minimum is also a
global minimum ☺

16
Concave Function and Optimization
Concave function
If 𝑓(𝒙) is concave, then for any 𝐱 𝟏 , 𝒙𝟐 , and 0 ≤ 𝜃 ≤ 1
𝑓 𝜃𝒙𝟏 + 1 − 𝜃 𝒙𝟐 ≥ 𝜃𝑓 𝒙𝟏 + 1 − 𝜃 𝑓 𝒙𝟐

Concave optimization
If 𝑓(𝒙) is concave, then:
▪ Every local maximum is also a
global maximum ☺

17
Roadmap
• Linear Algebra Recap
• Functions and Optimization
• Function Derivatives
• Derivative of Univariate Functions
• Partial Derivative and Gradient

18
Roadmap
• Linear Algebra Recap
• Functions and Optimization
• Function Derivatives
• Derivative of Univariate Functions
• Partial Derivative and Gradient

19
Secant Line and Tangent Line

𝑦 = 𝑥2
𝑚𝑠𝑒𝑐 = 7
(4,16) 𝑚𝑠𝑒𝑐 = 6.5 Is this
function
Average rate of change convex?
(3.5,12.25) 𝑚𝑠𝑒𝑐 = 6.1
Δ𝑦
Slope: 𝑚𝑠𝑒𝑐 =
(3.1,9.61) Δ𝑥

Instantaneous rate of change

When 𝑄 gets very close to 𝑃,
(3,9) say (3 + Δ𝑥, 3 + Δ𝑥 2 )

Δ𝑦 f x+Δ𝑥 −f(𝑥)
Slope: 𝑚𝑡𝑎𝑛 = lim = lim
Δ𝑥→0 Δ𝑥 Δ𝑥→0 Δ𝑥

20
What are Derivatives?

The derivative of 𝑓 𝑥 is the slope of tangent line (instantaneous rate of change) at (𝑥, 𝑓(𝑥))
𝑑𝑦 f x+Δ𝑥 −𝑓(𝑥)
▪ = 𝑓 ′ 𝑥 = lim
𝑑𝑥 Δ𝑥→0 Δ𝑥
The process of calculating the derivatives of a function is called differentiation.
Example: 𝑦 = 𝑥 2
𝑑𝑦 x+Δ𝑥 2 −𝑥 2
▪ = lim = lim 2𝑥 + Δ𝑥 = 2𝑥
𝑑𝑥 Δ𝑥→0 Δ𝑥 Δ𝑥→0
▪ Also written as: 𝑑𝑦 = 2𝑥𝑑𝑥 (the differential 𝑑𝑦 in terms of the differential 𝑑𝑥)
▪ Used to estimate the output difference Δ𝑦 in terms of a small input difference Δ𝑥

21
Example: Derivative of a Univariate Function

𝑓 𝑥 = 11𝑥 + 9 𝑥 Function 𝑓 y = 𝑓(𝑥)

𝑥+𝜖 𝑓(𝑥 + 𝜖)
𝑑𝑦
: how fast it changes?
𝑑𝑥

𝑥=7 𝑓 𝑥 = 11𝑥 + 9 86
𝑥 + 𝜖 = 7.1 87.1

𝑑𝑦 87.1 − 86 1.1
= = = 11
𝑑𝑥 7.1 − 7 0.1 22
Some Useful Derivatives

𝑑 𝑥𝑝 𝑑 log 𝑥
𝑏 1
Power Rule: = 𝑝𝑥 𝑝−1 Logarithm Rule:
𝑑𝑥
=
𝑥𝑙𝑛𝑏
𝑑𝑥
1
3
𝑑(𝑥 ) 𝑑(
1
) 1 𝑑𝑥 2 1 −1 𝑑 𝑙𝑛𝑥 1
▪ 2
= 3𝑥 ; 𝑥
= − 2; = 𝑥 2 =
𝑑𝑥 𝑑𝑥 𝑥 𝑑𝑥 2 𝑑𝑥 𝑥
𝑑𝑥 𝑑𝐶
▪ = 1 (𝑦 = 𝑥 has slope 1 every where) Derivatives for constants: = 0
𝑑𝑥
𝑑𝑥
𝑑 𝑏𝑥
Exponential Rule: = 𝑏 𝑥 𝑙𝑛𝑏
𝑑𝑥
𝑥
𝑑(𝑒 )
▪ = 𝑒𝑥
𝑑𝑥

23
Properties of Derivatives
𝑑[𝑐𝑓(𝑥)] 𝑑𝑓 𝑥
For any constant 𝑐 and any differentiable function 𝑓 𝑥 , =𝑐
𝑑𝑥 𝑑𝑥
𝑑[5𝑥 3 ]
▪ = 5 ⋅ 3𝑥 2 = 15𝑥 2
𝑑𝑥
2𝑥 𝑑 𝑒 2 𝑥
𝑑[−3𝑒 ]
▪ = −3 ⋅ = −3𝑒 2𝑥 ln 𝑒 2 = −6𝑒 2𝑥
𝑑𝑥 𝑑𝑥
For any two differentiable functions: 𝑓 𝑥 and g 𝑥
▪ Sum and Difference Rules.
▪ Product Rule.
▪ Quotient Rule.
▪ Chain Rule.
24
Properties of Derivatives (cont.)

For any two differentiable functions: 𝑓 𝑥 and g 𝑥

▪ Sum and Difference Rules.
𝑑 𝑓 𝑥 ±𝑔 𝑥 𝑑𝑓(𝑥) 𝑑𝑔(𝑥)
▪ = ±
𝑑𝑥 𝑑𝑥 𝑑𝑥
3
▪ Example: 𝑦 = 𝑥 2 − 7𝑥 4 + 10𝑒 −3𝑥 −5
𝑑𝑦 3 1
▪ = − 28𝑥 3 − 30𝑒 −3𝑥
𝑥2
𝑑𝑥 2
▪ Product Rule.
▪ Quotient Rule.
▪ Chain Rule.
25
Properties of Derivatives (cont.)

For any two differentiable functions: 𝑓 𝑥 and g 𝑥

▪ Sum and Difference Rules.
▪ Product Rule.
▪ 𝑓 𝑥 𝑔 𝑥 ′ = 𝑓 ′ 𝑥 𝑔 𝑥 + 𝑓 𝑥 𝑔′ 𝑥
▪ Example: 𝑦 = 𝑥 11 𝑒 6𝑥
𝑑𝑦
▪ = 11𝑥 10 𝑒 6𝑥 + 6𝑒 6𝑥 𝑥 11 = 11 + 6𝑥 𝑥 10 𝑒 6𝑥
𝑑𝑥
▪ Quotient Rule.
▪ Chain Rule.
26
Properties of Derivatives (cont.)
For any two differentiable functions: 𝑓 𝑥 and g 𝑥
▪ Sum and Difference Rules.
▪ Product Rule.
▪ Quotient Rule.
′
𝑓 𝑥 𝑓 ′ 𝑥 𝑔 𝑥 −𝑓 𝑥 𝑔′ 𝑥
▪ = where 𝑔(𝑥) ≠ 0
𝑔 𝑥 𝑔 𝑥 2
𝑒 4𝑥
▪ Example: 𝑦 = 7
𝑥 +8
𝑑𝑦 4𝑒 4𝑥 𝑥 +8 −𝑒 4𝑥 (7𝑥 6 )
7 𝑒 4𝑥 (4𝑥 7 +32−7𝑥 6 )
▪ = =
𝑑𝑥 (𝑥 7 +8)2 (𝑥 7 +8)2
▪ Chain Rule.
27
Properties of Derivatives (cont.)
For any two differentiable functions: 𝑓 𝑥 and g 𝑥
▪ Sum and Difference Rules.
▪ Product Rule.
▪ Quotient Rule.
▪ Chain Rule.
′
▪ 𝑓 𝑔 𝑥 = 𝑓′ 𝑔 𝑥 ⋅ 𝑔′(𝑥)
2 6
▪ Example: 𝑦 = 𝑥 + 2𝑒 −9𝑥
3
2 5
𝑑𝑦 2 −1
▪ =6 𝑥 +
3 2𝑒 −9𝑥 ⋅( 𝑥 3 − 18𝑒 −9𝑥 )
𝑑𝑥 3
28
Properties of Derivatives (cont.)
For any two differentiable functions: 𝑓 𝑥 and g 𝑥
▪ Sum and Difference Rules.
▪ Product Rule.
▪ Quotient Rule.
▪ Chain Rule.
′
▪ 𝑓 𝑔 𝑥 = 𝑓′ 𝑔 𝑥 ⋅ 𝑔′(𝑥)
2
𝑥
1 −2
▪ Example: 𝑦 = 𝑒
2𝜋
𝑥2 𝑥2
𝑑𝑦 1 −2 1 −2
▪ = 𝑒 ⋅ −𝑥 = − 𝑥 ⋅𝑒
𝑑𝑥 2𝜋 2𝜋
29
Example: Chain Rule for Univariate Functions

𝑔 𝑥 = 3𝑥
𝑓 𝑧 = 2𝑧
𝑦=𝑓 𝑔 𝑥
𝑔(⋅) 𝑓(⋅)
𝑧 𝑦
𝑥 3𝑥 2𝑧 y = 𝑓(𝑔(𝑥))

𝑑𝑦 𝑑𝑓 𝑑𝑔
= ⋅ =2⋅3=6
𝑑𝑥 𝑑𝑧 𝑑𝑥
30
Roadmap
• Linear Algebra Recap
• Functions and Optimization
• Function Derivatives
• Derivative of Univariate Functions
• Partial Derivative and Gradient

31
Let us start with this
Functions x Linear Algebra vector-in scalar out
sample

𝑥1 𝑋1,1 ⋯ 𝑋1,𝐿
Input 𝑦=f 𝑥 𝑦=f ⋮ 𝑦=f ⋮ ⋱ ⋮
Dim 𝑥𝑀 𝑋𝑀,1 ⋯ 𝑋𝑀,𝐿

𝑦1 𝑦1
Output

𝑦1
Dim

𝑥1 𝑋1,1 ⋯ 𝑋1,𝐿
⋮ =f 𝑥 ⋮ =f ⋮ ⋮ =f ⋮ ⋱ ⋮
𝑥𝑀 𝑋𝑀,1 ⋯ 𝑋𝑀,𝐿
𝑦𝑁 𝑦𝑁 𝑦𝑁

𝑌1,1 ⋯ 𝑌1,𝐾 𝑌1,1 ⋯ 𝑌1,𝐾 𝑥1 𝑌1,1 ⋯ 𝑌1,𝐾 𝑋1,1 ⋯ 𝑋1,𝐿

⋮ ⋱ ⋮ =f 𝑥 ⋮ ⋱ ⋮ =f ⋮ ⋮ ⋱ ⋮ =f ⋮ ⋱ ⋮
𝑌𝑁,𝐾 ⋯ 𝑌𝑁,𝐾 𝑌𝑁,𝐾 ⋯ 𝑌𝑁,𝐾 𝑥𝑀 𝑌𝑁,𝐾 ⋯ 𝑌𝑁,𝐾 𝑋𝑀,1 ⋯ 𝑋𝑀,𝐿

32
Partial Derivatives (for multivariate functions)

x
A function may have multiple variables, e.g.,
▪ 𝑓 𝑥, 𝑦 = 𝑥 2 𝑦 and g 𝑥1 , 𝑥2 , 𝑥3 = 𝑥1 𝑥2 𝑥3
10
y A partial derivative of a function of several
10
variables is its derivative with respect to one of
those variables, with the others held constant
f ( x, y ) = 100 − x 2 − y 2
Usually denoted by:
𝜕𝑓 𝜕𝑓
= −2𝑥 = −2𝑦 𝜕𝑓 𝑑𝑓
𝜕𝑥 𝜕y ▪ or simply
𝜕𝑥 𝑑𝑥

33
Calculus with Vector in, scalar out
Linear Algebra Define Gradient (Vectors’ derivatives)
𝑦 = 𝑓(𝐱) 𝑦 ∈ ℝ, 𝐱 ∈ ℝ𝑛

34
Examples of Partial Derivatives and Gradient

35
Many of the univariate
Rules of Partial function’s derivative
rules hold for partial
Differentiation differentiation

These include the

sum rule, product
rule, quotient
rule, chain rule,
and power rule.

36
𝑦1
𝑥1 Create a set of partial
Functions ⋮ =f ⋮ derivatives
𝜕𝑦𝑖
𝜕𝑥𝑗
with vector 𝑥𝑀
𝑦𝑁 It will form a matrix
or matrix
𝑌1,1 ⋯ 𝑌1,𝐾 𝑋1,1 ⋯ 𝑋1,𝐿
output ⋮ ⋱ ⋮ =f ⋮ ⋱ ⋮
𝑌𝑁,1 ⋯ 𝑌𝑁,𝐾 𝑋𝑀,1 ⋯ 𝑋𝑀,𝐿
One way to think of it:
𝜕𝑌𝑖,𝑘
Bag of Derivatives Create a set of partial derivatives
𝜕𝑋𝑗,𝑙

Yes, a four-dimensional tensor

37
Vector in, vector out
Vector-Valued Numerator-layout
Functions 𝒚 = 𝑓(𝒙) 𝒚∈ ℝ𝑁 , 𝒙∈ ℝ𝑀 ,
𝜕𝒚
∈ ℝ𝑁×𝑀
𝜕𝒙

38
Gradient: Jacobian Matrix

39
Examples about the use of Jacobian

Jacobian

40
Calculus • Scalar: • Multivariate:
Chain Rule • 𝑧=𝑔 𝑥 •𝒛=𝑔 𝑥
•𝑦=𝑓 𝑧 •𝑦=𝑓 𝒛
𝑑𝑦 𝑑𝑦 𝑑𝑧 𝑑𝑦 𝜕𝑦 𝜕𝑧𝑗
• = • = σ𝑗
𝑑𝑥 𝑑𝑧 𝑑𝑥 𝑑𝑥 𝜕𝑧𝑗 𝜕𝑥

41
Example: Multivariate Chain Rule
𝑧1 = 𝑔1 𝑥
𝑔1 (𝑥) 𝑧1
⋮
𝑧𝑀 = 𝑔𝑀 𝑥 𝑧2
𝑥 𝑔2 (𝑥) 𝑓(𝑧1 , … , 𝑧𝑀 ) 𝑦
𝑦 = 𝑓 𝑧1 , ⋯ , 𝑧𝑀
…

𝑔𝑀 (𝑥) 𝑧𝑀
𝑀
𝑑𝑦 𝜕𝑦 𝜕𝑧𝑗
=෍
𝑑𝑥 𝜕𝑧𝑗 𝜕𝑥
𝑗=1

42
Multivariate chain rule
Example: Derivatives 𝒛=𝑔 𝑥
𝑦=𝑓 𝒛
𝑑𝑦 𝜕𝑦 𝜕𝑧𝑗
𝑔1 𝑥 = 3𝑥 = σ𝑗
𝑑𝑥 𝜕𝑧𝑗 𝜕𝑥
𝑔2 𝑥 = 5𝑥 𝑔1 (⋅)
𝑓 𝑧1 , 𝑧2 = 𝑧1 + 𝑧2 3𝑥 𝑧1 𝑓(⋅)
𝑦 = 𝑓 𝑔1 𝑥 , 𝑔2 𝑥
𝑥 𝑧1 + 𝑧2 𝑦

5𝑥 𝑧2

𝑔2 (⋅)
𝑑𝑦 𝜕𝑓 𝜕𝑧1 𝜕𝑓 𝜕𝑧2
= ⋅ + ⋅ =3⋅1+5⋅1=8
𝑑𝑥 𝜕𝑧1 𝜕𝑥 𝜕𝑧2 𝜕𝑥
43
Multivariate chain rule
Example: Derivatives 𝒛=𝑔 𝑥
𝑦=𝑓 𝒛
𝑑𝑦 𝜕𝑦 𝜕𝑧𝑗
𝑧1 = 𝑔1 𝑥 = sin 𝑥 = σ𝑗
𝑔1 (⋅) 𝑑𝑥 𝜕𝑧𝑗 𝜕𝑥
𝑧2 = 𝑔2 𝑥 = 𝑥 3
𝑦 = 𝑓 𝑧1 , 𝑧2 = 𝑧1 𝑧2 s𝑖𝑛(𝑥) 𝑧1

𝑥 𝑧1 ⋅ 𝑧2 𝑦

𝑥3 𝑧2 𝑓(⋅)
𝑑𝑦 𝜕𝑓 𝜕𝑧1 𝜕𝑓 𝜕𝑧2
= ⋅ + ⋅ 𝑔2 (⋅)
𝑑𝑥 𝜕𝑧1 𝜕𝑥 𝜕𝑧2 𝜕𝑥
= 𝑧2 cos 𝑥 + 𝑧1 ⋅ 3𝑥 2
= 𝑥 3 cos 𝑥 + 3x 2 sin(𝑥)
44
A Slide to take away
• A derivative measures how a function changes as its input changes, essentially
representing the rate of change or slope of the function at a given point.
• A partial derivative is a derivative of a function with several variables, where all but the
variable of interest are held constant during the differentiation.
• The Jacobian is a matrix of all first-order partial derivatives of a vector-valued function,
providing information about the function's rate of change in different directions at a
specific point.
• Chain Rule enables easy derivative computing for complex functions.
• Identify the Outer and Inner Functions: The chain rule is used when a function is
composed of an outer function and an inner function. Identify these two parts.
• Differentiate the Outer Function: Take the derivative of the outer function, leaving the
inner function as it is.
• Multiply by the Derivative of the Inner Function: The derivative of a composite function
is the derivative of the outer function multiplied by the derivative of the inner function. 45

BM Chapter 4
No ratings yet
BM Chapter 4
32 pages
MLF Combined
No ratings yet
MLF Combined
84 pages
EC270 CH2 Math Lecture Slides WEB
No ratings yet
EC270 CH2 Math Lecture Slides WEB
37 pages
Calculus Review Presentation
No ratings yet
Calculus Review Presentation
42 pages
Mathcamp - Lec1 - Kopia
No ratings yet
Mathcamp - Lec1 - Kopia
49 pages
Maths For ML
No ratings yet
Maths For ML
1 page
Linear Functions Slides
100% (1)
Linear Functions Slides
77 pages
Applied Mathematics For Business and Economics
100% (3)
Applied Mathematics For Business and Economics
87 pages
Applied Mathematics For Business and Economics PDF
100% (2)
Applied Mathematics For Business and Economics PDF
87 pages
Mit18 S096iap23 Lec02
No ratings yet
Mit18 S096iap23 Lec02
12 pages
Functions and Optimization
No ratings yet
Functions and Optimization
48 pages
AIMLB PGP 2025 Session 4
No ratings yet
AIMLB PGP 2025 Session 4
38 pages
Mathematics I Egiovanis@adu - Edu.tr
No ratings yet
Mathematics I Egiovanis@adu - Edu.tr
59 pages
2 - Learning With Gradient
No ratings yet
2 - Learning With Gradient
23 pages
Math Review - 1
No ratings yet
Math Review - 1
21 pages
Functions, Limits & Continuity
No ratings yet
Functions, Limits & Continuity
21 pages
Mathematics For Microeconomics: Maximization of A Function of One Variable
No ratings yet
Mathematics For Microeconomics: Maximization of A Function of One Variable
35 pages
Diff Application
No ratings yet
Diff Application
16 pages
Gradient Descent - Xiaowei Huang
No ratings yet
Gradient Descent - Xiaowei Huang
53 pages
CENG3300 Lecture 2-1
No ratings yet
CENG3300 Lecture 2-1
21 pages
Linear Functions (Concept)
No ratings yet
Linear Functions (Concept)
40 pages
Calculus
No ratings yet
Calculus
94 pages
Mathematical Economics (N)
No ratings yet
Mathematical Economics (N)
148 pages
Linear Functions: Functions in General Linear Functions C. Linear (In) Equalities
No ratings yet
Linear Functions: Functions in General Linear Functions C. Linear (In) Equalities
72 pages
Mathematical Economics - Lecture Slides, 2023
No ratings yet
Mathematical Economics - Lecture Slides, 2023
160 pages
Differential Calculus
No ratings yet
Differential Calculus
63 pages
Math Chap 1 To 6
No ratings yet
Math Chap 1 To 6
85 pages
Pre-Math Tutorial Outline: LU Futao September 8, 2014
No ratings yet
Pre-Math Tutorial Outline: LU Futao September 8, 2014
2 pages
微积分参考教材
No ratings yet
微积分参考教材
172 pages
AML 04 Backpropagation
100% (1)
AML 04 Backpropagation
26 pages
ECON309 Math Essentials
No ratings yet
ECON309 Math Essentials
6 pages
A Brief Introduction To Calculus
No ratings yet
A Brief Introduction To Calculus
69 pages
Mathematical Knowledge Functions of A Single Variable
No ratings yet
Mathematical Knowledge Functions of A Single Variable
5 pages
Math
No ratings yet
Math
22 pages
Math2 - 1 2
No ratings yet
Math2 - 1 2
25 pages
Document 0729072351
No ratings yet
Document 0729072351
6 pages
MTH 211
100% (1)
MTH 211
39 pages
Differentiation
No ratings yet
Differentiation
44 pages
Calculus Notes
No ratings yet
Calculus Notes
7 pages
SIT194 - Derivatives (Lecture Notes)
No ratings yet
SIT194 - Derivatives (Lecture Notes)
40 pages
The Derivative: Objectives
No ratings yet
The Derivative: Objectives
11 pages
Calculus for Economics Students
No ratings yet
Calculus for Economics Students
62 pages
Hysics
No ratings yet
Hysics
20 pages
Unit One Mathematical Economics
No ratings yet
Unit One Mathematical Economics
15 pages
Derivatives and Application
No ratings yet
Derivatives and Application
14 pages
Lesson02-Python Calculus Maths
No ratings yet
Lesson02-Python Calculus Maths
19 pages
Maths
No ratings yet
Maths
82 pages
Deep Learning Assignment 1 Solutions
No ratings yet
Deep Learning Assignment 1 Solutions
10 pages
2 - Continuity and Differentiation Formulae
No ratings yet
2 - Continuity and Differentiation Formulae
8 pages
Differentiation
No ratings yet
Differentiation
102 pages
AP Calc BC Midterm Study Guide
No ratings yet
AP Calc BC Midterm Study Guide
7 pages
Differentiation, Partial Differentiation & Gradients
No ratings yet
Differentiation, Partial Differentiation & Gradients
51 pages
DL (Unit I)
No ratings yet
DL (Unit I)
25 pages
Differentiation
No ratings yet
Differentiation
27 pages
Lecture 4
No ratings yet
Lecture 4
8 pages
Appendix D Calculus
No ratings yet
Appendix D Calculus
31 pages
Math For economics2025VNU
No ratings yet
Math For economics2025VNU
224 pages
3) Differentiation FPM
No ratings yet
3) Differentiation FPM
38 pages
Advanced Chain Rule Exercises
No ratings yet
Advanced Chain Rule Exercises
20 pages
Daily Lesson Log - Basic Calculus
No ratings yet
Daily Lesson Log - Basic Calculus
14 pages
BUS - 135 - Course - Outline-Spring 2024
No ratings yet
BUS - 135 - Course - Outline-Spring 2024
2 pages
Calcs 6
No ratings yet
Calcs 6
5 pages
Sophia Calculus I Syllabus
No ratings yet
Sophia Calculus I Syllabus
5 pages
Assignment Problems: Paul Dawkins
No ratings yet
Assignment Problems: Paul Dawkins
176 pages
(Maa 5.3) Chain Rule
No ratings yet
(Maa 5.3) Chain Rule
24 pages
Chapter 4 Differ en Ti Able Functions
No ratings yet
Chapter 4 Differ en Ti Able Functions
32 pages
Mathematics For Biomedical Physics
No ratings yet
Mathematics For Biomedical Physics
242 pages
ENG4200 04 PartialDerivatives 07
No ratings yet
ENG4200 04 PartialDerivatives 07
59 pages
Differentiation of Exponential Functions
No ratings yet
Differentiation of Exponential Functions
20 pages
Planes PDF
No ratings yet
Planes PDF
2 pages
Finite Element Analysis Prof - Dr.B.N.Rao Department of Civil Engineering Indian Institute of Technology, Madras
No ratings yet
Finite Element Analysis Prof - Dr.B.N.Rao Department of Civil Engineering Indian Institute of Technology, Madras
33 pages
DerivativesMindMap 1
No ratings yet
DerivativesMindMap 1
1 page
Advanced Vector Concepts
No ratings yet
Advanced Vector Concepts
166 pages
L20. Integration
No ratings yet
L20. Integration
18 pages
Real Analysis Project
0% (1)
Real Analysis Project
5 pages
Calculus Syllabus
No ratings yet
Calculus Syllabus
18 pages
Calculus: Derivatives Essentials
No ratings yet
Calculus: Derivatives Essentials
23 pages
M2200-98 Lesson 04 - Notes - Rules of Differentiation
No ratings yet
M2200-98 Lesson 04 - Notes - Rules of Differentiation
15 pages
Ap Calculus BC Sample Syllabus 1
No ratings yet
Ap Calculus BC Sample Syllabus 1
17 pages
Bsce Calculus 2 Syllabi
No ratings yet
Bsce Calculus 2 Syllabi
4 pages
Calculus for Advanced Learners
No ratings yet
Calculus for Advanced Learners
23 pages
BTECH2022MECH
No ratings yet
BTECH2022MECH
154 pages
Calc I Syllabus
No ratings yet
Calc I Syllabus
6 pages
Reviewer For Calculus 1 Prelim Exam
No ratings yet
Reviewer For Calculus 1 Prelim Exam
16 pages
400 - Lecture 8 Problems
No ratings yet
400 - Lecture 8 Problems
6 pages
Unit 3 AP Classroom PDF
No ratings yet
Unit 3 AP Classroom PDF
12 pages
Taking Derivatives - Calculus - Khan Academy - Khan Academy
No ratings yet
Taking Derivatives - Calculus - Khan Academy - Khan Academy
4 pages
Ch11 Update
No ratings yet
Ch11 Update
46 pages

Lecture 4 Introduction To Calculus (Part 1)

Uploaded by

Lecture 4 Introduction To Calculus (Part 1)

Uploaded by

Introduction to Calculus

Jing Li, Assistant Professor

Mathematical Basics R programming

Linear Algebra Environment

“Loss” Desired output Model’s output

𝛉(t+1) = 𝛉(t) − η ∇ (𝓁)

Training data: {(X1(i), Y(i)): i=1,..,N}

How to compute gradient of Loss??

NXP PXM NXM

Cij is the inner product of the ith row

NXP PXM NXM

C is a sum of outer products of the columns of A with the rows of B

• 𝑦 is a function of 𝑥, written with 𝑦 = 𝑓(𝑥):

Input 𝑥 Function 𝑓 Output 𝑦

Instantaneous rate of change

𝑓 𝑥 = 11𝑥 + 9 𝑥 Function 𝑓 y = 𝑓(𝑥)

For any two differentiable functions: 𝑓 𝑥 and g 𝑥

For any two differentiable functions: 𝑓 𝑥 and g 𝑥

𝑌1,1 ⋯ 𝑌1,𝐾 𝑌1,1 ⋯ 𝑌1,𝐾 𝑥1 𝑌1,1 ⋯ 𝑌1,𝐾 𝑋1,1 ⋯ 𝑋1,𝐿

These include the

Yes, a four-dimensional tensor

You might also like