Functions of Several Variables
Functions of Several Variables
Manoj Pandey
Department of Mathematics
Rajiv Gandhi Proudyogiki Vishwavidyalaya,(M.P)
2 Partial Derivatives 28
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.2 Partial Derivative Along x-Axis . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.3 Partial Derivative Along y-axis . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.4 Tangent Plane to a Surface . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.4.1 Equation of the Tangent Plane . . . . . . . . . . . . . . . . . . . . . 40
2.4.2 Second and Higher Order Partial Derivatives . . . . . . . . . . . . . . 44
2.5 Differentiability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
2.5.1 Differentiability of Scalar Function of Several Variables . . . . . . . . 52
2.5.2 Differentiability Implies Continuity . . . . . . . . . . . . . . . . . . . 58
2.5.3 General Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
3 Chain Rule 69
3.1 Composition of Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
3.2 The Chain Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
3.3 Chain Rule in Higher Dimensions . . . . . . . . . . . . . . . . . . . . . . . . 74
3.3.1 Two intermediate and one independent variable . . . . . . . . . . . . 74
3.3.2 Two intermediate and two independent variables . . . . . . . . . . . . 78
3.3.3 General Case (Optional) . . . . . . . . . . . . . . . . . . . . . . . . . 82
3
CONTENTS
4
Chapter 1
In this chapter, we introduce scalar functions of several real variables, commonly referred to
as scalar fields. These functions assign a real number to each point in a multi-dimensional
domain and play a crucial role in modeling a wide range of physical and geometric phenom-
ena such as temperature, pressure, and surface area. We will explore fundamental concepts
associated with scalar fields, including domains, level sets, and limits, and develop techniques
for their graphical and analytical interpretation.
1.1 Introduction
In previous chapter, we studied vector-valued functions of a real variable, given by c : [a, b] →
Rn , where n = 2, 3. These functions map a real input t to a vector c(t) in Rn , representing
curves or paths in space. Such functions are fundamental in describing motion, trajectories,
and geometric forms in two- and three-dimensional spaces.
We now extend our focus to functions of several variables, which differ in that they assign a
real (scalar) value to each point in a multi-dimensional domain. These are known as scalar
fields, especially when defined over a region in space.
A basic example is the area A of a rectangle, which depends on its length x and breadth y.
This relationship is expressed as A = f (x, y), where f (x, y) = xy. The function f defines a
5
1.2. DEFINITION
scalar field over the xy-plane, assigning a real value to each point (x, y).
Scalar fields also appear in geometric contexts. For instance, the equation of a plane in R3 ,
ax + by + cz = d, can be rewritten as:
z = g(x, y),
In physical contexts, scalar fields are used to describe distributions of quantities such as
temperature. For example, the temperature T at each point (x, y, z) in a room may vary
with position, represented as:
T = T (x, y, z).
This defines a scalar field in R3 , assigning a unique temperature to every point in space.
More generally, a scalar field is a function of n variables that assigns a real number to each
point in a region of Rn . Such functions are essential in mathematics, physics, and engineering,
modeling phenomena including electric potential, pressure, elevation, and population density.
1.2 Definition
Let D ⊆ Rn represent a set of ordered n-tuples. A function assigning each n-tuple (x1 , x2 , . . . , xn )
a unique real number is termed a function of n variables. Formally, we define the functions
of several variables as below:
w = f (x1 , x2 , . . . , xn ).
Domain of the function must be a set of inputs for which function remains meaningful, i.e.,
real and finite. For instance, for a 2- dimensional case, i.e., for functions f : D ⊆ R2 → R,
the domain D is the set of all points (x, y) that satisfy conditions ensuring f (x, y) remain
real and finite. To find the domain of a function of one variable, check which input values
you can safely plug into the function without causing problems. Problems usually happen
with things like dividing by zero or taking the square root of a negative number. Common
restrictions include:
6
1.2. DEFINITION
ILLUSTRATIVE EXAMPLES
Example 1. Describe domain of the following functions and find their values at specified
points in space.
4. f (x, y, z) =
p
36 + x2 − y 2 − z 2 ; f (0, 0, 0), f (2, −5, 3), f (1, 2, −3)
Solution. Let us find the functional values and domains for each of the functions.
1. Given f (x, y) = x2 + xy 3 . Clearly, this function is defined for all real values of x and
y. Therefore,the domain D is all of R2 , i.e., D = {(x, y) : (x, y) ∈ R2 }. The specific
values of the function are given below
f (0, 0) = 02 + 0 · 03 = 0,
f (−1, 1) = (−1)2 + (−1) · (1)3 = 1 − 1 = 0,
f (2, 3) = (2)2 + (2) · (3)3 = 4 + 2 · 27 = 4 + 54 = 58,
f (−3, −2) = (−3)2 + (−3) · (−2)3 = 9 + (−3) · (−8) = 9 + 24 = 33.
2. The function f (x, y) = cos(xy) is also defined for all real values of x and y, so the
domain is whole of R2 , i.e., D = {(x, y) : (x, y) ∈ R2 }. The specific values of the
functions are
π π
f 2, = cos 2 · = cos(π) = −1,
2 2
π π π 1
f −3, = cos −3 · = cos − = ,
9 9 3 √ 2
1 1 π 2
f π, = cos π · = cos = .
4 4 4 2
7
1.2. DEFINITION
real values of x, y, and z for which y 2 + z 2 6= 0 (to avoid division by zero). Therefore,
the domain is D = {(x, y, z) ∈ R3 : y 2 + z 2 6= 0}. The specific values are given as under
3 − (−1) 3+1 4
f (3, −1, 2) = = = ,
(−1)2 + 22 1+4 5
1 − 12 1 1
1 1 8
f 1, , − = 2 = 1 2 1 =
1 2
2
5 = ,
2 4 1
+ − 4
+ 16 16
5
2 4
0 − − 13 1
1 3
f 0, − , 0 = = 1 = 3.
3 1 2
−3 + 0 2
9
4. The f (x, y, z) = 36 + x2 − y 2 − z 2 has an expression under square root sign. For the
p
Solution. From, the definition of the function, it is clear that in order to have a real f (x, y),
the expression under the square root must be non-negative, i.e., 4 − x2 − y 2 ≥ 0 In other
8
1.3. GRAPH
words, x2 + y 2 ≤ 4. Thus the domain is region enclosed by the circle of radius 2 centered at
the origin, i.e.,
D = {(x, y) ∈ R2 : x2 + y 2 ≤ 4}.
Thus, the domain describes a disk (including its boundary) of radius 2 as shown in the figure
(1.2.1).
Example 3. Find domain and range of z = y − x2 .
p
Solution. The function z = y − x2 will be defined for those point (x, y) in plane R2 for
p
which y − x2 ≥ 0, i.e., y ≥ x2 . These are the point (see the figure 1.2.2) lying inside a
region enclosed by the parabola y = x2 . The range is clearly the set of all non-negative real
numbers, i.e., D = {z : z ∈ R+ }.
1.3 Graph
Visualizing a function is essential because it provides an intuitive way to understand how
a function behaves and how its values change in relation to its input variables. Graph of a
function is simply the collection of all the ordered pairs (input, output). In general setting,
the graph G(f ) of a function f : D ⊆ Rn → R is defined as the set:
The components of the inputs x1 , x2 , . . . , xn are the independent variables, while the output
w is the dependent variable.
9
1.4. LEVEL SETS
A more intuitive and efficient method to visualize a function of two variables is through slic-
ing or sections, i.e., considering the intersection of the surface with horizontal planes (planes
parallel to the xy-plane) at various heights. Specifically, we examine the set of points (x, y, z)
on the surface where the function takes a constant value, say z = f (x, y) = c. We sometimes
call these sections as contour lines or contour curves.
By plotting several such sections for different values of c, we obtain a contour map, which
reveals important geometric features of the surface. When these slices are projected onto the
xy-plane, they form what are known as level curves. We now formally define this concept
as below:
Definition 1.4.1 (Level Curve). A level curve of a function f (x, y) is the set S of all
points (x, y) in the domain D at which the function has a constant value. That is,
In other words, the level set of a function f (x, y) with value c is the a the set S of all
points (x, y) in the plane R2 for which the function f (x, y) reaches a certain ‘level’ c.
For a function w = f (x, y, z), such sets are known as Level Surface. In case of more than
three variables, it is called level set. A level surface of a function of three variable is defined
10
1.4. LEVEL SETS
as below:
Definition 1.4.2 (Level Surface). A level surface of a function f (x, y, z) is the set S of
all points (x, y, z) in the domain D at which the function has a constant value. That is,
In other words, the level surface of a function f (x, y, z) with value c is the a the set S of
all points (x, y, z) in the plane R3 for which the function f (x, y) reaches a certain ‘level’
c.
Below is an illustrative image of a surface, its contour lines and level curves.
ILLUSTRATIVE EXAMPLES
11
1.4. LEVEL SETS
shown in the following figure except for c = 0. In this case, the level curves are given by
x2 − y 2 = 0, i.e., a pair lines x = ±y (a degenerate hyperbola).
Example 6. Sketch the level curves of the exponential function f (x, y) = ex for c =
2 +y 2
1, 2, 3, 4.
Solution. The level curves are obtained by solving ex +y = c. Taking the natural logarithm
2 2
on both sides,
x2 + y 2 = log c.
12
1.4. LEVEL SETS
√
For positive values of c, the level curves are circles centered at the origin with radii log c.
These circles increase in size as c increases.
Example 7. Find the level surface of the function f (x, y, z) = x2 + y 2 + z 2 for c = 4, 9.
Solution. The level surfaces for c = 4 and c = 9 are the spheres x2 + y 2 + z 2 = 4 and
x2 + y 2 + z 2 = 9. These spheres are centered at the origin with radii 2 and 3 respectively.
Example 8. Describe the level surface of the function f (x, y, z) = z − x2 − y 2 = 0 for c = 0.
Solution. The level surface of the function for c = 0 is given by 0 = z − x2 + y 2 . This is a
just a paraboloid z = x2 − y 2 opening upward in the positive z-direction.
Example 9. Find the level surface of the function f (x, y, z) = log (x2 + y 2 + z 2 ) for c = 1.
Solution. The desired level surface is given by ln(x2 + y 2 + z 2 ) = 1. Taking the
√ exponential
on both sides, we have x + y + z = e. This represents a sphere of radius e centered at
2 2 2
the origin.
Example 10. Determine the level curve of the function f (x, y) = x+2y that passes through
the point (2, 1).
Solution. Let x + 2y = c be the level curve that passes through the given point (2, 1).
Therefore, we have 2 + 2(1) = c. Thus, the desired level curve is x + 2y = 4. This is a just a
straight line.
Example 11. Find the level curve of the function f (x, y) = x2 − 3y that passes through the
point (3, 2).
Solution. Let the level curve be given by x2 − 3y = c. Substituting (3, 2) into the equation,
we have c = 32 − 3(2) = 3. Thus, the required level curve is x2 − 3y = 3. This represents a
parabolic curve.
Example 12. Find the level surface of the function f (x, y, z) = x + y − 2z that passes
through the point (1, 2, 3).
Solution. The level surface is obtained by the equation x + y − 2z = c. Now, substituting
(1, 2, 3) into the equation, we get c = 1 + 2 − 2(3) = −3. Thus, the required level surface is
x + y − 2z = −3. This is just a plane.
Example 13. Find the level surface of the function f (x, y, z) = x2 + y 2 − z that passes
through the point (2, 3, 4).
Solution. Similar to the above question, level surfaces are given by x2 + y 2 − z = c. Substi-
tuting (2, 3, 4) into the equation, we get c = 22 + 32 − 4 = 4 + 9 − 4 = 9. Thus, the required
level surface is x2 + y 2 − z = 9. This represents a type of paraboloid in three-dimensional
space.
Level Curves are important because of many reasons. In mathematics, level curves provide
a 2D visualization of a 3D surface. Level curves also have a relationship with the gradient.
In forthcoming lectures, we shall study the fact that the gradient vector of f (x, y) at any
point is always perpendicular to the level curve passing through that point. They are used
in weather maps, engineering, physics, economics, and more.
13
1.5. LIMIT
1.5 Limit
In this section, we shall discuss the notions of limits and continuity for functions of several
variables, which are fundamental concepts in multivariable calculus. These ideas extend the
familiar single-variable concepts into higher dimensions and allow us to rigorously understand
how functions behave near a point in space. Just as in single-variable calculus, limits provide
the foundation for defining continuity, derivatives, and integrals for multivariable functions.
However, in the multivariable context, the behavior of a function becomes more intricate
due to the presence of infinitely many paths through which a point can be approached. We
begin our study with the formal definition of the limit of a function of two variables.
lim f (x, y) = L
(x,y)→(a,b)
if for every ε > 0, there exists a δ > 0 such that for all (x, y) ∈ D,
p
0 < (x − a)2 + (y − b)2 < δ ⇒ |f (x, y) − L| < ε.
In simpler terms, this means that we can make f (x, y) as close as we want to L (within ε) by
choosing (x, y) sufficiently close to (a, b) (within a distance δ), excluding the point (a, b) itself.
14
1.5. LIMIT
It must be noted that in single-variable case, a point x → a can be approached only from
the left or right. But in two variables, (x, y) → (a, b) can occur along infinitely many curves
(straight lines, parabolas, spirals, etc.) and for the limit to exist, the function must approach
the same value regardless of the path.
Thus, geometrically, the definition says that as the point (x, y) approaches (a, b) from any
direction in the plane, along any path, the function values f (x, y) must approach the same
number L. If even two different paths toward (a, b) yield different limiting values for f (x, y),
then the limit does not exist.
It is also important to recognize that the existence of a limit at a point does not depend on
the value of the function at that point. The limit only concerns how the function behaves
near the point, not necessarily at the point itself. Therefore, f (x, y) need not be defined at
(a, b) for the limit to exist.
Remark 1.5.1 (Transition from Spherical to Square Neighborhood). The classical def-
inition of a limit in two variables employs a spherical neighborhood around the point
(a, b), using the Euclidean distance
p
(x − a)2 + (y − b)2 < δ,
which describes a circular region centered at (a, b). This ensures that the point (x, y) ap-
proaches (a, b) from all directions. However, in many analytical contexts, it is convenient
to work with square neighborhoods defined by bounding each coordinate independently.
This leads to the condition
which describes a punctured open square centered at (a, b) with side length 2δ. This
formulation is logically equivalent to the spherical one, due to the equivalence of norms
in finite-dimensional spaces. Use of square neighbourhood is especially helpful in con-
structing δ-values during ε-δ proofs and provides a more direct path to generalization in
higher dimensions.
lim f (x, y) = L
(x,y)→(a,b)
if for every ε > 0, there exists a δ > 0 such that for all (x, y) ∈ D,
15
1.5. LIMIT
where L and M are real numbers, and let c be a constant. Then the following limit laws hold.
lim x = a.
(x,y)→(a,b)
lim y=b
(x,y)→(a,b)
and
lim k = k.
(x,y)→(a,b)
We now present some examples to illustrate how to compute limits of functions of two
variables.
ILLUSTRATIVE EXAMPLES
Solution. Given that f (x, y) = x + y, (a, b) = (1, 1) and L = 2. In order to prove that the
limit of the function f (x, y) is 2, we need to show that for every > 0 there exists δ > 0
such that
|x + y − 2| < , whenever 0 < (x − 1) < δ, 0 < (y − 1) < δ
16
1.5. LIMIT
Solution. We are given the function f (x, y) = 3x − 4y, the point (a, b) = (0, 0), and the
proposed limit L = 0. To prove that the limit of f (x, y) is 0 as (x, y) → (0, 0), we must show
that for every > 0, there exists a δ > 0 such that
|3x − 4y − 0| < , whenever 0 < |x| < δ, 0 < |y| < δ.
Let > 0 be given. We observe that
|f (x, y) − 0| = |3x − 4y| ≤ |3x| + |4y| = 3|x| + 4|y|.
Now, suppose that 0 < |x| < δ and 0 < |y| < δ. Then,
|3x − 4y| ≤ 3|x| + 4|y|
< 3δ + 4δ
= (3 + 4)δ = 7δ.
To ensure that |3x − 4y| < , it is enough to choose
δ= .
7
Then,
|3x − 4y| < 7δ = .
Thus, for every > 0, we have found a δ =
7
such that
|3x − 4y| <
whenever 0 < |x| < δ and 0 < |y| < δ. This completes the proof.
17
1.5. LIMIT
lim x2 + 2y = 3.
(x,y)→(1,1)
Solution. We are given f (x, y) = x2 + 2y, the point (a, b) = (1, 1), and the proposed limit
L = 3. To prove that lim(x,y)→(1,1) x2 + 2y = 3, we must show that for every > 0, there
exists a δ > 0 such that
|f (x, y) − 3| = |x2 + 2y − 3|
= |x2 − 1 + 2(y − 1)|
≤ |x2 − 1| + 2|y − 1|.
Now, assume 0 < |x − 1| < δ and 0 < |y − 1| < δ, where δ < 1. Then:
|x2 + 2y − 3| < .
Example 17. Evaluate limit of the function f (x, y) = (x2 + y 2 ) at (2, 3).
18
1.5. LIMIT
Solution. We use the rules described above to evaluate the limit of the function as below
= 2 2 + 32 = 4 + 9
= 13.
A mature student may understand that the same limit might have been calculated much
quickly through direct substitution as given below:
19
1.5. LIMIT
4. As (x, y) → (0, 0), the numerator tends to 0, and the denominator tends to 1. Hence,
x2 + y 2 + xy
lim = 0.
(x,y)→(0,0) x2 + y 2 + 1
5. As (x, y) → (0, 1), the numerator tends to 3, and the denominator tends to 2. Thus,
x+y+2 3
lim 2
= .
(x,y)→(0,1) x + y + 1 2
x2 − xy + y 2
lim p p = 0.
(x,y)→(0,0) |x| + |y| + 1
7. In this case also, we may directly put the values. Hence, we have
x2 − 3xy + 5y 2
lim = 0.
(x,y)→(0,0) x − 2y + 1
9. Let us put u = x − 3y. Since (x, y) → (0, 0), therefore u → 0. Thus, the desired limit
is
sin(x − 3y) sin u
lim = lim = 1.
(x,y)→(0,0) x − 3y u→0 u
10. Given (x, y, z) → (1, 0, −1). We directly put the values of x, y, z. Thus, we have
20
1.5. LIMIT
n o
sin−1 (xy−2)
Example 19. Evaluate lim(x,y)→(2,1) tan−1 (3xy−6)
.
Solution. Let us define u = xy. As (x, y) → (2, 1), we see that u → 2. So we rewrite the
limit as
sin−1 (xy − 2) sin−1 (u − 2)
lim = lim .
(x,y)→(2,1) tan−1 (3xy − 6) u→2 tan−1 (3u − 6)
sin−1 (h)
lim .
h→0 tan−1 (3h)
Using the standard linear approximations near 0, i.e., sin−1 (h) ≈ h and tan−1 (3h) ≈ 3h, as
h → 0, we get
h 1
lim = .
h→0 3h 3
x2 −y 2
Example 20. Show that the lim(x,y)→(0,0) x2 +y 2
does not exist.
Solution. To show that the limit does not exist, we shall choose two different paths. First,
we let (x, y) → (0, 0) along x-axis, i.e., along the path y = 0. We see that
x2 − y 2 x2
lim = lim = 1.
(x,y)→(0,0) x2 + y 2 (x,y)→(0,0) x2
Secondly, we let (x, y) → (0, 0) along y-axis, i.e., along the path x = 0. We see that
x2 − y 2 y2
lim = lim − = −1.
(x,y)→(0,0) x2 + y 2 (x,y)→(0,0) y2
We see, that along two different paths, limit is different. Since the limit depends on the
path, therefore, it does not exist.
Solution. Let us assume that the point (x, y) approaches (0, 0) through the path y = mx.
Then,
xy mx2
lim = lim
(x,y)→(0,0) x2 + y 2 (x,y)→(0,0) x2 + m2 x2
m
= .
1 + m2
Thus, the limit depends on the path y = mx, therefore, it does not exist.
Example 22. Show that the lim(x,y)→(0,1) tan−1 xy does not exist.
21
1.5. LIMIT
Solution. To analyze the limit, we examine the behavior of the function along different
paths approaching the point (0, 1).
Since the function approaches different values along different paths, therefore the given limit
−1 y
does not exist.
lim(x,y)→(0,1) tan x
Solution. Let the point (x, y) approach (0, 0) along the path y = mx, where m is a constant.
Then,
2y 2(mx)
lim = lim = lim 2m = 2m.
(x,y)→(0,0) x x→0 x x→0
Since the value of the limit depends on the parameter m, which varies with the path chosen,
the limit does not exist.
Solution. Let us approach the point (0, 0) along the path y = mx, where m is a real
constant. Substituting y = mx into the function:
xy x(mx)
f (x, y) = = 2
x2
+y 2 x + (mx)2
mx2 mx2
= 2 =
x + m2 x2 x2 (1 + m2 )
m
= .
1 + m2
Thus, lim(x,y)→(0,0) f (x, y) = 1+m
m
2 , which depends on the value of m. Since the limit depends
on the path taken (i.e., the value of m), therefore, the limit does not exist.
22
1.6. CONTINUOUS FUNCTIONS
Path 1: Let us assume that (x, y) → (0, 0) along the parabola y = x2 . Then,
x2 (x2 ) x4 1
lim f (x, y) = lim 4 2 2
= lim 4
= .
(x,y)→(0,0) (x,y)→(0,0) x + (x ) (x,y)→(0,0) 2x 2
Path 2: This time, we assume that (x, y) → (0, 0) along the x-axis, i.e., y = 0. Then,
x2 · 0
lim f (x, y) = lim = 0.
(x,y)→(0,0) (x,y)→(0,0) x4 + 0
Thus, along the path y = 0, the function approaches 0. Since the limit depends on the path
taken, the limit does not exist.
A function f (x, y) is said to be continuous at a point (a, b) if the following three conditions
are satisfied:
1. The value of the function f (x, y) at (a, b) exists, i.e., f (a, b) is defined.
2. The limit of the function f (x, y) at (a, b) exists, i.e., lim(x,y)→(a,b) f (x, y) exists.
3. Both the value of the function f (x, y) and the limit at point (a, b) are equal, i.e.,
23
1.6. CONTINUOUS FUNCTIONS
If all three conditions are met, we write lim(x,y)→(a,b) f (x, y) = f (a, b), and say that f is
continuous at (a, b). A function is said to be continuous on a domain D ⊂ R2 if it is
continuous at every point (x, y) ∈ D. In a bit formal way, we can define continuity of the
functions as below:
That is, for every point (x, y) within a spherical neighborhood of radius δ centered at
(a, b), the function value f (x, y) lies within an interval of radius around f (a, b). This
formalizes the intuitive idea that small changes in input lead to small changes in output
near the point.
Continuity on a Domain: A function f : R2 → R is said to be continuous on a
domain D ⊆ R2 if it is continuous at every point (x, y) ∈ D. That is, for every point
(a, b) ∈ D, the − δ condition of continuity is satisfied. In such cases, we simply say that
f is continuous on D.
Having understood the definition of continuity, it is natural to ask how continuity behaves un-
der standard operations like addition, multiplication, and division of functions. Fortunately,
continuity is preserved under these operations, as stated in the following fundamental result.
Theorem 1.6.1 (Algebra of Continuous Functions:). Let f (x, y) and g(x, y) be functions
that are continuous at a point (a, b) ∈ R2 . Then the following functions are also continuous
at (a, b):
Furthermore, all the polynomial functions in x and y are continuous. Additionaly, composi-
tion of continuous functions, is also continuous at (a, b).
Many commonly encountered functions in mathematics are continuous within their domains.
For instance, polynomial functions such as f (x, y) = x2 + y 2 are continuous everywhere in
2 2
R2 . Rational functions like x2x+y+y2 +1 are also continuous at all points where the denominator
is nonzero. Similarly, standard trigonometric, exponential, and logarithmic functions are
continuous wherever they are defined. Discontinuities typically arise either at points where
24
1.6. CONTINUOUS FUNCTIONS
the function is undefined or where the limit does not coincide with the function value. To
deepen our understanding, we now explore several examples that highlight when functions
are continuous at a specific point or throughout their entire domain.
ILLUSTRATIVE EXAMPLES
Example 26. Using the -δ definition, prove that f (x, y) = x + y is continuous at the point
(1, 2).
Solution. We compute value of the function at (1, 2), i.e., f (1, 2) = 1 + 2 = 3. In order to
show that the function is continuous at (1, 2), We must show that for every > 0, there
exists a δ > 0 such that
|f (x, y) − f (1, 2)| < ,
whenever
|x − 1| < δ and |y − 2| < δ.
Now, let > 0 be given. Observe that
Example 27. Using the -δ definition, prove that f (x, y) = 2x + y 2 is continuous at the
point (1, 1).
Solution. The function is f (x, y) = 2x + y 2 and at the point (1, 1), we have
25
1.6. CONTINUOUS FUNCTIONS
To estimate |y + 1|, assume |y − 1| < 1. This implies that −2 < y < 2 or else y ∈ (0, 2).
This means that |y + 1| < 3. So, under the assumption δ ≤ 1, we have:
whenever |x − 1| < δ and |y − 1| < δ. Therefore, we conclude that the function f (x, y) =
2x + y 2 is continuous at (1, 1).
Solution. To show that the function is not continuous at (0, 0), it is enough to show that
the limit
xy 3
lim
(x,y)→(0,0) x2 + y 6
does not exist. We do this by approaching (0, 0) along different paths and showing that the
resulting limits are not the same.
(y 3 )y 3 y6 y6 1
lim f (x, y) = 3 2 6
= lim 6 6
= lim 6
= .
(x,y)→(0,0) (y ) + y (x,y)→(0,0) y + y (x,y)→(0,0) 2y 2
Approach 2: Along the path x = 0, i.e., y-axis. Then,
0 · y3
lim f (x, y) = lim = 0.
(x,y)→(0,0) (x,y)→(0,0) 0 + y 6
Since the limits along two different paths are not equal, therefore the
lim f (x, y)
(x,y)→(0,0)
26
1.6. CONTINUOUS FUNCTIONS
Example 29. Let f (x, y) = x2 + 3xy − 2y 2 . Show that f is continuous at (1, 2).
Solution. Value of the function and limit at (1, 2) are calculated as below:
f (1, 2) = 12 + 3(1)(2) − 2(2)2 = 1 + 6 − 8 = −1
and
lim f (x, y) = 12 + 3(1)(2) − 2(2)2 = 1 + 6 − 8 = −1.
(x,y)→(1,2)
Since the limit equals the functional value, the function is continuous at (1, 2).
Alternatively, the continuity in this particular case can be ascertained just by observing that
the function f (x, y) is a polynomial in x and y and it is known that the polynomial functions
are continuous everywhere in R2 . Therefore, f is continuous at (1, 2).
x2 −y 2
Example 30. Let f (x, y) = x2 +y 2 +1
. Show that f is continuous everywhere.
Solution. Functional value at a point (a, b) is given by
a2 − b 2
f (a, b) = .
a2 + b 2 + 1
Similarly, limit at (a, b) is
x2 − y 2 a2 − b 2
lim = .
(x,y)→(a,b) x2 + y 2 + 1 a2 + b 2 + 1
Since the function is defined and the limit equals the function value at every point, f is
continuous everywhere in R2 .
Example 31. Let f (x, y) = x2 + y 2 . Show that f is continuous at all (x, y) ∈ R2 .
p
Solution. To prove the continuity of the function f (x, y) = x2 + y 2 at any point (a, b) ∈
p
R2 , observe that p √
lim x 2 + y 2 = a2 + b 2 ,
(x,y)→(a,b)
27
1.6. CONTINUOUS FUNCTIONS
Alternatively, we may go for some computations. To check the continuity of the function f
at a particular point (0, 0), we compute its value and limit both at the point (0, 0). We find
f (0, 0) = sin(0 · 0) = sin(0) = 0,
and
lim sin(xy) = sin(0) = 0.
(x,y)→(0,0)
Since the limit of f (x, y) as (x, y) → (0, 0) exists and equals the value of the function at that
point, we conclude that f is continuous at (0, 0).
Example 33. Let f (x, y) = ex . Show that f is continuous at every point in R2 .
2 +y 2
Solution. We aim to show that the function f (x, y) = ex +y is continuous at all points
2 2
Alternatively, we may compute the value as well as the limit of the function at any point
(a, b) ∈ R2 . The value of the function f at (a, b) is given by
2 +b2
f (a, b) = ea ,
and the limit at (a, b) is given by
2 +y 2 2 +b2
lim ex = ea .
(x,y)→(a,b)
28
1.6. CONTINUOUS FUNCTIONS
Note that value and limit of the function at the point (a, b) both exist and are equal. Hence,
the function is continuous at (a, b). SInce this is true for any (a, b) ∈ R2 , therefore, we
conclude that f is continuous everywhere in R2 .
29
1.6. CONTINUOUS FUNCTIONS
30
Chapter 2
Partial Derivatives
In this chapter, we extend the concept of ordinary derivatives to functions of multiple vari-
ables. When dealing with scalar fields such as f (x, y), it becomes essential to understand how
the function changes with respect to each variable independently, as well as in arbitrary di-
rections within the domain. This leads to the foundational concepts of partial derivatives and
directional derivatives, which quantify the rate of change of a multivariable function along
specified directions. These tools are crucial for understanding gradients, tangent planes, and
local linear approximations in higher dimensions.
2.1 Introduction
For a function y = f (x) of single variable, its derivative f 0 (x) is defined as
f (x + h) − f (x)
f 0 (x) = lim .
h→0 h
At a point x = a, the derivative f 0 (a) is then written as
f (a + h) − f (a)
f 0 (a) = lim .
h→0 h
This expression represents the rate of change of the function f at the point x = a. Geomet-
rically, it corresponds to moving from a point A(x = a) along the x-axis to a nearby point
B(x = a + h), which lies at a distance h from A. We then compute the ratio of the change
in the function values to the change in the input variable and examine the limiting behavior
of this ratio as h → 0.
31
2.1. INTRODUCTION
In the case of a function of a single variable, the direction of movement from the point x = a
is fixed. In fact, one can move just along the x-axis, or in the direction of the unit vector
î. However, for functions of several variables, there are infinitely many directions to proceed
from a given point.
For instance, consider a function z = f (x, y). From any point (a, b), we can move not only
in the directions of î (along the x-axis) or ĵ (along the y-axis), but also in any arbitrary
direction in the plane, determined by a unit vector v̂. This leads us to the concept of the
Figure 2.1.2: Infinitely many directions in the plane from point (a, b)
directional derivative, which measures the rate of change of the function f (x, y) in the
direction of a given unit vector v̂. Of particular interest are the directional derivatives in
the directions of î and ĵ, which correspond to changes along the x and y-axes, respectively.
32
2.2. PARTIAL DERIVATIVE ALONG x-AXIS
Thus, while the derivative in one variable captures change along a single path, the framework
of directional and partial derivatives in multivariable calculus enables us to explore how
functions behave across all directions from a given point.
To begin with, let us consider a point p = (a, b) in the domain of the function f (x, y). We are
interested in analyzing how the function f behaves as we move away from p in the direction
of the x-axis, which corresponds to the unit vector i. For notational convenience, we denote
the unit vector i by the symbol e1 . Thus, we have
e1 = i = (1, 0).
This choice simplifies expressions and facilitates computation while maintaining clarity in
the directional interpretation.
Next, we take a nearby point q located at a small distance h from p in the direction of i.
The coordinates of this new point are
f (a + h, b) − f (a, b)
,
h
which measures the average rate of change of the function between the points p and q along
the x-direction. Taking the limit as h → 0, we obtain the partial derivative of f with respect
to x at the point (a, b):
33
2.2. PARTIAL DERIVATIVE ALONG x-AXIS
figure (2.2.1), intersect the surface in a curve. The curve of intersection of the surface with
this plane is described by
z = f (x, b).
This curve lies entirely in the vertical plane y = b, and its shape is determined by how the
function f changes as x varies, with y held constant.
Let us define a single-variable function g(x) = f (x, b), so that the curve becomes
z = g(x).
From single-variable calculus, we know that the slope of the tangent line to this curve at the
point x = a is given by the derivative dx
dg
x=a
. Let us compute this derivative:
dg g(a + h) − g(a)
= lim
dx x=a
h→0 h
f (a + h, b) − f (a, b)
= lim
h→0 h
34
2.2. PARTIAL DERIVATIVE ALONG x-AXIS
∂f
= (a, b).
∂x
This shows that the partial derivative ∂f ∂x
(a, b) represents the slope of the curve formed by
slicing the surface z = f (x, y) with the plane y = b. In other words, it measures how the
surface rises or falls in the x-direction at the point (a, b), while keeping y fixed.
Summary: The partial derivative fx (a, b) gives the slope of the surface in the x-direction
at the point (a, b). Geometrically, it is the slope of the tangent to the curve obtained by
slicing the surface with the plane y = b. The corresponding tangent vector to this curve at
x = a is
(1, 0, fx (a, b)),
which will be useful later when we discuss tangent planes to surfaces. Furthermore, this
vector forms an angle φ with the x-axis in three-dimensional space, and the tangent of this
angle gives a geometric realization of the partial derivative in the x-direction.
Let the unit vector along the x-axis be i = (1, 0, 0), and consider the angle φ between i and
the tangent vector v = (1, 0, fx (a, b)). Then
i·v 1
cos φ = =p .
kik kvk 1 + fx (a, b)2
From this, we calculate
p fx (a, b)
sin φ = 1 − cos2 φ = p .
1 + fx (a, b)2
Therefore, the tangent of the angle φ is
sin φ fx (a, b) p
tan φ = =p · 1 + fx (a, b)2 = fx (a, b).
cos φ 1 + fx (a, b)2
This confirms that the partial derivative fx (a, b) corresponds to the tangent of the angle
between the surface curve and the x-axis in the vertical plane y = b, thereby providing an
intuitive geometric interpretation of the rate of change in the x-direction.
35
2.3. PARTIAL DERIVATIVE ALONG y-AXIS
As before, we begin with a point p = (a, b) in the domain of a function f (x, y), and we
examine how the function changes as we move from p in the direction of the positive y-axis.
This direction is represented by the unit vector j = e2 = (0, 1).
We choose a nearby point q, located a small distance k away from p in the y-direction. The
coordinates of this point are given by
36
2.3. PARTIAL DERIVATIVE ALONG y-AXIS
z = f (a, y).
This curve lies entirely in the vertical plane x = a, and its shape is determined by how the
function f changes as y varies, with x held constant.
Let us define a single-variable function h(y) = f (a, y), so that the curve becomes
z = h(y).
From single-variable calculus, we know that the slope of the tangent line to this curve at the
point y = b is given by the derivative dh
dy y=b
. Let us compute this derivative:
dh h(b + k) − h(b)
= lim
dy y=b
k→0 k
f (a, b + k) − f (a, b)
= lim
k→0 k
∂f
= (a, b).
∂y
This curve lies in three-dimensional space, and its tangent vector is given by the derivative
Evaluating this at y = b, we get the tangent vector at the point (a, b, f (a, b)) as
0 ∂f
d (b) = 0, 1, (a, b) .
∂y
This vector lies in the tangent plane to the surface and points in the direction of increasing
y, holding x constant.
37
2.3. PARTIAL DERIVATIVE ALONG y-AXIS
Summary: The partial derivative fy (a, b) gives the slope of the surface in the y-direction at
the point (a, b). Geometrically, it is the slope of the tangent to the curve obtained by slicing
the surface with the plane x = a. The corresponding tangent vector to this curve at y = b is
which will also be useful when we discuss the tangent plane to a surface. Furthermore, this
vector forms an angle θ with the y-axis in three-dimensional space, and the tangent of this
angle provides a geometric interpretation of the partial derivative.
To make this precise, observe that the unit vector in the y-direction is j = (0, 1, 0), and
the angle θ between this vector and the tangent vector v = (0, 1, fy (a, b)) satisfies
j·v 1
cos θ = =p .
kjk kvk 1 + fy (a, b)2
√
Then, applying the identity tan θ = sin θ
cos θ
, and computing sin θ = 1 − cos2 θ, we obtain
v !2
u
u 1 fy (a, b)
sin θ = t1 − p =p .
1 + fy (a, b)2 1 + fy (a, b)2
Therefore,
sin θ fy (a, b)
q
tan θ = =p · 1 + fy (a, b)2 = fy (a, b).
cos θ 1 + fy (a, b)2
This confirms that the partial derivative fy (a, b) is, in fact, the tangent of the angle that the
curve makes with the y-axis in the vertical plane x = a, offering a clear geometric interpre-
tation of directional change on the surface.
We now compute partial derivatives for several functions to illustrate the basic techniques
and applications of partial differentiation.
ILLUSTRATIVE EXAMPLES
Example 34. If f (x, y) = x2 + y 2 , then find fx (x, y) and fy (x, y) from the definition.
38
2.3. PARTIAL DERIVATIVE ALONG y-AXIS
f (x + h, y) − f (x, y)
fx (x, y) = lim
h→0 h
(x + h)2 + y 2 − (x2 + y 2 )
= lim
h→0 h
(x + 2xh + h2 + y 2 ) − (x2 + y 2 )
2
= lim
h→0 h
2
(2xh + h )
= lim
h→0 h
= 2x.
f (x, y + k) − f (x, y)
fy (x, y) = lim
h→0 h
x2 + (y + k)2 − (x2 + y 2 )
= lim
h→0 h
(x + y + 2yk + k 2 ) − (x2 + y 2 )
2 2
= lim
h→0 h
= 2y.
Example 35. Let f (x, y) = x2 y + 3xy 3 . Compute fx (1, 2) and fy (1, 3).
∂ 2
fx (x, y) = (x y + 3xy 3 ) = 2xy + 3y 3 .
∂x
Therefore, we get
fx (1, 2) = 2(1)(2) + 3(8) = 28.
To compute the partial derivative with respect to y, treat x as a constant:
∂ 2
fy (x, y) = (x y + 3xy 3 ) = x2 + 9xy 2 .
∂y
Therefore, we get
fy (1, 3) = 1 + 9(1)(9) = 82.
39
2.3. PARTIAL DERIVATIVE ALONG y-AXIS
Solution.
∂ x y 0 y y
fx = + = − 2 = − 2,
∂x y x y x x
∂ x y x 1
fy = + =− 2 + .
∂y y x y x
Example 39. Let f (x, y) = tan−1 xy . Compute fx (x, y) and fy (x, y).
1 −y −y
fx (x, y) = · = ,
1 + (y/x)2 x2 x2 + y 2
all the partial derivatives exist at (0, 0), but f (x, y) is not continuous at (0, 0).
Solution. First, we compute partial derivative using first principle. The partial derivative
fx (0, 0) with respect to x is given by
40
2.4. TANGENT PLANE TO A SURFACE
But this isn’t just a pretty line. It tells you how the function is behaving right there-whether
its going up, going down, or leveling off. Its your go-to tool for understanding local behavior.
In fact, it gives the best linear approximation of the curve near that point.
But what happens when we move from curves to surfaces? Suppose we have a surface defined
by a function of two variables, z = f (x, y). This surface bends in multiple directions - not
just one. If we want to understand how the surface behaves near a point, we need a kind of
plane, a flat surface, that plays the same role the tangent line did for a curve. This is the
tangent plane.
To build an intuitive sense of what the tangent plane is, we can think of every smooth curve
that lies on the surface and passes through the point P (a, b, f (a, b) as shown in the figure
41
2.4. TANGENT PLANE TO A SURFACE
(2.4.1). Each such curve has a tangent line at P (see figure 2.4.2) and remarkably, all these
tangent lines- despite pointing in different directions- lie in a single plane (see figure 2.4.3).
This plane is known as the tangent plane.
Intuitive Definition: The tangent plane at a point on a smooth surface is the unique plane
that contains tangent vectors to all smooth curves on the surface that pass through
that point.
In essence, the tangent plane is the best flat approximation of the surface near that
42
2.4. TANGENT PLANE TO A SURFACE
point, just as the tangent line is the best straight-line approximation to a curve.
Now, consider a surface given by z = f (x, y), and let P = (a, b, f (a, b)) be a point on this
surface. Our goal is to find the equation of the tangent plane to the surface at P . To do this,
we need a vector N that is normal to the tangent plane at P , known as the surface normal.
To find such a normal vector, we first construct two non-parallel vectors lying in the tangent
plane. These can be obtained from the tangent vectors to curves formed by intersecting the
surface z = f (x, y) with the vertical planes y = b and x = a, respectively. These curves lie
entirely on the surface and pass through the point P , and their tangent vectors at P are:
T1 = (1, 0, fx (a, b)) and T2 = (0, 1, fy (a, b)).
These two vectors span the tangent plane at P , and their cross product gives a vector normal
to this plane:
i j k
N = T1 × T2 = 1 0 fx (a, b) .
0 1 fy (a, b)
Expanding the determinant, we get:
0 fx (a, b) 1 fx (a, b) 1 0
N=i −j +k
1 fy (a, b) 0 fy (a, b) 0 1
= −fx (a, b) i − fy (a, b) j + k.
In vector form, the surface normal is
Using this normal vector and the point P = (a, b, f (a, b)), the equation of the tangent plane
is:
−fx (a, b)(x − a) − fy (a, b)(y − b) + 1 · (z − f (a, b)) = 0.
Rearranging terms gives the familiar form:
which is the equation of the tangent plane to the surface z = f (x, y) at the point (a, b, f (a, b)).
43
2.4. TANGENT PLANE TO A SURFACE
ILLUSTRATIVE EXAMPLES
Example 41. Find the equation of the tangent plane to the surface z = x2 + y 2 at the point
(1, 2, 5).
fx (1, 2) = 2, fy (1, 2) = 4.
Simplifying, we get
z = 2x + 4y − 5.
Example 42. Find the tangent plane to the surface z = exy at the point (0, 1, 1).
fx (0, 1) = 1 · e0 = 1, fy (0, 1) = 0 · e0 = 0.
Example 43. Let f (x, y) = log(x2 + y 2 ). Find the equation of the tangent plane at the
point (1, 1, log 2).
44
2.4. TANGENT PLANE TO A SURFACE
or simplifying:
z = log 2 + x + y − 2.
Example 44. Find the equation of the tangent plane to the surface z = 1 − x2 − y 2 at
p
the point (x, y) = (a, b), where a2 + b2 < 1. Interpret your calculations geometrically.
Solution. Let f (x, y) = 1 − x2 − y 2 . To find the tangent plane at (a, b, f (a, b)), we first
p
−a −b
fx (a, b) = √ , fy (a, b) = √ .
1 − a2 − b 2 1 − a2 − b 2
45
2.4. TANGENT PLANE TO A SURFACE
Also, √
f (a, b) = 1 − a2 − b2 = c(say).
Now use the formula for the tangent plane:
r = a i + b j + c k.
Since c > 0 on the upper hemisphere, the normal vector N and the position vector r point
in the same direction for r = c N, i.e., the vector r and N both are parallel. This means
the normal vector points outward from the surface, and is aligned with the radius vector of
the sphere at that point. Therefore, the tangent plane at P is perpendicular to the radius
vector r.
This leads to a fundamental geometric fact: At any point on the surface of a sphere,
the radius vector from the center to that point is perpendicular to the tangent
plane. In other words, the radial direction defines the outward normal to the surface.
46
2.4. TANGENT PLANE TO A SURFACE
√
where now c = − 1 − a2 − b2 < 0. In this case, even though the formula gives a normal
vector, its direction is opposite to the position vector r = (a, b, c). Hence, the computed
normal points inward-toward the center of the sphere-rather than outward.
To maintain geometric and physical consistency (e.g., when defining flux or orientation for
surface integrals), we reverse the direction of the normal and define the outward unit normal
as
Nout = (fx (a, b), fy (a, b), −1) .
This ensures that the surface normal always points away from the surface-regardless of
whether we are on the upper or lower hemisphere.
Conclusion: While standard formulas like (−fx , −fy , 1) yield inward-pointing normals on
−1 1
0
−1
0 −1
1
the lower hemisphere, we explicitly reverse their direction to preserve the outward-pointing
convention. This aligns the normal vector with the position vector throughout the sphere,
and ensures that the radius vector remains perpendicular to the tangent plane at every point
on the surface.
Let f (x, y) be a function of two variables. The second-order partial derivatives of f (x, y)
are obtained by differentiating the first-order partial derivatives again with respect to either
x or y. There are four such second order derivatives:
∂2f
• Second derivative with respect to x: fxx (x, y) = ∂x ∂ ∂f
∂x
= ∂x2 .
47
2.4. TANGENT PLANE TO A SURFACE
∂2f
• Second derivative with respect to y: fyy (x, y) = ∂
∂y
∂f
∂y
= ∂y 2
.
∂2f
• Mixed partial derivative x then y: fxy (x, y) = ∂ ∂f
∂y ∂x
= ∂y∂x
.
∂2f
• Mixed partial derivative y then x: fyx (x, y) = ∂
∂x
∂f
∂y
= ∂x∂y
.
ILLUSTRATIVE EXAMPLES
Example 45. Let f (x, y) = x2 y + 3xy 3 . Compute the second-order partial derivatives.
48
2.4. TANGENT PLANE TO A SURFACE
Solution. Let us compute the mixed partial derivatives fxy and fyx and verify that they are
equal. First, we compute fx (x, y) treating y as a constant, we get
∂ ∂
fx (x, y) = sin(xy 3 ) = cos(xy 3 ) · (xy 3 ) = y 3 cos(xy 3 ).
∂x ∂x
Now, in order to compute fxy (x, y), we differentiate fx with respect to y
∂
cos(xy 3 ) · y 3
fxy (x, y) =
∂y
∂ ∂ 3
= (cos(xy 3 )) · y 3 + cos(xy 3 ) · (y )
∂y ∂y
= [− sin(xy 3 ) · 3xy 2 ] · y 3 + cos(xy 3 ) · 3y 2
= −3xy 5 sin(xy 3 ) + 3y 2 cos(xy 3 ).
∂ ∂
fy (x, y) = sin(xy 3 ) = cos(xy 3 ) · (xy 3 ) = cos(xy 3 ) · 3xy 2 .
∂y ∂y
Now, we compute fyx (x, y) as below
∂
3xy 2 cos(xy 3 )
fyx (x, y) =
∂x
∂ ∂
= (3xy 2 ) · cos(xy 3 ) + 3xy 2 · cos(xy 3 )
∂x ∂x
= 3y 2 cos(xy 3 ) + 3xy 2 · (− sin(xy 3 ) · y 3 )
= 3y 2 cos(xy 3 ) − 3xy 5 sin(xy 3 ).
∂ xyz ∂
uz = (e ) = exyz · (xyz) = exyz · xy.
∂z ∂z
Next, differentiate uz with respect to y to obtain
∂ ∂ ∂ xyz
uzy = (xy · exyz ) = (xy) · exyz + xy · (e ).
∂y ∂y ∂y
This gives
uzy = x · exyz + xy · (exyz · xz) = (x + x2 yz)exyz .
49
2.4. TANGENT PLANE TO A SURFACE
∂
xexyz + x2 yzexyz
uzyx =
∂x
∂ ∂ xyz ∂ 2 ∂ xyz
= (x) · exyz + x · (e ) + (x yz) · exyz + x2 yz · (e )
∂x ∂x ∂x ∂x
= 1 · exyz + x · (exyz · yz) + 2xyz · exyz + x2 yz · (exyz · yz)
= exyz + xyzexyz + 2xyzexyz + x2 y 2 z 2 exyz
= (1 + 3xyz + x2 y 2 z 2 )exyz .
∂ 3u
= (1 + 3xyz + x2 y 2 z 2 )exyz
∂x∂y∂z
as required.
1. x ∂u
∂x
+ y ∂u
∂y
+ z ∂u
∂z
= −u.
∂2u ∂2u ∂2u
2. ∂x2
+ ∂y 2
+ ∂z 2
= 0.
∂u d 1
(x2 + y 2 + z 2 )−1/2 = − (x2 + y 2 + z 2 )−3/2 · 2x = −x(x2 + y 2 + z 2 )−3/2 .
=
∂x dx 2
Similarly,
∂u ∂u
= −y(x2 + y 2 + z 2 )−3/2 , = −z(x2 + y 2 + z 2 )−3/2 .
∂y ∂z
Now compute the expression:
∂u ∂u ∂u
x +y +z = −x2 (x2 + y 2 + z 2 )−3/2 − y 2 (x2 + y 2 + z 2 )−3/2 − z 2 (x2 + y 2 + z 2 )−3/2
∂x ∂y ∂z
= −(x2 + y 2 + z 2 ) · (x2 + y 2 + z 2 )−3/2
= −(x2 + y 2 + z 2 )−1/2 = −u.
Hence,
∂u ∂u ∂u
x +y +z = −u.
∂x ∂y ∂z
50
2.4. TANGENT PLANE TO A SURFACE
∂ 2u ∂ 2 2 2 −3/2
= −x(x + y + z ) .
∂x2 ∂x
Using the product rule:
∂ 2u ∂ 2 2 2 −3/2 ∂ 2 2 2 −3/2
= − (x) · (x + y + z ) − x · (x + y + z )
∂x2 ∂x ∂x
2 2 2 −3/2 3 2 2 2 −5/2
= −(x + y + z ) − x · − (x + y + z ) · 2x
2
= −(x2 + y 2 + z 2 )−3/2 + 3x2 (x2 + y 2 + z 2 )−5/2 .
Similarly,
∂ 2u
= −(x2 + y 2 + z 2 )−3/2 + 3y 2 (x2 + y 2 + z 2 )−5/2 ,
∂y 2
∂ 2u
= −(x2 + y 2 + z 2 )−3/2 + 3z 2 (x2 + y 2 + z 2 )−5/2 .
∂z 2
Adding all three:
∂ 2u ∂ 2u ∂ 2u
+ + = −3(x2 + y 2 + z 2 )−3/2 + 3(x2 + y 2 + z 2 )(x2 + y 2 + z 2 )−5/2
∂x2 ∂y 2 ∂z 2
= −3(x2 + y 2 + z 2 )−3/2 + 3(x2 + y 2 + z 2 )−3/2
= 0.
Hence,
∂ 2u ∂ 2u ∂ 2u
+ + = 0.
∂x2 ∂y 2 ∂z 2
Example 49. If u = f (r), where r2 = x2 + y 2 , show that
∂ 2u ∂ 2u 1
2
+ 2 = f 00 (r) + f 0 (r).
∂x ∂y r
a function of x and y. We will compute the second partial derivatives using the chain rule.
First, compute the partial derivatives of r:
∂r 1 x ∂r y
= p · 2x = , = .
∂x 2
2 x +y 2 r ∂y r
51
2.4. TANGENT PLANE TO A SURFACE
Now compute ∂u
∂x
using the chain rule:
∂u df ∂r x
= · = f 0 (r) · .
∂x dr ∂x r
Differentiate again with respect to x:
∂ 2u ∂ 0 x
= f (r) ·
∂x2 ∂x r
∂f 0 x 0 ∂ x
= · + f (r) · .
∂x r ∂x r
We compute both the terms on right hand side separately. The first term is
∂f 0 ∂r x
= f 00 (r) · = f 00 (r) · .
∂x ∂x r
Similarly, the second term is,
∂r x
∂ x r · 1 − x · ∂x
r−x· r r 2 − x2
= = = .
∂x r r2 r2 r3
So,
∂ 2u 00 x2 0 r 2 − x2
= f (r) · + f (r) · .
∂x2 r2 r3
Similarly,
∂ 2u 00 y2 0 r2 − y 2
= f (r) · + f (r) · .
∂y 2 r2 r3
Now add the two second derivatives:
∂ 2u ∂ 2u
2
x + y2
2
2r − (x2 + y 2 )
00 0
+ = f (r) + f (r)
∂x2 ∂y 2 r2 r3
r2 r2
= f 00 (r) · 2 + f 0 (r) · 3
r r
1
= f 00 (r) + f 0 (r).
r
Hence,
∂ 2u ∂ 2u 00 1 0
+ = f (r) + f (r),
∂x2 ∂y 2 r
as required.
∂ 2z 1
=− .
∂x∂y x log(ex)
52
2.4. TANGENT PLANE TO A SURFACE
log(xx y y z z ) = log c.
This gives
x log x + y log y + z log z = log c.
Since c is a constant, log c is also constant. Now, treat z as a function of x and y, i.e.,
z = z(x, y). Differentiating both sides of the above equation partially with respect to y, we
get
∂
(x log x + y log y + z log z) = 0.
∂y
This gives us
∂ ∂
(y log y) + (z log z) = 0.
∂y ∂y
After a bit simplification, we arrive at
∂z ∂z
(log y + 1) + log z + = 0.
∂y ∂y
∂ 2z
∂ log y + 1
= − .
∂x∂y ∂x log z + 1
Now,
∂ 1 ∂z
(log z + 1) = · .
∂x z ∂x
53
2.5. DIFFERENTIABILITY
∂ 2z (log y + 1) 1 ∂z
=− · ·
∂x∂y (log z + 1)2 z ∂x
(log y + 1) 1 log x + 1
=− · · −
(log z + 1)2 z log z + 1
(log y + 1)(log x + 1)
= .
z(log z + 1)3
log c
3x log x = log c ⇒ log x = .
3x
But instead, we are asked to show that:
∂ 2z 1
=− .
∂x∂y x log(ex)
Note that:
log(ex) = log e + log x = 1 + log x.
So,
∂ 2z 1 1
=− =− ,
∂x∂y x(1 + log x) x log(ex)
as required.
2.5 Differentiability
For a general function f : D ⊂ Rn → Rm , giving a rigorous definition of differentiability
involves deeper ideas from linear algebra and analysis. It requires understanding how to
approximate functions using linear maps between vector spaces, which is beyond our current
scope. Instead of focusing on these technicalities, our goal here is to build an intuitive and
practical understanding of what it means for such functions to be differentiable, and how we
define and work with their derivatives.
To guide our way forward, we will begin by recalling what differentiability means in the
simpler case of functions of a single variable. From there, we will extend the ideas first to
scalar-valued functions of several variables that is, functions of the form
f : D ⊂ Rn → R,
54
2.5. DIFFERENTIABILITY
also known as scalar fields. This intermediate step will help us bridge our intuition from
one-dimensional calculus to functions of several variables. After developing a clear picture in
this scalar case, we will then proceed to define and explore differentiability for more general
vector-valued functions of the form
f : D ⊂ Rn → Rm .
Let us begin by recalling the familiar definition from single-variable calculus. For a function
f : R → R, the derivative at a point x = a is defined by the limit
f (a + h) − f (a)
f 0 (a) = lim .
h→0 h
This expression captures the idea of the instantaneous rate of change of the function at the
point a, or the slope of the tangent line to the graph at that point.
To better prepare for generalization to multiple variables, we can rewrite this definition in a
slightly more abstract form. Instead of directly talking about slopes, we think of the deriva-
tive as the best linear approximation to the function near the point of interest. With that
in mind, we say:
This reformulation emphasizes the idea that the derivative is the number λ such that the
function f (x) can be closely approximated by the linear expression f (a) + λ · (x − a) near
x = a. In the next section, we will carry this idea into higher dimensions and see how similar
reasoning leads to the definition of the derivative in the multivariable setting.
55
2.5. DIFFERENTIABILITY
√
To address this, we replace (h, k) in the denominator with its magnitude h2 + k 2 , leading
to the revised definition:
f (a + h, b + k) − f (a, b) − λ · (h, k)
lim √ = 0. (2.2)
(h,k)→(0,0) h2 + k 2
While this resolves the division issue, the meaning of λ remains unclear. What is this λ?
Just a real number as before or something else?
If λ were a single real number, then λ·(h, k) would be an ordered pair, making the expression
f (a + h, b + k) − f (a, b) − λ · (h, k) in the numerator meaningless for both f (a + h, b + k) and
f (a, b) are real numbers and addition of a real number with an ordered pair is not defined.
The natural choice is to treat λ as a vector (α, β), and the operation ‘·’ as scalar product so
that
With this formulation, the expression f (a + h, b + k) − f (a, b) − λ · (h, k) now makes sense.
We can now rigorously state that a function f (x, y) is differentiable at (a, b) if there exists
λ = (α, β) such that
f (a + h, b + k) − f (a, b) − (αh + βk)
lim √ = 0. (2.3)
(h,k)→(0,0) h2 + k 2
This provides a solid foundation for defining the derivative of a function of two variables.
The vector (α, β) is called the derivative of f (x, y), written as:
In order to determine the numbers α and β, we now examine the behavior of f (x, y) along
the coordinate axes. First, we let (h, k) → (0, 0) along x-axis. Along the x-axis (k = 0).
Therefore, setting k = 0 in the definition, we get:
f (a + h, b) − f (a, b) − αh
lim = 0.
h→0 |h|
56
2.5. DIFFERENTIABILITY
fx (a, b) = α.
f (a, b + k) − f (a, b) − βk
lim = 0.
k→0 |k|
This implies
f (a, b + k) − f (a, b)
lim = β.
k→0 k
Again, this is precisely the partial derivative of f with respect to y at (a, b), i.e.,
fy (a, b) = β.
From the above two observations, we conclude that the numbers α and β appearing in the
definition of differentiability are precisely the partial derivatives of f at (a, b), i.e.,
ILLUSTRATIVE EXAMPLES
Example 51. Compute derivative Df (a, b) or gradient for each of the functions given below:
57
2.5. DIFFERENTIABILITY
∂f ∂f
= 2x sin y − y 2 sin x, = x2 cos y + 2y cos x.
∂x ∂y
Therefore, the derivative matrix of the function f is given by
∂f 2x ∂f 2y
= 2 , = 2 .
∂x x + y2 ∂y x + y2
Therefore, the derivative matrix is
2x 2y
Df (x, y) = 2 , .
x + y 2 x2 + y 2
Example 52. Compute the derivative matrix for the function f (x, y) = ex cos(y) + xy 2 and
evaluate it at the points (1, 0) and (0, 1).
Solution. The derivative matrix of a function of two variable is a matrix comprised of partial
derivatives and given by
0 ∂f ∂f
Df (x, y) = f (x, y) = , .
∂x ∂y
For the function f (x, y) = ex cos y + xy 2 , the partial derivatives are
∂f ∂f
= ex cos y + y 2 , = −ex sin y + 2xy.
∂x ∂y
Therefore, the derivative matrix of the function f is
Df (x, y) = ex cos y + y 2 , −ex sin y + 2xy .
58
2.5. DIFFERENTIABILITY
Example 53. Compute the derivative matrix for the function f (x, y, z) = xyez and evaluate
it at the points (1, 1, 0) and (2, −1, ln 2).
Solution. The derivative matrix (or gradient) of a scalar function of three variables is a
vector of partial derivatives and is given by
0 ∂f ∂f ∂f
Df (x, y, z) = f (x, y, z) = , , .
∂x ∂y ∂z
∂f ∂f ∂f
= yez , = xez , = xyez .
∂x ∂y ∂z
Example 54. Compute the derivative matrix for the function f (x, y, z) = tan−1 (x + yz)
and evaluate it at the points (0, 1, 1) and (1, 2, 0).
59
2.5. DIFFERENTIABILITY
Solution. The derivative matrix (or gradient) of a scalar function of three variables is given
by
0 ∂f ∂f ∂f
Df (x, y, z) = f (x, y, z) = , , .
∂x ∂y ∂z
For the function f (x, y, z) = tan−1 (x + yz), the partial derivatives are
∂f 1 ∂f z ∂f y
= , = , = .
∂x 1 + (x + yz)2 ∂y 1 + (x + yz)2 ∂z 1 + (x + yz)2
Therefore, the derivative matrix is
1 z y
Df (x, y, z) = , , .
1 + (x + yz)2 1 + (x + yz)2 1 + (x + yz)2
Now, we evaluate the derivative matrix at the given points:
a) At (0, 1, 1), we have x + yz = 1, so
1 1 1 1 1 1
Df (0, 1, 1) = , , = , , .
1 + 12 1 + 12 1 + 12 2 2 2
Example 55. Compute the derivative matrix for the function f (x, y, z) = log (x2 + y 2 + z 2 )
and evaluate it at the points (1, 0, 0) and (1, 2, 2).
Solution. The derivative matrix (or gradient) of a scalar function of three variables is a
vector of partial derivatives and is given by
0 ∂f ∂f ∂f
Df (x, y, z) = f (x, y, z) = , , .
∂x ∂y ∂z
For the function f (x, y, z) = ln(x2 + y 2 + z 2 ), the partial derivatives are
∂f 2x ∂f 2y ∂f 2z
= 2 , = 2 , = 2 .
∂x x + y2 + z2 ∂y x + y2 + z2 ∂z x + y2 + z2
Therefore, the derivative matrix is
2x 2y 2z
Df (x, y, z) = 2 , , .
x + y 2 + z 2 x2 + y 2 + z 2 x2 + y 2 + z 2
Now, we evaluate the derivative matrix at the given points:
60
2.5. DIFFERENTIABILITY
In this section, we prove that differentiability implies continuity. We then present a few
counterexamples to show that continuity alone does not guarantee differentiability, and even
the existence of partial derivatives is not sufficient for the function to be continuous or
differentiable.
Theorem 2.5.1. If f (x, y) is differentiable at a point (a, b), then it is continuous at (a, b)
but the converse need not be true.
Proof: To establish that differentiability implies continuity, we assume that f (x, y) is dif-
ferentiable at (a, b). Therefore there exist real numbers α and β (which turn out to be the
partial derivatives fx (a, b) and fy (a, b), respectively) such that
f (a + h, b + k) − f (a, b) − (αh + βk)
lim √ = 0.
(h,k)→(0,0) h2 + k 2
This means that the difference
61
2.5. DIFFERENTIABILITY
This confirms that f (x, y) is continuous at (a, b), proving that differentiability implies con-
tinuity.
To prove that the converse of the theorem need not be true, we produce a counter example.
Example 56. Consider the following function f (x, y) of two variables given by
(
p xy , (x, y) 6= (0, 0);
f (x, y) = x2 +y 2
0, (x, y) = (0, 0).
Show that the function f (x, y) is continuous and partial derivatives fx (0, 0) and fy (0, 0) exist
but the function is not differentiable at (0, 0).
Solution. First, we prove the continuity of the function. Note that, value of the function at
(0, 0), i,e, f (0, 0) = 0 is already given. Now, in order to prove that the function is continuous
at (0, 0), we shall prove that the limit of the function equals f (0, 0) = 0. For this, we shall
use the ε-δ definition of continuity.
From the definition of continuity, it follows that the function f (x, y) will be continuous at
(0, 0) if, for every ε > 0, there exists a δ > 0 such that
p
x2 + y 2 < δ =⇒ |f (x, y) − f (0, 0)| < ε.
Since f (0, 0) = 0, therefore to prove that the function is continuous, we must show that
p
|f (x, y)| < ε whenever x2 + y 2 < δ.
Let > 0 be given. Now, we see that for (x, y) 6= (0, 0), we have
xy
|f (x, y)| = p .
x2 + y 2
We know that for all real numbers x and y, |xy| ≤ |x||y|, therefore, we get
|xy| |x||y|
|f (x, y)| = p ≤p .
x2 + y2 x2 + y 2
62
2.5. DIFFERENTIABILITY
p p
|x||y| ( x2 + y 2 )( x2 + y 2 ) p 2
p ≤ p = x + y2.
2
x +y 2 2
x +y 2
Thus, p
|f (x, y)| ≤ x2 + y 2 .
Now, to ensure that |f (x, y)| < ε, it suffices to choose δ = , then
p p
x2 + y 2 < δ =⇒ |f (x, y)| ≤ x2 + y 2 < ε.
Thus for a given > 0, we can find a δ(= ε), such that
whenever p
x2 + y 2 < δ.
Thus, the function is continuous.
Finally, we discuss differentiability of the function. Since both the partial derivative exists,
therefore, the given function will be differentiable at (0, 0) if the following limit exists and
equals zero, i.e.,
f (h, k) − f (0, 0) − (αh + βk)
lim √ = 0.
(h,k)→(0,0) h2 + k 2
Since f (0, 0) = 0 and α = fx (0, 0) = 0, β = fy (0, 0) = 0, this simplifies to
f (h, k)
lim √ = 0.
(h,k)→(0,0) h2 + k 2
Thus, we arrive at the conclusion that the given function will be differentiable at (0, 0) if the
limit lim(h,k)→(0,0) √fh(h,k)
2 +k 2 exists and is equal to 0, i.e.,
√ hk
f (h, k) 2 2 hk
lim √ = √ h +k = 2 = 0.
(h,k)→(0,0) h2 + k 2 h2 + k 2 h + k2
63
2.5. DIFFERENTIABILITY
To check whether this limit exists or not, we try different paths. First, we let (h, k) → (0, 0)
along the line y = x. Along this line k = h and therefore, we get
f (h, h) h·h h2 1
√ = 2 = = .
h2 + h2 h + h2 2h2 2
but yet the function f (x, y) is not differentiable at (0, 0). This example shows that mere
continuity of the function or the existence of partial derivatives do not guarantee the differ-
entiability.
Show that the partial derivatives fx (0, 0) and fy (0, 0) exist, but the function is not continuous
at (0, 0).
Solution. If a point (x, y) lies on any one of the axes, then xy = 0 and for points (x, y)
not lying on axes, xy 6= 0 (why?). From the definition of the function, we find that if
(x, y) → (0, 0) along x or y-axes, the functional values will remain 1, i.e., along both the
axes
lim f (x, y) = 1.
(x,y)→(0,0)
64
2.5. DIFFERENTIABILITY
Also, we find that for the paths other than axes (such as a line y = x), functional values will
remain 0, i.e., along other path such as y = x,
lim f (x, y) = 0.
(x,y)→(0,0)
Thus, the limit of the function f (x, y) at (0, 0) is path dependent. Therefore, it does not
exist. As a consequence, we conclude that the function is not continuous at (0, 0) for limit is
non-existent at origin. However, we are given that f (x, y) = 1 for points where xy = 0, i.e.,
function is constant along both the axes, therefore, partial derivatives exist and fx (0, 0) = 0
and fy (0, 0) = 0 for the rate of change of constant entities must be zero. Precisely, one may
also proceed as below to check whether the partial derivatives exist at the origin or not.
f (h, 0) − f (0, 0)
fx (0, 0) = lim .
h→0 h
Given that f (x, y) = 1, when xy = 0, therefore, we have f (h, 0) = 1 and f (0, 0) = 1.
Thus, we get
1−1
fx (0, 0) = lim = 0.
h→0 h
• Partial derivative with respect to y at (0, 0):
f (0, k) − f (0, 0)
fy (0, 0) = lim .
k→0 k
Using similar arguments, we have f (0, k) = 1 and f (0, 0) = 1, so
1−1
fy (0, 0) = lim = 0.
k→0 k
Therefore, both partial derivatives fx (0, 0) and fy (0, 0) exist and are equal to 0.
This example shows that the partial derivatives fx (0, 0) and fy (0, 0) exist but f (x, y) is not
continuous at (0, 0). Thus, existence of partial derivatives does not imply continuity of a
function of two variables.
65
2.5. DIFFERENTIABILITY
f (x) = y.
It is important to note that the output vector y = (y1 , y2 , . . . , ym ) depends on the input vector
x = (x1 , x2 , . . . , xn ). Thus, each component yi can be viewed as a real-valued function of x,
and we write:
f (x) = (y1 (x), y2 (x), . . . , ym (x)) .
In other words, each yi is a scalar function of n real variables, i.e.,
yi = yi (x) = yi (x1 , x2 , . . . , xn ).
ILLUSTRATIVE EXAMPLES
66
2.5. DIFFERENTIABILITY
Therefore, we compute
∂ ∂
∂x
(x + y) ∂y
(x + y) 1 1
Df (x, y) = ∂ 2 ∂ = .
∂x
(x − y) ∂y
(x2 − y) 2x −1
67
2.5. DIFFERENTIABILITY
Solution. The given transformation expresses the polar-to-Cartesian conversion. That is,
p y
r = x2 + y 2 , θ = tan−1 ,
x
f (x, y) = (r cos θ, r sin θ) = (x, y).
Hence, the Jacobian matrix is simply the identity matrix:
" #
∂x ∂x
∂x ∂y 1 0
Df (x, y) = ∂y ∂y = .
∂x ∂y
0 1
Now, we conclude with an important result that provides a criterion for determining whether
a function is differentiable, based on the continuity of its partial derivatives. This result not
only links differentiability to the smoothness of partial derivatives but also clarifies how the
existence and continuity of partial derivatives play a pivotal role in ensuring the differentia-
bility of multivariable functions.
Theorem 2.5.2. Let f : Rn → Rm be a function. Suppose that all the partial derivatives
∂yi
∂xj
of f exist and are continuous in a neighborhood of a point x ∈ U . Then, f is
differentiable at x.
68
2.5. DIFFERENTIABILITY
Functions whose partial derivatives exist and are continuous are referred to as continuously
differentiable or of class C 1 . A function being C 1 implies that not only does it have well-
defined derivatives, but these derivatives also vary smoothly without abrupt changes.
Thus, the theorem provides the important conclusion that all C 1 functions are differentiable.
In practice, this means that to check the differentiability of a function, one can focus on the
continuity of its partial derivatives. If they are continuous, differentiability follows directly.
This insight simplifies the verification process and gives a clear criterion for working with
functions in multivariable calculus.
ILLUSTRATIVE EXAMPLES
Solution. We begin by calculating the partial derivatives of f . The function is given by:
f (x, y) = x2 y + sin(xy).
∂f
(1, 1) = 2(1)(1) + (1) cos(1 · 1) = 2 + cos(1),
∂x
∂f
(1, 1) = (1)2 + (1) cos(1 · 1) = 1 + cos(1).
∂y
Since both partial derivatives exist and are continuous (since the components involve ele-
mentary functions), f is differentiable at (1, 1).
69
2.5. DIFFERENTIABILITY
Solution. First, we compute the partial derivatives of f . The function is given by:
f (x, y) = x3 + 2xy + y 3 .
Since all partial derivatives exist and are continuous (as they involve elementary functions),
f is differentiable at (1, 1).
70
2.5. DIFFERENTIABILITY
Since all partial derivatives exist and are continuous (as they involve elementary functions),
f is differentiable at (1, 0, 0).
71
2.5. DIFFERENTIABILITY
72
Chapter 3
Chain Rule
In earlier classes, we studied the composition of two real-valued functions of a single vari-
able and the corresponding chain rule for differentiating such composite functions. In this
chapter, we develop the chain rule for functions of several variables- a powerful extension of
the single-variable chain rule. The chain rule enables us to differentiate composite functions
where variables are interdependent or expressed in terms of other variables. While we will
not attempt a formal proof here, our focus will be on understanding the structure of com-
positions, computing their derivatives, and interpreting the chain rule in both explicit and
implicit formulations.
Before diving into the general formulation of the chain rule, let us first revisit the idea of
composite functions to prepare ourselves for the multivariable chain rule.
73
3.1. COMPOSITION OF FUNCTIONS
Obviously, if we want to keep the above operations on functions f (x) and g(x) meaningful
then the input must belong to domain of both the functions, i.e., x ∈ D(f ) ∩ D(g) = A ∩ B.
There is one more specific way to combine the two functions known as composition of two
functions. Given an input x, we first operate f on it yielding the output f (x). The output
f (x) lies in B and serves as an input for the other function g. Now, g acts on this new input
f (x) and yields an output g(f (x)) in C. The whole process can be thought of as a single
operation (more precisely a function, say h) that carries the input x to an output g(f (x))
in C. This function h is called composition of f and g and is written as g ◦ f . Its domain is
the set A of all inputs x and codomain is C, i.e., h : A → C is a map from A to C and
h = gof.
also action of h on x is defined by
h(x) = g(f (x)).
We now precisely write the definition of composition of two functions as below:
Sometimes the composition f ◦ g is also possible provided that R(g) ⊆ D(f ). Further, these
definitions are so general in nature that they can be applied functions of several variables by
merely making suitable changes in domain and range, i.e., replacing A or B by just Rn for
some desired n.
ILLUSTRATIVE EXAMPLES
√
Example 67. Take the two functions f : [0, ∞) → R given by f (x) = x and g : R → R
given by g(x) = −x2 − 1. Find f ◦ g and g ◦ f .
74
3.1. COMPOSITION OF FUNCTIONS
Solution. First, we consider the composition g ◦ f . Note that the function f is a square
root function, therefore range of f will consists of positive reals only, i.e.,
R(f ) = [0, ∞)
For the other composition f ◦ g, we see that range of g is comprised of negative real numbers
only, i.e.,
R(g) = (−∞, −1].
Clearly
R(g) $ D(f ) = [0, ∞)
Therefore, f ◦ g is not possible.
Example 68. Let f : R → R; f (x) = x3 and g : R → R; g(x) = sin x. Find f ◦ g and g ◦ f .
Solution. In this case both f ◦ g and g ◦ f are defined. These compositions are given by
and
(g ◦ f )(x) = g(f (x)) = g(x3 ) = sin x3 .
Observe that both the compositions are defined but f ◦ g 6= g ◦ f .
In a similar fashion, we can build compositions in case of multivariable functions too. The
following examples will illustrate the concept.
Also g ◦ f : R2 → R2 is given by
75
3.2. THE CHAIN RULE
Also g ◦ f : R3 → R3 is given by
yz
(g ◦ f )(x, y, z) = g(f (x, y, z)) = g(xeyz ) = (exe , xeyz , sin (xeyz )).
y = sin u, u = x2 .
Now, y becomes a composite function of x (via u). The chain rule is given by
dy dy du
= · .
dx du dx
1
We do not submit any proof. Curious reader may find a proof in Thomas’ Calculus.
76
3.2. THE CHAIN RULE
The chain rule described above is quite simplistic in terms of notations and can be used every
where in single variable case efficiently. However, in more general setting of multivariate case
it is better to write the chain rule (1.3) using functional notation. The formal treatment of
chain rule employing the functional notations is more convenient and sometimes more useful
at some instances. Therefore, we state the formula (3.3) in functional notations.
Now, replacing u by g(x) in the fist member of (3.4), we may write y as composite function
y = f (g(x)) = (f ◦ g)(x).
Now, with these new notations, the equation (1.3), may be rewritten as below:
Now, we write the statement of chain rule for single variable case as below:
Theorem 3.2.1 (The Chain Rule). Let f and g be two differentiable functions of single
real variable. Then the composition f ◦ g is differentiable and its derivative f ◦ g at x is
given by
(f ◦ g)0 (x) = (f ◦ g)0 (x) = f 0 (g(x)) · g 0 (x). (3.5)
Also,
(f ◦ g)0 (x0 ) = (f ◦ g)0 (x0 ) = f 0 (g(x0 )) · g 0 (x0 ). (3.6)
ILLUSTRATIVE EXAMPLES
77
3.3. CHAIN RULE IN HIGHER DIMENSIONS
Since, f (x) = ex , so we have f 0 (g(x)) = eg(x) = eax+b and g 0 (x) = a. Plugging these values
in the chain rule given above, we get
Example 74. Given f (x) = log x and g(x) = sin x. Find (f ◦ g)0 .
78
3.3. CHAIN RULE IN HIGHER DIMENSIONS
Theorem 3.3.1. If z = f (x, y) be differentiable and x = x(t) and y = y(t) are also
differentiable functions of t then their composition z = f (x(t), y(t)) is differentiable and
dz ∂f dx ∂f dy
= · + · . (3.9)
dt ∂x dt ∂y dt
Also
dz ∂f dx ∂f dy
= · + · ,
dt t=t0 ∂x (x0 ,y0 ) dt t=t0 ∂y (x0 ,y0 ) dt t=t0
This can be written more formally in functional notations. For this, we note that x and
y are given to be functions of t, so we may write the ordered pair (x(t), y(t)) = g(t). Thus,
we get a function g : R → R2 given by g(t) = (x, y). The function f : R2 → R is already
given by z = f (x, y). Now, composition of f with g is
The chain rule for derivative z 0 (t) may now be restated as below:
dz
= (f ◦ g)0 (t) = f 0 (g(t)) · g 0 (t). (3.10)
dt
Also
dz
= (f ◦ g)0 (t0 ) = f 0 (g(t0 )) · g 0 (t0 ).
dt t=t0
It should be noted that the formulae (3.9) and (3.10) are essentially same. Just a matter of
notational difference. Observe that f (g(t)) is simply f (x, y), i.e.,
f (g(t)) = f (x, y)
therefore, we have
0 0 ∂f ∂f
f (g(t)) = f (x, y) = ,
∂x ∂y
and
0 0 0 dx dy
g (t) = (x , y ) = , .
dt dt
79
3.3. CHAIN RULE IN HIGHER DIMENSIONS
In case of three or more intermediate variable, the formula (3.9) can be stretched easily. If we
take a differentiable function u = f (x, y, z) of three variables where x = x(t), y = y(t) and
z = z(t) are themselves differentiable functions of t, then the composite function u = u(t) is
differentiated as below:
du ∂f dx ∂f dy ∂f dz
= · + · + · . (3.11)
dt ∂x dt ∂y dt ∂z dt
ILLUSTRATIVE EXAMPLES
Finally, we get
df
= 26t + 76.
dt
80
3.3. CHAIN RULE IN HIGHER DIMENSIONS
dw ∂w dx ∂w dy
= · + · ,
dt t=t0 ∂x (x0 ,y0 ) dt t=t0 ∂y (x0 ,y0 ) dt t=t0
Example 77. Suppose a duck is swimming around in a pond. The position (x, y) of the
duck at time t is given by ~c(t) = (3 + 8t, 3 − 2t) while the water temperature is given by the
formula T (x, y) = 25 + x2 + y 2 . Find the position of the duck at time t = 0. What will be
temperature at that point and what about rate of change of temperature relative to time at
t = 0.
Solution. Given that at time t, the position of the duck is given by
(x, y) = ~c(t) = (3 + 8t, 3 − 2t).
Therefore, at t = 0, the position of the duck is
(x0 , y0 ) = ~c(0) = (3, 3)
Also, the temperature at t = 0 is
T (x, y) t=0
= T (~c(0)) = 25 + x20 + y02 = 25 + 9 + 9 = 43.
The rate of change in temperature T with respect to time is given by
dT ∂T dx ∂T dy
= · + · = (2x)(8) + (2y)(−2) = 16(3 + 8t) − 4(3 − 2t) = 36 + 136t.
dt ∂x dt ∂y dt
Therefore,
dT
= 36.
dt t=0
81
3.3. CHAIN RULE IN HIGHER DIMENSIONS
Example 78. Consider f (x, y) = 9 − x2 − y 2 and ~r(t) = (2 cos t, 3 sin t). For this problem,
imagine the following scenario. A horse is running around outside in the cold. The horse’s
position at time t is given by the elliptical path ~r(t). The temperature of the air at any point
(x, y) is given by T = f (x, y). Now, answer the following:
1. At time t = 0, what is the horse’s position ~r(0), and what is the temperature f (~r(0))
at that position? Find the temperatures at t = π/2, t = π, and t = 3π/2 as well.
2. In the plane, draw the path of the horse for t ∈ [0, 2π]. Then, on the same 2D graph,
include a contour plot of f . Make sure you include the level curves that pass through
the points in part ??. At the points addressed in part ??, write the temperature on
the curve.
Solution. Do it yourself.
Now, we need to find h0 (u, v). Note that h is now function of two variables, therefore
h0 (u, v) = [∂h/∂u, ∂h/∂v]. The component of this derivative matrix, i.e., partial derivatives
∂h/∂u and ∂h/∂v may be calculated with the help of chain rule given below:
Since x = x(u, v) and y = y(u, v), so in functional notation, we may write that (x, y) =
g(u, v). Thus, we get a function g : R2 → R2 given by
Now, the chain rule given by the equations (3.12) and (3.13) may be formally restated as
below:
82
3.3. CHAIN RULE IN HIGHER DIMENSIONS
The formula (3.14) can be written conveniently in terms of derivative matrix also. Note that
g(u, v) = (x, y), therefore, f (g(u, v)) = f (x, y) and
0 ∂f ∂f
f (g(u, v)) = ,
∂x ∂y
and ∂x ∂x
0 ∂u ∂v
g (u, v) = ∂y ∂y .
∂u ∂v
ILLUSTRATIVE EXAMPLES
Example 79. Given f (x, y) = xy, where g(u, v) = (u2 − v 2 , u2 + v 2 ). Find (f ◦ g)0 by (i)
chain rule and (ii) by expressing f in terms of t.
83
3.3. CHAIN RULE IN HIGHER DIMENSIONS
Solution. (i) Note that the function g gives an ordered pair as an output. Let us call it
(x, y), i.e., g(u, v) = (x, y) so that x = u2 − v 2 and y = u2 + v 2 . Now, given f (x, y) = xy
and g(u, v) = (x, y), so the composition h is
This gives
∂h ∂h 2u −2v
, = [y, x] ·
∂u ∂u 2u 2v
This further gives
∂h ∂h
, = [(2u)y + (2u)x, (−2v)y + (2v(x)]
∂u ∂u
= (2u)(u2 + v 2 ) + (2u)(u2 − v 2 ), (−2v)(u2 + v 2 ) + 2v(u2 − v 2 )
= 4u3 , −4v 3 .
Therefore,
∂h
= 4u3
∂u
and
∂h
= −4v 3 .
∂v
Therefore, the derivative of composition h is
∂h ∂h
= (f ◦ g)0 (u, v)) = 4u3 , −4v 3 .
,
∂u ∂v
Example 80. Compute the derivative matrices using chain rule for the following functions.
84
3.3. CHAIN RULE IN HIGHER DIMENSIONS
∂u ∂u
∂z ∂z ∂z ∂z
, = , · ∂x
∂v
∂y
∂v
∂x ∂y ∂u ∂v ∂x ∂y
2 0
= [2u, 2v] ·
3 1
= [4u + 6v, 2v] .
where
0 0 ∂z ∂z
= 2x − y/x2 , 1/x
z (g(u, v)) = z (x, y) = ,
∂x ∂y
and ∂x ∂x
0 ∂u ∂v 1 −2
g (u, v) = ∂y ∂y = .
∂u ∂v
2 1
For (u, v) = (0, 0), we have (x, y) = (1, −2), therefore, we have
and
0 1 −2
g (0, 0) = .
2 1
Thus,
0 0 0 1 −2
z (0, 0) = z (1, −2) · g (0, 0) = [4, 1] · = [6, −7].
2 1
85
3.3. CHAIN RULE IN HIGHER DIMENSIONS
If we write h(u1 , u2 , . . . , un ) = (h1 , h2 , . . . , hp ), then the chain rule can be stated in terms
of derivative matrices as under:
∂h1 ∂h1 ∂h1 ∂f1 ∂f1 ∂f1 ∂x1 ∂x1 ∂x1
∂u1 ∂u2
. . . ∂u n ∂x1 ∂x2
. . . ∂x m ∂u1 ∂u2
. . . ∂u n
. . . . . . . . . . . . = . . . . . . . . . . . . · . . . . . . . . . . . . .
∂hp ∂hp ∂hp ∂fp ∂fp ∂fp ∂xm ∂xm
∂u1 ∂u2
. . . ∂u n ∂x ∂x
. . . ∂x m ∂u1 ∂u2
. . . ∂x
∂un
m
1 2
ILLUSTRATIVE EXAMPLES
Example 81. Given g(x, y) = (x2 + 1, y 2 ) and f (u, v) = (u + v, u, v 2 ). Compute the deriva-
tive matrix of f ◦ g at (x, y) = (1, 1).
86
Chapter 4
In the previous chapters, we explored partial derivatives and the gradient of scalar fields-
tools for analyzing how functions change along principal coordinate directions. A natural
progression is to ask how a function changes in an arbitrary direction. This leads to the
concept of the directional derivative, which generalizes the notion of rate of change by incor-
porating both direction and magnitude. The directional derivative quantifies how a function
changes in a given direction from a point and reveals its connection to the gradient vector
as a projection.
In this chapter, we also introduce vector fieldsmathematical constructs that assign a vector
to every point in a domain. Unlike scalar fields, vector fields represent quantities such as
velocity or force, which have both magnitude and direction. We examine vector fields geo-
metrically, with particular attention to electric fields and normal vectors to surfaces. These
concepts deepen our understanding of multivariable behavior, linking analytical techniques
with geometric intuition.
87
4.1. FROM SCALAR FIELDS TO VECTOR FIELDS
a value to each point (x, y) ∈ R2 , and similarly, a function f (x, y, z) defines a scalar field in
R3 .
When we take the gradient of such a scalar field, we obtain a new kind of object: a function
that assigns a vector to each point. Specifically, for a differentiable function f : Rn → R,
the gradient
∂f ∂f
∇f (x1 , . . . , xn ) = ,...,
∂x1 ∂xn
produces a vector in Rn at every point in the domain. This vector points in the direction of
fastest increase of the function and captures local behavior of the scalar field.
This motivates the more general idea of a vector field: a function that directly assigns a
vector to each point in a domain, without requiring that it arise as the gradient of a scalar
function. That is, a vector field on a subset of Rn is a function
F : D ⊆ Rn → Rn .
Such functions arise naturally when we consider quantities that inherently involve both
magnitude and direction—such as velocity, acceleration, or force.
Thus, vector fields generalize and extend the idea of gradients. Note that every gradient is a
vector field but not every vector field comes up as a gradient. This distinction leads to rich
mathematical structures and questions, such as which vector fields are gradients of scalar
functions-a topic we will revisit under conservative fields.
F(x, y, z) = (P (x, y), Q(x, y), R(x, y)) = P (x, y)i + Q(x, y)j + R(x, y)k.
Vector fields naturally extend multivariable functions by assigning a vector to each point in
a domain D ⊆ Rn . For instance, while a scalar function f (x, y) assigns a number to each
point (x, y), a vector field F(x, y) = (P (x, y), Q(x, y)) assigns a vector. These fields can be
studied independently of physical interpretation, focusing on their structure, continuity, and
geometric behavior. The following examples illustrate this mathematical perspective.
Example 82. For example, the function F(x, y) = (x + y, 2xy) = (x + y)i + (2xy)j is a
two-dimensional vector field. For a point such as (1, 3), it assigns a vector F(1, 3) = (4, 6) =
4î + 6ĵ. For point (2, 3), it assigns a vector F(2, 3) = (5, 12) = 5î + 12ĵ.
88
4.1. FROM SCALAR FIELDS TO VECTOR FIELDS
Example 83. The function F(x, y, z) = (x, x − z, 2yz) = xi + (x − z)j + 2yzk is a vector
field in space R3 . For a point such as (4, 1, −3), it assigns a vector F(4, 1, −3) = (4, 7, −6) =
4i + 7j − 6k. Similarly, for other points, we may find associated vectors.
Example 84 (Tangent Vector Field on a Circle). Consider the unit circle in the plane,
given by x2 + y 2 = 1. A vector field that assigns to each point on the circle a unit tangent
vector in the counterclockwise direction is
This field is tangent to the circle at every point because the dot product T(x, y) · (x, y) =
−yx+xy = 0 vanishes identically. Thus, T is orthogonal to the radial vector (x, y), and hence
tangent to the circle. For example, at the point (1, 0), the tangent vector is T(1, 0) = (0, 1),
pointing in the counterclockwise direction.
Example 85 (Normal Vector Field on a Circle). Let us again consider the unit circle
x2 + y 2 = 1. A natural choice for a normal vector field (pointing radially outward) is
This field assigns to each point on the circle the position vector itself, which is normal to
the
p circle at that point. The field points outward and has unit length since kN(x, y)k =
x2 + y 2 = 1. For instance, at the point (0, 1), the normal vector is N(0, 1) = (0, 1),
pointing vertically upward.
There are physical motivations too for consideration of vector fields. We know that physical
vector quantities such as velocity and force may change from point to point. Therefore, they
are often modelled by vector fields. The velocity field represents speed and direction (at any
point) of a moving fluid in space and force fields (such as magnetic or gravitational) give
strength and direction of the force at any point in space.
Example 86 (Velocity Vector Field). Let a fluid flow in space be described by the velocity
vector field
v(x, y, z) = (−y, x, 0).
This field assigns to each point (x, y, z) a velocity vector indicating the direction and speed
of the fluid at that point. For example, at the point (2, 3, 0), we have
This means that a fluid particle at (2, 3, 0) moves in the direction of the vector −3ı̂ + 2̂, i.e.,
perpendicular to the position vector in the xy-plane, suggesting a rotational flow around the
z-axis. The vector field v thus models a swirling or vortex-type motion in the plane.
89
4.1. FROM SCALAR FIELDS TO VECTOR FIELDS
Example 87 (Spring Force Field). Consider a force field representing the restoring force
in a spring-like medium, governed by Hooke’s Law. In two dimensions, it is given by
F(x, y) = −k(x, y),
where k > 0 is the spring constant. This force pulls the particle back toward the origin and
increases in magnitude as the particle moves farther away. For example, if k = 2, then at
the point (1, −3), the force is
F(1, −3) = −2(1, −3) = (−2, 6) = −2ı̂ + 6̂.
This means a particle at (1, −3) experiences a force directed toward the origin, trying to
restore equilibrium. This field is radial and linear, making it a simple and classic example
of a conservative force field.
Example 88 (Gradient Vector Field). Let f (x, y, z) = x2 + y 2 + z 2 be a scalar field
defined on R3 . Its gradient is given by
∂f ∂f ∂f
∇f = , , = (2x, 2y, 2z) .
∂x ∂y ∂z
This expression defines a vector field in space R3 , since to every point (x, y, z), the gradient
assigns a vector. For example, at the point (1, 2, 3), we have
∇f (1, 2, 3) = (2, 4, 6) = 2î + 4ĵ + 6k̂.
Thus, the gradient acts as a rule that associates a vector to each point in space, making it
a vector field.
Example 89 (Gravitational Vector Field). According to Newton’s law of universal grav-
itation, any two point masses attract each other with a force that is proportional to the prod-
uct of their masses and inversely proportional to the square of the distance between them.
The force is directed along the line joining the two masses and is given by the formula:
Gm1 m2
F=− r̂,
r2
where G = 6.67430 × 10−11 m3 kg−1 s−2 is the universal gravitational constant, m1 and m2
are the two masses, r is the distance between them, and r̂ is the unit vector from the mass
experiencing the force toward the other mass. The negative sign signifies that the force is
attractive.
This leads naturally to the concept of a gravitational vector field, which describes the gravi-
tational force exerted by a fixed mass M on a unit mass placed at various points in space.
If M is located at the origin, then the gravitational field at a point (x, y, z) ∈ R3 \ {(0, 0, 0)}
is given by:
GM
G(x, y, z) = − 2 (xî + y ĵ + z k̂).
(x + y 2 + z 2 )3/2
90
4.1. FROM SCALAR FIELDS TO VECTOR FIELDS
This vector field points toward the origin, and its magnitude decreases with the square of
the distance from the origin.
Numerical Example: Suppose the fixed mass is M = 5.972 × 1024 kg (mass of Earth) and
we evaluate the gravitational field at the point (6.371 × 106 , 0, 0), which lies on the surface
of Earth, assuming that the Earth is a perfect sphere of radius R = 6.371 × 106 m. Then,
we get
(6.67430 × 10−11 )(5.972 × 1024 )
G(6.371 × 106 , 0, 0) = − (6.371 × 106 )î.
(6.371 × 106 )3
Simplifying:
G(6.371 × 106 , 0, 0) ≈ −9.8 î m/s2 .
Thus, the gravitational vector field at that point is approximately:
G ≈ −9.8 î m/s2 ,
which means a unit mass placed at that point experiences a force of magnitude 9.8 N directed
toward the center of Earth, along the negative x-axis.
Example 90 (Electric Field of a Point Charge). According to Coulombs law, the
electric force between two point charges is proportional to the product of the charges and
inversely proportional to the square of the distance between them. The force acts along the
line joining the charges and is given by the formula:
q1 q2
F=k r̂,
r2
where k = 4π 1
0
is known as Coulombs constant and has the numerical value k = 4π 1
0
≈
8.9875 × 10 N · m /C , q1 and q2 are the charges, r is the distance between them, and r̂ is
9 2 2
the unit vector pointing from the charge exerting the force to the one experiencing it.
From this principle, we define the electric field generated by a point charge q located at the
origin. The electric field at a point (x, y, z) ∈ R3 \ {(0, 0, 0)} is the force per unit positive
charge placed at that point and is given by:
q
E(x, y, z) = k (xî + y ĵ + z k̂).
(x2 + + z 2 )3/2
y2
This defines a vector field in space, where each point is associated with a vector pointing
away from the origin if q > 0 (positive charge) or toward the origin if q < 0 (negative charge).
The magnitude of the field diminishes with the square of the distance from the origin.
91
4.1. FROM SCALAR FIELDS TO VECTOR FIELDS
This shows that at (1, 0, 0), the electric field vector points in the positive x-direction with
magnitude kq. Thus, the electric field vector field describes the influence of a point charge
on its surrounding space, assigning a vector to every point that indicates the direction and
strength of the electric force on a unit charge.
Thus, the electric field at the point (1, 0, 0) with a charge of 1 C is approximately 8.9875 ×
109 N/C in the positive x-direction.
assigns a vector (an arrow) to each point in the plane. In the example above, only the arrows
are shown originating from the origin, while the points to which they are associated are com-
pletely disregarded. Thus, this approach neglects the crucial point-to-vector correspondence.
To correctly represent a vector field, we must associate each vector with its corresponding
point. This can be achieved by translating each arrow so that it starts at the input point
(x, y). In other words, instead of placing the vector at the origin, we draw the arrow with its
tail at (x, y) and head at (x + P, y + Q). This is shown as the red arrow in the figure above.
In this way, both the input point and the output vector are represented, revealing the full
structure of the vector field.
Example 91. For example, consider the vector field F(x, y) = xi − yj, where P (x, y) = x
and Q(x, y) = −y. To plot this vector field, we compute a few sample vectors:
F(1, 1) = i − j = (1, −1),
F(0, 1) = −j = (0, −1),
F(1, −2) = i + 2j = (1, 2),
F(−2, 1) = −2i − j = (−2, −1).
92
4.1. FROM SCALAR FIELDS TO VECTOR FIELDS
Now, we plot the following four vectors by placing each one at its corresponding initial point
(x, y), with the arrow pointing to the terminal point (x + P, y + Q):
These vectors are shown in the figure (4.1.1). Although plotting a few vectors gives some
idea of the vector field’s behavior, however, software tools such as GeoGebra can generate
many more vectors, offering a more detailed visualization. A computer-generated plot of this
vector field is shown in the figure (4.1.2).
Example 92. A computer-generated graphic visualization of the vector field F(x, y) = yi−xj
is shown below.
93
4.2. GRADIENT VECTOR FIELD
∂f ∂f
∇f (x, y) = (fx (x, y), fy (x, y)) = (x, y)i + (x, y)j.
∂x ∂y
Example 93. Plot the gradient vector field of the function z = x2 + y 2 . Also, draw a level
curve of the surface that passes through (1,0). Draw a tangent to this level curve at that
point (1,0) and also plot ∇f (1, 0) at that point. What relation can be seen between gradient
vector ∇f (1, 0) and the tangent vector to the level curve at that point.
Solution. The gradient vector field of the function z = x2 + y 2 at point (x, y) is given by
∂f ∂f
∇f (x, y) = i+ j = 2xî + 2y ĵ.
∂x ∂y
A gradient vector at point (1, 0) is given by
∇f (1, 0) = 2i + 0j = 2i.
A plot of the gradient vector field ∇f (x, y) is given below: A level curve that passes through
94
4.3. GRADIENT AND DIRECTIONAL DERIVATIVE
Example 94. Find the gradient of the function f (x, y) = log(x2 + y 2 ) at the point (1, 1).
Sketch the gradient vector together with the level curve that passes through the point.
Solution. Left as an exercise.
Taking the limit as t → 0, we obtain the directional derivative of f at the point (a, b) in the
direction of the unit vector v:
f (p + tv) − f (p)
Dv f (a, b) = lim .
t→0 t
In this expression, the point p and the vector v are fixed. Therefore, p + tv is just a function
of t. Define
c(t) = p + tv, so that c0 (t) = v.
95
4.3. GRADIENT AND DIRECTIONAL DERIVATIVE
At t = 0, we obtain:
g 0 (0) = ∇f (c(0)) · c0 (0).
Since c(0) = p = (a, b) and c0 (0) = v, we have:
96
4.3. GRADIENT AND DIRECTIONAL DERIVATIVE
Thus, the directional derivative of a function f (x, y) in the direction v is the component of
the gradient vector ∇f (x, y) in that direction.
In other words, the directional derivative captures the instantaneous rate of change of the
surface elevation as one walks away from point (a, b) in the direction of v. When v aligns
with the coordinate axes, this reduces to the familiar partial derivatives ∂f
∂x
and ∂f
∂y
.
where θ is the angle between the gradient vector ∇f (a, b) and the direction vector v. Since
kvk = 1, the formula simplifies to:
Since −1 ≤ cos θ ≤ 1, the value of the directional derivative depends on the alignment of
the direction vector v with the gradient. The fastest change in the function occurs when
cos θ = 1, i.e., θ = 0, meaning v is in the same direction as the gradient vector. In this case,
the function is increasing the fastest, and the maximum rate of change is:
Conclusion: The gradient vector ∇f (a, b) points in the direction where the function
increases the fastest, and its magnitude k∇f (a, b)k gives the fastest rate of change. The
function decreases the fastest in the opposite direction.
97
4.3. GRADIENT AND DIRECTIONAL DERIVATIVE
ILLUSTRATIVE EXAMPLES
Example 95. Let f (x, y) = x2 y + y 3 . Compute the directional derivative at the point (1, 2)
in the direction of the vector v = (3, 4).
Example 96. Let f (x, y) = ex sin y. Find the directional derivative at the point (0, π/2) in
the direction of the vector v = (1, −1).
98
4.3. GRADIENT AND DIRECTIONAL DERIVATIVE
Example 97. Let f (x, y) = log(x2 + y 2 ). Find the directional derivative at the point (1, −1)
in the direction pointing from (1, −1) to (2, 1).
Solution. The gradient vector of the function f is given by
2x 2y
∇f (x, y) = , .
x2 + y 2 x2 + y 2
At (1, −1), the gradient comes out to be a vector
2 · 1 2 · (−1)
∇f (1, −1) = , = (1, −1).
2 2
The vector from initial point (1, −1) to a point (2, 1) is v = (1, 2), so the unit vector along
this is given by
1 1 2
v̂ = √ (1, 2) = √ , √ .
12 + 22 5 5
Thus, the required directional derivative is
1 2 1−2 1
Dv f (1, −1) = (1, −1) · √ , √ = √ = −√ .
5 5 5 5
Example 98. Let f (x, y, z) = x2 yz + yz 2 . Compute the directional derivative at the point
(1, 2, 3) in the direction of the vector v = (2, −1, 2).
Solution. The gradient vector ∇f (x, y, z) of the function f is given by
∂f ∂f ∂f
∇f = , , = (2xyz, x2 z + z 2 , x2 y + 2yz).
∂x ∂y ∂z
We evaluate the gradient at (1, 2, 3) as below
∇f (1, 2, 3) = (12, 3 + 9, 2 + 12) = (12, 12, 14).
Next, we compute the unit vector in the direction of v = (2, −1, 2):
v (2, −1, 2) 2 1 2
v̂ = =p = ,− ,
kvk 22 + (−1)2 + 22 3 3 3
Now, the directional derivative is given by
Dv f (1, 2, 3) = ∇f (1, 2, 3) · v̂
2 1 2
= (12, 12, 14) · ,− , .
3 3 3
1
= (24 − 12 + 28)
3
40
= .
3
Thus, the directional derivative of f at (1, 2, 3) in the direction of v is 40
3
.
99
4.3. GRADIENT AND DIRECTIONAL DERIVATIVE
Example 99. Let f (x, y) = 3x2 + 4xy + y 2 . Find the direction and value of the maximum
and minimum rate of change of f at the point (1, 2).
Solution. First, we compute the gradient of f as below:
∂f ∂f
∇f (x, y) = , = (6x + 4y, 4x + 2y).
∂x ∂y
At (1, 2), the gradient is given by
∇f (1, 2) = (14, 8) = 14i + 8j.
The maximum rate of change at a point is given by the norm of the gradient vector at that
point. Therefore, at point (1, 2), the maximum rate of change is given by
√ √ √
max Dv f (1, 2) = k∇f (1, 2)k = 142 + 82 = 196 + 64 = 260.
Also, the direction of maximum rate of change, i.e., the direction of fastest increase is the
direction given by the gradient vector itself. A unit vector along this direction is given by
∇f (1, 2) 1
v̂ = =√ (14i + 8j).
k∇f (1, 2)k 260
The minimum rate of change at a point is just the negative of norm of the gradient vector
at that point. Therefore, minimum rate of change at (1, 2) is given by
√
min Dv f (1, 2) = −k∇f (1, 2)k = − 260,
and it occurs in the direction opposite to that of given by gradient vector at that point.
Thus, the direction of fastest decrease is given by the unit vector
∇f (1, 2) 1
û = − =√ (−14i − 8j).
k∇f (1, 2)k 260
Example 100. Let f (x, y, z) = xyz. Find the maximum and minimum rate of change of f
at the point (1, −1, 2) and the directions in which they occur.
Solution. The gradient is ∇f (x, y, z) = (yz, xz, xy). At the point (1, −1, 2), we have
∇f (1, −1, 2) = (−2, 2, −1).
The magnitude of maximum rate of change at the point (1, −1, 2) is given by
p √ √
k∇f (1, −1, 2)k = (−2)2 + 22 + (−1)2 = 4 + 4 + 1 = 9 = 3.
and the direction of maximum rate of change is the direction given by
1 1
v = (−2, 2, −1) = (−2i + 2j − 1k).
3 3
Similarly, minimum rate of change at the point (1, −1, 2) is −k∇f (1, −1, 2)k = −3. Also,
the direction of minimum rate of change is
1
û = −v̂ = − (−2i + 2j − 1k).
3
100
4.4. GRADIENT AND ORTHOGONALITY TO LEVEL SETS
Example 101. Suppose f (x, y) = ex sin y. Find the direction of fastest increase and the
maximum rate of change of f at the point (0, π/2).
Solution. The gradient is given by ∇f (x, y) = (ex sin y, ex cos y). At the point (0, π/2),
this becomes
∇f (0, π/2) = (1 · 1, 1 · 0) = (1, 0).
√
Thus, the magnitude of maximum rate of change is k∇f (0, π/2)k = 12 + 02 = 1 and the
direction of fastest increase is given by
∇f (0, π/2)
v̂ = = (1, 0).
k∇f (0, π/2)k
The minimum rate of change is −1 in the direction of the vector (−1, 0).
Example 102. The temperature at a point (x, y) in a metal plate is given by T (x, y) =
100 − x2 − y 2 , where T is in degrees Celsius and x, y are in meters. A bug is sitting at the
point (2, 1). In which direction should the bug move to warm up as quickly as possible?
Also, find the maximum rate of increase in temperature at that point.
Solution. We begin by computing the gradient of the temperature function.
∂T ∂T
∇T (x, y) = , = (−2x, −2y).
∂x ∂y
At the point (2, 1), we get ∇T (2, 1) = (−4, −2). This vector points in the direction of the
steepest temperature increase. So, the bug should move in the direction:
1 1
v̂ = p (−4, −2) = √ (−4, −2).
(−4)2 + (−2)2 20
The maximum rate of increase is given by the magnitude of the gradient:
p √ √
k∇T (2, 1)k = (−4)2 + (−2)2 = 20 = 2 5.
Therefore, the bug should move in the direction √1 (−4i − 2j), and the temperature will
√ 20
increase at a rate of 2 5 units.
Let c(t) = (x(t), y(t)) be a smooth parameterization of this level curve. Let (a, b) be a point
on the curve corresponding to the parameter value t = t0 . Then c(t0 ) = (x(t0 ), y(t0 )) = (a, b)
101
4.4. GRADIENT AND ORTHOGONALITY TO LEVEL SETS
and f (a, b) = c. Since every point on the level curve satisfies f (x(t), y(t)) = c, we differentiate
both sides with respect to t:
d
f (c(t)) = 0.
dt
Applying the chain rule:
∇f (c(t)) · c0 (t) = 0.
Evaluating this at t = t0 , we get:
∇f (c(t0 )) · c0 (t0 ) = 0,
Thus, we conclude that the gradient ∇f (a, b) is perpendicular to the tangent line to the
level surface f (x, y) = c at the point (a, b). The equation of the tangent line at the point
(a, b) is then given by
d
f (c(t)) = 0.
dt
By the chain rule, this gives
∇f (c(t)) · c0 (t) = 0.
At t = t0 , if c(t0 ) = (a, b, c), then:
∇f (a, b, c) · c0 (t0 ) = 0.
So, the gradient vector ∇f (a, b, c) is perpendicular to the tangent vector of any curve lying
on the level surface and passing through (a, b, c). This implies that the gradient vector is
normal to the surface itself at that point. Therefore, we can write
102
4.4. GRADIENT AND ORTHOGONALITY TO LEVEL SETS
Thus, we conclude that the gradient ∇f (a, b, c) is perpendicular to the tangent plane the
level surface f (x, y, z) = k at the point (a, b, c). The equation of the tangent plane at
the point (a, b, c) is then given by
Conclusion: In both two and three dimensions, the gradient vector of a scalar function at
a point is orthogonal to the level set (curve or surface) that passes through that point. In
particular:
• In 2D, ∇f (x, y) is perpendicular to the tangent line of the level curve.
The gradient vector is normal to the level set and points in the direction of fastest increase.
ILLUSTRATIVE EXAMPLES
Example 103. Find a normal vector and equation of the tangent line to the circle x2 + y 2 =
25 at the point (3, 4).
Solution. Let f (x, y) = x2 + y 2 . Then, ∇f (x, y) = ∂f , ∂f
∂x ∂y
= (2x, 2y). At the point (3, 4),
the gradient is ∇f (3, 4) = (6, 8), which itself is a normal vector to the circle at (3, 4) and
also equation of the tangent line to the circle at the point (3, 4) is given by
6(x − 3) + 8(y − 4) = 0.
103
4.5. ELECTRIC FIELD (OPTIONAL)
Solution. Let f (x, y, z) = x2 + y 2 + z 2 . Then, ∇f (x, y, z) = (2x, 2y, 2z). At the point
(6, 0, 0), we have ∇f (6, 0, 0) = (12, 0, 0), which is a normal vector to the sphere at the point
(6, 0, 0). Also, the equation of the tangent plane is
Simplifying, we get x = 6.
Example 106. Find a normal vector and the equation of the tangent plane to the ellipsoid
2
x2 z2
4
+ y9 + 16 = 1 at the point (2, 0, 0).
2 2
z2
Solution. Let f (x, y, z) = x4 + y9 + 16 . Then, ∇f (x, y, z) = x2 , 2y , z8 . At the point (2, 0, 0),
9
the gradient becomes ∇f (2, 0, 0) = (1, 0, 0) , which is a normal vector to the ellipsoid at
(2, 0, 0). So, the equation of the tangent plane is:
If the international systems of Units (SI) is used, magnitude of the electric force F is mea-
sured in Newtons (N), the electric charges Q1 and Q2 are measured in coulombs (C), and the
distance r between the charges is measured in meters (m). The constant k = 4π 1
0
is known
as Coulombs constant and has a value of approximately 8.99 × 109 N · m2 /C . 2
Since force is a vector quantity, it is better to express it in vector terms. If r1 locate Q1 and
r2 locate Q2 , then the vector R12 = r2 − r1 represent the directed line segment from Q1 to
104
4.5. ELECTRIC FIELD (OPTIONAL)
Q2 . The vector F2 is the force on Q2 . If Q1 and Q2 both are of same sign, the vector form
of Coulomb’s law is given by
1 Q1 Q2
F2 = 3
(r2 − r1 ).
4π0 R12
If, we take a unit vector a12 in the direction of R12 , then the Coulomb’s law can be written
as
1 Q1 Q2
F2 = 2
a12 .
4π0 R12
Example 107. Let a charge Q = 3 × 10−2 C be located at M (1, 2, 3) and a charge Q2 =
−10−4 C be at N (2, 0, 5) in a vacuum. Find the force on Q2 by Q1 .
3 × 10−4 (−10−4 )
i − 2j + 2k
F2 =
4π(1/36π)(10−9 )32 3
i − 2j + 2k
= −30 N
3
The magnitude of the force is 30N and direction is given by the vector a1 2.
105
4.5. ELECTRIC FIELD (OPTIONAL)
Electric Field: The electric field at a point in space is a vector quantity that describes
the force experienced by a unit positive test charge placed at that point due to the presence
of other electric charges. Now, we shall explain the electric filed due to a point charge and
of a dipole.
For sake of simplicity, we assume the point charge Q is located at the origin, i.e., r0 = (0, 0, 0).
Then r − r0 = (x, y, z), the the electric field becomes
1 Q(x, y, z)
E(x, y, z) = · 2 .
4πε0 (x + y 2 + z 2 )3/2
In component form the electric filed is written as
kQx kQy kQz
E(x, y, z) = , , ,
(x2 + y 2 + z 2 )3/2 (x2 + y 2 + z 2 )3/2 (x2 + y 2 + z 2 )3/2
106
4.5. ELECTRIC FIELD (OPTIONAL)
1
where k = ≈ 8.988 × 109 N · m2 /C2 . For a two dimensional space, the electric filed due
4πε0
to a point charge is
kQx kQy
E(x, y) = , , ,
(x2 + y 2 )3/2 (x2 + y 2 )3/2
If plot of electric filed is given in the following figure. The illustration visually represents
the electric field lines generated by point charges. On the left, the diagram shows the
field of a positive point charge, where field vectors radiate outward symmetrically in all
directionsindicating the repulsive nature of a positive charge. On the right, the diagram
displays the field of a negative point charge, where field vectors point inward, converging
toward the chargereflecting the attractive nature of a negative charge. These vector diagrams
help in understanding both the direction and relative strength of the electric field at various
points in space around the charges.
Example 109. A point charge Q = 1 µC = 1 × 10−6 C is placed at the origin in the xy-
plane. Let r = (x, y) be the position vector of a point in the plane where the electric field is
to be computed.
a) Derive the expression for the 2D electric field vector E(x, y) at any point in the plane
(neglecting the z-component).
b) Compute the electric field vector E at the point (3, 4) and write it in vector form.
Solution: (a) In two dimensions, the electric field due to a point charge located at the
origin is given by
1 Q(x, y)
E(x, y) = · 2 ,
4πε0 (x + y 2 )3/2
where 1
4πε0
≈ 9 × 109 N · m2 /C2 . Substituting Q = 1 × 10−6 C, we get
107
4.5. ELECTRIC FIELD (OPTIONAL)
(b) To compute the electric filed at (3, 4), we first find the squared distance as below
x2 + y 2 = 32 + 42 = 9 + 16 = 25.
Let the positive charge +q be located at position vector r+ , and the negative charge −q at
r− . The electric field E(r) at an arbitrary point r in space is given by
1 q(r − r+ ) q(r − r− )
E(r) = −
4πε0 |r − r+ |3 |r − r− |3
108
4.5. ELECTRIC FIELD (OPTIONAL)
Component Form: Assuming the dipole is aligned along the z-axis, with the charges
located at r+ = (0, 0, a) and r− = (0, 0, −a), the electric field at point r = (x, y, z) has
components
" #
q x x
Ex = − .
4πε0 (x2 + y 2 + (z − a)2 )3/2 (x2 + y 2 + (z + a)2 )3/2
" #
q y y
Ey = − .
4πε0 (x2 + y 2 + (z − a)2 )3/2 (x2 + y 2 + (z + a)2 )3/2
. and " #
q z−a z+a
Ez = − .
4πε0 (x2 + y 2 + (z − a)2 )3/2 (x2 + y 2 + (z + a)2 )3/2
Similar formulae can be derived if the dialpole is aligned along x or y-axis.
Specific 2D Case: For a planar configuration, consider placing the charges along the y-axis.
Let the positive charge +q be at r+ = (0, a) and the negative charge −q be at r− = (0, −a).
Then, the electric field at a point r = (x, y) in the plane is
1 q(x, y − a) q(x, y + a)
E(x, y) = −
4πε0 [x2 + (y − a)2 ]3/2 [x2 + (y + a)2 ]3/2
Expressing E(x, y) in terms of its components, we get the - x-component as
q x x
Ex (x, y) = −
4πε0 [x2 + (y − a)2 ]3/2 [x2 + (y + a)2 ]3/2
and the y-component as
q y−a y+a
Ey (x, y) = −
4πε0 [x2 + (y − a)2 ]3/2 [x2 + (y + a)2 ]3/2
The figure (??) explains the electric field lines of a dipole in the xy-plane.
Example 110. Consider an electric dipole consisting of a positive charge +q = 1 µC located
at r+ = (0, 0, 1) and a negative charge −q = −1 µC located at r− = (0, 0, −1). Compute
the electric field vector E at the point (3, 0, 0) and express it in vector form.
Solution: (a) The electric field at a point r = (x, y, z) due to a point charge +q located at
(0, 0, a) and a point charge −q located at (0, 0, −a) is given by
q (x, y, z − a) (x, y, z + a)
E(x, y, z) = − .
4πε0 [(x)2 + (y)2 + (z − a)2 ]3/2 [(x)2 + (y)2 + (z + a)2 ]3/2
109
4.5. ELECTRIC FIELD (OPTIONAL)
91 × 10−6 9 × 103
E(3, 0, 0) = −9 × 10 · √ · (0, 0, 1) = − √ · (0, 0, 1).
5 10 5 10
Computing the numerical value, we get
9 × 103 9000
√ ≈ ≈ 569.2.
5 10 15.811
110