[go: up one dir, main page]

0% found this document useful (0 votes)
35 views110 pages

Functions of Several Variables

The document consists of lecture notes on functions of several variables, covering topics such as definitions, graphs, limits, and continuity. It introduces scalar functions, their applications in modeling physical phenomena, and provides examples of domain analysis. The content is structured into chapters that include partial derivatives, the chain rule, and gradient vector fields.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views110 pages

Functions of Several Variables

The document consists of lecture notes on functions of several variables, covering topics such as definitions, graphs, limits, and continuity. It introduces scalar functions, their applications in modeling physical phenomena, and provides examples of domain analysis. The content is structured into chapters that include partial derivatives, the chain rule, and gradient vector fields.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 110

Lecture Notes

Functions of Several Variables

Manoj Pandey
Department of Mathematics
Rajiv Gandhi Proudyogiki Vishwavidyalaya,(M.P)

April 20, 2025


2
Contents

1 Functions of Several Variables 3


1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.4 Level Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.5 Limit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.5.1 Limit Laws for Functions of Two Variables . . . . . . . . . . . . . . . 14
1.6 Continuous Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

2 Partial Derivatives 28
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.2 Partial Derivative Along x-Axis . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.3 Partial Derivative Along y-axis . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.4 Tangent Plane to a Surface . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.4.1 Equation of the Tangent Plane . . . . . . . . . . . . . . . . . . . . . 40
2.4.2 Second and Higher Order Partial Derivatives . . . . . . . . . . . . . . 44
2.5 Differentiability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
2.5.1 Differentiability of Scalar Function of Several Variables . . . . . . . . 52
2.5.2 Differentiability Implies Continuity . . . . . . . . . . . . . . . . . . . 58
2.5.3 General Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

3 Chain Rule 69
3.1 Composition of Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
3.2 The Chain Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
3.3 Chain Rule in Higher Dimensions . . . . . . . . . . . . . . . . . . . . . . . . 74
3.3.1 Two intermediate and one independent variable . . . . . . . . . . . . 74
3.3.2 Two intermediate and two independent variables . . . . . . . . . . . . 78
3.3.3 General Case (Optional) . . . . . . . . . . . . . . . . . . . . . . . . . 82

4 Gradient Vector Field 83


4.1 From Scalar Fields to Vector Fields . . . . . . . . . . . . . . . . . . . . . . . 83
4.1.1 Plotting a Vector Field . . . . . . . . . . . . . . . . . . . . . . . . . . 88
4.2 Gradient Vector Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

3
CONTENTS

4.3 Gradient and Directional Derivative . . . . . . . . . . . . . . . . . . . . . . . 91


4.4 Gradient and Orthogonality to Level Sets . . . . . . . . . . . . . . . . . . . . 97
4.5 Electric Field (Optional) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

4
Chapter 1

Functions of Several Variables

In this chapter, we introduce scalar functions of several real variables, commonly referred to
as scalar fields. These functions assign a real number to each point in a multi-dimensional
domain and play a crucial role in modeling a wide range of physical and geometric phenom-
ena such as temperature, pressure, and surface area. We will explore fundamental concepts
associated with scalar fields, including domains, level sets, and limits, and develop techniques
for their graphical and analytical interpretation.

At the end of this chapter, you will be able to:


1. Distinguish between vector-valued and scalar-valued functions with respect to their
definitions and applications,
2. Analyze the geometry of functions of two variables and interpret their graphical rep-
resentations,
3. Evaluate level sets of scalar functions,
4. Compute limits of multivariable functions using appropriate analytical methods.

1.1 Introduction
In previous chapter, we studied vector-valued functions of a real variable, given by c : [a, b] →
Rn , where n = 2, 3. These functions map a real input t to a vector c(t) in Rn , representing
curves or paths in space. Such functions are fundamental in describing motion, trajectories,
and geometric forms in two- and three-dimensional spaces.

We now extend our focus to functions of several variables, which differ in that they assign a
real (scalar) value to each point in a multi-dimensional domain. These are known as scalar
fields, especially when defined over a region in space.

A basic example is the area A of a rectangle, which depends on its length x and breadth y.
This relationship is expressed as A = f (x, y), where f (x, y) = xy. The function f defines a

5
1.2. DEFINITION

scalar field over the xy-plane, assigning a real value to each point (x, y).

Scalar fields also appear in geometric contexts. For instance, the equation of a plane in R3 ,
ax + by + cz = d, can be rewritten as:

z = g(x, y),

where g(x, y) = D − Ax − By, and D = dc , A = ac , B = cb . This defines a scalar field g,


mapping points in the xy-plane to corresponding z-coordinates on the plane.

In physical contexts, scalar fields are used to describe distributions of quantities such as
temperature. For example, the temperature T at each point (x, y, z) in a room may vary
with position, represented as:
T = T (x, y, z).
This defines a scalar field in R3 , assigning a unique temperature to every point in space.

More generally, a scalar field is a function of n variables that assigns a real number to each
point in a region of Rn . Such functions are essential in mathematics, physics, and engineering,
modeling phenomena including electric potential, pressure, elevation, and population density.

1.2 Definition
Let D ⊆ Rn represent a set of ordered n-tuples. A function assigning each n-tuple (x1 , x2 , . . . , xn )
a unique real number is termed a function of n variables. Formally, we define the functions
of several variables as below:

Definition 1.2.1. A function f : D ⊆ Rn → R is known as a function of n variables


x1 , x2 , . . . , xn , and it is often represented as:

w = f (x1 , x2 , . . . , xn ).

The set D is termed the domain of the function.

Domain of the function must be a set of inputs for which function remains meaningful, i.e.,
real and finite. For instance, for a 2- dimensional case, i.e., for functions f : D ⊆ R2 → R,
the domain D is the set of all points (x, y) that satisfy conditions ensuring f (x, y) remain
real and finite. To find the domain of a function of one variable, check which input values
you can safely plug into the function without causing problems. Problems usually happen
with things like dividing by zero or taking the square root of a negative number. Common
restrictions include:

1. (Division by zero): The denominator in a function, if any, must be non zero.

6
1.2. DEFINITION

2. (Square or n-th roots): Expressions inside roots must be non-negative.


3. (Logarithms): The argument of the logarithm, if any, must be strictly greater than
zero.

ILLUSTRATIVE EXAMPLES

Example 1. Describe domain of the following functions and find their values at specified
points in space.

1. f (x, y) = x2 + xy 3 ; f (0, 0), f (−1, 1), f (2, 3), f (−3, −2)


2. f (x, y) = cos(xy); f (2, π/2), f (−3, π/9), f (π, 1/4)
3. f (x, y, z) = x−y
y 2 +z 2
; f (3, −1, 2), f (1, 1/2, −1/4), f (0, −1/3, 0)

4. f (x, y, z) =
p
36 + x2 − y 2 − z 2 ; f (0, 0, 0), f (2, −5, 3), f (1, 2, −3)

Solution. Let us find the functional values and domains for each of the functions.

1. Given f (x, y) = x2 + xy 3 . Clearly, this function is defined for all real values of x and
y. Therefore,the domain D is all of R2 , i.e., D = {(x, y) : (x, y) ∈ R2 }. The specific
values of the function are given below

f (0, 0) = 02 + 0 · 03 = 0,
f (−1, 1) = (−1)2 + (−1) · (1)3 = 1 − 1 = 0,
f (2, 3) = (2)2 + (2) · (3)3 = 4 + 2 · 27 = 4 + 54 = 58,
f (−3, −2) = (−3)2 + (−3) · (−2)3 = 9 + (−3) · (−8) = 9 + 24 = 33.

2. The function f (x, y) = cos(xy) is also defined for all real values of x and y, so the
domain is whole of R2 , i.e., D = {(x, y) : (x, y) ∈ R2 }. The specific values of the
functions are
 π  π
f 2, = cos 2 · = cos(π) = −1,
2 2
 π  π  π 1
f −3, = cos −3 · = cos − = ,
9 9 3 √ 2
   
1 1 π  2
f π, = cos π · = cos = .
4 4 4 2

7
1.2. DEFINITION

3. The f (x, y, z) = yx−y


2 +z 2 has three independent variables. This function is defined for all

real values of x, y, and z for which y 2 + z 2 6= 0 (to avoid division by zero). Therefore,
the domain is D = {(x, y, z) ∈ R3 : y 2 + z 2 6= 0}. The specific values are given as under

3 − (−1) 3+1 4
f (3, −1, 2) = = = ,
(−1)2 + 22 1+4 5
1 − 12 1 1
 
1 1 8
f 1, , − = 2  = 1 2 1 =
1 2
2
5 = ,
2 4 1
+ − 4
+ 16 16
5
2 4
0 − − 13 1
  
1 3
f 0, − , 0 = = 1 = 3.
3 1 2

−3 + 0 2
9

4. The f (x, y, z) = 36 + x2 − y 2 − z 2 has an expression under square root sign. For the
p

function to remain meaningful, we must choose (x, y, z) ∈ R3 so that 36+x2 −y 2 −z 2 ≥


0. Thus, the domain is D = {(x, y, z) ∈ R3 : 36 + x2 − y 2 − z 2 ≥ 0}. The specific values
are given as under:
√ √
f (0, 0, 0) = 36 + 02 − 02 − 02 = 36 = 6,
p √ √
f (2, −5, 3) = 36 + 22 − (−5)2 − 32 = 36 + 4 − 25 − 9 = 6,
p √ √ √
f (1, 2, −3) = 36 + 12 − 22 − (−3)2 = 36 + 1 − 4 − 9 = 24 = 2 6.

Figure 1.2.1: Domain: The Circular Disk Figure 1.2.2: Domain of z =


p
y − x2

Example 2. To find domain for the function f (x, y) =


p
4 − x2 − y 2 .

Solution. From, the definition of the function, it is clear that in order to have a real f (x, y),
the expression under the square root must be non-negative, i.e., 4 − x2 − y 2 ≥ 0 In other

8
1.3. GRAPH

words, x2 + y 2 ≤ 4. Thus the domain is region enclosed by the circle of radius 2 centered at
the origin, i.e.,
D = {(x, y) ∈ R2 : x2 + y 2 ≤ 4}.
Thus, the domain describes a disk (including its boundary) of radius 2 as shown in the figure
(1.2.1).
Example 3. Find domain and range of z = y − x2 .
p

Solution. The function z = y − x2 will be defined for those point (x, y) in plane R2 for
p

which y − x2 ≥ 0, i.e., y ≥ x2 . These are the point (see the figure 1.2.2) lying inside a
region enclosed by the parabola y = x2 . The range is clearly the set of all non-negative real
numbers, i.e., D = {z : z ∈ R+ }.

1.3 Graph
Visualizing a function is essential because it provides an intuitive way to understand how
a function behaves and how its values change in relation to its input variables. Graph of a
function is simply the collection of all the ordered pairs (input, output). In general setting,
the graph G(f ) of a function f : D ⊆ Rn → R is defined as the set:

G(f ) = {(x1 , x2 , . . . , xn , w) | w = f (x1 , x2 , . . . , xn )}.

The components of the inputs x1 , x2 , . . . , xn are the independent variables, while the output
w is the dependent variable.

In particular, for n = 2, the functions f : D ⊆ R2 → R of two variables are given by


z = f (x, y) and their graph by G(f ) = {(x, y, z) | z = f (x, y)}. These functions possess
a distinct geometric interpretation, as their graph forms a surface in R3 . Each pair (a, b)
in the domain D ⊆ R2 corresponds to exactly one height c = f (a, b), generating a three-
dimensional surface (see the figure 1.3.1). Thus, the domain D itself is a planar region,
typically visualized in the xy-plane, and the value z = f (x, y) indicates how high the surface
lies above (or below) this plane.

For n = 3, we get the functions f : D ⊆ R3 → R of three variables, given by w = f (x, y, z).


Their graph is given by G(f ) = {(x, y, z, w) | w = f (x, y, z)}. These functions, however,
cannot easily be represented geometrically due to the necessity of four-dimensional space for
their graphical interpretation.

1.4 Level Sets


Although graphing software and machines can handle the visualization of functions effec-
tively, manually graphing a surface z = f (x, y) remains a challenging task. It often requires

9
1.4. LEVEL SETS

Figure 1.3.1: Graph of z = f (x, y)

plotting a large number of points in three-dimensional space, which makes it difficult to


comprehend the overall shape of the graph.

A more intuitive and efficient method to visualize a function of two variables is through slic-
ing or sections, i.e., considering the intersection of the surface with horizontal planes (planes
parallel to the xy-plane) at various heights. Specifically, we examine the set of points (x, y, z)
on the surface where the function takes a constant value, say z = f (x, y) = c. We sometimes
call these sections as contour lines or contour curves.

By plotting several such sections for different values of c, we obtain a contour map, which
reveals important geometric features of the surface. When these slices are projected onto the
xy-plane, they form what are known as level curves. We now formally define this concept
as below:

Definition 1.4.1 (Level Curve). A level curve of a function f (x, y) is the set S of all
points (x, y) in the domain D at which the function has a constant value. That is,

S = {(x, y) ∈ D ∈ R2 |f (x, y) = c}.

In other words, the level set of a function f (x, y) with value c is the a the set S of all
points (x, y) in the plane R2 for which the function f (x, y) reaches a certain ‘level’ c.
For a function w = f (x, y, z), such sets are known as Level Surface. In case of more than
three variables, it is called level set. A level surface of a function of three variable is defined

10
1.4. LEVEL SETS

as below:

Definition 1.4.2 (Level Surface). A level surface of a function f (x, y, z) is the set S of
all points (x, y, z) in the domain D at which the function has a constant value. That is,

S = {(x, y, z) ∈ D ∈ R3 |f (x, y, z) = c}.

In other words, the level surface of a function f (x, y, z) with value c is the a the set S of
all points (x, y, z) in the plane R3 for which the function f (x, y) reaches a certain ‘level’
c.
Below is an illustrative image of a surface, its contour lines and level curves.

Figure 1.4.1: Contour Lines and Level Curves

ILLUSTRATIVE EXAMPLES

The following examples will illustrate the concept of level curves.


Example 4. Sketch the level curves of the hyperboloid given by z = x2 + y 2 . for the levels
c = 1, 2, 3.
Solution. The level curves of the given hyperboloid are given by x2 +√ y 2 =√c. For
√ values
c = 1.5, c = 2, c = 1.5 and c = 1, level curves are the circles with radii 2, 3.5 1.5 and
1. The level curves and contour lines are shown in the figure (1.4.2) and (1.4.3). So, level
curves are circles of increasing radius.

11
1.4. LEVEL SETS

Figure 1.4.2: Contour Lines Figure 1.4.3: Level Curves

Example 5. Sketch the level curves of a hyperbolic paraboloid f (x, y) = x2 − y 2 for c =


−3, −2, −1, 0, 1, 2, 3.
Solution. The computer generated sketch of the given hyperbolic paraboloid is given in the
figures (1.4.4) and (1.4.5). The level curves for c = −3, −2, −1, 1, 2, 3 are all hyperbolas as

Figure 1.4.4: Hyperbolic Paraboloid Figure 1.4.5: Level Curves

shown in the following figure except for c = 0. In this case, the level curves are given by
x2 − y 2 = 0, i.e., a pair lines x = ±y (a degenerate hyperbola).
Example 6. Sketch the level curves of the exponential function f (x, y) = ex for c =
2 +y 2

1, 2, 3, 4.
Solution. The level curves are obtained by solving ex +y = c. Taking the natural logarithm
2 2

on both sides,
x2 + y 2 = log c.

12
1.4. LEVEL SETS


For positive values of c, the level curves are circles centered at the origin with radii log c.
These circles increase in size as c increases.
Example 7. Find the level surface of the function f (x, y, z) = x2 + y 2 + z 2 for c = 4, 9.
Solution. The level surfaces for c = 4 and c = 9 are the spheres x2 + y 2 + z 2 = 4 and
x2 + y 2 + z 2 = 9. These spheres are centered at the origin with radii 2 and 3 respectively.
Example 8. Describe the level surface of the function f (x, y, z) = z − x2 − y 2 = 0 for c = 0.
Solution. The level surface of the function for c = 0 is given by 0 = z − x2 + y 2 . This is a
just a paraboloid z = x2 − y 2 opening upward in the positive z-direction.
Example 9. Find the level surface of the function f (x, y, z) = log (x2 + y 2 + z 2 ) for c = 1.
Solution. The desired level surface is given by ln(x2 + y 2 + z 2 ) = 1. Taking the
√ exponential
on both sides, we have x + y + z = e. This represents a sphere of radius e centered at
2 2 2

the origin.
Example 10. Determine the level curve of the function f (x, y) = x+2y that passes through
the point (2, 1).
Solution. Let x + 2y = c be the level curve that passes through the given point (2, 1).
Therefore, we have 2 + 2(1) = c. Thus, the desired level curve is x + 2y = 4. This is a just a
straight line.
Example 11. Find the level curve of the function f (x, y) = x2 − 3y that passes through the
point (3, 2).
Solution. Let the level curve be given by x2 − 3y = c. Substituting (3, 2) into the equation,
we have c = 32 − 3(2) = 3. Thus, the required level curve is x2 − 3y = 3. This represents a
parabolic curve.
Example 12. Find the level surface of the function f (x, y, z) = x + y − 2z that passes
through the point (1, 2, 3).
Solution. The level surface is obtained by the equation x + y − 2z = c. Now, substituting
(1, 2, 3) into the equation, we get c = 1 + 2 − 2(3) = −3. Thus, the required level surface is
x + y − 2z = −3. This is just a plane.
Example 13. Find the level surface of the function f (x, y, z) = x2 + y 2 − z that passes
through the point (2, 3, 4).
Solution. Similar to the above question, level surfaces are given by x2 + y 2 − z = c. Substi-
tuting (2, 3, 4) into the equation, we get c = 22 + 32 − 4 = 4 + 9 − 4 = 9. Thus, the required
level surface is x2 + y 2 − z = 9. This represents a type of paraboloid in three-dimensional
space.
Level Curves are important because of many reasons. In mathematics, level curves provide
a 2D visualization of a 3D surface. Level curves also have a relationship with the gradient.
In forthcoming lectures, we shall study the fact that the gradient vector of f (x, y) at any
point is always perpendicular to the level curve passing through that point. They are used
in weather maps, engineering, physics, economics, and more.

13
1.5. LIMIT

1.5 Limit
In this section, we shall discuss the notions of limits and continuity for functions of several
variables, which are fundamental concepts in multivariable calculus. These ideas extend the
familiar single-variable concepts into higher dimensions and allow us to rigorously understand
how functions behave near a point in space. Just as in single-variable calculus, limits provide
the foundation for defining continuity, derivatives, and integrals for multivariable functions.
However, in the multivariable context, the behavior of a function becomes more intricate
due to the presence of infinitely many paths through which a point can be approached. We
begin our study with the formal definition of the limit of a function of two variables.

Let f : D ∈ R2 → R(x, y) be a real-valued function defined by z = f (x, y) on a domain


D ⊂ R2 and let (a, b) be a point in the domain D ∈ R2 . We say that the limit of f (x, y) is
a real number L as (x, y) → (a, b), if the values of f (x, y) can be made arbitrarily close to L
by taking (x, y) sufficiently close to (a, b), but not equal to (a, b). In this case, we write
lim f (x, y) = L,
(x,y)→(a,b)

The precise definition of limit is given in the following box.

Definition 1.5.1 (Limit). Let f : D ∈ R2 → R(x, y) be a real-valued function defined


by z = f (x, y), then we say that

lim f (x, y) = L
(x,y)→(a,b)

if for every ε > 0, there exists a δ > 0 such that for all (x, y) ∈ D,
p
0 < (x − a)2 + (y − b)2 < δ ⇒ |f (x, y) − L| < ε.

In simpler terms, this means that we can make f (x, y) as close as we want to L (within ε) by
choosing (x, y) sufficiently close to (a, b) (within a distance δ), excluding the point (a, b) itself.

We now present a theorem on uniqueness of the limit without proof.

Theorem 1.5.1 (Uniqueness of Limit). Let f : D ⊂ R2 → R be a real-valued function


defined on a domain D. The and let (a, b) ∈ D. If the limit of f (x, y) as (x, y) → (a, b)
exists, then it is unique.
In other words, if there exist two real numbers L1 and L2 such that

lim f (x, y) = L1 and lim f (x, y) = L2 ,


(x,y)→(a,b) (x,y)→(a,b)

then it must follow that L1 = L2 .

14
1.5. LIMIT

It must be noted that in single-variable case, a point x → a can be approached only from
the left or right. But in two variables, (x, y) → (a, b) can occur along infinitely many curves
(straight lines, parabolas, spirals, etc.) and for the limit to exist, the function must approach
the same value regardless of the path.

Thus, geometrically, the definition says that as the point (x, y) approaches (a, b) from any
direction in the plane, along any path, the function values f (x, y) must approach the same
number L. If even two different paths toward (a, b) yield different limiting values for f (x, y),
then the limit does not exist.

It is also important to recognize that the existence of a limit at a point does not depend on
the value of the function at that point. The limit only concerns how the function behaves
near the point, not necessarily at the point itself. Therefore, f (x, y) need not be defined at
(a, b) for the limit to exist.

Remark 1.5.1 (Transition from Spherical to Square Neighborhood). The classical def-
inition of a limit in two variables employs a spherical neighborhood around the point
(a, b), using the Euclidean distance
p
(x − a)2 + (y − b)2 < δ,

which describes a circular region centered at (a, b). This ensures that the point (x, y) ap-
proaches (a, b) from all directions. However, in many analytical contexts, it is convenient
to work with square neighborhoods defined by bounding each coordinate independently.
This leads to the condition

0 < |x − a| < δ and 0 < |y − b| < δ,

which describes a punctured open square centered at (a, b) with side length 2δ. This
formulation is logically equivalent to the spherical one, due to the equivalence of norms
in finite-dimensional spaces. Use of square neighbourhood is especially helpful in con-
structing δ-values during ε-δ proofs and provides a more direct path to generalization in
higher dimensions.

Definition 1.5.2 (Limit using Square Neighborhood). Let f : D ⊆ R2 → R be a


real-valued function defined by z = f (x, y). We say that

lim f (x, y) = L
(x,y)→(a,b)

if for every ε > 0, there exists a δ > 0 such that for all (x, y) ∈ D,

0 < |x − a| < δ and 0 < |y − b| < δ ⇒ |f (x, y) − L| < ε.

15
1.5. LIMIT

1.5.1 Limit Laws for Functions of Two Variables


Let f (x, y) and g(x, y) be functions such that

lim f (x, y) = L and lim g(x, y) = M,


(x,y)→(a,b) (x,y)→(a,b)

where L and M are real numbers, and let c be a constant. Then the following limit laws hold.

1. Sum & Difference Rule: lim(x,y)→(a,b) [f (x, y) ± g(x, y)] = L ± M.

2. Constant Multiple Rule: lim(x,y)→(a,b) [c · f (x, y)] = cL.

3. Product Rule: lim(x,y)→(a,b) [f (x, y) · g(x, y)] = L · M.

4. Quotient Rule: lim(x,y)→(a,b) f (x,y)


g(x,y)
= L
M
, provided M 6= 0

5. Power Rule: lim(x,y)→(a,b) [f (x, y)]n = Ln , for any integer n



6. Root Rule: lim(x,y)→(a,b) n f (x, y) = n L, provided L ≥ 0 when n is even
p

In particular, for simple cases such as when f (x, y) = x, we have

lim x = a.
(x,y)→(a,b)

Similar rules applies in case of f (x, y) = y or f (x, y) = k, (k a constant), we have

lim y=b
(x,y)→(a,b)

and
lim k = k.
(x,y)→(a,b)

We now present some examples to illustrate how to compute limits of functions of two
variables.

ILLUSTRATIVE EXAMPLES

Example 14. Using the  − δ definition, show that lim(x,y)→(1,1) x + y = 2.

Solution. Given that f (x, y) = x + y, (a, b) = (1, 1) and L = 2. In order to prove that the
limit of the function f (x, y) is 2, we need to show that for every  > 0 there exists δ > 0
such that
|x + y − 2| < , whenever 0 < (x − 1) < δ, 0 < (y − 1) < δ

16
1.5. LIMIT

Let  > 0 be given. Observe that


|f (x, y) − 2| = |x + y − 2| = |(x − 1) + (y − 1)| ≤ |x − 1| + |y − 1|.
If we choose δ = 2 , we get
|f (x, y) − 2| = |x + y − 2|
=≤ |x − 1| + |y − 1|.
< |x − 1| + |y − 1|
<δ+δ
< 2δ
<
Thus, for a given  > 0, we found a δ = 2 such that |x + y − 2| < , whenever 0 < (x − 1) < δ
and 0 < (y − 1) < δ. This completes the proof.
Example 15. Using the  − δ definition, show that
lim (3x − 4y) = 0.
(x,y)→(0,0)

Solution. We are given the function f (x, y) = 3x − 4y, the point (a, b) = (0, 0), and the
proposed limit L = 0. To prove that the limit of f (x, y) is 0 as (x, y) → (0, 0), we must show
that for every  > 0, there exists a δ > 0 such that
|3x − 4y − 0| < , whenever 0 < |x| < δ, 0 < |y| < δ.
Let  > 0 be given. We observe that
|f (x, y) − 0| = |3x − 4y| ≤ |3x| + |4y| = 3|x| + 4|y|.
Now, suppose that 0 < |x| < δ and 0 < |y| < δ. Then,
|3x − 4y| ≤ 3|x| + 4|y|
< 3δ + 4δ
= (3 + 4)δ = 7δ.
To ensure that |3x − 4y| < , it is enough to choose

δ= .
7
Then,
|3x − 4y| < 7δ = .
Thus, for every  > 0, we have found a δ = 
7
such that
|3x − 4y| < 
whenever 0 < |x| < δ and 0 < |y| < δ. This completes the proof.

17
1.5. LIMIT

Example 16. Using the  − δ definition, show that

lim x2 + 2y = 3.
(x,y)→(1,1)

Solution. We are given f (x, y) = x2 + 2y, the point (a, b) = (1, 1), and the proposed limit
L = 3. To prove that lim(x,y)→(1,1) x2 + 2y = 3, we must show that for every  > 0, there
exists a δ > 0 such that

|x2 + 2y − 3| <  whenever 0 < |x − 1| < δ, 0 < |y − 1| < δ.

Let  > 0 be given. We begin by rewriting the expression:

|f (x, y) − 3| = |x2 + 2y − 3|
= |x2 − 1 + 2(y − 1)|
≤ |x2 − 1| + 2|y − 1|.

Now note that


|x2 − 1| = |(x − 1)(x + 1)| = |x − 1||x + 1|.
To control this, we restrict x to stay close to 1. Suppose |x − 1| < 1, so that x ∈ (0, 2). Then
|x + 1| ≤ 3, and thus:
|x2 − 1| = |x − 1||x + 1| ≤ 3|x − 1|.
Putting it all together:

|f (x, y) − 3| ≤ |x2 − 1| + 2|y − 1|


≤ 3|x − 1| + 2|y − 1|.

Now, assume 0 < |x − 1| < δ and 0 < |y − 1| < δ, where δ < 1. Then:

|f (x, y) − 3| < 3δ + 2δ = 5δ.

To ensure |f (x, y) − 3| < , choose


n o
δ = min 1, .
5
Then for all (x, y) satisfying 0 < |x − 1| < δ, 0 < |y − 1| < δ, we have:

|x2 + 2y − 3| < .

This completes the proof.

Example 17. Evaluate limit of the function f (x, y) = (x2 + y 2 ) at (2, 3).

18
1.5. LIMIT

Solution. We use the rules described above to evaluate the limit of the function as below

lim (x2 + y 2 ) = lim x2 + lim y 2


(x,y)→(2,3) (x,y)→(2,3) (x,y)→(2,3)
     
= lim x lim x + lim y lim y
(x,y)→(2,3) (x,y)→(2,3) (x,y)→(2,3) (x,y)→(2,3)

= 2 2 + 32 = 4 + 9
= 13.

A mature student may understand that the same limit might have been calculated much
quickly through direct substitution as given below:

lim (x2 + y 2 ) = 22 + 32 = 4 + 9 = 13.


(x,y)→(2,3)

Example 18. Evaluate each of the following limits:


1. lim(x,y)→(1,−2) (3x − 2y + 5). 2. lim(x,y)→(4,π) x2 sin y

x
.
x3 +y 3 x2 +y 2 +xy
3. lim(x,y)→(−1,2) x2 +y 2
. 4. lim(x,y)→(0,0) x2 +y 2 +1
2 2
5. lim(x,y)→(0,1) x+y+2
x2 +y+1
6. lim(x,y)→(0,0) px −xy+y
p
|x|+ |y|+1
x2 −3xy+5y 2
7. lim(x,y)→(1, π6 ) x sin y
x+3
8. lim(x,y)→(0,0) x−2y+1
ex+2z
9. lim(x,y)→(0,0) sin(x−3y)
x−3y
10.lim(x,y,z)→(1,0,−1) √
z 2 +cos xy+2


Solution. The limits are evaluated as below:


1. We directly put the values, i.e., x = 1 and y = −2, we get

lim (3x − 2y + 5) = 3(1) − 2(−2) + 5 = 12.


(x,y)→(1,−2)

Therefore, the limit is lim(x,y)→(1,−2) (3x − 2y + 5) = 12.


2. We do the same as in the previous example. Thus, we have

2
y
2
π  2 √
lim x sin = 4 · sin = 16 · = 8 2.
(x,y)→(4,π) x 4 2

Thus, the limit is lim(x,y)→(4,π) x2 sin y

x
= 8 2.
3. As (x, y) → (−1, 2), both the numerator and denominator approach finite values.
Hence,
x3 + y 3 (−1)3 + 23 7
lim 2 2
= lim 2 2
= .
(x,y)→(−1,2) x + y (x,y)→(−1,2) (−1) + 2 5

19
1.5. LIMIT

4. As (x, y) → (0, 0), the numerator tends to 0, and the denominator tends to 1. Hence,

x2 + y 2 + xy
lim = 0.
(x,y)→(0,0) x2 + y 2 + 1

5. As (x, y) → (0, 1), the numerator tends to 3, and the denominator tends to 2. Thus,

x+y+2 3
lim 2
= .
(x,y)→(0,1) x + y + 1 2

6. The numerator approaches 0 and the denominator approaches 1. Hence,

x2 − xy + y 2
lim p p = 0.
(x,y)→(0,0) |x| + |y| + 1

7. In this case also, we may directly put the values. Hence, we have

x sin y 1 · sin(π/6) 1/2 1


lim = = = .
(x,y)→(1, π6 ) x + 3 1+3 4 8

8. The numerator tends to 0, and the denominator tends to 1. So,

x2 − 3xy + 5y 2
lim = 0.
(x,y)→(0,0) x − 2y + 1

9. Let us put u = x − 3y. Since (x, y) → (0, 0), therefore u → 0. Thus, the desired limit
is
sin(x − 3y) sin u
lim = lim = 1.
(x,y)→(0,0) x − 3y u→0 u

10. Given (x, y, z) → (1, 0, −1). We directly put the values of x, y, z. Thus, we have

ex+2z e1−2 e−1


lim √ = √ = √ .
(x,y,z)→(1,0,−1) z 2 + cos xy + 2 1 + cos( 2) 1 + cos( 2)

20
1.5. LIMIT
n o
sin−1 (xy−2)
Example 19. Evaluate lim(x,y)→(2,1) tan−1 (3xy−6)
.

Solution. Let us define u = xy. As (x, y) → (2, 1), we see that u → 2. So we rewrite the
limit as
sin−1 (xy − 2) sin−1 (u − 2)
 
lim = lim .
(x,y)→(2,1) tan−1 (3xy − 6) u→2 tan−1 (3u − 6)

Now make the substitution h = u − 2, so that h → 0 as u → 2. Then the expression becomes

sin−1 (h)
lim .
h→0 tan−1 (3h)

Using the standard linear approximations near 0, i.e., sin−1 (h) ≈ h and tan−1 (3h) ≈ 3h, as
h → 0, we get
h 1
lim = .
h→0 3h 3
x2 −y 2
Example 20. Show that the lim(x,y)→(0,0) x2 +y 2
does not exist.

Solution. To show that the limit does not exist, we shall choose two different paths. First,
we let (x, y) → (0, 0) along x-axis, i.e., along the path y = 0. We see that

x2 − y 2 x2
lim = lim = 1.
(x,y)→(0,0) x2 + y 2 (x,y)→(0,0) x2

Secondly, we let (x, y) → (0, 0) along y-axis, i.e., along the path x = 0. We see that

x2 − y 2 y2
lim = lim − = −1.
(x,y)→(0,0) x2 + y 2 (x,y)→(0,0) y2

We see, that along two different paths, limit is different. Since the limit depends on the
path, therefore, it does not exist.

Example 21. Evaluate lim(x,y)→(0,0) xy


x2 +y 2
, if it exists.

Solution. Let us assume that the point (x, y) approaches (0, 0) through the path y = mx.
Then,

xy mx2
lim = lim
(x,y)→(0,0) x2 + y 2 (x,y)→(0,0) x2 + m2 x2
m
= .
1 + m2
Thus, the limit depends on the path y = mx, therefore, it does not exist.

Example 22. Show that the lim(x,y)→(0,1) tan−1 xy does not exist.


21
1.5. LIMIT

Solution. To analyze the limit, we examine the behavior of the function along different
paths approaching the point (0, 1).

Approach 1: Let y = 1 and x → 0+ . Then,


y    
−1 −1 1 −1 1 π
lim tan = lim+ tan = lim tan = .
(x,y)→(0,1) x x→0 x h→0 0+h 2

Approach 2: Let y = 1 and x → 0− . Then,


y    
−1 −1 1 −1 1 π
lim tan = lim− tan = lim tan =− .
(x,y)→(0,1) x x→0 x h→0 0−h 2

Since the function approaches different values along different paths, therefore the given limit
−1 y
does not exist.

lim(x,y)→(0,1) tan x

Example 23. Show that the lim(x,y)→(0,0) 2y does not exist.



x

Solution. Let the point (x, y) approach (0, 0) along the path y = mx, where m is a constant.
Then,

2y 2(mx)
lim = lim = lim 2m = 2m.
(x,y)→(0,0) x x→0 x x→0

Since the value of the limit depends on the parameter m, which varies with the path chosen,
the limit does not exist.

Example 24. Let f : R2 → R be defined by


 xy
x2 +y 2
, (x, y) 6= (0, 0),
f (x, y) =
0, (x, y) = (0, 0).

Show that the lim(x,y)→(0,0) f (x, y) does not exist.

Solution. Let us approach the point (0, 0) along the path y = mx, where m is a real
constant. Substituting y = mx into the function:

xy x(mx)
f (x, y) = = 2
x2
+y 2 x + (mx)2
mx2 mx2
= 2 =
x + m2 x2 x2 (1 + m2 )
m
= .
1 + m2
Thus, lim(x,y)→(0,0) f (x, y) = 1+m
m
2 , which depends on the value of m. Since the limit depends

on the path taken (i.e., the value of m), therefore, the limit does not exist.

22
1.6. CONTINUOUS FUNCTIONS

Example 25. Let a function f : R2 → R of two real variables be defined by


( 2
x y
x4 +y 2
, (x, y) 6= (0, 0),
f (x, y) =
0, (x, y) = (0, 0).

Show that the lim(x,y)→(0,0) f (x, y) does not exist.


Solution. We investigate the limit along two different paths to check if it yields different
values.

Path 1: Let us assume that (x, y) → (0, 0) along the parabola y = x2 . Then,

x2 (x2 ) x4 1
lim f (x, y) = lim 4 2 2
= lim 4
= .
(x,y)→(0,0) (x,y)→(0,0) x + (x ) (x,y)→(0,0) 2x 2

Thus, along the path y = x2 , the function approaches 12 .

Path 2: This time, we assume that (x, y) → (0, 0) along the x-axis, i.e., y = 0. Then,

x2 · 0
lim f (x, y) = lim = 0.
(x,y)→(0,0) (x,y)→(0,0) x4 + 0

Thus, along the path y = 0, the function approaches 0. Since the limit depends on the path
taken, the limit does not exist.

1.6 Continuous Functions


In the study of functions of several variables, continuity plays a central role in understanding
how functions behave locally and globally. Just as with functions of a single variable, the
concept of continuity captures the idea that small changes in input produce small changes
in output. We now extend this intuitive idea to functions defined on subsets of R2 and
formalize what it means for such functions to be continuous.

A function f (x, y) is said to be continuous at a point (a, b) if the following three conditions
are satisfied:
1. The value of the function f (x, y) at (a, b) exists, i.e., f (a, b) is defined.
2. The limit of the function f (x, y) at (a, b) exists, i.e., lim(x,y)→(a,b) f (x, y) exists.
3. Both the value of the function f (x, y) and the limit at point (a, b) are equal, i.e.,

lim f (x, y) = f (a, b)


(x,y)→(a,b)

23
1.6. CONTINUOUS FUNCTIONS

If all three conditions are met, we write lim(x,y)→(a,b) f (x, y) = f (a, b), and say that f is
continuous at (a, b). A function is said to be continuous on a domain D ⊂ R2 if it is
continuous at every point (x, y) ∈ D. In a bit formal way, we can define continuity of the
functions as below:

Definition 1.6.1 (-δ Definition:). Let f : R2 → R be a real-valued function. We say


that f is continuous at a point (a, b) ∈ R2 if, for every  > 0, there exists a δ > 0 such
that
if (x − a)2 + (y − b)2 < δ, then |f (x, y) − f (a, b)| < .
p

That is, for every point (x, y) within a spherical neighborhood of radius δ centered at
(a, b), the function value f (x, y) lies within an interval of radius  around f (a, b). This
formalizes the intuitive idea that small changes in input lead to small changes in output
near the point.
Continuity on a Domain: A function f : R2 → R is said to be continuous on a
domain D ⊆ R2 if it is continuous at every point (x, y) ∈ D. That is, for every point
(a, b) ∈ D, the  − δ condition of continuity is satisfied. In such cases, we simply say that
f is continuous on D.

Having understood the definition of continuity, it is natural to ask how continuity behaves un-
der standard operations like addition, multiplication, and division of functions. Fortunately,
continuity is preserved under these operations, as stated in the following fundamental result.

Theorem 1.6.1 (Algebra of Continuous Functions:). Let f (x, y) and g(x, y) be functions
that are continuous at a point (a, b) ∈ R2 . Then the following functions are also continuous
at (a, b):

1. f (x, y) + g(x, y) (Sum),


2. f (x, y) − g(x, y) (Difference),
3. f (x, y) · g(x, y) (Product),
f (x, y)
4. , provided g(a, b) 6= 0 (Quotient),
g(x, y)
5. c · f (x, y), where c ∈ R is a constant (Scalar multiple).

Furthermore, all the polynomial functions in x and y are continuous. Additionaly, composi-
tion of continuous functions, is also continuous at (a, b).

Many commonly encountered functions in mathematics are continuous within their domains.
For instance, polynomial functions such as f (x, y) = x2 + y 2 are continuous everywhere in
2 2
R2 . Rational functions like x2x+y+y2 +1 are also continuous at all points where the denominator
is nonzero. Similarly, standard trigonometric, exponential, and logarithmic functions are
continuous wherever they are defined. Discontinuities typically arise either at points where

24
1.6. CONTINUOUS FUNCTIONS

the function is undefined or where the limit does not coincide with the function value. To
deepen our understanding, we now explore several examples that highlight when functions
are continuous at a specific point or throughout their entire domain.

ILLUSTRATIVE EXAMPLES

Example 26. Using the -δ definition, prove that f (x, y) = x + y is continuous at the point
(1, 2).

Solution. We compute value of the function at (1, 2), i.e., f (1, 2) = 1 + 2 = 3. In order to
show that the function is continuous at (1, 2), We must show that for every  > 0, there
exists a δ > 0 such that
|f (x, y) − f (1, 2)| < ,
whenever
|x − 1| < δ and |y − 2| < δ.
Now, let  > 0 be given. Observe that

|f (x, y) − f (1, 2)| = |x + y − 3| = |(x − 1) + (y − 2)| ≤ |x − 1| + |y − 2|.

If |x − 1| < δ and |y − 2| < δ, then

|f (x, y) − 3| < δ + δ = 2δ.

To make this less than , we choose δ = 2 . Then,

|x − 1| < δ and |y − 2| < δ ⇒ |f (x, y) − 3| < .

Hence, f (x, y) = x + y is continuous at (1, 2).

Example 27. Using the -δ definition, prove that f (x, y) = 2x + y 2 is continuous at the
point (1, 1).

Solution. The function is f (x, y) = 2x + y 2 and at the point (1, 1), we have

f (1, 1) = 2(1) + (1)2 = 3.

Now, compute the difference

|f (x, y) − f (1, 1)| = |2x + y 2 − 3|


= |2(x − 1) + (y 2 − 1)|
= |2(x − 1) + (y − 1)(y + 1)|
≤ 2|x − 1| + |y − 1||y + 1|.

25
1.6. CONTINUOUS FUNCTIONS

To estimate |y + 1|, assume |y − 1| < 1. This implies that −2 < y < 2 or else y ∈ (0, 2).
This means that |y + 1| < 3. So, under the assumption δ ≤ 1, we have:

|f (x, y) − 3| ≤ 2|x − 1| + |y − 1||y + 1| < 2δ + 3δ = 5δ.

To ensure this is less than , we choose


n o
δ = min 1, .
5
Thus, we arrive at the conclusion, that

|f (x, y) − f (1, 1)| < ,

whenever |x − 1| < δ and |y − 1| < δ. Therefore, we conclude that the function f (x, y) =
2x + y 2 is continuous at (1, 1).

Example 28. Show that the function f : R2 → R defined by


(
xy 3
x 2 +y 6 , (x, y) 6= (0, 0),
f (x, y) =
0, (x, y) = (0, 0)

is not continuous at (0, 0).

Solution. To show that the function is not continuous at (0, 0), it is enough to show that
the limit
xy 3
lim
(x,y)→(0,0) x2 + y 6

does not exist. We do this by approaching (0, 0) along different paths and showing that the
resulting limits are not the same.

Approach 1: Along the path x = y 3 . Substitute x = y 3 into the function:

(y 3 )y 3 y6 y6 1
lim f (x, y) = 3 2 6
= lim 6 6
= lim 6
= .
(x,y)→(0,0) (y ) + y (x,y)→(0,0) y + y (x,y)→(0,0) 2y 2
Approach 2: Along the path x = 0, i.e., y-axis. Then,

0 · y3
lim f (x, y) = lim = 0.
(x,y)→(0,0) (x,y)→(0,0) 0 + y 6

Since the limits along two different paths are not equal, therefore the

lim f (x, y)
(x,y)→(0,0)

does not exist. Hence, f is not continuous at (0, 0).

26
1.6. CONTINUOUS FUNCTIONS

Example 29. Let f (x, y) = x2 + 3xy − 2y 2 . Show that f is continuous at (1, 2).
Solution. Value of the function and limit at (1, 2) are calculated as below:
f (1, 2) = 12 + 3(1)(2) − 2(2)2 = 1 + 6 − 8 = −1
and
lim f (x, y) = 12 + 3(1)(2) − 2(2)2 = 1 + 6 − 8 = −1.
(x,y)→(1,2)

Since the limit equals the functional value, the function is continuous at (1, 2).

Alternatively, the continuity in this particular case can be ascertained just by observing that
the function f (x, y) is a polynomial in x and y and it is known that the polynomial functions
are continuous everywhere in R2 . Therefore, f is continuous at (1, 2).
x2 −y 2
Example 30. Let f (x, y) = x2 +y 2 +1
. Show that f is continuous everywhere.
Solution. Functional value at a point (a, b) is given by
a2 − b 2
f (a, b) = .
a2 + b 2 + 1
Similarly, limit at (a, b) is
x2 − y 2 a2 − b 2
lim = .
(x,y)→(a,b) x2 + y 2 + 1 a2 + b 2 + 1
Since the function is defined and the limit equals the function value at every point, f is
continuous everywhere in R2 .
Example 31. Let f (x, y) = x2 + y 2 . Show that f is continuous at all (x, y) ∈ R2 .
p

Solution. To prove the continuity of the function f (x, y) = x2 + y 2 at any point (a, b) ∈
p

R2 , observe that p √
lim x 2 + y 2 = a2 + b 2 ,
(x,y)→(a,b)

which directly implies that f is continuous at (a, b).


Alternatively, we can establish the continuity of f by expressing it as a composition1 of two
continuous functions. Define

g(x, y) = x2 + y 2 and h(t) = t.
Then the function f can be written as:
p
f (x, y) = (h ◦ g)(x, y) = h(g(x, y)) = x2 + y 2 .

Since g(x, y) is a polynomial in x and y, it is continuous on all of R2 . The function h(t) = t
is continuous for all t ≥ 0, and g(x, y) ≥ 0 for all (x, y) ∈ R2 . Therefore, the composition
f (x, y) = h(g(x, y)) is continuous on R2 . Hence, f is continuous at every point (x, y) ∈ R2 .
1
For more on function composition, refer to Chapter 3.

27
1.6. CONTINUOUS FUNCTIONS

Example 32. Let f (x, y) = sin(xy). Is f continuous at (0, 0)?


Solution. We aim to determine whether the function f (x, y) = sin(xy) is continuous at the
point (0, 0) ∈ R2 . Observe that the function can be written as a composition of the following
two functions
g(x, y) = xy and h(t) = sin(t).
Then,
f (x, y) = (h ◦ g)(x, y) = h(g(x, y)) = sin(xy).
The function g(x, y) = xy is continuous on R2 because it is the product of continuous func-
tions x and y. The function h(t) = sin(t) is continuous for all real numbers. Therefore, the
composition f (x, y) = sin(xy) is continuous on R2 . In particular, the function is continuous
at (0, 0).

Alternatively, we may go for some computations. To check the continuity of the function f
at a particular point (0, 0), we compute its value and limit both at the point (0, 0). We find
f (0, 0) = sin(0 · 0) = sin(0) = 0,
and
lim sin(xy) = sin(0) = 0.
(x,y)→(0,0)

Since the limit of f (x, y) as (x, y) → (0, 0) exists and equals the value of the function at that
point, we conclude that f is continuous at (0, 0).
Example 33. Let f (x, y) = ex . Show that f is continuous at every point in R2 .
2 +y 2

Solution. We aim to show that the function f (x, y) = ex +y is continuous at all points
2 2

(x, y) ∈ R2 . Consider the function as a composition of two functions:


g(x, y) = x2 + y 2 and h(t) = et .
Then, we can write
2 +y 2
f (x, y) = (h ◦ g)(x, y) = h(g(x, y)) = ex .
The function g(x, y) = x2 + y 2 is a polynomial in two variables and hence continuous on
R2 . The exponential function h(t) = et is continuous for all real numbers. Therefore, the
composition f (x, y) = ex +y is continuous everywhere in R2 .
2 2

Alternatively, we may compute the value as well as the limit of the function at any point
(a, b) ∈ R2 . The value of the function f at (a, b) is given by
2 +b2
f (a, b) = ea ,
and the limit at (a, b) is given by
2 +y 2 2 +b2
lim ex = ea .
(x,y)→(a,b)

28
1.6. CONTINUOUS FUNCTIONS

Note that value and limit of the function at the point (a, b) both exist and are equal. Hence,
the function is continuous at (a, b). SInce this is true for any (a, b) ∈ R2 , therefore, we
conclude that f is continuous everywhere in R2 .

29
1.6. CONTINUOUS FUNCTIONS

30
Chapter 2

Partial Derivatives

In this chapter, we extend the concept of ordinary derivatives to functions of multiple vari-
ables. When dealing with scalar fields such as f (x, y), it becomes essential to understand how
the function changes with respect to each variable independently, as well as in arbitrary di-
rections within the domain. This leads to the foundational concepts of partial derivatives and
directional derivatives, which quantify the rate of change of a multivariable function along
specified directions. These tools are crucial for understanding gradients, tangent planes, and
local linear approximations in higher dimensions.

At the end of this chapter, you will be able to:


1. Compute partial derivatives and gradient of functions of two or more variables,
2. Interpret partial derivatives geometrically,
3. Compute tangent planes to a surface.

2.1 Introduction
For a function y = f (x) of single variable, its derivative f 0 (x) is defined as
f (x + h) − f (x)
f 0 (x) = lim .
h→0 h
At a point x = a, the derivative f 0 (a) is then written as
f (a + h) − f (a)
f 0 (a) = lim .
h→0 h
This expression represents the rate of change of the function f at the point x = a. Geomet-
rically, it corresponds to moving from a point A(x = a) along the x-axis to a nearby point
B(x = a + h), which lies at a distance h from A. We then compute the ratio of the change
in the function values to the change in the input variable and examine the limiting behavior
of this ratio as h → 0.

31
2.1. INTRODUCTION

Figure 2.1.1: Geometric interpretation of the derivative in one variable

In the case of a function of a single variable, the direction of movement from the point x = a
is fixed. In fact, one can move just along the x-axis, or in the direction of the unit vector
î. However, for functions of several variables, there are infinitely many directions to proceed
from a given point.

For instance, consider a function z = f (x, y). From any point (a, b), we can move not only
in the directions of î (along the x-axis) or ĵ (along the y-axis), but also in any arbitrary
direction in the plane, determined by a unit vector v̂. This leads us to the concept of the

Figure 2.1.2: Infinitely many directions in the plane from point (a, b)

directional derivative, which measures the rate of change of the function f (x, y) in the
direction of a given unit vector v̂. Of particular interest are the directional derivatives in
the directions of î and ĵ, which correspond to changes along the x and y-axes, respectively.

32
2.2. PARTIAL DERIVATIVE ALONG x-AXIS

These are known as the partial derivatives of f and are denoted by

fx (a, b) and fy (a, b).

Thus, while the derivative in one variable captures change along a single path, the framework
of directional and partial derivatives in multivariable calculus enables us to explore how
functions behave across all directions from a given point.

2.2 Partial Derivative Along x-Axis


Let us now explore how to express the derivative of a function with respect to one of its
variables, in particular, the partial derivative with respect to x, denoted by fx (a, b).

To begin with, let us consider a point p = (a, b) in the domain of the function f (x, y). We are
interested in analyzing how the function f behaves as we move away from p in the direction
of the x-axis, which corresponds to the unit vector i. For notational convenience, we denote
the unit vector i by the symbol e1 . Thus, we have

e1 = i = (1, 0).

This choice simplifies expressions and facilitates computation while maintaining clarity in
the directional interpretation.

Next, we take a nearby point q located at a small distance h from p in the direction of i.
The coordinates of this new point are

q = p + he1 = (a, b) + h(1, 0) = (a + h, b).

We now consider the difference quotient

f (a + h, b) − f (a, b)
,
h
which measures the average rate of change of the function between the points p and q along
the x-direction. Taking the limit as h → 0, we obtain the partial derivative of f with respect
to x at the point (a, b):

f (a + h, b) − f (a, b) f (p + he1 ) − f (p)


fx (a, b) = lim = lim .
h→0 h h→0 h

Geometric Interpretation: To understand the meaning of the partial derivative ∂f ∂x


(a, b),
consider the surface defined by z = f (x, y). Fix a particular value of y = b. Now, the
equation y = b defines a vertical plane parallel to the xz-plane. This plane, as shown in the

33
2.2. PARTIAL DERIVATIVE ALONG x-AXIS

Figure 2.2.1: Partial Derivative with respect to x

figure (2.2.1), intersect the surface in a curve. The curve of intersection of the surface with
this plane is described by
z = f (x, b).
This curve lies entirely in the vertical plane y = b, and its shape is determined by how the
function f changes as x varies, with y held constant.

Let us define a single-variable function g(x) = f (x, b), so that the curve becomes

z = g(x).

From single-variable calculus, we know that the slope of the tangent line to this curve at the
point x = a is given by the derivative dx
dg
x=a
. Let us compute this derivative:

dg g(a + h) − g(a)
= lim
dx x=a
h→0 h
f (a + h, b) − f (a, b)
= lim
h→0 h

34
2.2. PARTIAL DERIVATIVE ALONG x-AXIS

∂f
= (a, b).
∂x
This shows that the partial derivative ∂f ∂x
(a, b) represents the slope of the curve formed by
slicing the surface z = f (x, y) with the plane y = b. In other words, it measures how the
surface rises or falls in the x-direction at the point (a, b), while keeping y fixed.

We can also describe this curve of intersection using a vector-valued function:


c(x) = (x, b, g(x)) = (x, b, f (x, b)).
This curve lives in three-dimensional space, and its tangent vector is given by the derivative
c0 (x) = (1, 0, g 0 (x)) .
Evaluating this at x = a, we get the tangent vector at the point (a, b, f (a, b)) as
 
0 ∂f
c (a) = 1, 0, (a, b) .
∂x
This vector lies in the tangent plane to the surface and points in the direction of increasing
x, holding y constant.

Summary: The partial derivative fx (a, b) gives the slope of the surface in the x-direction
at the point (a, b). Geometrically, it is the slope of the tangent to the curve obtained by
slicing the surface with the plane y = b. The corresponding tangent vector to this curve at
x = a is
(1, 0, fx (a, b)),
which will be useful later when we discuss tangent planes to surfaces. Furthermore, this
vector forms an angle φ with the x-axis in three-dimensional space, and the tangent of this
angle gives a geometric realization of the partial derivative in the x-direction.

Let the unit vector along the x-axis be i = (1, 0, 0), and consider the angle φ between i and
the tangent vector v = (1, 0, fx (a, b)). Then
i·v 1
cos φ = =p .
kik kvk 1 + fx (a, b)2
From this, we calculate
p fx (a, b)
sin φ = 1 − cos2 φ = p .
1 + fx (a, b)2
Therefore, the tangent of the angle φ is
sin φ fx (a, b) p
tan φ = =p · 1 + fx (a, b)2 = fx (a, b).
cos φ 1 + fx (a, b)2
This confirms that the partial derivative fx (a, b) corresponds to the tangent of the angle
between the surface curve and the x-axis in the vertical plane y = b, thereby providing an
intuitive geometric interpretation of the rate of change in the x-direction.

35
2.3. PARTIAL DERIVATIVE ALONG y-AXIS

2.3 Partial Derivative Along y-axis


Having defined the partial derivative with respect to x, we now turn our attention to the
partial derivative with respect to y, denoted by fy (a, b).

As before, we begin with a point p = (a, b) in the domain of a function f (x, y), and we
examine how the function changes as we move from p in the direction of the positive y-axis.
This direction is represented by the unit vector j = e2 = (0, 1).

Figure 2.3.1: Partial Derivative with respect to y

We choose a nearby point q, located a small distance k away from p in the y-direction. The
coordinates of this point are given by

q = p + ke2 = (a, b) + k(0, 1) = (a, b + k).

We now form the difference quotient


f (a, b + k) − f (a, b)
,
k
which measures the average rate of change of the function between the points p and q along
the y-direction. Taking the limit as k → 0, we obtain the partial derivative of f with respect
to y at the point (a, b):
f (a, b + k) − f (a, b) f (p + ke2 ) − f (p)
fy (a, b) = lim = lim .
k→0 k k→0 k

36
2.3. PARTIAL DERIVATIVE ALONG y-AXIS

Geometric Interpretation: To understand the meaning of the partial derivative ∂f ∂y


(a, b),
consider the surface defined by z = f (x, y). Fix a particular value x = a. The equation
x = a defines a vertical plane parallel to the yz-plane. The intersection of the surface with
this plane, as shown in the figure (2.3.1), forms a curve described by

z = f (a, y).

This curve lies entirely in the vertical plane x = a, and its shape is determined by how the
function f changes as y varies, with x held constant.

Let us define a single-variable function h(y) = f (a, y), so that the curve becomes

z = h(y).

From single-variable calculus, we know that the slope of the tangent line to this curve at the
point y = b is given by the derivative dh
dy y=b
. Let us compute this derivative:

dh h(b + k) − h(b)
= lim
dy y=b
k→0 k
f (a, b + k) − f (a, b)
= lim
k→0 k
∂f
= (a, b).
∂y

This shows that the partial derivative ∂f ∂y


(a, b) represents the slope of the curve formed by
slicing the surface z = f (x, y) with the plane x = a. In other words, it measures how the
surface rises or falls in the y-direction at the point (a, b), while keeping x fixed.

We can also describe this curve of intersection using a vector-valued function:

d(y) = (a, y, h(y)) = (a, y, f (a, y)).

This curve lies in three-dimensional space, and its tangent vector is given by the derivative

d0 (y) = (0, 1, h0 (y)) .

Evaluating this at y = b, we get the tangent vector at the point (a, b, f (a, b)) as
 
0 ∂f
d (b) = 0, 1, (a, b) .
∂y

This vector lies in the tangent plane to the surface and points in the direction of increasing
y, holding x constant.

37
2.3. PARTIAL DERIVATIVE ALONG y-AXIS

Summary: The partial derivative fy (a, b) gives the slope of the surface in the y-direction at
the point (a, b). Geometrically, it is the slope of the tangent to the curve obtained by slicing
the surface with the plane x = a. The corresponding tangent vector to this curve at y = b is

(0, 1, fy (a, b)),

which will also be useful when we discuss the tangent plane to a surface. Furthermore, this
vector forms an angle θ with the y-axis in three-dimensional space, and the tangent of this
angle provides a geometric interpretation of the partial derivative.

To make this precise, observe that the unit vector in the y-direction is j = (0, 1, 0), and
the angle θ between this vector and the tangent vector v = (0, 1, fy (a, b)) satisfies

j·v 1
cos θ = =p .
kjk kvk 1 + fy (a, b)2

Then, applying the identity tan θ = sin θ
cos θ
, and computing sin θ = 1 − cos2 θ, we obtain
v !2
u
u 1 fy (a, b)
sin θ = t1 − p =p .
1 + fy (a, b)2 1 + fy (a, b)2

Therefore,
sin θ fy (a, b)
q
tan θ = =p · 1 + fy (a, b)2 = fy (a, b).
cos θ 1 + fy (a, b)2
This confirms that the partial derivative fy (a, b) is, in fact, the tangent of the angle that the
curve makes with the y-axis in the vertical plane x = a, offering a clear geometric interpre-
tation of directional change on the surface.

We now compute partial derivatives for several functions to illustrate the basic techniques
and applications of partial differentiation.

ILLUSTRATIVE EXAMPLES

Example 34. If f (x, y) = x2 + y 2 , then find fx (x, y) and fy (x, y) from the definition.

Solution. We will use the definition of partial derivatives:

f (x + h, y) − f (x, y) f (x, y + k) − f (x, y)


fx (x, y) = lim , fy (x, y) = lim .
h→0 h k→0 k

38
2.3. PARTIAL DERIVATIVE ALONG y-AXIS

In view of these definitions, we get

f (x + h, y) − f (x, y)
fx (x, y) = lim
h→0 h
(x + h)2 + y 2 − (x2 + y 2 )
= lim
h→0 h
(x + 2xh + h2 + y 2 ) − (x2 + y 2 )
2
= lim
h→0 h
2
(2xh + h )
= lim
h→0 h
= 2x.

Similarly, we compute fy (x, y) as below

f (x, y + k) − f (x, y)
fy (x, y) = lim
h→0 h
x2 + (y + k)2 − (x2 + y 2 )
= lim
h→0 h
(x + y + 2yk + k 2 ) − (x2 + y 2 )
2 2
= lim
h→0 h
= 2y.

Thus, we finally get


fx (x, y) = 2x, fy (x, y) = 2y.

Example 35. Let f (x, y) = x2 y + 3xy 3 . Compute fx (1, 2) and fy (1, 3).

Solution. To compute the partial derivative with respect to x, treat y as a constant:

∂ 2
fx (x, y) = (x y + 3xy 3 ) = 2xy + 3y 3 .
∂x
Therefore, we get
fx (1, 2) = 2(1)(2) + 3(8) = 28.
To compute the partial derivative with respect to y, treat x as a constant:

∂ 2
fy (x, y) = (x y + 3xy 3 ) = x2 + 9xy 2 .
∂y

Therefore, we get
fy (1, 3) = 1 + 9(1)(9) = 82.

Example 36. Let f (x, y) = exy + x cos y. Find fx and fy .

39
2.3. PARTIAL DERIVATIVE ALONG y-AXIS

Solution. The partial derivatives are given by


∂ xy
fx = (e + x cos y) = yexy + cos y,
∂x
∂ xy
fy = (e + x cos y) = xexy − x sin y.
∂y
Example 37. Let f (x, y) = ln(x2 + y 2 ). Find fx and fy .

Solution. Partial derivatives are computed as below


∂ 2x ∂ 2y
fx = ln(x2 + y 2 ) = 2 , fy = ln(x2 + y 2 ) = 2 .
∂x x + y2 ∂y x + y2
Example 38. Let f (x, y) = x
y
+ xy . Find fx and fy .

Solution.  
∂ x y 0 y y
fx = + = − 2 = − 2,
∂x y x y x x
 
∂ x y x 1
fy = + =− 2 + .
∂y y x y x

Example 39. Let f (x, y) = tan−1 xy . Compute fx (x, y) and fy (x, y).


Solution. We have f (x, y) = tan−1 xy , therefore, we have




 
1 −y −y
fx (x, y) = · = ,
1 + (y/x)2 x2 x2 + y 2

Similarly, we compute the partial derivative with respect to y as below.


 
1 1 x
fy (x, y) = · = .
1 + (y/x)2 x x2 + y 2

Example 40. Show that for the function f (x, y) given by


(
xy
x2 +y 2
, if (x, y) 6= (0, 0),
f (x, y) =
0, if (x, y) = (0, 0).

all the partial derivatives exist at (0, 0), but f (x, y) is not continuous at (0, 0).

Solution. First, we compute partial derivative using first principle. The partial derivative
fx (0, 0) with respect to x is given by

f (h, 0) − f (0, 0) 0−0


fx (0, 0) = lim = lim = 0.
h→0 h h→0 h

40
2.4. TANGENT PLANE TO A SURFACE

Similarly, the partial derivative fy (0, 0) with respect to y is given by


f (0, h) − f (0, 0) 0−0
fy (0, 0) = lim = lim = 0.
h→0 h h→0 h
Thus both fx (0, 0) and fy (0, 0) exists. Now, we shall check continuity of f (x, y) at (0, 0).
For this, we first try to compute limit at (0, 0). We approach (0, 0) along two different paths
y = x and y = −x. Along y = x, we have
x2 1
lim f (x, y) = lim 2
= .
(x,y)→(0,0) (x,y)→(0,0) 2x 2
Similarly, along the path y = −x, we have
x2 1
lim f (x, y) = lim − = − .
(x,y)→(0,0) (x,y)→(0,0) 2x2 2
Since the function approaches different values along different paths, the limit does not exist.
Therefore, f (x, y) is not continuous at (0, 0) although the partial derivatives fx (0, 0) and
fy (0, 0) both exist.
This example demonstrates that the existence of partial derivatives at a point
does not guarantee continuity at that point.

2.4 Tangent Plane to a Surface


In earlier classes, we studied about tangent line to a curve. Informally, a tangent line to a
curve at a point is the line that just ‘touches’ the curve at at that point and matches its
slope there. Mathematically, in a precise manner, the tangent line to a curve y = f (x) at a
point x = a is the line given by

y = f (a) + f 0 (a)(x − a).

But this isn’t just a pretty line. It tells you how the function is behaving right there-whether
its going up, going down, or leveling off. Its your go-to tool for understanding local behavior.
In fact, it gives the best linear approximation of the curve near that point.

But what happens when we move from curves to surfaces? Suppose we have a surface defined
by a function of two variables, z = f (x, y). This surface bends in multiple directions - not
just one. If we want to understand how the surface behaves near a point, we need a kind of
plane, a flat surface, that plays the same role the tangent line did for a curve. This is the
tangent plane.

To build an intuitive sense of what the tangent plane is, we can think of every smooth curve
that lies on the surface and passes through the point P (a, b, f (a, b) as shown in the figure

41
2.4. TANGENT PLANE TO A SURFACE

Figure 2.4.1: Curves on the Surface Figure 2.4.2: Tangent Lines

(2.4.1). Each such curve has a tangent line at P (see figure 2.4.2) and remarkably, all these
tangent lines- despite pointing in different directions- lie in a single plane (see figure 2.4.3).
This plane is known as the tangent plane.

Intuitive Definition: The tangent plane at a point on a smooth surface is the unique plane
that contains tangent vectors to all smooth curves on the surface that pass through
that point.

Figure 2.4.3: The Tangent Plane

In essence, the tangent plane is the best flat approximation of the surface near that

42
2.4. TANGENT PLANE TO A SURFACE

point, just as the tangent line is the best straight-line approximation to a curve.

2.4.1 Equation of the Tangent Plane


We know that the equation of a plane passing through a point P with position vector r0 ,
and perpendicular to a vector N, is given by
(r − r0 ) · N = 0,
where r = (x, y, z) is the position vector of a general point on the plane.

Now, consider a surface given by z = f (x, y), and let P = (a, b, f (a, b)) be a point on this
surface. Our goal is to find the equation of the tangent plane to the surface at P . To do this,
we need a vector N that is normal to the tangent plane at P , known as the surface normal.

To find such a normal vector, we first construct two non-parallel vectors lying in the tangent
plane. These can be obtained from the tangent vectors to curves formed by intersecting the
surface z = f (x, y) with the vertical planes y = b and x = a, respectively. These curves lie
entirely on the surface and pass through the point P , and their tangent vectors at P are:
T1 = (1, 0, fx (a, b)) and T2 = (0, 1, fy (a, b)).
These two vectors span the tangent plane at P , and their cross product gives a vector normal
to this plane:
i j k
N = T1 × T2 = 1 0 fx (a, b) .
0 1 fy (a, b)
Expanding the determinant, we get:
0 fx (a, b) 1 fx (a, b) 1 0
N=i −j +k
1 fy (a, b) 0 fy (a, b) 0 1
= −fx (a, b) i − fy (a, b) j + k.
In vector form, the surface normal is

N = (−fx (a, b), −fy (a, b), 1) .

Using this normal vector and the point P = (a, b, f (a, b)), the equation of the tangent plane
is:
−fx (a, b)(x − a) − fy (a, b)(y − b) + 1 · (z − f (a, b)) = 0.
Rearranging terms gives the familiar form:

z = f (a, b) + fx (a, b)(x − a) + fy (a, b)(y − b) ,

which is the equation of the tangent plane to the surface z = f (x, y) at the point (a, b, f (a, b)).

43
2.4. TANGENT PLANE TO A SURFACE

ILLUSTRATIVE EXAMPLES

Example 41. Find the equation of the tangent plane to the surface z = x2 + y 2 at the point
(1, 2, 5).

Solution. We are given f (x, y) = x2 + y 2 . First, compute the partial derivatives:

fx (x, y) = 2x, fy (x, y) = 2y.

Evaluating at the point (1, 2), we get:

fx (1, 2) = 2, fy (1, 2) = 4.

The equation of the tangent plane at (1, 2, 5) is given by

z = f (1, 2) + fx (1, 2)(x − 1) + fy (1, 2)(y − 2).

Since f (1, 2) = 12 + 22 = 5, we substitute:

z = 5 + 2(x − 1) + 4(y − 2).

Simplifying, we get
z = 2x + 4y − 5.

Example 42. Find the tangent plane to the surface z = exy at the point (0, 1, 1).

Solution. Let f (x, y) = exy . Compute the partial derivatives:

fx (x, y) = yexy , fy (x, y) = xexy .

At the point (0, 1), we get:

fx (0, 1) = 1 · e0 = 1, fy (0, 1) = 0 · e0 = 0.

Also, f (0, 1) = e0 = 1. The equation of the tangent plane is:

z = 1 + 1(x − 0) + 0(y − 1),

which simplifies to:


z = 1 + x.

Example 43. Let f (x, y) = log(x2 + y 2 ). Find the equation of the tangent plane at the
point (1, 1, log 2).

44
2.4. TANGENT PLANE TO A SURFACE

Figure 2.4.4: Normal to the Sphere

Solution. We have f (x, y) = log(x2 + y 2 ). Compute the partial derivatives:


2x 2y
fx (x, y) = , fy (x, y) = .
x2 + y 2 x2 + y 2

At the point (1, 1), we get:


2 2
fx (1, 1) = = 1, fy (1, 1) = = 1.
2 2
Since f (1, 1) = log(12 + 12 ) = log 2, the tangent plane is:

z = log 2 + 1(x − 1) + 1(y − 1),

or simplifying:
z = log 2 + x + y − 2.

Example 44. Find the equation of the tangent plane to the surface z = 1 − x2 − y 2 at
p

the point (x, y) = (a, b), where a2 + b2 < 1. Interpret your calculations geometrically.

Solution. Let f (x, y) = 1 − x2 − y 2 . To find the tangent plane at (a, b, f (a, b)), we first
p

compute the partial derivatives:


−x −y
fx (x, y) = p , fy (x, y) = p .
1− x2 − y2 1 − x2 − y 2
Evaluating at (a, b), we get

−a −b
fx (a, b) = √ , fy (a, b) = √ .
1 − a2 − b 2 1 − a2 − b 2

45
2.4. TANGENT PLANE TO A SURFACE

Also, √
f (a, b) = 1 − a2 − b2 = c(say).
Now use the formula for the tangent plane:

z = f (a, b) + fx (a, b)(x − a) + fy (a, b)(y − b).

Substitute the values, we get


a b
z = c − (x − a) − (y − b).
c c
Thus, the equation of the tangent plane to the hemisphere at the point (a, b, c) is given by
a b
z = c − (x − a) − (y − b).
c c
Geometric Interpretation: From the earlier calculations, we observed that the normal
vector to the surface defined by p
z = 1 − x2 − y 2

at a point P (a, b, c), where c = 1 − a2 − b2 , is given by
 
a b
N = (−fx (a, b), −fy (a, b), 1) = , ,1 .
c c
The position vector of the point P on the surface is

r = a i + b j + c k.

Since c > 0 on the upper hemisphere, the normal vector N and the position vector r point
in the same direction for r = c N, i.e., the vector r and N both are parallel. This means
the normal vector points outward from the surface, and is aligned with the radius vector of
the sphere at that point. Therefore, the tangent plane at P is perpendicular to the radius
vector r.

This leads to a fundamental geometric fact: At any point on the surface of a sphere,
the radius vector from the center to that point is perpendicular to the tangent
plane. In other words, the radial direction defines the outward normal to the surface.

Remark 2.4.1. Lower Hemisphere: For the lower hemisphere, defined by


p
z = − 1 − x2 − y 2 ,

the same method gives the surface normal as


 
a b
N = (−fx (a, b), −fy (a, b), 1) = , ,1 ,
c c

46
2.4. TANGENT PLANE TO A SURFACE

where now c = − 1 − a2 − b2 < 0. In this case, even though the formula gives a normal
vector, its direction is opposite to the position vector r = (a, b, c). Hence, the computed
normal points inward-toward the center of the sphere-rather than outward.

To maintain geometric and physical consistency (e.g., when defining flux or orientation for
surface integrals), we reverse the direction of the normal and define the outward unit normal
as
Nout = (fx (a, b), fy (a, b), −1) .
This ensures that the surface normal always points away from the surface-regardless of
whether we are on the upper or lower hemisphere.

Conclusion: While standard formulas like (−fx , −fy , 1) yield inward-pointing normals on

−1 1
0
−1
0 −1
1

Figure 2.4.5: Normal Vector Field on Sphere

the lower hemisphere, we explicitly reverse their direction to preserve the outward-pointing
convention. This aligns the normal vector with the position vector throughout the sphere,
and ensures that the radius vector remains perpendicular to the tangent plane at every point
on the surface.

2.4.2 Second and Higher Order Partial Derivatives


Just as with functions of a single variable, we can compute higher-order derivatives for func-
tions of several variables by repeatedly applying partial differentiation.

Let f (x, y) be a function of two variables. The second-order partial derivatives of f (x, y)
are obtained by differentiating the first-order partial derivatives again with respect to either
x or y. There are four such second order derivatives:
 ∂2f
• Second derivative with respect to x: fxx (x, y) = ∂x ∂ ∂f
∂x
= ∂x2 .

47
2.4. TANGENT PLANE TO A SURFACE
 
∂2f
• Second derivative with respect to y: fyy (x, y) = ∂
∂y
∂f
∂y
= ∂y 2
.

∂2f
• Mixed partial derivative x then y: fxy (x, y) = ∂ ∂f

∂y ∂x
= ∂y∂x
.
 
∂2f
• Mixed partial derivative y then x: fyx (x, y) = ∂
∂x
∂f
∂y
= ∂x∂y
.

Theorem 2.4.1. (Equality of Mixed Partial Derivatives)


If f (x, y) and all the partial derivatives fx , fy , fxy and fyx are defined throughout an
open region containing a point (a, b) and are all continuous at (a, b), then

fxy (a, b) = fyx (a, b).


This result is known as the Schwarz Theorem and it simplifies computation by assuring
us that the order of differentiation does not matter in such cases.

Higher-Order Derivatives Beyond Second Order


The idea of repeated differentiation can be extended to obtain third-order, fourth-order, and
in general, nth-order partial derivatives. These are denoted using subscript notation such
as:
∂ 3f ∂ 3f
fxxy (x, y) = , f yyx (x, y) = , etc.
∂x2 ∂y ∂y 2 ∂x
Such higher-order derivatives play important roles in approximation theory (e.g., Taylor
series), optimization, and differential equations.

ILLUSTRATIVE EXAMPLES

Example 45. Let f (x, y) = x2 y + 3xy 3 . Compute the second-order partial derivatives.

Solution. The first order partial derivatives are given by

fx (x, y) = 2xy + 3y 3 , fy (x, y) = x2 + 9xy 2 .

Also, the repeated differentiation yields, second order partial derivatives

fxx = 2y, fyy = 18xy, fxy = 2x + 9y 2 , fyx = 2x + 9y 2

It is now easy to see that fxy = fyx .

Example 46. If f (x, y) = sin xy 3 , then show that fxy = fyx .

48
2.4. TANGENT PLANE TO A SURFACE

Solution. Let us compute the mixed partial derivatives fxy and fyx and verify that they are
equal. First, we compute fx (x, y) treating y as a constant, we get
∂ ∂
fx (x, y) = sin(xy 3 ) = cos(xy 3 ) · (xy 3 ) = y 3 cos(xy 3 ).
∂x ∂x
Now, in order to compute fxy (x, y), we differentiate fx with respect to y


cos(xy 3 ) · y 3

fxy (x, y) =
∂y
∂ ∂ 3
= (cos(xy 3 )) · y 3 + cos(xy 3 ) · (y )
∂y ∂y
= [− sin(xy 3 ) · 3xy 2 ] · y 3 + cos(xy 3 ) · 3y 2
= −3xy 5 sin(xy 3 ) + 3y 2 cos(xy 3 ).

Similarly, we first compute fy (x, y) treating x as constant:

∂ ∂
fy (x, y) = sin(xy 3 ) = cos(xy 3 ) · (xy 3 ) = cos(xy 3 ) · 3xy 2 .
∂y ∂y
Now, we compute fyx (x, y) as below


3xy 2 cos(xy 3 )

fyx (x, y) =
∂x
∂ ∂
= (3xy 2 ) · cos(xy 3 ) + 3xy 2 · cos(xy 3 )
∂x ∂x
= 3y 2 cos(xy 3 ) + 3xy 2 · (− sin(xy 3 ) · y 3 )
= 3y 2 cos(xy 3 ) − 3xy 5 sin(xy 3 ).

Hence, fxy = fyx , as required.


∂3u
Example 47. If u = exyz , show that ∂x∂y∂z
= (1 + 3xyz + x2 y 2 z 2 )exyz .
Solution. We are given u = exyz . We want to compute the third mixed partial derivative
∂3u
∂x∂y∂z
. First, compute the partial derivative with respect to z:

∂ xyz ∂
uz = (e ) = exyz · (xyz) = exyz · xy.
∂z ∂z
Next, differentiate uz with respect to y to obtain
∂ ∂ ∂ xyz
uzy = (xy · exyz ) = (xy) · exyz + xy · (e ).
∂y ∂y ∂y
This gives
uzy = x · exyz + xy · (exyz · xz) = (x + x2 yz)exyz .

49
2.4. TANGENT PLANE TO A SURFACE

Finally, we differentiate uzy with respect to x to get


xexyz + x2 yzexyz

uzyx =
∂x
∂ ∂ xyz ∂ 2 ∂ xyz
= (x) · exyz + x · (e ) + (x yz) · exyz + x2 yz · (e )
∂x ∂x ∂x ∂x
= 1 · exyz + x · (exyz · yz) + 2xyz · exyz + x2 yz · (exyz · yz)
= exyz + xyzexyz + 2xyzexyz + x2 y 2 z 2 exyz
= (1 + 3xyz + x2 y 2 z 2 )exyz .

Hence, we have proved that

∂ 3u
= (1 + 3xyz + x2 y 2 z 2 )exyz
∂x∂y∂z
as required.

Example 48. If u = (x2 + y 2 + z 2 )−1/2 , where x2 + y 2 + z 2 6= 0, then prove the following:

1. x ∂u
∂x
+ y ∂u
∂y
+ z ∂u
∂z
= −u.
∂2u ∂2u ∂2u
2. ∂x2
+ ∂y 2
+ ∂z 2
= 0.

Solution. Let us define


p
u = (x2 + y 2 + z 2 )−1/2 = r−1 , where r = x2 + y 2 + z 2 .

Part (1): We first compute the partial derivatives of u as below:

∂u d 1
(x2 + y 2 + z 2 )−1/2 = − (x2 + y 2 + z 2 )−3/2 · 2x = −x(x2 + y 2 + z 2 )−3/2 .

=
∂x dx 2
Similarly,
∂u ∂u
= −y(x2 + y 2 + z 2 )−3/2 , = −z(x2 + y 2 + z 2 )−3/2 .
∂y ∂z
Now compute the expression:

∂u ∂u ∂u
x +y +z = −x2 (x2 + y 2 + z 2 )−3/2 − y 2 (x2 + y 2 + z 2 )−3/2 − z 2 (x2 + y 2 + z 2 )−3/2
∂x ∂y ∂z
= −(x2 + y 2 + z 2 ) · (x2 + y 2 + z 2 )−3/2
= −(x2 + y 2 + z 2 )−1/2 = −u.

Hence,
∂u ∂u ∂u
x +y +z = −u.
∂x ∂y ∂z

50
2.4. TANGENT PLANE TO A SURFACE

Part (2): We already have


∂u
= −x(x2 + y 2 + z 2 )−3/2 .
∂x
Now differentiate again with respect to x:

∂ 2u ∂ 2 2 2 −3/2

= −x(x + y + z ) .
∂x2 ∂x
Using the product rule:

∂ 2u ∂ 2 2 2 −3/2 ∂ 2 2 2 −3/2

= − (x) · (x + y + z ) − x · (x + y + z )
∂x2 ∂x  ∂x 
2 2 2 −3/2 3 2 2 2 −5/2
= −(x + y + z ) − x · − (x + y + z ) · 2x
2
= −(x2 + y 2 + z 2 )−3/2 + 3x2 (x2 + y 2 + z 2 )−5/2 .

Similarly,
∂ 2u
= −(x2 + y 2 + z 2 )−3/2 + 3y 2 (x2 + y 2 + z 2 )−5/2 ,
∂y 2
∂ 2u
= −(x2 + y 2 + z 2 )−3/2 + 3z 2 (x2 + y 2 + z 2 )−5/2 .
∂z 2
Adding all three:

∂ 2u ∂ 2u ∂ 2u
+ + = −3(x2 + y 2 + z 2 )−3/2 + 3(x2 + y 2 + z 2 )(x2 + y 2 + z 2 )−5/2
∂x2 ∂y 2 ∂z 2
= −3(x2 + y 2 + z 2 )−3/2 + 3(x2 + y 2 + z 2 )−3/2
= 0.

Hence,
∂ 2u ∂ 2u ∂ 2u
+ + = 0.
∂x2 ∂y 2 ∂z 2
Example 49. If u = f (r), where r2 = x2 + y 2 , show that

∂ 2u ∂ 2u 1
2
+ 2 = f 00 (r) + f 0 (r).
∂x ∂y r

Solution. We are given u = f (r), where r = x2 + y 2 . So, u is a function of r, and r is


p

a function of x and y. We will compute the second partial derivatives using the chain rule.
First, compute the partial derivatives of r:
∂r 1 x ∂r y
= p · 2x = , = .
∂x 2
2 x +y 2 r ∂y r

51
2.4. TANGENT PLANE TO A SURFACE

Now compute ∂u
∂x
using the chain rule:

∂u df ∂r x
= · = f 0 (r) · .
∂x dr ∂x r
Differentiate again with respect to x:

∂ 2u ∂  0 x
= f (r) ·
∂x2 ∂x r
∂f 0 x 0 ∂ x
= · + f (r) · .
∂x r ∂x r
We compute both the terms on right hand side separately. The first term is

∂f 0 ∂r x
= f 00 (r) · = f 00 (r) · .
∂x ∂x r
Similarly, the second term is,
∂r x
∂ x r · 1 − x · ∂x
r−x· r r 2 − x2
= = = .
∂x r r2 r2 r3
So,
∂ 2u 00 x2 0 r 2 − x2
= f (r) · + f (r) · .
∂x2 r2 r3
Similarly,
∂ 2u 00 y2 0 r2 − y 2
= f (r) · + f (r) · .
∂y 2 r2 r3
Now add the two second derivatives:
∂ 2u ∂ 2u
 2
x + y2
 2
2r − (x2 + y 2 )
 
00 0
+ = f (r) + f (r)
∂x2 ∂y 2 r2 r3
r2 r2
= f 00 (r) · 2 + f 0 (r) · 3
r r
1
= f 00 (r) + f 0 (r).
r
Hence,
∂ 2u ∂ 2u 00 1 0
+ = f (r) + f (r),
∂x2 ∂y 2 r
as required.

Example 50. If xx y y z z = c, then show that

∂ 2z 1
=− .
∂x∂y x log(ex)

52
2.4. TANGENT PLANE TO A SURFACE

Solution. We are given an implicit equation xx y y z z = c. Let us take logarithms on both


sides to simplify the expression.

log(xx y y z z ) = log c.

This gives
x log x + y log y + z log z = log c.
Since c is a constant, log c is also constant. Now, treat z as a function of x and y, i.e.,
z = z(x, y). Differentiating both sides of the above equation partially with respect to y, we
get

(x log x + y log y + z log z) = 0.
∂y
This gives us
∂ ∂
(y log y) + (z log z) = 0.
∂y ∂y
After a bit simplification, we arrive at

∂z ∂z
(log y + 1) + log z + = 0.
∂y ∂y

So the derivative becomes


∂z
log y + 1 + (log z + 1) = 0.
∂y
Solving for ∂z
∂y
, we obtain
∂z log y + 1
=− .
∂y log z + 1
Similarly, we can get
∂z log x + 1
=− . (2)
∂x log z + 1
Now differentiate above equation with respect to x, we get

∂ 2z
 
∂ log y + 1
= − .
∂x∂y ∂x log z + 1

Since log y + 1 is constant w.r.t. x, so apply the quotient rule:


∂B ∂
0·B−A· (log y + 1) · ∂x
 
∂ A ∂x
(log z + 1)
= =− .
∂x B B2 (log z + 1)2

Now,
∂ 1 ∂z
(log z + 1) = · .
∂x z ∂x

53
2.5. DIFFERENTIABILITY

∂ 2z (log y + 1) 1 ∂z
=− · ·
∂x∂y (log z + 1)2 z ∂x
 
(log y + 1) 1 log x + 1
=− · · −
(log z + 1)2 z log z + 1
(log y + 1)(log x + 1)
= .
z(log z + 1)3

Now return to the original equation. From

xx y y z z = c ⇒ log c = x log x + y log y + z log z.

Now suppose x = y = z (symmetry assumption to simplify), then:

log c
3x log x = log c ⇒ log x = .
3x
But instead, we are asked to show that:

∂ 2z 1
=− .
∂x∂y x log(ex)

Note that:
log(ex) = log e + log x = 1 + log x.
So,
∂ 2z 1 1
=− =− ,
∂x∂y x(1 + log x) x log(ex)
as required.

2.5 Differentiability
For a general function f : D ⊂ Rn → Rm , giving a rigorous definition of differentiability
involves deeper ideas from linear algebra and analysis. It requires understanding how to
approximate functions using linear maps between vector spaces, which is beyond our current
scope. Instead of focusing on these technicalities, our goal here is to build an intuitive and
practical understanding of what it means for such functions to be differentiable, and how we
define and work with their derivatives.

To guide our way forward, we will begin by recalling what differentiability means in the
simpler case of functions of a single variable. From there, we will extend the ideas first to
scalar-valued functions of several variables that is, functions of the form

f : D ⊂ Rn → R,

54
2.5. DIFFERENTIABILITY

also known as scalar fields. This intermediate step will help us bridge our intuition from
one-dimensional calculus to functions of several variables. After developing a clear picture in
this scalar case, we will then proceed to define and explore differentiability for more general
vector-valued functions of the form
f : D ⊂ Rn → Rm .
Let us begin by recalling the familiar definition from single-variable calculus. For a function
f : R → R, the derivative at a point x = a is defined by the limit
f (a + h) − f (a)
f 0 (a) = lim .
h→0 h
This expression captures the idea of the instantaneous rate of change of the function at the
point a, or the slope of the tangent line to the graph at that point.

To better prepare for generalization to multiple variables, we can rewrite this definition in a
slightly more abstract form. Instead of directly talking about slopes, we think of the deriva-
tive as the best linear approximation to the function near the point of interest. With that
in mind, we say:

Definition 2.5.1. A function y = f (x) is said to be differentiable at x = a if there exists


a number λ ∈ R such that
f (a + h) − f (a) − λ · h
lim = 0.
h→0 h
In this case, we say that the function is differentiable at a and we define the derivative
as Df (a) = f 0 (a) = λ.

This reformulation emphasizes the idea that the derivative is the number λ such that the
function f (x) can be closely approximated by the linear expression f (a) + λ · (x − a) near
x = a. In the next section, we will carry this idea into higher dimensions and see how similar
reasoning leads to the definition of the derivative in the multivariable setting.

2.5.1 Differentiability of Scalar Function of Several Variables


Now, let us extend this concept to functions of two variables. We say that a function f (x, y)
is differentiable at (a, b) if there exists some λ such that the following limit holds:
f (a + h, b + k) − f (a, b) − λ · (h, k)
lim = 0. (2.1)
(h,k)→(0,0) (h, k)
However, this expression is meaningless since the denominator is an ordered pair and division
by an ordered pair is undefined.

55
2.5. DIFFERENTIABILITY

To address this, we replace (h, k) in the denominator with its magnitude h2 + k 2 , leading
to the revised definition:
f (a + h, b + k) − f (a, b) − λ · (h, k)
lim √ = 0. (2.2)
(h,k)→(0,0) h2 + k 2
While this resolves the division issue, the meaning of λ remains unclear. What is this λ?
Just a real number as before or something else?

If λ were a single real number, then λ·(h, k) would be an ordered pair, making the expression
f (a + h, b + k) − f (a, b) − λ · (h, k) in the numerator meaningless for both f (a + h, b + k) and
f (a, b) are real numbers and addition of a real number with an ordered pair is not defined.
The natural choice is to treat λ as a vector (α, β), and the operation ‘·’ as scalar product so
that

λ · (h, k) = (α, β) · (h, k)


= αh + βk.

With this formulation, the expression f (a + h, b + k) − f (a, b) − λ · (h, k) now makes sense.
We can now rigorously state that a function f (x, y) is differentiable at (a, b) if there exists
λ = (α, β) such that
f (a + h, b + k) − f (a, b) − (αh + βk)
lim √ = 0. (2.3)
(h,k)→(0,0) h2 + k 2
This provides a solid foundation for defining the derivative of a function of two variables.
The vector (α, β) is called the derivative of f (x, y), written as:

Df (a, b) = f 0 (a, b) = λ = (α, β).

With this, we formally define the derivative of a function of two variables:

Definition 2.5.2. A function z = f (x, y) is differentiable at a point (a, b) if there exist


two numbers α and β such that

f (a + h, b + k) − f (a, b) − (αh + βk)


lim √ = 0. (2.4)
(h,k)→(0,0) h2 + k 2
We write D0 (a, b) = f 0 (a, b) = (α, β).

In order to determine the numbers α and β, we now examine the behavior of f (x, y) along
the coordinate axes. First, we let (h, k) → (0, 0) along x-axis. Along the x-axis (k = 0).
Therefore, setting k = 0 in the definition, we get:
f (a + h, b) − f (a, b) − αh
lim = 0.
h→0 |h|

56
2.5. DIFFERENTIABILITY

This implies that


f (a + h, b) − f (a, b)
lim = α.
h→0 h
But the left-hand side is precisely the partial derivative of f with respect to x at (a, b), i.e.,

fx (a, b) = α.

Similarly, if (h, k) → (0, 0) along y-axis, then (k = 0. Therefore, setting k = 0, we obtain

f (a, b + k) − f (a, b) − βk
lim = 0.
k→0 |k|
This implies
f (a, b + k) − f (a, b)
lim = β.
k→0 k
Again, this is precisely the partial derivative of f with respect to y at (a, b), i.e.,

fy (a, b) = β.

From the above two observations, we conclude that the numbers α and β appearing in the
definition of differentiability are precisely the partial derivatives of f at (a, b), i.e.,

α = fx (a, b), β = fy (a, b).

Thus, the derivative vector can be rewritten as


 
0 ∂f ∂f
Df (a, b) = f (a, b) = (α, β) = (fx (a, b), fy (a, b)) = (a, b), (a, b) .
∂x ∂y
This row vector, which contains the partial derivatives with respect to each variable, is
commonly called the gradient of f at the point (a, b). It is often denoted by the symbol
∇f , which is read as "del f" or "grad f", i.e.,
 
∂f ∂f
∇f (a, b) = (a, b), (a, b) .
∂x ∂y
This gradient vector can also be thought of as a matrix, usually called as derivative matrix.
The derivative matrix at a point (a, b) is
h i
Df (a, b) = ∂f∂x
(a, b) ∂f
∂y
(a, b) .

ILLUSTRATIVE EXAMPLES

Example 51. Compute derivative Df (a, b) or gradient for each of the functions given below:

57
2.5. DIFFERENTIABILITY

a) f (x, y) = x2 sin(y) + y 2 cos(x),


b) f (x, y) = log (x2 + y 2 ) .
Solution. The derivative matrix (or the gradient) is a matrix (or vector) comprised of partial
derivatives and given by  
0 ∂f ∂f
Df (x, y) = f (x, y) = , .
∂x ∂y
a) For the function, f (x, y) = x2 sin(y) + y 2 cos(x), the partial derivatives are

∂f ∂f
= 2x sin y − y 2 sin x, = x2 cos y + 2y cos x.
∂x ∂y
Therefore, the derivative matrix of the function f is given by

f 0 (x, y) = Df (x, y) = 2x sin y − y 2 sin x, x2 cos y + 2y cos x .


 

b) Similarly, for the function f (x, y) = log (x2 + y 2 ), we have

∂f 2x ∂f 2y
= 2 , = 2 .
∂x x + y2 ∂y x + y2
Therefore, the derivative matrix is
 
2x 2y
Df (x, y) = 2 , .
x + y 2 x2 + y 2

Example 52. Compute the derivative matrix for the function f (x, y) = ex cos(y) + xy 2 and
evaluate it at the points (1, 0) and (0, 1).
Solution. The derivative matrix of a function of two variable is a matrix comprised of partial
derivatives and given by  
0 ∂f ∂f
Df (x, y) = f (x, y) = , .
∂x ∂y
For the function f (x, y) = ex cos y + xy 2 , the partial derivatives are
∂f ∂f
= ex cos y + y 2 , = −ex sin y + 2xy.
∂x ∂y
Therefore, the derivative matrix of the function f is
Df (x, y) = ex cos y + y 2 , −ex sin y + 2xy .
 

Now, we evaluate the derivative matrix at the given points:

58
2.5. DIFFERENTIABILITY

a) At (1, 0), the derivative matrix is

Df (1, 0) = [e1 cos 0 + 02 , −e1 sin 0 + 2(1)(0)] = [e, 0].

b) At (0, 1), the derivative matrix is

Df (0, 1) = [e0 cos 1 + 12 , −e0 sin 1 + 2] = [cos 1 + 1, − sin 1].

Example 53. Compute the derivative matrix for the function f (x, y, z) = xyez and evaluate
it at the points (1, 1, 0) and (2, −1, ln 2).

Solution. The derivative matrix (or gradient) of a scalar function of three variables is a
vector of partial derivatives and is given by
 
0 ∂f ∂f ∂f
Df (x, y, z) = f (x, y, z) = , , .
∂x ∂y ∂z

For the function f (x, y, z) = xyez , the partial derivatives are

∂f ∂f ∂f
= yez , = xez , = xyez .
∂x ∂y ∂z

Therefore, the derivative matrix of the function f is

Df (x, y, z) = [yez , xez , xyez ].

Now, we evaluate the derivative matrix at the given points:

a) At (1, 1, 0), the derivative matrix is

Df (1, 1, 0) = [1 · e0 , 1 · e0 , 1 · 1 · e0 ] = [1, 1, 1].

b) At (2, −1, log 2), we use elog 2 = 2:

Df (2, −1, log 2) = [−1 · 2, 2 · 2, 2 · (−1) · 2] = [−2, 4, −4].

Example 54. Compute the derivative matrix for the function f (x, y, z) = tan−1 (x + yz)
and evaluate it at the points (0, 1, 1) and (1, 2, 0).

59
2.5. DIFFERENTIABILITY

Solution. The derivative matrix (or gradient) of a scalar function of three variables is given
by  
0 ∂f ∂f ∂f
Df (x, y, z) = f (x, y, z) = , , .
∂x ∂y ∂z
For the function f (x, y, z) = tan−1 (x + yz), the partial derivatives are
∂f 1 ∂f z ∂f y
= , = , = .
∂x 1 + (x + yz)2 ∂y 1 + (x + yz)2 ∂z 1 + (x + yz)2
Therefore, the derivative matrix is
 
1 z y
Df (x, y, z) = , , .
1 + (x + yz)2 1 + (x + yz)2 1 + (x + yz)2
Now, we evaluate the derivative matrix at the given points:
a) At (0, 1, 1), we have x + yz = 1, so
   
1 1 1 1 1 1
Df (0, 1, 1) = , , = , , .
1 + 12 1 + 12 1 + 12 2 2 2

b) At (1, 2, 0), we have x + yz = 1, so


   
1 0 2 1
Df (1, 2, 0) = , , = , 0, 1 .
2 2 2 2

Example 55. Compute the derivative matrix for the function f (x, y, z) = log (x2 + y 2 + z 2 )
and evaluate it at the points (1, 0, 0) and (1, 2, 2).
Solution. The derivative matrix (or gradient) of a scalar function of three variables is a
vector of partial derivatives and is given by
 
0 ∂f ∂f ∂f
Df (x, y, z) = f (x, y, z) = , , .
∂x ∂y ∂z
For the function f (x, y, z) = ln(x2 + y 2 + z 2 ), the partial derivatives are
∂f 2x ∂f 2y ∂f 2z
= 2 , = 2 , = 2 .
∂x x + y2 + z2 ∂y x + y2 + z2 ∂z x + y2 + z2
Therefore, the derivative matrix is
 
2x 2y 2z
Df (x, y, z) = 2 , , .
x + y 2 + z 2 x2 + y 2 + z 2 x2 + y 2 + z 2
Now, we evaluate the derivative matrix at the given points:

60
2.5. DIFFERENTIABILITY

a) At (1, 0, 0), we have x2 + y 2 + z 2 = 1, so


 
2·1 2·0 2·0
Df (1, 0, 0) = , , = [2, 0, 0].
1 1 1

b) At (1, 2, 2), we have x2 + y 2 + z 2 = 1 + 4 + 4 = 9, so


   
2·1 2·2 2·2 2 4 4
Df (1, 2, 2) = , , = , , .
9 9 9 9 9 9

2.5.2 Differentiability Implies Continuity


In calculus, differentiability gives us insight into how a function behaves near a point. As
in single-variable calculus, if a function is differentiable at a point, then it is also continuous
there. However, the converse is not always true, a function can be continuous without being
differentiable. This idea carries over to multivariable calculus as well.

In this section, we prove that differentiability implies continuity. We then present a few
counterexamples to show that continuity alone does not guarantee differentiability, and even
the existence of partial derivatives is not sufficient for the function to be continuous or
differentiable.
Theorem 2.5.1. If f (x, y) is differentiable at a point (a, b), then it is continuous at (a, b)
but the converse need not be true.
Proof: To establish that differentiability implies continuity, we assume that f (x, y) is dif-
ferentiable at (a, b). Therefore there exist real numbers α and β (which turn out to be the
partial derivatives fx (a, b) and fy (a, b), respectively) such that
f (a + h, b + k) − f (a, b) − (αh + βk)
lim √ = 0.
(h,k)→(0,0) h2 + k 2
This means that the difference

f (a + h, b + k) − f (a, b) − (αh + βk)



vanishes faster than h2 + k 2 as (h, k) → (0, 0), implying that

f (a + h, b + k) = f (a, b) + αh + βk + ε(h, k),

where ε(h, k) satisfies


ε(h, k)
lim √ = 0.
(h,k)→(0,0) h2 + k 2

61
2.5. DIFFERENTIABILITY

Now, taking the limit as (h, k) → (0, 0), we get

lim f (a + h, b + k) = lim [f (a, b) + αh + βk + ε(h, k)] .


(h,k)→(0,0) (h,k)→(0,0)

Since αh + βk → 0 and ε(h, k) → 0 as (h, k) → (0, 0), we get

lim f (a + h, b + k) = f (a, b).


(h,k)→(0,0)

This confirms that f (x, y) is continuous at (a, b), proving that differentiability implies con-
tinuity.

To prove that the converse of the theorem need not be true, we produce a counter example.
Example 56. Consider the following function f (x, y) of two variables given by
(
p xy , (x, y) 6= (0, 0);
f (x, y) = x2 +y 2
0, (x, y) = (0, 0).

Show that the function f (x, y) is continuous and partial derivatives fx (0, 0) and fy (0, 0) exist
but the function is not differentiable at (0, 0).
Solution. First, we prove the continuity of the function. Note that, value of the function at
(0, 0), i,e, f (0, 0) = 0 is already given. Now, in order to prove that the function is continuous
at (0, 0), we shall prove that the limit of the function equals f (0, 0) = 0. For this, we shall
use the ε-δ definition of continuity.

From the definition of continuity, it follows that the function f (x, y) will be continuous at
(0, 0) if, for every ε > 0, there exists a δ > 0 such that
p
x2 + y 2 < δ =⇒ |f (x, y) − f (0, 0)| < ε.

Since f (0, 0) = 0, therefore to prove that the function is continuous, we must show that
p
|f (x, y)| < ε whenever x2 + y 2 < δ.

Let  > 0 be given. Now, we see that for (x, y) 6= (0, 0), we have

xy
|f (x, y)| = p .
x2 + y 2

We know that for all real numbers x and y, |xy| ≤ |x||y|, therefore, we get

|xy| |x||y|
|f (x, y)| = p ≤p .
x2 + y2 x2 + y 2

62
2.5. DIFFERENTIABILITY

Using the fact that |x| ≤ x2 + y 2 and |y| ≤ x2 + y 2 , we obtain


p p

p p
|x||y| ( x2 + y 2 )( x2 + y 2 ) p 2
p ≤ p = x + y2.
2
x +y 2 2
x +y 2

Thus, p
|f (x, y)| ≤ x2 + y 2 .
Now, to ensure that |f (x, y)| < ε, it suffices to choose δ = , then
p p
x2 + y 2 < δ =⇒ |f (x, y)| ≤ x2 + y 2 < ε.

Thus for a given  > 0, we can find a δ(= ε), such that

|f (x, y)| < ε,

whenever p
x2 + y 2 < δ.
Thus, the function is continuous.

The partial derivatives of f (x, y) at (0, 0) are given by


f (h, 0) − f (0, 0) 0−0
fx (0, 0) = lim = lim = 0.
h→0 h h→0 h
and
f (0, k) − f (0, 0) 0−0
fy (0, 0) = lim = lim = 0.
k→0 k k→0 k
Thus, the partial derivatives exist and are both 0 at (0, 0).

Finally, we discuss differentiability of the function. Since both the partial derivative exists,
therefore, the given function will be differentiable at (0, 0) if the following limit exists and
equals zero, i.e.,
f (h, k) − f (0, 0) − (αh + βk)
lim √ = 0.
(h,k)→(0,0) h2 + k 2
Since f (0, 0) = 0 and α = fx (0, 0) = 0, β = fy (0, 0) = 0, this simplifies to

f (h, k)
lim √ = 0.
(h,k)→(0,0) h2 + k 2
Thus, we arrive at the conclusion that the given function will be differentiable at (0, 0) if the
limit lim(h,k)→(0,0) √fh(h,k)
2 +k 2 exists and is equal to 0, i.e.,

√ hk
f (h, k) 2 2 hk
lim √ = √ h +k = 2 = 0.
(h,k)→(0,0) h2 + k 2 h2 + k 2 h + k2

63
2.5. DIFFERENTIABILITY

To check whether this limit exists or not, we try different paths. First, we let (h, k) → (0, 0)
along the line y = x. Along this line k = h and therefore, we get

f (h, h) h·h h2 1
√ = 2 = = .
h2 + h2 h + h2 2h2 2

Taking the limit as h → 0,


f (h, h) 1
lim √ = .
h→0 h2 + h2 2
Now, we choose another path y = −x. Along this path, we have substituting k = −h,

f (h, −h) h(−h) −h2 1


p = 2 = = − .
h2 + (−h)2 h + h2 2h2 2

Taking the limit as h → 0,


f (h, −h) 1
lim p =− .
h→0 2
h + (−h) 2 2
Since different paths give different values, the limit does not exist. Thus, f (x, y) is not
differentiable at (0, 0).

Thus, we conclude that although the function

a) f (x, y) is continuous at (0, 0)


b) fx (0, 0) and fy (0, 0) exist,

but yet the function f (x, y) is not differentiable at (0, 0). This example shows that mere
continuity of the function or the existence of partial derivatives do not guarantee the differ-
entiability.

Example 57. Consider the function f (x, y) defined by


(
1, if xy = 0,
f (x, y) =
0, if xy 6= 0.

Show that the partial derivatives fx (0, 0) and fy (0, 0) exist, but the function is not continuous
at (0, 0).

Solution. If a point (x, y) lies on any one of the axes, then xy = 0 and for points (x, y)
not lying on axes, xy 6= 0 (why?). From the definition of the function, we find that if
(x, y) → (0, 0) along x or y-axes, the functional values will remain 1, i.e., along both the
axes
lim f (x, y) = 1.
(x,y)→(0,0)

64
2.5. DIFFERENTIABILITY

Also, we find that for the paths other than axes (such as a line y = x), functional values will
remain 0, i.e., along other path such as y = x,

lim f (x, y) = 0.
(x,y)→(0,0)

Thus, the limit of the function f (x, y) at (0, 0) is path dependent. Therefore, it does not
exist. As a consequence, we conclude that the function is not continuous at (0, 0) for limit is
non-existent at origin. However, we are given that f (x, y) = 1 for points where xy = 0, i.e.,
function is constant along both the axes, therefore, partial derivatives exist and fx (0, 0) = 0
and fy (0, 0) = 0 for the rate of change of constant entities must be zero. Precisely, one may
also proceed as below to check whether the partial derivatives exist at the origin or not.

• Partial derivative with respect to x at (0, 0):

f (h, 0) − f (0, 0)
fx (0, 0) = lim .
h→0 h
Given that f (x, y) = 1, when xy = 0, therefore, we have f (h, 0) = 1 and f (0, 0) = 1.
Thus, we get
1−1
fx (0, 0) = lim = 0.
h→0 h
• Partial derivative with respect to y at (0, 0):

f (0, k) − f (0, 0)
fy (0, 0) = lim .
k→0 k
Using similar arguments, we have f (0, k) = 1 and f (0, 0) = 1, so
1−1
fy (0, 0) = lim = 0.
k→0 k

Therefore, both partial derivatives fx (0, 0) and fy (0, 0) exist and are equal to 0.

This example shows that the partial derivatives fx (0, 0) and fy (0, 0) exist but f (x, y) is not
continuous at (0, 0). Thus, existence of partial derivatives does not imply continuity of a
function of two variables.

2.5.3 General Case


For a general function f : D ⊂ Rn → Rm , defining differentiability in full detail requires
more advanced mathematical tools. However, rather than going into those complexities, we
aim to build a basic and intuitive understanding of what it means for such functions to be
differentiable and how to describe their derivative.

65
2.5. DIFFERENTIABILITY

The function f : Rn → Rm takes an input point x = (x1 , x2 , . . . , xn ) ∈ Rn and produces an


output vector y = (y1 , y2 , . . . , ym ) ∈ Rm . Symbolically, we write

f (x) = y.

It is important to note that the output vector y = (y1 , y2 , . . . , ym ) depends on the input vector
x = (x1 , x2 , . . . , xn ). Thus, each component yi can be viewed as a real-valued function of x,
and we write:
f (x) = (y1 (x), y2 (x), . . . , ym (x)) .
In other words, each yi is a scalar function of n real variables, i.e.,

yi = yi (x) = yi (x1 , x2 , . . . , xn ).

For a function f : Rn → Rm , the derivative at a point a = (a1 , a2 , . . . , an ) ∈ Rn is represented


by an m×n matrix, where each entry is the partial derivative of one component of the function
with respect to one of the input variables. This matrix is called the derivative matrix or
the Jacobian matrix and is given by:
 ∂y1 ∂y1

∂x1
(a) · · · ∂xn
(a)
Df (a) =  ... ... ..  .
. 

∂ym ∂ym
∂x1
(a) ··· ∂xn
(a)

In a formal tone, the definiton can be presented as below:

Definition 2.5.3. The derivative matrix of a function f : Rn → Rm at a point


a = (a1 , a2 , . . . , an ) is the matrix Df (a), whose (i, j)-th entry is ∂x
∂yi
j
(a). The function f
is said to be differentiable at the point a if all partial derivatives ∂yi
∂xj
exist at a, and
the following limit holds:

kf (x) − f (a) − Df (a)(x − a)k


lim = 0.
x→a kx − ak

ILLUSTRATIVE EXAMPLES

Example 58. Let f : R2 → R2 be defined by f (x, y) = (x2 y, sin(xy)). Compute the


Jacobian matrix Df (x, y).
Solution. The two components of the output vector are y1 = x2 y, y2 = sin (xy). Now, the
Jacobian matrix or the derivative matrix is given by
" #
∂y1 ∂y1
∂x ∂y
Df (x) = ∂y2 ∂y2 .
∂x ∂y

66
2.5. DIFFERENTIABILITY

Therefore, the desired Jacobian matrix is


 ∂ 2 ∂
(x2 y)
  
∂x
(x y) ∂y 2xy x2
Df (x, y) = ∂ ∂ = .
∂x
(sin(xy)) ∂y (sin(xy)) y cos(xy) x cos(xy)

Example 59. Let f : R3 → R2 be defined by f (x, y, z) = (xyz, ex+y+z ). Compute the


Jacobian matrix Df (x, y, z).
Solution. The components of the output vector are y1 = xyz, y2 = ex+y+z . Now, the
Jacobian matrix or the derivative matrix is given by
" #
∂y1 ∂y1 ∂y1
∂x ∂y ∂z
Df (x) = ∂y2 ∂y2 ∂y2 .
∂x ∂y ∂z

Therefore, the desired Jacobian matrix is


 ∂ ∂ ∂   
(xyz) (xyz) (xyz) yz xz xy
Df (x, y, z) = ∂∂x x+y+z ∂
∂y

∂z
= .
∂x
(e ) ∂y (ex+y+z ) ∂z
(ex+y+z ) ex+y+z ex+y+z ex+y+z

Example 60. Let f : R2 → R2 be given by f (x, y) = (x + y, x2 − y). Compute the Jacobian


matrix Df (x, y) at the point (1, 2).
Solution. The components of the output vector are y1 = x + y, y2 = x2 − y. The general
form of the Jacobian matrix is
" #
∂y1 ∂y1
∂x ∂y
Df (x) = ∂y2 ∂y2 .
∂x ∂y

Therefore, we compute
 ∂ ∂   
∂x
(x + y) ∂y
(x + y) 1 1
Df (x, y) = ∂ 2 ∂ = .
∂x
(x − y) ∂y
(x2 − y) 2x −1

Evaluating at (1, 2), we get  


1 1
Df (1, 2) = .
2 −1
Example 61. Let f : R2 → R3 be defined by f (x, y) = (x + y, xy, x2 + y 2 ). Compute the
Jacobian matrix Df (x, y) and evaluate it at the point (1, 2).
Solution. The components of the output vector are y1 = x + y, y2 = xy, y 3 = x2 + y 2 .
Since f : R2 → R3 , the Jacobian matrix is of size 3 × 2. It is given by
 ∂y ∂y 
1 1
∂x ∂y
 ∂y2 ∂y2 
Df (x) =  ∂x ∂y  .
∂y3 ∂y3
∂x ∂y

67
2.5. DIFFERENTIABILITY

Therefore, the Jacobian matrix is


 ∂ ∂
(x + y) (x + y)
  
∂x ∂y 1 1
∂ ∂
Df (x, y) =  ∂x (xy) ∂y
(xy)  =  y x .
∂ ∂
∂x
(x + y 2 )
2
∂y
(x2 + y 2 ) 2x 2y

Evaluating at (1, 2), we get  


1 1
Df (1, 2) = 2 1 .
2 4

Example 62. A function f : R2 → R2 is given by f (x, y) = (r cos θ, r sin θ), where r =


x2 + y 2 and θ = tan−1 xy . Evaluate the Jacobian matrix Df (x, y) at the point (3, 4).
p

Solution. The given transformation expresses the polar-to-Cartesian conversion. That is,
p y
r = x2 + y 2 , θ = tan−1 ,
x
f (x, y) = (r cos θ, r sin θ) = (x, y).
Hence, the Jacobian matrix is simply the identity matrix:
" # 
∂x ∂x 
∂x ∂y 1 0
Df (x, y) = ∂y ∂y = .
∂x ∂y
0 1

Therefore, evaluating at the point (3, 4), we get



1 0
Df (3, 4) = .
0 1

Now, we conclude with an important result that provides a criterion for determining whether
a function is differentiable, based on the continuity of its partial derivatives. This result not
only links differentiability to the smoothness of partial derivatives but also clarifies how the
existence and continuity of partial derivatives play a pivotal role in ensuring the differentia-
bility of multivariable functions.

Theorem 2.5.2. Let f : Rn → Rm be a function. Suppose that all the partial derivatives
∂yi
∂xj
of f exist and are continuous in a neighborhood of a point x ∈ U . Then, f is
differentiable at x.

This theorem is significant because it provides a straightforward criterion to verify differen-


tiability: if the partial derivatives of a function exist and are continuous in a neighborhood
of a point, then the function is guaranteed to be differentiable at that point. This result is
a cornerstone in the study of multivariable calculus, as it connects the geometric notion of

68
2.5. DIFFERENTIABILITY

differentiability to the analytical properties of the function’s derivatives.

Functions whose partial derivatives exist and are continuous are referred to as continuously
differentiable or of class C 1 . A function being C 1 implies that not only does it have well-
defined derivatives, but these derivatives also vary smoothly without abrupt changes.

Thus, the theorem provides the important conclusion that all C 1 functions are differentiable.

In practice, this means that to check the differentiability of a function, one can focus on the
continuity of its partial derivatives. If they are continuous, differentiability follows directly.
This insight simplifies the verification process and gives a clear criterion for working with
functions in multivariable calculus.

ILLUSTRATIVE EXAMPLES

Example 63. Let f : R2 → R be defined by f (x, y) = x2 y + sin(xy). Determine whether f


is differentiable at (1, 1).

Solution. We begin by calculating the partial derivatives of f . The function is given by:

f (x, y) = x2 y + sin(xy).

The partial derivatives are:


∂f
= 2xy + y cos(xy),
∂x
∂f
= x2 + x cos(xy).
∂y
At the point (1, 1), we evaluate the partial derivatives:

∂f
(1, 1) = 2(1)(1) + (1) cos(1 · 1) = 2 + cos(1),
∂x

∂f
(1, 1) = (1)2 + (1) cos(1 · 1) = 1 + cos(1).
∂y
Since both partial derivatives exist and are continuous (since the components involve ele-
mentary functions), f is differentiable at (1, 1).

Example 64. Let f : R2 → R be defined by f (x, y) = x3 + 2xy + y 3 . Determine whether f


is differentiable at (1, −1).

69
2.5. DIFFERENTIABILITY

Solution. First, we compute the partial derivatives of f . The function is given by:

f (x, y) = x3 + 2xy + y 3 .

The partial derivatives are:


∂f
= 3x2 + 2y,
∂x
∂f
= 2x + 3y 2 .
∂y
At the point (1, −1), we evaluate the partial derivatives:
∂f
(1, −1) = 3(1)2 + 2(−1) = 3 − 2 = 1,
∂x
∂f
(1, −1) = 2(1) + 3(−1)2 = 2 + 3 = 5.
∂y
Both partial derivatives exist and are continuous because they are polynomial functions.
Therefore, f is differentiable at (1, −1).
Example 65. Let f : R2 → R2 be defined by f (x, y) = (x2 y, sin(xy)). Show that f is
differentiable at (1, 1).
Solution. The components of the vector-valued function are:

f1 (x, y) = x2 y, f2 (x, y) = sin(xy).

We compute the partial derivatives:


∂f1 ∂f1
= 2xy, = x2 ,
∂x ∂y
∂f2 ∂f2
= y cos(xy), = x cos(xy).
∂x ∂y
At the point (1, 1), we evaluate these partial derivatives:
∂f1 ∂f1
(1, 1) = 2(1)(1) = 2, (1, 1) = (1)2 = 1,
∂x ∂y
∂f2 ∂f2
(1, 1) = (1) cos(1) = cos(1), (1, 1) = (1) cos(1) = cos(1).
∂x ∂y
Thus, the Jacobian matrix at (1, 1) is:
 
2 1
Df (1, 1) = .
cos(1) cos(1)

Since all partial derivatives exist and are continuous (as they involve elementary functions),
f is differentiable at (1, 1).

70
2.5. DIFFERENTIABILITY

Example 66. Let f : R3 → R2 be defined by f (x, y, z) = (exy , cos(yz)). Show that f is


differentiable at (1, 0, 0).

Solution. The components of the vector-valued function are:

f1 (x, y, z) = exy , f2 (x, y, z) = cos(yz).

We compute the partial derivatives:


∂f1 ∂f1 ∂f1
= yexy , = xexy , = 0,
∂x ∂y ∂z
∂f2 ∂f2 ∂f2
= 0, = −z sin(yz), = −y sin(yz).
∂x ∂y ∂z
At the point (1, 0, 0), we evaluate these partial derivatives:

∂f1 ∂f1 ∂f1


(1, 0, 0) = (0)e(1)(0) = 0, (1, 0, 0) = (1)e(1)(0) = 1, (1, 0, 0) = 0,
∂x ∂y ∂z
∂f2 ∂f2 ∂f2
(1, 0, 0) = 0, (1, 0, 0) = −(0) sin((0)(0)) = 0, (1, 0, 0) = −(0) sin((0)(0)) = 0.
∂x ∂y ∂z
Thus, the Jacobian matrix at (1, 0, 0) is:
 
0 1 0
Df (1, 0, 0) = .
0 0 0

Since all partial derivatives exist and are continuous (as they involve elementary functions),
f is differentiable at (1, 0, 0).

71
2.5. DIFFERENTIABILITY

72
Chapter 3

Chain Rule

In earlier classes, we studied the composition of two real-valued functions of a single vari-
able and the corresponding chain rule for differentiating such composite functions. In this
chapter, we develop the chain rule for functions of several variables- a powerful extension of
the single-variable chain rule. The chain rule enables us to differentiate composite functions
where variables are interdependent or expressed in terms of other variables. While we will
not attempt a formal proof here, our focus will be on understanding the structure of com-
positions, computing their derivatives, and interpreting the chain rule in both explicit and
implicit formulations.

At the end of this chapter, you will be able to:

1. Describe the concept of function composition in multivariable settings,


2. Apply the chain rule to compute derivatives of composite scalar functions,

Before diving into the general formulation of the chain rule, let us first revisit the idea of
composite functions to prepare ourselves for the multivariable chain rule.

3.1 Composition of Functions


We can always build new functions out of the given ones though various operations between
them. We can add, subtract and multiply the given ones or else divide one by another. More
precisely, If f : A → B and g : B → C be the two functions, then we define

Sum : (f + g)(x) = f (x) + g(x)


Difference : (f − g)(x) = f (x) − g(x)
Product : (f g)(x) = f (x)g(x) (3.1)
 
f f (x)
Quotient : (x) = , provided that g(x) 6= 0.
g g(x)

73
3.1. COMPOSITION OF FUNCTIONS

Obviously, if we want to keep the above operations on functions f (x) and g(x) meaningful
then the input must belong to domain of both the functions, i.e., x ∈ D(f ) ∩ D(g) = A ∩ B.

There is one more specific way to combine the two functions known as composition of two
functions. Given an input x, we first operate f on it yielding the output f (x). The output

f (x) lies in B and serves as an input for the other function g. Now, g acts on this new input
f (x) and yields an output g(f (x)) in C. The whole process can be thought of as a single
operation (more precisely a function, say h) that carries the input x to an output g(f (x))
in C. This function h is called composition of f and g and is written as g ◦ f . Its domain is
the set A of all inputs x and codomain is C, i.e., h : A → C is a map from A to C and
h = gof.
also action of h on x is defined by
h(x) = g(f (x)).
We now precisely write the definition of composition of two functions as below:

Definition 3.1.1. Let f : A → B and g : B → C be the two functions. Then the


function h : A → C defined by
h(x) = g(f (x))
is called the composition of g with f and is denoted by g ◦ f , i.e., h = g ◦ f .

Sometimes the composition f ◦ g is also possible provided that R(g) ⊆ D(f ). Further, these
definitions are so general in nature that they can be applied functions of several variables by
merely making suitable changes in domain and range, i.e., replacing A or B by just Rn for
some desired n.

ILLUSTRATIVE EXAMPLES


Example 67. Take the two functions f : [0, ∞) → R given by f (x) = x and g : R → R
given by g(x) = −x2 − 1. Find f ◦ g and g ◦ f .

74
3.1. COMPOSITION OF FUNCTIONS

Solution. First, we consider the composition g ◦ f . Note that the function f is a square
root function, therefore range of f will consists of positive reals only, i.e.,

R(f ) = [0, ∞)

and the domain D(g) of g is whole of R, i.e., D(g) = R. Obviously,

R(f ) = [0, ∞) ⊂ D(g) = R.

So, the composition g ◦ f is possible and



(g ◦ f )(x) = g(f (x)) = g( x) = −x − 1.

For the other composition f ◦ g, we see that range of g is comprised of negative real numbers
only, i.e.,
R(g) = (−∞, −1].
Clearly
R(g) $ D(f ) = [0, ∞)
Therefore, f ◦ g is not possible.
Example 68. Let f : R → R; f (x) = x3 and g : R → R; g(x) = sin x. Find f ◦ g and g ◦ f .
Solution. In this case both f ◦ g and g ◦ f are defined. These compositions are given by

(f ◦ g)(x) = f (g(x)) = f (sin x) = sin3 x

and
(g ◦ f )(x) = g(f (x)) = g(x3 ) = sin x3 .
Observe that both the compositions are defined but f ◦ g 6= g ◦ f .

In a similar fashion, we can build compositions in case of multivariable functions too. The
following examples will illustrate the concept.

Example 69. Let f : R2 → R be given by f (x, y) = xy and g : R → R2 given by


g(t) = (t3 , t2 ). Find f ◦ g and g ◦ f .
Solution. Clearly f ◦ g : R → R is defined and

(f ◦ g)(t) = f (g(t)) = f (t3 , t2 ) = t3 .t2 = t5 .

Also g ◦ f : R2 → R2 is given by

(g ◦ f )(x, y) = g(f (x, y)) = g(xy) = (xy)3 , (xy)2 .




75
3.2. THE CHAIN RULE

Example 70. Let f : R2 → R be given by f (x, y) = xy and g : R2 → R2 given by


g(x, y) = (x2 − y 2 , x2 + y 2 ). Find f ◦ g and g ◦ f .
Solution. Clearly f ◦ g : R2 → R is defined and

(f ◦ g)(x, y) = f (g(x, y)) = f (x2 − y 2 , x2 + y 2 ) = (x2 − y 2 ).(x2 + y 2 ) = x4 − y 4 .

But g ◦ f is not defined. (why?)


Example 71. Let f : R3 → R be given by f (x, y, z) = xeyz and g : R → R3 by g(t) =
(et , t, sin t). Find f ◦ g and g ◦ f .
Solution. Clearly f ◦ g : R → R is defined and

(f ◦ g)(t) = f (g(t)) = f (et , t, sin t) = et .et sin t = et(1+sin t) .

Also g ◦ f : R3 → R3 is given by
yz
(g ◦ f )(x, y, z) = g(f (x, y, z)) = g(xeyz ) = (exe , xeyz , sin (xeyz )).

Example 72. Let f : R3 → R be given by f (u, v, w) = u2 + v 2 − w and g : R3 → R3 given


by g(x, y, z) = (x2 y, y 2 , e−xz ). Find f ◦ g and g ◦ f .
Solution. We see that R(g) ⊆ D(f ) so the composition f ◦ g is possible and further

(f ◦ g)(x, y, z) = f (g(x, y, z))


= f (x2 y, y 2 , e−xz ) (3.2)
= x4 y 2 + y 4 − e−xz .

But g ◦ f is not possible as R(f ) $ D(f ).

3.2 The Chain Rule


Let us recall the way we applied the chain rule1 to the composite functions of single real
variable in our school time. Consider a functions y = sin x2 whose derivative with respect
to x is to be obtained. Prima facie, this function looks a bit complicated than the function
sin x itself. So, in order to simplify the situation, we just put u = x2 , which at once allows
us to split the function in two parts as below

y = sin u, u = x2 .

Now, y becomes a composite function of x (via u). The chain rule is given by
dy dy du
= · .
dx du dx
1
We do not submit any proof. Curious reader may find a proof in Thomas’ Calculus.

76
3.2. THE CHAIN RULE

This may be written as


y 0 (x) = y 0 (u) · u0 (x). (3.3)
which gives us
dy
= cos u · 2x = 2x cos x2 .
dx

The chain rule described above is quite simplistic in terms of notations and can be used every
where in single variable case efficiently. However, in more general setting of multivariate case
it is better to write the chain rule (1.3) using functional notation. The formal treatment of
chain rule employing the functional notations is more convenient and sometimes more useful
at some instances. Therefore, we state the formula (3.3) in functional notations.

In the example above, y appears as function of u and u as a function of x, so let us write


u = g(x). Therefore, we have

y = f (u) and u = g(x). (3.4)

Now, replacing u by g(x) in the fist member of (3.4), we may write y as composite function

y = f (g(x)) = (f ◦ g)(x).

Now, with these new notations, the equation (1.3), may be rewritten as below:

(f ◦ g)0 (x) = f 0 (u) · u0 (x) = f 0 (g(x)) · g 0 (x).

Now, we write the statement of chain rule for single variable case as below:

Theorem 3.2.1 (The Chain Rule). Let f and g be two differentiable functions of single
real variable. Then the composition f ◦ g is differentiable and its derivative f ◦ g at x is
given by
(f ◦ g)0 (x) = (f ◦ g)0 (x) = f 0 (g(x)) · g 0 (x). (3.5)
Also,
(f ◦ g)0 (x0 ) = (f ◦ g)0 (x0 ) = f 0 (g(x0 )) · g 0 (x0 ). (3.6)

ILLUSTRATIVE EXAMPLES

Example 73. Given f (x) = ex and g(x) = ax + b. Find D(f ◦ g).

77
3.3. CHAIN RULE IN HIGHER DIMENSIONS

Solution. The composition f ◦ g of the two functions is given by

(f ◦ g)(x) = f (g(x)) = f (ax + b) = eax+b .

The chain rule to find the derivative is

(f ◦ g)0 (x) = f 0 (g(x)) · g 0 (x). (3.7)

Since, f (x) = ex , so we have f 0 (g(x)) = eg(x) = eax+b and g 0 (x) = a. Plugging these values
in the chain rule given above, we get

(f ◦ g)0 (x) = eax+b · a = aeax+b .

Example 74. Given f (x) = log x and g(x) = sin x. Find (f ◦ g)0 .

Solution. The composition f ◦ g of the two functions is given by

(f ◦ g)(x) = f (g(x)) = f (sin x) = log sin x.

The chain rule to find the derivative is

(f ◦ g)0 (x) = f 0 (g(x)) · g 0 (x). (3.8)

Since, f (x) = log x and g(x) = sin x, so we find that f 0 (g(x)) = 1


g(x)
= 1
sin x
and g 0 (x) = cos x.
Plugging these values in the chain rule given above, we get
1 cos x
(f ◦ g)0 (x) = · cos x = .
sin x sin x

3.3 Chain Rule in Higher Dimensions


In this section, we shall use chain rule for multi variable functions. First, we use it for a
simple case when there is two intermediate variable and one independent variable. The chain
rule is stated without proof. Proof can be collected from Thomas’ Calculus.

3.3.1 Two intermediate and one independent variable


Let z = f (x, y) be a differentiable function of two variables and further x any y are differen-
tiable function of a variable t, i.e., x = x(t) and y = y(t). Then z is a composite function of
t, i.e.,
z = z(t) = f (x(t), y(t)).
Now, we need to find z 0 (t), i.e., dz/dt. The derivative z 0 (t) can be obtained with the help of
chain rule given by

78
3.3. CHAIN RULE IN HIGHER DIMENSIONS

Theorem 3.3.1. If z = f (x, y) be differentiable and x = x(t) and y = y(t) are also
differentiable functions of t then their composition z = f (x(t), y(t)) is differentiable and

dz ∂f dx ∂f dy
= · + · . (3.9)
dt ∂x dt ∂y dt
Also
dz ∂f dx ∂f dy
= · + · ,
dt t=t0 ∂x (x0 ,y0 ) dt t=t0 ∂y (x0 ,y0 ) dt t=t0

where x(t0 ) = x0 and y(t0 ) = y0 ).

This can be written more formally in functional notations. For this, we note that x and
y are given to be functions of t, so we may write the ordered pair (x(t), y(t)) = g(t). Thus,
we get a function g : R → R2 given by g(t) = (x, y). The function f : R2 → R is already
given by z = f (x, y). Now, composition of f with g is

z(t) = (f ◦ g)(t) = f (g(t)) = f (x(t), y(t)).

The chain rule for derivative z 0 (t) may now be restated as below:

Theorem 3.3.2. Let f : R2 → R and g : R → R2 be two differentiable functions given


by z = f (x, y) and g(t) = (x, y) respectively, then

dz
= (f ◦ g)0 (t) = f 0 (g(t)) · g 0 (t). (3.10)
dt
Also
dz
= (f ◦ g)0 (t0 ) = f 0 (g(t0 )) · g 0 (t0 ).
dt t=t0

It should be noted that the formulae (3.9) and (3.10) are essentially same. Just a matter of
notational difference. Observe that f (g(t)) is simply f (x, y), i.e.,

f (g(t)) = f (x, y)

therefore, we have  
0 0 ∂f ∂f
f (g(t)) = f (x, y) = ,
∂x ∂y
and  
0 0 0 dx dy
g (t) = (x , y ) = , .
dt dt

79
3.3. CHAIN RULE IN HIGHER DIMENSIONS

Thus, the equation (3.10) takes the form


   
dz ∂f ∂f dx dy
= , · , ,
dt ∂x ∂y dt dt
which is nothing but
dz ∂f dx ∂f dy
= · + ·
dt ∂x dt ∂y dt
same as the formula (3.9).

In case of three or more intermediate variable, the formula (3.9) can be stretched easily. If we
take a differentiable function u = f (x, y, z) of three variables where x = x(t), y = y(t) and
z = z(t) are themselves differentiable functions of t, then the composite function u = u(t) is
differentiated as below:
du ∂f dx ∂f dy ∂f dz
= · + · + · . (3.11)
dt ∂x dt ∂y dt ∂z dt

ILLUSTRATIVE EXAMPLES

Example 75. Given f (x, y) = x2 + y 2 , where x = 2t + 7 and y = 3t + 8. Find dz/dt by (i)


chain rule and (ii) by expressing f in terms of t.
Solution. (i) The chain rule is given by
df ∂f dx ∂f dy
= · + · .
dt ∂x dt ∂y dt
Now, we see that
∂f ∂f
= 2x, = 2y,
∂x ∂y
dx dy
= 2, = 3.
dt dt
Therefore,
df
= 2x · 2 + 2y · 3 = 4x + 6y = 4(2t + 7) + 6(3t + 8) = 26t + 76.
dt
(ii) Putting x = 2t + 7 and y = 3t + 8 in f (x, y) = x2 + y 2 . This gives

f (t) = (2t + 7)2 + (3t + 8)2 = 13t2 + 76t + 113.

Finally, we get
df
= 26t + 76.
dt

80
3.3. CHAIN RULE IN HIGHER DIMENSIONS

Example 76. Given w = x2 + y 2 , x = cos t, y = sin t. Evaluate dw/dt at t = π.

Solution. (i) The function w is a composite function of t. Therefore, dw


dt
is given by
t=t0

dw ∂w dx ∂w dy
= · + · ,
dt t=t0 ∂x (x0 ,y0 ) dt t=t0 ∂y (x0 ,y0 ) dt t=t0

where x(t0 ) = x0 and y(t0 ) = y0 ). Now, given w = x2 + y 2 , x = cos t, y = sin t. Also, At


t = π, x = cos π = −1 and y = sin π = 0, i.e, at t = π, (x0 , y0 ) = (−1, 0). Now, we have
∂w ∂w
= 2x, and = −2,
∂x ∂x (−1,0)
∂w ∂w
= 2y, and = 0,
∂y ∂y (−1,0)
dx dx
= − sin t, and = 0,
dt dt t=π
dy dy
= cos t, and = −1.
dt dt t=π
Thus, from chain rule, we get
dw ∂w dx ∂w dy
= · + · = (−2) · 0 + 0 · (−1) = 0.
dt t=π ∂x (−1,0) dt t=π ∂y (−1,0) dt t=π

Example 77. Suppose a duck is swimming around in a pond. The position (x, y) of the
duck at time t is given by ~c(t) = (3 + 8t, 3 − 2t) while the water temperature is given by the
formula T (x, y) = 25 + x2 + y 2 . Find the position of the duck at time t = 0. What will be
temperature at that point and what about rate of change of temperature relative to time at
t = 0.
Solution. Given that at time t, the position of the duck is given by
(x, y) = ~c(t) = (3 + 8t, 3 − 2t).
Therefore, at t = 0, the position of the duck is
(x0 , y0 ) = ~c(0) = (3, 3)
Also, the temperature at t = 0 is
T (x, y) t=0
= T (~c(0)) = 25 + x20 + y02 = 25 + 9 + 9 = 43.
The rate of change in temperature T with respect to time is given by
dT ∂T dx ∂T dy
= · + · = (2x)(8) + (2y)(−2) = 16(3 + 8t) − 4(3 − 2t) = 36 + 136t.
dt ∂x dt ∂y dt
Therefore,
dT
= 36.
dt t=0

81
3.3. CHAIN RULE IN HIGHER DIMENSIONS

Example 78. Consider f (x, y) = 9 − x2 − y 2 and ~r(t) = (2 cos t, 3 sin t). For this problem,
imagine the following scenario. A horse is running around outside in the cold. The horse’s
position at time t is given by the elliptical path ~r(t). The temperature of the air at any point
(x, y) is given by T = f (x, y). Now, answer the following:

1. At time t = 0, what is the horse’s position ~r(0), and what is the temperature f (~r(0))
at that position? Find the temperatures at t = π/2, t = π, and t = 3π/2 as well.
2. In the plane, draw the path of the horse for t ∈ [0, 2π]. Then, on the same 2D graph,
include a contour plot of f . Make sure you include the level curves that pass through
the points in part ??. At the points addressed in part ??, write the temperature on
the curve.
Solution. Do it yourself.

3.3.2 Two intermediate and two independent variables


Consider a differentiable function h = f (x, y). Suppose that x and y be differentiable
functions of two independent variables u and v, i.e., x = x(u, v) and y = y(u, v). Then h is
a composite function of u and v, given by

h = h(u, v) = f (x(u, v), y(u, v)).

Now, we need to find h0 (u, v). Note that h is now function of two variables, therefore
h0 (u, v) = [∂h/∂u, ∂h/∂v]. The component of this derivative matrix, i.e., partial derivatives
∂h/∂u and ∂h/∂v may be calculated with the help of chain rule given below:

Theorem 3.3.3. If f (x, y) be a differentiable function and x = x(u, v) and y = y(u, v)


are also differentiable functions of u and v then their composition h = f (x(u, v), y(u, v))
is differentiable and
∂h ∂f ∂x ∂f ∂y
= · + · (3.12)
∂u ∂x ∂u ∂y ∂u
and
∂h ∂f ∂x ∂f ∂y
= · + · . (3.13)
∂v ∂x ∂v ∂y ∂v

Since x = x(u, v) and y = y(u, v), so in functional notation, we may write that (x, y) =
g(u, v). Thus, we get a function g : R2 → R2 given by

(x, y) = g(u, v).

Now, the chain rule given by the equations (3.12) and (3.13) may be formally restated as
below:

82
3.3. CHAIN RULE IN HIGHER DIMENSIONS

Theorem 3.3.4. Let f : R2 → R and g : R2 → R2 be two differentiable functions given


by h = f (x, y) and (x, y) = g(u, v) respectively, then their composition is a differentiable
function h : R2 → R2 given by

h(u, v) = (f ◦ g)(u, v) = f (g(u, v))

and the derivative of h is given by

h0 (u, v) = (f ◦ g)0 (u, v) = f 0 (g(u, v)) · g 0 (u, v). (3.14)

Also, at u = u0 and v = v0 , the derivative is

h0 (u0 , v0 ) = (f ◦ g)0 (u0 , v0 ) = f 0 (g(u0 , v0 )) · g 0 (u0 , v0 ).

The formula (3.14) can be written conveniently in terms of derivative matrix also. Note that
g(u, v) = (x, y), therefore, f (g(u, v)) = f (x, y) and
 
0 ∂f ∂f
f (g(u, v)) = ,
∂x ∂y

and  ∂x ∂x

0 ∂u ∂v
g (u, v) = ∂y ∂y .
∂u ∂v

The formula (3.14) may now be written as


     ∂x ∂x

∂h ∂h ∂f ∂f
, = , · ∂u
∂y
∂v
∂y . (3.15)
∂u ∂v ∂x ∂y ∂u ∂v

This further yields


∂h ∂f ∂x ∂f ∂y
= · + ·
∂u ∂x ∂u ∂y ∂u
and
∂h ∂f ∂x ∂f ∂y
= · + · ,
∂v ∂x ∂v ∂y ∂v
which are same as (3.12) and (3.13).

ILLUSTRATIVE EXAMPLES

Example 79. Given f (x, y) = xy, where g(u, v) = (u2 − v 2 , u2 + v 2 ). Find (f ◦ g)0 by (i)
chain rule and (ii) by expressing f in terms of t.

83
3.3. CHAIN RULE IN HIGHER DIMENSIONS

Solution. (i) Note that the function g gives an ordered pair as an output. Let us call it
(x, y), i.e., g(u, v) = (x, y) so that x = u2 − v 2 and y = u2 + v 2 . Now, given f (x, y) = xy
and g(u, v) = (x, y), so the composition h is

h(u, v) = (f ◦ g)(u, v) = f (g(u, v)).

The chain rule gives us

h0 (u, v) = (f ◦ g)0 (u, v) = f 0 (g(u, v)) · g 0 (u, v).

Writing above in terms of derivative matrices, we get


     ∂x ∂x

∂h ∂h ∂f ∂f
, = , · ∂u
∂y
∂v
∂y
∂u ∂u ∂x ∂y ∂u ∂v

This gives    
∂h ∂h 2u −2v
, = [y, x] ·
∂u ∂u 2u 2v
This further gives
 
∂h ∂h
, = [(2u)y + (2u)x, (−2v)y + (2v(x)]
∂u ∂u
= (2u)(u2 + v 2 ) + (2u)(u2 − v 2 ), (−2v)(u2 + v 2 ) + 2v(u2 − v 2 )
 

= 4u3 , −4v 3 .
 

Thus h0 (u, v) = [4u3 , −4v 3 ].

(ii) By direct substitution, we find that

h(u, v) = f (g(u, v)) = f (u2 − v 2 , u2 + v 2 ) = (u4 − v 4 ).

Therefore,
∂h
= 4u3
∂u
and
∂h
= −4v 3 .
∂v
Therefore, the derivative of composition h is
 
∂h ∂h
= (f ◦ g)0 (u, v)) = 4u3 , −4v 3 .
 
,
∂u ∂v

Example 80. Compute the derivative matrices using chain rule for the following functions.

84
3.3. CHAIN RULE IN HIGHER DIMENSIONS

a) z = x + y, x = 3u2 − 2v, y = u − 3v.


b) z = u2 + v 2 , u = 2x + 7, v = 3x + y + 7
c) z = x2 + (y/x), x = u − 2v + 1, y = 2u + v − 2 at (u, v) = (0, 0)
Solution. (a) The derivative matrix is
     ∂x ∂x

∂z ∂z ∂z ∂z
, = , · ∂u
∂y
∂v
∂y
∂u ∂u ∂x ∂y ∂u ∂v
 
6u −2
= [1, 1] ·
1 −3
= [6u + 1, −2 − 3]
= [6u + 1, −5] .

(b) In this case the derivative matrix is given by

     ∂u ∂u 
∂z ∂z ∂z ∂z
, = , · ∂x
∂v
∂y
∂v
∂x ∂y ∂u ∂v ∂x ∂y
 
2 0
= [2u, 2v] ·
3 1
= [4u + 6v, 2v] .

(c) Given z = x2 + (y/x), x = u − 2v + 1 and y = 2u + v − 2. Writing this in functional


notation, we may write z = z(x, y) and (x, y) = g(u, v). The chain rule is

z 0 (u, v) = z 0 (g(u, v)) · g 0 (u, v).

where  
0 0 ∂z ∂z
= 2x − y/x2 , 1/x
 
z (g(u, v)) = z (x, y) = ,
∂x ∂y
and  ∂x ∂x
  
0 ∂u ∂v 1 −2
g (u, v) = ∂y ∂y = .
∂u ∂v
2 1
For (u, v) = (0, 0), we have (x, y) = (1, −2), therefore, we have

z 0 (g(0, 0)) = z 0 (1, −2) = [4, 1]

and 

0 1 −2
g (0, 0) = .
2 1
Thus,  
0 0 0 1 −2
z (0, 0) = z (1, −2) · g (0, 0) = [4, 1] · = [6, −7].
2 1

85
3.3. CHAIN RULE IN HIGHER DIMENSIONS

3.3.3 General Case (Optional)


The general case may be omitted in first reading as it involves a lot of symbols and gener-
alised canvass. However, we state it for a curious reader.

Theorem 3.3.5. Let f : Rm → Rp and g : Rn → Rm be two differentiable functions


given by

f (x1 , x2 , . . . , xm ) = (f1 , f2 , . . . , fp ); and g(u1 , u2 , . . . , un ) = (x1 , x2 , . . . , xm ).

Let their composition be denoted by h, i.e., h = f ◦g. Then h : Rn → Rp is a differentiable


function with

(f ◦ g)0 (u1 , u2 , . . . , un ) = f 0 (g(u1 , u2 , . . . , un )) · g 0 (u1 , u2 , . . . , un ).

If we write h(u1 , u2 , . . . , un ) = (h1 , h2 , . . . , hp ), then the chain rule can be stated in terms
of derivative matrices as under:
 ∂h1 ∂h1 ∂h1   ∂f1 ∂f1 ∂f1   ∂x1 ∂x1 ∂x1 
∂u1 ∂u2
. . . ∂u n ∂x1 ∂x2
. . . ∂x m ∂u1 ∂u2
. . . ∂u n
 . . . . . . . . . . . .  = . . . . . . . . . . . .  ·  . . . . . . . . . . . .  .
∂hp ∂hp ∂hp ∂fp ∂fp ∂fp ∂xm ∂xm
∂u1 ∂u2
. . . ∂u n ∂x ∂x
. . . ∂x m ∂u1 ∂u2
. . . ∂x
∂un
m
1 2

ILLUSTRATIVE EXAMPLES

Example 81. Given g(x, y) = (x2 + 1, y 2 ) and f (u, v) = (u + v, u, v 2 ). Compute the deriva-
tive matrix of f ◦ g at (x, y) = (1, 1).

Solution: Write g(x, y) = (u, v) so that u = x2 + 1 and v = y 2 . Let f (u, v) = (f1 , f2 , f3 ) =


(u + v, u, v 2 ). Now chain rule is
(f ◦ g)0 (x, y) = f 0 (g(x, y)) · g 0 (x, y).
Note that f 0 (g(x, y)) = f 0 (u, v), therefore, we get
 ∂f1 ∂f1   
∂u ∂v 1 1
f 0 (u, v) =  ∂f2
∂u
∂f2 
∂v
= 1 0 
∂f3 ∂f3
∂u ∂v
0 2v
and  ∂u ∂u   
0 ∂x ∂y 2x 0
g (x, y) = ∂v ∂v = .
∂x ∂y
0 2y
At (x, y) = (1, 1), we have (u, v) = g(x, y) = (2, 1), therefore
   
1 1   2 2
2 0
(f ◦ g)0 (1, 1) = f 0 (2, 1) · g 0 (1, 1) = 1 0 · = 2 0 .
0 2
0 2 0 4

86
Chapter 4

Gradient Vector Field

In the previous chapters, we explored partial derivatives and the gradient of scalar fields-
tools for analyzing how functions change along principal coordinate directions. A natural
progression is to ask how a function changes in an arbitrary direction. This leads to the
concept of the directional derivative, which generalizes the notion of rate of change by incor-
porating both direction and magnitude. The directional derivative quantifies how a function
changes in a given direction from a point and reveals its connection to the gradient vector
as a projection.

In this chapter, we also introduce vector fieldsmathematical constructs that assign a vector
to every point in a domain. Unlike scalar fields, vector fields represent quantities such as
velocity or force, which have both magnitude and direction. We examine vector fields geo-
metrically, with particular attention to electric fields and normal vectors to surfaces. These
concepts deepen our understanding of multivariable behavior, linking analytical techniques
with geometric intuition.

At the end of this chapter, you will be able to:

1. Analyze and visualize vector fields in two and three dimensions.


2. Compute the directional derivative of a scalar field in any specified direction,
3. Compute normal vectors to the (level) surfaces,
4. Derive equation of the tangent plane to a surface,
5. Determine the direction of fastest change (steepest ascent or descent) using the gradient
vector.

4.1 From Scalar Fields to Vector Fields


In the study of functions of several variables, we often begin with scalar fields—functions
that assign a real number to every point in space. For example, a scalar field f (x, y) assigns

87
4.1. FROM SCALAR FIELDS TO VECTOR FIELDS

a value to each point (x, y) ∈ R2 , and similarly, a function f (x, y, z) defines a scalar field in
R3 .
When we take the gradient of such a scalar field, we obtain a new kind of object: a function
that assigns a vector to each point. Specifically, for a differentiable function f : Rn → R,
the gradient  
∂f ∂f
∇f (x1 , . . . , xn ) = ,...,
∂x1 ∂xn
produces a vector in Rn at every point in the domain. This vector points in the direction of
fastest increase of the function and captures local behavior of the scalar field.
This motivates the more general idea of a vector field: a function that directly assigns a
vector to each point in a domain, without requiring that it arise as the gradient of a scalar
function. That is, a vector field on a subset of Rn is a function

F : D ⊆ Rn → Rn .

Such functions arise naturally when we consider quantities that inherently involve both
magnitude and direction—such as velocity, acceleration, or force.
Thus, vector fields generalize and extend the idea of gradients. Note that every gradient is a
vector field but not every vector field comes up as a gradient. This distinction leads to rich
mathematical structures and questions, such as which vector fields are gradients of scalar
functions-a topic we will revisit under conservative fields.

In a fomal tone, we define a vector filed as below:

Definition 4.1.1. A vector field in Rn is a function F : D ⊆ Rn → Rn .


For n = 2, F is called a two dimensional vector field and are given by

F(x, y) = (P (x, y), Q(x, y)) = P (x, y)i + Q(x, y)j.

and for n = 3, it is a three dimensional vector field given by

F(x, y, z) = (P (x, y), Q(x, y), R(x, y)) = P (x, y)i + Q(x, y)j + R(x, y)k.

Vector fields naturally extend multivariable functions by assigning a vector to each point in
a domain D ⊆ Rn . For instance, while a scalar function f (x, y) assigns a number to each
point (x, y), a vector field F(x, y) = (P (x, y), Q(x, y)) assigns a vector. These fields can be
studied independently of physical interpretation, focusing on their structure, continuity, and
geometric behavior. The following examples illustrate this mathematical perspective.

Example 82. For example, the function F(x, y) = (x + y, 2xy) = (x + y)i + (2xy)j is a
two-dimensional vector field. For a point such as (1, 3), it assigns a vector F(1, 3) = (4, 6) =
4î + 6ĵ. For point (2, 3), it assigns a vector F(2, 3) = (5, 12) = 5î + 12ĵ.

88
4.1. FROM SCALAR FIELDS TO VECTOR FIELDS

Example 83. The function F(x, y, z) = (x, x − z, 2yz) = xi + (x − z)j + 2yzk is a vector
field in space R3 . For a point such as (4, 1, −3), it assigns a vector F(4, 1, −3) = (4, 7, −6) =
4i + 7j − 6k. Similarly, for other points, we may find associated vectors.

Example 84 (Tangent Vector Field on a Circle). Consider the unit circle in the plane,
given by x2 + y 2 = 1. A vector field that assigns to each point on the circle a unit tangent
vector in the counterclockwise direction is

T(x, y) = (−y, x).

This field is tangent to the circle at every point because the dot product T(x, y) · (x, y) =
−yx+xy = 0 vanishes identically. Thus, T is orthogonal to the radial vector (x, y), and hence
tangent to the circle. For example, at the point (1, 0), the tangent vector is T(1, 0) = (0, 1),
pointing in the counterclockwise direction.

Example 85 (Normal Vector Field on a Circle). Let us again consider the unit circle
x2 + y 2 = 1. A natural choice for a normal vector field (pointing radially outward) is

N(x, y) = (x, y).

This field assigns to each point on the circle the position vector itself, which is normal to
the
p circle at that point. The field points outward and has unit length since kN(x, y)k =
x2 + y 2 = 1. For instance, at the point (0, 1), the normal vector is N(0, 1) = (0, 1),
pointing vertically upward.

There are physical motivations too for consideration of vector fields. We know that physical
vector quantities such as velocity and force may change from point to point. Therefore, they
are often modelled by vector fields. The velocity field represents speed and direction (at any
point) of a moving fluid in space and force fields (such as magnetic or gravitational) give
strength and direction of the force at any point in space.

Example 86 (Velocity Vector Field). Let a fluid flow in space be described by the velocity
vector field
v(x, y, z) = (−y, x, 0).
This field assigns to each point (x, y, z) a velocity vector indicating the direction and speed
of the fluid at that point. For example, at the point (2, 3, 0), we have

v(2, 3, 0) = (−3, 2, 0) = −3ı̂ + 2̂.

This means that a fluid particle at (2, 3, 0) moves in the direction of the vector −3ı̂ + 2̂, i.e.,
perpendicular to the position vector in the xy-plane, suggesting a rotational flow around the
z-axis. The vector field v thus models a swirling or vortex-type motion in the plane.

89
4.1. FROM SCALAR FIELDS TO VECTOR FIELDS

Example 87 (Spring Force Field). Consider a force field representing the restoring force
in a spring-like medium, governed by Hooke’s Law. In two dimensions, it is given by
F(x, y) = −k(x, y),
where k > 0 is the spring constant. This force pulls the particle back toward the origin and
increases in magnitude as the particle moves farther away. For example, if k = 2, then at
the point (1, −3), the force is
F(1, −3) = −2(1, −3) = (−2, 6) = −2ı̂ + 6̂.
This means a particle at (1, −3) experiences a force directed toward the origin, trying to
restore equilibrium. This field is radial and linear, making it a simple and classic example
of a conservative force field.
Example 88 (Gradient Vector Field). Let f (x, y, z) = x2 + y 2 + z 2 be a scalar field
defined on R3 . Its gradient is given by
 
∂f ∂f ∂f
∇f = , , = (2x, 2y, 2z) .
∂x ∂y ∂z
This expression defines a vector field in space R3 , since to every point (x, y, z), the gradient
assigns a vector. For example, at the point (1, 2, 3), we have
∇f (1, 2, 3) = (2, 4, 6) = 2î + 4ĵ + 6k̂.
Thus, the gradient acts as a rule that associates a vector to each point in space, making it
a vector field.
Example 89 (Gravitational Vector Field). According to Newton’s law of universal grav-
itation, any two point masses attract each other with a force that is proportional to the prod-
uct of their masses and inversely proportional to the square of the distance between them.
The force is directed along the line joining the two masses and is given by the formula:
Gm1 m2
F=− r̂,
r2
where G = 6.67430 × 10−11 m3 kg−1 s−2 is the universal gravitational constant, m1 and m2
are the two masses, r is the distance between them, and r̂ is the unit vector from the mass
experiencing the force toward the other mass. The negative sign signifies that the force is
attractive.

This leads naturally to the concept of a gravitational vector field, which describes the gravi-
tational force exerted by a fixed mass M on a unit mass placed at various points in space.
If M is located at the origin, then the gravitational field at a point (x, y, z) ∈ R3 \ {(0, 0, 0)}
is given by:
GM
G(x, y, z) = − 2 (xî + y ĵ + z k̂).
(x + y 2 + z 2 )3/2

90
4.1. FROM SCALAR FIELDS TO VECTOR FIELDS

This vector field points toward the origin, and its magnitude decreases with the square of
the distance from the origin.

Numerical Example: Suppose the fixed mass is M = 5.972 × 1024 kg (mass of Earth) and
we evaluate the gravitational field at the point (6.371 × 106 , 0, 0), which lies on the surface
of Earth, assuming that the Earth is a perfect sphere of radius R = 6.371 × 106 m. Then,
we get
(6.67430 × 10−11 )(5.972 × 1024 )
G(6.371 × 106 , 0, 0) = − (6.371 × 106 )î.
(6.371 × 106 )3
Simplifying:
G(6.371 × 106 , 0, 0) ≈ −9.8 î m/s2 .
Thus, the gravitational vector field at that point is approximately:

G ≈ −9.8 î m/s2 ,

which means a unit mass placed at that point experiences a force of magnitude 9.8 N directed
toward the center of Earth, along the negative x-axis.
Example 90 (Electric Field of a Point Charge). According to Coulombs law, the
electric force between two point charges is proportional to the product of the charges and
inversely proportional to the square of the distance between them. The force acts along the
line joining the charges and is given by the formula:
q1 q2
F=k r̂,
r2
where k = 4π 1
0
is known as Coulombs constant and has the numerical value k = 4π 1
0

8.9875 × 10 N · m /C , q1 and q2 are the charges, r is the distance between them, and r̂ is
9 2 2

the unit vector pointing from the charge exerting the force to the one experiencing it.

From this principle, we define the electric field generated by a point charge q located at the
origin. The electric field at a point (x, y, z) ∈ R3 \ {(0, 0, 0)} is the force per unit positive
charge placed at that point and is given by:
q
E(x, y, z) = k (xî + y ĵ + z k̂).
(x2 + + z 2 )3/2
y2
This defines a vector field in space, where each point is associated with a vector pointing
away from the origin if q > 0 (positive charge) or toward the origin if q < 0 (negative charge).
The magnitude of the field diminishes with the square of the distance from the origin.

In particular, at the point (1, 0, 0), the electric field becomes:


q
E(1, 0, 0) = k (1, 0, 0) = kq î.
13

91
4.1. FROM SCALAR FIELDS TO VECTOR FIELDS

This shows that at (1, 0, 0), the electric field vector points in the positive x-direction with
magnitude kq. Thus, the electric field vector field describes the influence of a point charge
on its surrounding space, assigning a vector to every point that indicates the direction and
strength of the electric force on a unit charge.

To conclude, if we take q = 1 C, the electric field at (1, 0, 0) becomes:


E(1, 0, 0) = 8.9875 × 109 N · m2 /C2 · 1 C · î = 8.9875 × 109 N/C î.


Thus, the electric field at the point (1, 0, 0) with a charge of 1 C is approximately 8.9875 ×
109 N/C in the positive x-direction.

4.1.1 Plotting a Vector Field


Consider a 2-dimensional vector field F(x, y) = (P (x, y), Q(x, y)). To obtain a graphical view
of this vector field, one might begin by plotting the vectors (P, Q) in the plane. Initially, these
vectors are often drawn starting from the origin, as illustrated by the blue vector in the image
below. However, such a depiction misrepresents the nature of a vector field. A vector field

assigns a vector (an arrow) to each point in the plane. In the example above, only the arrows
are shown originating from the origin, while the points to which they are associated are com-
pletely disregarded. Thus, this approach neglects the crucial point-to-vector correspondence.

To correctly represent a vector field, we must associate each vector with its corresponding
point. This can be achieved by translating each arrow so that it starts at the input point
(x, y). In other words, instead of placing the vector at the origin, we draw the arrow with its
tail at (x, y) and head at (x + P, y + Q). This is shown as the red arrow in the figure above.
In this way, both the input point and the output vector are represented, revealing the full
structure of the vector field.
Example 91. For example, consider the vector field F(x, y) = xi − yj, where P (x, y) = x
and Q(x, y) = −y. To plot this vector field, we compute a few sample vectors:
F(1, 1) = i − j = (1, −1),
F(0, 1) = −j = (0, −1),
F(1, −2) = i + 2j = (1, 2),
F(−2, 1) = −2i − j = (−2, −1).

92
4.1. FROM SCALAR FIELDS TO VECTOR FIELDS

Now, we plot the following four vectors by placing each one at its corresponding initial point
(x, y), with the arrow pointing to the terminal point (x + P, y + Q):

Initial point (1, 1) → Terminal point (2, 0),


Initial point (0, 1) → Terminal point (0, 0),
Initial point (1, −2) → Terminal point (2, 0),
Initial point (−2, 1) → Terminal point (−4, 0).

These vectors are shown in the figure (4.1.1). Although plotting a few vectors gives some

Figure 4.1.1: Manual Plot Figure 4.1.2: Computer Generated Plot

idea of the vector field’s behavior, however, software tools such as GeoGebra can generate
many more vectors, offering a more detailed visualization. A computer-generated plot of this
vector field is shown in the figure (4.1.2).
Example 92. A computer-generated graphic visualization of the vector field F(x, y) = yi−xj
is shown below.

Figure 4.1.3: Vector Field F(x, y) = yi − xj

93
4.2. GRADIENT VECTOR FIELD

4.2 Gradient Vector Field


Let us take a function z = f (x, y). The derivative of this function (if it exists) is the matrix
of the two partial derivatives given below.
f 0 (a, b) = Df (a, b) = [fx (a, b), fy (a, b)].
If we take it as a vector, it is called a gradient vector field. Thus, we define the gradient
vector field as below:

Definition 4.2.1. Gradient vector field of a scalar field z = f (x, y) is

∂f ∂f
∇f (x, y) = (fx (x, y), fy (x, y)) = (x, y)i + (x, y)j.
∂x ∂y

Example 93. Plot the gradient vector field of the function z = x2 + y 2 . Also, draw a level
curve of the surface that passes through (1,0). Draw a tangent to this level curve at that
point (1,0) and also plot ∇f (1, 0) at that point. What relation can be seen between gradient
vector ∇f (1, 0) and the tangent vector to the level curve at that point.
Solution. The gradient vector field of the function z = x2 + y 2 at point (x, y) is given by
∂f ∂f
∇f (x, y) = i+ j = 2xî + 2y ĵ.
∂x ∂y
A gradient vector at point (1, 0) is given by
∇f (1, 0) = 2i + 0j = 2i.
A plot of the gradient vector field ∇f (x, y) is given below: A level curve that passes through

Figure 4.2.1: ∇f (x, y) = 2xî + 2y ĵ

the point (1, 0) is x2 + y 2 = 1. A tangent vector to this level curve at (1, 0) is


v = (0, 1).

94
4.3. GRADIENT AND DIRECTIONAL DERIVATIVE

Figure 4.2.2: Without Field Figure 4.2.3: With Field

Example 94. Find the gradient of the function f (x, y) = log(x2 + y 2 ) at the point (1, 1).
Sketch the gradient vector together with the level curve that passes through the point.
Solution. Left as an exercise.

4.3 Gradient and Directional Derivative


Let us now extend the notion of partial derivatives to the derivatives taken in an arbitrary
direction. Consider a unit vector v = (v1 , v2 ) that defines a direction in the plane. We wish
to compute the rate at which a function f (x, y) changes at a point p = (a, b) as we move in
the direction of v. To analyze this, let us move a small distance t from the point p in the
direction of v. The new point q, reached by moving along the direction v, is given by:

q = p + tv = (a, b) + t(v1 , v2 ) = (a + tv1 , b + tv2 ).

We now consider the difference quotient


f (p + tv) − f (p)
,
t
which measures the average rate of change of f in the direction v over a small interval of
length t.

Taking the limit as t → 0, we obtain the directional derivative of f at the point (a, b) in the
direction of the unit vector v:
f (p + tv) − f (p)
Dv f (a, b) = lim .
t→0 t
In this expression, the point p and the vector v are fixed. Therefore, p + tv is just a function
of t. Define
c(t) = p + tv, so that c0 (t) = v.

95
4.3. GRADIENT AND DIRECTIONAL DERIVATIVE

Figure 4.3.1: Directional Derivative Dv f

Also, at t = 0, we have c(0) = p and c0 (0) = v. Then the expression becomes


f (c(t)) − f (c(0))
Dv f (a, b) = lim .
t→0 t
To simplify further, define

f (c(t)) = g(t), so that g(t) = (f ◦ c)(t).

Then, at t = 0, we have g(0) = (f ◦ c)(0) = f (c(0)). Therefore, the expression becomes


g(t) − g(0)
Dv f (a, b) = lim .
t→0 t
The right-hand side is just the ordinary derivative of g(t) evaluated at t = 0, that is,
dg
Dv f (a, b) = .
dt t=0

Now, using the chain rule, we get:

g 0 (t) = (f ◦ c)0 (t) = f 0 (c(t)) · c0 (t) = ∇f (c(t)) · c0 (t).

At t = 0, we obtain:
g 0 (0) = ∇f (c(0)) · c0 (0).
Since c(0) = p = (a, b) and c0 (0) = v, we have:

Dv f (a, b) = g 0 (0) = ∇f (a, b) · v.

96
4.3. GRADIENT AND DIRECTIONAL DERIVATIVE

Thus, the directional derivative of a function f (x, y) in the direction v is the component of
the gradient vector ∇f (x, y) in that direction.

Geometric Interpretation: The directional derivative Dv f (a, b) has a clear geometric


meaning. It represents the slope of the tangent line to the curve formed by the inter-
section of the surface z = f (x, y) with the vertical plane that passes through the point
B = (a, b, f (a, b)) and is parallel to the direction vector v.

In other words, the directional derivative captures the instantaneous rate of change of the
surface elevation as one walks away from point (a, b) in the direction of v. When v aligns
with the coordinate axes, this reduces to the familiar partial derivatives ∂f
∂x
and ∂f
∂y
.

Maximum Rate of Change: The directional derivative of a scalar function f at a


point (a, b) in the direction of a unit vector v = (v1 , v2 ) is given by:

Dv f (a, b) = ∇f (a, b) · v = k∇f (a, b)k kvk cos θ,

where θ is the angle between the gradient vector ∇f (a, b) and the direction vector v. Since
kvk = 1, the formula simplifies to:

Dv f (a, b) = k∇f (a, b)k cos θ.

Since −1 ≤ cos θ ≤ 1, the value of the directional derivative depends on the alignment of
the direction vector v with the gradient. The fastest change in the function occurs when
cos θ = 1, i.e., θ = 0, meaning v is in the same direction as the gradient vector. In this case,
the function is increasing the fastest, and the maximum rate of change is:

max Dv f (a, b) = k∇f (a, b)k.


v

A unit vector pointing in the direction of the fastest increase is:


∇f (a, b)
v= .
k∇f (a, b)k
On the other hand, the function is decreasing the fastest when cos θ = −1, i.e., when v
points directly opposite to the gradient vector. A unit vector in this direction of the fastest
decrease is:
∇f (a, b)
u=− .
k∇f (a, b)k

Conclusion: The gradient vector ∇f (a, b) points in the direction where the function
increases the fastest, and its magnitude k∇f (a, b)k gives the fastest rate of change. The
function decreases the fastest in the opposite direction.

97
4.3. GRADIENT AND DIRECTIONAL DERIVATIVE

ILLUSTRATIVE EXAMPLES

Example 95. Let f (x, y) = x2 y + y 3 . Compute the directional derivative at the point (1, 2)
in the direction of the vector v = (3, 4).

Solution. First, compute the gradient of the function f as below:


 
∂f ∂f
∇f (x, y) = , = (2xy, x2 + 3y 2 ).
∂x ∂y

At the point (1, 2), the gradient is given by

∇f (1, 2) = (2 · 1 · 2, 12 + 3 · 22 ) = (4, 13).

Now, a unit vector in the direction of v = (3, 4) is given by


 
1 3 4
v̂ = √ (3, 4) = , .
32 + 42 5 5

Therefore, the directional derivative of f in the direction of the vector v is


 
3 4 12 + 52 64
Dv f (1, 2) = ∇f (1, 2) · v̂ = (4, 13) · , = = .
5 5 5 5

Example 96. Let f (x, y) = ex sin y. Find the directional derivative at the point (0, π/2) in
the direction of the vector v = (1, −1).

Solution. The gradient of the function f is


 
∂f ∂f
∇f (x, y) = , = (ex sin y, ex cos y).
∂x ∂y

At (0, π/2), we get


∇f (0, π/2) = (1 · 1, 1 · 0) = (1, 0).
Now, we find a unit vector along the direction of the vector v
 
1 1 −1
v̂ = p (1, −1) = √ , √ .
12 + (−1)2 2 2

Therefore, the required directional derivative is


 
1 −1 1
Dv f (0, π/2) = (1, 0) · √ ,√ =√ .
2 2 2

98
4.3. GRADIENT AND DIRECTIONAL DERIVATIVE

Example 97. Let f (x, y) = log(x2 + y 2 ). Find the directional derivative at the point (1, −1)
in the direction pointing from (1, −1) to (2, 1).
Solution. The gradient vector of the function f is given by
 
2x 2y
∇f (x, y) = , .
x2 + y 2 x2 + y 2
At (1, −1), the gradient comes out to be a vector
 
2 · 1 2 · (−1)
∇f (1, −1) = , = (1, −1).
2 2
The vector from initial point (1, −1) to a point (2, 1) is v = (1, 2), so the unit vector along
this is given by  
1 1 2
v̂ = √ (1, 2) = √ , √ .
12 + 22 5 5
Thus, the required directional derivative is
 
1 2 1−2 1
Dv f (1, −1) = (1, −1) · √ , √ = √ = −√ .
5 5 5 5
Example 98. Let f (x, y, z) = x2 yz + yz 2 . Compute the directional derivative at the point
(1, 2, 3) in the direction of the vector v = (2, −1, 2).
Solution. The gradient vector ∇f (x, y, z) of the function f is given by
 
∂f ∂f ∂f
∇f = , , = (2xyz, x2 z + z 2 , x2 y + 2yz).
∂x ∂y ∂z
We evaluate the gradient at (1, 2, 3) as below
∇f (1, 2, 3) = (12, 3 + 9, 2 + 12) = (12, 12, 14).
Next, we compute the unit vector in the direction of v = (2, −1, 2):
 
v (2, −1, 2) 2 1 2
v̂ = =p = ,− ,
kvk 22 + (−1)2 + 22 3 3 3
Now, the directional derivative is given by
Dv f (1, 2, 3) = ∇f (1, 2, 3) · v̂
 
2 1 2
= (12, 12, 14) · ,− , .
3 3 3
1
= (24 − 12 + 28)
3
40
= .
3
Thus, the directional derivative of f at (1, 2, 3) in the direction of v is 40
3
.

99
4.3. GRADIENT AND DIRECTIONAL DERIVATIVE

Example 99. Let f (x, y) = 3x2 + 4xy + y 2 . Find the direction and value of the maximum
and minimum rate of change of f at the point (1, 2).
Solution. First, we compute the gradient of f as below:
 
∂f ∂f
∇f (x, y) = , = (6x + 4y, 4x + 2y).
∂x ∂y
At (1, 2), the gradient is given by
∇f (1, 2) = (14, 8) = 14i + 8j.
The maximum rate of change at a point is given by the norm of the gradient vector at that
point. Therefore, at point (1, 2), the maximum rate of change is given by
√ √ √
max Dv f (1, 2) = k∇f (1, 2)k = 142 + 82 = 196 + 64 = 260.
Also, the direction of maximum rate of change, i.e., the direction of fastest increase is the
direction given by the gradient vector itself. A unit vector along this direction is given by
∇f (1, 2) 1
v̂ = =√ (14i + 8j).
k∇f (1, 2)k 260
The minimum rate of change at a point is just the negative of norm of the gradient vector
at that point. Therefore, minimum rate of change at (1, 2) is given by

min Dv f (1, 2) = −k∇f (1, 2)k = − 260,
and it occurs in the direction opposite to that of given by gradient vector at that point.
Thus, the direction of fastest decrease is given by the unit vector
∇f (1, 2) 1
û = − =√ (−14i − 8j).
k∇f (1, 2)k 260
Example 100. Let f (x, y, z) = xyz. Find the maximum and minimum rate of change of f
at the point (1, −1, 2) and the directions in which they occur.
Solution. The gradient is ∇f (x, y, z) = (yz, xz, xy). At the point (1, −1, 2), we have
∇f (1, −1, 2) = (−2, 2, −1).
The magnitude of maximum rate of change at the point (1, −1, 2) is given by
p √ √
k∇f (1, −1, 2)k = (−2)2 + 22 + (−1)2 = 4 + 4 + 1 = 9 = 3.
and the direction of maximum rate of change is the direction given by
1 1
v = (−2, 2, −1) = (−2i + 2j − 1k).
3 3
Similarly, minimum rate of change at the point (1, −1, 2) is −k∇f (1, −1, 2)k = −3. Also,
the direction of minimum rate of change is
1
û = −v̂ = − (−2i + 2j − 1k).
3

100
4.4. GRADIENT AND ORTHOGONALITY TO LEVEL SETS

Example 101. Suppose f (x, y) = ex sin y. Find the direction of fastest increase and the
maximum rate of change of f at the point (0, π/2).
Solution. The gradient is given by ∇f (x, y) = (ex sin y, ex cos y). At the point (0, π/2),
this becomes
∇f (0, π/2) = (1 · 1, 1 · 0) = (1, 0).

Thus, the magnitude of maximum rate of change is k∇f (0, π/2)k = 12 + 02 = 1 and the
direction of fastest increase is given by
∇f (0, π/2)
v̂ = = (1, 0).
k∇f (0, π/2)k
The minimum rate of change is −1 in the direction of the vector (−1, 0).
Example 102. The temperature at a point (x, y) in a metal plate is given by T (x, y) =
100 − x2 − y 2 , where T is in degrees Celsius and x, y are in meters. A bug is sitting at the
point (2, 1). In which direction should the bug move to warm up as quickly as possible?
Also, find the maximum rate of increase in temperature at that point.
Solution. We begin by computing the gradient of the temperature function.
 
∂T ∂T
∇T (x, y) = , = (−2x, −2y).
∂x ∂y
At the point (2, 1), we get ∇T (2, 1) = (−4, −2). This vector points in the direction of the
steepest temperature increase. So, the bug should move in the direction:
1 1
v̂ = p (−4, −2) = √ (−4, −2).
(−4)2 + (−2)2 20
The maximum rate of increase is given by the magnitude of the gradient:
p √ √
k∇T (2, 1)k = (−4)2 + (−2)2 = 20 = 2 5.

Therefore, the bug should move in the direction √1 (−4i − 2j), and the temperature will
√ 20
increase at a rate of 2 5 units.

4.4 Gradient and Orthogonality to Level Sets


First, we consider the level curves in two dimensions. Let a function z = f (x, y) be given
and suppose the equation f (x, y) = c defines a level curve in the plane. That is, all points
(x, y) on this curve have the same function value c, where c is a constant.

Let c(t) = (x(t), y(t)) be a smooth parameterization of this level curve. Let (a, b) be a point
on the curve corresponding to the parameter value t = t0 . Then c(t0 ) = (x(t0 ), y(t0 )) = (a, b)

101
4.4. GRADIENT AND ORTHOGONALITY TO LEVEL SETS

and f (a, b) = c. Since every point on the level curve satisfies f (x(t), y(t)) = c, we differentiate
both sides with respect to t:
d
f (c(t)) = 0.
dt
Applying the chain rule:
∇f (c(t)) · c0 (t) = 0.
Evaluating this at t = t0 , we get:

∇f (c(t0 )) · c0 (t0 ) = 0,

which simplifies to:


∇f (a, b) · c0 (t0 ) = 0.
This shows that the gradient vector ∇f (a, b) is orthogonal to the tangent vector c0 (t0 ), which
is tangent to the level curve at the point (a, b). Hence, we may write

Thus, we conclude that the gradient ∇f (a, b) is perpendicular to the tangent line to the
level surface f (x, y) = c at the point (a, b). The equation of the tangent line at the point
(a, b) is then given by

fx (a, b)(x − a) + fy (a, b)(y − b) = 0.

Level Surfaces in Three Dimensions: In a similar fashion, we consider a scalar function


w = f (x, y, z) and suppose the equation f (x, y, z) = k defines a level surface in three-
dimensional space. Let c(t) = (x(t), y(t), z(t)) be a smooth curve lying entirely on the level
surface. Then for all t, we have:
f (c(t)) = k.
Differentiating both sides with respect to t

d
f (c(t)) = 0.
dt
By the chain rule, this gives
∇f (c(t)) · c0 (t) = 0.
At t = t0 , if c(t0 ) = (a, b, c), then:

∇f (a, b, c) · c0 (t0 ) = 0.

So, the gradient vector ∇f (a, b, c) is perpendicular to the tangent vector of any curve lying
on the level surface and passing through (a, b, c). This implies that the gradient vector is
normal to the surface itself at that point. Therefore, we can write

102
4.4. GRADIENT AND ORTHOGONALITY TO LEVEL SETS

Thus, we conclude that the gradient ∇f (a, b, c) is perpendicular to the tangent plane the
level surface f (x, y, z) = k at the point (a, b, c). The equation of the tangent plane at
the point (a, b, c) is then given by

fx (a, b, c)(x − a) + fy (a, b, c)(y − b) + fz (a, b, c)(z − c) = 0.

Conclusion: In both two and three dimensions, the gradient vector of a scalar function at
a point is orthogonal to the level set (curve or surface) that passes through that point. In
particular:
• In 2D, ∇f (x, y) is perpendicular to the tangent line of the level curve.

• In 3D, ∇f (x, y, z) is perpendicular to the tangent plane of the level surface.

The gradient vector is normal to the level set and points in the direction of fastest increase.

ILLUSTRATIVE EXAMPLES

Example 103. Find a normal vector and equation of the tangent line to the circle x2 + y 2 =
25 at the point (3, 4).
 
Solution. Let f (x, y) = x2 + y 2 . Then, ∇f (x, y) = ∂f , ∂f
∂x ∂y
= (2x, 2y). At the point (3, 4),
the gradient is ∇f (3, 4) = (6, 8), which itself is a normal vector to the circle at (3, 4) and
also equation of the tangent line to the circle at the point (3, 4) is given by

6(x − 3) + 8(y − 4) = 0.

After simplification the equation is 3x + 4y = 25.


Example 104. Find equation of the tangent line to the ellipse x2 + 4y 2 = 20 at the point
(2, 2).
 
Solution. Let f (x, y) = x2 + 4y 2 . Then, ∇f (x, y) = ∂f , ∂f
∂x ∂y
= (2x, 8y). At the point
(2, 2), the gradient becomes
∇f (2, 2) = (4, 16),
which is a normal vector to the ellipse at the point (2, 2). Also, the equation of the tangent
line is
4(x − 2) + 16(y − 2) = 0.
After simplification, gives x + 4y = 10.
Example 105. Find a normal vector and the equation of the tangent plane to the sphere
x2 + y 2 + z 2 = 36 at the point (6, 0, 0).

103
4.5. ELECTRIC FIELD (OPTIONAL)

Solution. Let f (x, y, z) = x2 + y 2 + z 2 . Then, ∇f (x, y, z) = (2x, 2y, 2z). At the point
(6, 0, 0), we have ∇f (6, 0, 0) = (12, 0, 0), which is a normal vector to the sphere at the point
(6, 0, 0). Also, the equation of the tangent plane is

12(x − 6) + 0(y − 0) + 0(z − 0) = 0.

Simplifying, we get x = 6.
Example 106. Find a normal vector and the equation of the tangent plane to the ellipsoid
2
x2 z2
4
+ y9 + 16 = 1 at the point (2, 0, 0).
2 2
z2
Solution. Let f (x, y, z) = x4 + y9 + 16 . Then, ∇f (x, y, z) = x2 , 2y , z8 . At the point (2, 0, 0),

9
the gradient becomes ∇f (2, 0, 0) = (1, 0, 0) , which is a normal vector to the ellipsoid at
(2, 0, 0). So, the equation of the tangent plane is:

1(x − 2) + 0(y − 0) + 0(z − 0) = 0.

Thus, the tangent plane is x = 2.

4.5 Electric Field (Optional)


Coulomb’s Law: Coulomb’s Law describes that the electric force between two charged
particles is directly proportional to the product of their charges and inversely proportional
to the square of the distance between them. The formula is given by:
1 Q1 Q2
F = ,
4π0 r2
where F is the magnitude of the electric force, Q1 and Q2 are the electric charges, r is the
distance between the charges, 0 is called permittivity of free space given by
1
0 = 8.854 × 10−12 = 10−9 F/m.
36π
The force acts along the line connecting the two charges. If the charges are like (both positive
or both negative), they repel each other. If they are unlike (one positive and one negative),
they attract each other.

If the international systems of Units (SI) is used, magnitude of the electric force F is mea-
sured in Newtons (N), the electric charges Q1 and Q2 are measured in coulombs (C), and the
distance r between the charges is measured in meters (m). The constant k = 4π 1
0
is known
as Coulombs constant and has a value of approximately 8.99 × 109 N · m2 /C . 2

Since force is a vector quantity, it is better to express it in vector terms. If r1 locate Q1 and
r2 locate Q2 , then the vector R12 = r2 − r1 represent the directed line segment from Q1 to

104
4.5. ELECTRIC FIELD (OPTIONAL)

Q2 . The vector F2 is the force on Q2 . If Q1 and Q2 both are of same sign, the vector form
of Coulomb’s law is given by
1 Q1 Q2
F2 = 3
(r2 − r1 ).
4π0 R12

If, we take a unit vector a12 in the direction of R12 , then the Coulomb’s law can be written
as
1 Q1 Q2
F2 = 2
a12 .
4π0 R12
Example 107. Let a charge Q = 3 × 10−2 C be located at M (1, 2, 3) and a charge Q2 =
−10−4 C be at N (2, 0, 5) in a vacuum. Find the force on Q2 by Q1 .

Solution: First we find the vector R12 , which is given by

R12 = (2, 0, 5) − (1, 2, 3) = (1, −2, 2) = i − 2j + 2k.



The norm of R12 is ||R12 || = 12 + 22 + 22 = 3. Thus, the unit vector a12 is given by
1
a12 = i − 2j + 2k.
3
Therefore, the force F2 on Q2 due to Q1 is given by

3 × 10−4 (−10−4 )
 
i − 2j + 2k
F2 =
4π(1/36π)(10−9 )32 3
 
i − 2j + 2k
= −30 N
3

Thus the force is given by the vector

F2 = (−10i + 20j − 20k).

The magnitude of the force is 30N and direction is given by the vector a1 2.

Example 108. A charge QA = −20µC is located at A(−6, 4, 7) and a charge QB = 50µC


is at B(5, 8, −2) in free space. If distances are given in meters, find the vector force exerted
on QA by QB .

Solution: First we find the vector RAB , which is given by

RAB = rB − rA = (5, 8, −2) − (−6, 4, 7) = (11, 4, −9) = 11i + 4j − 9k.

The norm of RAB is


p √ √
||RAB || = 112 + 42 + (−9)2 = 121 + 16 + 81 = 218.

105
4.5. ELECTRIC FIELD (OPTIONAL)

The unit vector aAB is given by


1
aAB = √ (11i + 4j − 9k).
218
Now, we compute the force FA on QA due to QB . Recall that:
1 QA QB
FA = · aAB .
4π0 ||RAB ||2
Using QA = −20 × 10−6 C, QB = 50 × 10−6 C, and 1
4π0
= 9 × 109 N · m2 /C2 , we get

9(−20 × 10−6 )(50 × 10−6 ) 1


FA = 9 × 10 · ·√ (11i + 4j − 9k)
218 218
−1000 × 10−12
= 9 × 109 · √ (11i + 4j − 9k)
218 218
−9 × 10−3
= √ (11i + 4j − 9k) N.
218
Thus, the force on QA is a vector pointing in the opposite direction of RAB (because the
charges are opposite in sign), with magnitude proportional to the product of their charges
and inversely proportional to the square of the distance.

Electric Field: The electric field at a point in space is a vector quantity that describes
the force experienced by a unit positive test charge placed at that point due to the presence
of other electric charges. Now, we shall explain the electric filed due to a point charge and
of a dipole.

Electric Field of Point Charge: Mathematically, if a point charge Q is located at a


position r0 ∈ R3 , then the electric field E at any point r = (x, y, z) ∈ R3 \ {r0 } is given by:
1 Q(r − r0 )
E(r) = · ,
4πε0 |r − r0 |3
where ε0 ≈ 8.854 × 10−12 C2 /(N · m2 ) is the permittivity of free space, Q is the magnitude of
the point charge (in Coulombs), r is the observation point, r0 is the location of the charge.

For sake of simplicity, we assume the point charge Q is located at the origin, i.e., r0 = (0, 0, 0).
Then r − r0 = (x, y, z), the the electric field becomes
1 Q(x, y, z)
E(x, y, z) = · 2 .
4πε0 (x + y 2 + z 2 )3/2
In component form the electric filed is written as
 
kQx kQy kQz
E(x, y, z) = , , ,
(x2 + y 2 + z 2 )3/2 (x2 + y 2 + z 2 )3/2 (x2 + y 2 + z 2 )3/2

106
4.5. ELECTRIC FIELD (OPTIONAL)

Figure 4.5.1: Electric Field Plot

1
where k = ≈ 8.988 × 109 N · m2 /C2 . For a two dimensional space, the electric filed due
4πε0
to a point charge is  
kQx kQy
E(x, y) = , , ,
(x2 + y 2 )3/2 (x2 + y 2 )3/2
If plot of electric filed is given in the following figure. The illustration visually represents
the electric field lines generated by point charges. On the left, the diagram shows the
field of a positive point charge, where field vectors radiate outward symmetrically in all
directionsindicating the repulsive nature of a positive charge. On the right, the diagram
displays the field of a negative point charge, where field vectors point inward, converging
toward the chargereflecting the attractive nature of a negative charge. These vector diagrams
help in understanding both the direction and relative strength of the electric field at various
points in space around the charges.
Example 109. A point charge Q = 1 µC = 1 × 10−6 C is placed at the origin in the xy-
plane. Let r = (x, y) be the position vector of a point in the plane where the electric field is
to be computed.
a) Derive the expression for the 2D electric field vector E(x, y) at any point in the plane
(neglecting the z-component).
b) Compute the electric field vector E at the point (3, 4) and write it in vector form.
Solution: (a) In two dimensions, the electric field due to a point charge located at the
origin is given by
1 Q(x, y)
E(x, y) = · 2 ,
4πε0 (x + y 2 )3/2
where 1
4πε0
≈ 9 × 109 N · m2 /C2 . Substituting Q = 1 × 10−6 C, we get

9 × 109 · 1 × 10−6 9000


E(x, y) = 2 2 3/2
(x, y) = 2 (x, y).
(x + y ) (x + y 2 )3/2

107
4.5. ELECTRIC FIELD (OPTIONAL)

Figure 4.5.2: Electric Field

Thus, the electric field vector in the plane is


 
9000x 9000y
E(x, y) = ,
(x2 + y 2 )3/2 (x2 + y 2 )3/2

Plot of this electric filed is given in the figure (4.5.2).

(b) To compute the electric filed at (3, 4), we first find the squared distance as below

x2 + y 2 = 32 + 42 = 9 + 16 = 25.

Then compute the denominator:

(x2 + y 2 )3/2 = 253/2 = (52 )3/2 = 53 = 125.

Now, plugging these into the formula, we get


9000
E(3, 4) = (3, 4) = 72 · (3, 4).
125

⇒ E(3, 4) = (216, 288) N/C.


Electric Field of an Electric Dipole: An electric dipole consists of two equal and op-
posite point charges, +q and −q, separated by a fixed distance. The electric field generated
by such a configuration is the vector sum of the fields due to each charge.

Let the positive charge +q be located at position vector r+ , and the negative charge −q at
r− . The electric field E(r) at an arbitrary point r in space is given by
 
1 q(r − r+ ) q(r − r− )
E(r) = −
4πε0 |r − r+ |3 |r − r− |3

108
4.5. ELECTRIC FIELD (OPTIONAL)

Here, ε0 is the vacuum permittivity.

Component Form: Assuming the dipole is aligned along the z-axis, with the charges
located at r+ = (0, 0, a) and r− = (0, 0, −a), the electric field at point r = (x, y, z) has
components
" #
q x x
Ex = − .
4πε0 (x2 + y 2 + (z − a)2 )3/2 (x2 + y 2 + (z + a)2 )3/2
" #
q y y
Ey = − .
4πε0 (x2 + y 2 + (z − a)2 )3/2 (x2 + y 2 + (z + a)2 )3/2
. and " #
q z−a z+a
Ez = − .
4πε0 (x2 + y 2 + (z − a)2 )3/2 (x2 + y 2 + (z + a)2 )3/2
Similar formulae can be derived if the dialpole is aligned along x or y-axis.

Specific 2D Case: For a planar configuration, consider placing the charges along the y-axis.
Let the positive charge +q be at r+ = (0, a) and the negative charge −q be at r− = (0, −a).
Then, the electric field at a point r = (x, y) in the plane is
 
1 q(x, y − a) q(x, y + a)
E(x, y) = −
4πε0 [x2 + (y − a)2 ]3/2 [x2 + (y + a)2 ]3/2
Expressing E(x, y) in terms of its components, we get the - x-component as
 
q x x
Ex (x, y) = −
4πε0 [x2 + (y − a)2 ]3/2 [x2 + (y + a)2 ]3/2
and the y-component as
 
q y−a y+a
Ey (x, y) = −
4πε0 [x2 + (y − a)2 ]3/2 [x2 + (y + a)2 ]3/2
The figure (??) explains the electric field lines of a dipole in the xy-plane.
Example 110. Consider an electric dipole consisting of a positive charge +q = 1 µC located
at r+ = (0, 0, 1) and a negative charge −q = −1 µC located at r− = (0, 0, −1). Compute
the electric field vector E at the point (3, 0, 0) and express it in vector form.
Solution: (a) The electric field at a point r = (x, y, z) due to a point charge +q located at
(0, 0, a) and a point charge −q located at (0, 0, −a) is given by
 
q (x, y, z − a) (x, y, z + a)
E(x, y, z) = − .
4πε0 [(x)2 + (y)2 + (z − a)2 ]3/2 [(x)2 + (y)2 + (z + a)2 ]3/2

109
4.5. ELECTRIC FIELD (OPTIONAL)

Figure 4.5.3: Electric Field Figure 4.5.4: Field Lines

Therefore, E at the point (3, 0, 0) is given by


 
q (3, 0, −1) (3, 0, 1) q (0, 0, −2)
E(3, 0, 0) = √ − √ = · √ .
4πε0 10 10 10 10 4πε0 10 10
Simplify:
q (0, 0, 2) q (0, 0, 1)
E(3, 0, 0) = − · √ =− · √ .
4πε0 10 10 4πε0 5 10
Substitute q = 1 × 10−6 C and 1
4πε0
≈ 9 × 109 N · m2 /C2 :

91 × 10−6 9 × 103
E(3, 0, 0) = −9 × 10 · √ · (0, 0, 1) = − √ · (0, 0, 1).
5 10 5 10
Computing the numerical value, we get

9 × 103 9000
√ ≈ ≈ 569.2.
5 10 15.811

Therefore, the desired electric field is

E(3, 0, 0) ≈ (0, 0, −569.2) N/C.

110

You might also like