1 Chain Rule: 1.1 Composition of Functions

Tornike Kadeishvili
WEEK 5
Reading [SB], 4.1-4.2, pp. 70-81
1
1.1
Chain Rule
Composition of Functions
Suppose f : X Y and g : Y Z. The the composition g f : X Z is

defined by g f (x) = g(f (x)). In this composition g f the function f is the
inside function, and the function g is the outside function.
Examples
1. Let f (x) = x2 and g(x) = 2x + 3, then f g(x) = (2x + 3)2 and g f (x) =
2x2 + 3.
3
3
3
2.
Let
f
(x)
=
x
and
g(x)
=
x,
then
f
g(x)
=
(
x)3 = x and g f (x) =
3
x3 = x, so the compositions both are identity functions f g = id, g f = id.
3. Let f (x) = ex and g(x) = ln x, then f g(x) = eln x = x and g f (x) =
ln ex = x, so the compositions both are identity functions f g = id, gf = id.
Exercise
For the composite function f g(x) = 5e2x + 3ex + 1, what are the inside and
outside functions?
Solution. 5e2x +3ex +1 = 5(ex )2 +3ex +1, so the inside function is g(x) = ex
and the outside function is f (x) = 5x2 + 3x + 1.
1.2
Differentiating of Composite Functions: the Chain

Rule
Theorem. The derivative of composite function (h g)(x) can be calculated

as
(h g)0 (x) = h0 (g(x)) g 0 (x)
(the chain rule).
1
Proof*.
0 ))
0)
(h g)0 (x0 ) = limx1 x0 (hg)(xx11)(hg)(x
= limx1 x0 h(g(x1x))h(g(x
=
x0
1 x0
1 ))h(g(x0 ))
= limx1 x0 h(g(x
g(x1 )g(x0 )
g(x1 )g(x0 )
x1 x0
1 ))h(g(x0 ))
0)
limx1 x0 h(g(x
limx1 x0 g(xx11)g(x
=
g(x1 )g(x0 )
x0
1 ))h(g(x0 ))
0)
limg(x1 )g(x0 ) h(g(x
limx1 x0 g(xx11)g(x
=
g(x1 )g(x0 )
x0
h0 (g(x0 ) g 0 (x0 ).
Well, this proof has small gap, but forget it!
In particular
d
(g(x))k = k(g(x))k1 g 0 (x).
dx
Examples
1. Find the derivative of f (x) = (2x + 3)7 .
Solution. The function f (x) is a composition f (x) = h(g(x)) with g(x) =
2x + 3 and h(z) = z 7 . Thus, by chain rule
f 0 (x) = h0 (g(x) g 0 (x) = 7(2x + 3)6 (2x + 3)0 = 7(2x + 3) 2 = 14(2x + 3)6 .
2. A firm computes that at the present moment its output is increasing at
the rate of 2 units per hour and that its marginal cost is 12. At what rate is
its cost increasing per hour?
Solution. Let x(t) be the production function (output x depends on time t)
and in this moment t = t0 we have x0 (t0 ) = 2. Let C(x) be the cost function,
so we have C 0 (x0 ) = 12, where x0 = x(t0 ). Then
dC
dC
dx
(t0 ) =
(x(t0 )) (t0 ) = 12 2 = 24.
dt
dx
dt
Exercises 4.1-4.6
Again About Functions
A function (map, transformation) from the set X (domain, or source) to the

set Y (codomain, or target)
f :XY
2
is a rule that assigns to each element x X one element f (x) Y .

The image of f is the set of all elements y Y that correspond to some
x:
Im f = {y Y, y = f (x)}.
For an element y Y its preimage f 1 (y) is the set of all elements x X
such that f (x) = y:
f 1 (y) = {x X, f (x) = y}.
2.1
Again About Surjections, Injections, Bijections
A function f : X Y is called surjective (onto) if

y Y x X s.t. f (x) = y.
A function f : X Y is called injective (one-to-one) if
f (x1 ) = f (x2 )
x1 = x2 .
A function is called bijection if it is a surjection and injection simultaneously.

In other words:
f is a surjection if the equation f (x) = y has at least one solution;
f is an injection if the equation f (x) = y has at most one solution.
f is bijection if the equation f (x) = y has exactly one solution.
2.2
Inverse Function
When f : X Y is bijective, there is an inverse function g : Y X which

assigns to y Y the unique element g(y) = x such that f (x) = y.
Definition Function g is the inverse of f if g(f (x)) = x and f (g(y)) = y for
arbitrary x X and y Y . In other words
f g = id, g f = id.
If f is invertible, then its inverse function often is denoted as f 1 .
Theorem 1 If f : X Y is invertible then it is a bijection.
Proof.
(i) Surjectivity. For any y Y we must find x X s.t. f (x) = y. Let us
take x := g(y). Then
f (x) = f (g(y)) = y since f g = idY ,
QED.
(i) Injectivity. Suppose f (x1 ) = f (x2 ), we must show that x1 = x2 . Indeed,

f (x1 ) = f (x2 ) g(f (x1 )) = g(f (x2 )) x1 = x2 since gf = idX , QED.
3
Theorem 2 If f : X Y is invertible then its inverse is uniquely determined.

Proof. Suppose g, h : Y X are two inverses of f :
f g = idY , g f = idX and f h = idY , h f = idX .
Then g = h, i.e. g(y) = h(y) for arbitrary y Y , indeed, since of bijectivity
(in fact by surjectivity) of f
x X s.t. f (x) = y.
Then
g(y) = g(f (x)) = x and h(y) = h(f )x)) = x since g f = h f = idx ,
thus g(y) = h(y), QED.
Theorem 3 A continuous function f defined on an interval I R is invertible if and only if it is monotonically increasing or or monotonically decreasing.
Examples
1. The function f : R R given by f (x) = x2 is not invertible (why?), but
the function f : [0, ) [0, ) is: The inverse function g = f 1 : [0, )
[0, ) is g(y) = y = y 1/2 . Indeed,
f (g(y)) = ( y)2 = y, g(f (x)) = x2 = x.

Remark. This example shows that in the definition of inverse function both
conditions
f g = id, g f = id.
are essential: here we have f (g(x)) =q( x)2 = x, i.e. the first condition
f g = id is satisfied, but g(f (3)) = (3)2 = 9 = 3 6= 3 , that is the

second condition g f = id is not satisfied for f : R [0, ).
RR
neither inj. nor surj.
f
[0, +) R
inj. but not surj.
R [0, +)
not inj. but surj.
f
[0, +) [0, +)
inj. and surj.
4
2. The function f : R R+ given by f (x) = ex is invertible, and its inverse

is g : R+ R given by f (y) = ln y (why?).
Exercise
Calculate an expression for the inverse of the function y =
domain.
Solution. Solve x from the equation y =
1
x+1
specifying the
1
:
x+1
1
1
y (x + 1) = 1, x + 1 = , x = 1.
y
y
So the inverse function for f (x) =
f (g(y)) =
and
g(f (x)) =
1
x+1
( y1
1
1
x+1
is g(y) =
1
y
1, indeed
1
1
= 1 =y
1) + 1
y
1 = (x + 1) 1 = x.
S
The domain of the inverse function is (, 0) (0, ).

Notice that just the condition f g = id guarantees the surjectivity of f ;
just the condition g f = id guarantees the injectivity of f ; and only both
conditions f g = id, g f = id guarantee the bijectivity of f , consequently
its invertibility.
2.2.1
Graph of Inverse Function
Suppose f is invertible and g is its inverse. This means that if f (a) = b then
g(b) = a.
Suppose a point (a, b) belongs to the graph of f (notation (a, b) (f )),
i.e. f (a) = b. Then we have g(b) = a, thus the point (b, a) belongs to the
graph of g. Shortly
(a, b) (f ) f (a) = b g(b) = a (b, a) (g).
Similarly,
(b, a) (g) g(b) = a f (a) = b (a, b) (f ).
This means that the graphs of f and g are symmetric with respect to the
bisectrix y = x.
f (x) = x2 , g(x) =
(x)
f (x) = ex , g(x) = ln x
2.2.2
The Derivative of the Inverse Function
Theorem 4 Let f be a C 1 function on an interval I R and f 0 (x) 6= 0 for

all x I. Then f is invertible on I, its inverse g is C 1 on the interval f (I)
and
1
g 0 (y) = 0
.
f (g(y))
6
Proof. Invertibility of f on I follows from its monotonicity. Suppose g =

f 1 , then f (g(y)) = y for each y f (I). Differentiating this equality using
the chain rule we obtain
f 0 (g(y)) g 0 (y) = y 0 = 1,
thus g 0 (y) =
2.2.3
1
.
f 0 (g(y))
Application*
The formula
(xk )0 = kxk1 ,
was proven only for natural k-s. The above theorem allows to generalize
this formula for arbitrary rational k:
1
1. The function g(y) = y n is the inverse of f (x) = xn (why?). This allows
1
to calculate the derivative of g(y) = y n :
1
(y n )0 = g 0 (y) =
1
ng(y))n1
1
f 0 (g(y))
1
n(y 1/n )n1
1
n
=
y
1
((g(y))n )0
1n
n
2. Now take any arbitrary rational number

m
(x n )0 =
m
n
1
n
=
1
y n 1 .
Q. Let us proof that
m m 1
xn .
n
Indeed, first let us assume that m, n N , i.e. q = m

is a positive rational
n
m
1
m
number. Since x n = (x n ) by the Chain Rule we have
m
(x n )0 = ((x n )m )0 = m(x n )m1 (x n )0 = mx

mx
m1
n
n1 x
1n
n
1n
m m1
x n + n
n
m1
n
m m1+1n
x n
n
n1 x n 1 =
m m
x n 1 .
n
So we already have proved (xq )0 = qxq1 for any positive rational q Q.

It remains to generalize this formula for negative rational numbers (xq )0 =
qxq1 , indeed,
(xq )0 = ( x1q )0 =
10 xq 1(xq )0
x2q
qxq1)
x2q
= qxq1 .
The further generalization of the formula (xr )0 = rxr1 for a real r R uses
approximation of a real number by a sequence of rational numbers.
Exercise
Calculate the derivative of the inverse of the function f (x) =
f (1) = 21 .
7
1
x+1
at the point
Solution.
1
1
1
1
g 0 ( ) = g 0 (f (1)) = 0
= 0
=
|x=1 = (x+1)2 |x=1 = 4.
1
2
f (g(f (1)))
f (1)
(x+1)
2
By the way, as we know the inverse for f (x) =
direct calculation of g 0 ( 12 ) gives the same result.
Exercises 4.7-4.10
Homework 4
Exercises 4.3 (c), 4.5 (e,g), 4.6, 4.8 (c), 4.9 (c)
1
x+1
is g(y) =
1
y
1. The

1 Chain Rule: 1.1 Composition of Functions

Uploaded by

Document Informationclick to expand document information

Copyright:

Available Formats

1 Chain Rule: 1.1 Composition of Functions

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

1 Chain Rule: 1.1 Composition of Functions

Uploaded by

Copyright:

Available Formats

Tornike Kadeishvili

Suppose f : X Y and g : Y Z. The the composition g f : X Z is

Differentiating of Composite Functions: the Chain

Theorem. The derivative of composite function (h g)(x) can be calculated

Again About Functions

A function (map, transformation) from the set X (domain, or source) to the

is a rule that assigns to each element x X one element f (x) Y .

Again About Surjections, Injections, Bijections

A function f : X Y is called surjective (onto) if

A function is called bijection if it is a surjection and injection simultaneously.

When f : X Y is bijective, there is an inverse function g : Y X which

(i) Injectivity. Suppose f (x1 ) = f (x2 ), we must show that x1 = x2 . Indeed,

Theorem 2 If f : X Y is invertible then its inverse is uniquely determined.

[0, ) is g(y) = y = y 1/2 . Indeed,

f (g(y)) = ( y)2 = y, g(f (x)) = x2 = x.

f g = id is satisfied, but g(f (3)) = (3)2 = 9 = 3 6= 3 , that is the

2. The function f : R R+ given by f (x) = ex is invertible, and its inverse

The domain of the inverse function is (, 0) (0, ).

Graph of Inverse Function

The Derivative of the Inverse Function

Theorem 4 Let f be a C 1 function on an interval I R and f 0 (x) 6= 0 for

Proof. Invertibility of f on I follows from its monotonicity. Suppose g =

2. Now take any arbitrary rational number

Q. Let us proof that

Indeed, first let us assume that m, n N , i.e. q = m

(x n )0 = ((x n )m )0 = m(x n )m1 (x n )0 = mx

So we already have proved (xq )0 = qxq1 for any positive rational q Q.

You might also like