[go: up one dir, main page]

0% found this document useful (0 votes)
90 views61 pages

Ver1 Lecture Note ODE Graduate

These lecture notes on ordinary differential equations (ODE) are prepared for a graduate course at National Chengchi University, focusing on the relationship between ODE and partial differential equations (PDE). The course includes various mathematical models, classifications of ODE, and techniques for solving them, with a structure that emphasizes presentations and homework assignments. The notes cover topics such as first-order nonlinear ODE, linear ODE, and specific examples like population growth and cooling/heating models.

Uploaded by

Angelo Oppio
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
90 views61 pages

Ver1 Lecture Note ODE Graduate

These lecture notes on ordinary differential equations (ODE) are prepared for a graduate course at National Chengchi University, focusing on the relationship between ODE and partial differential equations (PDE). The course includes various mathematical models, classifications of ODE, and techniques for solving them, with a structure that emphasizes presentations and homework assignments. The notes cover topics such as first-order nonlinear ODE, linear ODE, and specific examples like population growth and cooling/heating models.

Uploaded by

Angelo Oppio
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 61

An introduction to ordinary differential equations

Lecture notes, Fall 2024


(Version: November 21, 2024)

Pu-Zhao Kow
D EPARTMENT OF M ATHEMATICAL S CIENCES , NATIONAL C HENGCHI U NIVERSITY
Email address: pzkow@g.nccu.edu.tw
Preface

This lecture note is prepared mainly based on [BD22, HS99] for the course Differential
Equations, for graduate levels, during Fall 2024 (113-1). In next semester, we will more focus on
the partial differential equations (PDE), and we will use the lecture note [Kow24]. There is another
Differential Equations course for undergraduate levels. The lecture note may updated during the
course.

Title. Differential Equations


Lectures. Thursday (13:10–16:00)
Language. Chinese and English. Materials will be prepared in English
Instructor. Pu-Zhao Kow (Email: pzkow@g.nccu.edu.tw)
Office hour. Thursday (16:10–17:00)
Completion. Homework Assignments 60% (must be prepared using LATEX), Midterm presentation
20%, Final presentation 20%

We will focus in pointing the relation of ordinary differential equations (ODE) with other
(mathematical) fields, especially the partial differential equations (PDE), rather than go through
all boring details in class. One can choose to present the proof of some theorems in this lecture
note which the proof is skipped, or other interesting topics (more credit will be earned), during the
midterm or final presentations.

i
Contents

Chapter 1. Introduction 1
1.1. Some mathematical models 1
1.2. Classification of ODE 2

Chapter 2. First order nonlinear ODE 4


2.1. Well-posedness of ODE 4
2.2. Some techniques for solving the equation 8
2.3. From ODE to PDE 13
2.3.1. Linear equations 13
2.3.2. Quasilinear equations 15

Chapter 3. Linear ODE 20


3.1. Homogeneous ODE with constant coefficients 20
3.1.1. Computations of the exponential 24
3.1.2. The matrix logarithm 32
3.1.3. One parameter subgroup, Lie group and Lie algebra 34
3.2. Homogeneous ODE with variable coefficients 41
3.3. Nonhomogeneous equations 44
3.4. Higher order linear ODE 46
3.5. Strum-Liouville eigenvalue problem 53

Bibliography 57

ii
CHAPTER 1

Introduction

1.1. Some mathematical models

In order to motivate this course, we begin with some examples from [BD22, Che16]. The most
simplest and important example which can be modeled by ordinary differential equations (ODE) is
a relaxation process, i.e. the system starts from state and eventual reaches an equilibrium state.

E XAMPLE 1.1.1 (A falling object). We now consider an object with mass m falling from height
y0 at time t = 0. Let v(t) be its velocity at time t. According to physics law, we know that the
acceleration of the object at time t is the rate of change of velocity v(t), that is,
d
a(t) = v(t) ≡ v′ (t).
dt
According to Newton’s second law, the net force F exerted on the on the object is expressed by the
equation

(1.1.1) F(t) = ma(t) = mv′ (t).

Next, we consider the forces that act on the object as it falls. The gravity exerts a force equal to
the weight of the object given by mg, where g is the acceleration due to the gravity. The drag
force due to air resistance has the magnitude γv(t), where γ is a constant called the drag coefficient.
Therefore the net force is given by

(1.1.2) F(t) = mg − γv(t).

Combining (1.1.1) and (1.1.2), we reach the ODE

mv′ (t) = mg − γv(t) for t > 0.

E XAMPLE 1.1.2 (Heating (or cooling) of an object). We now consider an object, with initial
temperature T0 at time t = 0, which is taken out of refrigerator to defrost. Let T (t) be its temperature
at time t. Suppose that the room temperature is given by K. The Newton’s law of cooling/heating
says that the rate change of temperature T (t) is proportional to the difference between T (t) and K,
more precisely,
T ′ (t) = −α(T (t) − K),
where α > 0 is a conductivity coefficient.
1
1.2. CLASSIFICATION OF ODE 2

E XAMPLE 1.1.3 (Population growth model). We first describe the population model proposed
by Malthus (1766–1834). Let y(t) be the population (in a “large” area) at time t. He built a model
based on the following hypothesis:

y′ (t) = births − deaths + migration,

and he assume that the births and the deaths are proportion to the current population y(t), that is,

births − deaths = ry(t),

where the constant r ∈ R is called the net growth rate. If there is no migration at all, then the model
reads
y′ (t) = ry(t),
which is called the simple population growth model. Suppose that the initial population is y0 at
time t = 0. In fact, the unique solution is

y(t) = y0 ert ,

which is not make sense, since the environment limitation is not taken into account. With this
consideration, we should expect that there is an environment carrying capacity K such that

y′ (t) > 0 when y(t) < K, y′ (t) < 0 when y(t) > K

due to a competition of resource. Verhulst (1804–1849) proposed another model which take the
limit of environment into consideration (i.e. in a “small” area):
 
′ y(t)
y (t) = ry 1 − ,
K
which is called the logistic population model. Note the above simple population growth model
formally corresponds to the case when K = +∞. See also [BD22, Section 2.5] for further
explainations.

1.2. Classification of ODE

We say that the system of equations of the form

F (t, u(t), u′′ (t), · · · , u(m) (t)) = 0,

or in equation form 
 F (t, u(t), u′′ (t), · · · , u(m) (t)) = 0,
 1


..
.


Fℓ (t, u(t), u′′ (t), · · · , u(m) (t)) = 0,

1.2. CLASSIFICATION OF ODE 3

(k) (k)
the ordinary differential equation (ODE) of order m, where we write u(k) := (u1 , · · · , un ) the
kth -order derivative of u = (u1 , · · · , un ). Throughout this note, we use the bold font to emphasize
the vector-valued functions.

E XAMPLE 1.2.1. For example,

u′′′ + 2et u′′ + uu′ = t 4

is a third order ODE.

In many cases, we only consider the system of ODE of the form (with n = ℓ)

u(m) (t) = f (t, u(t), u′′ (t), · · · , u(m−1) (t)).

Otherwise, for example, the equation

(u′ )2 + tu′ + 4u = 0

leads to two equations


√ √
−t + t 2 − 16u −t − t 2 − 16u
u′ = , u′ = .
2 2
An ODE is said to be linear if it takes the form

a0 (t)u(n) + a1 (t)u(n−1) + · · · + an (t)u = g(t),

for some matrices a0 , · · · , an , otherwise we say that the ODE is nonlinear. If g(t) ≡ 0, then we say
that the ODE is homogeneous, otherwise inhomogeneous.
CHAPTER 2

First order nonlinear ODE

This chapter deals with first order nonlinear ODE of the form

(2.0.1) u′ (t) = f (t, u), u(t0 ) = u0

for some vector u0 ∈ Rn . We again remind the readers that we use the notation

u(t) = (u1 (t), · · · , un (t)),


f (t) = ( f1 (t), · · · , fn (t)),

and we also use the notation

|u(t)| = max{|u1 (t)|, · · · , |un (t)|}.

2.1. Well-posedness of ODE

If f ≡ 0, then (2.0.1) reads


u′ (t) = 0, u(t0 ) = u0 ,
and one easily sees that the constant function u = u0 is a solution which is valid for all t ∈ R. We
first state the fundamental existence theorem when f ̸≡ 0.
T HEOREM 2.1.1 ([HS99, Theorem I-2-5]). Let a > 0 and b > 0. If f = f (t, y) is a (real-valued)
continuous function on a closed cylinder

R = {(t, y) ∈ R × Rn : |t − t0 | ≤ a, |y − u0 | ≤ b}

such that
M := max |f (t, y)| > 0,
(t,y)∈R
n
then there exists a function u ∈ C1 ((t0 − α,t0 + α)) with α = min{a, Mb } satisfying (2.0.1) in
(t0 − α,t0 + α).
However, the uniqueness does not hold true in general without further assumption on f . We
demonstrate this in the following few examples.
E XAMPLE 2.1.2. We define the function

0
 ,t ≤ 3,
u(t) :=  2 5/4
2
 (t − 9)
 ,t > 3.
5
4
2.1. WELL-POSEDNESS OF ODE 5

By using left and right limits, it is not difficult to check that u ∈ C(R). By using elementary
calculus, one computes that
 1/4
′ 2
u (t) = t(t 2 − 9)1/4 for t > 3,
5
u′ (t) = 0 for t < 3.

Since
 5/4
u(3 + h) − u(3) 1 2 2
lim = lim ((3 + h) − 9)
h→0+ h h→0+ h 5
 5/4  5/4
1 2 1/5 2
= lim h(h + 6) = lim h (h + 6) =0
h→0+ h 5 h→0+ 5
and
u(3 + h) − u(3) 0
lim = lim = 0,
h→0− h h→0− h
then
u(3 + h) − u(3)
u′ (3) := lim = 0.
h→0 h
Now we also see that
 1/4
′ 2
lim u (t) = lim t(t 2 − 9)1/4 = 0,
t→3+ t→3+ 5

lim u′ (t) = lim 0 = 0,


t→3− t→3+

which concludes that u′ ∈ C(R), and thus u ∈ C1 (R). One can easily check that

u′ (t) = f (t, u(t)) for all t ∈ R, u(t ) = 0
0
(2.1.1)
with f (t, y) = ty1/5 and t0 = 3.

Note that f is continuous in R × R, and hence the assumptions in Theorem 2.1.1 satisfy. Since
u ≡ 0 is also another solution of (2.1.1), one sees that the solution of initial value problem (2.1.1)
is not unique.

E XERCISE 2.1.3 ([HS99, Example III-1-1]). Verify that the initial-value problem

u′ (t) = (u(t))1/3 for all t ∈ R, u(t0 ) = 0

has at least two nontivial C1 (R)-solutions:



0
 ,t ≤ t0 ,
u(t) =  2 3/2
 (t − t0 )
 ,t > t0 ,
3
2.1. WELL-POSEDNESS OF ODE 6

and 
0
 ,t ≤ t0 ,
u(t) = 2
 3/2
−
 (t − t0 ) ,t > t0 .
3
E XERCISE 2.1.4 ([HS99, Example III-1-3]). Verify that the initial-value problem

u′ (t) = |u(t)| for all t ∈ R, u(t0 ) = 0


p

has at least one nontivial C1 (R)-solution:



− 1 (t − t )2 ,t ≤ t0 ,
4 0
u(t) =
 1 (t − t0 )2 ,t > t0 .
4
We now state a sufficient condition to guarantee also the uniqueness of the solution.
T HEOREM 2.1.5 (Fundamental theorem of ODE [HS99, Theorem I-1-4]). Suppose that all
assumptions in Theorem 2.1.1 hold. If we additionally assume that

(2.1.2) |f (t, y1 ) − f (t, y2 )| ≤ L|y1 − y2 |

whenever (t, y1 ) and (t, y2 ) are in R, then the solution described in Theorem 2.1.1 is the unique
n
C1 ((t0 − α,t0 + α)) solution.
R EMARK 2.1.6. See also Theorem 2.1.10 below.
E XERCISE 2.1.7. Verify that the ODEs in Example 2.1.2, Exercise 2.1.3 and Exercise 2.1.4 do
not satisfy the Lipschitz condition (2.1.2).
E XERCISE 2.1.8. Under the assumptions of Theorem 2.1.5, show that the initial value problem
(2.0.1) is equivalent to the integral equation
Z t
u(t) = u0 + f (s, u(s)) ds
t0

for all t ∈ (t0 − α,t0 + α).


By using Exercise 2.1.8, under the assumptions of Theorem 2.1.5, if u1 ∈ C1 ((t0 − α,t0 + α))
and u2 ∈ C1 ((t0 − α,t0 + α)) are the unique solution of (2.0.1) corresponding to initial data u10 and
u20 respectively, then one sees that
Z t
1 2
(t) = u10 − u20 + f (s, u1 (s)) − f (s, u2 (s)) ds.

u (t) − u
t0

By using the Lipschitz condition (2.1.2), one sees that


Z t
1 2
|u (t) − u (t)| ≤ |u10 − u20 | + f (s, u1 (s)) − f (s, u2 (s)) ds
t0
Z t
(2.1.3) ≤ |u10 − u20 | + L |u1 (s) − u2 (s)| ds for all t ∈ (t0 − α,t0 + α).
t0
2.1. WELL-POSEDNESS OF ODE 7

If t ≤ t0 , then one immediately reach

(2.1.4) |u1 (t) − u2 (t)| ≤ |u10 − u20 |.

If t ≥ t0 , we will use the following useful lemma.

L EMMA 2.1.9 (Gronwall [HS99, Lemma I-1-5]). If g ∈ C([t0 ,t1 ]) satisfies the inequality
Z t
(2.1.5) 0 ≤ g(t) ≤ K + L g(s) ds for all t ∈ [t0 ,t1 ],
t0

then
0 ≤ g(t) ≤ KeL(t−t0 ) for all t ∈ [t0 ,t1 ].
Rt
P ROOF. Set v(t) := t0 g(s) ds, from (2.1.5) we have
dv
≤ K + Lv(t) for all t ∈ (t0 ,t1 ), v(t0 ) = 0.
dt
One sees that (this technique is called the method of integrating factors)
d  −L(t−t0 )  dv
e v(t) = e−L(t−t0 ) − Le−L(t−t0 ) v(t)
dt   dt
dv
= e−L(t−t0 ) − Lv(t) ≤ Ke−L(t−t0 ) for all t ∈ (t0 ,t1 ).
dt
Now the fundamental theorem of calculus implies (i.e. integrating the above inequality from t0 to
τ ∈ (t0 ,t1 ))
K
Z τ
−L(τ−t0 )
e v(τ) ≤ K e−L(t−t0 ) dt = (1 − e−L(τ−t0 ) ) for all τ ∈ (t0 ,t1 ),
t0 L
which implies Z t
K L(t−t0 )
g(s) ds = v(t) ≤ (e − 1).
t0 L
Finally, plugging this into (2.1.5) to see that
Z t
g(t) ≤ K + L g(s) ds ≤ K + K(eL(t−t0 ) − 1) = KeL(t−t0 )
t0

for all t ∈ (t0 ,t1 ), and our result follows from the continuity of g. □
In view of (2.1.4) and (2.1.3), we now choose any t0 < t1 < t0 + α and g(t) = |u1 (t) − u2 (t)|
as well as K = |u10 − u20 | in Lemma 2.1.9 to see that:

T HEOREM 2.1.10 (Dependence on data). If all assumptions in Theorem 2.1.5 hold, then the
stability estimate

|u1 (t) − u2 (t)| ≤ |u10 − u20 |eαL for all t ∈ (t0 − α,t0 + α)

hold, where u1 ∈ C1 ((t0 − α,t0 + α)) and u2 ∈ C1 ((t0 − α,t0 + α)) are the unique solution of
(2.0.1) corresponding to initial data u10 and u20 respectively.
2.2. SOME TECHNIQUES FOR SOLVING THE EQUATION 8

R EMARK 2.1.11. We refer to [HS99, Chapter II] for further generalizations of Theorem 2.1.10

E XERCISE 2.1.12. Show that the initial value problem


1 √ √
u′ (t) = for all t ∈ (1 − 2, 1 + 2),
(3 − (t − 1)2 )(9 − (u − 5)2 )
u(1) = 5,
√ √
has a unique C1 ((1 − 2, 1 + 2))-solution.

E XERCISE 2.1.13. Show that the initial value problem


1
u′ (t) = for all t ∈ R,
(1 + (t − 4)2 )(5 + (u − 3)2 )
u(4) = 3,

has a unique C1 (R)-solution.

2.2. Some techniques for solving the equation

We now consider a single equation of ODE:

(2.2.1) u′ (t) = f (t, u(t)), u(t0 ) = u0 .

Unfortunately, there is no universally applicable method for solving solution(s) u ∈ C1 for the
equation (2.2.1) . We now exhibit some methods which can help to solve some certain class of
ODE.

D EFINITION 2.2.1. The ODE (2.2.1) is said to be separable if it can be expressed in the form
of

(2.2.2) M(t) + N(u(t))u′ (t) = 0,

for some continuous functions M and N, or sometimes we abuse the notation by writing M(t) dt +
N(u) du = 0.

For u ∈ C1 , we see that


u(t)
Z 
d
N(z) dz = N(u(t))u′ (t),
dt u0

then one can rewrite (2.2.2) as


u(t)
Z 
d
N(z) dz = −M(t).
dt u0

Integrate both sides with respect to the variable t from t0 to τ, we see that
Z u(τ) Z τ Z u(t) 
d
Z τ
(2.2.3) N(z) dz = N(z) dz dt = − M(t) dt,
u0 t0 dt u0 t0
2.2. SOME TECHNIQUES FOR SOLVING THE EQUATION 9

which solves the ODE implicitly.

E XAMPLE 2.2.2. We now want to find the general solution of


du t2
(2.2.4) = .
dt 1 − u2
We see that, if we impose the initial condition u(t0 ) = u0 ̸= 1, the conditions in Theorem 2.1.5
hold, therefore there exists a unique solution u ∈ C1 near t0 . Note that u0 ̸= 1 and the continuity of
u impliy that u(t) ̸= 1 for all t near t0 , therefore (2.2.4) make sense. Now we use chain rule to see
that  
d 1 du
u(t) − (u(t)) = (1 − u2 ) = t 2 ,
3
dt 3 dt
and thus
  Zτ  
1 1 3 d 1 1 1
Z τ
3 3
u(τ) − (u(τ)) − u0 − u0 = u(t) − (u(t)) dt = t 2 dt = τ 3 − t03 ,
3 3 t0 dt 3 t0 3 3
which gives
−t 3 + 3u(t) − (u(t))3 = −t03 + 3u0 − u30 for all t near t0 ,
or in the form of depressed cubic form

(u(t))3 − 3u(t) + (t 3 − t03 + 3u0 − u30 ) = 0,

and thus u can be expressed by using Cardano’s formula1 case by case.

E XERCISE 2.2.3. Do the same thing for the ODE


du 3t 2 + 4t + 2
= .
dt 2(u − 1)
E XERCISE 2.2.4. Do the same thing for the ODE
du 4t − t 3
= .
dt 4 + u3
We now consider the following ODE:

(2.2.5) M(t, u(t)) + N(t, u(t))u′ (t) = 0, u(t0 ) = u0 .

Note that (2.2.1) and (2.2.2) are both special case of (2.2.5). We now want to solve (2.2.5) under
some sufficient conditions.
Assume that M = M(t, y), N = N(t, y), ∂y M and ∂t N are continuous in an open rectan-
gle (t1 ,t2 ) × (y1 , y2 ) and u ∈ C1 ((t1 ,t2 )) satisfies y1 < u(t) < y2 for all t ∈ (y1 , y2 ). For each
ψ ∈ C1 ((t1 ,t2 ) × (y1 , y2 )), by using chain rule one sees that
d
(2.2.6) (ψ(t, u(t))) = ∂t ψ(t, u(t)) + ∂y ψ(t, u(t))u′ (t).
dt
1https://en.wikipedia.org/wiki/Cubic_equation
2.2. SOME TECHNIQUES FOR SOLVING THE EQUATION 10

Comparing this equality with the ODE (2.2.5), it is natural to find ψ such that ∂y ψ = N in (t1 ,t2 ) ×
(y1 , y2 ), which can be achieved by choosing
Z y
ψ(t, y) := N(t, z) dz for all (t, y) ∈ (t1 ,t2 ) × (y1 , y2 ).
u0

We need the following lemma for further computations (one way to prove this is to utilize the
Lebesgue dominated convergence theorem):

L EMMA 2.2.5 ([Str08, Theorem 1 in Appendix A.3]). Suppose that f (t, y) and ∂t f (t, y) are
continuous in the closed rectangle [s1 , s2 ] × [z1 , z2 ], then
Z z  Zz
d 2 2
f (t, y) dy = ∂t f (t, y) dy for all t ∈ [s1 , s2 ].
dt z1 z1

Now the above lemma guarantees that


Z y
∂t ψ(t, y) = ∂t N(t, z) dz for all (t, y) ∈ (t1 ,t2 ) × (y1 , y2 ).
u0

Now if

(2.2.7) ∂t N = ∂y M in (t1 ,t2 ) × (y1 , y2 ),

then we reach
Z y
∂t ψ(t, y) = ∂z M(t, z) dz = M(t, y) − M(t, u0 ) for all (t, y) ∈ (t1 ,t2 ) × (y1 , y2 ).
u0

Now from (2.2.6), and consequently by (2.2.5), we see that


d
(ψ(t, u(t))) = M(t, u(t)) − M(t, u0 ) + N(t, u(t))u′ (t) = −M(t, u0 ),
dt
thus
d
Z τ Z τ
(2.2.8) ψ(τ, u(τ)) − ψ(t0 , u0 ) = (ψ(t, u(t))) dt = − M(t, u0 ) dt
t0 dt t0

which solves u implicitly. In view of the above ideas, it is natural to introduce the following
definition.

D EFINITION 2.2.6. The ODE (2.2.5) is said to be exact if (2.2.7) holds.

R EMARK 2.2.7. If the ODE is separable (in the sense of Definition 2.2.1), then it also exact. In
this case, (2.2.8) reduces to (2.2.3).

E XAMPLE 2.2.8. We now want to solve the ODE

(u(t) cost + 2teu(t) ) + (sint + t 2 eu(t) − 1)u′ (t) = 0

with suitable initial condition u(t0 ) = u0 . In view of the chain rule


d
(2.2.9) (ψ(t, u(t))) = ∂t ψ(t, u(t)) + ∂y ψ(t, u(t))u′ (t),
dt
2.2. SOME TECHNIQUES FOR SOLVING THE EQUATION 11

it is natural to choose
Z y
ψ(t, y) = (sint + t 2 ez − 1) dz = (y sint + t 2 ey − y) − (u0 sint + t 2 eu0 − u0 )
u0

so that ∂y ψ = sint + t 2 ey − 1. Now we compute

∂t ψ(t, y) = y cost + 2tey − u0 cost − 2teu0 ,

and from (2.2.9), and consequently the ODE, we see that


d
(ψ(t, u(t))) = (u(t) cost +2teu(t) −u0 cost −2teu0 )+(sint +t 2 eu(t) −1)u′ (t) = −u0 cost −2teu0 ,
dt
thus
Z τ
u0 sint0 − u0 sin τ + t02 eu0 − 
τ 2
eu0 = − (u0 cost + 2teu0 ) dt

t0
d
Z τ
= (ψ(t, u(t))) dt = ψ(τ, u(τ)) − ψ(t0 , u0 )
t0 dt
τ 2
= (u(τ) sin τ + τ 2 eu(τ) − u(τ)) − (u0 sint0 +  eu0 − u0 ),


that is,
u(t) sint + t 2 eu(t) − u(t) = −u0 sint + 2u0 sint0 + t02 eu0 − u0 .


We now want to deal with the ODE (2.2.5) which is not necessarily to be exact in the sense of
Definition 2.2.6. The idea is quite simple: we want to multiply an integrating factor µ(t, y) so that

M̃(t, u(t)) + Ñ(t, u(t))u′ (t) = 0, u(t0 ) = u0

is exact, where M̃(t, y) = µ(t, y)M(t, y) and Ñ(t, y) = µ(t, y)N(t, y). Now (2.2.7) reads

∂t µN + µ∂t N = ∂t Ñ = ∂y M̃ = ∂y µM + µ∂y M,

that is,

(2.2.10) M∂y µ − N∂t µ + (∂y M − ∂t N)µ = 0.

This is nothing but just a transport equation, which will be discussed in Section 2.3 below. It is
remarkable to mention that, one can directly check that if
∂y M − ∂t N
k := is a function of t only,
N
the integrating factor µ is also a function of t only (which is independent of y), and it satisfies the
linear ODE
µ ′ (t) = k(t)µ(t),
so that for each K with K ′ = k one has
d −K(t)
(e µ(t)) = −k(t)e−K(t) µ(t) + e−K(t) µ ′ (t) = 0.
dt
2.2. SOME TECHNIQUES FOR SOLVING THE EQUATION 12

This can be achieved by choosing


µ(t) := eK(t) .

E XAMPLE 2.2.9. We now want to solve the ODE (3tu + u2 ) + (t 2 + tu)u′ = 0 with suitable
initial condition u(t0 ) = u0 . We want to multiply an integrating factor µ = µ(t, y) so that

∂t µ(t, y)(t 2 + ty) = ∂y µ(t, y)(3ty + y2 )


 

⇐⇒ ∂t µ(t, y)(t 2 + ty) + µ(t, y)(2t + y)


= ∂y µ(t, y)(3ty + y2 ) + µ(t, y)(3t + 2y)
⇐⇒ (t∂t µ(t, y) − µ(t, y))(t + y) = y∂y µ(t, y)(3t + y).

Note that the choice µ(t, y) = t fulfills the above requirement. We only need to find a µ, no need to
find its general solution. We now write the ODE as

(2.2.11) (3t 2 u + tu2 ) + (t 3 + t 2 u)u′ = 0, u(t0 ) = u0 .

In view of the chain rule


d
(2.2.12) (ψ(t, u(t))) = ∂t ψ(t, u(t)) + ∂y ψ(t, u(t))u′ (t),
dt
it is natural to choose ψ such that ∂y (t, y) = t 3 + t 2 y. One way to achieve this is take
Z y    
3 2 3 1 2 2 3 1 2 2
ψ(t, y) = (t + t z) dz = t y + t y − t u0 + t u0 .
u0 2 2
We now compute that
∂t ψ(t, y) = 3t 2 y + ty2 − 3t 2 u0 + tu20 .
 

Now from (2.2.12), and consequently from (2.2.11), we reach


d
(ψ(t, u(t))) = 3t 2 u + tu2 − 3t 2 u0 + tu20 + (t 3 + t 2 u)u′ = 3t 2 u0 − tu20 ,
 
dt
thus
   
3 1 2 2 3 122

τ u(τ) + τ (u(τ)) − τ u0+ τ u0

2  2
d
Z τ
= ψ(τ, u(τ)) − ψ(t0 , u0 ) = (ψ(t, u(t))) dt
t0 dt
   
1 1
Z τ  
2 2 3 2 2 3 2 2
= (3t u0 + tu0 ) dt = τ u0+ τ u0 − t0 u0 + t0 u0 ,

t0  2 2
which concludes that
τ 2 (u(τ))2 + 2τ 3 u(τ) = −2t03 u0 − t02 u20 .
2.3. FROM ODE TO PDE 13

2.3. From ODE to PDE

2.3.1. Linear equations. We now give an application of ODE in the theory of partial
differential equations (PDE). We begin our discussions from a simple model. Given a horizontal
pipe of fixed cross section in the (positive) x-direction. Suppose that there is a fluid flowing at a
constant rate c (c = 0 means the fluid is stationary; c > 0 means flowing toward right, otherwise
towards left). We now assume that there is a substance is suspended in the water. Fix a point at
the pipe, and we set the point as the origin 0, and let u(t, x) be the concentration of such substance.
The amount of pollutant in the interval [0, y] at time t is given by
Z y
u(t, x) dx.
0
At the later time t + τ, the same molecules of pollutant moved by the displacement cτ, and this
means Z y Z y+cτ
u(t, x) dx = u(t + τ, x) dx.
0 cτ
If u is continuous, by using the fundamental theorem of calculus, by differentiating the above
equation with respect to y, one sees that

(2.3.1) u(t, y) = u(t + τ, y + cτ) for all y ∈ R.

If we further assume u ∈ C1 , then differentiating (2.3.1) with respect to τ, we reach the following
transport equation:

0 = ∂τ u(t + τ, y + cτ)|τ=0 = ∂t u(t, x) + c∂x u(t, x) for all (t, x) ∈ R × R.

We now consider the transport equation with variable coefficient equation of the form

(2.3.2) ∂t u + c(t, x)∂x u = 0, u(0, x) = f (x),

where f ∈ C1 (R) and c = c(t, x) satisfies all assumption in Theorem 2.1.5. Given any s ∈ R and we
consider a curve x = γs (t), where γ solves the ODE

(2.3.3) γs′ (t) = c(t, γs (t)), γs (0) = s.

We now restrict u on a curve x = γs (t), and one sees that

∂t u|γs (t) = ∂t (u(t, γs (t))) = (∂t u + γ ′ (t)∂x u)|x=γs (t)




= (∂t u + c(t, x)∂x u)|x=γs (t) = 0.

This means that u is constant along the characteristic curve γs . Hence

(2.3.4) u(t, γs (t)) = u(0, γs (0)) = f (γs (0)) = f (s).

For later convenience, we write γ(t, s) = γs (t). Fix x ∈ R and we now want to solve the equation x =
γ(t, s). From γ(0, x) = x, and since ∂s γ(0, x) = (∂s γs (0))|s=x = 1 ̸= 0, then we can apply the implicit
2.3. FROM ODE TO PDE 14

function theorem [Apo74, Theorem 13.7] to guarantee that there exist an open neighborhood Ux ⊂
R of 0 and gx ∈ C1 (Ux ) such that gx (0) = x and x = γ(t, s)|s=gx (t) for all t ∈ Ux . In other words,
we found a solution s = gx (t) ≡ g(x,t) of the equation x = γ(t, s) in Ux . Plugging this solution into
(2.3.4), we conclude

(2.3.5) u(t, x) = f (g(x,t)) for all x ∈ R and t ∈ Ux .

This completes the local existence proof. Uniqueness follows from the fact that u is constant along
the characteristic curve γ.

E XAMPLE 2.3.1. Given any f ∈ C1 (R), let us now consider (2.3.2) with c = constant. In this
case, (2.3.3) reads γ ′ (t) = c. For each s ∈ R, it is easy to see that the solution of γs′ (t) = c with
γs (0) = s is
γ(t, s) ≡ γs (t) = ct + s.
For each x ∈ R, the solution of x = γ(t, s) is clearly given by s = g(x,t) ≡ x − ct, and thus from
(2.3.5) we conclude that
u(t, x) = f (x − ct),
and the solution is valid for all x ∈ R and t ∈ R.

E XAMPLE 2.3.2. Given any f ∈ C1 (R), we now want to solve ∂t u+x∂x u = 0 with u(0, x) = f (x)
for all x ∈ R. Write c(t, x) = x, and for each s ∈ R we consider the ODE

γs′ (t) = c(t, γs (t)) ≡ γs (t), γs (0) = s.

By using the integrating factor, one can easily see that the solution of the ODE is

γs (t) = et s.

For each x ∈ R, the solution of x = γs (t) is given by s = g(x,t) ≡ e−t x, and thus from (2.3.5) we
conclude that
u(t, x) = f (g(x,t)) = f (e−t x)
and the solution is valid for all x ∈ R and t ∈ R.

E XAMPLE 2.3.3. Given any f ∈ C1 (R), we now want to solve ∂t u + 2tx2 ∂x u = 0 with u(0, x) =
f (x) for all x ∈ R. Write c(t, x) = 2tx2 , and for each s ̸= 0 we consider the ODE

γs′ (t) = 2t(γs (t))2 , γs (0) = s−1 .

By using the method of separation of variables, one can easily see that the solution of the ODE is

γs (t) = (s − t 2 )−1 ,
2.3. FROM ODE TO PDE 15

which is valid

for all t ∈ R when s < 0,
(2.3.6)
for all t 2 < s when s > 0,

but the ODE is not solvable when s = 0. When s ̸= 0, the solution of x = γs (t) is given by s = t 2 + 1x ,
and thus from (2.3.5) we conclude that
 
−1 x
u(t, x) = f (s ) = f for all x ̸= −t −2 .
1 + t 2x
Here we emphasize that the uniqueness only holds true in the region
1
{(t, x) : x > 0} ∪ {(t, x) : x < 0,t < |x|− 2 }.

We now summarize the above ideas in the following algorithm:

Algorithm 1 Solving ∂t u + c(t, x)∂x u + d(t, x)u = F(t, x) with u(0, x) = f (x)
1: Solve the ODE γs′ (t) = c(t, γs (t)) with given γs (0) for any suitable parameter s.
2: Compute ∂t (u(t, γs (t))).
3: Rewrite the identity x = γs (t) in the form of s = g(x,t).
4: Identify the domain for which u(t, x) = f (g(x,t)) solves ∂t u + c(t, x)∂x u = 0.

E XERCISE 2.3.4. Given any f ∈ C1 (R), solve the equation (1 +t 2 )∂t u + ∂x u = 0 with u(0, x) =
f (x) and identify the range of x.

E XERCISE 2.3.5. Given any f ∈ C1 (R), solve the equation t∂t u + x∂x u = 0 with u(0, x) = f (x)
and identify the range of x.
2
E XERCISE 2.3.6. Solve the equation x∂t u + t∂x u = 0 with u(0, x) = e−x .

2.3.2. Quasilinear equations. The ideas in previous subsection can be extend for quasilinear
equation of the form

(2.3.7) a(x, y, u)∂x u + b(x, y, u)∂y u = c(x, y, u).

Here we follow the approach in [Joh78, Sections 1.4–1.6]. We write (2.3.7) as

(a, b, c) · (∂x u, ∂y u, −1) = 0.

We represent the function u by a surface z = u(x, y) in R3 , and we write


 
dz dz
(a, b, c) · , , −1 = 0.
dx dy
 
dz dz
Note that dx , dy , −1 is the normal vector of the surface, thus (a, b, c) is a tangent vector. We now
consider a “regular” curve (x(t), y(t), z(t)) in that surface, and now we see that (x′ (t), y′ (t), z′ (t)) is
2.3. FROM ODE TO PDE 16

a tangent vector at the point (x(t), y(t), z(t)). This suggests us to consider the characteristic ODE:


x (t) = a(x(t), y(t), z(t)),


(2.3.8) y′ (t) = b(x(t), y(t), z(t)),


′
z (t) = c(x(t), y(t), z(t)),
which is a special case of the ODE (2.0.1). Here, the system is even autonomous, i.e. the
coefficients are independent of variable t does not appear explicitly. If we assume that a, b, c ∈ C1 ,
then one can apply Theorem 2.1.1 to ensure the existence of characteristic curve (x(t), y(t), z(t))
which is C1 . We now prove that the above choice of the characteristic ODE really describes the
surface z = u(x, y).

L EMMA 2.3.7. Assume that a, b, c ∈ C1 near (x0 , y0 , z0 ) ∈ S, where S is the surface described
by z = u(x, y). If γ is a C1 curve described by (x(t), y(t), z(t)) with (x(t0 ), y(t0 ), z(t0 )) = (x0 , y0 , z0 ),
then γ lies completely on S.

P ROOF. For convenience, we write U(t) := z(t) − u(x(t), y(t)) so that U(t0 ) = 0 since
(x0 , y0 , z0 ) ∈ S. Using chain rule and from (2.3.8) one sees that

U ′ (t) = z′ (t) − (∂x u)x′ (t) − (∂y u)y′ (t)


= c(x, y, z) − ∂x u(x, y)a(x, y, z) − ∂y u(x, y)b(x, y, z)
= c(x, y,U − u(x, y)) − ∂x u(x, y)a(x, y,U − u(x, y))
(2.3.9) − ∂y u(x, y)b(x, y,U − u(x, y)).

From (2.3.7), we see that U ≡ 0 is a solution of the ODE (2.3.9). By using the fundamental
theorem of ODE (Theorem 2.1.5), we see that U ≡ 0 is the unique solution of the ODE (2.3.9),
which concludes our lemma. □

We now want to solve the Cauchy problem for (2.3.7) with the Cauchy data

(2.3.10) h(s) = u( f (s), g(s)) for some f , g, h ∈ C1 near s0 .

Note that the initial value problem we previous considered is simply the special case when f (s) ≡ x0
and g(s) = s. Now the characteristic ODE (2.3.8) (with suitable parameterization) reads



 ∂t X(s,t) = a(X(s,t),Y (s,t), Z(s,t)),


∂t Y (s,t) = b(X(s,t),Y (s,t), Z(s,t)),



(2.3.11) ∂t Z(s,t) = c(X(s,t),Y (s,t), Z(s,t)),


with initial conditions






X(s, 0) = f (s), Y (s, 0) = g(s), Z(s, 0) = h(s).

2.3. FROM ODE TO PDE 17

If a, b, c ∈ C1 near ( f (s0 ), g(s0 ), h(s0 )), thus the fundamental theorem of ODE (Theorem 2.1.5)
guarantees that there exists a unique solution of (2.3.11):

(X(s,t),Y (s,t), Z(s,t))

which is C1 for (s,t) near (s0 , 0). If


! !
f ′ (s0 ) g′ (s0 ) ∂s X(s0 , 0) ∂sY (s0 , 0)
the matrix = is invertible,
a(x0 , y0 , z0 ) b(x0 , y0 , z0 ) ∂t X(s0 , 0) ∂t Y (s0 , 0)
then we can use implicit function theorem [Apo74, Theorem 13.7] to guarantee that there exists a
unique solution (s,t) = (S(x, y), T (x, y)) of

x = X(S(x, y), T (x, y)), y = Y (S(x, y), T (x, y))

of class C1 is a neighborhood of (x0 , y0 ) and satisfying

S(x0 , y0 ) = s0 , T (x0 , y0 ) = 0,

so that we finally we conclude that the local solution of the Cauchy problem for (2.3.7) with the
Cauchy data (2.3.10) is given by

u(x, y) = Z(S(x, y), T (x, y)).

The above arguments can be readily extend for higher dimensional case:

T HEOREM 2.3.8. We now consider the Cauchy problem


n
∑ ai(x1, · · · , xn, u)uxi = c(x1, · · · , xn, u)
i=1
with Cauchy data

h(s1 , · · · , sn−1 ) = u( f1 (s1 , · · · , sn−1 ), · · · , fn (s1 , · · · , sn−1 ))

for some f , · · · , fn , h ∈ C1 near (s01 , · · · , s0n−1 ). If a1 , · · · , an , c ∈ C1 near ( f1 (s0 ), · · · , fn (s0 ), h(s0 ))


such that
 
∂s1 f1 (s01 , · · · , s0n−1 ) · · · ∂s1 fn (s01 , · · · , s0n−1 )
 .. .. 
 . . 
the matrix   is invertible,
 ∂sn f1 (s01 , · · · , s0n−1 ) · · · ∂sn fn (s01 , · · · , s0n−1 ) 
 
a1 (x10 , · · · , xn0 , z0 ) · · · an (x10 , · · · , xn0 , z0 )

where xi0 = fi (s01 , · · · , s0n−1 ) for all i = 1, · · · , n and z0 = h(s01 , · · · , s0n−1 ), then there exists a unique
C1 solution u = u(x1 , · · · , xn ) near (x10 , · · · , xn0 , z0 ).
2.3. FROM ODE TO PDE 18

R EMARK 2.3.9. The corresponding characteristic ODE is

∂t xi (s1 , · · · , sn−1 ,t) = ai (x1 , · · · , xn , z) for i = 1, · · · , n,


∂t z(s1 , · · · , sn−1 ,t) = c(x1 , · · · , xn , z),

with initial condition

xi (s1 , · · · , sn−1 , 0) = fi (s1 , · · · , sn−1 ) for i = 1, · · · , n,


z(s1 , · · · , sn−1 , 0) = h(x1 , · · · , xn , z).

E XAMPLE 2.3.10. We now want to solve the initial value problem

u∂x u + ∂y u = 0, u(x, 0) = h(x).

The characteristic ODE is

∂t x(s,t) = z, ∂t y(s,t) = 1, ∂t z(s,t) = 0

with initial condition


x(s, 0) = s, y(s, 0) = 0, z(s, 0) = h(s).
Solving the ODE yields
x = s + zt, y = t, z = h(s).
Eliminating s,t yields the implicit equation

u(x, y) = h(x − u(x, y)y).

It is interesting to see that


∂x u = h′ (x − u(x, y)y)(1 − y∂x u),
then
∂x u + yh′ (x − u(x, y)y)∂x u = h′ (x − u(x, y)y),
and this implies
h′ (x − u(x, y)y)
∂x u(x, y) = .
1 + yh′ (x − u(x, y)y)
We see, for example when h(z) = −z, that
−1
∂x u(x, y) = .
1−y
and this quantity will blow up at y = 1, which means that there cannot exist a strict solution u of
class C1 beyond y = 1. This type of behavior is typical for a nonlinear partial differential equation.
In general, we need to consider “weak” solution to study the PDE, but we will not going to go too
far beyond this point.
2.3. FROM ODE TO PDE 19

E XERCISE 2.3.11. Solve the initial value problem

xu∂x u − ∂y u = 0, u(x, 0) = x.

R EMARK 2.3.12. For the general first order equation, we refer [Joh78, Sections 1.7] for details.
Here we will not going to discuss here.
CHAPTER 3

Linear ODE

We now study the structure of solutions of a linear system

y ′ (t) = A(t)y + b(t),

where the entries of the n × n matrix A(t) are complex-valued continuous functions of a real
independent variable t, and b(t) is a complvex-valued continuous function. Well-posedness of
the ODE can be guaranteed by the fundamental theorem of ODE (Theorem 2.1.5). We will follow
the approach in some parts of [HS99, Chapter IV].

3.1. Homogeneous ODE with constant coefficients

The main theme of this section is to show that the unique (guaranteed by the fundamental
theorem of ODE in Theorem 2.1.5) matrix-valued solution Y = Y (t) of

(3.1.1) Y ′ (t) = AY (t) for all t ∈ R, Y (0) = I,

which is called the fundamental matrix solution, where I is the n × n identity matrix, takes the form

(3.1.2) Y (t) = exp(tA) for all t ∈ R.

If this is the case, then the unique solution of

y ′ (t) = Ay(t), y(t0 ) = p

is exactly
y(t) = exp((t − t0 )A)p for all t ∈ R.
If n = 1, the above discussions are trivial, we are interested in the case when n ≥ 2.
Let Cn×n denote the set of all n × n matrices whose entries are complex numbers. The set of all
inverible matix with entries in C is denoted by GL(n, C), which also can be characterized by

GL(n, C) = {A ∈ Cn×n : det A ̸= 0}.

The collection GL(n, C) is known as the general linear group of order n. For any A ∈ Cn×n , we
define the Hilbert-Schmidt norm
!1/2
n
(3.1.3) ∥A∥ := ∑ |A jk |2 ,
j,k=1
20
3.1. HOMOGENEOUS ODE WITH CONSTANT COEFFICIENTS 21

where A jk is the of A on the jth row and the kth column. The trace of A ∈ Cn×n is defined by
n
tr (A) := ∑ A j j.
j=1

E XERCISE 3.1.1. Let A, B ∈ Cn×n , show that


(a) ∥A∥ = (tr (A∗ A))1/2 , where A∗ is the conjugate transpose (or adjoint) of A ∈ Cn×n .
(b) ∥A + B∥ ≤ ∥A∥ + ∥B∥,
(c) ∥AB∥ ≤ ∥A∥∥B∥.

D EFINITION 3.1.2. Let {Am } be a sequence of complex matrices in Cn×n . We say that Am
converges to matrix A if
lim (Am ) jk = A jk for all 1 ≤ j, k ≤ n.
m→∞

E XERCISE 3.1.3. Show that Am converges to A if and only if limm→∞ ∥Am − A∥ = 0.

We first need to prove the following lemma.

L EMMA 3.1.4 ([Hal15, Proposition 2.1]). For each A ∈ Cn×n , we define Am be the repeated
matrix product of A with itself and A0 = I. Then the series
Am

(3.1.4) exp(A) := ∑
m=0 m!

converges absolutely (in the sense of Exercise 3.1.3). In addition, the function

A ∈ Cn×n 7→ eA ∈ Cn×n

is a continuous function (with respect to the Hilbert-Schmidt norm (3.1.3)).

R EMARK 3.1.5. Lemma 3.1.4 can be rephrase as: the radius of converge of the power series
(3.1.4) is +∞.

P ROOF OF L EMMA 3.1.4. By using Exercise 3.1.1(c), we see that

∥Am ∥ ≤ ∥A∥m for all m ∈ N,

hence ∞
Am ∞
∥A∥m
∑ ≤ ∑ = e∥A∥ < +∞,
m=0 m! m=0 m!
which shows that (3.1.4) converges absolutely. On the other hand, we see that
N
Am ∞
Am ∞
Rm
sup exp(A) − ∑ = sup ∑ m! ≤ ∑ m! ,
∥A∥≤R m=0 m! ∥A∥≤R m=N+1 m=N+1

hence
N
Am
sup exp(A) − ∑ →0 as N → +∞,
∥A∥≤R m=0 m!
3.1. HOMOGENEOUS ODE WITH CONSTANT COEFFICIENTS 22

thus (3.1.4) converges uniformly on the closed ball {A ∈ Cn×n : ∥A∥ ≤ R}, and thus A 7→ eA is
continuous on the open ball {A ∈ Cn×n : ∥A∥ < R}. Since this holds true for all R > 0, hence we
conclude our result. □

It is easy to see that



exp(0) = I, (exp(A))∗ = exp(A∗ ),
(3.1.5)
exp(BAB−1 ) = B exp(A)B−1 for all B ∈ GL(n, C).

The following lemma is crucial (see also [Hal15, Theorem 5.1] for a generalization).

L EMMA 3.1.6 ([Hal15, Proposition 2.3]). If A ∈ Cn×n and B ∈ Cn×n are commute (i.e. AB =
BA), then

(3.1.6) exp(A + B) = exp(A) exp(B) = exp(B) exp(A).

P ROOF. We simply multiply the two power series exp(A) and exp(B) term by term, which is
permitted because both series converge absolutely (by Lemma 3.1.4). We also able to rearrange the
terms (since A and B are commute), so we can collect terms where the power of A plus the power
of B equals to m:
m

Ak Bm−k
exp(A) exp(B) = ∑∑
m=0 k=0 k! (m − k)!

1 m m!
= ∑ m! ∑ k!(m − k)! Ak Bm−k
m=0 k=0

(A + B)m
= ∑ m! = exp(A + B),
m=0

which conclude our lemma. □

E XAMPLE 3.1.7. In general, the identity (3.1.6) does not hold true for those A ∈ Cn×n and
B ∈ Cn×n which are not commute. For example, we choose
! !
0 1 0 0
A= , B= .
0 0 1 0
One sees that ! !
1 0 0 0
AB = ̸= = BA.
0 0 0 1
3.1. HOMOGENEOUS ODE WITH CONSTANT COEFFICIENTS 23

Since A2 = 0 and B2 = 0, then we see that


!
1 1
exp(A) = I + A = ,
0 1
!
1 0
exp(B) = I + B = ,
1 1
thus ! !
2 1 1 1
exp(A) exp(B) = ̸= = exp(B) exp(A).
1 1 1 2
Since (A + B)2 = I, hence we see that
!
1 1 0 1
exp(A + B) = ∑ I+ ∑
m:even m! m:odd m! 1 0
!
∞ ∞
1 1 0 1
= ∑ I+ ∑
m=0 (2m)! m=0 (2m + 1)! 1 0
! !
0 1 cosh(1) sinh(1)
= cosh(1)I + sinh(1) =
1 0 sinh(1) cosh(1)
where
e−x + ex ex − e−x
cosh x = , sinh x = .
2 2
In fact, cosh(1) ≈ 1.5431 and sinh(1) ≈ 1.1752, and we see that all

exp(A + B), exp(A) exp(B), exp(B) exp(A)

are not equal. It is also interesting to compare Lemma 3.1.6 with the Lie product formula
(Theorem 3.1.51) below.

The following are immediate consequences of Lemma 3.1.6.

C OROLLARY 3.1.8. Given any A ∈ Cn×n , one has

exp(αA) exp(β A) = exp((α + β )A) for all α, β ∈ C.

C OROLLARY 3.1.9. Given any A ∈ Cn×n , one has exp(A) ∈ GL(n, C) with

(exp(A))−1 = exp(−A).

It is worth to mention the following theorem, despite we do not use it in this course.

T HEOREM 3.1.10 ([Hal15, Theorem 2.10]). Each A ∈ GL(n, C) can be expressed as exp(B)
for some B ∈ Cn×n . In other words, the mapping B ∈ Cn×n → exp(B) ∈ GL(n, C) is surjective.

We now verify (3.1.2) solves (3.1.1):


3.1. HOMOGENEOUS ODE WITH CONSTANT COEFFICIENTS 24

T HEOREM 3.1.11. Let A ∈ Cn×n . Then t 7→ exp(tA) is a smooth curve in Cn×n and
d
exp(tA) = A exp(tA) = exp(tA)A.
dt
P ROOF. It is well-known that one can differentiate a power series term by term within its radius
of convergence (see e.g. [Pug15, Theorem 12 in Chapter 4]). In view of Remark 3.1.5, our theorem
immediate follows by differentiating the power series exp(tA). □
If we take A with ∥A∥ = 1, we see that Theorem 3.1.11 is nothing by just a directional derivative.
For each fixed 1 ≤ j0 , k0 ≤ n, if we choose

1 , j = j and k = k ,
0 0
A=
0 otherwise,

then the derivative in Theorem 3.1.11 is simply a partial derivative. In fact, the matrix exponential
map is (total) differentiable:

T HEOREM 3.1.12 ([Hal15, Theorem 2.16]). The matrix exponential map exp : Cn×n ∼
2
= R2n →
GL(n, C) is an infinitely differentiable map.

P ROOF. Fix any A ∈ Cn×n . Note that for each j and k, the quantity (Am ) jk is a homogeneous
polynomial of degree m in the entries of A. Thus, the series for the function (Am ) jk has the form of a
multivariable power series on Cn×n ∼
2 2
= R2n . Since the series converges on all R2n (more precisely,
the radius of convergence = ∞), it is permissible to differentiate the series term by term as many
times as we wish (see e.g. [Pug15, Theorem 12 in Chapter 4]). □

3.1.1. Computations of the exponential. For A ∈ Cn×n , we denote A( jk) ∈ C(n−1)×(n−1) be


the matrix obtained from A by crossing out jth row and kth column. We now define the determinant
by induction.

D EFINITION 3.1.13. For A ∈ C1×1 ∼


= C, we simply define det(A) := A. For each A ∈ Cn×n , we
define
n
det(A) := ∑ (−1)1+k A1k det(A(1k)).
k=1

T HEOREM 3.1.14 (Cofactor expansion [Tre17, Theorem 5.1]). For each A ∈ Cn×n , for each
fixed j = 1, · · · , n, we have
n
det(A) = ∑ (−1) j+k A jk det(A( jk)),
k=1
that is, the determinant is independent of j. In addition, we have
n
det(A) = ∑ (−1) j+k A jk det(A( jk)),
j=1

that is, the determinant is independent of k.


3.1. HOMOGENEOUS ODE WITH CONSTANT COEFFICIENTS 25

Let Sn be the set of permutation on the indices {1, · · · , n}, that is,

Sn = {bijective function σ : {1, · · · , n} → {1, · · · , n}}.

T HEOREM 3.1.15 ([Tre17, (4.2)]). The determinant of A ∈ Cn×n can be computed by

det(A) = ∑ cσ Aσ (1),1 Aσ (2),2 · · · Aσ (n),n ,


σ ∈Sn

for some constant cσ ∈ {−1, 1}.

R EMARK 3.1.16. For those who familiar with abstract algebra, here we also remark that cσ
is exact the sign of the permutation σ ∈ Sn , denoted by sign (σ ). In addition, it is also related to
determinant via the formula
 
sign (σ ) = det eσ (1) · · · eσ (n) ,

where e j is the jth column of I.

It is important to mention the following properties:

T HEOREM 3.1.17 ([Tre17, Theorem 3.4 and Theorem 3.5]). Given any A ∈ Cn×n and B ∈ Cn×n ,
one has
det(A) = det(A⊺ ), det(AB) = det(A) det(B).

A matrix A ∈ Cn×n is said to be diagonal if a jk = 0 for all j ̸= k. We denote diag (λ1 , · · · , λn )


the diagonal matrix with entries λ1 , · · · , λn on the main diagonal. A matrix A ∈ Cn×n is said to be
diagonalizable if there exists a matrix P ∈ GL(n, C) such that P−1 AP is diagonal. If we write

P−1 AP = diag (λ1 , · · · , λn ),

then by using (3.1.5) we can easily compute its exponential of a diagonalizable matrix A as

exp(A) = exp(Pdiag (λ1 , · · · , λn )P−1 )


(3.1.7) = P exp(diag (λ1 , · · · , λn ))P−1 = Pdiag (eλ1 , · · · , eλn ))P−1 .

A nontrivial vector p is said to be an eigenvector of A with eigenvalue λ if Ap = λ p. Since

λ is an eigenvalue of A ⇐⇒ det(A − λ I) = 0,

this suggests us to consider the characteristic polynomial


n−1
p(z) = det(zI − A) = zn + ∑ c j z j
j=0
3.1. HOMOGENEOUS ODE WITH CONSTANT COEFFICIENTS 26

for some c0 , · · · , c j ∈ C. By using the fundamemtal theorem of algebra (see e.g. [Kow23]), there
has exactly n complex roots. We also define
n−1
(3.1.8) p(B) := Bn + ∑ c j B j for all B ∈ Cn×n .
j=0

It is important to mention the following theorem regarding the characteristic polynomial (3.1.8):

T HEOREM 3.1.18 (Cayley-Hamilton). If A ∈ Cn×n , then p(A) = 0.

A complex number λ is called a root of p if p(λ ) = 0. The multiplicity of this root is called the
algebraic multiplicity of the eigenvalue λ . There is another notion of multiplicity of an eigenvalue:
the dimension of the eigenspace

ker(λ I − A) := {p ∈ Rn : (λ I − A)p = 0}

is called the geometric multiplicity of the eigenvalue λ .

T HEOREM 3.1.19 ([Tre17, Theorem 2.8]). Let A ∈ Cn×n . Then A is diagonalizable if and only
if for each eigenvalue λ the dimension of the eigenspace ker(A − λ I) coincides with its algebraic
multiplicity.

E XAMPLE 3.1.20 (A nondiagonalizable matrix). We consider the matrix


!
1 1
A= .
0 1
Its characteristic polynomial is
!
z − 1 −1
p(z) = det(zI − A) = det = (z − 1)2 ,
0 z−z
so A has an eigenvalue 1 of algebraic multiplicity 2. However, one sees that
!
0 −1
dim ker(I − A) = dim ker = 1,
0 0
which shows that A is not diagonalizable.

A set of vectors {p1 , · · · , pk } is said to be linear independent if


k
∑ cipi = 0 =⇒ ci = 0 for all i = 1, · · · , k.
i=1

L EMMA 3.1.21 ([HS99, Lemma IV-1-2]). A matrix A ∈ Cn×n is diagonalizable if and only if A
has n linearly independent eigenvectors p1 , · · · , pn .

From Lemma 3.1.21, we immediately obtain the following corollary.

C OROLLARY 3.1.22. If A ∈ Cn×n has n distinct eigenvalues, then A is diagonalizable.


3.1. HOMOGENEOUS ODE WITH CONSTANT COEFFICIENTS 27

E XERCISE 3.1.23. Show that the set of diagonalizable n × n matrix is a proper subset (i.e. a
subset which is not equal) of Cn×n for all n ≥ 2. (Hint. modify the ideas in Example 3.1.20).

L EMMA 3.1.24. The set of diagonalizable n × n matrix is dense in Cn×n (in the sense of
Exercise 3.1.3). In other words, given any A ∈ Cn×n , there exists a sequence of diagonalizable
matrix Bk which is converges to A.

P ROOF. By using Lemma 3.1.30, there exists P ∈ GL(n, C) such that B = P−1 AP is upper
triangular. If we can show that there exists a sequence of diagonalizable matrix B̃k which is
converges to B, then from Exercise 3.1.1 and Exercise 3.1.3 we have

lim sup ∥PB̃k P−1 − A∥ = lim sup ∥P(B̃k − B)P−1 ∥


k→+∞ k→+∞

≤ ∥P∥∥P−1 ∥ lim ∥B̃k − B∥ = 0,


k→+∞

which concludes the lemma with B̃k = PB̃k P−1 . This can be done by setting

B̃k := B + diag(εk,1 , · · · , εk,n )

where the quantities εk, j are chosen in such a way that n numbers b11 + εk,1 , · · · , bnn + εk,n
are distinct and εk, j → 0 for all j = 1, · · · , n, because by Corollary 3.1.22 we see that B̃k are
diagonalizable matrices. □

E XERCISE 3.1.25. For each A ∈ Cn×n , show that det(exp(A)) = etr (A) . In addition, show that
tr (A) = λ1 + · · · + λn , where λ j ∈ C are eigenvalues (may identical) of A.

However, in practical, it is not easy to check whether a matrix A is diagonalizable or not. There
are some sufficient conditions which are relatively easy to check.

D EFINITION 3.1.26 ([Tre17, Section 6]). A matrix U ∈ Cn×n is called unitary if U ∗U = I. The
set of unitary matrix is defined by U(n, C), which is also called the unitary group.

L EMMA 3.1.27. If U ∈ Cn×n is unitary, then U ∈ GL(n, C) with U −1 = U ∗ , which is also


unitary. In other words, U(n, C) ⊂ GL(n, C).

A matrix A ∈ Cn×n is called normal if A∗ A = AA∗ . A matrix A ∈ Cn×n is called Hermitian (or
self-adjoint) if A = A∗ . We write

(u, v)Cn×n := v ∗ u for all u, v ∈ Cn .

It is important to notice that

(3.1.9) (v, u)Cn×n = ((v, u)Cn×n )∗ = (u∗ v)∗ = v ∗ u = (u, v)Cn×n for all u, v ∈ Cn .

L EMMA 3.1.28. A is Hermitian if and only if (Au, v)Cn×n = (u, Av)Cn×n for all u, v ∈ Cn .
3.1. HOMOGENEOUS ODE WITH CONSTANT COEFFICIENTS 28

P ROOF. The lemma easily hold from the identities

(u, Av)Cn×n = (Av)∗ u = v ∗ A∗ u

and
(Au, v)Cn×n = v ∗ Au
for all A ∈ Cn×n as well as for all u, v ∈ Cn . □

In fact, all normal matrices are unitary diagonalizable:

T HEOREM 3.1.29 ([Tre17, Theorem 2.4]). If A ∈ Cn×n is normal, then there exists a unitary
matrix U ∈ Cn×n such that D := U ∗ AU is diagonal. If A ∈ Cn×n is Hermitian, then D is real-valued.

We still can simplify arbitrary matrix by using unitary matrices. A matrix A ∈ Cn×n is said to
be upper triangular if a jk = 0 for j > k.

L EMMA 3.1.30 (Schur representation [Tre17, Theorem 1.1 and Theorem 1.2]). For each A ∈
Cn×n , there exists a unitary matrix U ∈ Cn×n such that U ∗ AU is upper triangular. If A ∈ Rn×n and
all its eigenvalues are real, then we can choose U ∈ Rn×n .

In view of the power series of exp(A), it is also natural to study the following class of matrix.

D EFINITION 3.1.31. A matrix N ∈ Cn×n is said to be nilpotent if N n = 0.

R EMARK 3.1.32. If N is nilpotent, then exp(N) is simply a finite sum. One can directly verify
that exp(N) is unipotent (i.e. exp(N) − I is nilpotent)

L EMMA 3.1.33. A matrix N ∈ Cn×n is nilpotent if and only if all eigenvalues of N are zero.

P ROOF. Let λ be an eigenvalue of a nilpotent matrix N with eigenfunction p ̸≡ 0, i.e.

Np = λ p.

Then 0 = N n p = λ n p, which implies λ n = 0. Since |λ |n = |λ n | = 0, then we conclude that λ = 0.


The converse can be easily verified as well. □

By using Schur’s representation (Lemma 3.1.30) and Lemma 3.1.33, we reach the following
lemma.

L EMMA 3.1.34. A matrix N ∈ Cn×n (resp. N ∈ Rn×n ) is nilpotent if and only if there exists a
unitary matrix U ∈ Cn×n (resp. U ∈ Rn×n ) such that T = U ∗ NU is upper triangle with T j j = 0 for
all j = 1, · · · , n.

We already discuss a relation between the set of diagonalizable matrix and Cn×n in
Lemma 3.1.24. The following theorem gives another relation between them.
3.1. HOMOGENEOUS ODE WITH CONSTANT COEFFICIENTS 29

T HEOREM 3.1.35 (Jordan-Chevalley decomposition [HS99, Theorem IV-1-11]). Let A ∈ Cn×n .


Then there exist a diagonalizable matrix D ∈ Cn×n and a nilpotent matrix N ∈ Cn×n such that

(3.1.10) A = D+N and DN = ND,

and the decomposition (3.1.10) is unique. If A ∈ Rn×n , then D ∈ Rn×n and N ∈ Cn×n .

Since the trace operator is linear and tr (N) = 0 (see Lemma 3.1.33), we immediately reach the
following corollary.

C OROLLARY 3.1.36. If A ∈ Cn×n satisfies tr (A) = 0, then the diagonalizable matrix D in


(3.1.10) satisfies tr (D) = 0.

E XERCISE 3.1.37. Let


 
! 0 a b !
0 −a a b
A1 = , A2 =  0 0 c  , A3 = .
 
a 0 0 a
0 0 0
Compute exp(A1 ), exp(A2 ) and exp(A3 ).

E XERCISE 3.1.38. Show that for any a, b, d ∈ C that


! a −ed
!
a b ea b ea−d
exp = .
0 d 0 ed
Since
ea − ed
lim = ea ,
a→d a − d
ea −ed
we simply interpret a−d as ea when d = a. (Hint. Show that
!m m −d m
!
a b am b a a−d
=
0 d 0 bm
for all m ∈ N and a ̸= d.)

We now exhibit the general algorithm to compute the decomposition in Theorem 3.1.35.
3.1. HOMOGENEOUS ODE WITH CONSTANT COEFFICIENTS 30

Algorithm 2 Computation of D and N in Theorem 3.1.35


1: Input a matrix A ∈ Cn×n .
2: Compute the characteristic polynomial p(z) = (zI − A).
3: Decompose 1/p(z) into partial fractions
k Q j (z)
1
= ∑ (z − λ j )m j ,
p(z) j=1
where for each j the quantity Q j (z) is a nonzero polynomial with deg(Q j ) ≤ m j − 1 and
λ1 , · · · , λk are distinct zeros of p (i.e. eigenvalues of A).
4: For each j = 1, · · · , n, we define
Pj (A) := Q j (A) ∏(A − λ j I)mℓ .
ℓ̸= j
5: Output D = λ1 P1 (A) + · · · + λk Pk (A) and N = A − D.

R EMARK 3.1.39. There always exists a unique decomposition in Step 3 of Algorithm 2, see
e.g. [Kow23] for a proof. This algorithm highlighted the proof of Theorem 3.1.35. Here we also
highlight that Pj (A)Pk (A) = 0 for all j ̸= k and Pj (A)2 = Pj (A) for all j = 1, · · · , n, see [HS99,
Lemma IV-1-9].

Combining Lemma 3.1.6 and Theorem 3.1.35, we reach

exp(A) = exp(D) exp(N) = exp(λ1 P1 (A)) · · · exp(λk Pk (A)) exp(N)


= exp(N) exp(D) = exp(N) exp(λ1 P1 (A)) · · · exp(λk Pk (A)).

E XAMPLE 3.1.40 ([HS99, Example IV-1-18]). The characteristic polynomial of the matrix
 
252 498 4134 698
 −234 −465 −3885 −656 
A=
 

 15 30 252 42 
−10 −20 −166 −25

is p(z) = (z − 4)2 (z − 3)2 . One can compute


1 1 2 1 2
= − + +
p(z) (z − 4)2 z − 4 (z − 3)2 z − 3
1 − 2(z − 4) 1 + 2(z − 3)
= +
(z − 4)2 (z − 3)2
Accordingly, we set

P1 (z) := (1 − 2(z − 4))(z − 3)2 , λ1 = 4,


P2 (z) := (1 + 2(z − 3))(z − 4)2 , λ2 = 3,
3.1. HOMOGENEOUS ODE WITH CONSTANT COEFFICIENTS 31

and we compute
   
−1 −2 134 198 2 2 −134 −198
 1 2 −125 −186   −1 −1 125 186 
P1 (A) =  , P2 (A) =  .
   
 0 0 9 12   0 0 −8 −12 
0 0 −6 −8 0 0 6 9
Therefore  
2 −2 134 198
 1 5 −125 −186 
S = λ1 P1 (A) + λ2 P2 (A) = 
 

 0 0 12 12 
0 0 −6 −5
and  
250 500 4000 500
 −235 −470 −3760 = 470 
N = A−S =  .
 
 15 30 240 30 
−10 −20 −160 −20
 
3 4 3
E XERCISE 3.1.41. Decompose A =  2 7 4  by using Algorithm 2.
 

−4 8 3

E XERCISE 3.1.42 (Bochner’s subordination). Let 0 < s < 1, by using the integration by parts
on Γ(1 − s), where Γ is the gamma function, show that
1
Z ∞
s
λ = (1 − etλ )t −1−s dt for all λ > 0.
|Γ(−s)| 0

D EFINITION 3.1.43. A Hermitian matrix A ∈ Cn×n is said to positive definite, denoted by A ≻ 0,


if
p∗ Ap > 0 for all p ∈ Cn \ {0}.

By using Theorem 3.1.29, we see that A ≻ 0 if and only if

A = Udiag (λ1 , · · · , λn )U ∗

for some λ1 > 0, · · · , λn > 0 and unitary U ∈ Cn×n , and accordingly we define

As := Udiag (λ1s , · · · , λns )U ∗ .

Now using the Bochner’s subordination (Exercise 3.1.42) and (3.1.7), we can compute As via the
formula
1
Z ∞
(3.1.11) s
A = (1 − exp(tA))t −1−s dt for all A ≻ 0,
|Γ(−s)| 0
which gives an application of the matrix fundamental solution (3.1.2).
3.1. HOMOGENEOUS ODE WITH CONSTANT COEFFICIENTS 32

R EMARK 3.1.44. The Fourier transform suggests us to (formally) replace A by −∆ in (3.1.11),


and we reach the fractional Laplacian
1
Z ∞
(−∆) :=s
(1 − e−t∆ )t −1−s dt.
|Γ(−s)| 0

One can see e.g. [Kwa17] for an introduction.

3.1.2. The matrix logarithm. We now wish to define a matrix logarithm, which should be an
inverse function to the matrix exponential. One simplest way to define the matrix logarithm is by
a power series. We recall the following fact concerning the principal branch of complex logarithm
(see e.g. [Kow23] for more details):

L EMMA 3.1.45. The radius of convergence of the complex power series



(z − 1)m
(3.1.12) log z := ∑ (−1)m+1 m
m=1
is 1, in other words, the series (3.1.12) is defined and holomorphic in a circle of radius 1 about
z = 1. In addition, we have

elog z = z for all z with |z − 1| < 1.

Moreover, we have

|eu − 1| < 1 and log eu = u for all u with |u| < log 2.

Based on the above lemma, we now can define the matrix logarithm by the following theorem.

T HEOREM 3.1.46. [Hal15, Theorem 2.8] The matrix logarithm



(A − I)m
log(A) := ∑ (−1)m+1 m
m=1

is defined and continuous on the set of all matrices A ∈ Cn×n with ∥A − I∥ < 1. In addition, we
have
exp(log(A)) = A
for all matrices A ∈ Cn×n with ∥A − I∥ < 1. Moreover, we have ∥ exp(B) − 1∥ < 1 and

log(exp(B)) = B

for all matrices B ∈ Cn×n with ∥B∥ < log 2.

P ROOF. By using Exercise 3.1.1, we have ∥(A − I)m ∥ ≤ ∥A − I∥m , by using the similar
arguments in Lemma 3.1.4, one can show that log(A) is defined and continuous on the set of all
matrices A ∈ Cn×n with ∥A − I∥ < 1. We left the details for readers as an exercise.
Let A ∈ Cn×n with ∥A − I∥ < 1. By using Lemma 3.1.24, one can find a sequence of
diagonalizable matrix Ak ∈ Cn×n such that ∥Ak − A∥ → 0 as k → ∞. Since ∥Ak − I∥ < 1 for all
3.1. HOMOGENEOUS ODE WITH CONSTANT COEFFICIENTS 33

sufficiently large k, then we know that



(Ak − I)m
log(Ak ) = ∑ (−1)m+1 m
m=1

is defined and continuous for all sufficiently large k. We write

Ak = Qk diag(λk,1 , · · · , λk,n )Q−1


k ,

and we see that


(Ak − I)m = Qk diag (λk,1 − 1)m , · · · , (λk,n − 1)m Q−1

k ,
thus
!
∞ diag (λk,1 − 1)m , · · · , (λk,n − 1)m
log(Ak ) = Qk ∑ (−1)m+1 Q−1
k
m=1 m
 −1
= Qk diag log(λk,1 ), · · · , log(λk,n ) Qk .

Now by using (3.1.5) and Lemma 3.1.45 we see that

exp(log(Ak )) = Qk exp diag log(λk,1 ), · · · , log(λk,n ) Q−1



k

= Qk diag exp log(λk,1 ), · · · , exp log(λk,n ) Q−1



k

= Qk diag(λk,1 , · · · , λk,n )Q−1


k = Ak .

Finally, by continuity of the mapping A 7→ log(A) and B 7→ exp(B), we conclude

exp(log(A)) = A

by taking k → +∞.
Now, if ∥B∥ < log 2, then using Exercise 3.1.1 we see that

Bm ∞
∥B∥m
∥ exp(B) − I∥ = ∑ m! ≤ ∑ m! = e∥B∥ − 1 < 1,
m=1 m=1

thus log(exp(B)) is well-defined. The proof of log(exp(B)) = B is very similar to the proof of
exp(log(A)) = A, therefore we left the details for readers as an exercise. □

R EMARK 3.1.47. If A is unipotent (i.e. A − I is nilpotent), then log(A) is simply a finite sum,
which can be defined without the assumption ∥A − I∥ < 1. In this case, one can easily verify that
log(A) is nilpotent. See also Remark 3.1.32.

E XERCISE 3.1.48. Show that:


(a) If A is unipotent, then exp(log(A)) = A.
(b) If B is nilpotent, then log(exp(B)) = B.
(Hint. Let A(t) := I + t(A − I) and show that exp(log(A(t))) depends polynomially on t and that
exp(log(A(t))) = A(t) for all sufficiently small t)
3.1. HOMOGENEOUS ODE WITH CONSTANT COEFFICIENTS 34

E XERCISE 3.1.49. Show that there exists a constant c > 0 such that

(3.1.13) ∥ log(I + A) − A∥ ≤ c∥A∥2

holds true for all A ∈ Cn×n with ∥A∥ ≤ 1/2.

R EMARK 3.1.50. We may restate (3.1.13) by saying that

log(I + A) = A + O(∥A∥2 ),

where O(∥A∥2 ) denotes a quantity of order ∥A∥2 , i.e. a quantity that is bounded by a constant times
∥A∥2 for all sufficiently small values of ∥A∥.

3.1.3. One parameter subgroup, Lie group and Lie algebra. We now exhibit a result
involving the exponential of a matrix that will be important in the study of Lie algebras.

T HEOREM 3.1.51 (Lie product formula [Hal15, Theorem 2.11]). For each A, B ∈ Cn×n , we
have
exp(A + B) = lim (exp(A/m) exp(B/m))m .
m→∞

P ROOF. By multiplying the power series for exp(A/m) and exp(B/m), one sees that
A B
exp(A/m) exp(B/m) = I + + + O(m−2 ).
m m
Now since exp(A/m) exp(B/m) → I as m → ∞, then log(exp(A/m) exp(B/m)) is well-defined for
all sufficiently large m. By using Exercise 3.1.49 (with X = mA + mB + O(m−2 )), we see that
 
A B −2
log (exp(A/m) exp(B/m)) = log I + + + O(m )
m m
!
2
A B A B
= + + O(m−2 ) + O + + O(m−2 )
m m m m
A B
= + + O(m−2 ).
m m
Now Theorem 3.1.46 guarantees that
 
A B −2
exp(A/m) exp(B/m) = exp + + O(m ) ,
m m
therefore,
(exp(A/m) exp(B/m))m = exp A + B + O(m−1 ) .


By the continuity of the exponential, we conclude our theorem by taking m → +∞. □

R EMARK 3.1.52. There is a version of this result, known as the Trotter product formula, which
holds for suitable unbounded operators on an infinite-dimensional Hilbert space, see e.g. [Hal13,
Theorem 20.1].
3.1. HOMOGENEOUS ODE WITH CONSTANT COEFFICIENTS 35

D EFINITION 3.1.53. We call {Y (t)}t∈R a one parameter subgroup of GL(n, C) if


(a) Y : R → GL(n, C) is continuous;
(b) Y (0) = I; and
(c) Y (t + s) = Y (t)A(s) for all t, s ∈ R.

E XAMPLE 3.1.54. The fundamental matrix solution {exp(tA)}t∈R given in (3.1.2) forms a
one parameter subgroup, where (a) is verified by Theorem 3.1.11, (b) can be found in the basic
properties (3.1.5), and (c) is a special case of Corollary 3.1.8.

We now proof the following lemma.

L EMMA 3.1.55. Let 0 < ε < log 2, let Bε/2 := {A ∈ Cn×n : ∥A∥ < ε/2} and let

exp(Bε/2 ) := {exp(A) ∈ Cn×n : ∥A∥ < ε/2}.


√ √ 2
Given any A ∈ exp(Bε/2 ), there exists a unique A ∈ exp(Bε/2 ) such that A = A, which is given
by

 
1
A = exp log(A) .
2
P ROOF. By using Lemma 3.1.6, one sees that
√ 2
 
1 1
A = exp log(A) + log(A) = exp(log(A)) = A.
2 2
To establish uniqueness, suppose B ∈ exp(Bε/2 ) satisfies B2 = A. Now using Lemma 3.1.6 we see
that

exp(2 log(B)) = exp(log(B) + log(B))


= exp(log B) exp(log B) = B2 = A

Since B ∈ exp(Bε/2 ), then we can check that ∥2 log(B)∥ < ε < log 2. Now we can use
Theorem 3.1.46 to see that

log(A) = log(exp(2 log(B))) = 2 log(B),

thus log(B) = 21 log(A). Finally, again using Theorem 3.1.46 we conclude that

 
1
B = exp(log(B)) = exp log(A) = A,
2
which conclude the uniqueness. □
We now show that Example 3.1.54 already exhibit all one parameter subgroup.

T HEOREM 3.1.56 ([Hal15, Theorem 2.14]). If {Y (t)}t∈R is a one parameter subgroup of


GL(n, C), then there exists a unique A ∈ Cn×n such that

Y (t) = etA for all t ∈ R.


3.1. HOMOGENEOUS ODE WITH CONSTANT COEFFICIENTS 36

P ROOF. We first prove the uniqueness of such A. Suppose that etA = etB for all t ∈ R, by using
Theorem 3.1.11, we can differentiate the power series of matrix exponential term-by-term and see
that
d d
A = etA = etB = B.
dt t=0 dt t=0
Since the function log : exp(Bε/2 ) → Bε/2 is bijective and continuous, then we see that exp(Bε/2 ) is
an open set in GL(n, C). Since Y (0) = I ∈ exp(Bε/2 ) and t 7→ Y (t) is continuous, then there exists
t0 > 0 such that
Y (t) ∈ exp(Bε/2 ) for all t ∈ [−t0 ,t0 ].
Now we define
1
A := log(Y (t0 )),
t0
so that t0 A = log(Y (t0 )). Now we see that t0 A ∈ Bε/2 and thus Theorem 3.1.46 allows us to apply
matrix exponential on the identity t0 A = log(Y (t0 )) to see that

exp(t0 A) = exp(log(Y (t0 ))) = Y (t0 ).

Now Y (t0 /2) is again in exp(Bε/2 ) and by Definition 3.1.53(c) we have Y (t0 /2)2 = Y (t0 ). By using
Lemma 3.1.55 we see that
 
p 1
Y (t0 /2) = Y (t0 ) = exp log(Y (t0 )) = exp(t0 A/2)
2
Applying this argument repeatedly, we conclude that

Y (t0 /2k ) = exp(t0 A/2k ) for all k ∈ N.

Now by Definition 3.1.53(c) and Lemma 3.1.6 we see that

Y (mt0 /2k ) = Y (t0 /2k )m = exp(mt0 A/2k ) for all k ∈ N and m ∈ Z,

in other words,
m
Y (t) = exp(tA) for all t ∈ R of the form t = t0 .
2k
Since the set {t ∈ R : t = 2mk t0 for some k ∈ N and m ∈ Z} is dense in R, by continuity of t 7→ Y (t)
and t 7→ exp(tA), it follows that Y (t) = exp(tA) for all t ∈ R, which conclude our theorem. □
Theorem 3.1.56 says that there is a one-to-one correspondence between Cn×n and the collection
of one parameter subgroups of GL(n, C). This also suggests us to extend Definition 3.1.53 as
follows:
D EFINITION 3.1.57. Let G be a matrix Lie group, i.e. a subgroup of GL(n, C) with respect to
matrix multiplication. In other words, G is a subset of GL(n, C) satisfying
• AB ∈ G for all A ∈ G and B ∈ G ;
• I ∈ G ; and
• A−1 ∈ G for all A ∈ G .
3.1. HOMOGENEOUS ODE WITH CONSTANT COEFFICIENTS 37

We call {Y (t)}t∈R a one parameter subgroup of G if


(a) Y : R → G is continuous;
(b) Y (0) = I; and
(c) Y (t + s) = Y (t)A(s) for all t, s ∈ R.

E XAMPLE 3.1.58. The trivial subgroup {Y (t) = I}t∈R is a one parameter subgroup of G , which
means the existence of one parameter subgroup holds.

We now able to give some examples. By using Lemma 3.1.27, it is easy to check that the unitary
group U(n, C) is a matrix Lie group.

T HEOREM 3.1.59. If {Y (t)}t∈R is a one parameter subgroup of U(n, C), then there exists a
unique A ∈ Cn×n which is skew-Hermitian (i.e. A∗ = −A) such that

Y (t) = exp(tA) for all t ∈ R.

P ROOF. It is easy to check that


((tA)∗ )2
exp(tA∗ ) = exp((tA)∗ ) = I + (tA)∗ + +···
2!
∗
(tA)2

= I + tA + + · · · = (exp(tA))∗ for all t ∈ R.
2!
If A ∈ Cn×n is skew-Hermition, then by Corollary 3.1.9 we see that

(exp(tA))∗ = exp(tA∗ ) = exp(−tA) = (exp(tA))−1 for all t ∈ R,

which shows that exp(tA) ∈ U(n, C) for all t ∈ R, that is, {exp(tA)}t∈R forms a one parameter
subgroup of U(n, C). Now let {Y (t)}t∈R be a one parameter subgroup of U(n, C). Since {Y (t)}t∈R
is also a one parameter subgroup of GL(n, C), then by Theorem 3.1.56 there exists a matrix B ∈
Cn×n such that
Y (t) = exp(tB) for all t ∈ R.
Since exp(tB) = Y (t) ∈ U(n, C) for all t ∈ R, then by Corollary 3.1.9 we see that

exp(tB∗ ) = (exp(tB))∗ = (exp(tB))−1 = exp(−tB).

Now by Theorem 3.1.46 we see that

tB∗ = log(exp(tB∗ )) = log(exp(−tB)) = −tB for all t with small |t|,

which shows that B is skew-Hermitian. □

It is not difficult to see that the special linear group SL(n, C) := {A ∈ Cn×n : det(A) = 1} is a
matrix Lie group.
3.1. HOMOGENEOUS ODE WITH CONSTANT COEFFICIENTS 38

T HEOREM 3.1.60. If {Y (t)}t∈R is a one parameter subgroup of SL(n, C), then there exists a
unique A ∈ Cn×n with tr (A) = 0 such that

Y (t) = exp(tA) for all t ∈ R.

R EMARK 3.1.61. See also Lemma 3.1.36. It is interesting to mention that, if A ∈ Cn×n satisfies
tr (A) = 0, then there exist matrices X ∈ Cn×n and Y ∈ Cn×n so that A = XY − Y X, where X is
Hermitian and tr (Y ) = 0.

P ROOF. Let A ∈ Cn×n with tr (A) = 0. By using Exercise 3.1.25, one can check that

det(exp(tA)) = etr (tA) = e0 = 1 for all t ∈ R,

that is, {exp(tA)}t∈R forms a one parameter subgroup of SL(n, C). Now let {Y (t)}t∈R be a one
parameter subgroup of SL(n, C). By Theorem 3.1.56 there exists a matrix A ∈ Cn×n such that

Y (t) = exp(tA) for all t ∈ R.

Again by Exercise 3.1.25 we see that

1 = det(Y (1)) = etr (A) ,

thus tr (A) = log(etr (A) ) = 0 which conclude our theorem. □

One also can check that the special unitary group SU(n, C) := U(n, C) ∩ SL(n, C) is also a
matrix Lie group. Imitate the proof of Theorem 3.1.59 and Theorem 3.1.60, one can easily check
the following corollary.

C OROLLARY 3.1.62. If {Y (t)}t∈R is a one parameter subgroup of SU(n, C), then there exists
a unique skew-Hermitian matrix A ∈ Cn×n with tr (A) = 0 such that

Y (t) = exp(tA) for all t ∈ R.

We now define the following sets:

gl(n, C) := Cn×n ,
u(n, C) := {A ∈ Cn×n : A∗ = −A},
sl(n, C) := {A ∈ Cn×n : tr (A) = 0},
su(n, C) := u(n, C) ∩ sl(n, C).
3.1. HOMOGENEOUS ODE WITH CONSTANT COEFFICIENTS 39

We point out that Theorem 3.1.56, Theorem 3.1.59, Theorem 3.1.60 and Corollary 3.1.62 say that
one has the following one-to-one correspondence:

GL(n, C) ↔ gl(n, C),


U(n, C) ↔ u(n, C),
SL(n, C) ↔ sl(n, C),
SU(n, C) ↔ su(n, C).

Now it is also natural to introduce the following terminologies:

D EFINITION 3.1.63. Let G be a matrix Lie group (see Definition 3.1.57). The Lie algebra of
G, denoted g, is the set of all matrices X such that etX ∈ G for all t ∈ R.

We now summarize the above examples in the following table, see [Hal15] for more examples.

matrix Lie group G Lie algebra g


GL(n, C) gl(n, C)
U(n, C) u(n, C)
SL(n, C) sl(n, C)
SU(n, C) su(n, C)

TABLE 1. Some examples of Lie algebra g of matrix Lie group G

We define the commutator by

[A, B] := AB − BA for all A, B ∈ Cn×n .

Note that A and B are commute if and only if [A, B] = 0. The following theorem exhibit some basic
properties of Lie algebra.

T HEOREM 3.1.64 ([Hal15, Theorem 3.20]). Let G be a matrix Lie group with Lie algebra g.
The following holds:
(a) AgA−1 ⊂ g for all A ∈ G. 1
(b) g is R-linear, that is, aA + bB ∈ g for all A, B ∈ g and a, b ∈ R.
(c) [A, B] ∈ g for all A, B ∈ g.

The proof of Theorem 3.1.64(a) is easy, which we left the details for readers as an exercise.

P ROOF OF T HEOREM 3.1.64( B ). By definition, for each A ∈ g, it is easy to check that tA ∈ g


for all t ∈ R. For each A, B ∈ G and for each m ∈ N, we see that

(exp(tA/m) exp(tB/m))m ∈ G.
1In other words, for each B ∈ g, one has ABA−1 ∈ g for all A ∈ G.
3.1. HOMOGENEOUS ODE WITH CONSTANT COEFFICIENTS 40

Now using the Lie product formula (Theorem 3.1.51), we conclude that

exp(t(A + B)) = lim (exp(tA/m) exp(tB/m))m ∈ G,


m→∞
which conclude (b). □
P ROOF OF T HEOREM 3.1.64( C ). By using (a), one sees that exp(tA)B exp(−tA) ∈ g for all
t ∈ R, and by (b) we see that
exp(tA)B exp(−tA) − B
∈ g for all t ∈ R \ {0},
t
thus
d exp(tA)B exp(−tA) − B
(exp(tA)B exp(−tA)) = lim ∈ g.
dt t=0 t→0 t
Now using product rule we see that
d
(exp(tA)B exp(−tA))
dt t=0

= A exp(tA)B exp(−tA) + exp(tA)B(−A exp(−tA))


t=0
= AB − BA = [A, B]

which complete the proof of (c). □

D EFINITION 3.1.65. Let G and H be matrix Lie groups. A map Φ : G → H is called a Lie group
homomorphism if:
(a) Φ : G → H is a group homomorphism2;
(b) Φ : G → H is continuous.
In addition, if Φ : G → H is bijective and its inverse Φ−1 : H → G is continuous, then Φ is called a
Lie group isomorphism.

The following theorem tells us that a Lie group homomorphism between two Lie groups gives
rise in a natural way to a map between the corresponding Lie algebras.

T HEOREM 3.1.66 ([Hal15, Theorem 3.28]). Let G and H be matrix Lie groups, with Lie
algebras g and h, respectively. Suppose that Φ : G → H is a Lie group homomorphism. Then
there exists a unique R-linear map φ : g → h such that

Φ(exp(A)) = exp(φ (A)) for all A ∈ g,

which satisfies
(a) φ (BAB−1 ) = Φ(B)φ (A)φ (B)−1 for all A ∈ g and B ∈ G;
(b) φ ([A, B]) = [φ (A), φ (B)] for all A, B ∈ g.
2This means that Φ(A)Φ(B) = Φ(AB) for all A, B ∈ G. Here Φ(A)Φ(B) is matrix multiplication in H, while AB is
matrix multiplication in G.
3.2. HOMOGENEOUS ODE WITH VARIABLE COEFFICIENTS 41

d
(c) φ (A) = dt Φ(exp(tA)) t=0 for all A ∈ g.
If, in addition, Φ : G → H is a Lie group isomorphism, then such R-linear map φ : g → h is bijective.

D EFINITION 3.1.67. We call such mapping φ : g → h the associated Lie algebra


homomorphism of the Lie group homomorphism Φ : G → H.

The proof of Theorem 3.1.66 is similar to Theorem 3.1.64, here we left the details for readers
as an exercise.

D EFINITION 3.1.68. Let G be a matrix Lie group. A representation of a matrix Lie group G is
a Lie group homomorphism
Π : G → GL(n, C).
A representation of a Lie algebra g is a Lie algebra homomorphism

π : g → gl(n, C).

In this note, we only deal with Cn×n . The notations in this section can be extend to abstract
vector spaces, and this is related to the group representation theory, see e.g. the monograph [Hal15]
for further details.

3.2. Homogeneous ODE with variable coefficients

In this section, we explain the basic results concerning the structure of solutions of a
homogeneous system of ODE given by

(3.2.1) y ′ (t) = A(t)y(t), y(t0 ) = p = (p1 , · · · , pn ).

We assume that A is continuous near t0 . By using the fundamental theorem of ODE


(Theorem 2.1.5), there exist a unique C1 -solution Y such that

(3.2.2) Y ′ (t) = A(t)Y (t) near t0 , Y (t0 ) = I.

By using det(Y (t0 )) = 1 and the continuity of det : Cn×n → C, one sees that det(Y (t)) ̸= 0 for all t
near t0 , i.e.
Y (t) ∈ GL(n, C) for all t near t0 .
We call such Y (t) a fundamental matrix solution near t0 . Furthermore, the columns y1 (t), · · · , yn (t)
of Y (t), which forms a linearly independent set in Cn near t0 , are called the fundamental set of n
linearly independent solution near t0 . We see that
n
Y (t)p = ∑ p jy j
j=1

is the unique solution of (3.2.1) near t0 . By arbitrariness of p, we can rephrase the above in the
following theorem.
3.2. HOMOGENEOUS ODE WITH VARIABLE COEFFICIENTS 42

T HEOREM 3.2.1. If A is continuous near t0 , then the set of all solutions of the ODE y ′ (t) =
A(t)y(t) near t0 forms an n-dimensional vector space over C.

Similarly, by using the fundamental theorem of ODE (Theorem 2.1.5), there exist a unique
C1 -solution Y such that

(3.2.3) Z ′ (t) = −Z(t)A(t) near t0 , Z(t0 ) = I,

which satisfies
Z(t) ∈ GL(n, C) for all t near t0 .
By using the chain rule, one sees that
d
(Z(t)Y (t)) = Z ′ (t)Y (t) + Z(t)Y ′ (t)
dt
= −Z(t)A(t)Y (t) + Z(t)A(t)Y (t) = 0

and Z(t0 )Y (t0 ) = I. Hence we see that


d
(Z(t)Y (t) − I) = 0, (Z(t)Y (t) − I) = 0.
dt t=t0

By using (uniqueness part in) the fundamental theorem of ODE (Theorem 2.1.5), we now see that

Z(t)Y (t) − I = 0 for all t near t0 ,

and hence
Y (t) = Z(t)−1 ∈ GL(n, C) for all t near t0 .
We can refer (3.2.3) be the adjoint problem of (3.2.2).
We now want to compute Y (t) in terms of A(t). In this case when A(t) ∈ C1×1 ∼= C, i.e. the
ODE (3.2.2) is scalar (n = 1), then by using the fundamental theorem of calculus, we can easily
obtain
Z t 
(3.2.4) Y (t) = exp A(s) ds for all t near t0 .
t0

When n ≥ 2, the situation become tricky: By using the product rule, we see that
 Z t ′

Y (t) = exp A(s) ds
t0
2 !′
t
Z  Z t
1
= I+ A(s) ds + A(s) ds + · · ·
t0 2! t0
Z t 2 !′
1
= A(t) + A(s) ds +···
2! t0
 Z t  Z t  
1
= A(t) + A(t) A(s) ds + A(s) ds A(t) + · · · .
2! t0 t0
3.2. HOMOGENEOUS ODE WITH VARIABLE COEFFICIENTS 43

If
Z t
(3.2.5) A(t) and A(s) ds are commute,
t0

then we reach
Z t
 Z t 2
′ 1
Y (t) = A(t) + A(t) A(s) ds + A(t) A(s) ds + · · · = A(t)Y (t).
t0 2! t0

In addition, by using Exercise 3.1.25, from (3.2.4) we see that


Z t 
(3.2.6) det(Y (t)) = exp tr (A(s)) ds for all t near t0 .
t0

E XAMPLE 3.2.2. A(t) = diag (λ1 (t), · · · , λn (t)) for some scalar functions λ1 , · · · , λn which are
continuous near t0 .

We remind the readers that the existence of unique fundamental matrix solution Y (t) does not
require the additional assumption (3.2.5). The requirement (3.2.5) is quite restrictive: In general
we do not know the explicit formula of the unique fundamental matrix solution Y (t). Due to this
reason, it is worth to mention the following theorem.

T HEOREM 3.2.3 (Abel’s formula [HS99, Remark IV-2-7(4)]). If A is continuous near t0 , then
the fundamental matrix solution Y (t) satisfying (3.2.2) satisfies (3.2.6).

D EFINITION 3.2.4. The quantity W (t) := det(Y (t)) is called the Wronskian and the Abel
formula (3.2.6) reads
Z t 
(3.2.7) W (t) = exp tr (A(s)) ds for all t near t0 .
t0

P ROOF OF T HEOREM 3.2.3. Using the product rule and Theorem 3.1.15, we see that
d
(det(Y (t)))
dt  
d
= ∑ sign (σ ) Y (t) Yσ (2),2 (t) · · ·Yσ (n),n (t)
σ ∈Sn dt σ (1),1
..
.
 
d
+ ∑ sign (σ )Yσ (1),1 (t)Yσ (2),2 (t) · · · Y (t)
σ ∈Sn dt σ (n),n
 
d
= ∑ sign (σ ) Y (t) Yσ (2),2 (t) · · ·Yσ (n),n (t)
σ ∈Sn dt σ (1),1
..
.
 
d
+ ∑ sign (σ )Yσ (1),1 (t)Yσ (2),2 (t) · · · Y (t)
σ ∈Sn dt σ (n),n
3.3. NONHOMOGENEOUS EQUATIONS 44

Since Y ′ (t) = A(t)Y (t), then we reach


d
(det(Y (t)))
dt
= ∑ sign (σ ) (A(t)Y (t))σ (1),1 Yσ (2),2 (t) · · ·Yσ (n),n (t)
σ ∈Sn
..
.
+ ∑ sign (σ )Yσ (1),1 (t)Aσ (2),2 (t) · · · (A(t)Y (t))σ (n),n
σ ∈Sn

= ∑ sign (σ )(aσ (1) (t)y1 (t))(e⊺σ (2) (t)y2 (t)) · · · (e⊺σ (n) (t)yn (t))
σ ∈Sn
..
.
+ ∑ sign (σ )(e⊺σ (1) (t)y1 (t)) · · · (e⊺σ (n−1) (t)yn−1 (t))(aσ (n) (t)yn (t))
σ ∈Sn

e⊺1 (t)
     
a1 (t)

e⊺2 (t)
   ..  
   
 Y (t) + · · · + det  .  
= det   ..  ⊺
 Y (t)
.  en−1 (t) 
    
   

en (t) an (t)

where a j be the jth row of A and y j is the jth column of Y . Now we use Theorem 3.1.17 to see that
   ⊺ 
a1 (t) e1 (t)
 ⊺
 e2 (t) 
  .. 
d 
 det(Y (t)) + · · ·  . 
(det(Y (t))) = det 
 .  .  ⊺
 det(Y (t))
dt  .  e
 n−1 (t)



en (t) an (t)
= a11 (t) det(Y (t)) + · · · + ann (t) det(Y (t))
= tr (A(t)) det(Y (t)),

which shows that det(Y (t)) satisfies the scalar ODE of the form (3.2.2). Since det(Y (t0 )) = 1, by
using the fundamental theorem of ODE (Theorem 2.1.5), we conclude that the unique solution of
the scalar ODE is given by (3.2.6) and we complete the proof of the theorem. □

3.3. Nonhomogeneous equations

In this section, we explain how to solve an initial-value problem

(3.3.1) y ′ (t) = A(t)y(t) + b(t), y(t0 ) = p.

Here we assume that both A and b are continuous for all t near t0 . Let Y (t) = Y (t;t0 ) ∈ GL(n, C)
be the fundamental matrix solution satisfying

Y ′ (t) = A(t)Y (t), Y (t0 ) = I,


3.3. NONHOMOGENEOUS EQUATIONS 45

which was mentioned in previous sections (Section 3.1 and Section 3.2).
By plugging the anzats
y(t) = Y (t)z(t) for all t near t0 ,
into (3.3.1) and by using the product rule, we see that

(((t)z(t) +Y (t)z ′ (t) = Y ′ (t)z(t) +Y (t)z ′ (t)


(
A(t)Y
( (((

= y ′ (t) = A(t)y(t) + b(t) = ( (((t)z(t) + b(t).


(
A(t)Y (((

Mutiply the above equation by (Y (t))−1 ∈ GL(n, C), we now see that

(3.3.2) z ′ (t) = (Y (t))−1 b(t), z(t0 ) = p,

and its unique solution (guaranteed by the fundamental theorem of ODE (Theorem 2.1.5)) is given
by Z t
z(t) = p + (Y (s))−1 b(s) ds for all t near t0 .
t0
Now we see that the unique solution Y (t) of (3.3.1) is given by
 Z t 
−1
y(t) = Y (t;t0 ) p + (Y (s;t0 )) b(s) ds for all t near t0 .
t0

R EMARK 3.3.1. The numerical computation of (Y (t))−1 b(t) is quite fundamental, but keep in
mind that one never compute the inverse of (Y (t))−1 directly (the computation is both inaccuate and
slow). In practical, this is computed via iterative algorithm such as GMRES, conjugate gradient,
etc. One can refer to the monograph [TB22] for a nice introduction of numerical linear algebra.
Here we recall that in fact (Y (t))−1 is the fundamental matrix solution of the adjoint problem
(3.2.3), which allows us to compute (Y (t))−1 by solving an ODE, which is much better than
compute (Y (t))−1 b(t) via numerical method.

In fact, the above formula can be further simplified and we now label the subscript for
clarification. For each fixed s near t0 , we now see that
d
Y (t;t0 )(Y (s;t0 ))−1 = A(t) Y (t;t0 )(Y (s;t0 ))−1 , Y (t;t0 )(Y (s;t0 ))−1
  
= I,
dt t=s
and t the fundamental theorem of ODE (Theorem 2.1.5) says that

Y (t; s) = Y (t;t0 )(Y (s;t0 ))−1 ,

and we now conclude that the unique solution of (3.3.1) is given by


Z t
(3.3.3) y(t) = Y (t;t0 )p + Y (t; s)b(s) ds for all t near t0 .
t0

It is worth to mention that the solution formula (3.3.3) do not involve Y (t)−1 , as well as the adjoint
problem (3.2.3), at all.
3.4. HIGHER ORDER LINEAR ODE 46

R EMARK 3.3.2. When A(t) ≡ A is a constant matrix, we see that

Y (t; s) = exp((t − s)A) for all t, s ∈ R,

and now (3.3.3) reads


Z t
y(t) = exp((t − t0 )A)p + exp((t − s)A)b(s) ds.
t0
Rt
Note that the term t0 exp((t − s)A)b(s) ds is the convolution.

E XAMPLE 3.3.3. Let us solve the initial value problem

(3.3.4) y ′ (t) = Ay(t) + b(t), y(0) = p

with      
−2 1 0 2 1
A =  0 −2 0  , b(t) = 0 , p = 1 .
     

3 2 1 t 0
The fundamental matrix solution is given by
 
e−2t te−2t 0
(3.3.5) Y (t) = exp(tA) =  0 e−2t 0 .
 

et − e−2t et − (1 + t)e−2t et
Therefore, the solution of (3.3.4) is given by
 
1 + te−2t
y(t) =  e−2t .
 
t
−4 − t + 5e − (1 + t)e−2t

E XERCISE 3.3.4. Prove (3.3.5).

3.4. Higher order linear ODE

In this section, we explain how to solve the initial value problem of an nth order linear ODE

u(n) + a (t)u(n−1) + · · · + a (t)u′ + a (t)u = b(t) for all t near t ,
1 n−1 n 0
(3.4.1)
u(t0 ) = p1 , u′ (t0 ) = p2 , · · · , u(n−1) (t0 ) = pn .

We first prove the following fundamental result.

T HEOREM 3.4.1. If the coefficients a1 , · · · , an , b are continuous near t0 , then there exists a
unique Cn -solution u of (3.4.1) near t0 .
3.4. HIGHER ORDER LINEAR ODE 47

P ROOF. Existence of solution. We now define


!
0n−1 In−1
A(t) =  
−an (t) −an−1 (t), · · · , −a1 (t)
 
0 1 0 ··· 0
 .. .. 

 0 0 1 . . 

(3.4.2) =
 .. .. .. .. 
 . . . 0 . 


 0 0 ··· 0 1 

−an (t) −an−1 (t) −an−2 (t) · · · −a1 (t)

where 0n−1 ∈ Rn−1 is the zero vector, In−1 ∈ R(n−1)×(n−1) is the identity matrix, and
   
0 p1
 . 
 .. 
 
 p2 
(3.4.3) b(t) =  , p =  . .
 . 
0  . 
 
 
b(t) pn

In previous section (Section 3.3), we have showed that there exists a unique C1 solution y(t) of

(3.4.4) y ′ (t) = A(t)y(t) + b(t), y(t0 ) = p.

For each t and s, which are close to t0 , let


 
y1 (t; s)
 .. 
 . 
Y (t; s) =  
 yn−1 (t; s) 
 
yn (t; s)
be the fundamental matrix solution satisfying Y (s) = I and
 
y1′ (t; s)
 .. 

 .  = Y ′ (t; s) = A(t)Y (t; s)

 ′
y (t; s)

 n−1 

yn (t; s)
!
0n−1 In−1
=   Y (t; s)
−an (t) −an−1 (t), · · · , −a1 (t)
 
y2 (t; s)
 .. 
 . 
= ,
yn (t; s)
 
 
−an (t)y1 (t; s) − · · · − a1 yn (t; s)
3.4. HIGHER ORDER LINEAR ODE 48

and we see that

(3.4.5) y ′j (t; s) = y j+1 (t; s) for all j = 1, · · · , n − 1.

From (3.3.3), we now have


Z t
(3.4.6) y(t) = Y (t;t0 )p + Y (t; s)b(s) ds.
t0

We now define
Z t
(3.4.7) u(t) := y1 (t) = y1 (t;t0 )p + y1 (t; s)b(s) ds ∈ C1 near t0
t0

Now from (3.4.5) we see that u ∈ Cn near t0 and


 
u(t)
u′ (t)
 
 
(3.4.8) y(t) =  .. ,
.
 
 
u(n−1) (t)
then we see that

A(t)y(t) + b(t)
   
u(t) 0
0n−1 In−1
! ..   . 
  .. 
 .
=    + 
−an (t) −an−1 (t), · · · , −a1 (t) 
 u(n−2) (t)   0 
  
u(n−1) (t) b(t)
 
u′ (t)
 .. 
 . 
= ,
u(n−2) (t)
 
 
−a1 (t)u(n−1) (t) − a2 (t)u(n−1) (t) − · · · − an (t)u(t) + b(t)
and hence (3.4.4) gives (3.4.1).

Uniqueness. if u ∈ Cn is a solution of (3.4.1) near t0 , then the C1 -function y given in (3.4.8)


satisfies (3.4.4). □

R EMARK 3.4.2. We recall (Definition 3.2.4) the Wronskian is defined by W (t) := det(Y (t)),
and using the Abel’s formula (Theorem 3.2.3), we see that
Z t   Zt 
W (t) = exp tr (A(s)) ds = exp − a1 (s) ds .
t0 t0

R EMARK 3.4.3 (2-dimensional case [HS99, Remark IV-7-2 and Remark IV-7-3]). In practical
computation, the most difficult part is to compute Y (t; s) = Y (t)(Y (s))−1 . When n = 2, the
3.4. HIGHER ORDER LINEAR ODE 49

computation of (Y (s))−1 can be further simplified. As mentioned in the proof, one has
!
φ1 (t) φ2 (t)
Y (t) =
φ1′ (t) φ2′ (t)
where φ1 (t) and φ2 (t) are two linearly independent solutions of the associated homogeneous
equation

(3.4.9) v′′ + a1 (t)v′ + a2 (t)v = 0.

We see that !
1 φ2′ (t) −φ2 (t)
(Y (s;t0 ))−1 = .
W (t) −φ1′ (t) φ1 (t)
In this case, the Wronskian reads

(3.4.10) W (t) = φ1 (t)φ2′ (t) − φ1′ (t)φ2 (t)

Therefore from (3.4.7) we know that the unique solution u of



u′′ + a (t)u′ + a (t)u = b(t) for all t near t ,
1 2 0
u(t0 ) = p1 , u′ (t0 ) = p2 ,

is given by
Z t Z t
φ2 (s) φ1 (s)
u(t) := p1 φ1 (t) + p2 φ2 (t) − φ1 (t) b(s) ds + φ2 (t) b(s) ds.
t0 W (s) t0 W (s)
We now write (3.4.10) as
′
φ2′ (t) φ ′ (t)

φ2 (t) W (t)
= − 1 2 φ2 (t) = .
φ1 (t) φ1 (t) (φ1 (t)) (φ1 (t))2
(t)
Let W̃ be any function such that W̃ ′ (t) = (φW(t)) 2 , and one sees that φ2 (t) = φ1 (t)W̃ (t) satisfies
1
the above ODE. Now using Corollary 3.4.5. we conclude that the solution set of (3.4.9) is a 2-
dimensional vector space with basis

{φ1 (t), φ1 (t)W̃ (t)}.

R EMARK 3.4.4 (3-dimensional case [HS99, Remark IV-7-4]). We write


 
φ (t) φ1 (t) φ2 (t)
Y (t) =  φ ′ (t) φ1′ (t) φ2′ (t)  ,
 

φ ′′ (t) φ1′′ (t) φ2′′ (t)


and we see that φ (t), φ1 (t) and φ2 (t) are lienarly independent solutions of the homogeneous
equation
u′′′ + a1 (t)u′′ + a2 (t)u′ + a3 (t)u = 0.
3.4. HIGHER ORDER LINEAR ODE 50

By using the cofactor expansion of determinant (Theorem 3.1.14), the Wronskian reads
! ! !
′′ φ1 (t) φ2 (t) ′ φ1 (t) φ2 (t) φ1′ (t) φ2′ (t)
W (t) = φ (t) det − φ (t) det + φ (t) det ,
φ1′ (t) φ2′ (t) φ1′′ (t) φ2′′ (t) φ1′′ (t) φ2′′ (t)
and we write
A1 (t) ′ A2 (t) W (t)
φ ′′ (t) + φ (t) + φ (t) =
A0 (t) A0 (t) A0 (t)
where
!
φ1 (t) φ2 (t)
A0 (t) = det ,
φ1′ (t) φ2′ (t)
!
φ1 (t) φ2 (t)
A1 (t) = det ,
φ1′′ (t) φ2′′ (t)
!
φ1′ (t) φ2′ (t)
A2 (t) = det .
φ1′′ (t) φ2′′ (t)
This means that the general solution of φ can be expressed in terms of φ1 , φ2 and W .

The proof of Theorem 3.4.1 itself gives an algorithm to compute the unique solution. Since
Y (t; s) ∈ GL(n, C), then we also have the following corollary.

C OROLLARY 3.4.5. If the coefficients a1 , · · · , an are continuous near t0 , then the solution set

{u ∈ Cn near t0 : u(n) + a1 (t)u(n−1) + · · · + an−1 (t)u′ + an (t)u = 0 near t0 }

forms an n-dimensional vector space.

E XAMPLE 3.4.6. Let us solve the initial value problem

(3.4.11) u′′′ − 2u′′ − 5u + 6u = 3t, u(0) = 1, u′ (0) = 2, u′′ (0) = 0.

In this case, the matrix A given in (3.4.2) and the vectors b as well as p given in (3.4.3) read
     
0 1 0 0 1
A =  0 0 1  , b(t) =  0  , p = 2 .
     

−6 5 2 3t 0
We can compute

Y (t) = exp(tA)
     
−6 −1 1 3 −4 1 −2 1 1
et   e−2t   e3t 
(3.4.12) = −  −6 −1 1  +  −6 8 −2  +  −6 3 3  .

6 15 10
−6 −1 1 12 −16 4 −18 9 9
3.4. HIGHER ORDER LINEAR ODE 51

By a direct but long (and boring) computations we reach


Z t
Y (s)−1 b(s) ds
0
     
1 − (t + 1)e−t 1 + (2t − 1)e2t 1 + (3t + 1)e−3t
1  1   1 
= − 1 − (t + 1)e−t  + −2(1 + (2t − 1)e2t ) + 3(1 + (3t + 1)e−3t ) ,

2 20 30
1 − (t + 1)e−t −4(1 + (2t − 1)e2t ) 9(1 + (3t + 1)e−3t )
and hence  
30t + 25 − 17e−2t + 50et + 2e3t
1 
y(t) =  2(15 + 17e−2t + 25et + 3e3t )  .

60
2(−34e−2t + 25et + 9e3t )
We finally conclude from (3.4.7) that
1
u(t) = (30t + 25 − 17e−2t + 50et + 2e3t )
60
is the unique solution of (3.4.11).

E XERCISE 3.4.7. Proof (3.4.12) using Algorithm 2. (Hint. See Exercise 3.1.40)

Let u ∈ Cn be the unique solution of (3.4.1). Let v ∈ Cn be any function satisfying

v(n) + a1 (t)v(n−1) + · · · + an−1 (t)v′ + an (t)v = b(t)for all t near t0 ,

then one sees that the solution w = u − v is the unique solution to


(3.4.13)

w(n) + a (t)w(n−1) + · · · + a (t)w′ + a (t)w = 0 for all t near t0 ,
1 n−1 n
w(t0 ) = p1 − v(t0 ), u′ (t0 ) = p2 − v′ (t0 ), · · · , u(n−1) (t0 ) = pn − v(n−1) (t0 ).

For the case when a1 , · · · , an are constants, there is an efficient way (based on Corollary 3.4.5) to
compute the solution w of (3.4.13). By plugging the anzats w(t) = eλt into the equation

(3.4.14) w(n) + a1 w(n−1) + · · · + an−1 w′ + an w = 0,

we reach the equation


P(λ ) := λ n + a1 λ n−1 + · · · + an = 0.
It is important to obseve that (3.4.14) can be rewritten as
 
d
P w = 0.
dt
By using the fundamental theorem of algebra (see e.g. [Kow23] for a proof), there exists distinct
λ1 , · · · , λk ∈ C such that

P(λ ) = (λ − λ1 )m1 · · · (λ − λk )mk with m1 + · · · + mk = n.


3.4. HIGHER ORDER LINEAR ODE 52

For any differentiable function g, we see that


 
d
− λ j (eλ j t g(t)) = eλ j t g′ (t),
dt
and inductively one can show
 m j
d
−λj (eλ j t g(t)) = eλ j t g(m j ) (t).
dt
For each c1 , · · · , cm j ∈ C, we now choose g(t) = c1 + c2t + · · · + cm j t m j , we see that g(m j ) (t) ≡ 0,
and hence  m j
d
−λj (c1 etλ j + c2tetλ j + · · · + cm j t m j −1 etλ j ) = 0,
dt
thus  
d
P (c1 etλ j + c2tetλ j + · · · + cm j t m j −1 etλ j ) = 0.
dt
Since the above argument works for all j = 1, · · · , k, combining with Corollary 3.4.5, we reach the
following theorem.

T HEOREM 3.4.8. The solution set of (3.4.14) is a C-vector space with basis
k
{etλ j ,tetλ j , · · · ,t m j −1 etλ j }.
[

j=1

E XAMPLE 3.4.9. We now revisit Example 3.4.6. We first note that


1 t 5
v(t) := (30t + 25) = +
60 2 12
is a particular solution to v′′′ − 2v′′ − 5v + 6v = 3t. We now see that w = u − v solves
7 ′ 3
w′′′ − 2w′′ − 5w + 6w = 0, w(0) = , u (0) = , w′′ (0) = 0.
12 2
We now consider
P(λ ) := λ 3 − 2λ 2 − 5λ + 6 = 0
One can check that the roots of P are −2, 1, 3. Hence the general solution of w is

w(t) = c1 e−2t + c2 et + c3 e3t .

Now we see that


7
= w(0) = c1 + c2 + c3 ,
12
3
= w′ (0) = −2c1 + c2 + 3c3 ,
2
0 = w′′ (0) = 4c1 + c2 + 9c3 .
3.5. STRUM-LIOUVILLE EIGENVALUE PROBLEM 53

Solving the above system gives c1 = − 17


60 , c2 =
5
6 and c3 = 1
30 , and we reach
1
w(t) = (−17e−2t + 50et + 2e3t ).
60
Finally, we conclude that
1
u(t) = v(t) + w(t) = (30t + 25 − 17e−2t + 50et + 2e3t )
60
is the unique solution of (3.4.11).

3.5. Strum-Liouville eigenvalue problem

Let a, b, θ1 , θ2 ∈ R with a < b, let q ∈ C([a, b]) is real-valued and let p ∈ C1 ((a, b)) ∩C([a, b])
is real-valued such that
p(t) > 0 for all t ∈ [a, b].
We define the linear operator L : C2 ((a, b)) ∩C1 ([a, b]) → C((a, b)) by
 
d du
(L [u])(t) := p(t) + q(t)u
dt dt
In this section, we consider the eigenvalue problem

(3.5.1a) L [u] = −λ u for all t ∈ (a, b)

subject to the boundary conditions



u(a) cos θ − p(a)u′ (a) sin θ = 0,
1 1
(3.5.1b)
u(b) cos θ2 − p(b)u′ (b) sin θ2 = 0.

It is easy to see that (3.5.1a)–(3.5.1b) has a solution u ≡ 0, which is called the trivial solution. We
see that the boundary value problem (3.5.1a)–(3.5.1b) is over-determined, and we expect that in
general the nonexistence of nontrivial solution (i.e. u ̸≡ 0) without any further assumptions. We
are interested in the following object:

D EFINITION 3.5.1. If there exists λ ∈ C and a nontrivial solution u ∈ C2 ((a, b)) ∩ C1 ([a, b])
of (3.5.1a)–(3.5.1b), then we say that such λ is a Strum-Liouville eigenvalue and such nontrivial
solution u is called the corresponding Strum-Liouville eigenfunction. The boundary value problem
(3.5.1a)–(3.5.1b) is called the Strum-Liouville eigenvalue problem.

We define Z b
(u, v)L2 (a,b) := u(t)v(t) dt,
a
and 1/2
b
Z
1/2 2
∥u∥L2 (a,b) := (u, u)L2 (a,b) = |u(t)| dt .
a
3.5. STRUM-LIOUVILLE EIGENVALUE PROBLEM 54

We remind the inner product is skew-Hermitian:


Z b
(v, u)L2 (a,b) = v(t)u(t) dt = (u, v)L2 (a,b) .
a
one can compare this with the the inner product (·, ·)Cn×n given in (3.1.9). We first observe the
following lemma.

L EMMA 3.5.2. The operator L : C2 ((a, b)) → C((a, b)) is Hermitian or self-adjoint in the
sense of

(3.5.2) (L [u], v)L2 (a,b) = (u, L [v])L2 (a,b)

for all u, v ∈ C2 ((a, b)) ∩C1 ([a, b]) both satisfy the boundary condition (3.5.1b). In addition, if λ is
a Strum-Liouville eigenvalue, then λ ∈ R.

R EMARK 3.5.3. It is interesting to compare (3.5.2) with a characterization of Hermitian matrix


in Lemma 3.1.28, and compare the second statement with the result for Hermitian matrix in
Theorem 3.1.29.

P ROOF OF L EMMA 3.5.2. By using the integration by parts, we see that

(L [u], v)L2 (a,b)


t=b Z b Z b
′ ′
= p(t)u (t)v(t) − p(t)u (t)v′ (t) dt + q(t)u(t)v(t) dt.
t=a a a

We now treat the term p(b)u′ (b)v(b) into two cases:

Case 1. If θ2 ∈
/ πZ, then sin θ2 ̸= 0, and we now can write
=0
}| z { v(b)
p(b)u′ (b)v(b) = p(b)u′ (b) sin θ2 = 0.
sin θ2

Case 2. Otherwise, if θ2 ∈ πZ, then sin θ2 = 0 and cos θ2 ∈ {−1, 1}. Now from (3.5.1b) we see
that
u(b) = v(b) = 0,
and thus p(b)u′ (b)v(b) = 0.

Similar arguments show that p(a)u′ (a)v(a) = 0. We now see that


Z b
L [u](t)v(t) dt
a
Z b Z b
=− p(t)u′ (t)v′ (t) dt + q(t)u(t)v(t) dt.
a a
3.5. STRUM-LIOUVILLE EIGENVALUE PROBLEM 55

Since both p and q are real-valued, interchanging the role of u and v we see that
Z b
u(t)L [v](t) dt
a
Z b Z b

=− p(t)u (t)v′ (t) dt + q(t)u(t)v(t) dt,
a a
and combining the above two equations, we conclude that L is Hermitian.

Now let λ ∈ C be an eigenvalue with eigenfunction u. We see that

λ ∥u∥2L2 (a,b) = (L [u], v)L2 (a,b) = (u, L [v])L2 (a,b) = λ ∥u∥2L2 (a,b) .

Since ∥u∥2L2 (a,b) ̸= 0, we conclude that λ = λ . □

In fact, we have the following theorems.

T HEOREM 3.5.4 ([HS99, Theorem VI-3-11, Theorem VI-4-1 and Theorem VI-4-4]). Let
a, b, θ1 , θ2 ∈ R with a < b, let q ∈ C([a, b]) is real-valued and let p ∈ C1 ((a, b)) ∩C([a, b]) is real-
valued such that
p(t) > 0 for all t ∈ [a, b].
Then:
(a) there exists a countable sequence of eigenvalues λ1 < λ2 < λ3 < · · · → +∞ of the Strum-
Liouville eigenvalue problem (3.5.1a)–(3.5.1b).
(b) Let λi and λ j be two distinct eigenvalues, then the corresponding eigenfunctions ui and u j
are orthogonal in L2 (a, b), that is,

(ui , u j )L2 (a,b) = 0.

(c) For every u ∈ C2 ((a, b)) ∩C1 ([a, b]) satisfying the boundary condition (3.5.1b), the series
+∞
∑ ( f , u j )L2(a,b)u j
j=1

converges to u in L∞ (a, b), provided the eigenfunctions are normalized to ∥u j ∥L2 (a,b) = 1.

E XAMPLE 3.5.5. For simplicity, we put a = 0 and b = π. We now choose p(t) ≡ 1 and q(t) ≡ 0,
and now the Strum-Liouville problem (3.5.1a)–(3.5.1b) reads

(3.5.3a) u′′ (t) = −λ u(t) for all t ∈ (0, π)

subject to the boundary conditions



u(a) cos θ − u′ (a) sin θ = 0,
1 1
(3.5.3b)
u(b) cos θ2 − u′ (b) sin θ2 = 0.
3.5. STRUM-LIOUVILLE EIGENVALUE PROBLEM 56

By choosing θ1 = θ2 = 0, we reach an orthogonal sequence {sin(nt)}n∈N of eigenfunctions. By


choosing θ1 = θ2 = π2 , we reach an orthogonal sequence {cos(nt)}∞ n=0 of eigenfunctions. This
induces Fourier series [Kow22]. If we still have time, we can continue the course by using the
material [Kow22].
Bibliography

[Apo74] T. M. Apostol. Mathematical analysis. Addison-Wesley Publishing Co., second edition, 1974. MR0344384,
Zbl:0309.26002.
[BD22] W. E. Boyce and R. C. DiPrima. Elementary differential equations and boundary value problems. John Wiley
& Sons, Inc., Hoboken, NJ, 12th edition, 2022. MR0179403, Zbl:1492.34001.
[Che16] I-L. Chern. Mathematical modeling and ordinary differential equations. Lecture notes. National Taiwan
University, 2016. https://www.math.ntu.edu.tw/~chern/notes/ode2015.pdf.
[Hal15] B. Hall. Lie groups, Lie algebras, and representations. An elementary introduction, volume 222 of Grad.
Texts in Math. Springer, Cham, second edition, 2015. MR3331229, Zbl:1316.22001, doi:10.1007/978-3-
319-13467-3.
[Hal13] B. C. Hall. Quantum theory for mathematicians, volume 267 of Grad. Texts in Math. Springer, New York,
NY, 2013. Zbl:1273.81001, doi:10.1007/978-1-4614-7116-5.
[HS99] P.-F. Hsieh and Y. Sibuya. Basic theory of ordinary differential equations. Universitext. Springer-Verlag,
New York, 1999. MR1697415, doi:10.1007/978-1-4612-1506-6.
[Joh78] F. John. Partial differential equations, volume 1 of Appl. Math. Sci. Springer-Verlag, New York-Berlin, third
edition, 1978. MR0514404, Zbl:0426.35002.
[Kow22] P.-Z. Kow. Fourier analysis and distribution theory. University of Jyväskylä, 2022.
https://puzhaokow1993.github.io/homepage.
[Kow23] P.-Z. Kow. Complex Analysis. National Chengchi University, Taipei, 2023.
https://puzhaokow1993.github.io/homepage.
[Kow24] P.-Z. Kow. An introduction to partial differential equations and functional analysis. National Chengchi
University, Taipei, 2024. https://puzhaokow1993.github.io/homepage.
[Kwa17] M. Kwaśnicki. Ten equivalent definitions of the fractional Laplace operator. Fractional Calculus
and Applied Analysis, 20(1):7–51, 2017. MR3613319, Zbl:1375.47038, doi:10.1515/fca-2017-0002,
arXiv:1507.07356.
[Pug15] C. C. Pugh. Real mathematical analysis. Undergrad. Texts Math. Springer, Cham, second edition, 2015.
MR3380933, Zbl:1329.26003, doi:10.1007/978-3-319-17771-7.
[Str08] W. A. Strauss. Partial differential equations: An introduction. John Wiley & Sons, Ltd., Chichester, second
edition, 2008. MR2398759, Zbl:1160.35002.
[TB22] L. N. Trefethen and D. III Bau. Numerical linear algebra. Society for Industrial and Applied Mathematics
(SIAM), Philadelphia, PA, 25th anniversary edition, 2022. MR4713493, Zbl:1510.65092.
[Tre17] S. Treil. Linear algebra done wrong. Brown University, 2017. https://www.math.brown.edu.

57

You might also like