[go: up one dir, main page]

0% found this document useful (0 votes)
117 views66 pages

Stochastic Control:: With Applications To Financial Mathematics

This document is an introduction to stochastic control and its applications in financial mathematics. It covers topics such as dynamic programming, Itô diffusions, verification theorems, and optimal investment problems. It also discusses extensions including minimization problems, infinite time horizons, consumption-investment problems, stopping processes, and viscosity solutions. The document provides an overview of stochastic control techniques and their use in modeling financial decisions under uncertainty.
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
117 views66 pages

Stochastic Control:: With Applications To Financial Mathematics

This document is an introduction to stochastic control and its applications in financial mathematics. It covers topics such as dynamic programming, Itô diffusions, verification theorems, and optimal investment problems. It also discusses extensions including minimization problems, infinite time horizons, consumption-investment problems, stopping processes, and viscosity solutions. The document provides an overview of stochastic control techniques and their use in modeling financial decisions under uncertainty.
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 66

Stochastic Control:

With Applications to Financial Mathematics∗

Summer 2006, JKU Linz

Jörn Saß

joern.sass@oeaw.ac.at
RICAM
Austrian Academy of Sciences
Altenbergerstraße 54

June 27, 2007


Web: http://www.ricam.oeaw.ac.at/people/page/sass/teaching/stochcontrol/

1
Contents
1 Introduction 6
1.1 Stochastic control problem . . . . . . . . . . . . . . . . . . . . . 6
1.2 Portfolio optimization: first example . . . . . . . . . . . . . . . 7

2 Dynamic Programming 9
2.1 Itô diffusions and their generators . . . . . . . . . . . . . . . . . 9
2.2 The idea of dynamic programming . . . . . . . . . . . . . . . . 12
2.3 A verification theorem . . . . . . . . . . . . . . . . . . . . . . . 14
2.4 Example: Optimal investment . . . . . . . . . . . . . . . . . . . 17
2.5 Types of controls . . . . . . . . . . . . . . . . . . . . . . . . . . 20

3 Some extensions 20
3.1 Minimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.2 Infinite time horizon . . . . . . . . . . . . . . . . . . . . . . . . 20
3.3 Example: Discounted utility of consumption . . . . . . . . . . . 21
3.4 Stopping the state process . . . . . . . . . . . . . . . . . . . . . 23
3.5 Example: Portfolio Insurer . . . . . . . . . . . . . . . . . . . . . 25
3.6 Dynamic programming for deterministic optimal control problems 30
3.7 Stochastic linear regulator problem . . . . . . . . . . . . . . . . 33

4 Viscosity solutions and stochastic control 34


4.1 A deterministic example . . . . . . . . . . . . . . . . . . . . . . 34
4.2 Upper and lower semi continuous functions . . . . . . . . . . . . 34
4.3 A stochastic example . . . . . . . . . . . . . . . . . . . . . . . . 35
4.4 Viscosity solutions . . . . . . . . . . . . . . . . . . . . . . . . . 36
4.5 Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4.6 Viscosity solutions and HJB equations . . . . . . . . . . . . . . 38
4.7 Portfolio optimization under transaction costs . . . . . . . . . . 38
4.7.1 Wealth processes under transaction costs . . . . . . . . . 38
4.7.2 Value function . . . . . . . . . . . . . . . . . . . . . . . . 39
4.7.3 Heuristic derivation of the HJB equation . . . . . . . . . 40
4.7.4 No short selling, no borrowing . . . . . . . . . . . . . . . 43
4.7.5 Reduction of the dimension . . . . . . . . . . . . . . . . 43
4.7.6 A Semi-Smooth Newton Method . . . . . . . . . . . . . 45
4.7.7 More complex transaction costs . . . . . . . . . . . . . . 46

5 Optimal Stopping 47
5.1 Some results on optimal stopping . . . . . . . . . . . . . . . . . 47
5.2 Optimal Stopping for underlying Markov processes . . . . . . . 50
5.3 Reminder on pricing of European options . . . . . . . . . . . . . 50
5.4 American options . . . . . . . . . . . . . . . . . . . . . . . . . . 52
5.5 The American put option . . . . . . . . . . . . . . . . . . . . . 55

2
A Conditional Expectation 57
A.1 Conditional expectation and conditional probability . . . . . . . 57
A.2 Some properties . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

B Stochastic Processes in Continuous Time 58


B.1 Stochastic processes . . . . . . . . . . . . . . . . . . . . . . . . . 58
B.2 Filtrations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
B.3 Stopping times . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
B.4 Martingales . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

C The Stochastic Integral 60


C.1 Stochastic integral for simple processes . . . . . . . . . . . . . . 60
C.2 The stochastic integral . . . . . . . . . . . . . . . . . . . . . . . 61
C.3 A generalization . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
C.4 Itô Formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
C.5 The multidimensional case . . . . . . . . . . . . . . . . . . . . . 63

D Stochastic Differential Equations 65


D.1 Problem formulation . . . . . . . . . . . . . . . . . . . . . . . . 65
D.2 Uniqueness and existence . . . . . . . . . . . . . . . . . . . . . . 65

3
References
[CIL92] Crandall M.G., Ishii, H., Lions, P.-L. (1992): User’s guide to viscos-
ity solutions of second order partial differential equations. Bulletin of the
American Mathematical Society 27, 1–67.
[ElK81] El Karoui, N. (1981): Les aspects probabilistes du contrôle stochas-
tique. In: Ninth Saint Flour Probability Summer School 1979 (Saint Flour,
1979), pp. 73–238, Lecture Notes in Math., 876. Berlin-New York, Springer,
73–238.
[FR75] Fleming, W.H., Rishel, R.W. (1975): Deterministic and Stochastic
Optimal Control. New York: Springer.
[FS93] Fleming, W.H., Soner, H.M. (1993): Controlled Markov Processes and
Viscosity Solutions. New York: Springer.
[He94] Hernández Lerma, O. (1994): Lectures on Continuous-Time Markov
Control Processes. Apportaciones Matematicas 3. Sociedad Matematica
Mexicana.

[Ir03] Irle, A. (2003): Finanzmathematik: Die Bewertung von Derivaten.


2. Aufl., Teubner, Stuttgart.
[KS98] Karatzas, I., Shreve, S. E. (1998): Methods of Mathematical Finance.
New York Berlin, Heidelberg: Springer.

[KS99] Karatzas, I., Shreve, S. E. (1999): Brownian Motion and Stochastic


Calculus, 2nd ed. New York Berlin, Heidelberg: Springer.
[Ko99] Korn, R. (1999): Optimal Portfolios. Singapore, World Scientific Pub-
lications.
[K99] Korn, R., Korn, E. (1999): Optionsbewertung und Portfolio-
Optimierung. Braunschweig, Vieweg.
[Ok00] Øksendal, B. (2000): Stochastic Differential Equations. Springer,
Berlin.
[PS06] Peskir, G., Shiryaev, A. (2006): Optimal Stopping and Free Boundary
Problems. Basel, Boston, Berlin. Birkhäuser.
[Pr04] Protter, P.E. (2004): Stochastic Integration and Differential Equations.
Berlin: Springer.
[Ru03] Runggaldier, W.J. (2003): On stochastic control in finance. In: Rosen-
thal, J., Gilliam, D.S. (eds.): Mathematical Systems Theory in Biology,
Communication, Computation and Finance. IMA Volumes in Mathematics
and its Applications, Vol. 134. New York, Springer Verlag, 317-344.

4
[St00] Steele, J.M. (2000): Stochastic Calculus and Financial Applications.
New York, Springer.

[YZ99] Yong, J., Zhou, X.Y. (1999): Stochastic Controls. New York, Berlin,
Heidelberg: Springer.

[To02] Touzi, N. (2002): Stochastic Control Problems, Viscosity Solutions,


and Application to Finance. Lecture Notes, Special Semester on Financial
Markets: Mathematical, Statistical and Economic Analysis. Pisa, April 29–
July 15 2002.

For more references regarding the ”basics” in the appendix we refer to the
lecture notes Stochastische Differentialgleichungen, available at
http://www.ricam.oeaw.ac.at/people/page/sass/teaching/sdes/
and the references therein. These are mainly based on [KS99, Ok00], more
general results can be found in [Pr04] covering also non-continuous processes.
A highly readable book is [St00].
Text books on stochastic control are always difficult to read. Short and good
introductions are given in [K99, Ko99], the latter with an extensive overview of
the applications in portfolio optimization. In addition, the lecture notes [To02]
provide a good introduction to the concept of viscosity solutions. There are
also some good review papers on applications of stochastic control methods in
Finance, e.g. [Ru03]. The course will be based on the references made so far
and to a certain extent on [ElK81, FR75, FS93, He94, Ok00, YZ99].

5
1 Introduction
We consider optimal control of Itô-type processes which satisfy a stochastic
differential equation (SDE) w.r.t. some Wiener process.

1.1 Stochastic control problem


Let (Ω, F, P ) be a probability space, T > 0 the terminal time, F a filtration
satisfying the usual conditions and W = (Wt )t∈[0,T ] ,
 
Wt1
Wt =  ...  .
 
Wtm
an m-dimensional Wiener process w.r.t. F .
• A control process (the actions) is an F -progressively measurable process
u = (ut )t∈[0,T ] with values in some set U ⊆ IRp .
• The n-dimensional controlled process (state of the system) X = (Xt )t∈[0,T ]
is given by
dXt = b(t, Xt , ut)dt + σ(t, Xt , ut )dWt , X0 = x0 , (1.1)
where
b : [0, T ] × IRn × U → IRn , σ : [0, T ] × IRn × U → IRn×m
are measurable. Further conditions have to be specified for each control
problem separately. In particular conditions are needed which guarantee
the existence of X. To emphasize the dependency of X on the control u
we may write Xtu when convenient.
• As optimization/performance criterion we use (in the beginning)
Z T 
u u
u
J(t, x, u) = E ψ(t, Xt , ut )dt + Ψ(T, XT ) Xt = x . (1.2)
t

• The set of admissible controls is denoted by A(t, x) and consists of all


controls (us )s∈[t,T ] for which at least a unique strong solution of (1.1)
exists on [t, T ] given Xt = x and for which the performance measure
in (1.2) is well defined. Depending on the particular problem further
conditions may be specified. We denote A(x0 ) = A(0, x0 ).
• The value function of the control problem is then defined as
V (t, x) = sup J(t, x, u).
u∈A(t,x)

• Our aim is to find V (0, x0 ) and a control strategy u∗ for which this
optimal value is attained, i.e. for which V (0, x0 ) = J(0, x0 , u∗ ). Then u∗
will be called optimal.

6
1.2 Portfolio optimization: first example
We consider a financial market consisting of one bond with prices

dBt = Bt rdt, B0 = 1, i.e. Bt = er t ,

and one stock with prices evolving like

dSt = St (µ dt + σ dWt ), S0 = s0 > 0,

with trend parameter µ ∈ IR and volatility σ > 0. The unique solution of this
SDE is   
σ2
St = s0 exp µ− t + σ Wt
2
as can be verified by Itô’s formula, Theorem C.3. The wealth (the portfolio
value) of an investor with initial capital x0 > 0 evolves like

dXt = NtB dBt + NtS dSt , X0 = x0 ,

where NtB and NtS are the number of bonds and stocks, respectively, held by
the investor at time t. This definition corresponds to a self-financing portfolio
since changes in the wealth are only due to changes in the bond or stock prices
(there is no consumption or endowment).

As control at time t we may use the fraction ut of the wealth which should be
invested in the stocks. Then
(1 − ut )Xt u t Xt
NtB = , NtS =
Bt St
yielding

dXt = (1 − ut )Xt r dt + ut Xt (µ dt + σ dWt )


= Xt ((r + ut (µ − r))dt + ut σ dWt ) .

Guessing Z Z 
t t
Xt = x0 exp gs ds + hs dWs ,
0 0

applying Itô’s formula, and comparing the coefficients yields


1
gt + h2t = r + (µ − r)ut , ht = σ ut .
2
Thus Z t Z t 
1
Xt = exp (r + (µ − r)us − σ 2 u2s )ds + σ us dWs .
0 2 0

7
We want to maximize (1.2) with ψ ≡ 0, Ψ(T, x) = log(x), i.e.

J(0, x0 , u) = E[log(XTu ) | X0 = x0 ]

over all control strategies in


Z T
A(x0 ) = {u : E (|b ut | + |σut |2 )dt < ∞, Xtu > 0, E[(log XT )− ] < ∞}.
0
R
These conditions imply that σ ut dWt is a martingale, in particular
Z T 
E σ ut dWt = 0.
0

Therefore we obtain for u ∈ A(x0 )


Z T 
1 2 2
J(0, x0 , u) = log x0 + E (r + (µ − r)ut − σ ut )dt
0 2

Taking derivatives for the integrand yields


 
∂ 1 2 2
r + (µ − r)u − σ u = µ − r − σ 2 u
∂u 2
2
 
∂ 1 2 2
r + (µ − r)u − σ u = −σ 2 < 0.
∂u2 2

So a pointwise maximization yields that the best choice of ut is always (setting


the first line equal 0 and solving for u)
µ−r
u∗t = π ∗ := for all t ∈ [0, T ]. (1.3)
σ2
The value function is
 
∗ 1 (µ − r)
V (0, x0 ) = sup J(0, x0 , u) = J(0, x0 , u ) = log(x0 ) + r + T.
u∈A(x0 ) 2 σ2

The strategy given by (1.3) is the Merton strategy (for logarithmic utility), and
we call π ∗ Merton fraction. So, if e.g. π ∗ = 0.4, this means that an investor
should always keep 40% percent of his money invested in the stock. Note that
this strategy requires a lot of trading.

Remark 1.1 We can get a corresponding result for n stocks with prices
(St )t∈[0,T ] ,  
St1
St =  ...  .
 
Stn

8
with dynamics
dSt = Diag(St )(µ dt + σ dWt ), S0 = s 0 ,
where Diag(St ) is the diagonal matrix with diagonal St , and W is a n-dimensional
Wiener process, si0 > 0 for I = 1, . . . , n, µ ∈ IRn , and σ a non-singular volatility
matrix in IRn×n . So for stock i we have dynamics
n
!
X
dSti = Sti µi dt + σij dWtj , i = 1, . . . , n.
j=1

We can define controls as a n-dimensional process u, where the i-th component


uit corresponds to the fraction of wealth which is invested in stock i. We then
get the optimal solution
u∗t = π ∗ = (σσ⊤ )−1 (µ − r), t ∈ [0, T ],
where ⊤ denotes transposition, i.e. σ⊤ is the transposed matrix to σ.

2 Dynamic Programming
2.1 Itô diffusions and their generators
We consider a n-dimensional SDE
dXt = b(t, Xt )dt + σ(t, Xt )dWt , (2.4)
where W is a m-dimensional Wiener process and the measurable drift and
diffusion coefficients
b : [0, ∞) × IRn → IRn , σ : [0, ∞) × IRn → IRn×m
satisfy for some constant K > 0 and for all x, y ∈ IRn , s, t ≥ 0
kb(s, x) − b(t, y)k + kσ(s, x) − σ(t, y)k ≤ K(ky − xk + |t − s|), (2.5)
kb(t, x)k2 + kσ(t, x)k2 ≤ K 2 (1 + kxk2 ). (2.6)
We consider the filtration F generated by W and augmented with the null
sets. Under these conditions (2.4) has a unique and strong solution X which
we call Itô diffusion. We call
a(t, x) := σ(t, x)σ(t, x)⊤
the diffusion matrix of X.
For a random variable Y we write
Et,x [Y ] = E[Y | Xt = x].
and Ex [Y ] = E0,x [Y ].

9
Theorem 2.1 Suppose that X is a time-homogeneous Itô diffusion and f :
IRn → IR bounded and measurable.
(i) For all ω ∈ Ω

Ex [f (Xt+s ) | Ft](ω) = EXt (ω) [f (Xs )], t, s ≥ 0.

(ii) If τ is a stopping time with τ < ∞, then

Ex [f (Xτ +s ) | Fτ ](ω) = EXτ (ω) [f (Xs )], s ≥ 0, ω ∈ Ω.

Proof: Theorems 7.1.2 und 7.2.4 in [Ok00]. 

Part (i) is called Markov property, part (ii) strong Markov property. The de-
pendency on ω is usually not written explicitly. Hence EXt [f (Xs )] is a random
variable
g(Xt ), where g : IRn → IR, g(x) = Ex [f (Xs )].
From now on suppose that X is an Itô-Diffusion like in (2.4).

The infinitesimal generator L of X is defined as


Es,x [f (t, Xt )] − f (s, x)
Lf (s, x) = lim
tցs t−s
for all s ≥ 0, x ∈ IRn and f in the domain DL of L which is the class of
functions f : [0, ∞) × IRn → IR for which the limit exists for all s, x.

Further we define a partial differential operator L by


n n
∂ X ∂ 1X ∂2
L := + bi + aij . (2.7)
∂t i=1 ∂xi 2 i,j=1 ∂xi ∂xj

This operator can be applied to functions f in

C 1,2 := {g = g(t, x) : [0, ∞)×IRn → IR : g cont. dbl. in t, twice cont. dbl. in x}

yielding
n n
X
⊤ 1X
Lf (t, x) = ft (t, x) + fxi (t, x) bi (t, x) + aij (t, x)fxi ,xj (t, x)
i=1
2 i,j=1
1
= ft (t, x) + (Dx f (t, x))⊤ b(t, x) + tr((Dxx f (t, x))a(t, x))
2
where Dx f denotes the gradient of f , Dxx the Hessian of f , i.e. (Dxx f )ij =
fxi ,xj , and ’tr(A)’ is the trace of matrix A, the sum of the diagonal elements
of A.

10
Remark 2.2 (i) Note that Itô’s formula can be written as

df (t, Xt ) = Lf (t, Xt ) dt + (Dx f (t, Xt ))⊤ σ(t, Xt )dWt .

(ii) If there is no dependency on t and all processes are 1-dimensional, we


simply have
1
Lf (x) = b(x)f ′ (x) + a(x)f ′′ (x).
2
Theorem 2.3 Suppose that f ∈ C 1,2 and for all u ≥ t ≥ 0, x ∈ IR
Z u  Z u 
⊤ 2
Et,x |Lf (s, Xs)|ds < ∞, Et,x |(Dx f (s, Xs )) σ(s, Xs )| ds < ∞.
t t

Then f ∈ DL and Lf (t, x) = Lf (t, x), t ≥ 0, x ∈ IRn .

Proof: Follows directly from Itô’s formula, cf. Remark 2.2 (i). 

Naturally, L is also called generator of X. The conditions in Theorem 2.3 hold,


if f ∈ C 1,2 with compact support, i.e. f (x) = 0 for x 6∈ C for some compact
set C ⊂ IRn .

Theorem 2.4 Dynkin’s Formula.


Let f ∈ C 1,2 be a function with compact support and τ a stopping time satisfying
Ex [τ ] < ∞. Then
Z τ 
Ex [f (τ, Xτ )] = f (0, x) + Ex Lf (s, Xs ) ds .
0

Proof: Theorem 7.4.1 in [Ok00]. 

If τ is a first exit time from a bounded set A ⊂ IRn , then Dynkins Formula
holds for any f ∈ C 1,2 , since f |A can be extended outside of A correspondingly.

Example 2.5 For a 1-dimensional Wiener process W we consider

Xt = x0 + Wt , t ≥ 0,

and the first exit time τ of the interval (a, b) with a ≤ x0 ≤ b,

τ = inf{t ≥ 0 : Xt 6∈ (a, b)}.

We want to find px0 := P (Xτ = b |X0 = x0 ) and Ex0 [τ ]. We can proceed as


follows:

11
• Show that P (τ < ∞ |X0 = x0 ) = 1 for all x0 . This can be done using
that Wt+∆t − Wt is normally distributed with mean 0 and variance ∆t.
Then
Ex0 f (Xτ ) = px0 f (b) + (1 − px0 )f (a). (2.8)

• Find the generator of X. Since dXt = 0 dt + 1 dWt we get from (2.7)

1 ∂2
L= .
2 ∂x2

• Solve Lf0 = 0. This gives f0 (x) = c0 x + d0 for constants c0 , d0 ∈ IR.


Dynkin’s formula then yields

Ex0 [f0 (Xτ )] = f0 (x0 ).

A comparison with (2.8) yields


f0 (x0 ) − f0 (a) x0 − a
px 0 = = .
f0 (b) − f0 (a) b−a

• Solve Lf1 = 1. This yields f1 (x) = x2 + c1 x + d1 for constants c1 , d1 ∈ IR.


From Dynkin’s formula we get

Ex0 [f1 (Xτ )] = f1 (x0 ) + Ex0 [τ ].

Comparing with (2.8) and using the solution for px0 we get

Ex0 [τ ] = (b − x0 )(x0 − a).

For x0 = a or x0 = b we have Ex0 [τ ] = 0 as expected. The maximum


expected time we get for starting at the mean x0 = a+b
2
. This yields

1 (b − a)2
p a+b = and E a+b [τ ] = .
2 2 2 4

2.2 The idea of dynamic programming


To solve the stochastic control problem presented in Section 1.1 we may pro-
ceed as follows:

• Use the Bellman Principle (if it holds)


Z t1 
u u
V (t, x) = sup Et,x ψ(s, Xs , us )ds + V (tt , Xt1 ) .
u∈A(t,x) t

This principle states that choosing an optimal control in [t, t1 ] yields an


optimal control if we continue optimally at t1 . This principle has to be
proved.

12
• Apply Itô’s formula to V (if V is smooth enough, e.g. V ∈ C 1,2 ), yielding
 Z t1
V (t, x) = sup Et,x ψ(s, Xs , us )ds + V (t, Xt )
u∈A(t,x) t
Z t1
+ Vt (s, Xs ) + (Dx V (s, Xs ))⊤ b(s, Xs , us ) ds
Zt t1
1
+ tr ((Dxx V (s, Xs ))a(s, Xs , us )) ds
t 2
Z t1 

+ (Dx V (s, Xs )) σ(s, Xs , us )dWs ,
t

where a is the diffusion matrix


a(s, Xs , us ) := σ(s, Xs , us )σ(s, Xs , us )⊤ .
Rt
If t 1 (Dx V (s, Xs ))⊤ σ(s, Xs , us )dWs , t1 ≥ t, is a martingale and hence its
expectation equals 0, we obtain
 Z t1
V (t, x) = sup Et,x ψ(s, Xs , us )ds + V (t, Xt )
u∈A(t,x) t
Z t1
+ Vt (s, Xs ) + (Dx V (s, Xs ))⊤ b(s, Xs , us ) ds
Zt t1 
1
+ tr ((Dxx V (s, Xs ))a(s, Xs , us )) ds .
t 2

• Subtract V (t, x) on both sides, divide by t1 −t and go to the limit t1 ց t.


If ’sup’ and ’lim’ and expectation and ’lim’ can be interchanged we get
– using V (t, Xt ) = V (t, x) under Et,x and ut ∈ U –
 
⊤ 1
0 = sup ψ(t, x, u) + Vt (t, x) + (Dx V (t, x)) b(t, x, u) + tr ((Dxx V (t, x))a(t, x, u))
u∈U 2
(2.9)
Defining an operator depending on u by
1
Lu f (t, x) = (Dx f (t, x))⊤ b(t, x, u) + tr((Dxx f (t, x))a(t, x, u))
2
we can write (2.9) as
0 = sup {ψ(t, x, u) + Lu V (t, x)} . (2.10)
u∈U

The equation (2.9) (or equivalently (2.10)) is called Hamilton Jacobi Bellman
equation, short HJB equation. The above reasoning shows that under cer-
tain conditions the value function solves the HJB equation, so it provides a
necessary condition.
Vice versa we may ask, when a solution of the HJB equation is the value
function of the correspondin control problem. To this end we proceed as follows
to ’solve’ the HJB equation.

13
2.6 Algorithm
1. Find an optimal u = û(t, x) in (2.9).
2. If it exists, û formally depends on the derivatives Vt , Dx V , Dxx V , i.e.
û(t, x) = ũ(t, x, Vt (t, x), Dx V (t, x), Dxx V (t, x)).
Substituting û in (2.9) leads to a partial differential equation for V which
has to be solved with boundary condition V (T, x) = Ψ(T, x) to find a
candidate V ∗ for the optimal value function.
3. If V ∗ satisfies certain conditions (see below) and u∗t = û(t, Xt∗ ), t ∈ [0, T ],
is an admissible control strategy, then V ∗ is indeed the value function of
the control problem and u∗t = û(t, Xt∗ ) defines an optimal control strategy
in Markovian form. Here Xt∗ is the solution of (1.1) using the optimal
control strategy u∗ in [0, t).
A theorem which provides a set of conditions on V , ψ, Ψ, b, σ and u such that
a solution V of the HJB equation and the corresponding maximizer û found
in steps 1 and 2 of the above algorithm provide indeed the value function and
an optimal control strategy, is called a verification theorem.

2.3 A verification theorem


In Section 2.2 we saw that under certain conditions, the value function V
satisfies the HJB equation. Here we shall provide a verification theorem which
guarantees that vice versa the solution found by Algorithm 2.6, steps 1 and 2,
indeed provides the value function and an optimal control strategy.
We consider the control problem of Section 1.1. First we have to specify when
a control strategy is admissible. Say, u ∈ A(t, x) if
(A1) u h= (us )s∈[t,T ] iis progressively measurable, has values in U, and satisfies
RT
E t kus k2 ds < ∞.

(A2) (1.1) has a unique strong solution (Xs )s∈[t,T ] with Xt = x and
Et,x [ sup kXs k2 ] < ∞
t≤s≤T

(A3) J(t, x, u) is well defined.


Condition (A3) will be guaranteed by (A1) and the conditions of the following
theorem.
Theorem 2.7 Verification Theorem
Suppose that kσ(t, x, u)k2 ≤ Cσ (1 + kxk2 + kuk2) and that ψ is continuous
with kψ(t, x, u)k2 ≤ Cψ (1 + kxk2 + kuk2 ) for some Cσ , Cψ > 0 and all t ≥ 0,
x ∈ IRn , u ∈ U.

14
(i) Suppose that Φ lies in C 1,2 ([0, T ) × IRn ), is continuous on [0, T ] × IRn
with kΦ(t, x)k ≤ CΦ (1 + kxk2 ), and satisfies the HJB equation and the
boundary condition, i.e.

sup {ψ(t, x, u) + Lu Φ(t, x)} = 0, t ∈ [0, T ), x ∈ IRn


u∈U
Φ(T, x) = Ψ(T, x), x ∈ IRn .

Then for all t ∈ [0, T ], x ∈ IRn

Φ(t, x) ≥ V (t, x).

(ii) If a maximizer û(t, x) of u 7→ ψ(t, x, u) + Lu Φ(t, x) exists such that


u∗ = (u∗t )t∈[0,T ] , u∗t = û(t, Xt∗ ) is admissible, then Φ(t, x) = V (t, x) for all
t ∈ [0, T ], x ∈ IRn and u∗ is an optimal control strategy, i.e. V (t, x) =
J(t, x, ut,x ) where ut,x = (u∗s )s∈[t,T ] ∈ A(t, x). Here Xt∗ is the solution of
(1.1) using control u∗s on [0, t).

Proof: Keep some t ∈ [0, T ], x ∈ IRn fixed. For arguing with bounded
processes, we introduce

τn = T ∧ inf{s > t : kXs − Xt |k ≥ n}, n ∈ IN.

The Itô formula for Xt = x and admissible u yields


Z τn Z τn
Φ(τn , Xτn ) = Φ(t, x) + us
L Φ(s, Xs )ds + Φx (s, Xs )⊤ σ(s, Xs , us )dWs .
t t

From the admissibility


R τ of u, the continuity of Φand the boundedness of X on
[t, τn ] we get Et,x t n kΦx (s, Xs )⊤ σ(Xs , us )k2 ds < ∞ and thus
Z τn 

Et,x Φx (s, Xs ) σ(Xs , us )dWs = 0.
t

Therefore we get
Z τn 
Et,x ψ(s, Xs , us )ds + Φ(τn , Xτn )
t
Z τn Z τn 
us
= Et,x ψ(s, Xs , us )ds + Φ(t, x) + L Φ(s, Xs )ds
t t
 
Z τn
= Φ(t, x) + Et,x  (ψ(s, Xs , us ) + Lus Φ(s, Xs )) ds
t | {z }
≤0
≤ Φ(t, x) (2.11)

since Φ satisfies the HJB and us ∈ U for s ∈ [t, T ].

15
For n → ∞ we get τn → T . The quadratic growth conditions for ψ and Φ and
the admissibility of u implies
Z τn Z T


ψ(s, Xs , us )ds + Φ(τn , Xτn ) ≤ Cψ
(1+kXs k2 +kus k2 )ds+CΦ (1+kXT k2 ) ∈ L1 .
t t

So we get by dominated convergence and the continuity of Φ


Z τn 
Et,x ψ(s, Xs , us )ds + Φ(τn , Xτn ) → J(t, x, u) (n → ∞).
t

Thus we obtain from (2.11) that also J(t, x, u) ≤ Φ(t, x) and by taking the
supremum finally V (t, x) ≤ Φ(t, x). This proves part (i).
For (ii) only observe that we have equality in (2.11) if we can find a maximizer
û(t, x) and consider the strategy u∗t = û(t, Xt∗ ) defined by this maximizer.
Since this was the only inequality we get with the same arguments V (t, x) =
J(t, x, u∗ ) = Φ(t, x). 

Remark 2.8

(i) Under the conditions of Theorem 2.7 the Bellman Principle holds: For
any stopping time τ with values in [t, T ]
Z τ 
u u
V (t, x) = sup Et,x ψ(s, Xs , us )ds + V (τ, Xτ ) .
u∈A(t,x) t

(ii) By definition the value function is always unique. Therefore Theorem 2.7
shows that a solution of the HJB equation is unique in the class of C 1,2 -
functions with quadratic growth. Note that a control strategy does not
have to be unique.

(ii) Existence is difficult to show in general. A typical set of very strong


conditions is

– U is compact,
– Ψ is three times continuously differentiable in x and is bounded,
– b, σ, ψ in C 1,2 and bounded,
– The uniform parabolicity condition holds, i.e.

y⊤ a(t, x, u)y ≥ c kyk2

for all t ∈ [0, T ], x, y ∈ IRn , u ∈ U.

Then the HJB equation has a unique bounded solution V ∈ C 1,2 .

16
Example 2.9 We consider the control of
Z t
u
Xt = x + us ds + Wt ,
0

everything one-dimensional, with performance criterion (for α, β > 0)


 Z T 
u 2
J(t, x, u) = Et,x β Xt − α us ds
t
RT
and progressively measurable controls u with values in IR and satisfying 0
u2s ds <
∞. The augmented generator of X u is given by
1
Lu f (t, x) = u fx (t, x) + fxx (t, x).
2
So in the first step of Algorithm 2.6 we have to solve the maximization problem
in
sup {Lu V (t, x) − α u2 } = 0
u∈IR

yielding the maximizer û(t, x) = Vx2(t,x)


α
. Plugging this in the HJB equation
and using the boundary condition V (T, x) = β XT we now have to solve (step
2)
1 2 1
−α û2 (t, x) + Lû(t,x) V (t, x) = Vt (t, x) + Vx (t, x) + Vxx (t, x) = 0.
4α 2
This yields
β2
V (t, x) = β x + (T − t).

Step 3 consists of checking the conditions of the verification theorem which
can easily be done for this example.

2.4 Example: Optimal investment


We use a model like in Remark 1.1, only that we now allow for more Wiener
processes than we have stocks in the market (so called incomplete market).
We consider a financial market consisting of one bond with prices

dBt = Bt rdt, B0 = 1, i.e. Bt = er t ,

and n stocks with prices evolving like

dSt = Diag(St )(µ dt + σ dWt ), S0 = s 0 ,

where Diag(St ) is the diagonal matrix with diagonal St , and W is a m-


dimensional Wiener process, m ≥ n, si0 > 0 for i = 1, . . . , n, µ ∈ IRn , and

17
σ a matrix in IRn×m with maximal rank. The latter implies that the IRn×n -
matrix σσ⊤ is non-singular. So for stock i we have dynamics
m
!
X j
i i
dSt = St µi dt + σij dWt , i = 1, . . . , n.
j=1

As controls u = (π, c) we consider the vector of risky fractions πt = (πt1 , . . . , πtn )⊤ ,


where πti is the fraction of the wealth invested in stock i, and the consumption
rate ct . The corresponding wealth process satisfies
n
X
dXtu = Xtu πti dSti/Sti + Xtu (1 − πt )dBt /Bt − ct dt
i=1
 
= r + π⊤t (µ − r 1) Xtu − ct dt + Xtu π⊤t σ dWt ,

where 1 denotes the n-dimensional vector (1, . . . , 1)⊤ . The investor assigns
utility U1 (ct ) to the payout given by the consumption rate ct and utility U2 (XTu )
to the terminal wealth. We consider power utility functions

U(x) = , α < 1, α 6= 0
α
and a discounting factor e−β t , β ≥ 0. Thus we want to maximize
Z T 
−β t −β T u
J(t, x, u) = Et,x e U(ct ) dt + e U(XT ) .
t

The augmented generator is given by


 1
L(π,c) v(t, x) = vt (t, x) + (r + π⊤ (µ − r 1))x − c vx (t, x) + π⊤ σσ⊤ πx2 vxx (t, x)
2
and the corresponding HJB equation reads as
 α

−β t c (π,c)
sup e +L V (t, x) = 0.
π,c α

Step 1: If x is strictly positive and V is increasing and concave, then we get


maximizers
 1
ĉ(t, x) = eβ t Vx (t, x) α−1 (2.12)
Vx (t, x)
π̂(t, x) = −(σσ⊤ )−1 (µ − r 1) . (2.13)
xVxx (t, x)

Step 2: Putting this in the HJB equation, it remains to solve

1 − α − β t α−1
α
1 ⊤ ⊤ −1 Vx2
e 1−α Vx + Vt + r x Vx − (µ − r 1) (σσ ) (µ − r 1) =0
α 2 Vxx

18
α
with boundary condition V (T, x) = e−β T xα . Making an ansatz

V (t, x) = h(t)1−α (2.14)
α
βT
yields for h the boundary condition h(T ) = e− 1−α and
βt
e− 1−α + c h(t) + h′ (t) = 0,
where  
α 1 ⊤ ⊤ −1
c= r+ (µ − r 1) (σσ ) (µ − r 1) .
1−α 2(1 − α)
This can be solved using standard methods yielding
βt (1 − α)e−c t n − β−(1−α)c t β−(1−α)c
o
h(t) = e− 1−α e−c (T −t) + e 1−α − e− 1−α T
β − (1 − α)c
if β − (1 − α)c 6= 0 and
h(t) = e−c t (1 + T − t)
if β − (1 − α)c = 0.
Step 3: Note that h(t) is always strictly positive. Hence the value function
(2.14) is strictly positive if the SDE for the controlled process has a strictly
positive unique solution when using controls given by the maximizers π̂, ĉ.
Then, V would clearly lie in C 1,2 and be strictly increasing and concave since
Vx (t, x) = h(t)1−α xα−1 > 0 and Vxx (t, x) = −(1 − α)h(t)1−α xα−2 < 0.
Note further that with (2.14) we get from (2.12), (2.13)
βt
e− 1−α
ĉ(t, x) = x,
h(t)
1
π̂(t, x) = (σσ⊤ )−1 (µ − r 1).
1−α
For controls
βt
e− 1−α ∗
c∗t = ĉ(t, x) = X , (2.15)
h(t) t
1
πt∗ = π̂(t, x) = (σσ⊤ )−1 (µ − r 1) (2.16)
1−α
∗ ∗
the SDE for the controlled process X ∗ := X (π ,c ) is of the form dXt∗ =
Xt∗ ((c1 + f1 (t))dt + c2 dWt ) and hence admits a unique strong solution Xt∗ =
x0 exp{something} which is strictly positive and (as a strong solution) satisfies
the integrability conditions we need. Since π ∗ is constant and h(t)−1 bounded
the admissibility and growth conditions can be verified using a suitable set U
for the controls (partly difficult).
Thus the value function is indeed given by (2.14) and an optimal control strat-
egy by (2.15), (2.16).

19
2.5 Types of controls
In general a control at time t is given by some random variable ut (ω) One can
distinguish some special cases:
• Open loop or deterministic controls, if ut (w) = f (t) is only a function in
t (non random).
• Closed loop or feedback controls: u is adapted to the filtration generated
by the controlled process, i.e. ut is σ(Xsu , s ≤ t)-measurable.
• A special case of the feedback controls are Markovian controls which are
of the form ut (ω) = f (t, Xt (ω)).
In particular the controls given by the maximizers û like in Algorithm 2.6
and in the Verification Theorem 2.7 are Markovian. The proof of Theorem
2.7 shows that a Markovian control will be at least as good as any feedback
control, if the former exists.

3 Some extensions
3.1 Minimization
Suppose we want to minimize the performance criterion. Then the value func-
tion would be of the form
Z T 
Ṽ (t, x) = inf Et,x ψ̃(s, Xs , us )ds + Ψ̃(T, XT ) .
u t

Switching to the supremum and defining ψ := −ψ̃, Ψ := −Ψ̃ yields


Z T 
Ṽ (t, x) = − sup Et,x ψ(s, Xs , us )ds + Ψ(T, XT ) = −V (t, x),
u t

where we can find V as before as the value function for maximizing the per-
formance criterion with ψ and Ψ.
In Section 3.7 we will see an example.

3.2 Infinite time horizon


We shall assume that neither the coefficients b(x, u) and σ(x, u) of the con-
trolled process X nor ψ(x, u) depend explicitly on time t. So we look at
dynamics
dXt = b(Xt , ut )dt + σ(Xt , ut )dWt , X0 = x0 , (3.17)
and define the corresponding differential operator depending on u by
1
Lu f (t, x) = (Dx f (x))⊤ b(x, u) + tr((Dxx f (x))a(x, u)).
2
20
As performance criterion we use
Z ∞ 
−β s
J(x, u) = Ex e ψ(Xs , us )ds
0

with discount factor β > 0. The conditions on b and σ and the admissibility
of control strategies – now described by class A(x) – are defined analogously
to Section 2.3. The value function is

V (x) = sup J(x, u).


u∈A(x)

Theorem 3.1 Verification Theorem


Suppose that kσ(x, u)k2 ≤ Cσ (1 + kxk2 + kuk2 ) and that ψ is continuous with
kψ(x, u)k2 ≤ Cψ (1 + kxk2 + kuk2 ) for some Cσ , Cψ > 0 and all x ∈ IRn , u ∈ U.

(i) Suppose that Φ lies in C 2 (IRn ) with kΦ(x)k ≤ CΦ (1 + kxk2 ), and satisfies
the HJB equation

sup {ψ(x, u) + Lu Φ(x) − βΦ(x)} = 0, t ∈ [0, T ), x ∈ IRn .


u∈U

Then for all x ∈ IRn


Φ(x) ≥ V (x).

(ii) If a maximizer û(x) of u 7→ ψ(x, u) + Lu Φ(x) − βΦ(x) exists such that


u∗ = (u∗t )t≥0 , u∗t = û(Xt∗ ) is admissible, then Φ(x) = V (x) for all x ∈ IRn
and u∗ is an optimal control strategy, i.e. V (x) = J(x, u∗ ). Here Xt∗ is
the solution of (3.17) using control u∗s on [0, t).

The proof is similar to the proof of Theorem 2.7.

3.3 Example: Discounted utility of consumption


In the model of Section 2.4 we now would like to maximize
Z ∞ 
−β t 1 α
J(x, u) = Ex e c dt
0 α t

over admissible controls u = (π, c) which control



dXt = (r + π⊤t (µ − r 1))Xt − ct dt + Xt π⊤t σ dWt , X0 = x.

We consider β > 0, α ∈ (0, 1), x > 0 and we shall require as additional


admissibility conditions P (Xt > 0) = 1 for all t > 0. The value function is

V (x) = sup J(x, u).


u∈A(x)

21
The HJB equation reads
 

 1 ⊤ ⊤ 2 cα
sup (r + π (µ − r 1))x − c Vx + π σσ πx Vxx − β V + = 0.
u∈IRn ×[0,∞) 2 α

Step 1: Suppose Vx > 0 and Vxx < 0. Then we get maximizers


Vx (x)
π̂(x) = −η , where η := (σσ⊤ )−1 (µ − r 1),
x Vxx (x)
1
ĉ(x) = Vx (x) α−1 .

Step 2: Plugging these in the HJB we get the differential equation


1 V2 1 − α α−1
α
− η⊤ σσ⊤ η x + r x Vx − β V + Vx = 0.
2 Vxx α
α
Making an ansatz V (x) = A xα for some A > 0 we have Vx (x) = A xα−1 ,
Vxx (x) = −(1 − α)A xα−2 and it remains to solve (if x > 0)
1 β 1−α 1
η⊤ σσ⊤ η + r − + A α−1 = 0
2(1 − α) α α
So we have to assume
α
β> η⊤ σσ⊤ η + α r
2(1 − α)
since then A > 0. Solving for A yields
  α−1
α β 1 ⊤ ⊤
A= −r− η σσ η .
1−α α 2(1 − α)
Step 3: As candidates for the optimal policy we thus get
1 1
πt∗ = η= (σσ⊤ )−1 (µ − r 1),
1−α 1−α
1
γt∗ = A α−1 Xt∗

and the wealth process X ∗ controlled by (π ∗ , c∗ ) satisfies


1 1
dXt∗ = Xt∗ ((1 − α)r + η⊤ (σσ⊤ η − r1) − (1 − α)A α−1 )dt + η⊤ σdWt ).
1−α
This SDE has a unique strong solution
   
∗ 1 ⊤ 1 − 2α ⊤ ⊤ 1 1 ⊤
Xt = X0 exp 1− η 1 r+ η σσ η − A α−1 t + η σWt
1−α 2(1 − α)2 1−α
which is strictly positive. Thus we also get Vx > 0, Vxx < 0 and one can prove
the integrability conditions using that X is an L2 -process. Obviously, V lies
in C 2 (0, ∞). Hence by Theorem 3.1 we have found the optimal solution.

22
But note that we have restricted the domain of V to (0, ∞) by the admissibility
condition on the control strategies (only strictly positive wealth processes were
allowed). Another way would be to terminate the evaluation as soon as Xt ≤ 0.
This can be modelled by a suitable stopping time τ . Then boundary conditions
would have to be specified for the case that we stop early. This can be done
similarly as we do it in the next section for a finite time horizon.

3.4 Stopping the state process


Let u and X be control and state process as defined in Section 1.1. Now the
wealth process should be constrained to a certain set. To this end let Q be
an open set in [0, T ] × IRn and ∂Q the boundary of Q. The process should be
stopped when leaving Q as described by the following stopping time

τ := inf{t > 0 : (t, Xt ) 6∈ Q}.

In particular ({T } × IR) ∩ Q, where Q = Q ∪ ∂Q, is part of the boundary, so


Pt,x (τ ≤ T ) = 1 for all (t, x) ∈ Q. Now let ∂ ∗ Q be a subset of the boundary
which satisfies

Pt,x ((τ, Xτ ) ∈ ∂ ∗ Q) = 1 for all (t, x) ∈ Q.

As performance criterion we use


Z τ 
J(t, x, u) = Et,x ψ(s, Xs , us )ds + Ψ(τ, Xτ ) ,
t

where we have to specify Ψ also for t < T on the boundary ∂ ∗ Q. As usual,

V (t, x) = sup J(t, x, u),


u∈A(t,x)

and we define the admissibility conditions and the assumptions on b and σ as


in Section 2.3.

Theorem 3.2 Verification Theorem


Suppose that kσ(t, x, u)k2 ≤ Cσ (1 + kxk2 + kuk2) and that ψ is continuous with
kψ(t, x, u)k2 ≤ Cψ (1 + kxk2 + kuk2 ) for some Cσ , Cψ > 0 and all (t, x) ∈ Q,
u ∈ U.
(i) Suppose Φ ∈ C 1,2 (Q) ∩ C(Q), with kΦ(t, x)k ≤ CΦ (1 + kxk2 ) satisfying
the HJB equation and the boundary condition, i.e.

sup {ψ(t, x, u) + Lu Φ(t, x)} = 0, (t, x) ∈ Q


u∈U
Φ(t, x) = Ψ(t, x), (t, x) ∈ ∂ ∗ Q.

Then Φ(t, x) ≥ V (t, x) for all (t, x) ∈ Q.

23
(ii) If û(t, x) is a maximizer of u 7→ ψ(t, x, u) + Lu Φ(t, x) on Q and u∗ =
(u∗t )t≤τ , u∗t = û(t, Xt∗ ) is admissible, then Φ(t, x) = V (t, x) for all (t, x) ∈
Q and u∗ is an optimal control strategy, i.e. V (t, x) = J(t, x, ut,x ) where
ut,x = (u∗s )s∈[t,T ] ∈ A(t, x).

The proof is essentially the same as the proof of Theorem 2.7. Instead of τn
we have to use stopping times τn ∧ τ .

Example 3.3 We look at the market model of Section 2.4 but consider only
investment without consumption, i.e. we use controls ut = πt , where πti is the
fraction of wealth invested in stock i. For α ∈ (0, 1) we want to maximize
expected power utility of terminal wealth
 
1 α
E X such that P (XT ≥ q) = 1.
α t
This problem is called portfolio insurer problem since there is a lower boundary
for the payoff. It is quite attractive, since the distribution of the optimal
terminal wealth of the unconstrained problem can be very skew, allowing for
losses with high probability, and big gains only with a very low probability.
We have to distinguish 3 cases:
• If x0 < e−r T q, we cannot reach the minimum payout at time T with
probability 1 since by investing in the bond we only get x0 er T < q.
• If x0 = e−r T q, pure investment in the bond yields exactly the payout q.
So we cannot invest in the stocks since then we would make losses with
a strictly positive probability, so we would miss q with a strictly positive
probability.
• If x0 > e−r T q investment in bond and stocks is possible.
So let us assume that x0 > e−r T q.
The same considerations show that at time t we need at least wealth Xt ≥
e−r(T −t) to be able to reach q with probability 1.
Thus we may define

Q = {(t, x) ∈ (0, T ) × IR : x > e−r(T −t) q}.

Then

∂Q = ({0} × [e−r T q, ∞))


∪ {(t, x) : t ∈ (0, T ), x = e−r(T −t) q}
∪ ({T } × [q, ∞)).

Assuming that we start at some x0 > e−r T q, we can only stop at

∂ ∗ Q = {(t, x) : t ∈ (0, T ), x = e−r(T −t) q} ∪ ({T } × [q, ∞)).

24
So we define boundary conditions
 1 α
q , (t, x) ∈ ∂ ∗ Q, t < T,
Ψ(t, x) = α
1 α
α
x , t = T.

So for τ = inf{t > 0 : (t, Xt ) 6∈ Q} we define


 
1 α
J(t, x, u) = Et,x X , (t, x) ∈ Q
α τ
and
V (t, x) = sup J(t, x, u),
u∈A(t,x)

the admissibility conditions defined similar as in Section 2.3. The HJB equa-
tion with boundary conditions then reads as
 
⊤ 1 ⊤ ⊤ 2
sup Vt + (r + π (µ − r 1))xVx + π σσ π x Vx x = 0, (t, x) ∈ Q,
π∈IRn 2
V (t, x) = Ψ(t, x), (t, x) ∈ ∂ ∗ Q.

3.5 Example: Portfolio Insurer


We shall have a closer look at Example 3.3, for simplicity only for one stock
(n = 1). First, without consumption and without constraints the value func-
tion V 0 can be determined as in Example 2.4) yielding

V 0 (t, x) = e(α r+c0)(T −t) , t ∈ [0, T ], x ∈ IR, (3.18)
α
where  2
α µ−r
c0 = .
2(1 − α) σ
Further the optimal risky fraction is
1 µ−r
π0 = .
1 − α σ2
With constraint P (XT ≥ q) = 1 we have seen in Example 3.3 that we have to
solve
 
1 2 2 2
sup Vt + (r + π(µ − r))xVx + π σ x Vxx = 0, (t, x) ∈ Q,
π∈IR 2
V (t, x) = Ψ(t, x), (t, x) ∈ ∂ ∗ Q

with Q, ∂ ∗ Q, Ψ as given in Example 3.3. Taking derivatives we get as candidate


µ − r Vx (t, x)2
π̂(t, x) = − .
σ 2 x Vxx (t, x)

25
So we have to solve
 2
1 µ−r Vx2 (t, x)
Vt (t, x) + r xVx (t, x) − =0 (3.19)
2 σ Vxx (t, x)
subject to
( 1
α
q α , (t, x) ∈ ∂ ∗ Q, t < T,
V (t, x) = Ψ(t, x) = 1
α
xα , t = T.

A separation approach for V (factorization in f (t)xα as in Example 2.4) won’t


work. To see this, note that using the terminal condition we only would get the
solution V 0 of the unconstrained problem which does not satisfy the boundary
conditions for t < T .
So we may try to use a finite difference method to find a numerical approxima-
tion of V . Therefore it is convenient to work on some grid. Unfortunately the
lower boundary t 7→ e−r(T −t) q depends on time. To get a constant boundary
we will make a change of variables
y = e−rt x.
Then we get the domain
Q̃ = (0, T ) × (e−rT q, ∞)
with (interesting part of the) boundary
 
∂ ∗ Q̃ = (0, T ) × {e−rT q} ∪ {T } × [e−rT q, ∞) .
We then introduce
Ṽ (t, y) := V (t, ert y), (t, y) ∈ Q̃.
Then
Ṽt (t, y) = Vt (t, ert y) + r y ert Vx (t, ert y),
Ṽy (t, y) = ert Vx (t, ert y),
Ṽyy (t, y) = e2rt Vxx (t, ert y).
We then have for x = ert y
µ−r Vx (t, x)2
π̂(t, x) = −
σ2 x Vxx (t, x)
µ−r Vx (t, ert y)2
= − 2
σ ert y Vxx (t, ert y)
µ−r Ṽy (t, y)2
= − 2
σ y Ṽyy (t, y)
=: π̃(t, y).

26
So it remains to solve
 2
1 µ−r Ṽy (t, y)
Ṽt (t, y) − =0
2 σ2 Ṽyy (t, y)
subject to
( α qα
rt
eαrT yα = α
, (t, y) ∈ ∂ ∗ Q̃, t < T,
Ṽ (t, y) = V (t, e y) = α
eαrT yα (t, y) ∈ ∂ ∗ Q̃, t = T.

We get these boundary conditions using that (t, y) ∈ ∂ ∗ Q̃ if and only if


(t, ert y) ∈ ∂ ∗ Q. For t < T this implies y = e−rT q and V (t, ert y) = q α /α.
Remark 3.4 This change of variables corresponds to controlling the dis-
counted wealth process
Yt = e−rt Xt
with performance criterion
 
˜ 1 rT π α
J(t, x, π) = Et,y e YT .
α
But we still have a problem. We need an upper boundary. So we choose
y >> e−rT q and make the reasonable assumption that Ṽ (t, y) = V (t, ert y) ≈
V 0 (t, ert y), where V 0 is the value function (3.18) of the unconstrained problem.
Then we get from (3.18) the additional boundary conditions
(erT y)α
Ṽ (t, y) = ec0 (T −t) , t ∈ [0, T ].
α
We shall use an explicit finite difference scheme on the grid
T
t0 = 0, . . . , tN = T, ti = i∆t, ∆t = ,
N
y − y0
y0 = e−rT q, . . . , yM = y, yj = y0 + j∆y, ∆y = .
M
We approximate the derivatives by finite differences. So for vi,j := Ṽ (ti , yj ) we
may use
vi+1,j − vi,j
Ṽt (ti , yj ) ≈ ,
∆t
vi,j+1 − vi,j
Ṽy (ti , yj ) ≈ ,
∆y
vi,j+1 − 2vi,j + vi,j−1
Ṽyy (ti , yj ) ≈ .
∆y 2
Going backwards in time we set
yjα
vN,j = eαrT
α
27
10.6
10.4 2
10.2
1.75
10
1.5
0
y
0.25 1.25
0.5
t 0.75 1
1

Figure 1: Value functions V 0 and Ṽ (red) for Example 3.5

and for i = N − 1, . . . , 0 we determine vi,j , j = 0, . . . , M, by solving


vi,0 = Ṽ (ti , y0) = e−αr(T −i∆t) ,
α
(erT y)α
vi,M = Ṽ (ti , yM ) = ec0 (T −i∆t) ,
α
 2
vi+1,j − vi,j 1 µ − r (vi,j+1 − vi,j )2
0 = − , j = 1, . . . , M − 1.
∆t 2 σ vi,j+1 − 2vi,j + vi,j−1

Example 3.5 We implement the algorithm for parameters α = 0.1, r =


0.02, µ = 0.1, σ = 0.4 and bound q = 0.9 for initial capital x0 = 1. The
numerical results (see Figures 1, 2) indicate

• Ṽ (t, y) ≤ V 0 (t, ert y)

• π̃ and hence π̂ are increasing in t and increasing in y with π̃ ∈ [0, π 0 ],


where π 0 is the risky fraction corresponding to the unconstrained prob-
lem.

• Further π̃(t, y) ց 0 for y ց e−rT q.

28
0.5
0.4
1.75
0.3
0.2 1.5
0
y
0.2 1.25
0.4
t 0.6 1
0.8

Figure 2: Optimal strategy π 0 and π̃ (red) for Example 3.5

29
Note that these results are only numerical without saying anything about
existence. Further, since we only used an approximative boundary condition,
there is less hope to show convergence to the true solution.

Even with a correct boundary condition it would be difficult to obtain conver-


gence results since the equations for vi,j are non linear. But one can discretize
the HJB equation itself to get a discrete time control problem which leads to
an approximative solution. This will be discussed in a subsequent chapter.

3.6 Dynamic programming for deterministic optimal con-


trol problems
We consider a deterministic control problem, where the n-dimensional state
X(t) of the system evolves according to the PDE

Ẋ(t) = b(t, X(t), u(t)), X(0) = x0 (3.20)

and the controls are given by measurable functions

u : [0, T ] → U ⊆ IRp .

We use the notation X(t) and u(t) for deterministic functions of time while
we reserve Xt and ut (’t’ as index) for stochastic processes. For a dynamic
programming approach we also look at starting time t and consider the per-
formance criterion
Z T
J(t, x, u) = ψ(s, X(s; t, x), u(s))ds + Ψ(T, X(T ; t, x))
t

where X(s; , t, x) is a solution of (3.20) starting at t with x. Proper conditions


have to be imposed on ψ,Ψ and the controls u such that a solution of (3.20)
exists and J is well defined. Suppose that for the latter these are specified in
some admissibility set A(t, x).
We consider the minimization of J,

V (t, x) = inf J(t, x, u).


u∈A(t,x)

Remark 3.6 A problem like above with running cost ψ and terminal cost Ψ
is called a Bolza problem, with ψ and without terminal cost a Lagrange prob-
lem and with Ψ and without running cost a Mayer problem. In determinstic
optimal control these three problems are equivalent in the sense, that each one
of them can be transformed in one of the other formulations.
E.g. transforming a Bolza problem to a Mayer problem can be done by intro-
ducing a further variable xn+1 with dynamics Ẋn+1 (t) = ψ(t, X(t), u(t)) =:

30
bn+1 (t, X(t), Xn+1(t), u(t)). Then we consider the (n + 1)-dimensional con-
trolled process X̃ = (X⊤ , Xn+1 )⊤ and define b̃ = (b⊤ , bn+1 )⊤ , J(t, ˜ x, xn+1 , u) =
Xn+1 (T ; t, xn+1 ) + Ψ(T, X(T ; t, x), Ṽ (t, x, xn+1 ) = xn+1 + V (t, x), yielding HJB
n o

Ṽt + sup b Dx Ṽ + bn+1 Dxn+1 Ṽ = 0
u

which is the same as for the non-reformulated problem. These transforma-


tions are in general not possible for stochastic control due to the conditional
expectations involved.

Suitable conditions for the dynamic principle to work are in the determinstic
case similar to the stochastic control problem. Usually we have two choices:
Being more restrictive on the controls or more restrictive on the cost functions.
Supposing that U is closed conditions for the first approach are e.g. that the
control functions are piecewise continuous and b, ψ, Ψ are continuous and
continuously differentiable such that a solution X always exists, cf. e.g. [FR75].
On the other hand we could only assume that the controls are measurabe
functions and then need further boundedness conditions on f = b, ψ, Ψ like

kf (t, x, u) − f (t, y, u)k ≤ K kx − yk, for all x, y, kf (t, 0, u)k ≤ K

for all t, u, cf. e.g. [YZ99].


Under such conditions one can show that if V ∈ C 1,1 ([0, T ], IRn), then V satis-
fies the HJB equation

Vt (t, x) + inf ψ(t, x, u) + b(t, x, u)⊤ Dx V (t, x) = 0 (3.21)
u∈U

such that V (T, x) = Ψ(T, x).


On the other hand, under suitable conditions, if Φ ∈ C 1,1 ([0, T ], IRn) is a
solution of (3.21) with V (T, x) = Φ(T, x) and u∗ and the states X ∗ controlled
by u∗ solving (3.20) satisfy

Φt (t, X ∗ (t)) + ψ(t, X ∗ (t), u∗ (t)) + b(t, X ∗ (t), u∗ (t))⊤ Dx Φ(t, X ∗ (t)) = 0

then u∗ is an optimal control and V = Φ.


So we have the same algorithm as for the stochastic counterpart:

1. Write down the HJB and find a minimizer û(t, x).

2. Solve the HJB after plugging in û.

3. Verify the conditions on ψ, Ψ, u.

As we will see in the following example and in Section 3.7 we may mix or
interchange steps 1 and 2 if it works.

31
Example 3.7 Linear regulator problem (LQ system)
Consider the n-dimensional controlled system (with p-dimensional controls)

Ẋ(t) = A(t)X(t) + B(t)u(t),


Z T

J(t, x, u) = X(s)⊤ C(s)X(s) + u(s)⊤ D(s)u(s) ds + X(T )⊤ RX(T ),
t

where A, B, C, D are continuous, A, C, R are (n × n)-dimensional, B is (n ×


p)-dimensional, D is (p × p)-dimensional, C, D, R are symmetric, D positive
definite, C, R non negative definite. The HJB equation reads

Vt (t, x) + inf (A(t)x + B(t)u)⊤ Dx V (t, x) + x⊤ C(t)x + u⊤ D(t)u = 0
u∈U

withboundary condition V (T, x) = x⊤ Rx. We make an ansatz V (t, x) =


x⊤ K(t)x, where K is C 1 and symmetric and satisfies K(T ) = R. Then
Vt (t, x) = x⊤ K̇(t)x, Dx V (t, x) = 2K(t)x, Dxx = 2K(t), yielding the HJB
equation

x⊤ K̇(t)x + inf 2x⊤ A(t)⊤ K(t)x + 2u⊤ B(t)⊤ K(t)x + x⊤ C(t)x + u⊤ D(t)u .
u∈U

A pointwise minimization yields

û(t, x) = −D(t)−1 B(t)⊤ K(t)x.

Plugging this in the HJB equation yields

x⊤ K̇x + x⊤ (KA + A⊤ K)x − x⊤ KB(D −1 )B⊤ Kx + x⊤ Cx = 0.

A sufficient condition is that K satisfies

K̇(t) = −(K(t)A(t) + A(t)⊤ K(t)) + K(t)B(t)(D(t)−1 )B(t)⊤ K(t) − C. (3.22)

Under our conditions (C, D, R are symmetric, D positive definite, C, R non


negative definite) a C 1 -solution of this matrix Riccati equation with K(T ) = R
exists, cf. [FR75, Theorem 5.2]. Then it can be shown that V (t, x) = x⊤ K(t)x
and that an optimal control is given by û.
So the optimally controlled system state satisfies
d ∗
X (t) = A(t)X ∗ (t) − B(t)F (t)−1 B(t)⊤ K(t)X ∗ (t),
dt
which is a linear PDE and hence is relatively easy to handle. This model is
often used in engineering, partly as approximation for more complex models.
But for dimensions n > 1 there is no explicit solution of the matrix Riccati
equation. Of course there is a bunch of numerical methods available to ap-
proximate the solution.

32
3.7 Stochastic linear regulator problem
We now want to look at a stochastic version of Example 3.7. The comparison
will allow us to see what the influence of the additional noise really is. Consider
with the same conditions on the still deterministic matrices A, B, C, D, R and
for m-dimensional Brownian motion W and non-singular σσ⊤

dXt = (A(t)Xt + B(t)ut )dt + σdWt ,


Z T 
⊤ ⊤
 ⊤
J(t, x, u) = Et,x Xs C(s)Xs + us D(s)us ds + XT RXT
t

and V (t, x) = inf u J(t, x, u). Based on Section 3.1 we get the HJB equation
 
⊤ ⊤ ⊤ 1 ⊤
0 = inf x C(t)x + u D(t)u + (A(t)x + B(t)u) Dx V (t, x) + tr(σσ Dxx V (t, x)) .
u∈U 2
Due to the expectation we have to expect a more complicated dependency of
the value function on time and hence make the ansatz

V (t, x) = k(t) + x⊤ K(t)x, k(T ) = 0, K(T ) = R

with continuously differentiable k and K and symmetric K. Similar as in


Example 3.7 we have Vt (t, x) = k̇(t) + x⊤ K̇(t)x, Dx V (t, x) = 2K(t)x, Dxx =
2K(t), yielding the HJB equation
n o
inf k̇(t) + x⊤ K̇(t)x + 2x⊤ A(t)⊤ K(t)x + 2u⊤ B(t)⊤ K(t)x + x⊤ C(t)x + u⊤ D(t)u + tr(σσ⊤ K(t))
u∈U

and a pointwise minimization yields the same minimizer

û(t, x) = −D(t)−1 B(t)⊤ K(t)x.

Plugging this in the HJB equation we obtain

k̇ + x⊤ K̇x + x⊤ (KA + A⊤ K)x − x⊤ KB(D −1 )B⊤ Kx + x⊤ Cxtr(σσ⊤ K) = 0.

A sufficient condition is that k and K satisfy

k̇(t) = −tr(σσ⊤ K(t)),


K̇(t) = −(K(t)A(t) + A(t)⊤ K(t)) + K(t)B(t)(D(t)−1 )B(t)⊤ K(t) − C.

The latter is the same matrix Riccati equation as (3.22) and the differential
equation for k simply yields for k(T ) = 0
Z T
k(t) = tr(σσ⊤ K(s))ds.
t

Since we minimize, this can be seen as the additional costs we have to pay for
the noise. Apart from that the solution has absolutely the same structure as
for the deterministic problem in Example 3.7.

33
4 Viscosity solutions and stochastic control
We have seen that the value function V solves the HJB equation if V is smooth,
which means for determinstic control problems V ∈ C 1,1 and for stochastic
control roblems V ∈ C 1,2 . Often V is not smooth like the examples in Section
4.1, 4.3 will show. Then we need another concept of solutions of the HJB
equation, these are viscosity solutions for which we can show in Section 4.6
that the value function solves the HJB equation in the viscosity sense.

4.1 A deterministic example


In the setting of Section 3.6, U = [−1, 1], b(x, u) = u, A(t, x) measurable
controls with values in U. The dynamics of the controlled brocess is
Ẋ(t) = b(X(t), u(t)),
i.e. Z t Z t
X(t) = x0 + u(s) ds = X(t0 ) + u(s) ds.
0 t0
As performance criterion we use J(t, x, u) = X(T ; t, x)2 which we maximize:
 Z T 2
2
V (t, x) = sup J(t, x, u) = sup x+ u(s) ds .
u∈A(t,x) u∈A(t,x) t

We can see directly


 
(x + T − t)2 , x ≥ 0, 1, x ≥ 0,
V (t, x) = û(x) =
(x − T + t)2 , x ≤ 0, −1, x ≤ 0.
This yields 
2(x + T − t), x > 0,
Vx (t, x) =
2(x − T + t), x < 0,
so
Vx (t, 0+) = 2(T − t) > 0 > −2(T − t) = Vx (t, 0−).
Thus V is not smooth since not differentiable at x = 0.

4.2 Upper and lower semi continuous functions


A real valued function f on IRn is called upper semi continuous (u.s.c.), if
lim sup f (xn ) = f (x), x ∈ IRn ,
xn →x

and lower semi continuous (u.s.c.), if


lim inf f (xn ) = f (x), x ∈ IRn .
xn →x

So, if an u.s.c. function f has a jump, the upper point belongs to the graph
of f , if f is l.s.c. the lower point. An u.s.c. function attains its supremum on
compacta (maximum exists), a l.s.c. attains its infimum (minimum exists) on
compacta.

34
4.3 A stochastic example
We consider U = IR and a two-dimensional controlled process (X, Y )⊤ with
dynamics
dXt = Yt dWt1 ,
dYt = ut dt + dWt2
where (W 1 , W 2 )⊤ is a standard Wiener process. So we have coefficients
   
0 y 0
b(x, y, u) = , .
u 0 1
For any l.s.c. function g let
J(t, x, y, u) = Et,x,y [g(Xt )]
and At,x,y = {u | X, Y exist, Et,x,y [g − (Xt )] < K, X martingale},
V (t, x, y) = sup J(t, x, y, u).
u∈A(t,x,y)

Suppose that V ∈ C 1,2 . Then one can show analogously to Section 2.2 – using
that we have continuous b and σ – that V satisfies
 
⊤ 1 ⊤ 2

sup Vt + b DV + tr σσ D V ≤ 0.
u∈U 2
Computing the gradient DV and the Hessian D 2 V we get on [0, T ) × IR2
 
1 2 
sup Vt + u Vy + y Vxx + Vyy ≤ 0.
u∈U 2
Since we can choose any u ∈ U = IR this implies Vy (t, x, y) = 0 for all (t, x, y) ∈
[0, T ) × IR2 . Therefore V is constant w.r.t. y and we may consider from now
on V (t, x) = V (t, x, y) for any y ∈ IR. We get
 
1 2
sup Vt + y Vxx ≤ 0.
u∈U 2
Choosing y = 0 yields Vt (t, x) ≤ 0, so V is decreasing for t ∈ [0, T ). Looking
at y → ∞ we obtain Vxx (t, x) ≤ 0 for all x, so V is concave in x for all t. Using
these properties and that g is l.s.c. one can prove that
V (T −, x) ≥ V (T, x) = g(x), x ∈ IR.
So V (t, x) ≥ g(x) for all x and V (t, x) concave in x, which implies that
V (t, x) ≥ ğ(x), where ğ is the conave envelope of g (the smallest concave
function greater or equal to g).
On the other hand we have by Jensen’s inequality and the martingale property
of X
V (t, x) ≤ sup Et,x [ğ(XT )] ≤ sup ğ(Et,x [XT ]) = ğ(x).
u∈A(t,x) u∈A(t,x)

Therefore V (t, x) = ğ(x). Now, if ğ is not twice continuously differentiable we


have a contradiction to our assumption V ∈ C 1,2 , i.e. V cannot be smooth.

35
4.4 Viscosity solutions
Consider
F (x, v(x), Dx v(x), Dxx v(x)) = 0 (4.23)
for x ∈ O ⊂ IRn open. Let F be continuous, with values in IR , satisfying the
ellipticity condition

F (x, r, p, A) ≤ F (x, r, p, A′ ) for all A ≥ A′ , (4.24)

where ’A ≥ A′ ’ means that A − A′ is positive definite.


Before defining viscosity solutions we shall look at classical (smooth) solutions
to gain some insight.

Definition 4.1 v : O → IR is a classical supersolution (subsolution) of 4.23


if v ∈ C 2 (O) and

F (x, v(x), Dx v(x), Dxx v(x)) ≥ 0 (≤ 0) for all x ∈ O.

Proposition 4.2 Suppose v ∈ C 2 (O). Then v is a classical supersolution


(subsolution) if and only if for all x̂ ∈ O, w ∈ C 2 (O), for which x̂ is a mini-
mizer (maximizer) of v − w, we have

F (x̂, v(x̂), Dx w(x̂), Dxx w(x̂)) ≥ 0 (≤ 0). (4.25)

Beweis. Given that (4.25) holds, we simply have to choose w = v. For the
opposite direction suppose that v is a classical supersolution and that x̂ ∈ O,
w ∈ C 2 (O), x̂ a minimizer of v − w. Since x̂ is a minimizer we have Dx v(x̂) =
Dx w(x̂), Dxx v(x̂) ≥ Dxx w(x̂), hence by (4.24)

F (x̂, v(x̂), Dx w(x̂), Dxx w(x̂)) = F (x̂, v(x̂), Dx v(x̂), Dxx w(x̂))
≥ F (x̂, v(x̂), Dx v(x̂), Dxx v(x̂))
≥ 0.

Thinking of F as a HJB, the idea is to define (viscosity) sub- and supersolutions


v by (4.25) holding for all x̂ and smooth w and to weaken the conditions on
v such that one can show that the value function of the control problem is a
viscosity solution. Even if v is not smooth one can then work with smooth w.
To this end define for a real valued g the lower semi continuous envelope

g(x) = lim inf g(y)


y→x

and the upper semi continuous envelope

g(x) = lim sup g(y).


y→x

36
Definition 4.3 suppose that v : O → IR is locally bounded.
(i) v is a (viscosity) supersolution of (4.23) if
F (x̂, v(x̂), Dx w(x̂), Dxx w(x̂)) ≥ 0
for all x̂ ∈ O, w ∈ C 2 (O) such that x̂ is a minimizer of v − w.
(ii) v is a (viscosity) subsolution of (4.23) if
F (x̂, v(x̂), Dx w(x̂), Dxx w(x̂)) ≤ 0
for all x̂ ∈ O, w ∈ C 2 (O) such that x̂ is a maximizer of v − w.
(iii) v is a viscosity solution of (4.23) if v is both a super- and a subsolution
of (4.23).

4.5 Properties
We start with a change of variables formula (proof not difficult):
Proposition 4.4 Let v be a l.s.c. supersolution of (4.23). If f ∈ C 1 (IR) with
f ′ 6= 0 on IR then
ṽ = f −1 ◦ v
is a supersolution (subsolution) of
F̃ (x, v(x), Dx v(x), Dxx v(x)) = 0
if f ′ > 0 ( if f ′ < 0), and where
F̃ (x, r, p, A) = F (x, f (r), f ′(r)p, f ′′ (r)pp⊤ + f ′ (r)A).
The proof of the following proposition is relatively easy compared to other
convergence results for PDEs. It is very important for numerical schemes.
Proposition 4.5 (i) Let vε be a l.s.c. supersolution of Fε (x, Dx v(x), Dxx v(x)) =
0, ε > 0, where Fε is continuous and satisfies (4.24). Suppose (ε, x) 7→ vε (x),
(ε, x, p, A) 7→ Fε (x, p, A) are locally bounded. Define
v 0 (x) = lim inf vε (y), F 0 (x, p, A) = lim sup Fε (x′ , p′ , A′ ).
(ε,y)→(0,x) (ε,x′ ,p′ ,A′ )→(0,x,p,A)

Then v 0 is a l.s.c. supersolution of F 0 (x, Dx v(x), Dxx v(x)) = 0.


(ii) Let vε be a u.s.c. subsolution of Fε (x, Dx v(x), Dxx v(x)) = 0, ε > 0, where
Fε is continuous and satisfies (4.24). Suppose (ε, x) 7→ vε (x), (ε, x, p, A) 7→
Fε (x, p, A) are locally bounded. Define
v 0 (x) = lim sup vε (y), F 0 (x, p, A) = lim inf Fε (x′ , p′ , A′ ).
(ε,y)→(0,x) (ε,x′ ,p′ ,A′ )→(0,x,p,A)

Then v 0 is a u.s.c. subsolution of F 0 (x, Dx v(x), Dxx v(x)) = 0.


Example 4.6 In Example 4.3 we can show by an approximation argument
using Proposition 4.5 that V (t, x) = ğ(x) is a viscosity solution of the given
HJB.

37
4.6 Viscosity solutions and HJB equations
Now let us consider also the dependency on time, i.e. F in the form

F (t, x, Dx V (t, x), Dxx V (t, x)) = −Vt (t, x) − sup {ψ(t, x, u) + Lu V (t, x)} = 0
u∈U
(4.26)
on Q = [0, T ) × IRn .

Theorem 4.7 Suppose b, σ, ψ are for any fixed u in C 1 (Q) ∩ C(Q) and b, σ
have unifomrly bounded derivatives w.r.t. t, x and are of linear growth in x and
u. Then for any u.s.c. subsolution V∗ and a l.s.c. supersolution of 4.26

sup (V∗ (t, x) − V ∗ (t, x)) = sup (V∗ (T, x) − V ∗ (T, x))
(t,x)∈Q x∈IRn

Theorem 4.8 Suppose that the value function V is locally bounded on Q


and that ψ is continuous in t, x for all fixed u and that

(t, x) 7→ sup {ψ(t, x, u) + Lu V (t, x)}


u∈U

is continuous in t, x. Then V is a viscosity solution of (4.26).

4.7 Portfolio optimization under transaction costs


We consider the model of Section 2.4 with one stock and finite time horizon T .
So we have a bond and a stock with prices following

dBt = Bt r dt, B0 = 1
dSt = St (µ dt + σ dWt ), S0 = 1.

Without costs we saw in Section 2.4 that for maximizing expected power utility
xα /α of terminal wealth it is optimal to keep a constant fraction
1 µ−r
πM =
1 − α σ2
of the portfolio value (wealth) invested in the stock. Such a strategy requires
continuous trading since the postion has always to be adjusted when the stock
does not evolve like the money market. Because the stock prices are not of
finite variation, any reasonable costs for trading would lead to an immediate
ruin of an investor trying to do so.

4.7.1 Wealth processes under transaction costs


We consider proportional transaction costs: An investor has to pay a fraction
γ ∈ (0, 1) of his transaction ∆ as fees, so he pays γ|∆| where ∆ is the amount

38
of money for which stocks are bought (∆ > 0) or sold (∆ < 0). The fees are
paid from the bond (bank account).
To compute the costs we need two processes to describe the portfolio. We use
the wealth X 0 in the bond and the wealth X 1 in the stock. It is not clear what
the value of the portfolio should be. At the terminal time we distinguish two
possibilities:
XT = XT0 + XT1 (total wealth (no liquidation costs)),
X T = XT − γ|XT1 | (wealth of the liquidated portfolio).
To make sure that it is always possible to liquidate the position in the stocks
and thus to end up with a (strictly) positive wealth after liquidation one has
to ensure that the two-dimensional wealth process (Xt0 , Xt1 )t∈[0,T ] stays in (the
closure of) the solvency cone S given by
S = {(x0 , x1 ) ∈ IR2 : x0 + x1 − γ|x1 | > 0}
x0
which is the interior of a cone with boundaries x1 (x0 ) = − 1−γ for x0 < 0 and
x0
x1 (x0 ) = − 1+γ for x0 ≥ 0.
As controls we consider the cumulated purchases Lt and the cumulated sales
Mt up to time t which are assumed to be adapted, positive, increasing and
right continuous. The controlled wealth processes are for (x0 , x1 ) ∈ S then
given by
Z t
0
Xt = x0 + rXs0 ds + (1 − γ)Mt − (1 + γ)Lt , (4.27)
0
Z t Z t
1 1
Xt = x1 + µXs ds + σXs1 dWs + Lt − Mt . (4.28)
0 0

The controls L and M are admissible if X 0 , X 1 are well defined and (Xt0 , Xt1 ) ∈
S for all t ∈ [0, T ].

4.7.2 Value function


For (x0 , x1 ) ∈ S we consider the value functions
 
1 α
V (t, x0 , x1 ) = sup Et,x0 ,x1 X ,
L,M α T
 
1 α
V (t, x0 , x1 ) = sup Et,x0 ,x1 X ,
L,M α T
where the supremum is taken over all admissible control processes L, M start-
ing at t with Lt− = 0, Mt− = 0, the class which we denote by A(t, x0 , x1 ).
Proposition 4.9 V and V are continuous on S, concave, strictly increasing
in x0 , x1 , and satisfy the homotheticity property
V (t, y x0 , y x1 ) = y α V (t, x0 , x1 ) for all y > 0.

39
Proof: We shall only give the argument for the homotheticity property.
The proof for the concavity uses similar arguments based on the elementary
definition of a concave function.
Due to the linearity of (4.27), (4.28) one can easily verify that at t the controls
L , M are admissible for (x0 , x1 ) ∈ S if and only if yL, yM are admissible for
yx0 , yx1 and that

XT0 (t, yx0 , yx1 , yL, yM) = yXT0 (t, x0 , x1 , L, M),


XT1 (t, yx0 , yx1 , yL, yM) = yXT1 (t, x0 , x1 , L, M),

where e.g. XT0 (t, x0 , x1 , L, M) denotes the terminal wealth in the bond for start-
ing at t with x0 , x1 and using L, M ∈ A(t, x0 , x1 ). Therefore

XT (t, yx0 , yx1 , yL, yM) = XT0 (t, yx0 , yx1 , yL, yM) + XT1 (t, yx0 , yx1, yL, yM)
= yXT (t, x0 , x1 , L, M)

and
 

V (t, yx0 , yx1) = sup Et,x0 ,x1 XT (t, x0 , x1 , L, M) = y α V (t, x0 , x1 ).
α
(L,M )∈A(t,x0 ,x1 ) α

4.7.3 Heuristic derivation of the HJB equation


In general, the controlled processes might have jumps. So we are no longer
in the situation we considered before. The correct methods are provided by
singular stochastic control theory, see [FS93]. Here we will outline a heuristic
approach which allows to conjecture the HJB equation and the form of optimal
control strategies and then requires suitable verification results.
Suppose that L and M are absolutely continuous, i.e.
Z t Z t
Lt = ls ds, Mt = ms ds (4.29)
0 0

where 0 ≤ lt , mt ≤ κ for some maximum rate κ > 0. Then we can rewrite


(4.27), (4.28) as

dXt0 = (rXt0 + (1 − γ)mt − (1 + γ)lt )dt,


dXt1 = (µXt1 + lt − mt )dt + σXt1 dWt .

and under further regularity conditions we obtain the HJB

sup {Vt (t, x0 , x1 ) + LV (t, x0 , x1 ) + LB V (t, x0 , x1 ) l + LS V (t, x0 , x1 ) m} = 0,


l,m

40
where
1
LV = rx0 Vx0 + µx1 Vx1 + σ 2 x21 Vx1 x1 ,
2
LB V = Vx1 − (1 + γ)Vx0 ,
LS V = (1 − γ)Vx0 − Vx1 .

Then, if V is smooth enough, maximizers are given by



ˆl(t, x0 , x1 ) = 0, if LB V (t, x0 , x1 ) < 0,
κ, if LB V (t, x0 , x1 ) ≥ 0,

and 
0, if LS V (t, x0 , x1 ) < 0,
m̂(t, x0 , x1 ) =
κ, if LS V (t, x0 , x1 ) ≥ 0.
Thus on the no trading region NT , defined by

NT = {(t, x0 , x1 ) ∈ [0, T ] × S : ˆl(t, x0 , x1 ) = 0, m̂(t, x0 , x1 ) = 0},

we have Vt + LV = 0. Further we can introduce the buy region B and the sell
region S where it is optimal to buy and to sell respectively,

B = {(t, x0 , x1 ) ∈ [0, T ] × S : ˆl(t, x0 , x1 ) > 0},


S = {(t, x0 , x1 ) ∈ [0, T ] × S : m̂(t, x0 , x1 ) > 0}.

Due to Proposition 4.9 Vx0 is strictly positive and thus (1 + γ)Vx0 > (1 − γ)Vx0 .
Therefore the condition LB ≥ 0 implies LS < 0 and LS ≥ 0 implies LB < 0,
hence the regions may equivalently be defined by

NT = {(t, x0 , x1 ) : Vt (t, x0 , x1 ) + LV (t, x0 , x1 ) = 0},


B = {(t, x0 , x1 ) : LB V (t, x0 , x1 ) ≥ 0}, (4.30)
S = {(t, x0 , x1 ) : LS V (t, x0 , x1 ) ≥ 0}.

So they provide a partition of [0, T ] × S, i.e. NT ∪ B ∪ S = [0, T ] × S and


NT ∩ B ∩ S = ∅.
We shall assume that the slices NT (t), B(t), S(t) for given t have an in-
terval structure with NT (t) = (x1 (t, x0 ), x1 (t, x0 )), S = [x1 (t, x0 ), ∞), B =
(−x0 /(1 − γ), x1 (t, x0 )] for x0 < 0 and B = (−x0 /(1 + γ), x1 (t, x0 )] for x0 ≥ 0,
where x1 (t, x0 ) ≤ x1 (t, x0 ) and (x0 , xl (t, x0 )), (x0 , x1 (t, x0 )) ∈ S. Further sup-
pose that we start at some (x0 , x1 ) ∈ NT (0). Then the process (Xt0 , Xt1 ) will
only hit the boundary of B and S, never the interior. Assuming that V is
continuously differentiable we have LB V = 0 and LS V = 0 on the boundaries
∂B ∩ ∂NT and ∂S ∩ ∂NT , respectively, and LB V > 0, LS V > 0 on NT . Thus
we might extend V as a solution of LB V = 0 on B and of LS V = 0 on S
without changing the optimal policy. This leads to the variational inequalities

max{Vt + LV, LB V, LS V } = 0 (4.31)

41
whose solution provides us with the optimal trading regions. According to
(4.30) we have equality Vt + LV = 0, LB V = 0, LS V = 0 on NT , B, S, re-
spectively. Note that this is not a HJB equation in the sense that we maximize
over possible strategies. Rather it is a set of variational inequalities and we
have to find the free boundaries between the regions where one of the inequal-
ities is active (= 0). Solving this free boundary problems yields the trading
regions NT , B, S. One may then view the maximum in (4.31) as a supre-
mum over the 3 possible strategies which choose one of the inequalities, so the
maximum of these choices corresponds to the optimal action (hold, buy or sell
stocks).
Using the homotheticity property in Lemma 4.9 and assuming that V is con-
tinuously differentiable we get for y > 0

∂V (t, y x0 , y x1 ) ∂y α V (t, x0 , x1 )
= = y α Vx0 (t, x0 , x1 )
∂x0 ∂x0
and on the other hand – not using the homotheticity property – by the chain
rule
∂V (t, y x0 , y x1 )
= y Vx0 (t, y x0 , y x1 ).
∂x0
Comparing these two we have (the same holds for the partial derivative w.r.t.
x1 )

Vx0 (t, y x0 , y x1 ) = y α−1 Vx0 (t, x0 , x1 ),


Vx1 (t, y x0 , y x1 ) = y α−1 Vx1 (t, x0 , x1 ).
(4.32)
Since LB is linear in Vx0 and Vx1 this means that if we have found a point
(t, x0 , x1 ) ∈ B, then (t, y x0 , y x1 ) ∈ B for all y > 0. So at t the whole ray
y(x0 , x1 ), y > 0, belongs to B. This shows that NT (t), B(t), S(t) are cones.
So we know quite a bit about the trading regions. What about the strategy?

Remark 4.10 In similar problems it can then be shown that controls L and
M exists such that (Xt0 , Xt1 ) ∈ NT (t) and
Z t Z t
Lt = 1{(Xs0 ,Xs1 )∈∂B(s)} (s) dLs , Mt = 1{(Xs0 ,Xs1 )∈∂S(s)} (s) dMs ,
0 0

compare [Ko99] and the references therein. So trading occurs only on the
boundary. Further it can be shown that only that much is traded that the pro-
cess stays on the boundary. Mathematically the controlled process (Xt0 , Xt1 )t∈[0,T ]
is a continuous reflected diffusion process, reflected at the boundaries of NT ,
and trading only occurs with infintesimal small transactions at the local time
on the boundary.
For the form of a verification theorem which still has to be shown to guarantee
that the optimal strategy is of the conjectured form, we refer for a similar
problem to [Ko99] and the references therein.

42
4.7.4 No short selling, no borrowing
If, in addition we require in the admissibility conditions, that no short selling
takes place (Xt1 ≥ 0) and no borrowing is allowed (Xt0 ≥ 0) then we consider
instead of the solvency region the domain D = [0, ∞)2 \ {0, 0} and define the
trading regions as subsets of D. Then it might happen that one of the trading
regions is empty. Further, if x0 = 0 (x1 = 0) we should exclude the second
(third) inequality in (4.31) since buying (selling) is not admissible. This leads
to the following theorem for which we refer to Akian et al. (1996)1.

Theorem 4.11 V is a concave and continuous viscosity solution of


max{Vt + LV, LB V, LS V } = 0 on [0, T ) × D \ ({0} × (0, ∞) ∪ (0, ∞) × {0}),
max{Vt + LV, LS V } = 0 on [0, T ) × {0} × (0, ∞),
max{Vt + LV, LB V } = 0 on [0, T ) × (0, ∞) × {0}.
with V (T, x0 , x1 ) = α1 (x0 + x1 )α .
Further V is unique in the class of continuous functions satisfying |h(t, x0 , x1 )| ≤
K(1 + (x20 + x21 )α ) for all (x0 , x1 ) ∈ D, t ∈ [0, T ], and some constant K. The
same is true for V with boundary condition V (T, x0 , x1 ) = α1 (x0 + (1 − γ)x1 )α .

The proof in Akian et al. (1996) is based on the derivation of a weak dynamic
programming principle leading to (4.31). The uniqueness is shown following
the Ishii technique, see [CIL92].

4.7.5 Reduction of the dimension


By homotheticity we have for the total wealth x = x0 + x1 and the risky
fraction π = x1 /x
V (t, x0 , x1 ) = xα V (t, x0 /x, x1 /x) = xα V (t, 1 − π, π) =: xα Φ(t, π)
So we may try to reparameterize the problem as a control problem for the
Xt1 α x1
risky fractions πt = X 0 +X 1 . We have for V (t, x0 , x1 ) = (x0 + x1 ) Φ(t, x +x )
0 1
t t

Vt = xα Φt ,
Vx0 = xα−1 (αΦ − πΦπ ),
Vx1 = xα−1 (αΦ + (1 − π)Φπ ),

Vx1 ,x1 = xα−2 (1 − π)2 Φπ,π − (1 − π)(1 − 2α)Φπ − α(1 − α)Φ ,
where we used x = x0 + x1 and π = x1 /x as above. Plugging this into the
definition of the operators we get
LV = xα Lπ Φ, LB V = xα−1 LπB Φ, LS V = xα−1 LπS Φ,
1
M. Akian, A. Sulem, P. Séquier (1996): A finite horizon multidimensional portfolio
selection problem with singular transactions. In: Proceedings of the 34th Conference on
Decisions & Control, New Orleans, 2193–2197.

43
where
 
π 1 2 2
L Φ(t, π) = α r + π(µ − r) − σ (1 − α)π Φ(t, π)
2
+ (µ − r − (1 − 2α)π) π(1 − π)Φπ (t, π)
1
+ σ 2 π 2 (1 − π)2 Φπ,π (t, π),
2
LπB (t, π) = (1 + γπ)Φπ (t, π) − αγΦ(t, π),
LπS (t, π) = −(1 − γπ)Φπ (t, π) − αγΦ(t, π).

Since x > 0 we thus get the variational inequalities

max{Lπ Φ(t, π), LπB Φ(t, π), LπS Φ(t, π)} = 0 (4.33)

for (t, π) ∈ [0, T ) × (0, 1). Note that x0 ≥ 0, x1 ≥ 0, x > 0 imply π ∈ [0, 1]. We
shall denote the corresponding trading regions in terms of π by NT π , B π , S π .
The boundary conditions at terminal time T now read
1 1
Φ(T, π) = and Φ(T, π) = (1 − γπ)α .
α α
On B π we can solve LπB Φ = 0 yielding

Φ(t, π) = CB (t)(1 + γπ)α (4.34)

with an unknown strictly positive function CB . On S π we get correspondingly

Φ(t, π) = CS (t)(1 − γπ)α . (4.35)

We still have to discuss what happens on the boundaries π = 0 (corresponding


to x1 = 0 in Theorem 4.11) and π = 1 (corresponding to x0 = 0). We assume
that πM , the optimal fraction without costs, lies in (0, 1). Then we can expect
NT π (t) 6= ∅ for t < T , so at t < T it can happen that (t, 0) ∈ NT π or that
(t, 0) ∈ B π . In the latter case we have the boundary condition LπB Φ(t, 0) = 0.
In the first case we have

Φt (t, 0) + αrΦ(t, 0) = 0. (4.36)

For π = 1 we have the boundary condition LπS Φ(t, 1) = 0 if (t, 1) ∈ S π and


1
Φt (t, 1) + α(µ − σ 2 (1 − α))Φ(t, 1) = 0. (4.37)
2
if (t, 1) ∈ NT π .
Now we have everything at hand we need to set up a good numerical procedure
to find the free boundaries

a(t) = inf NT (t) , b(t) = sup NT (t). (4.38)

44
4.7.6 A Semi-Smooth Newton Method
The algorithm we present to solve (4.33) is based on a primal-dual active set
strategy, compare Hintermüller et al. (2003)2 and Ito and Kunisch (2006)3.
Here we face two free boundaries and a different type of constraints and have
to adapt their algorithm. We now work in the setting of Section 4.7.5 but no
longer use the superscript π .
Problem (4.33) is equivalent to solving

Φt + LΦ + λB + λS = 0 , (4.39)
LB Φ ≤ 0, λB ≥ 0, λB LB Φ = 0, (4.40)
LS Φ ≤ 0, λS ≥ 0, λS LS Φ = 0 . (4.41)

The two complementarity problems in (4.40), (4.41) can be written as

λB = max{0, λB + c LB Φ} , λS = max{0, λS + c LS Φ} (4.42)

for any constant c > 0. So we have to solve (4.39), (4.42). At T the trading
regions are given by S(T ) = [0, 1] for Φ and NT (T ) = [0, 1] for Φ. We split
[0, T ] in N intervals and go backwards in time with tN = T , tn = tn+1 − ∆t,
∆t = T /N. Having computed Φ(tn+1 , ·) and the corresponding regions we use
the following algorithm to compute v = Φ(tn , ·) and NT (tn ):

0. Set v = Φ(tn+1 , ·), k = 0, choose an interval NT0 in [0, 1], constant c > 0.

1. Define the boundaries ak and bk of NTk as in (4.38).


1
2. On [ak , bk ] solve (numerically) ∆t (v − v) + Lv = 0 using the boundary
conditions LB v = 0 if ak 6∈ NTk , (4.36) if ak ∈ NTk (implying ak = 0)
and LS v = 0 if bk 6∈ NTk , (4.37) if bk ∈ NTk (implying bk = 1).

3. If ak 6= 0 define v on [0, ak ) by (4.34). If bk 6= 1 define v on (bk , 1] by the


second equation in (4.35). Choose CB and CS such that v is continuous
in ak and bk . So vk+1 = v is continuously differentiable.

4. Set 
0 on (ak , 1]
λk+1
B = 1
− ∆t (v − vk+1 ) − Lvk+1 on [0, ak ]
and 
0 on [0, bk )
λk+1
S = 1
− ∆t (v − vk+1 ) − Lvk+1 on [bk , 1].
2
M. Hintermüller, K. Ito, K. Kunisch (2003): The primal-dual active set strategy as a
semismooth Newton method. SIAM Journal on Optimization 13, 865–888.
3
K. Ito, K. Kunisch (2006): Parabolic variational inequalities: The Lagrange multiplier
approach. Journal de Mathḿatiques Pures et Appliquées (9) 85, 415–449.

45
y y
1 1

0.8
S 0.8 S
0.6 0.6
NT
0.4 0.4 NT
0.2 B 0.2 B
t t
0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1

Figure 3: Trading regions for α = 0.1 for Φ and Φ

5. Introduce the active sets

Bk+1 = {y ∈ [0, 1] : λk+1


B (y) + c LB vk+1 (y) > 0},

Sk+1 = {y ∈ [0, 1] : λk+1


S (y) + c LS vk+1 (y) > 0}

and set NTk+1 = [0, 1] \ (Bk+1 ∪ Sk+1 ). Verify that the interval structure
holds and define the boundaries ak+1 and bk+1 by (4.38).

6. If ak+1 = ak and bk+1 = bk then set NT (tn ) = (ak+1 , bk+1 ), Φ(tn , ·) = vk+1
and STOP; otherwise increase k by 1 and continue with step 1.

Example 4.12
We consider a bond and a stock with parameters r = 0, µ = 0.096, σ = 0.4,
and horizon T = 1. We use mesh sizes ∆t = 0.01 and ∆y = 0.001, choose
c = 1, and at tN −1 use NT0 = (0.1, 0.8), and at all other time steps tn use
NT0 = NT (tn+1 ). For the utility function we consider both α = 0.1 and
the more risk averse parameter α = −1. These yield without transaction
costs optimal risky fractions 0.667 and (dotted lines in Figure 3). We consider
proportional costs γ = 0.01. In Figure 3 we look at α = 0.1, left-hand at V
with liquidation at the end, right-hand at Ṽ . We see that the liquidation costs
we have to pay at T imply that we also trade close to the terminal time, while
without liquidation this is never optimal. Using Mathematica the computation
time for each graph was about 18 s.

4.7.7 More complex transaction costs


Typical transaction costs considered in portfolio theory are constant costs,
fixed costs (proportional to the portfolio value), and proportional costs (pro-
portional to the transaction volume). While the latter penalizes the size of
the transaction, the first two punish the frequency of trading. So a combina-
tion of both types is of interest, as well from a practical as a theoretical point
of view. Then the trading strategies of interest are so called impulse control
strategies consisting of a sequence of stopping times at which trading takes
place and the transactions at those times. Optimal impulse control strategies
can then be described as solutions of quasi variational inequalities which differ

46
from variational inequalities like (4.31) by inclusion of a maximization problem
in the inequalities for the buy and sell region which determines the optimal
transaction when trading.
The first type of costs considered were purely proportional costs – like we did
above – for which the optimal solution is given by a cone in which it is optimal
not to trade at all, and which corresponds to an interval for the risky fraction.
When reaching the boundaries, infinitesimal trading occurs in such a way that
the wealth process just stays in the cone.
Adding a constant component to the transaction costs punishes very frequent
trading and so will avoid the occurrence of infinitesimal trading at the bound-
ary. An investor now has to choose discrete trading times and optimal trans-
actions at these times so that the methodology of optimal impulse control
comes into play. The insight is that there is still some no-transaction region,
but reaching the boundary transactions will be done in such a way that the
wealth process restarts at some curve between boundary and Merton line. For
constant costs the trading regions depend also on the total wealth.
A simpler approach is possible when considering fixed costs instead of constant
costs. For purely fixed costs we get a constant new risky fraction close to the
Merton fraction πM . If combined with proportional costs we get two different
new risky fractions after buying and after selling.

5 Optimal Stopping
We will cite some results on optimal stopping in Section 5.1 and then look at
the pricing of American options as an important application of the theory of
optimal stopping.

5.1 Some results on optimal stopping


For the results cited below we refer to [KS98, Appendix D], [Ok00, Chapter
10], and [PS06].
Suppose that Y = (Yt )t∈[0,T ] is a right continuous, positive, stochastic process
w.r.t. some Filtration F = (Ft )t∈[0,T ] satisfying the usual conditions and with
trivial F0 . We would like to find

V (0) = sup E[Yτ ]


τ ∈S0,T

where Ss,t is the class of stopping times with values in [s, t], and an optimal
stopping time τ ∗ for which V (0) is attained, i.e. V (0) = E[Yτ ∗ ]. We assume
that V (0) ∈ (0, ∞). The main idea to solve the optimal stopping problem is
the introduction of the Snell envelope

Y t = esssupτ ∈St,T E[Yτ | Ft ], t ∈ [0, T ].

47
The essential supremum X ∗ = esssupX of a family of random variables X is
characterized by two conditions: (i) For all X ∈ X we have X ≤ X ∗ (a.s.) and
if X ≤ X (a.s.) for all X ∈ X , then X ∗ ≤ X (a.s.).
If the essential supremum is taken over a countable number of random variables
or over expectations (real numbers) it coincides (a.s.) with the supremum.
Therefore Y 0 = V (0). By definition we also have Y T = YT .

Theorem 5.1 The Snell envelope Y is the smallest supermartingale majo-


rant of Y . In particular we have for any t ∈ [0, T ], τ ∈ St,T
E[Y τ | Ft ] ≤ Y t .

For the characterization of an optimal stopping time the following theorem is


of uttermost importance.

Theorem 5.2 τ ∗ is optimal if and only if (i) (Yτ ∗ ∧t )t∈[0,T ] is a martingale


and (ii) Yτ ∗ = Y τ ∗ .

For the existence we need the

Assumption 5.3 E[sup0≤t≤T Yt ] < ∞.

Theorem 5.4 Suppose Y satisfies the conditions above, in particular As-


sumption 5.3. Then
τ ∗ = inf{t ∈ [0, T ] : Yt = Y t }
is optimal.

Remark 5.5 In discrete time the Snell envelope can be introduced similarly
and constructed by backward induction. Say we have a stochastic process
(Yn )n=0,...,N , adapted to a filtration (Fn )n=0,...,N and we want to find the value
vn = sup E[Yτ ].
τ ∈Sn,N

and find an optimal stopping time τn∗ in the class Sn,N of stopping time with
values in n, . . . , N, i.e. for which E[Yτn∗ ] = vn . In particular we are interested
in n = 0. Defining
Y N = YN
and backwards for n = N − 1, . . . , 0
Y n = max{Yn , E[Y n+1 | Fn]}.
The resulting process (Y n )n=0,...,N will be the smallest supermartingale majo-
rant of (Yn )n=0,...,N , the Snell envelope. Further by induction it follows that
τn∗ = min{m = n, . . . , N : Y m = Ym } = min{m = n, . . . , N : Y m ≤ Ym }
is optimal with vn = E[Yτn∗ ] = E[Y n ].

48
Example 5.6 The problem of best choice (formerly the secretary problem):
A company wants to hire a mathematician and has invited N applicants for
a job interview. They are interviewed sequentially and after each candidate it
has to be decided whether s/he is hired or not. The aim is to maximize the
probability to get the best candidate. The rank of candidate n is given by a
random variable Xn and we assume that all permutations of ranks have the
same probability. What can be observed are the relative ranks Rn with values
in {1, . . . , n} up to time n. So we consider the filtration Fn = σ(R1 , . . . , Rn ),
n = 1, . . . , N. The company faces the optimal stopping problem

vn := sup E[Yτ ], n = 1, . . . , N,
τ ∈Sn,N

where
Yn = P (Xn = 1 | R1, . . . , Rn ) = P (Xn = 1 | Fn)
and is interested in the optimal stopping times τn∗ at n, in particular for n = 1.
It can be shown that R1 , . . . , RN are independent with
1
P (Rn = k) = , k = 1, . . . , n,
n
and that
n
Yn = 1{Rn =1} .
N
Using that we have due to independence E[Y n+1 | Fn ] = vn+1 , we can compute
the Snell envelope Y n , τn∗ , and vn by backward induction and show that the
optimal strategy can be described by

τ1∗ = inf{n ≥ σ ∗ : Rn = 1 or n = N},


σ ∗ = inf{k ∈ {1, . . . , N} : k/N ≥ vk+1 }.

Thus – as long as σ ∗ is not reached – the decision is always no, and from
σ ∗ onwards the first candidate, who is the best of those seen so far, gets the
job. Further it can be shown that for looking at σ ∗ and v1 as functions of the
number of candidates N, we get

σ ∗ (N) 1 1
lim = , lim v1 (N) = ,
n→∞ N e n→∞ e
i.e., it is asymptotically optimal to send home N/e candidates after their in-
terview and then take the next relatively best candidate. This yields approxi-
mately a probability of 1/e to get the best of all candidates. For more details
we refer e.g. to [Ir03].

49
5.2 Optimal Stopping for underlying Markov processes
Suppose now that Yt = g(t, Xt) where (Xt )t∈[0,T ] is a strong Markov process
with values in IRn , e.g. an Itô diffusion like defined in (2.4).
Then the optimal stopping problem reads

V (t, x) = sup Et,x [g(τ, Xτ )], (5.43)


τ ∈St,T

where V is the value function of the optimal stopping problem. Assumption


5.3 now reads E0,x [supt∈[0,T ] g(t, Xt )] < ∞. Denoting by Xs (t, x) the process
starting at t with x, we get corresponding to Theorem 5.4

Theorem 5.7 Suppose that Assumption 5.3 holds, that V is lower semi
continuous and g upper semi continuous. Then

τt,x = inf{s ∈ [t, T ] : V (s, Xs (t, x)) = g(s, Xs (t, x))}

is optimal for (5.43).

Furthermore, one can introduce the continuation set C and the stopping set D

C = {(t, x) ∈ [0, T ] × IRn : V (t, x) > g(t, x)},


D = {(t, x) ∈ [0, T ] × IRn : V (t, x) = g(t, x)}.


Then for (0, x) ∈ C, the optimal stopping time τ0,x = τD is the first entry time
of (t, Xt ) into D.

5.3 Reminder on pricing of European options


We consider the Black-Scholes model for a financial market consisting of one
bond with prices

dBt = Bt rdt, B0 = 1, i.e. Bt = er t , t ∈ [0, T ],

and one stock with prices evolving according to

dSt = St (µ dt + σ dWt ), S0 = s0 > 0,

where W is a Wiener process, µ ∈ IR, σ > 0, and interest rate r > 0. Risk
neutral pricing requires a change of measure to a new probability measure
under which the discounted price processes become martingales. This risk
dP̃
neutral measure P̃ can be defined by a Radon-Nikodym density ZT = dP in
the following way:
P̃ (A) := E[ZT 1A ], A ∈ FT ,

50
where (  2 )
µ−r 1 µ−r
ZT = exp − WT − T .
σ 2 σ
Note that the density process (Zt )t∈[0,T ] , defined by
(  2 )
µ−r 1 µ−r
Zt := E[ZT | Ft] = exp − Wt − t ,
σ 2 σ

is a martingale under P . The expectation and the conditional expectation


under P̃ can then be computed by

E[ZT X | Ft]
Ẽ[X] = E[ZT X], Ẽ[X | Ft ] = = Zt−1 E[ZT X | Ft ].
E[ZT | Ft ]

By the Girsanov Theorem


µ−r
W̃t := Wt + t, t ∈ [0, T ]
σ
defines a Wiener process under P̃ . Rewriting the stock dynamics in terms of
W̃ we get
dSt = St (r dt + σ dW̃t ),
and applying Itô’s formula to the discounted stock prices S̃t := St /Bt yields

dS̃t = S̃t σ dW̃t ,

hence S̃ is a martingale under P̃ . If we consider a self-financing trading strategy


π = (πt )t∈[0,T ] where πt denotes the fraction of wealth Xt invested in the stocks,
we get for the wealth process

dXt = rXt dt + πt Xt σ dW̃t

and for the discounted wealth X̃t := Xt /Bt

dX̃t = πt X̃t dW̃t .

Thus also the wealth process is a P̃ (local) martingale. This was to be expected
since we can only invest in the two martingales Bt /Bt = 1 and S̃.
Say we have a financial derivative which pays at the terminal time the amount
C to its buyer. The arbitrage free price p(C) of this contingent claim C is given
by x0 if we can find initial capital x0 and a self-financing trading strategy π
such that the corresponding discounted wealth process X̃ = X̃ π satisfies
Z T
C
= x0 + πt X̃t σ dW̃t . (5.44)
BT 0

51
The Black-Scholes market is a so called complete market model in which we
know that every FT -measurable, square-integrable claim C can be hedged by
a trading strategy like in (5.44). If the wealth process is indeed a martingale,
we get from (5.44)
p(C) = Ẽ[X̃Tπ ] = Ẽ[C/BT ],
where the latter does no longer depend directly on the trading strategy, so we
can simply compute the price by taking expectation of the discounted claim
with respect to the risk neutral measure P̃ .
Furthermore one can show that under these conditions the arbitrage free price
of C at time t is given by

pt (C) = Bt Ẽ[C/BT ] | Ft].

There are several approaches to find pt (C). If we have a financial derivative


which pays at terminal time
C = Φ(ST )
one can show under some integrability conditions on Φ that prices are of the
form
pt (C) = f (t, St )
by using the Markov property of the underlying stock prices. Applying a
result like e.g. the Feynman-Kac Formula (see e.g. [St00]) we get under suitable
conditions that f satisfies for t ∈ [0, T ), x > 0
1
ft (t, x) + rxfx (t, x) + σ 2 x2 fxx (t, x) − r f (t, x) = 0,
2
f (T, x) = Φ(x).

For example, for Φ(x) = (x−K)+ we have a European Call option which offers
the right to buy the stock for strike price K > 0 at time T . The solution of
the corresponding PDE above yields the so called Black-Scholes formula for
the call price, cf. any textbook on financial mathematics.
The price for a put option given by Φ(x) = (K − x)+ can then be determined
by the put-call-parity, cf. e.g. [KS98, Example 2.4.3].

5.4 American options


Using the same model and notation as in Section 5.3 we shall now look at
claims of American type which guarantee a certain payout Ct at time t if
exercised. Such an American contingent claim consists of a continuous process
(Ct )t∈[0,T ] of F -adapted possible payouts and the investor chooses a time τ
when to exercise. Since he can at t only use the information Ft , τ should be
modelled as stopping time. So for the buyer of an American contingent claim
we have a problem of optimal stopping. For the meaning of optimal see the
discussion after Theorem 5.9.

52
So the buyer chooses a stopping time in S0,T . In particular, if he has not
exercised before τ , he receives the payout CT at terminal time. At time 0 the
maximum the buyer is willing to pay is

pB (C· ) = sup{y : There exist τ ∈ S0,T , π s.t. Xτπ (0, −y) + Cτ ≥ 0}.

On the other hand, the seller needs at least

pS (C· ) = inf{z : There exists π s.t. Xtπ (0, z) − Ct ≥ 0 for all t ∈ [0, T ]}.

We need the following assumption:

Assumption 5.8 supτ ∈S0,T Ẽ[Cτ /Bτ ] < ∞.

Then we get

Theorem 5.9 pB (C· ) = sup Ẽ[Cτ /Bτ ] = pS (C· ). Furthermore optimal τ ∗


τ ∈S0,T
and a corresponding hedging strategy π ∗ exist, satisfying

sup Ẽ[Cτ /Bτ ] = Ẽ[Cτ ∗ /Bτ ∗ ]


τ ∈S0,T

and Z τ∗
Cτ ∗ ∗
= pB (C· ) + πt∗ X̃tπ σ dW̃t .
Bτ ∗ 0

Proof: This is a combination of Theorem 12.3.8 (b) in [Ok00] and Theorem


2.5.3 in [KS98]. 

So the theorem says, that the arbitrage free price

p(C· ) = pB (C· ) = PS (C· )

is unique. It allows the buyer to find a stopping time τ and a trading strategy
for his initial debt −p(C· ) such that he will make no losses almost surely. For
the seller it guarantees that he can cover the claim any time when the investor
exercises. The buyer will also be interested to know the optimal stopping time
τ ∗ for which
sup Ẽ[Cτ /Bτ ] = Ẽ[Cτ ∗ /Bτ ∗ ],
τ ∈S0,T

because choosing τ ∗ guarantees that he makes no losses if he hedges the payout.


Only in this sense τ ∗ is optimal for the buyer. Note that the supremum in
Theorem 5.9 is over expectations under P̃ and not under the original measure
P and therefore the strategy τ ∗ is not optimal in the sense of best expected
payoff.

53
The proof of Theorem 5.9 shows that we have as arbitrage free price at time t

pt (C· ) = Bt esssupτ ∈St,T Ẽ[Cτ /Bτ | Ft ].

This is Bt times the Snell envelope Y t of Y = (Ys )s∈[0,T ] , Ys = Cs /Bs . Thus,


under the conditions on Y in Section 5.1 those results carry over directly –
now applied under measure P̃ – and they yield the existence of τ ∗ . So the
main part in the proof of Theorem 5.9 is about the existence of π ∗ and the
equivalence of the prices. By Theorem 5.4 we know

τ ∗ = inf{s ∈ [t, T ] : pt (C· )/Bt = Yt } = inf{s ∈ [t, T ] : pt (C· ) = Ct }.

Let us now consider the Markovian case, i.e. assume that

Ct = ψ(t, St )

for suitable ψ. Then the price of the American contingent claim C is given by
the value function V , i.e. pt (C· ) = V (t, x) on {St = x}, where

V (t, x) = Bt sup Ẽt,x [ψ(τ, Sτ )/Bτ ]


τ ∈St,T

for t ∈ [0, T ) and V (T, x) = ψ(T, x) for x > 0. Note that Ẽt,x [ · ] = Ẽ[ · | St = x].
Following Section 5.2 the continuation region is

C = {(t, x) ∈ [0, T ] × (0, ∞) : V (t, x) > ψ(t, x)}, (5.45)

the stopping region is

D = {(t, x) ∈ [0, T ] × (0, ∞) : V (t, x) = ψ(t, x)} (5.46)

and the optimal stopping time is τ ∗ = τD .

Example 5.10 For the American call option we have ψ(t, x) = (x − K)+
for some strike price K > 0. Then for τ ∈ St,T , Jensen’s inequality, Theorem
5.1 and r ≥ 0 imply
 +
Ẽt,x [(Sτ − K)+ /Bτ ] ≥ Ẽt,x [S̃τ ] − Ẽt,x [K/Bτ ]
1  +
= x − K Ẽt,x [e−r(τ −t) ]
Bt
1 1
≥ (x − K)+ = ψ(t, x),
Bt Bt
where the last inequality is strict if t < T and P̃t,x (τ = T ) > 0. Therefore,
V (t, x) > ψ(t, x) for t < T and D = {T } × (0, ∞), i.e. it is optimal to exercise
at the terminal time. Thus the American call has the same (optimal) payout
as the European call. This is no longer true if one considers e.g. dividend
payments.

54
5.5 The American put option
In the notation of the preceding section we now look at

ψ(t, x) = (K − x)+

for some K > 0. This corresponds to the American put option with strike
price K.
Following [PS06, Section 25], the optimal stopping problem can be transformed
to a free boundary problem to find the boundary between the continuation and
stopping regions. We sketch the procedure:

Step 1: From Section 5.4 we know that the arbitrage free price is given by

V (t, x) = Bt sup Ẽt,x [(K − Sτ )+ /Bτ ],


τ ∈St,T

and that we continue (do not stop) if (u, Su (t, x))u∈[t,T ] lies in the continuation
region
C = {(t, x) ∈ [0, T ] × (0, ∞) : V (t, x) > (K − x)+ },
and the optimal stopping time τ ∗ is the first entry time of (u, Su (t, x))u∈[t,T ] in
the stopping region

D = {(t, x) ∈ [0, T ] × (0, ∞) : V (t, x) = (K − x)+ }

Step 2: All points (t, x) with x ≥ K, t < T belong to C. Further one can show
that all points with 0 < x < b∞ belong to D, where b∞ < K is the constant
boundary of the stopping region for the corresponding infinite time horizon
problem, where an explicit solution can be derived, see e.g. [PS06, Section
25]. Showing that x 7→ V (t, x) is convex on (0, ∞) it follows that a function
t 7→ b(t) (boundary between C and D) exists such that

C = {(t, x) ∈ [0, T ) × (0, ∞) : x > b(t)}

and
D = {(t, x) ∈ [0, T ] × (0, ∞) : x ≤ b(t)} ∪ ({T } × (b(T ), ∞)).
Since (K−x)+ does not depend on time, t 7→ V (t, x) is decreasing and therefore
t 7→ b(t) increasing.
Step 3: V is continuous.
Step 4: The smooth fit condition for x 7→ V (t, x) holds at b(t), i.e.

Vx (t, b(t)) = −1.

Step 5: b is continuous on [0, T ) and satisfies b(T −) = K. So we may set


b(T ) = K.

55
Step 6: Arguments like we used to derive the HJB equation lead to the fact
that V is C 1,2 on C and satisfies

Vt + L S V − r V = 0

on C, where LS is the generator of S under P̃ ,


1 2 2
LS f (x) = r xfx (x) + σ x fxx (x).
2
Steps 1–6 together yield the following free boundary problem for the value
function V and the unknown boundary b:

Vt (t, x) + LS V (t, x) − r V (t, x) = 0, (t, x) ∈ C,


V (t, x) > (K − x)+ , (t, x) ∈ C,
V (t, x) = (K − x)+ , (t, x) ∈ D,
Vx (t, x) = −1, x = b(t).

It can then be proved (Theorem 25.3 in [PS06]):

Theorem 5.11 The boundary b(t) is the unique solution of


Z K     
−r(T −t) 1 K −y σ2
e Φ √ log − r− (T − t) dy
0 σ T −t b(t) 2
Z T     
−ru 1 b(u) σ2
+rK e Φ √ log − r− (u − t) du = K − b(t)
t σ u−t b(t) 2

in the class of continuous increasing functions f : [0, T ] → (0, ∞) satisfying


f (t) < K for t < T (Here Φ denotes the cumulative standard normal distribu-
tion).

So far, it does not seem to be possible to be more explicit and numerical


schemes, e.g. like those we used in Section 4.7.6, have to be used. Alternatively
one may approximate the Black Scholes model by a binomial tree model and
proceed as in Remark 5.5 to find an approximation for the value function and
the optimal stopping time (or the free boundary).

56
Appendix
The results in the appendix are formulated for stochastic processes with infinite
time horizon, i.e. t ≥ 0, and can easily be adapted to a finite time horizon, i.e.
t ∈ [0, T ]. Further all (in)equalities for random variables are to be understood
P -almost surely (a.s.). Only to emphasize this we may write ’P − a.s.’ in some
places.

A Conditional Expectation
The conditional expectation is the expected value of a random variable given
the available information, which can be described by a σ-algebra. It is again
a random variable!

A.1 Conditional expectation and conditional probabil-


ity
Let (Ω, F , P ) be a probability space, X : Ω → IR a random variable with
E|X| < ∞, and G ⊆ F a sub-σ-algebra.

A random variable Z with values in IR is (a version of) the conditional expec-


tation of X given G, if

(i) Z is G-measurable and

(ii) E[1G Z] = E[1G X] for all G ∈ G.

It is denoted by Z = E[X | G].

In fact, E[X | G] is the projection of X to L2 (G), This can be used to show


existence and (a.s.) uniqueness of E[X | G].

For X = 1A , A ∈ A, the conditional probability given G is

P (A | G) = E[1A | G].

Since the conditional expectation is a random variable, all (in)equalities below


are to be understood P -almost surely (a.s.).

A.2 Some properties


(B1) E[E[X | G]] = E[X].

(B2) Linearity: E[α X1 + β X2 | G] = α E[X1 | G] + β E[X2 | G] for α, β ∈ IR.

(B3) E[X1 | G] ≤ E[X2 | G] for X1 ≤ X2 .

57
(B4) E[X | G] = E[X] if X is independent of G.

(B5) E[Y X | G] = Y E[X | G] for G-measurable Y .

(B6) Tower property: E[E[X | G2 ] | G1] = E[X | G1 ] for G1 ⊆ G2 .

(B7) Jensen’s Inequality: If f is convex with E|f (X)| < ∞, then f (E[X | G]) ≤
E[f (X) | G].

B Stochastic Processes in Continuous Time


B.1 Stochastic processes
Let (Ω, F, P ) be a suitable probability space. A stochastic process is a family
of random variables X = (Xt )t≥0 with values in the state space (IR, B(IR)),
where B(IR) is the Borelean σ-algebra in IR. The index t is usually interpreted
as time. For each ω ∈ Ω the map t 7→ Xt (ω) is called a path of X.

A stochastic process (Yt )t≥0 is a version of X, if P (Xt = Yt ) = 1 for all t ≥ 0.

If the paths of X are (P -a.s.) continuous (left continuous, right continuous),


we call X continuous (left continuous, right continuous).

If all Xt are integrable (i.e. lie in L1 ) or square integrable (in L2 ), then we call
the process integrable or square integrable, respectively.

B.2 Filtrations
A filtration F = (Ft )t≥0 is an increasing family of σ-algebras in F, i.e. for
s < t we have Fs ⊆ Ft ⊆ F. Ft may be seen as the information which is
available at time t, i.e. for each event A ∈ Ft it can be decided at time t if it
has occurred or not.

A stochastic process X is F -adapted, if Xt is Ft -measurable for all t ≥ 0.

An important filtration is the filtration generated by a process X which we


denote by F X . It is defined as

FtX = σ(Xs , s ∈ [0, t]).

So FtX is the smallest σ-algebra for which all Xs , 0 ≤ s ≤ t, are measurable.


In particular, X is F X -adapted.
T
A filtration F is called right continuous, if Ft = u>t Fu . A filtration F is
said to fulfill the usual conditions, if F is right continuous and F0 contains all
P -null sets.

58
A stochastic process X is progressively measurable w.r.t. a filtration F , if for all
t ≥ 0 the maps (s, ω) 7→ Xs (ω) on [0, t] × Ω are measurable w.r.t B([0, t]) ⊗ Ft .

In particular, every progressively mesasurable process is adapted.

Proposition B.1 Left or right continuous adapted processes are progres-


sively measurable.

B.3 Stopping times


A random variable τ : Ω → [0, ∞] is called stopping time w.r.t. the filtration F
or simply F -stopping time, if {τ ≤ t} ∈ Ft for all t ∈ [0, ∞]. If X is F -adapted
and τ an F -stopping time, we set Xτ (ω) = Xτ (ω) (ω). We might have to define
a suitable X∞ .

For an F -stopping time τ we have {τ < t}, {τ = t}, {τ > t} ∈ Ft .

Lemma B.2 If τ1 , τ2 are stopping times, so are τ1 +τ2 , τ1 ∧τ2 := min{τ1 , τ2 },


τ1 ∨ τ2 := max{τ1 , τ2 }.

Lemma B.3 If τ1 , τ2 , . . . are stopping times, so is supn∈IN τn .

For an F -stopping time τ , the σ-algebra Fτ of the events determined before τ


consists of all events A for which A ∩ {τ ≤ t} ∈ Ft for all t ≥ 0.

If X is F -progressively measurable and τ an F -stopping time, it can be shown


that Xτ is Fτ -measurable.

B.4 Martingales
An F -adapted process X is a martingale, if E|Xt | < ∞ for all t ≥ 0 and

E[Xt | Fs ] = Xs (P − a.s.)

for all 0 ≤ s ≤ t.

Example B.4 Wiener process


A Wiener process or a Brownian motion (w.r.t. F ) is an F -adapted process
W = (Wt )t≥0 which satisfies

(W1) W0 = 0,

(W2) Wt − Ws is independent of Fs , t > s ≥ 0,

(W3) Wt −Ws is normally distributed with mean 0 and variance t−s, t > s ≥ 0.

(W4) W is continuous.

59
The processes

W, (Wt2 − t)t≥0 , exp a Wt − 21 a2 t t≥0 for a > 0

are martingales.

As in the example, the filtration under consideration might not always be


mentioned explicitly. But W is always a Wiener process w.r.t. F W .
A stochastic process X is called uniformly integrable, if

sup E|Xt | < ∞ and sup E|1A Xt | → 0 for P (A) → 0.


t≥0 t≥0

Lemma B.5 Let X be a IR-valued random variable with E|X| < ∞ and F
a filtration. Then
(E[X | Ft])t≥0
is a uniformly integrable martingale.

Theorem B.6 Optional Sampling


(i) Let M = (Mt )t≥0 be a right continuous F -martingale and τ a bounded
F -stopping time. Then EMτ = EM0 .
(ii) Let M = (Mt )t≥0 be a right continuous, uniformly integrable F -martingale
and τ any F -stopping time. Then EMτ = EM0 .

C The Stochastic Integral


From now on we assume that the filtration F satisfies the usual conditions and
that W is a Wiener process w.r.t. F . For a stochastic process X we consider
norms
1
kXT k2 = E[XT2 ] 2 ,
Z T  21
kXkL2 ,T = Xt2 dt .
0

C.1 Stochastic integral for simple processes


A simple process X is of the form

X
Xt = ξ0 1{0} (t) + ξj 1(tj ,tj+1 ] (t), (C.1)
j=1

where ξj is Ftj -measurable and supj∈IN |ξj (ω)| < C for all ω ∈ Ω and some
C ∈ IR. Furthermore,

0 = t0 < t1 < . . . , lim tn = ∞.


n→∞

60
We denote the class of simple processes by P0 .

The stochastic integral of X ∈ P0 (using representation (C.1) is defined as


n−1
X
It (X) := ξj (Wtj+1 − Wtj ) + ξn (Wt − Wtn ).
j=0

Thus the stochastic integral at time t is a random variable and hence I(X) =
(It (X))t≥0 is a stochastic process!

Proposition C.1 For t ≥ s ≥ 0, X, Y ∈ P0 , α, β ∈ IR


(I1) I0 (X) = 0.

(I2) It (α X + βY ) = αIt (X) + βIt (Y ).

(I3) E[It (X) | Fs] = Is (X).


hR i
2 t
(I4) E[(It (X) − Is (X)) | Fs ] = E s
Xu2 du |Fs .
hR i
2 t
(I5) E[(It (X)) ] = E 0
Xu2 du .

(I6) kI(X)kt = kXkL2 ,t for all t ≥ 0.

C.2 The stochastic integral


Suppose X ∈ P, the class of progressively measurable processes which satisfy

X
kXkL2 = 2−n (1 ∧ kXkL2 ,n ) < ∞.
n=1

One can show that P0 is dense in P. Thus there exist simple processes (X n )n∈IN
satisfying
lim kX n − XkM = 0.
n→∞

Proposition C.1(I3,I5) shows that I(X n ) is a square integrable martingale.


One can show that the class of square integrable martingales M is a complete
mertic space under the metric

X
M 7→ 2−n (1 ∧ kMn k2 )
n=1

Using also the isometry in Proposition C.1 (I6) this implies that the limit is
well defined and the stochastic integral can be set as
Z t
Xu dWu := It (X), t ≥ 0.
0

61
We also write Z Z 
t
Xu dWu = Xu dWu ,
0 t≥0

if we consider the stochastic integral as process.

Proposition C.2 The stochastic integral


Z
I(X) = Xt dWt

satisfies (I1), . . . , (I6) and hence is a square integrable martingale.

Example C.3 Z t
1 2 1
Ws dWs = W − t, t ≥ 0.
0 2 t 2

C.3 A generalization
The stochastic integral can be extended to cover integrands in P ∗ , the class of
progressively measurable processes X which satisfy
Z t
Xs2 ds < ∞ (P − a.s.) for all T ≥ 0.
0

This can be done by defining stopping times


Z t
τn = inf{t ≥ 0 : Xs2 ds ≥ n} ∧ n
0

(n)
These satisfy τn → ∞ and we have Xt := Xt 1{τn ≥t} ∈ P. Thus
Z t
(n)
It := Xs(n) ds.
0

exists and we can define


(n)
It (X) := It on 0 ≤ t ≤ τn , n ∈ IN.

We use the same notation


Z t
Xs dWs = It (X), t ≥ 0.
0

Remark C.4 Local martingales


Note that for X ∈ P ∗ the properties (I3) − (I6) might no longer hold!
But I(X) is still a continuous local martingale. A stochastic process M is a
local martingale if there exist stopping times (τn )n∈IN with τn → ∞ such that
the processes (Mt∧τn )t≥0 are martingales.

62
C.4 Itô Formula
A (one-dimensional) Itô process is a stochastic process which admits a repre-
sentation of the form
Z t Z t
Xt = ξ + bs ds + σs dWs , t ≥ 0, (C.2)
0 0

where ξ is F0 -measurable and (bt )t≥0 and (σt )t≥0 are F - progressively measur-
able with Z t
(|bs | + |ss |2 )ds < ∞ for all t > 0.
0
In particular, X is continuous and F -adapted. In differential form we may
write
dXt = bt dt + σt dWt , X0 = ξ.

Theorem C.5 Itô Formula


Suppose X is an Itô-process as given in (C.2) and f : IR → IR is twice contin-
uously differentiable. Then,
Z t Z
′ 1 t ′′
f (Xt ) = f (X0 ) + f (Xs )dXs + f (Xs )d[X]s , (C.3)
0 2 0
where the quadratic variation [X] of X is defined as
Z t
[X]t = σs2 ds.
0

For a shorter notation, we usually write only


1
df (Xt ) = f ′ (Xt )dXt + f ′′ (Xt )d[X]t .
2

C.5 The multidimensional case


An m-dimensional Wiener process W = (Wt )t≥0 with components
 
Wt1
Wt =  ...  .
 
Wtm

consists of independent Wiener processes W 1 , . . . , W m , all adapted to the same


filtration F . The stochastic integral can be define componentwise. So for
n × m-matrices At with components in P ∗ we have
 Pm R t 1j j

Z t j=1 0 A s dW s

As dWs =  .. 
. .
0 R
Pm t nj j
j=1 0 As dWs

63
An n-dimensional stochastic process which admits a representation of the form
Z t Z t
Xt = X 0 + bs ds + σs dWs , t ≥ 0, (C.4)
0 0

where X0 is F0 -measurable and (bt )t≥0 and (σt )t≥0 are IRn -valued and IRn×m -
valued, respectively, as well as F - progressively measurable with
Z t X n n X
m
!
X
2
|bis | + |sij
s | ds < ∞ for all t > 0
0 i=1 i=1 j=1

is called (multi-dimensional) Itô process. Again X is continuous and F -adapted.

Theorem C.6 Multidimensional Itô Formula


Suppose X is an Itô-process as given in (C.4) and f : [0, ∞) × IRn → IR is

continuously differentiable in the first component with derivative ft = ∂t f , and
twice continuously differentiable in the other components with partial deriva-
2
tives fxi := ∂x∂ i f , i = 1, . . . , n and fxi ,xj = ∂x∂j ∂xi f , i, j = 1, . . . , n). Then,
Z t
f (Xt ) = f (X0 ) + ft (s, Xs )ds
0
n Z t n Z
X 1X t
+ fxi (s, Xs )dXsi + fx ,x (s, Xs )d[X i, X j ]s ,
i=1 0 2 i,j=1 0 i j

where the covariations [X i , X j ] of X i and X j are defined as


Z t
i
[X , X ]t =j
(σs σ⊤s )ij ds,
0

where denotes transposition, so
m
X
(σt σ⊤t )ij = σtik σtjk .
k=1
.

Corollary C.7 Product Rule.


For (one-dimensional) Itô processes X, Y w.r.t. to the same Wiener process
and with diffusion coefficients σ X , σ Y we have
Z t Z t
Xt Y t = X 0 Y 0 + Xs dYs + Ys dXs + [X, Y ]t .
0 0

where Z t
[X, Y ]t = σsX σsY ds, t ≥ 0.
0
If X and Y were defined w.r.t. independent Wiener processes, we had [X, Y ]t =
0.

64
D Stochastic Differential Equations
D.1 Problem formulation
We want to solve the stochastic differential equation (SDE)

dXt = b(t, Xt )dt + σ(t, Xt )dWt (D.5)

where
b : [0, T ] × IRn → IRn , σ : [0, T ] × IRn → IRn×m
are measurable and W is an m-dimensionaler Wiener process w.r.t. to some
filtration F . X is n-dimensional, so (D.5) consists of the SDEs
m
X
dXti = bi (t, Xt )dt + σij (t, Xt )dWtj , i = 1, . . . , n.
j=1

Usually we also require that some initial condition X0 = ξ holds, where ξ is a


F -measurable random variable independent of W .

A continuous process X = (Xt )t∈[0,T ] is a solution of (D.5), if it is F -adapted,


X0 = ξ, which satisfies (D.5) and P -a.s.
Z t Z t
|bi (s, Xs )|ds < ∞, σij (s, Xs )|2 ds < ∞.
0 0

The SDE is said to have a strong solution, if for given probability space
(Ω, F , P ), initial condition ξ and Wiener process W a solution X can be found
which is adapted to the filtration generated by W and ξ (augmented with the
null sets).

A weak solution means that a probability space (Ω, F, P ), a Wiener process


W and a filtration F can be found such that a solution exists.

D.2 Uniqueness and existence


To show uniqueness we need that the one sided derivatives of the coefficients
b and σ are bounded. This follows from a Lipschitz condition like formulated
in the following Theorem.

Theorem D.1 Uniqueness Suppose that b, σ are Lipschitz continuous in


x, i.e. there exists a constant K > 0 such that

kb(t, x) − b(t, y)k + kσ(t, x) − σ(t, y)k ≤ K kx − yk, t ∈ [0, T ], x, y ∈ IRn .

Then any two solutions X, X̃ of (D.5) satisfy

P (Xt = X̃ for all t ∈ [0, T ]) = 1.

65
The condition in Theorem D.1 is not restrictive enough to avoid so called
explosions of the process. Therefore we also need the additional condition in
the following theorem.

Theorem D.2 Suppose that b, σ are Lipschitz continuous and satisfy a linear
growth condition, i.e. there exists a constant K > 0 such that for all t ∈ [0, T ],
x, y ∈ IRn ,

kb(t, x) − b(t, y)k + kσ(t, x) − σ(t, y)k ≤ K kx − yk, (D.6)


kb(t, x)k2 + kσ(t, x)k2 ≤ K 2 (1 + kxk2 ). (D.7)

Let the initial condition be given by some random variable ξ which is indepen-
dent of W and satisfies Ekξk2 < ∞.
Then (D.5) has a continuous, strong solution X with
Z T 
2
E kXt k dt < ∞.
0

66

You might also like