Stochastic Control:: With Applications To Financial Mathematics
Stochastic Control:: With Applications To Financial Mathematics
Jörn Saß
joern.sass@oeaw.ac.at
RICAM
Austrian Academy of Sciences
Altenbergerstraße 54
∗
Web: http://www.ricam.oeaw.ac.at/people/page/sass/teaching/stochcontrol/
1
Contents
1 Introduction 6
1.1 Stochastic control problem . . . . . . . . . . . . . . . . . . . . . 6
1.2 Portfolio optimization: first example . . . . . . . . . . . . . . . 7
2 Dynamic Programming 9
2.1 Itô diffusions and their generators . . . . . . . . . . . . . . . . . 9
2.2 The idea of dynamic programming . . . . . . . . . . . . . . . . 12
2.3 A verification theorem . . . . . . . . . . . . . . . . . . . . . . . 14
2.4 Example: Optimal investment . . . . . . . . . . . . . . . . . . . 17
2.5 Types of controls . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3 Some extensions 20
3.1 Minimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.2 Infinite time horizon . . . . . . . . . . . . . . . . . . . . . . . . 20
3.3 Example: Discounted utility of consumption . . . . . . . . . . . 21
3.4 Stopping the state process . . . . . . . . . . . . . . . . . . . . . 23
3.5 Example: Portfolio Insurer . . . . . . . . . . . . . . . . . . . . . 25
3.6 Dynamic programming for deterministic optimal control problems 30
3.7 Stochastic linear regulator problem . . . . . . . . . . . . . . . . 33
5 Optimal Stopping 47
5.1 Some results on optimal stopping . . . . . . . . . . . . . . . . . 47
5.2 Optimal Stopping for underlying Markov processes . . . . . . . 50
5.3 Reminder on pricing of European options . . . . . . . . . . . . . 50
5.4 American options . . . . . . . . . . . . . . . . . . . . . . . . . . 52
5.5 The American put option . . . . . . . . . . . . . . . . . . . . . 55
2
A Conditional Expectation 57
A.1 Conditional expectation and conditional probability . . . . . . . 57
A.2 Some properties . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
3
References
[CIL92] Crandall M.G., Ishii, H., Lions, P.-L. (1992): User’s guide to viscos-
ity solutions of second order partial differential equations. Bulletin of the
American Mathematical Society 27, 1–67.
[ElK81] El Karoui, N. (1981): Les aspects probabilistes du contrôle stochas-
tique. In: Ninth Saint Flour Probability Summer School 1979 (Saint Flour,
1979), pp. 73–238, Lecture Notes in Math., 876. Berlin-New York, Springer,
73–238.
[FR75] Fleming, W.H., Rishel, R.W. (1975): Deterministic and Stochastic
Optimal Control. New York: Springer.
[FS93] Fleming, W.H., Soner, H.M. (1993): Controlled Markov Processes and
Viscosity Solutions. New York: Springer.
[He94] Hernández Lerma, O. (1994): Lectures on Continuous-Time Markov
Control Processes. Apportaciones Matematicas 3. Sociedad Matematica
Mexicana.
4
[St00] Steele, J.M. (2000): Stochastic Calculus and Financial Applications.
New York, Springer.
[YZ99] Yong, J., Zhou, X.Y. (1999): Stochastic Controls. New York, Berlin,
Heidelberg: Springer.
For more references regarding the ”basics” in the appendix we refer to the
lecture notes Stochastische Differentialgleichungen, available at
http://www.ricam.oeaw.ac.at/people/page/sass/teaching/sdes/
and the references therein. These are mainly based on [KS99, Ok00], more
general results can be found in [Pr04] covering also non-continuous processes.
A highly readable book is [St00].
Text books on stochastic control are always difficult to read. Short and good
introductions are given in [K99, Ko99], the latter with an extensive overview of
the applications in portfolio optimization. In addition, the lecture notes [To02]
provide a good introduction to the concept of viscosity solutions. There are
also some good review papers on applications of stochastic control methods in
Finance, e.g. [Ru03]. The course will be based on the references made so far
and to a certain extent on [ElK81, FR75, FS93, He94, Ok00, YZ99].
5
1 Introduction
We consider optimal control of Itô-type processes which satisfy a stochastic
differential equation (SDE) w.r.t. some Wiener process.
• Our aim is to find V (0, x0 ) and a control strategy u∗ for which this
optimal value is attained, i.e. for which V (0, x0 ) = J(0, x0 , u∗ ). Then u∗
will be called optimal.
6
1.2 Portfolio optimization: first example
We consider a financial market consisting of one bond with prices
with trend parameter µ ∈ IR and volatility σ > 0. The unique solution of this
SDE is
σ2
St = s0 exp µ− t + σ Wt
2
as can be verified by Itô’s formula, Theorem C.3. The wealth (the portfolio
value) of an investor with initial capital x0 > 0 evolves like
where NtB and NtS are the number of bonds and stocks, respectively, held by
the investor at time t. This definition corresponds to a self-financing portfolio
since changes in the wealth are only due to changes in the bond or stock prices
(there is no consumption or endowment).
As control at time t we may use the fraction ut of the wealth which should be
invested in the stocks. Then
(1 − ut )Xt u t Xt
NtB = , NtS =
Bt St
yielding
Guessing Z Z
t t
Xt = x0 exp gs ds + hs dWs ,
0 0
7
We want to maximize (1.2) with ψ ≡ 0, Ψ(T, x) = log(x), i.e.
J(0, x0 , u) = E[log(XTu ) | X0 = x0 ]
The strategy given by (1.3) is the Merton strategy (for logarithmic utility), and
we call π ∗ Merton fraction. So, if e.g. π ∗ = 0.4, this means that an investor
should always keep 40% percent of his money invested in the stock. Note that
this strategy requires a lot of trading.
Remark 1.1 We can get a corresponding result for n stocks with prices
(St )t∈[0,T ] ,
St1
St = ... .
Stn
8
with dynamics
dSt = Diag(St )(µ dt + σ dWt ), S0 = s 0 ,
where Diag(St ) is the diagonal matrix with diagonal St , and W is a n-dimensional
Wiener process, si0 > 0 for I = 1, . . . , n, µ ∈ IRn , and σ a non-singular volatility
matrix in IRn×n . So for stock i we have dynamics
n
!
X
dSti = Sti µi dt + σij dWtj , i = 1, . . . , n.
j=1
2 Dynamic Programming
2.1 Itô diffusions and their generators
We consider a n-dimensional SDE
dXt = b(t, Xt )dt + σ(t, Xt )dWt , (2.4)
where W is a m-dimensional Wiener process and the measurable drift and
diffusion coefficients
b : [0, ∞) × IRn → IRn , σ : [0, ∞) × IRn → IRn×m
satisfy for some constant K > 0 and for all x, y ∈ IRn , s, t ≥ 0
kb(s, x) − b(t, y)k + kσ(s, x) − σ(t, y)k ≤ K(ky − xk + |t − s|), (2.5)
kb(t, x)k2 + kσ(t, x)k2 ≤ K 2 (1 + kxk2 ). (2.6)
We consider the filtration F generated by W and augmented with the null
sets. Under these conditions (2.4) has a unique and strong solution X which
we call Itô diffusion. We call
a(t, x) := σ(t, x)σ(t, x)⊤
the diffusion matrix of X.
For a random variable Y we write
Et,x [Y ] = E[Y | Xt = x].
and Ex [Y ] = E0,x [Y ].
9
Theorem 2.1 Suppose that X is a time-homogeneous Itô diffusion and f :
IRn → IR bounded and measurable.
(i) For all ω ∈ Ω
Part (i) is called Markov property, part (ii) strong Markov property. The de-
pendency on ω is usually not written explicitly. Hence EXt [f (Xs )] is a random
variable
g(Xt ), where g : IRn → IR, g(x) = Ex [f (Xs )].
From now on suppose that X is an Itô-Diffusion like in (2.4).
yielding
n n
X
⊤ 1X
Lf (t, x) = ft (t, x) + fxi (t, x) bi (t, x) + aij (t, x)fxi ,xj (t, x)
i=1
2 i,j=1
1
= ft (t, x) + (Dx f (t, x))⊤ b(t, x) + tr((Dxx f (t, x))a(t, x))
2
where Dx f denotes the gradient of f , Dxx the Hessian of f , i.e. (Dxx f )ij =
fxi ,xj , and ’tr(A)’ is the trace of matrix A, the sum of the diagonal elements
of A.
10
Remark 2.2 (i) Note that Itô’s formula can be written as
Proof: Follows directly from Itô’s formula, cf. Remark 2.2 (i).
If τ is a first exit time from a bounded set A ⊂ IRn , then Dynkins Formula
holds for any f ∈ C 1,2 , since f |A can be extended outside of A correspondingly.
Xt = x0 + Wt , t ≥ 0,
11
• Show that P (τ < ∞ |X0 = x0 ) = 1 for all x0 . This can be done using
that Wt+∆t − Wt is normally distributed with mean 0 and variance ∆t.
Then
Ex0 f (Xτ ) = px0 f (b) + (1 − px0 )f (a). (2.8)
1 ∂2
L= .
2 ∂x2
Comparing with (2.8) and using the solution for px0 we get
1 (b − a)2
p a+b = and E a+b [τ ] = .
2 2 2 4
12
• Apply Itô’s formula to V (if V is smooth enough, e.g. V ∈ C 1,2 ), yielding
Z t1
V (t, x) = sup Et,x ψ(s, Xs , us )ds + V (t, Xt )
u∈A(t,x) t
Z t1
+ Vt (s, Xs ) + (Dx V (s, Xs ))⊤ b(s, Xs , us ) ds
Zt t1
1
+ tr ((Dxx V (s, Xs ))a(s, Xs , us )) ds
t 2
Z t1
⊤
+ (Dx V (s, Xs )) σ(s, Xs , us )dWs ,
t
The equation (2.9) (or equivalently (2.10)) is called Hamilton Jacobi Bellman
equation, short HJB equation. The above reasoning shows that under cer-
tain conditions the value function solves the HJB equation, so it provides a
necessary condition.
Vice versa we may ask, when a solution of the HJB equation is the value
function of the correspondin control problem. To this end we proceed as follows
to ’solve’ the HJB equation.
13
2.6 Algorithm
1. Find an optimal u = û(t, x) in (2.9).
2. If it exists, û formally depends on the derivatives Vt , Dx V , Dxx V , i.e.
û(t, x) = ũ(t, x, Vt (t, x), Dx V (t, x), Dxx V (t, x)).
Substituting û in (2.9) leads to a partial differential equation for V which
has to be solved with boundary condition V (T, x) = Ψ(T, x) to find a
candidate V ∗ for the optimal value function.
3. If V ∗ satisfies certain conditions (see below) and u∗t = û(t, Xt∗ ), t ∈ [0, T ],
is an admissible control strategy, then V ∗ is indeed the value function of
the control problem and u∗t = û(t, Xt∗ ) defines an optimal control strategy
in Markovian form. Here Xt∗ is the solution of (1.1) using the optimal
control strategy u∗ in [0, t).
A theorem which provides a set of conditions on V , ψ, Ψ, b, σ and u such that
a solution V of the HJB equation and the corresponding maximizer û found
in steps 1 and 2 of the above algorithm provide indeed the value function and
an optimal control strategy, is called a verification theorem.
(A2) (1.1) has a unique strong solution (Xs )s∈[t,T ] with Xt = x and
Et,x [ sup kXs k2 ] < ∞
t≤s≤T
14
(i) Suppose that Φ lies in C 1,2 ([0, T ) × IRn ), is continuous on [0, T ] × IRn
with kΦ(t, x)k ≤ CΦ (1 + kxk2 ), and satisfies the HJB equation and the
boundary condition, i.e.
Proof: Keep some t ∈ [0, T ], x ∈ IRn fixed. For arguing with bounded
processes, we introduce
Therefore we get
Z τn
Et,x ψ(s, Xs , us )ds + Φ(τn , Xτn )
t
Z τn Z τn
us
= Et,x ψ(s, Xs , us )ds + Φ(t, x) + L Φ(s, Xs )ds
t t
Z τn
= Φ(t, x) + Et,x (ψ(s, Xs , us ) + Lus Φ(s, Xs )) ds
t | {z }
≤0
≤ Φ(t, x) (2.11)
15
For n → ∞ we get τn → T . The quadratic growth conditions for ψ and Φ and
the admissibility of u implies
Z τn Z T
ψ(s, Xs , us )ds + Φ(τn , Xτn ) ≤ Cψ
(1+kXs k2 +kus k2 )ds+CΦ (1+kXT k2 ) ∈ L1 .
t t
Thus we obtain from (2.11) that also J(t, x, u) ≤ Φ(t, x) and by taking the
supremum finally V (t, x) ≤ Φ(t, x). This proves part (i).
For (ii) only observe that we have equality in (2.11) if we can find a maximizer
û(t, x) and consider the strategy u∗t = û(t, Xt∗ ) defined by this maximizer.
Since this was the only inequality we get with the same arguments V (t, x) =
J(t, x, u∗ ) = Φ(t, x).
Remark 2.8
(i) Under the conditions of Theorem 2.7 the Bellman Principle holds: For
any stopping time τ with values in [t, T ]
Z τ
u u
V (t, x) = sup Et,x ψ(s, Xs , us )ds + V (τ, Xτ ) .
u∈A(t,x) t
(ii) By definition the value function is always unique. Therefore Theorem 2.7
shows that a solution of the HJB equation is unique in the class of C 1,2 -
functions with quadratic growth. Note that a control strategy does not
have to be unique.
– U is compact,
– Ψ is three times continuously differentiable in x and is bounded,
– b, σ, ψ in C 1,2 and bounded,
– The uniform parabolicity condition holds, i.e.
16
Example 2.9 We consider the control of
Z t
u
Xt = x + us ds + Wt ,
0
17
σ a matrix in IRn×m with maximal rank. The latter implies that the IRn×n -
matrix σσ⊤ is non-singular. So for stock i we have dynamics
m
!
X j
i i
dSt = St µi dt + σij dWt , i = 1, . . . , n.
j=1
where 1 denotes the n-dimensional vector (1, . . . , 1)⊤ . The investor assigns
utility U1 (ct ) to the payout given by the consumption rate ct and utility U2 (XTu )
to the terminal wealth. We consider power utility functions
xα
U(x) = , α < 1, α 6= 0
α
and a discounting factor e−β t , β ≥ 0. Thus we want to maximize
Z T
−β t −β T u
J(t, x, u) = Et,x e U(ct ) dt + e U(XT ) .
t
1 − α − β t α−1
α
1 ⊤ ⊤ −1 Vx2
e 1−α Vx + Vt + r x Vx − (µ − r 1) (σσ ) (µ − r 1) =0
α 2 Vxx
18
α
with boundary condition V (T, x) = e−β T xα . Making an ansatz
xα
V (t, x) = h(t)1−α (2.14)
α
βT
yields for h the boundary condition h(T ) = e− 1−α and
βt
e− 1−α + c h(t) + h′ (t) = 0,
where
α 1 ⊤ ⊤ −1
c= r+ (µ − r 1) (σσ ) (µ − r 1) .
1−α 2(1 − α)
This can be solved using standard methods yielding
βt (1 − α)e−c t n − β−(1−α)c t β−(1−α)c
o
h(t) = e− 1−α e−c (T −t) + e 1−α − e− 1−α T
β − (1 − α)c
if β − (1 − α)c 6= 0 and
h(t) = e−c t (1 + T − t)
if β − (1 − α)c = 0.
Step 3: Note that h(t) is always strictly positive. Hence the value function
(2.14) is strictly positive if the SDE for the controlled process has a strictly
positive unique solution when using controls given by the maximizers π̂, ĉ.
Then, V would clearly lie in C 1,2 and be strictly increasing and concave since
Vx (t, x) = h(t)1−α xα−1 > 0 and Vxx (t, x) = −(1 − α)h(t)1−α xα−2 < 0.
Note further that with (2.14) we get from (2.12), (2.13)
βt
e− 1−α
ĉ(t, x) = x,
h(t)
1
π̂(t, x) = (σσ⊤ )−1 (µ − r 1).
1−α
For controls
βt
e− 1−α ∗
c∗t = ĉ(t, x) = X , (2.15)
h(t) t
1
πt∗ = π̂(t, x) = (σσ⊤ )−1 (µ − r 1) (2.16)
1−α
∗ ∗
the SDE for the controlled process X ∗ := X (π ,c ) is of the form dXt∗ =
Xt∗ ((c1 + f1 (t))dt + c2 dWt ) and hence admits a unique strong solution Xt∗ =
x0 exp{something} which is strictly positive and (as a strong solution) satisfies
the integrability conditions we need. Since π ∗ is constant and h(t)−1 bounded
the admissibility and growth conditions can be verified using a suitable set U
for the controls (partly difficult).
Thus the value function is indeed given by (2.14) and an optimal control strat-
egy by (2.15), (2.16).
19
2.5 Types of controls
In general a control at time t is given by some random variable ut (ω) One can
distinguish some special cases:
• Open loop or deterministic controls, if ut (w) = f (t) is only a function in
t (non random).
• Closed loop or feedback controls: u is adapted to the filtration generated
by the controlled process, i.e. ut is σ(Xsu , s ≤ t)-measurable.
• A special case of the feedback controls are Markovian controls which are
of the form ut (ω) = f (t, Xt (ω)).
In particular the controls given by the maximizers û like in Algorithm 2.6
and in the Verification Theorem 2.7 are Markovian. The proof of Theorem
2.7 shows that a Markovian control will be at least as good as any feedback
control, if the former exists.
3 Some extensions
3.1 Minimization
Suppose we want to minimize the performance criterion. Then the value func-
tion would be of the form
Z T
Ṽ (t, x) = inf Et,x ψ̃(s, Xs , us )ds + Ψ̃(T, XT ) .
u t
where we can find V as before as the value function for maximizing the per-
formance criterion with ψ and Ψ.
In Section 3.7 we will see an example.
with discount factor β > 0. The conditions on b and σ and the admissibility
of control strategies – now described by class A(x) – are defined analogously
to Section 2.3. The value function is
(i) Suppose that Φ lies in C 2 (IRn ) with kΦ(x)k ≤ CΦ (1 + kxk2 ), and satisfies
the HJB equation
21
The HJB equation reads
⊤
1 ⊤ ⊤ 2 cα
sup (r + π (µ − r 1))x − c Vx + π σσ πx Vxx − β V + = 0.
u∈IRn ×[0,∞) 2 α
22
But note that we have restricted the domain of V to (0, ∞) by the admissibility
condition on the control strategies (only strictly positive wealth processes were
allowed). Another way would be to terminate the evaluation as soon as Xt ≤ 0.
This can be modelled by a suitable stopping time τ . Then boundary conditions
would have to be specified for the case that we stop early. This can be done
similarly as we do it in the next section for a finite time horizon.
23
(ii) If û(t, x) is a maximizer of u 7→ ψ(t, x, u) + Lu Φ(t, x) on Q and u∗ =
(u∗t )t≤τ , u∗t = û(t, Xt∗ ) is admissible, then Φ(t, x) = V (t, x) for all (t, x) ∈
Q and u∗ is an optimal control strategy, i.e. V (t, x) = J(t, x, ut,x ) where
ut,x = (u∗s )s∈[t,T ] ∈ A(t, x).
The proof is essentially the same as the proof of Theorem 2.7. Instead of τn
we have to use stopping times τn ∧ τ .
Example 3.3 We look at the market model of Section 2.4 but consider only
investment without consumption, i.e. we use controls ut = πt , where πti is the
fraction of wealth invested in stock i. For α ∈ (0, 1) we want to maximize
expected power utility of terminal wealth
1 α
E X such that P (XT ≥ q) = 1.
α t
This problem is called portfolio insurer problem since there is a lower boundary
for the payoff. It is quite attractive, since the distribution of the optimal
terminal wealth of the unconstrained problem can be very skew, allowing for
losses with high probability, and big gains only with a very low probability.
We have to distinguish 3 cases:
• If x0 < e−r T q, we cannot reach the minimum payout at time T with
probability 1 since by investing in the bond we only get x0 er T < q.
• If x0 = e−r T q, pure investment in the bond yields exactly the payout q.
So we cannot invest in the stocks since then we would make losses with
a strictly positive probability, so we would miss q with a strictly positive
probability.
• If x0 > e−r T q investment in bond and stocks is possible.
So let us assume that x0 > e−r T q.
The same considerations show that at time t we need at least wealth Xt ≥
e−r(T −t) to be able to reach q with probability 1.
Thus we may define
Then
24
So we define boundary conditions
1 α
q , (t, x) ∈ ∂ ∗ Q, t < T,
Ψ(t, x) = α
1 α
α
x , t = T.
the admissibility conditions defined similar as in Section 2.3. The HJB equa-
tion with boundary conditions then reads as
⊤ 1 ⊤ ⊤ 2
sup Vt + (r + π (µ − r 1))xVx + π σσ π x Vx x = 0, (t, x) ∈ Q,
π∈IRn 2
V (t, x) = Ψ(t, x), (t, x) ∈ ∂ ∗ Q.
25
So we have to solve
2
1 µ−r Vx2 (t, x)
Vt (t, x) + r xVx (t, x) − =0 (3.19)
2 σ Vxx (t, x)
subject to
( 1
α
q α , (t, x) ∈ ∂ ∗ Q, t < T,
V (t, x) = Ψ(t, x) = 1
α
xα , t = T.
26
So it remains to solve
2
1 µ−r Ṽy (t, y)
Ṽt (t, y) − =0
2 σ2 Ṽyy (t, y)
subject to
( α qα
rt
eαrT yα = α
, (t, y) ∈ ∂ ∗ Q̃, t < T,
Ṽ (t, y) = V (t, e y) = α
eαrT yα (t, y) ∈ ∂ ∗ Q̃, t = T.
qα
vi,0 = Ṽ (ti , y0) = e−αr(T −i∆t) ,
α
(erT y)α
vi,M = Ṽ (ti , yM ) = ec0 (T −i∆t) ,
α
2
vi+1,j − vi,j 1 µ − r (vi,j+1 − vi,j )2
0 = − , j = 1, . . . , M − 1.
∆t 2 σ vi,j+1 − 2vi,j + vi,j−1
28
0.5
0.4
1.75
0.3
0.2 1.5
0
y
0.2 1.25
0.4
t 0.6 1
0.8
29
Note that these results are only numerical without saying anything about
existence. Further, since we only used an approximative boundary condition,
there is less hope to show convergence to the true solution.
u : [0, T ] → U ⊆ IRp .
We use the notation X(t) and u(t) for deterministic functions of time while
we reserve Xt and ut (’t’ as index) for stochastic processes. For a dynamic
programming approach we also look at starting time t and consider the per-
formance criterion
Z T
J(t, x, u) = ψ(s, X(s; t, x), u(s))ds + Ψ(T, X(T ; t, x))
t
Remark 3.6 A problem like above with running cost ψ and terminal cost Ψ
is called a Bolza problem, with ψ and without terminal cost a Lagrange prob-
lem and with Ψ and without running cost a Mayer problem. In determinstic
optimal control these three problems are equivalent in the sense, that each one
of them can be transformed in one of the other formulations.
E.g. transforming a Bolza problem to a Mayer problem can be done by intro-
ducing a further variable xn+1 with dynamics Ẋn+1 (t) = ψ(t, X(t), u(t)) =:
30
bn+1 (t, X(t), Xn+1(t), u(t)). Then we consider the (n + 1)-dimensional con-
trolled process X̃ = (X⊤ , Xn+1 )⊤ and define b̃ = (b⊤ , bn+1 )⊤ , J(t, ˜ x, xn+1 , u) =
Xn+1 (T ; t, xn+1 ) + Ψ(T, X(T ; t, x), Ṽ (t, x, xn+1 ) = xn+1 + V (t, x), yielding HJB
n o
⊤
Ṽt + sup b Dx Ṽ + bn+1 Dxn+1 Ṽ = 0
u
Suitable conditions for the dynamic principle to work are in the determinstic
case similar to the stochastic control problem. Usually we have two choices:
Being more restrictive on the controls or more restrictive on the cost functions.
Supposing that U is closed conditions for the first approach are e.g. that the
control functions are piecewise continuous and b, ψ, Ψ are continuous and
continuously differentiable such that a solution X always exists, cf. e.g. [FR75].
On the other hand we could only assume that the controls are measurabe
functions and then need further boundedness conditions on f = b, ψ, Ψ like
Φt (t, X ∗ (t)) + ψ(t, X ∗ (t), u∗ (t)) + b(t, X ∗ (t), u∗ (t))⊤ Dx Φ(t, X ∗ (t)) = 0
As we will see in the following example and in Section 3.7 we may mix or
interchange steps 1 and 2 if it works.
31
Example 3.7 Linear regulator problem (LQ system)
Consider the n-dimensional controlled system (with p-dimensional controls)
32
3.7 Stochastic linear regulator problem
We now want to look at a stochastic version of Example 3.7. The comparison
will allow us to see what the influence of the additional noise really is. Consider
with the same conditions on the still deterministic matrices A, B, C, D, R and
for m-dimensional Brownian motion W and non-singular σσ⊤
and V (t, x) = inf u J(t, x, u). Based on Section 3.1 we get the HJB equation
⊤ ⊤ ⊤ 1 ⊤
0 = inf x C(t)x + u D(t)u + (A(t)x + B(t)u) Dx V (t, x) + tr(σσ Dxx V (t, x)) .
u∈U 2
Due to the expectation we have to expect a more complicated dependency of
the value function on time and hence make the ansatz
The latter is the same matrix Riccati equation as (3.22) and the differential
equation for k simply yields for k(T ) = 0
Z T
k(t) = tr(σσ⊤ K(s))ds.
t
Since we minimize, this can be seen as the additional costs we have to pay for
the noise. Apart from that the solution has absolutely the same structure as
for the deterministic problem in Example 3.7.
33
4 Viscosity solutions and stochastic control
We have seen that the value function V solves the HJB equation if V is smooth,
which means for determinstic control problems V ∈ C 1,1 and for stochastic
control roblems V ∈ C 1,2 . Often V is not smooth like the examples in Section
4.1, 4.3 will show. Then we need another concept of solutions of the HJB
equation, these are viscosity solutions for which we can show in Section 4.6
that the value function solves the HJB equation in the viscosity sense.
So, if an u.s.c. function f has a jump, the upper point belongs to the graph
of f , if f is l.s.c. the lower point. An u.s.c. function attains its supremum on
compacta (maximum exists), a l.s.c. attains its infimum (minimum exists) on
compacta.
34
4.3 A stochastic example
We consider U = IR and a two-dimensional controlled process (X, Y )⊤ with
dynamics
dXt = Yt dWt1 ,
dYt = ut dt + dWt2
where (W 1 , W 2 )⊤ is a standard Wiener process. So we have coefficients
0 y 0
b(x, y, u) = , .
u 0 1
For any l.s.c. function g let
J(t, x, y, u) = Et,x,y [g(Xt )]
and At,x,y = {u | X, Y exist, Et,x,y [g − (Xt )] < K, X martingale},
V (t, x, y) = sup J(t, x, y, u).
u∈A(t,x,y)
Suppose that V ∈ C 1,2 . Then one can show analogously to Section 2.2 – using
that we have continuous b and σ – that V satisfies
⊤ 1 ⊤ 2
sup Vt + b DV + tr σσ D V ≤ 0.
u∈U 2
Computing the gradient DV and the Hessian D 2 V we get on [0, T ) × IR2
1 2
sup Vt + u Vy + y Vxx + Vyy ≤ 0.
u∈U 2
Since we can choose any u ∈ U = IR this implies Vy (t, x, y) = 0 for all (t, x, y) ∈
[0, T ) × IR2 . Therefore V is constant w.r.t. y and we may consider from now
on V (t, x) = V (t, x, y) for any y ∈ IR. We get
1 2
sup Vt + y Vxx ≤ 0.
u∈U 2
Choosing y = 0 yields Vt (t, x) ≤ 0, so V is decreasing for t ∈ [0, T ). Looking
at y → ∞ we obtain Vxx (t, x) ≤ 0 for all x, so V is concave in x for all t. Using
these properties and that g is l.s.c. one can prove that
V (T −, x) ≥ V (T, x) = g(x), x ∈ IR.
So V (t, x) ≥ g(x) for all x and V (t, x) concave in x, which implies that
V (t, x) ≥ ğ(x), where ğ is the conave envelope of g (the smallest concave
function greater or equal to g).
On the other hand we have by Jensen’s inequality and the martingale property
of X
V (t, x) ≤ sup Et,x [ğ(XT )] ≤ sup ğ(Et,x [XT ]) = ğ(x).
u∈A(t,x) u∈A(t,x)
35
4.4 Viscosity solutions
Consider
F (x, v(x), Dx v(x), Dxx v(x)) = 0 (4.23)
for x ∈ O ⊂ IRn open. Let F be continuous, with values in IR , satisfying the
ellipticity condition
Beweis. Given that (4.25) holds, we simply have to choose w = v. For the
opposite direction suppose that v is a classical supersolution and that x̂ ∈ O,
w ∈ C 2 (O), x̂ a minimizer of v − w. Since x̂ is a minimizer we have Dx v(x̂) =
Dx w(x̂), Dxx v(x̂) ≥ Dxx w(x̂), hence by (4.24)
F (x̂, v(x̂), Dx w(x̂), Dxx w(x̂)) = F (x̂, v(x̂), Dx v(x̂), Dxx w(x̂))
≥ F (x̂, v(x̂), Dx v(x̂), Dxx v(x̂))
≥ 0.
36
Definition 4.3 suppose that v : O → IR is locally bounded.
(i) v is a (viscosity) supersolution of (4.23) if
F (x̂, v(x̂), Dx w(x̂), Dxx w(x̂)) ≥ 0
for all x̂ ∈ O, w ∈ C 2 (O) such that x̂ is a minimizer of v − w.
(ii) v is a (viscosity) subsolution of (4.23) if
F (x̂, v(x̂), Dx w(x̂), Dxx w(x̂)) ≤ 0
for all x̂ ∈ O, w ∈ C 2 (O) such that x̂ is a maximizer of v − w.
(iii) v is a viscosity solution of (4.23) if v is both a super- and a subsolution
of (4.23).
4.5 Properties
We start with a change of variables formula (proof not difficult):
Proposition 4.4 Let v be a l.s.c. supersolution of (4.23). If f ∈ C 1 (IR) with
f ′ 6= 0 on IR then
ṽ = f −1 ◦ v
is a supersolution (subsolution) of
F̃ (x, v(x), Dx v(x), Dxx v(x)) = 0
if f ′ > 0 ( if f ′ < 0), and where
F̃ (x, r, p, A) = F (x, f (r), f ′(r)p, f ′′ (r)pp⊤ + f ′ (r)A).
The proof of the following proposition is relatively easy compared to other
convergence results for PDEs. It is very important for numerical schemes.
Proposition 4.5 (i) Let vε be a l.s.c. supersolution of Fε (x, Dx v(x), Dxx v(x)) =
0, ε > 0, where Fε is continuous and satisfies (4.24). Suppose (ε, x) 7→ vε (x),
(ε, x, p, A) 7→ Fε (x, p, A) are locally bounded. Define
v 0 (x) = lim inf vε (y), F 0 (x, p, A) = lim sup Fε (x′ , p′ , A′ ).
(ε,y)→(0,x) (ε,x′ ,p′ ,A′ )→(0,x,p,A)
37
4.6 Viscosity solutions and HJB equations
Now let us consider also the dependency on time, i.e. F in the form
F (t, x, Dx V (t, x), Dxx V (t, x)) = −Vt (t, x) − sup {ψ(t, x, u) + Lu V (t, x)} = 0
u∈U
(4.26)
on Q = [0, T ) × IRn .
Theorem 4.7 Suppose b, σ, ψ are for any fixed u in C 1 (Q) ∩ C(Q) and b, σ
have unifomrly bounded derivatives w.r.t. t, x and are of linear growth in x and
u. Then for any u.s.c. subsolution V∗ and a l.s.c. supersolution of 4.26
sup (V∗ (t, x) − V ∗ (t, x)) = sup (V∗ (T, x) − V ∗ (T, x))
(t,x)∈Q x∈IRn
dBt = Bt r dt, B0 = 1
dSt = St (µ dt + σ dWt ), S0 = 1.
Without costs we saw in Section 2.4 that for maximizing expected power utility
xα /α of terminal wealth it is optimal to keep a constant fraction
1 µ−r
πM =
1 − α σ2
of the portfolio value (wealth) invested in the stock. Such a strategy requires
continuous trading since the postion has always to be adjusted when the stock
does not evolve like the money market. Because the stock prices are not of
finite variation, any reasonable costs for trading would lead to an immediate
ruin of an investor trying to do so.
38
of money for which stocks are bought (∆ > 0) or sold (∆ < 0). The fees are
paid from the bond (bank account).
To compute the costs we need two processes to describe the portfolio. We use
the wealth X 0 in the bond and the wealth X 1 in the stock. It is not clear what
the value of the portfolio should be. At the terminal time we distinguish two
possibilities:
XT = XT0 + XT1 (total wealth (no liquidation costs)),
X T = XT − γ|XT1 | (wealth of the liquidated portfolio).
To make sure that it is always possible to liquidate the position in the stocks
and thus to end up with a (strictly) positive wealth after liquidation one has
to ensure that the two-dimensional wealth process (Xt0 , Xt1 )t∈[0,T ] stays in (the
closure of) the solvency cone S given by
S = {(x0 , x1 ) ∈ IR2 : x0 + x1 − γ|x1 | > 0}
x0
which is the interior of a cone with boundaries x1 (x0 ) = − 1−γ for x0 < 0 and
x0
x1 (x0 ) = − 1+γ for x0 ≥ 0.
As controls we consider the cumulated purchases Lt and the cumulated sales
Mt up to time t which are assumed to be adapted, positive, increasing and
right continuous. The controlled wealth processes are for (x0 , x1 ) ∈ S then
given by
Z t
0
Xt = x0 + rXs0 ds + (1 − γ)Mt − (1 + γ)Lt , (4.27)
0
Z t Z t
1 1
Xt = x1 + µXs ds + σXs1 dWs + Lt − Mt . (4.28)
0 0
The controls L and M are admissible if X 0 , X 1 are well defined and (Xt0 , Xt1 ) ∈
S for all t ∈ [0, T ].
39
Proof: We shall only give the argument for the homotheticity property.
The proof for the concavity uses similar arguments based on the elementary
definition of a concave function.
Due to the linearity of (4.27), (4.28) one can easily verify that at t the controls
L , M are admissible for (x0 , x1 ) ∈ S if and only if yL, yM are admissible for
yx0 , yx1 and that
where e.g. XT0 (t, x0 , x1 , L, M) denotes the terminal wealth in the bond for start-
ing at t with x0 , x1 and using L, M ∈ A(t, x0 , x1 ). Therefore
XT (t, yx0 , yx1 , yL, yM) = XT0 (t, yx0 , yx1 , yL, yM) + XT1 (t, yx0 , yx1, yL, yM)
= yXT (t, x0 , x1 , L, M)
and
yα
V (t, yx0 , yx1) = sup Et,x0 ,x1 XT (t, x0 , x1 , L, M) = y α V (t, x0 , x1 ).
α
(L,M )∈A(t,x0 ,x1 ) α
40
where
1
LV = rx0 Vx0 + µx1 Vx1 + σ 2 x21 Vx1 x1 ,
2
LB V = Vx1 − (1 + γ)Vx0 ,
LS V = (1 − γ)Vx0 − Vx1 .
and
0, if LS V (t, x0 , x1 ) < 0,
m̂(t, x0 , x1 ) =
κ, if LS V (t, x0 , x1 ) ≥ 0.
Thus on the no trading region NT , defined by
we have Vt + LV = 0. Further we can introduce the buy region B and the sell
region S where it is optimal to buy and to sell respectively,
Due to Proposition 4.9 Vx0 is strictly positive and thus (1 + γ)Vx0 > (1 − γ)Vx0 .
Therefore the condition LB ≥ 0 implies LS < 0 and LS ≥ 0 implies LB < 0,
hence the regions may equivalently be defined by
41
whose solution provides us with the optimal trading regions. According to
(4.30) we have equality Vt + LV = 0, LB V = 0, LS V = 0 on NT , B, S, re-
spectively. Note that this is not a HJB equation in the sense that we maximize
over possible strategies. Rather it is a set of variational inequalities and we
have to find the free boundaries between the regions where one of the inequal-
ities is active (= 0). Solving this free boundary problems yields the trading
regions NT , B, S. One may then view the maximum in (4.31) as a supre-
mum over the 3 possible strategies which choose one of the inequalities, so the
maximum of these choices corresponds to the optimal action (hold, buy or sell
stocks).
Using the homotheticity property in Lemma 4.9 and assuming that V is con-
tinuously differentiable we get for y > 0
∂V (t, y x0 , y x1 ) ∂y α V (t, x0 , x1 )
= = y α Vx0 (t, x0 , x1 )
∂x0 ∂x0
and on the other hand – not using the homotheticity property – by the chain
rule
∂V (t, y x0 , y x1 )
= y Vx0 (t, y x0 , y x1 ).
∂x0
Comparing these two we have (the same holds for the partial derivative w.r.t.
x1 )
Remark 4.10 In similar problems it can then be shown that controls L and
M exists such that (Xt0 , Xt1 ) ∈ NT (t) and
Z t Z t
Lt = 1{(Xs0 ,Xs1 )∈∂B(s)} (s) dLs , Mt = 1{(Xs0 ,Xs1 )∈∂S(s)} (s) dMs ,
0 0
compare [Ko99] and the references therein. So trading occurs only on the
boundary. Further it can be shown that only that much is traded that the pro-
cess stays on the boundary. Mathematically the controlled process (Xt0 , Xt1 )t∈[0,T ]
is a continuous reflected diffusion process, reflected at the boundaries of NT ,
and trading only occurs with infintesimal small transactions at the local time
on the boundary.
For the form of a verification theorem which still has to be shown to guarantee
that the optimal strategy is of the conjectured form, we refer for a similar
problem to [Ko99] and the references therein.
42
4.7.4 No short selling, no borrowing
If, in addition we require in the admissibility conditions, that no short selling
takes place (Xt1 ≥ 0) and no borrowing is allowed (Xt0 ≥ 0) then we consider
instead of the solvency region the domain D = [0, ∞)2 \ {0, 0} and define the
trading regions as subsets of D. Then it might happen that one of the trading
regions is empty. Further, if x0 = 0 (x1 = 0) we should exclude the second
(third) inequality in (4.31) since buying (selling) is not admissible. This leads
to the following theorem for which we refer to Akian et al. (1996)1.
The proof in Akian et al. (1996) is based on the derivation of a weak dynamic
programming principle leading to (4.31). The uniqueness is shown following
the Ishii technique, see [CIL92].
Vt = xα Φt ,
Vx0 = xα−1 (αΦ − πΦπ ),
Vx1 = xα−1 (αΦ + (1 − π)Φπ ),
Vx1 ,x1 = xα−2 (1 − π)2 Φπ,π − (1 − π)(1 − 2α)Φπ − α(1 − α)Φ ,
where we used x = x0 + x1 and π = x1 /x as above. Plugging this into the
definition of the operators we get
LV = xα Lπ Φ, LB V = xα−1 LπB Φ, LS V = xα−1 LπS Φ,
1
M. Akian, A. Sulem, P. Séquier (1996): A finite horizon multidimensional portfolio
selection problem with singular transactions. In: Proceedings of the 34th Conference on
Decisions & Control, New Orleans, 2193–2197.
43
where
π 1 2 2
L Φ(t, π) = α r + π(µ − r) − σ (1 − α)π Φ(t, π)
2
+ (µ − r − (1 − 2α)π) π(1 − π)Φπ (t, π)
1
+ σ 2 π 2 (1 − π)2 Φπ,π (t, π),
2
LπB (t, π) = (1 + γπ)Φπ (t, π) − αγΦ(t, π),
LπS (t, π) = −(1 − γπ)Φπ (t, π) − αγΦ(t, π).
max{Lπ Φ(t, π), LπB Φ(t, π), LπS Φ(t, π)} = 0 (4.33)
for (t, π) ∈ [0, T ) × (0, 1). Note that x0 ≥ 0, x1 ≥ 0, x > 0 imply π ∈ [0, 1]. We
shall denote the corresponding trading regions in terms of π by NT π , B π , S π .
The boundary conditions at terminal time T now read
1 1
Φ(T, π) = and Φ(T, π) = (1 − γπ)α .
α α
On B π we can solve LπB Φ = 0 yielding
44
4.7.6 A Semi-Smooth Newton Method
The algorithm we present to solve (4.33) is based on a primal-dual active set
strategy, compare Hintermüller et al. (2003)2 and Ito and Kunisch (2006)3.
Here we face two free boundaries and a different type of constraints and have
to adapt their algorithm. We now work in the setting of Section 4.7.5 but no
longer use the superscript π .
Problem (4.33) is equivalent to solving
Φt + LΦ + λB + λS = 0 , (4.39)
LB Φ ≤ 0, λB ≥ 0, λB LB Φ = 0, (4.40)
LS Φ ≤ 0, λS ≥ 0, λS LS Φ = 0 . (4.41)
for any constant c > 0. So we have to solve (4.39), (4.42). At T the trading
regions are given by S(T ) = [0, 1] for Φ and NT (T ) = [0, 1] for Φ. We split
[0, T ] in N intervals and go backwards in time with tN = T , tn = tn+1 − ∆t,
∆t = T /N. Having computed Φ(tn+1 , ·) and the corresponding regions we use
the following algorithm to compute v = Φ(tn , ·) and NT (tn ):
0. Set v = Φ(tn+1 , ·), k = 0, choose an interval NT0 in [0, 1], constant c > 0.
4. Set
0 on (ak , 1]
λk+1
B = 1
− ∆t (v − vk+1 ) − Lvk+1 on [0, ak ]
and
0 on [0, bk )
λk+1
S = 1
− ∆t (v − vk+1 ) − Lvk+1 on [bk , 1].
2
M. Hintermüller, K. Ito, K. Kunisch (2003): The primal-dual active set strategy as a
semismooth Newton method. SIAM Journal on Optimization 13, 865–888.
3
K. Ito, K. Kunisch (2006): Parabolic variational inequalities: The Lagrange multiplier
approach. Journal de Mathḿatiques Pures et Appliquées (9) 85, 415–449.
45
y y
1 1
0.8
S 0.8 S
0.6 0.6
NT
0.4 0.4 NT
0.2 B 0.2 B
t t
0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1
and set NTk+1 = [0, 1] \ (Bk+1 ∪ Sk+1 ). Verify that the interval structure
holds and define the boundaries ak+1 and bk+1 by (4.38).
6. If ak+1 = ak and bk+1 = bk then set NT (tn ) = (ak+1 , bk+1 ), Φ(tn , ·) = vk+1
and STOP; otherwise increase k by 1 and continue with step 1.
Example 4.12
We consider a bond and a stock with parameters r = 0, µ = 0.096, σ = 0.4,
and horizon T = 1. We use mesh sizes ∆t = 0.01 and ∆y = 0.001, choose
c = 1, and at tN −1 use NT0 = (0.1, 0.8), and at all other time steps tn use
NT0 = NT (tn+1 ). For the utility function we consider both α = 0.1 and
the more risk averse parameter α = −1. These yield without transaction
costs optimal risky fractions 0.667 and (dotted lines in Figure 3). We consider
proportional costs γ = 0.01. In Figure 3 we look at α = 0.1, left-hand at V
with liquidation at the end, right-hand at Ṽ . We see that the liquidation costs
we have to pay at T imply that we also trade close to the terminal time, while
without liquidation this is never optimal. Using Mathematica the computation
time for each graph was about 18 s.
46
from variational inequalities like (4.31) by inclusion of a maximization problem
in the inequalities for the buy and sell region which determines the optimal
transaction when trading.
The first type of costs considered were purely proportional costs – like we did
above – for which the optimal solution is given by a cone in which it is optimal
not to trade at all, and which corresponds to an interval for the risky fraction.
When reaching the boundaries, infinitesimal trading occurs in such a way that
the wealth process just stays in the cone.
Adding a constant component to the transaction costs punishes very frequent
trading and so will avoid the occurrence of infinitesimal trading at the bound-
ary. An investor now has to choose discrete trading times and optimal trans-
actions at these times so that the methodology of optimal impulse control
comes into play. The insight is that there is still some no-transaction region,
but reaching the boundary transactions will be done in such a way that the
wealth process restarts at some curve between boundary and Merton line. For
constant costs the trading regions depend also on the total wealth.
A simpler approach is possible when considering fixed costs instead of constant
costs. For purely fixed costs we get a constant new risky fraction close to the
Merton fraction πM . If combined with proportional costs we get two different
new risky fractions after buying and after selling.
5 Optimal Stopping
We will cite some results on optimal stopping in Section 5.1 and then look at
the pricing of American options as an important application of the theory of
optimal stopping.
where Ss,t is the class of stopping times with values in [s, t], and an optimal
stopping time τ ∗ for which V (0) is attained, i.e. V (0) = E[Yτ ∗ ]. We assume
that V (0) ∈ (0, ∞). The main idea to solve the optimal stopping problem is
the introduction of the Snell envelope
47
The essential supremum X ∗ = esssupX of a family of random variables X is
characterized by two conditions: (i) For all X ∈ X we have X ≤ X ∗ (a.s.) and
if X ≤ X (a.s.) for all X ∈ X , then X ∗ ≤ X (a.s.).
If the essential supremum is taken over a countable number of random variables
or over expectations (real numbers) it coincides (a.s.) with the supremum.
Therefore Y 0 = V (0). By definition we also have Y T = YT .
Remark 5.5 In discrete time the Snell envelope can be introduced similarly
and constructed by backward induction. Say we have a stochastic process
(Yn )n=0,...,N , adapted to a filtration (Fn )n=0,...,N and we want to find the value
vn = sup E[Yτ ].
τ ∈Sn,N
and find an optimal stopping time τn∗ in the class Sn,N of stopping time with
values in n, . . . , N, i.e. for which E[Yτn∗ ] = vn . In particular we are interested
in n = 0. Defining
Y N = YN
and backwards for n = N − 1, . . . , 0
Y n = max{Yn , E[Y n+1 | Fn]}.
The resulting process (Y n )n=0,...,N will be the smallest supermartingale majo-
rant of (Yn )n=0,...,N , the Snell envelope. Further by induction it follows that
τn∗ = min{m = n, . . . , N : Y m = Ym } = min{m = n, . . . , N : Y m ≤ Ym }
is optimal with vn = E[Yτn∗ ] = E[Y n ].
48
Example 5.6 The problem of best choice (formerly the secretary problem):
A company wants to hire a mathematician and has invited N applicants for
a job interview. They are interviewed sequentially and after each candidate it
has to be decided whether s/he is hired or not. The aim is to maximize the
probability to get the best candidate. The rank of candidate n is given by a
random variable Xn and we assume that all permutations of ranks have the
same probability. What can be observed are the relative ranks Rn with values
in {1, . . . , n} up to time n. So we consider the filtration Fn = σ(R1 , . . . , Rn ),
n = 1, . . . , N. The company faces the optimal stopping problem
vn := sup E[Yτ ], n = 1, . . . , N,
τ ∈Sn,N
where
Yn = P (Xn = 1 | R1, . . . , Rn ) = P (Xn = 1 | Fn)
and is interested in the optimal stopping times τn∗ at n, in particular for n = 1.
It can be shown that R1 , . . . , RN are independent with
1
P (Rn = k) = , k = 1, . . . , n,
n
and that
n
Yn = 1{Rn =1} .
N
Using that we have due to independence E[Y n+1 | Fn ] = vn+1 , we can compute
the Snell envelope Y n , τn∗ , and vn by backward induction and show that the
optimal strategy can be described by
Thus – as long as σ ∗ is not reached – the decision is always no, and from
σ ∗ onwards the first candidate, who is the best of those seen so far, gets the
job. Further it can be shown that for looking at σ ∗ and v1 as functions of the
number of candidates N, we get
σ ∗ (N) 1 1
lim = , lim v1 (N) = ,
n→∞ N e n→∞ e
i.e., it is asymptotically optimal to send home N/e candidates after their in-
terview and then take the next relatively best candidate. This yields approxi-
mately a probability of 1/e to get the best of all candidates. For more details
we refer e.g. to [Ir03].
49
5.2 Optimal Stopping for underlying Markov processes
Suppose now that Yt = g(t, Xt) where (Xt )t∈[0,T ] is a strong Markov process
with values in IRn , e.g. an Itô diffusion like defined in (2.4).
Then the optimal stopping problem reads
Theorem 5.7 Suppose that Assumption 5.3 holds, that V is lower semi
continuous and g upper semi continuous. Then
∗
τt,x = inf{s ∈ [t, T ] : V (s, Xs (t, x)) = g(s, Xs (t, x))}
Furthermore, one can introduce the continuation set C and the stopping set D
∗
Then for (0, x) ∈ C, the optimal stopping time τ0,x = τD is the first entry time
of (t, Xt ) into D.
where W is a Wiener process, µ ∈ IR, σ > 0, and interest rate r > 0. Risk
neutral pricing requires a change of measure to a new probability measure
under which the discounted price processes become martingales. This risk
dP̃
neutral measure P̃ can be defined by a Radon-Nikodym density ZT = dP in
the following way:
P̃ (A) := E[ZT 1A ], A ∈ FT ,
50
where ( 2 )
µ−r 1 µ−r
ZT = exp − WT − T .
σ 2 σ
Note that the density process (Zt )t∈[0,T ] , defined by
( 2 )
µ−r 1 µ−r
Zt := E[ZT | Ft] = exp − Wt − t ,
σ 2 σ
E[ZT X | Ft]
Ẽ[X] = E[ZT X], Ẽ[X | Ft ] = = Zt−1 E[ZT X | Ft ].
E[ZT | Ft ]
Thus also the wealth process is a P̃ (local) martingale. This was to be expected
since we can only invest in the two martingales Bt /Bt = 1 and S̃.
Say we have a financial derivative which pays at the terminal time the amount
C to its buyer. The arbitrage free price p(C) of this contingent claim C is given
by x0 if we can find initial capital x0 and a self-financing trading strategy π
such that the corresponding discounted wealth process X̃ = X̃ π satisfies
Z T
C
= x0 + πt X̃t σ dW̃t . (5.44)
BT 0
51
The Black-Scholes market is a so called complete market model in which we
know that every FT -measurable, square-integrable claim C can be hedged by
a trading strategy like in (5.44). If the wealth process is indeed a martingale,
we get from (5.44)
p(C) = Ẽ[X̃Tπ ] = Ẽ[C/BT ],
where the latter does no longer depend directly on the trading strategy, so we
can simply compute the price by taking expectation of the discounted claim
with respect to the risk neutral measure P̃ .
Furthermore one can show that under these conditions the arbitrage free price
of C at time t is given by
For example, for Φ(x) = (x−K)+ we have a European Call option which offers
the right to buy the stock for strike price K > 0 at time T . The solution of
the corresponding PDE above yields the so called Black-Scholes formula for
the call price, cf. any textbook on financial mathematics.
The price for a put option given by Φ(x) = (K − x)+ can then be determined
by the put-call-parity, cf. e.g. [KS98, Example 2.4.3].
52
So the buyer chooses a stopping time in S0,T . In particular, if he has not
exercised before τ , he receives the payout CT at terminal time. At time 0 the
maximum the buyer is willing to pay is
pB (C· ) = sup{y : There exist τ ∈ S0,T , π s.t. Xτπ (0, −y) + Cτ ≥ 0}.
pS (C· ) = inf{z : There exists π s.t. Xtπ (0, z) − Ct ≥ 0 for all t ∈ [0, T ]}.
Then we get
and Z τ∗
Cτ ∗ ∗
= pB (C· ) + πt∗ X̃tπ σ dW̃t .
Bτ ∗ 0
is unique. It allows the buyer to find a stopping time τ and a trading strategy
for his initial debt −p(C· ) such that he will make no losses almost surely. For
the seller it guarantees that he can cover the claim any time when the investor
exercises. The buyer will also be interested to know the optimal stopping time
τ ∗ for which
sup Ẽ[Cτ /Bτ ] = Ẽ[Cτ ∗ /Bτ ∗ ],
τ ∈S0,T
53
The proof of Theorem 5.9 shows that we have as arbitrage free price at time t
Ct = ψ(t, St )
for suitable ψ. Then the price of the American contingent claim C is given by
the value function V , i.e. pt (C· ) = V (t, x) on {St = x}, where
for t ∈ [0, T ) and V (T, x) = ψ(T, x) for x > 0. Note that Ẽt,x [ · ] = Ẽ[ · | St = x].
Following Section 5.2 the continuation region is
Example 5.10 For the American call option we have ψ(t, x) = (x − K)+
for some strike price K > 0. Then for τ ∈ St,T , Jensen’s inequality, Theorem
5.1 and r ≥ 0 imply
+
Ẽt,x [(Sτ − K)+ /Bτ ] ≥ Ẽt,x [S̃τ ] − Ẽt,x [K/Bτ ]
1 +
= x − K Ẽt,x [e−r(τ −t) ]
Bt
1 1
≥ (x − K)+ = ψ(t, x),
Bt Bt
where the last inequality is strict if t < T and P̃t,x (τ = T ) > 0. Therefore,
V (t, x) > ψ(t, x) for t < T and D = {T } × (0, ∞), i.e. it is optimal to exercise
at the terminal time. Thus the American call has the same (optimal) payout
as the European call. This is no longer true if one considers e.g. dividend
payments.
54
5.5 The American put option
In the notation of the preceding section we now look at
ψ(t, x) = (K − x)+
for some K > 0. This corresponds to the American put option with strike
price K.
Following [PS06, Section 25], the optimal stopping problem can be transformed
to a free boundary problem to find the boundary between the continuation and
stopping regions. We sketch the procedure:
Step 1: From Section 5.4 we know that the arbitrage free price is given by
and that we continue (do not stop) if (u, Su (t, x))u∈[t,T ] lies in the continuation
region
C = {(t, x) ∈ [0, T ] × (0, ∞) : V (t, x) > (K − x)+ },
and the optimal stopping time τ ∗ is the first entry time of (u, Su (t, x))u∈[t,T ] in
the stopping region
Step 2: All points (t, x) with x ≥ K, t < T belong to C. Further one can show
that all points with 0 < x < b∞ belong to D, where b∞ < K is the constant
boundary of the stopping region for the corresponding infinite time horizon
problem, where an explicit solution can be derived, see e.g. [PS06, Section
25]. Showing that x 7→ V (t, x) is convex on (0, ∞) it follows that a function
t 7→ b(t) (boundary between C and D) exists such that
and
D = {(t, x) ∈ [0, T ] × (0, ∞) : x ≤ b(t)} ∪ ({T } × (b(T ), ∞)).
Since (K−x)+ does not depend on time, t 7→ V (t, x) is decreasing and therefore
t 7→ b(t) increasing.
Step 3: V is continuous.
Step 4: The smooth fit condition for x 7→ V (t, x) holds at b(t), i.e.
55
Step 6: Arguments like we used to derive the HJB equation lead to the fact
that V is C 1,2 on C and satisfies
Vt + L S V − r V = 0
56
Appendix
The results in the appendix are formulated for stochastic processes with infinite
time horizon, i.e. t ≥ 0, and can easily be adapted to a finite time horizon, i.e.
t ∈ [0, T ]. Further all (in)equalities for random variables are to be understood
P -almost surely (a.s.). Only to emphasize this we may write ’P − a.s.’ in some
places.
A Conditional Expectation
The conditional expectation is the expected value of a random variable given
the available information, which can be described by a σ-algebra. It is again
a random variable!
P (A | G) = E[1A | G].
57
(B4) E[X | G] = E[X] if X is independent of G.
(B7) Jensen’s Inequality: If f is convex with E|f (X)| < ∞, then f (E[X | G]) ≤
E[f (X) | G].
If all Xt are integrable (i.e. lie in L1 ) or square integrable (in L2 ), then we call
the process integrable or square integrable, respectively.
B.2 Filtrations
A filtration F = (Ft )t≥0 is an increasing family of σ-algebras in F, i.e. for
s < t we have Fs ⊆ Ft ⊆ F. Ft may be seen as the information which is
available at time t, i.e. for each event A ∈ Ft it can be decided at time t if it
has occurred or not.
58
A stochastic process X is progressively measurable w.r.t. a filtration F , if for all
t ≥ 0 the maps (s, ω) 7→ Xs (ω) on [0, t] × Ω are measurable w.r.t B([0, t]) ⊗ Ft .
B.4 Martingales
An F -adapted process X is a martingale, if E|Xt | < ∞ for all t ≥ 0 and
E[Xt | Fs ] = Xs (P − a.s.)
for all 0 ≤ s ≤ t.
(W1) W0 = 0,
(W3) Wt −Ws is normally distributed with mean 0 and variance t−s, t > s ≥ 0.
(W4) W is continuous.
59
The processes
W, (Wt2 − t)t≥0 , exp a Wt − 21 a2 t t≥0 for a > 0
are martingales.
Lemma B.5 Let X be a IR-valued random variable with E|X| < ∞ and F
a filtration. Then
(E[X | Ft])t≥0
is a uniformly integrable martingale.
where ξj is Ftj -measurable and supj∈IN |ξj (ω)| < C for all ω ∈ Ω and some
C ∈ IR. Furthermore,
60
We denote the class of simple processes by P0 .
Thus the stochastic integral at time t is a random variable and hence I(X) =
(It (X))t≥0 is a stochastic process!
One can show that P0 is dense in P. Thus there exist simple processes (X n )n∈IN
satisfying
lim kX n − XkM = 0.
n→∞
Using also the isometry in Proposition C.1 (I6) this implies that the limit is
well defined and the stochastic integral can be set as
Z t
Xu dWu := It (X), t ≥ 0.
0
61
We also write Z Z
t
Xu dWu = Xu dWu ,
0 t≥0
Example C.3 Z t
1 2 1
Ws dWs = W − t, t ≥ 0.
0 2 t 2
C.3 A generalization
The stochastic integral can be extended to cover integrands in P ∗ , the class of
progressively measurable processes X which satisfy
Z t
Xs2 ds < ∞ (P − a.s.) for all T ≥ 0.
0
(n)
These satisfy τn → ∞ and we have Xt := Xt 1{τn ≥t} ∈ P. Thus
Z t
(n)
It := Xs(n) ds.
0
62
C.4 Itô Formula
A (one-dimensional) Itô process is a stochastic process which admits a repre-
sentation of the form
Z t Z t
Xt = ξ + bs ds + σs dWs , t ≥ 0, (C.2)
0 0
where ξ is F0 -measurable and (bt )t≥0 and (σt )t≥0 are F - progressively measur-
able with Z t
(|bs | + |ss |2 )ds < ∞ for all t > 0.
0
In particular, X is continuous and F -adapted. In differential form we may
write
dXt = bt dt + σt dWt , X0 = ξ.
63
An n-dimensional stochastic process which admits a representation of the form
Z t Z t
Xt = X 0 + bs ds + σs dWs , t ≥ 0, (C.4)
0 0
where X0 is F0 -measurable and (bt )t≥0 and (σt )t≥0 are IRn -valued and IRn×m -
valued, respectively, as well as F - progressively measurable with
Z t X n n X
m
!
X
2
|bis | + |sij
s | ds < ∞ for all t > 0
0 i=1 i=1 j=1
where Z t
[X, Y ]t = σsX σsY ds, t ≥ 0.
0
If X and Y were defined w.r.t. independent Wiener processes, we had [X, Y ]t =
0.
64
D Stochastic Differential Equations
D.1 Problem formulation
We want to solve the stochastic differential equation (SDE)
where
b : [0, T ] × IRn → IRn , σ : [0, T ] × IRn → IRn×m
are measurable and W is an m-dimensionaler Wiener process w.r.t. to some
filtration F . X is n-dimensional, so (D.5) consists of the SDEs
m
X
dXti = bi (t, Xt )dt + σij (t, Xt )dWtj , i = 1, . . . , n.
j=1
The SDE is said to have a strong solution, if for given probability space
(Ω, F , P ), initial condition ξ and Wiener process W a solution X can be found
which is adapted to the filtration generated by W and ξ (augmented with the
null sets).
65
The condition in Theorem D.1 is not restrictive enough to avoid so called
explosions of the process. Therefore we also need the additional condition in
the following theorem.
Theorem D.2 Suppose that b, σ are Lipschitz continuous and satisfy a linear
growth condition, i.e. there exists a constant K > 0 such that for all t ∈ [0, T ],
x, y ∈ IRn ,
Let the initial condition be given by some random variable ξ which is indepen-
dent of W and satisfies Ekξk2 < ∞.
Then (D.5) has a continuous, strong solution X with
Z T
2
E kXt k dt < ∞.
0
66