MATH858D Markov Chains: Maria Cameron
MATH858D Markov Chains: Maria Cameron
MARKOV CHAINS
MARIA CAMERON
Contents
1. Discrete time Markov chains 2
1.1. Time evolution of the probability distribution 3
1.2. Communicating classes and irreducibility 3
1.3. Hitting times and absorption probabilities 5
1.4. Solving recurrence relationships 11
1.5. Recurrence and transience 13
1.6. Invariant distributions and measures 14
2. Time reversal and detailed balance 18
3. Markov Chain Monte Carlo methods 20
3.1. Metropolis and Metropolis-Hastings algorithms 21
3.2. Ising Model 22
3.3. MCMC for cryptography 22
4. Continuous time Markov chains 23
4.1. Right-continuous random processes 24
4.2. The exponential distribution 24
4.3. Jump chains and holding times 25
4.4. Class structure 29
4.5. Hitting times and absorption probabilities 30
4.6. Recurrence and transience 31
4.7. Invariant distributions and convergence to equilibrium 32
4.8. Time reversal and detailed balance for continuous-time Markov chains 34
5. Transition Path Theory 34
5.1. Settings 35
5.2. Reactive trajectories 35
5.3. The forward and backward committors 36
5.4. Probability distribution of reactive trajectories 37
5.5. Probability current of reactive trajectories 38
5.6. Effective current 39
5.7. Transition rate 39
5.8. Reaction pathways 40
5.9. Simplifications for time-reversible Markov chains 40
5.10. Sampling reactive trajectories and no-detour reactive trajectories 41
References 44
1
2 MARIA CAMERON
Notation Pi (Xn = j) denotes the probability that the Markov process starting at i at
time 0 will reach state j at time n:
Pi (Xn = j) := P(Xn = j | X0 = i).
Theorem 1. Let (Xn )n≥0 be a Markov chain with initial distribution λ and transition
matrix P . Then for all n, m ≥ 0
(1) P(Xn = j) = (λP n )j ;
(n)
(2) Pi (Xn = j) = P(Xn+m = j | Xm = i) = pij .
Proof. (1)
X X
P(Xn = j) = ... P(Xn = j, Xn−1 = in−1 , . . . , X0 = i0 )
i0 ∈S in−1 ∈S
X X
= ... P(Xn = j | Xn−1 = in−1 , . . . , X0 = i0 )P(Xn−1 = in−1 , . . . , X0 = i0 )
i0 ∈S in−1 ∈S
X X
= ... P(Xn = j | Xn−1 = in−1 )P(Xn−1 = in−1 | Xn−2 = in−1 ) . . . P(X0 = i0 )
i0 ∈S in−1 ∈S
X X
= ... λi0 pi0 i1 . . . pin−1 j = (λP n )j .
i0 ∈S in−1 ∈S
1.2. Communicating classes and irreducibility. We say that state i leads to state j
(denote it by i −→ j) if
Pi (Xn = j for some n ≥ 0) > 0.
If i leads to j and j leads to i we say that i and j communicate and write i ←→ j. Note
that i leads to j if and only if one can find a finite sequence i1 , . . . , in−1 such that
pii1 > 0, pi1 i2 > 0, . . . , pin−1 j > 0.
(n)
This, in turn, is equivalent to the condition that pij > 0 for some n.
The relation ←→ is an equivalence relation as it is
(1) symmetric as if i ←→ j then j ←→ i;
(2) reflective, i.e., i ←→ i;
(3) transitive, as i ←→ j and j ←→ k imply i ←→ k.
Therefore, the set of states is divided into equivalence classes with respect to the relation
←→ called communicating classes.
Definition 2. We say that a communicating class C is closed if
i ∈ C, i −→ j imply j ∈ C.
4 MARIA CAMERON
Once the chain jumps into a closed class, it stays there forever.
A state i is called absorbing if {i} is a closed class. In the corresponding network, the
vertex i has either only incoming edges, or no incident edges at all.
.7395
.152 1
2 .0299
.16
.0838
.488
.1713
.1865 3
.8596
.2 .2427
.1467
.4893
.1529
.4854
5 .1404
.1553
4
.1165
Definition 4. Let (Xn )n≥0 be a Markov chain with transition matrix P . The hitting time
of a subset A ⊂ S is the random variable τA : Ω → {0, 1, 2, . . .} ∪ {∞} given by
τ A = inf{n ≥ 0 | Xn ∈ A},
where we agree that inf ∅ = ∞.
Definition 5. The probability starting from state i that (Xn )n≥0 ever hits A is
(2) hA A
i = Pi (τ < ∞).
If A is a closed class, hA i is called the absorption probability. The mean time taken for
(Xn )n≥0 to reach A starting from i is
X
(3) kiA = Ei [τ A ] ≡ E[τ A |X0 = i] = nPi (τ A = n) + ∞Pi (τ A = ∞).
n<∞
The quantities hA A
i and ki can be calculated by solving certain linear equations.
(Minimality means that is x = {xi | i ∈ S} is another solution with xi ≥ 0 for all i, then
hA
i ≤ xi for all i.)
Proof. First we show that the hitting probabilities satisfy Eq. (4). Indeed, if i ∈ A then
τ A = 0 and hence Pi (τ A < ∞) = 1. If i ∈
/ A, then
X
Pi (τ A < ∞) = Pi (τ A < ∞ | X1 = j)Pi (X1 = j)
j∈S
X X
= Pj (τ A < ∞)pij = hA
j pij .
j∈S j∈S
Now we show that if x = {xi | i ∈ S} is another nonnegative solution of Eq. (4) then
xi ≥ hA A
i for all i ∈ S. If i ∈ A then hi = xi = 1. If i ∈
/ A, we have
X X X X X X
xi = pij xj = pij + pij xj = pij + pij pjk xk
j∈S j∈A j ∈A
/ j∈A j ∈A
/ k∈S
!
X X X X
= pij + pij pjk + pjk xk
j∈A j ∈A
/ k∈A k∈A
/
XX
A A
=Pi (τ = 1) + Pi (τ = 2) + pij pjk xk .
j ∈A
/ k∈A
/
Since xj ≥ 0 for all j ∈ S, the last term in the last sum is nonnegative. Therefore,
xi ≥ Pi (τ A ≤ n) for all n.
Hence
xi ≥ lim Pi (τ A ≤ n) = Pi (τ A < ∞) = hi .
n→∞
MARKOV CHAINS 7
Theorem 3. The vector of mean hitting times k A = {kiA | i ∈ S} is the minimal non-
negative solution to the system of linear equations
(
kiA = 0, i∈A
(5) A
P A
ki = 1 + j∈S pij kj , i ∈
/ A.
Proof. First we show that the mean hitting times satisfy Eq. (5). Indeed, if i ∈ A the
kiA = 0 as τ A = 0. Let us consider two cases.
Case 1: there is i∗ ∈ S\A such that hA i∗ < 1.
A
Case 2: for all i ∈ S\A such that hi = 1.
In Case 1, Eq. (4) implies that all hA / A such that i −→ i∗ . In this case, all
i < 1 for i ∈
kiA = ∞ such that i −→ i∗ by Eq. (3). Hence Eq. (5) holds. Let us consider Case 2. If
i∈/ A then
∞
X
kiA =Ei [τ A ] = nP(τ A = n | X0 = i)
n=1
∞
X X
= n P(τ A = n | X1 = j, X0 = i)Pi (X1 = j)
n=1 j∈S
We can switch order of summation because all terms are positive (this follows from the
monotone convergence theorem). Also the Markov property implies that
We continue:
∞
XX
kiA = nP(τ A = n | X1 = j)Pi (X1 = j)
j∈S n=1
∞
!
X X
= (m + 1)P(τ A = m | X0 = j)pij
j∈S m=0
∞ ∞
!
X X
= mP(τ A = m | X0 = j)pij + P(τ A = m | X0 = j)pij
m=0 m=0
X X ∞
X
= pij kjA + pij P(τ A = m | X0 = j).
j∈S j∈S m=0
as this is a row sum of the transition matrix. As a result, we obtain what we have desired:
X
kiA = 1 + pij kjA .
j∈S
Now we show that if {yi | i ∈ S} with yi ≥ 0 for every i ∈ S is another solution of Eq. (5)
then kiA ≤ yi for all i ∈ S. If i ∈ A, then kiA = yi = 0. For i ∈
/ A we have:
!
X X X X
yi =1 + pij yj = 1 + pij yj = 1 + pij 1 + pjk yk
j∈S j ∈A
/ j ∈A
/ k∈A
/
XX
=Pi (τ A ≥ 1) + Pi (τ A ≥ 2) + pij pjk yk .
j ∈A
/ k∈A
/
(0,1,1) (1,1,1)
(1,0,1)
(0,0,1) 2/3
1/3 1/3
0 1 2 3
(0,1,0) 2/3
(1,1,0)
(0,0,0)
(1,0,0)
(a) (b)
shown in Fig. 2(b). Let Pi be the probability for the particle to disappear
at (0, 0, 0) provided that it is initially at the subset of vertices i. Then we
have:
P0 = 1;
1 2
P1 = P0 + P2 ;
3 3
2 1
P2 = P1 + P3 ;
3 3
P3 = 0.
The solution of this system is P0 = 1, P1 = 35 , P2 = 25 , P3 = 0.
Solution 2: Transition probabilities between the subsets 0, 1, 2 and 3 are
shown in Fig. 2(b). The probability to get to 0 starting from 1 is the sum
of probabilities to get to 0 from nth visit of 1:
∞
1 2 2(n−1) 1 1
X 3
P = = 4 = .
3 3 31− 9 5
n=1
3
Answer: 5.
it by an edge with equal probability. Find the expected time for a process
starting at each vertex to be absorbed at (0, 0, 0).
Solution: Taking symmetry into account, we define a reduced Markov
2/3 1
1/3
0 1 2 3
2/3 1/3
1.4. Solving recurrence relationships. In the case where the Markov chain has an
infinite set of states, Z or {0, 1, 2, . . .}, and only transitions between nearest neighbors are
possible, Eqs. (4) and (5) become linear 2nd order recurrence relationships, homogeneous
and nonhomogeneous respectively. A recipe for solving linear recurrence relationships
with constant coefficients, homogeneous and nonhomogeneous, can be found e.g. here (a
presentation by Niloufar Shafiei).
Second order recurrence relationships can be solved uniquely if one has two initial
(boundary) conditions. However, if the set of states S = {0, 1, 2, . . .} and A = {0} (as
in the Markov chain Gambler’s ruin 1), Eqs. (4) and (5) have only one boundary condi-
tion. The solutions hA and k A are determined by the additional requirements that they
must be minimal and nonnegative.
Now we consider the “Birth-and-death” Markov chain where the coefficients are of the
transition matrix P are
P00 = 1, Pi,i+1 = pi , Pi,i−1 = qi , pi + qi = 1, i ≥ 1.
In this chain, 0 is an absorbing state, and we wish to calculate the absorption probability
starting from an arbitrary state i. Eq. (4) gives:
h0 = 1, hi = qi hi−1 + pi hi+1 , i ≥ 1.
MARKOV CHAINS 11
This recurrence relationship cannot be solved by the tools for the case of constant coeffi-
cients. However, another technique works in this case. Consider
ui := hi−1 − hi .
Subtracting hi from both parts of hi = qi hi−1 + pi hi+1 and taking into account that
qi + pi = 1 we get:
pi ui+1 = qi ui .
Therefore,
qi qi qi−1 . . . q1
ui+1 = ui = u1 =: γi u1 .
pi pi pi−1 . . . p1
Then
u1 + u2 + . . . + ui = h0 − h1 + h1 − h2 + . . . + hi−1 − hi = h0 − hi .
Hence
i−1
X
hi = h0 − u1 (1 + γ1 + . . . + γi−1 ) = 1 − u1 γj ,
j=0
hi → 0 as i → ∞.
This will take place if we set
−1
X∞
u1 = γj .
j=0
Then
Pi−1 P∞
γj γj
hi = 1 − Pj=0
∞
j=i
= P∞ .
j=0 γj j=0 γj
Therefore, the absorption probabilities hi < 1 for i ≥ 1.
Note that
∞ ∞
" #
X X
Ei [Vi ] =Ei 1{Xn =i} = E 1{Xn =i}
n=0 n=0
∞ ∞
(n)
X X
(8) = Pi (Xn = i) = pii .
n=0 n=1
Also note that the conditions for a state to be recurrent or transient can be written as
• state i is recurrent if Pi (Vi = ∞) = 1;
• state i is transient if Pi (Vi = ∞) = 0.
Theorem 4. the following dichotomy holds:
(n)
(1) if Pi (Ti < ∞) = 1, then i is recurrent and ∞
P
n=1 pii = ∞;
P∞ (n)
(2) if Pi (Ti < ∞) < 1, then i is transient and n=1 pii < ∞.
In particular, every state is either transient or recurrent.
Proof. (1) Let us denote Pi (Ti < ∞) by fi . First show that
Pi (Vi > r) = fir .
(r) (r) (r−1) (r−1)
Pi (Vi > r) =Pi (Ti < ∞) = Pi (Si < ∞ | Ti < ∞)Pi (Ti < ∞)
(r) (r−1) (r−1) (r−2)
=pi (Si < ∞ | Ti < ∞)Pi (Si < ∞ | Ti < ∞) . . . Pi (Ti < ∞)
=fir .
(2) If fi = Pi (Ti < ∞) = 1, then
pi (Vi = ∞) = lim Pi (Vi > r) = lim fir = lim 1 = 1.
r→∞ r→∞ r→∞
P∞ (n)
Hence i is recurrent and n=0 pii = Ei [Vi ] = ∞.
(3) If fi = Pi (Ti < ∞) < 1, then
Pi (Vi = ∞) = lim Pi (Vi > r) = lim fir = 0.
r→∞ r→∞
Hence i is transient and
∞ ∞ ∞
X (n)
X X 1
pii = Ei [Vi ] = Pi (Vi > r) = fir = < ∞.
1 − fi
n=0 r=0 r=0
14 MARIA CAMERON
Now I will list some facts about recurrence and transience. I will not prove them. Proofs
can be found e.g. in [1].
• In a communicating class, states are either all transient or all recurrent.
• Every recurrent class is closed.
• Every finite closed class is recurrent.
• For a simple random walk on Z, where the entries of the transition matrix are all
zeros except for pi,i+1 = q, pi,i−1 = 1 − q, all states are transient if q 6= 1/2, and all
states are recurrent if q = 1/2.
• For a simple symmetric random walk on Z2 , all states are recurrent.
• For a simple symmetric random walk on Zn , n ≥ 3, all states are transient.
Show that π = πP :
(n) (n−1) (n−1)
X X X
πj = lim pij = lim pik pkj = lim pik pkj = πk pkj .
n→∞ n→∞ n→∞
k∈S k∈S k∈S
Remark If the set of states is not finite, then the one cannot exchange summation and
(n)
taking limit. For example, limn→∞ pij = 0 for all i, j for a simple symmetric random walk
on Z. {πi = 0 | i ∈ Z} is certainly an invariant measure, but it is not a distribution.
The existence of an invariant distribution does not guarantee convergence to it. For
example, consider the two-state Markov chain with transition matrix
0 1
P = .
1 0
MARKOV CHAINS 15
Definition 10. We will call a Markov chain aperiodic if all its states are aperiodic.
Theorem 7. Suppose that (Xn )n≥0 is a Markov chain with transition matrix P and initial
distribution λ. Let P be irreducible and aperiodic, and suppose that P has an invariant
distribution π. Then
P(Xn = j) → πj as n → ∞ for all j.
In particular,
(n)
pij → πj as n → ∞ for all i, j.
A proof of this theorem is found in [1]. In the case where the set of states is finite,
this result can be proven by means of linear algebra. A building block of this proof is the
Perron-Frobenius theorem.
Theorem 8. Let A be an N × N matrix with nonnegative entries such that all entries of
Am are strictly positive for all m > M . Then
(1) A has a positive eigenvalue λ0 > 0 with corresponding left eigenvector x0 where all
entries are positive;
(2) if λ 6= λ0 is any other eigenvalue, then |λ| < λ0 .
(3) λ0 has geometric and algebraic multiplicity one.
16 MARIA CAMERON
Let P be the stochastic matrix for a Markov chain with N states. For sufficiently large
n, all entries of P n for stochastic irreducible aperiodic matrices P become positive. The
proof of this fact is similar to the one of Theorem 6. Furthermore, the largest eigenvalue
of a stochastic matrix is equal to 1. Indeed, since the row sums of P are ones, λ0 = 1 is an
eigenvalue with the right eigenvector e = [1, . . . , 1]> .
Now let us show that the other eigenvalues do not exceed λ0 = 1 in absolute value. Let
(λ, v) be an eigenvalue and a corresponding right eigenvector of a stochastic matrix P . We
normalize v so that
vi = max |vk | = 1.
k∈S
Since X
λvi = pik vk ,
k∈S
we have
1 X 1 X X
|λ| = pik vk ≤ pik |vk | ≤ pik = 1.
vi vi
k∈S k∈S k∈S
Remark The fact that has eigenvalues of a stochastic matrix do not exceed 1 in absolute
value is an instance of the Gershgorin Circle Theorem.
Theorem 9. Every irreducible aperiodic Markov chain with a finite number of states N
has a unique invariant distribution π. Moreover,
(9) lim qP n = π
n→∞
for any initial distribution q.
Proof. The Perron-Frobenius theorem applied to a finite stochastic irreducible aperiodic
matrix P implies that the largest eigenvalue of P is λ0 = 1 and all other eigenvalues are
strictly less than 1 in absolute value. The left eigenvector π, corresponding to λ0 has
positive entries and can be normalized so that they sum up to 1. Hence,
N
X
π = πP, πi = 1.
i=1
Now let us establish convergence. First we consider the case when P is diagonalizable:
P = V ΛU,
where Λ is the matrix with ordered eigenvalues along its diagonal:
1
λ1
Λ= , 1 > |λ1 | ≥ . . . ≥ |λN −1 |,
..
.
λN −1
V is the matrix of right eigenvectors of P : P V = V Λ, such that its first column is
e = [1, . . . , 1]> . U = V −1 is the matrix of left eigenvectors of P : U P = ΛU . The
first row of U is π = [π1 , . . . , πN ]. One can check that if U V = IN , these choices of the
MARKOV CHAINS 17
first
PN column of V and the first row of U are consistent. Therefore, taking into account that
i=1 qi = 1, we calculate:
lim qP n
n→∞
1
1 ∗ ∗ ∗ π1 π2 . . . πN
1 ∗ ∗ ∗ λn2 ∗ ∗
∗ ∗
= lim [q1 q2 . . . qN ] . ..
... ...
n→∞
1 ∗ ∗ ∗ λnN ∗ ∗ ∗ ∗
1
π1 π2 . . . π N
0
∗ ∗ ∗ ∗
= [1 0 . . . 0]
. .
...
.
0 ∗ ∗ ∗ ∗
= [π1 π2 . . . πN ].
In the case when P is not diagonalizable, the argument is almost identical, just a bit
more tedious. We consider the Jordan decomposition of P
P = V JU
where U = V −1 and J is the Jordan form of of P , i.e., a block-diagonal matrix of the form:
1
J1
J = ,
..
.
Jr
with the first block being 1 × 1 matrix J0 ≡ 1, and respectively, the first column of V being
[1, . . . , 1]> , and the first row of U being π – the right and left eigenvectors corresponding
to the eigenvalue 1, and the other blocks Ji of sizes mi × mi , where 1 ≤ mi ≤ N − 1 and
m1 + . . . + mr = N − 1, of the form
λi 1
λi 1
(10) Ji = =: λi Imi ×mi + E.
.. ..
. .
λi
Exercise (1) Check that the matrix E in Eq. (10) with ones right above the diagonal
and all other entries zero is nilpotent. More precisely, E mi = 0mi ×mi .
(2) Check that the matrices λi Imi ×mi and E commute.
(3) Check that
m i −1
X n
n
Ji = λin−k E k .
k
k=0
18 MARIA CAMERON
Definition 11. The chain (Yn )0≤n≤N is called the time-reversal of (Xn )0≤n≤N .
Definition 12. A stochastic matrix P and a measure λ are in detailed balance if
λi pij = λj pji .
Suppose the set of states S is finite, the matrix P is irreducible, and the system is
distributed according to the invariant distribution π. The condition of detailed balance
means the following. Let Ni→j (n) be the number of transitions from i to j observed by
time n. Then for all i, j ∈ S,
Ni→j (n)
lim = 1,
n→∞ Nj→i (n)
if P is in detailed balance with π. In words, over large intervals of times, on average, one
observes equal numbers of transitions from i to j and from j to i for all i, j ∈ S given the
detailed balance.
The detailed balance condition gives us another way to check whether a given measure
λ is invariant.
Theorem 11. Let P and λ be in detailed balance. Then λ is invariant for P .
Proof.
X X
(λP )i = λj pji = λi pij = λi .
j∈S j∈S
Hence λP = λ.
Definition 13. Let (Xn )n≥0 be M arkov(λ, P ) where P is irreducible. We say that (Xn )n≥0
is reversible if for all N ≥ 1, (XN −n )0≤n≤N is M arkov(λ, P ).
Theorem 12. Let P be an irreducible stochastic matrix and let λ be a distribution. Suppose
that (Xn )n≥0 is M arkov(λ, P ). Then the following are equivalent:
(1) (Xn )n≥0 is reversible;
(2) P and λ are in detailed balance.
Proof. Both (1) and (2) imply that λ is invariant for P . Then both (1) and (2) are
equivalent to the statement that P̂ = P .
problem of finding the expected value of g(η) where η is a random variable with pdf π(x),
x ∈ Rd :
Z
(11) E[g(η)] = g(x)π(x)dx.
x∈Rd
Or, consider the problem of finding the expected value of g(η) in the case where Ω is finite
but huge, i.e., |Ω| = N where N is huge. Let π(ω) be the probability distribution on Ω,
then
X
(12) E[g(η)] = g(η(ω))π(ω),
ω∈Ω
Note that in both of the cases, one rarely knows π per se. Instead, often only a measure f
proportional to π is known. For example, think about the canonical pdf for n particles in
3D: Z
1 −β(V (x)+|p|2 /2) 2
µ(x, p) = e , Z= e−β(V (x)+|p| /2) dxdp.
Z R6n
The normalization constant Z, except for some simple cases, cannot be evaluated analyt-
ically. Therefore, µ(x, p) is, strictly speaking, unknown. However, for each (x, p) one can
calculate the measure
2 /2)
f (x, p) = e−β(V (x)+|p|
that is proportional to µ(x, p)
Therefore, the problem is two-fold:
• The expected value is hard-to-evaluate due to either high dimensionality of the
integral, so that numerical quadrature methods are unappreciable, or due to the
huge number of summands in the sum (think about numbers like N = 2n where
n ∼ 10k , k = 2, 3, 4 . . .). Moreover, π is often far from being uniform, and some
kind of importance sampling is necessary to be able to obtain a satisfactory estimate
using a reasonable number of samples of η.
• The pdf or the probability distribution π is unknown. Instead, f , that is propor-
tional to π, is given.
make P to have the desired invariant measure f , it suffices to pick P being in detailed
balance with the measure f , i.e., the transition probabilities should satisfy
fi pij = fj pji .
Such a transition matrix is constructed in two steps. As A. Chorin puts it, first do some-
thing stupid and then improve it.
(1) Suppose at time n, Xn = k. Propose a move from state k according to some
irreducible aperiodic transition matrix Q = (qij )ij∈S made-up by you. In the
original Metropolis algorithm, the matrix Q must be symmetric, i.e., qij = qji .
Suppose the proposed move is from state k to state l.
(2) To guarantee that the condition fi pij = fj pji holds, accept the proposed move with
the probability
fl
(13) α = min ,1 .
fk
I.e., if the proposed state l is more likely than the current state k, move to the
new state. Otherwise, move there with probability fl /fk or stay at state k with
probability 1 − fl /fk .
As a result, the transition probabilities pij are given by
fj X fj
(14) pij = qij min , 1 , pii = 1 − qij min ,1 .
fi fi
j6=i
3.2. Ising Model. A description of the Ising model is found in A. Chorin’s and O. Hald’s
book [?] (see sections 5.4 and 5.4 in the 2nd edition). The 2D Ising model is a popular
toy example for learning the Metropolis algorithm. Due to its simplicity, and interest-
ing theoretical analysis of this model has been conducted. Numerous internet resources
can be readily found. You can find some details regarding its behavior near the critical
temperature e.g. here, here, and here.
The Ising model is also considered on other kinds of lattices and on graphs. Variations
of the Ising model are used, for example, to model opinion dynamics (e.g., click here and
here).
3.3. MCMC for cryptography. The Metropolis algorithm can also be used to decipher
encrypted messages. A popular description of this application can be found e.g. in P.
Diaconis’s paper1.
4.2. The exponential distribution. We interpret the absolute values of the diagonal
entries of the generator matrix L as the escape rates: Li is the escape rate from the state
i. Correspondingly, if Li > 0, L−1 i is the expected holding time at state i. The off-diagonal
entries Lij , i 6= j of L are often called the pairwise transition rates.
Now we will go over some important properties of exponential random variables. Suppose
the jump time T from i to j is an exponentially distributed random variable with parameter
Li , i.e.,
P(T > t) = e−Li t for all t ≥ 0.
If Li > 0, then T has pdf
(
Li e−Li t , t ≥ 0,
fT (t) =
0, t < 0.
The expected value of T is
Z ∞
1
E[T ] = Li e−Li t tdt = ,
0 Li
i.e., Li is the reciprocal of the expected jump time.
Why did we choose the exponential distribution for the jump times? The reason is that
the exponential random variable is the only random variable that possesses the memoryless
property.
Theorem 13. A random variable T : Ω → [0, ∞] has an exponential distribution if and
only if it has the following memoryless property:
(18) P(T > t + s | T > s) = P(T > t) for all s, t ≥ 0.
Exercise Prove it.
Exercise Show that if T is an exponential random variable with parameter λ and a > 0
then Ta is an exponential random variable with parameter λa.
Theorem 14. Let S be a countable set and let Tk , k ∈ S, be independent exponential
random variables with parameters qk . Let
X
0 < q := qk < ∞.
k∈S
Set
T := inf Tk .
k∈S
Then this infimum is attained at a unique random value K ∈ S, with probability 1. More-
over, T and K are independent, and T is exponential with parameter q, and
qk
P(K = k) = .
q
MARKOV CHAINS 25
4.3. Jump chains and holding times. Given a generator matrix L one can define the
jump matrix Π = (πij | i, j ∈ S) as follows:
( (
Lij
, i 6
= j and Li 6
= 0, 0, Li 6= 0,
(19) πij = Li , πii = .
0, i 6= j and Li = 0 1, Li = 0
Now we can give a definition of a continuous-time Markov chain in terms of its jump chain
and holding times.
Definition 16. A right-continuous process (Xt )t≥0 on S is a continuous-time Markov
chain with initial distribution λ and generator matrix L if its jump chain is discrete-time
Markov chain (Yn )n≥0 with initial distribution λ and transition matrix Π defined from L
by Eq. (19) and if for each n ≥ 1, conditional on Y0 , ..., Yn−1 , its holding times S1 , ..., Sn
are independent exponential random variables with parameters LY0 , ..., LYn−1 respectively.
Given a discrete-time Markov chain (Yn )n≥0 with initial distribution λ and transition
matrix Π, and independent random variables T1 , T2 , ... with parameter 1, independent of,
one can construct a continuous time random chain (Xt )t≥0 with the same set of states and
the same initial distribution as follows: Set the holding times and the jump times according
to:
Tn
Sn = , Jn = S1 + . . . + Sn .
LYn−1
Then define
Xt = Yn where n is such that Jn ≤ t < Jn+1 .
Given a continuous-time Markov chain (Xt )t≥0 with initial distribution λ and generator
matrix L, one can construct a discrete-time random chain (Yn )n≥0 with the same set of
states and the same initial distribution as follows. We begin with an initial state X0 = Y0
26 MARIA CAMERON
2 7 1 2 7/15 1
6 1 6/15
5/8
5 2 1/5
2/15
4 4/5
4 3 4 3
3 3/8
The continuous-time Markov chain on the right has the generator matrix L
given by
0 0 0 0
7 −15 2 6
L= .
0 1 −5 4
0 5 3 −8
Using Eq. (19) we calculate the jump rates for the corresponding jump
chain on the right:
1 0 0 0
7/15 0 2/15 6/15
Π=
0
.
1/5 0 4/5
0 5/8 3/8 0
Note that the any jump chain is a discrete-time Markov chain, where the
probability to stay at any state unless it is an absorbing state is zero. Vice
versa, given a jump matrix Π and the escape rates Li , i ∈ S, one can
construct the generator matrix L for the corresponding continuous-time
Markov chain.
Theorem 15. Let (Xt )t≥0 be a right-continuous random process with values in a finite
set S. Let L be a generator matrix on S with jump matrix Π. Then the following three
conditions are equivalent:
(1) (jump chain/holding time definition) conditional on X0 = i, the jump chain
(Yn )n≥0 of (Xt )t≥0 is a discrete-time Markov chain and for each n ≥ 1, conditional
on Y0 , ..., Yn−1 , the holding times S1 , ..., Sn are independent exponential random
variables with parameters LY0 , ..., LYn−1 respectively;
(2) (infinitesimal definition) for all t, h ≥ 0, conditional on Xt = i, Xt+h is inde-
pendent of (Xs | s < t) and, as h ↓ 0, uniformly in t, for all j
P(Xt+h = j | Xt = i) = δij + Lij h + o(h),
where δij is the Kronecker symbol:
(
1, i=j
δij =
0, i 6= j
(3) (transition probability definition) for all n = 0, 1, 2, . . ., all times
0 < t0 < t1 < . . . < tn+1 and all states i0 , . . . , in+1
P(Xtn+1 = in+1 | Xt0 = i0 , . . . , Xtn = in ) = P(Xtn+1 = in+1 | Xtn = in )
(23) = pin in+1 (tn+1 − tn ),
where {pij (t) | i, j ∈ S, t ≥ 0} is the solution of the forward equation
P 0 (t) = P (t)L, P (0) = I.
28 MARIA CAMERON
If (Xt )t≥0 satisfies any of these three conditions, we say that it is a continuous-time
Markov chain with the generator matrix L.
Proof. (1) Suppose (1) holds and prove (2). As h ↓ 0,
Pi (Xh = i) ≥ Pi (J1 > h) = e−Li h = 1 + Lii h + o(h),
(recall that Lii = −Li ), and for j 6= i,
Pi (Xh = j) ≥ Pi (J1 ≤ h, Y1 = j, S2 > h)
= (1 − e−Li h )πij e−Lj h = Lij h + o(h).
Thus, for every state j there is an inequality
Pi (Xh = j) ≥ δij + Lij h + o(h).
By taking the finite sum over j we see that these must be equalities: the left-hand
sides sum up to 1 while the right-hand sides sum up to 1 + o(h). Then, by Markov
property, for any t, h ≥ 0, conditional on Xt = i, Xt+h is independent of (Xs | s ≤ t)
and, as h ↓ 0, uniformly in t
P(Xt+h = j | Xt = i) = Pi (Xh = j) = δij + Lij h + o(h).
(2) Suppose (2) holds and prove (3). Set pij (t) = Pi (Xt = j) = P(Xt = j | X0 = i).
For all t, h ≥ 0, as h ↓ 0, uniformly in t
X
pij (t + h) = Pi (Xt = k)P(Xt+h = j | Xt = k)
k∈S
X
= pik (t)(δkj + Lkj h + o(h)).
k∈S
So, letting h ↓ 0, we see that pij (t) is differentiable on the right. Then by uniformity
we replace t by t − h in the above and let h ↓ 0 to see first that pij (t) is continuous
on the left, then differentiable on the left, hence differentiable, and satisfies the
forward equations
X
p0ij (t) = pik (t)Lkj , pij (0) = δij .
k∈S
Since S is finite, pij (t) is the unique solution of the forward equation. Also, if (2)
holds then Eq. (23) holds.
(3) Suppose (3) holds and prove (1). Condition (3) determines the finite-dimensional
distributions of (Xt )t≥0 and hence the distribution of jump chain and holding times.
Hence condition (1) is satisfied.
MARKOV CHAINS 29
4.4. Class structure. The concepts i leads to j (i −→ j), i communicates with j (i ←→ j),
communicating class, closed class, absorbing state, and irreducibility are inherited from
discrete time Markov chains.
Theorem 16. For distinct states i, j ∈ S the following are equivalent:
(1) i −→ j;
(2) i −→ j for the jump chain;
(3) Lii1 Li1 i2 . . . Lin−1 j > 0 for some states i1 , ..., in−1 ;
(4) pij (t) > 0 for all t > 0;
(5) pij (t) > 0 for some t > 0.
Proof. Implications (4) ⇒ (5) ⇒ (1) ⇒ (2) are clear. If (2) holds then there are states
i1 , . . . , in−1 such that πii1 πi1 i2 . . . πin−1 j > 0,
which implies (3).
If Lij > 0, then
pij (t) ≥ Pi (J1 ≤ t, Y1 = j, S2 > t) = (1 − e−Li t )πij e−Lj t > 0 for all t.
So, if (3) holds, then
pij (t) ≥ pii1 (t/n)pi1 i2 (t/n) . . . pin−1 j (t/n) > 0 for all t.
Hence (4) holds.
Proof. Since the hitting probabilities are the same for continuous-time Markov chains and
the corresponding jump chains, we will start with Eq. (4) where πij are the transition
30 MARIA CAMERON
probabilities. Consider i ∈
/ A. First assume LiP> 0. Then πij = Lij /Li and πii = 0. Then,
taking into account that Lii = −Li and Li = j6=i Lij we have:
X X Lij X Lij Lii
hA
i = πij hA
j = hA
j = hA A
j + hi + hA
i .
Li Li Li
j∈S j6=i j6=i
Canceling hA
i in the right- and left-hand side and then canceling Li we get
X
Lij hA
j = 0.
j∈S
If Li = 0 then i is an absorbing state. Then πii = 1, πij = 0 for j 6= i. Then Eq. (4) gives
hA A
i = hi while Eq. (24) gives 0 = 0.
The mean hitting times kiA are defined as they were for discrete-time chains.
kiA = Ei [τ A ] = E[τ A | X0 = i]
Theorem 18. Assume that Li > 0 for all i ∈ / A. The vector of mean hitting times
k A = {kiA | i ∈ S} is the minimal nonnegative solution of
(
kiA = 0, i ∈ A,
(25) P A
j∈S Lij kj = −1, i ∈/ A.
The proof follows the same lines as the one for the case of discrete-time Markov chains.
Here we will verify that Eq. (25) must be satisfied. If X0 ∈ A then τ A = 0 and hence
kiA = 0. If X0 ∈
/ A then τ A ≥ J1 , so, by Markov property of the jump chain
Ei [τ A − J1 | Y1 = j] = Ej [τ A ].
Therefore,
X
kiA = Ei [τ A ] = Ei [J1 ] + Ei [τ A − J1 | Y1 = j]Pi (Y1 = j)
j6=i
X
= L−1
i + πij kjA
j6=i
X Lij X Lij
= L−1
i + kjA = L−1
i + kjA + kiA .
Li Li
j6=i j∈S
Proof. Existence and uniqueness of the invariant distribution follows from its existence and
uniqueness for the matrix P (t) = etL . Note that all entries of P (t) are positive for all t > 0
owing to irreducibility and Theorem 16. Applying the Perron-Frobenius theorem (Theorem
8) we conclude that P (t) has an eigenvalue µ0 = 1 whose algebraic multiplicity is equal to
1 and µ0 > |µk | for all other eigenvalues µk of P . The unique invariant distribution π is
the left eigenvector of P (t) corresponding to the eigenvalue µ0 = 1, i.e.,
π = πP (t).
By Theorem 22, this is equivalent to πL = 0. Hence there exists a unique invariant
distribution π such that πL = 0.
Now we will show convergence to π. We conduct the proof for the case when L is diago-
nalizable. It it is not, the proof is similar but a bit more tedious (see the proof of the anal-
ogous theorem (Theorem 9) for discrete-time Markov chains.) Let λ0 = 0, λ1 , . . . , λN −1
be eigenvalues of L. If
L = V ΛU
is the eigendecomposition of L then P (t) has the eigendecomposition
1
tL
e λ1 t
P (T ) ≡ e = V U.
. .
.
eλ N −1 t
4.8. Time reversal and detailed balance for continuous-time Markov chains. For
continuous-time Markov chains analogous results for time reversal and detailed balance take
place. Let L be an irreducible generator matrix and π be an invariant distribution. The
generator matrix L̂ for the time-reversal is defined by
πi L̂ij = πj Lji .
The detailed balance condition reads
πi Lij = πj Lji .
Analogs of Theorems 10, 11, 12 hold.
34 MARIA CAMERON
5.1. Settings. We will consider a continuous-time Markov chain with a finite set of states
S, |S| = N , and irreducible generator matrix L. This Markov chain can be represented as
a network, where states correspond to the vertices of the graph, two vertices i and j are
connected by a directed edge if an only if Lij > 0. If both Lij > 0 and Lji > 0, we will
draw an undirected edge between i and j.
Let A and B be selected nonintersecting subsets of S. For simplicity, we assume that
there exists no edge (i, j) such that i ∈ A and j ∈ B, i.e., one cannot get from A to B
without spending some time in S\(A∪B) ≡ (A∪B)c . The sets A and B can be interpreted
as the reactant set and the product set respectively. For example, if you are modeling a
protein folding, A can be a collection of unfolded states, while B – a collection of folded
states.
There exists a unique invariant distribution π = (πi )i∈S , i.e., πL = 0. We do not
assume that π and L are in detailed balance. We will also need to consider a family
of time reversed chains (X̂t )t∈R , X̂t = Xτ −t where τ is some moment of time. The generator
matrix for the time reversed process is L̂ = (L̂ij )i,j∈S defined by
πj
L̂ij = Lji .
πi
5.2. Reactive trajectories. The subject of TPT is reactive trajectories that are defined
as follows. Consider a very long trajectory starting from an arbitrary state i, i.e., (Xt )t∈R
such that X0 = i. Since the Markov chain is irreducible and finite, every state is recurrent.
Hence this trajectory will visit all of the states infinitely many times with probability 1.
Let us prune those pieces of it that go from A to B, i.e., we will detect the collections of
moments of time {tA B
n }n∈N and {tn }n∈N such that
tA B A
n < tn < tn+1 n ∈ N,
MARKOV CHAINS 35
lim X(t) = xA
n ∈ A, X(tB B
n ) = xn ∈ B,
t↑tA
n
1 3 7
10
A 19 2
11
6
12
4
20
5 24
21 8 B
9 23
22 13
14 26
17
25
18 15
16
and
P
−
j∈S L̂ij qj = 0,
i ∈ (A ∪ B)c ,
(28) qi− = 1, i ∈ A,
−
qi = 0, i ∈ B,
where L̂ is the generator matrix for the time-reversed process.
Eq. (27) is justified as follows. Let us modify out network and make all states in A
absorbing, i.e., Lij = 0 for all i ∈ A. The other Lij ’s are unchanged. Then Eq. (27)
becomes the equation for the hitting probabilities for the set B for the modified network.
I. e., qi+ is the probability that the process starting at i will hit B prior being absorbed by
one of the states in A. This is exactly what the forward committor is. A similar argument
applied to the reversed process shows that the backward committor satisfies Eq. (28).
5.4. Probability distribution of reactive trajectories. What is the probability to find
a reactive trajectory at state i at any time t? To answer this question, consider an infinitely
long trajectory (Xt )t∈R where X0 is distributed according to the invariant distribution π.
For any fixed time t, the probability to find Xt at state i is πi . If Xt = i where i ∈ A or
MARKOV CHAINS 37
In [11], mR R
i is called the probability distribution of reactive trajectories. Note that m is
not a distribution, as it is not normalized. It is a measure. The normalization constant for
mRi , X X
ZR = mRi = πi qi− qi+ ,
i∈S i∈S
is the probability that any given t belongs to the set of reactive times, i.e.,
[
ZR = P(t ∈ (tA B
n , tn )) ≡ P(t ∈ R).
n∈Z
and X X πi
πj Lji qj− = πj L̂ij qj− = πi L̂i qi− = πi Li qi− .
πj
j6=i j6=i
Therefore,
X
(fij − fji ) = πi qi− Li qi+ − qi+ πi Li qi− = 0.
j∈S
5.6. Effective current. As we have mentioned, the reactive trajectories can go back in
forth along an edge (i, j) where i, j ∈ (A ∪ B)c on their way from A to B making both fij
and fji positive. The difference fij − fji is the net current from i to j carried by reactive
trajectories from i to j. The nonnegative part of fij − fji , denoted by fij+ , is called the
effective current:
(32) fij+ := max{fij − fji , 0}.
5.7. Transition rate. The transition rate from A to B (the reaction rate) is the average
number of transitions per unit time performed by an infinite trajectory (Xt )t∈R . It is equal
to the total reactive current coming out of A which is the same as the total reactive current
going into B, i.e.,
X X
νR = fij = fij+
i∈A, j∈S i∈A, j∈S
X X
(33) = fij = fij+ .
i∈S, j∈B i∈S, j∈B
One can obtain another expression for the reaction rate νR as the total reactive current
through an arbitrary cut. A cut in a network G(S, E) is a partition of the nodes in S
into two disjoints subsets that are joint by at least one edge in E. The set of edges whose
endpoints are in different subsets of the partition is referred to as the cut-set. Here we will
focus on A-B-cuts that are such that A and B are on different sides of the cut-set. Any
A-B-cut leads to the decomposition S = SA ∪ SB such that SA ⊇ A and SB ⊇ B (see Fig.
6).
Theorem 25. The transition rate νR is given by
X X
(34) νR = Fi,j ,
i∈SA j∈SB
CSLA CSRB
A B
because for every term Fi,j = fij − fji in this sum there is a term −Fi,j = fji − fij . We
have:
X X
Fi,j = Fi,j
i∈SA ,j∈SB i∈A∪(SA \A)
j∈(S\SA )
X X X
(36) = Fi,j + Fi,j − Fi,j
i∈A i∈SA \A i∈SA
j∈S j∈S j∈SA
The second sum in (36) is zero by current conservation, while the third sum is zero by
(35).
5.8. Reaction pathways. The effective current f + = (fij+ )i,j∈S defined by Eq. (32)
induces a directed graph with the set of states S. In other words, we connect states i and
j by a directed edge (i → j) if and only if fij+ > 0. We denote this graph by G{f + }.
i0 ∈ A, in ∈ B, ik ∈ (A ∪ B)c , 1 ≤ k ≤ n − 1.
40 MARIA CAMERON
5.9. Simplifications for time-reversible Markov chains. The case where the Markov
chain is time reversible, i.e., L̂ = L which is equivalent to the statement that L and π are
in detailed balance, i.e.,
πi Lij = πj Lji
is worth of special consideration. Many interesting systems possess this property, and the
formulas for the backward committor, the reactive current and for the transition rate can
be given in terms of the forward committtor.
Exercise (1) Show that the forward and backward committor are related via
qi− = 1 − qi+ , i ∈ S.
Hence we can simplify the notations: denote the forward commuter by q = (qi )i∈S .
Then the backward commuter is merely 1 − q.
(2) Show that the reactive current Fij := fij − fji is given by
Fij = πi Lij (qj − qi ).
(3) Starting
P from the expression for the transition rate from A to B (the reaction rate)
νR = i∈A,j∈S Fij , show that it can be rewritten as
1 X
(38) νR = πi Lij (qj − qi )2 .
2
i,j∈S
Besides the transition rate νR , one can consider the rates kA,B and kB,A defined as the
inverse of the average times the last set hit by the trajectory was A or B, respectively.
These rates are given by
(39) kA,B = νR /ρA , kB,A = νR /ρB ,
where
X X
(40) ρA = πi (1 − qi ), ρB = πi qi (ρA + ρB = 1)
i∈S i∈S
are the proportions of time such that the trajectory last hit A or B, respectively.
The directed graph G{f + } induced by the effective current contains no directed cycles in
the case of detailed balance because every its directed edge connects a state with a smaller
value of the committor q with a state with a large value of the committor. As a result, the
committor is strictly increasing along every directed path in the graph G{f + } (see Fig. 7).
We can use cuts to characterize the width of the transition tube carrying the current
of reactive trajectories. A specific set of cuts is convenient for this purpose, namely the
family of isocommittor cuts which are such that their cut-set C is given by
(41) C(q ∗ ) = {(i, j) | qi ≤ q ∗ , qj > q ∗ }, q ∗ ∈ [0, 1).
Isocommittor cuts [5] are special because if i ∈ CL and j ∈ CR , the reactive current
between these nodes is nonnegative, Fij ≥ 0, which also mean that every reaction pathway
(no-detour reactive trajectory) contains exactly one edge belonging to an isocommittor cut
since the committor increases monotonically along these transition paths. Therefore, we
MARKOV CHAINS 41
can sort the edges in the isocommittor cut C(q) according to the reactive current they
carry, in descending order, and find the minimal number of edges N (q) carrying at least
p% of this current. By doing so for each value of the committor 0 ≤ q ≤ 1 and for different
values of the percentage p ∈ (0, 100), one can then analyze the geometry of the transition
channel - how broad is it, how many sub-channels are they, etc.
Remark In the case of time-reversible Markov chains, the forward committor strictly
increases along the edges of the graph G({f = }) (check this!). Therefore, the committor
strictly increases along the reaction pathways. The reaction pathways were dubbed no-
detour reactive trajectories in [5].
the transition rate for the reactive trajectories along the edge (i, j), we can replace the
original generator matrix L with the generator matrix F whose off-diagonal entries are fij
and the diagonal ones are defined so that the row sums of F are zeros.
Then one can convert the generator matrix F into the corresponding jump matrix ΠF
according to Eq. (19) and generate a collection of reactive trajectories starting from states
in A having outgoing edges.
One can do the same procedure but using the effective current f + instead of the prob-
ability current of reactive trajectories f . The detailed balance condition guaranties that
these pathways are simple and the committer strictly increases along them.
Propositions justifying these instructions are found in [5].
while LE and LW are found by Eqs. (44)-(45). For the left and right
boundaries LE and LW are modified as
2T
(48) LE (i, N ) = 0, LE (j, 1) = 2 ,
h
2T
(49) LW (i, 1) = 0, LW (i, N ) = 2 ,
h
while LN and LS are found by Eqs. (42)-(43). Therefore, the generator
matrix L is 5-diagonal. Its diagonal entries are all equal to −4T /h2 . The
forward and backward committors for this system are shown in Fig. 8 (a)
and (b) respectively. A reactive trajectory superimposed with the probabil-
ity distribution of reactive trajectories is shown in Fig. 8 (c). A collection
of reaction pathways superimposed with the probability distribution of re-
active trajectories is shown in Fig. 8 (d).
References
[1] J. R. Norris, “Markov Chains”, Cambridge University Press, 1998
[2] A. Bovier, Metastability, in “Methods of Contemporary Statistical Mechanics”, (ed. R. Kotecky), LNM
1970, Springer, 2009
[3] A. Bovier, M. Eckhoff, V. Gayrard, and M. Klein, Metastability and Low Lying Spectra in Reversible
Markov Chains, Comm. Math. Phys. 228 (2002), 219-255
[4] A. Bovier, Metastability, in “Methods of Contemporary Statistical Mechanics”, (ed. R. Kotecky), LNM
1970, Springer, 2009
[5] Flows in Complex Networks: Theory, Algorithms, and Application to Lennard-Jones Cluster Re-
arrangement, M. Cameron and E. Vanden-Eijnden, Journal of Statistical Physics 156, 3, 427-454
(2014), arXiv:1402.1736
[6] M. Cameron, Metastability, Spectrum, and Eigencurrents of the Lennard-Jones-38 Network, J. Chem.
Phys., (2014), 141, 184113, arXiv: 1408.5630
[7] W. E and E. Vanden-Eijnden, Towards a Theory of Transition Paths, J. Stat. Phys., 123 (2006), 503
[8] W. E and E. Vanden-Eijnden, Transition-Path Theory and Path-Finding Algorithms for the Study of
Rare Events, Ann. Rev. Phys. Chem., 61 (2010), 391–420
[9] M. Holmes-Cerfon, S. J. Gortler, and M. P. Brenner, A geometrical approach to computing free-energy
landscapes from short-ranged potentials, PNAS January 2, 2013 110 (1) E5-E14
[10] M. Holmes-Cerfon, Sticky-Sphere Clusters, Annual Review of Condensed Matter Physics Vol. 8:77-98
[11] Metzner, P., Schuette, Ch., Vanden-Eijnden, E.: Transition path theory for Markov jump processes.
SIAM Multiscale Model. Simul. 7, 1192 – 1219 (2009)
44 MARIA CAMERON
1 1
2 2
0.8 0.8
1 A 1 A
0.6 0.6
0 0
y
y
0.4 0.4
−1 −1
0.2 0.2
B
−2 −2 B
0 0
−2 −1 0 1 2 −2 −1 0 1 2
x x
(a) (b)
−3 −3
x 10 x 10
2 8 2 8
1 A 1 A
6 6
0 0
y
y
4 4
−1 −1
2 2
B B
−2 −2
0 0
−2 −1 0 1 2 −2 −1 0 1 2
x x
(c) (d)
Figure 8. Example 9. Thin white curves show level sets of the 7-well po-
tential. Thick white curves represent the boundaries of the sets A and B.
(a): The forward committor. (b): The backward committor. (c): A reac-
tive trajectory superimposed with the probability distribution of reactive
trajectories. (d): A collection of reaction pathways superimposed with the
probability distribution of reactive trajectories.