See discussions, stats, and author profiles for this publication at: https://www.researchgate.
net/publication/271427826
The Viterbi algorithm
Conference Paper · January 2006
DOI: 10.1049/ic:20060556
CITATIONS READS
10 3,747
1 author:
Graham W Pulford
BandGap AI
132 PUBLICATIONS 1,247 CITATIONS
SEE PROFILE
All content following this page was uploaded by Graham W Pulford on 21 July 2015.
The user has requested enhancement of the downloaded file.
The Viterbi Algorithm
Graham Pulford
March 2006
Contents
• History & Why Comms engineers get more $$$
• Sequence Estimation - pictorial
• Sequence Estimation - maths
• Dynamic Programming Solution (a.k.a. VA)
• Trellis Interpretation - pictures
• Movie!
• Concluding Remarks
Time Line for the Viterbi Algorithm
• 1967 - Andrew Viterbi publishes paper on “optimum decoding algorithm for
convolutional codes” in IEEE Trans. Info. Theory (supported by AFOSR)
• 1968 - Viterbi & Jacobs co-found LINKABIT Corporation to develop
decoder chips for satellite comms and modems
• 1971 - G. D. Forney designs Codex 9600 Modem
• 1973 - Forney publishes paper “The Viterbi Algorithm” in Proc. IEEE
• 1977 - NASA uses VA for comms on Voyager Saturn/Jupiter flights
• 1985 - Viterbi & Jacobs co-found Qualcomm
• 1990 - IBM use VA in PRML magnetic hard disk drives
• 1993 - Qualcomm releases CDMA standard IS-95A
• 1997 - Qualcomm releases single-chip Viterbi + trellis decoder
• 2006 - CDMA subscribers surpass 200 million + 5million/month
Now you know why Comms engineers get paid more!
Discrete-Time Finite-State Markov Process
p11
2 n(k)
1 3
p71 x(k) y(k)
4 +
7
Memoryless
p67
6 5 Noisy
Observations
a.k.a Hidden Markov Model
Finite State Markov Process (or chain)
Finite state ⇒ x(k) ∈{1, …, N} ∀ k = 1, 2, 3, …
Markov chain transition probabilities
Pr{x(k+1) | x(1), …, x(k)} = Pr{x(k+1)=j | x(k)=i} = p(i, j)
Initial State Distribution
Pr{x(1) = i} = π(i), i = 1, …, N
Observation Process
Memoryless ⇒ p{y(k) | x(1), …, x(k)} = p{y(k) | x(k)}
Examples
y(k) = f(x(k), w(k))
y(k) = f(x(k)) + w(k)
where w(k) is white noise E{w(k)w(j)}=q δ(k-j)
Actually the VA allows for p{y(k) | x(k+1), x(k)}
MAP Sequence Estimation Problem
Given data {y(1), y(2), …, y(K)}
find the state sequence {x*(1), x*(2), …, x*(K)}
arg max p{x(1), …, x(K) | y(1), …, y(K)} = p(X|Y)
Sounds easy?
K
Remember: there are N state sequences to consider!
e.g. N=100 states, K=10 data points ⇒ 1020 sequences
Equivalent Estimation Problem
1) p(X | Y) = p(X, Y) / p(Y)
2) max( f ) ⇔ min ( –log (f) )
Equivalent MAP estimation problem:
arg min –log p{x(1), …, x(K), y(1), …, y(K)} = –log p(X, Y)
Dynamic Programming Recursion I
K −1
− log p( X , Y ) = − log ∏ Pr{ x( k + 1) | x( k )} p{ y ( k ) | x( k )}
k =1
K −1
= ∑ c[k , k + 1]
k =1
where c[k, k+1] = – log Pr{x(k+1) | x(k)} – log p{y(k) | x(k)}
Cost due to Cost due to
Markov transition Observation
Transition Cost
Dynamic Programming Recursion II
Let c(i, j)(k) = – log Pr{x(k+1)=j | x(k)=i} – log p{y(k) | x(k)=i}
Also define the path cost as:
d(i, k) = { minimum cost sequence to x(k)=i }
then
d(j, k) = min { d(i, k-1) + c(i, j)(k) } , j=1, …, N
i=1,…,N
⇒ forward dynamic programming recursion based on
Add-Compare-Select logic
Computational Complexity
• At each stage k of the DP recursion we need to do
O(NxN) operations.
2
• For K stages the complexity is O(N K )
Our example with N=100 states, K=10 data points
now requires only O(100,000) operations
20
instead of O(10 )
1
2
STATE
5 1
10
Node
Transition
Path Cost Cost
Trellis Interpretation
1
2
STATE
Trellis Interpretation
1
2
STATE
Trellis Interpretation
1
2
STATE
Trellis Interpretation
1
2
STATE
Trellis Interpretation
1
2
STATE
Trellis Interpretation
1
2
STATE
5
Viterbi Data Association Movie
Concluding Remarks
• Viterbi algorithm efficiently solves the MAP sequence
2
estimation problem on a trellis in O(N K )
where N = #states, K= batch length.
• It is easily analysable in terms of “error events”.
• It underpins many technologies (modems, sat comms,
hard drives, mobile phones).
• It is also applicable to tracking on discrete and
continuous state spaces.
References
• A. J. Viterbi, “Error Bounds for Convolutional Codes an and
Asymptotically Optimum Decoding Algorithm”, IEEE Trans. Info.
Th., vol. IT-13, pp. 260 - 269, Apr. 1967.
• J. K. Omura, “On the Viterbi Algorithm”, IEEE Trans. Info. Th., vol.
IT-15, pp. 177 - 179, Jan. 1969.
• G. D. Forney Jr., “Maximum Likelihood Sequence Estimation of
Digital Sequences in the Presence of Intersymbol Interference”,
IEEE Trans. Inform. Theory, vol. IT-18, pp. 363 - 378, May 1972.
• G. D. Forney Jr., “The Viterbi Algorithm”, Proc. IEEE, vol. 61, no.
3, pp. 268 - 278, Mar. 1973.
• B. La Scala, and G. W. Pulford, “A Viterbi Algorithm for Data
Association”, Proc. International Radar Symposium, vol. 3, pp.
1155 - 1164, Munich, Sept. 1998.
View publication stats