A New Description of Trellis Codes
A New Description of Trellis Codes
6, NOVEMBER 1984
z3E
provides the connection.Substituting (4) for a, in (3) yields 01
the desired form
x A x(b,; . . , b,) = x,, + xx,b, + c xijbibj
i i,i II -3
j>i
STATES
+ ... +xl.unbl ..a b,, (5) Fig. 2. A trellis code.
Authorized licensed use limited to: Georgia Institute of Technology. Downloaded on September 02,2024 at 08:10:43 UTC from IEEE Xplore. Restrictions apply.
786 IEEE TRANSACTIONS O N INFORMATION THEORY, VOL. IT-30, NO. 6, NOVEMBER 1984
TABLE I TABLE II
OPTIMUM TWO-TERM CODES WITH ONE AND THREE COEFFICIENT ELEVATED k = 2 CODES DERIVED FROM THE k = 1 CODES IN TABLE I
Coding Upper Coding Upper
Gain Bound Ungerboeck [3] Gain Bound Ungerboek [3]
u Best Code (dB) (W (W u Elevated Code (W (dB) (W
2 b, - 3b,b2b3 3.0 4.3 2.5 2 b, - 3b,b,b, + 6b, 3.4 8.2 3.3
3 b, - 3b,b,b, 3.4 5.2 3.0 3 b, - 3b,b,b, + 6b, 3.8 10 3.8
4 b4 - 3b,b2bs 4.2 6.0 3.4 4 b, - 3b,blb, + 6b2 4.5 10 4.2
5 b,b, - 3blb4b6 4.6 6.7 4.2 5 bib, - jb;b;b,, +-6b, 5.0 11.3 5.0
6 b,b, - 3b,b,b,b,b, 4.9 7.3 4.5 6 b,b,, - 3b,b3bgb,,b,, + 6b, 5.3 11.3 5.2
7 b,b, - 3b,b,b,b,b, 5.3 7.8 5.1 7 b5b,, - 3b,b,b,b,,b,, + 6b, 5.7 12.2 5.8
pairs b,b,. The bit b, and the other even bits will not enter
00 the encoder, and the code (10) will be used only on the odd
bits. For k = 2 Ungerboeck choosesequally spacedchan-
nel inputs + 1, f 3, f 5, f 7. We require a three-term code
with coefficients one, two, and four. The four should go
IO with the uncoded bit to give it the largest signal, and so
x = b, - 2b,b,b, + 4b,. (11)
Since k = 2, in (ll), any even indexed bit influences the
channel input only once; in this senseit is uncoded. The
Oi state of the encoder is determined by the pair (b3, b,),
because k = 2. Again, we have a four-state code, but now
the code has the trellis structure shown in Fig. 3. Note in
Fig. 3 that the trellis has double edges,or parallel transi-
tions, because of the uncoded bit. These edgeshave been
Ii
LI
labeled with the new bit pair and the channel input.
STATES The same method leads to a k = 3 code
Fig. 3. Trellis diagram of the code given in (11).
x = 8b, + 4b, + b4 - 2b,b,b, (12)
with four parallel transitions in the four-state trellis.
These codes are evidently similar to (9) and (lo), but they
Table II presents the results of “elevating” the codes of
use (after an unimportant scaling) f. 1, +2 for channel
Table I to rate k = 2 codes.O ther coefficients close to the
inputs instead of + 1, + 3. It is possible to show that the
ratio of those given in Table II may result in further small
coding gain of (9’) is still 2.5 dB, whereas(10’) provides a
gains. For example, the k = 2, u = 2 code
3-dB gain.
For small u, Dan Sleator performed a computer search x = 3b, - 7b,b,b, + 14b, (13)
of all two-term k = 1 codeswith coefficients fixed at 1 and yields a 3.5 dB coding gain, and the k = 2, u = 3 code
3. Details are given in Appendix A, and the results are
presented in Table I. The designerof a trellis code is free to x = 3b, - 7b,b,b, + 14b, tw
chooseboth the signal constellation and the rule for assign- achieves4.1 dB.
ing signal points to successiveblocks of input data. A final remark about Table II. The upper bounds given
Ungerboeck always choosesequally-spacedchannel sym- are for arbitrary k = 2 codes, and they are valid for any
bols +l, +3, f5, -.. . Table I demonstrates that it is choice of signal constellation and any rule for mapping
sometimes possible to increase the coding gain by using input data to signal points. It can be shown that the coding
unequally-spacedchannel symbols and a well-chosencod- gain for k = 2 codes with parallel transitions cannot ex-
ing scheme. ceed 7 dB just .by consideringthe length-one error eventsof
For u = 2-5 inclusive, the codesof Table I were uniquely the parallel transitions. O f course, an optimal rate k = 2
optimum-at least up to a reversalof indices. Observethat code might not involve parallel transitions.
if u = 3, then b, - 3b,b,b, has the same coding gain as
b, - 3b,b,b,. IV. PERFORMANCE MEASURES
The upper bound in Table I is an upper bound on the
The coding gains quoted in earlier sections referred to
gain of any k, u trellis code (any signal constellation and
any rule for assigning signals to blocks of input data), and the number
it was obtained using the results given in [4].
We now turn to a technique given by Ungerboeck to coding gain (dB) = 10 log,,
i i%cded/i+i..odedl
obtain a code with k = 2 given a code with k = 1, and we
illustrate this in our notation using (10) as the k = 1 code. 05)
We want a k = 2 code, so that new input bits come in where di,, refers to a characteristicsquared distance and
Authorized licensed use limited to: Georgia Institute of Technology. Downloaded on September 02,2024 at 08:10:43 UTC from IEEE Xplore. Restrictions apply.
CALDERBANK AND MAZO: NEW DESCRIPTION OF TRELLIS CODES 787
b,b,b, 111 3
b, 010 1
Fig. 4. Two paths forming an error event. Fig. 5. A sample distance computation.
the power P is the averagedsquaredchannel input (P = N = 2” trellis states, and an error event is specified by
Ex2). In the uncoded situation with k bits/symbol the giving a sequenceof pairs of states. Therefore, a state
channel inputs are taken to be the 2k real numbers diagram representationof error events requires N2 states.
fl, &-3,-a. , ~b(2~- l), and so P = (4k - 1)/3. The By contrast, for a linear convolutional codewe may assume
quantity d&, for the uncoded case is the squared m ini- that the all zero input sequencedeterminesone of the paths
m u m distance betweendistinct channel inputs and is easily and so only N states are needed.If we are willing to settle
seen to be 4. For the (trellis) coded case, d& is the for a lower bound on distancesof error events in a trellis
m inimum distance of error eventsin the trellis. code (hence a lower bound on coding gain for a given
Two paths of finite length through the trellis form an code), then the representation(5) suggestshow to reduce
error event if they start in the samestate, finish in the same the number of statesrequired from N2 to N. For example,
state, and do not simultaneouslyoccupy the same state in consider the code x = b, - 3b,b2b3.If b,b,b, are the bits
between. An error event is shown in F ig. 4, and with it we associatedwith the channelsymbol on one path in an error
associate the squared distance d2 = (a - b)2 + (b - c)~ event and if b;b;b; are those for the other path, then this
+ (d - e)2. component of the error event contributes
The distance d2(8) associatedwith an error event d is
of interest becausethe probability Pi that a Viterbi decoder [(b, - b;) - 3( b,b,b, - b;b;b;)]’
outputs an incorrect path starting at state i is upper to the distance. In general, two products llbi and KIb(
bounded (see [5]) by differ if and only if an odd number of pairs (b,, bj) are
different. If such is the case, we say that the term is
Pi12 fJ -$ c g(q) “excited” becauseonly then can it contribute to the dis-
I=I,, IES(I,i)
tance. If there are p terms excited in a componentand they
SE
I=&
exp(-d2(&‘)/Se2). (16) have coefficients ti, i = 1 . * . p in the code (5), then the
contribution of that component to the squared distance
must be at least
In (16), S(I, i) is the set of error events of length 1 that
starts at state i. The function Q is given by
Authorized licensed use limited to: Georgia Institute of Technology. Downloaded on September 02,2024 at 08:10:43 UTC from IEEE Xplore. Restrictions apply.
788 IEEE TRANSACTIONS ON INFORMATIONTHEORY,VOL. IT-30, NO. &NOVEMBER 1984
TABLE III
TRANSFORMSOFTHEPOINT
where the sum is taken over all error events 2,. The usual Rotation Angle Transformed Point
transition matrix evaluation of (19) is applicable, and only 0 (x2 Y)
an N X N matrix needs to be inverted, where N = 2”. n/2 c-v, x)
As a final remark on the use of (5) in performance
3:,2 ‘;y”-5’
, x
evaluations, we note that in an earlier publication Massey
[6] gave a lower bound on the minimum distance of con-
volutional codes. The representation (5) allows a modifica- With a 180” phase hit sequence(21) becomes
tion of this argument to be made so that a lower bound on
minimum distance for k = 1 trellis codes can be obtained. 1, -3, -1,3, -l,l, -1, -l,l, -1, -1, -3;**
This is given in Appendix B. (24
and as can be seen from Fig. 2, this is not an allowed
V. SYSTEM CONSIDERATIONS
sequence of output symbols for the code (9). One proce-
Trellis codes generate the real-valued channel inputs x dure is to monitor the Viterbi decoder, and if it is not
sequentially. For transmission over a passband channel, “ working”, assume the phase error and change the phase
these inputs might be taken in pairs and transmitted as the by 7~. This takes its toll in time, however, and for this
inphase and quadrature components of a carrier. However, reason the differential encoding procedure that we shall
we use an alternate approach, splitting the incoming bit describe is preferred.
stream into two streams and using a separate trellis en- Our main requirement for differential encoding is that
coder for each stream. The output symbols of each encoder the code must have the following reflection property: if
then separately modulate the two carrier components. {xi} is a sequence of output symbols corresponding to
Thus we consider a new two-dimensional transmitted inputs { bi}, then { -xi} is the sequenceof output symbols
symbol, with components corresponding to the inphase corresponding to inputs { - bi}. Thus, a code such as (10)
and quadrature components. The set of all possible such that has an odd number of variables bi in each term is
pairs (i.e., the signal constellation) will then have a 7r/2 required. We then precede the trellis encoder by the usual
rotational symmetry. Becauseof the way an equalizer or differential encoding operation. By this we mean that a
phase-locked loop recovers after a phase hit, the received dummy bit say + 1 is inserted at the start of (20). There-
signal points are sometimes rotated by a multiple of r/2 after, an input + 1 is encoded by repeating the preceding
compared to an initial determination of phase. This section binary symbol, and an input - 1 is encodedby changing it.
describes the first method of “differentially encoding” data The resulting differentially encoded bit stream for (20),
in a trellis code to overcomethis phase ambiguity.5 including the initial dummy bit, would then be
To illustrate this, we begin with a one-dimensional ver- l,l,-l,l,-l,-l,l,-l,-l,-l,l,l,l;... (23)
sion of the differential encoding problem. The sequence
{xi} is transmitted, but the sequence {-xi} is received. Now the trellis code (10) is used with (23) as input. The
For example, consider the code of Fig. 2 and (9). Assume entire procedure is sketchedin Fig. 6.
that the encoder is initially in the state (00), and let the Now consider a two-dimensional differential encoding of
input data be pairs of binary symbols. This form of encoding (and decod-
ing) will provide end-to-end transparency to “channel”
l,-l,-l,-l,l,-l,-l,l,l,-l,l,l;**. (20) rotations that are multiples of m/2. Direct encoding of the
binary pairs (1, l), (- 1, l), (- 1, -l), and (1, - 1) associ-
The output sequenceis ates with each pair the point of the plane having these
-1,3,1,-3,1,-1,1,1,-1,1,1,3;*~. coordinates. If one transmits an initial dummy pair, then
(21)
the pairs of binary data bits can be differentially encoded
into these four pairs by advancing the last transmitted
4The k = I codes 46, b, - 2 b, b4 T b, be provide examples of the fact
that it need not be true that d,, = dmin and that signs may matter. As a
symbol by 0, ?r/2, 7~, or 3a/2 radians if, for example,
hint, consider the edge label sequence lOOOO0,110000, 011000, 001100, (1, l), (- 1, l), (- 1, -l), or (1, - 1) are the raw-data pairs.
etc. In general, a point (x, y) in the plane is transformed
5A method simplifying the implementation of this differential encoding
scheme and others was recently discovered by Ungerboeck and indepen- under the rotations we are considering according to Table
dently by Lee-Fang Wei. By introducing feedback, they were able to III.
transform a trellis code requiring external differential encoding/decoding We now turn to two-dimensional differential encoding
into a single finite-state machine with built-in transparency to 90” phase
shifts. Their methods apply to the coding schemes described here and with trellis codes. First consider the case when k = 1, and
eliminate the need for external differential encoding/decoding. then separate the initial binary data stream into two paral-
Authorized licensed use limited to: Georgia Institute of Technology. Downloaded on September 02,2024 at 08:10:43 UTC from IEEE Xplore. Restrictions apply.
CALDERBANK AND MAZO: NEW DESCRIPTION OF TRELLIS CODES 789
(4
2AD!M.
ENCODER
TRELLIS
ENCODER
CHANNEL:
+=o, t9v
180”
2 DIM TRELLIS 42
ENCODER
EN&R 000
(b) Fig. 8. Possibletransitions for the code given by (24).
Fig. 7. (a) Proposed differential encoding with k = 1 trellis codes. (b)
Proposed differential encodingwith k = 2 trellis codes.
“m o d 4” differential encodingis used in place of the one
we have specified, the overall procedureof encodingtrellis
(or block) codes will not work. Thus, place (O,O), (0, l),
lel streams. Encode each stream with a trellis code having (l,O), (1,l) consecutivelyat the corners of a square, and
only odd-order terms (as in the one-dimensionalexample), then replace each label (a, p) by (p, a). Note that the
and transmit the resulting output pair (x, y). As we note in resulting figure is not a rotation of the original one.
Table III, a r/2 rotation will causethe x-channel decoder Our procedureclearly dependson the reflection property
to output the negative of the y-channel bits, and the holding and on the specialway that we applied differential
y-channel decoder to output the x-channel bits with the encoding to uncoded bit pairs. Further, this procedure
proper sign. Therefore, regarding the bit pairs as points in clearly applies to block codes as well. All we require is
the plane, they too will be rotated by n/2. Thus, we independent block encoderson each channel. If the trans-
precede the trellis encodersby a conventional differential m itted block of channel symbols is negated, the resulting
encoder and follow the Viterbi decoderwith a differential block is in the code and complementsall input bits.
decoder,as shown in F ig. 7(a). This completesthe differen-
tial encoding for trellis codeswhen k = 1. ACKNOWLEDGMENT
The procedure generalizeswhen the higher rate (k > 1)
trellis codes are used on each carrier component. These It is the authors’pleasureto acknowledgethe valuable
codes must be identical odd-order codes. Consider the contribution m a d e by Dan Sleator in performing the com-
input bit stream to be 2k parallel bit streamsregardedas k puter searchof two-term codes.
parallel pairs of bit streams. Each of these k pairs is
independently differentially encoded;that is, one bit from APPENDIX A
each encodedpair is sent to the inphasetrellis encoderand MINIMUM DISTANCE CALCULATIONS
the other bit is sent to the quadraturetrellis encoder.Thus, W e elaboratehereon somespecifictechnical points that ap-
each encoder receivesits k-bit input block. After transmis- pear in Sections 1-V. W e discuss the computer search used to
sion and Viterbi decoding,the bits are recombinedinto the find d,?,,,for two-term k = 1 codes, we give an example of an
k differentially encodedpairs, and each of these k streams explicit evaluationof the upperbound(19),and we givea proof
of pairs is differentially decoded.This is illustrated in F ig. that d&, = 8 for nonlinear one-term k = 1 codes.
7(b) for the case k = 2. The procedureworks becausethe W e illustratethe first two pointsusingthe k = 1 code
choice of odd-order codesguaranteesthat the net result of b, - ab, b, b, (24)
the trellis encoding, channel, and trellis decodingsequence
where O Lis an arbitraryfixed constant.In light of the discussion
is to rotate each bit pair of each differentially encoded preceeding Fig. 5, introduce a directed graph of 2k+” vertices
stream by the angle of the phasehit (assumedhere to be a (eight in this case) by labeling the vertices with the labeling used
m u ltiple of m /2). for edges in Fig. 5; that is, a 1 in the ith position means that bi
The differential encoding of these bit pairs is unusual, and b( are different. The first k bits are the new input, and the
and this is best seenin 0,l notation. Denote a binary pair last v are the trellis state. A sequence of components of the error
by ((Y,j?) and use a bar to denote the complementof a bit. event 8 is a walk on this directed graph. Thus, for (24) we have
Our procedure is to uniquely assign to each of the four the graphof F ig. 8, whereedgesarelabeledwith the contribution
possible input pairs either (a, p), (fl, cu),(& p), or (p, (u), of the transition to d* as calculated from (18). More specifically,
where (cw,/I) was the previous output of the differential in Fig. 8 the state to which the transition is made is used to
encoder. By contrast, a common implementation of dif- determine which terms of the code are excited when evaluating
ferential encodingis as follows. Assign the numbers0, 1,2,3 (18). A state (000) has been introduced as a starting state, and an
error event c? of length I corresponds in Fig. 8 to a path of length
to the pairs (0, 0), (0, l), (1, 0), (1, 1), respectively.If the last 1 from 000 to 001. The search for Table I (a = 3) took less than
output pair of the differential encodercorrespondedto the five minutes on a VAX@11/780.
number x, and if y correspondsto the new input pair, then
the new output pair correspondsto (x + y) (mod4). If this mVAX 11/780 is a trademark of Digital Equipment Corporation.
Authorized licensed use limited to: Georgia Institute of Technology. Downloaded on September 02,2024 at 08:10:43 UTC from IEEE Xplore. Restrictions apply.
790 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. IT-M, NO. 6, NOVEMBER 1984
ox
2.0 2.5 3.0
a
3.5 1 0
9-w To and
ym = number of pairs 01 (score 16).
An output sequence is bad if
y+P+4ylh
l-wgTl
and the number of bad sequences is
T(x) = T,(x2) + xT,(x’) whenever (Y + p + 4-r = X. Now the left side of the equation
simplifies to
= (x + x2 + x3 + x5)Z(x2). -(Ylogcr - p1ogp - ylogy
The polynomial G(x) = x + x2 + x3 + x5 is called the generu- -(l - a - p - y) log(1 - (Y - p - y)
tor polynomial of the convolutional code. The constraint length of
a convolutional code is the number of output symbols influenced and since -(Y log (Y - /I logp is maximized by setting LY= p =
by a given input signal. Then (X - 4y)/2, we must guarantee
00 I 0