Transform Techniques For Error Control Codes: R. E. Blahut
Transform Techniques For Error Control Codes: R. E. Blahut
Blahut
Introduction Fourier transforms have found wide application in signal processingand in the study of communicationwaveforms; study of these transforms has been well rewarded. A close analogue of the Fourier transform can be defined of on the vector space n-tuples over theGalois field GF(q) whenever n = qm - 1 or a submultiple as was noticed by Pollard [I]. Transforms over Galois fields have recently been introduced into the study of error control codes as a vehicle to reduce decoder complexity first by Gore [ 2 ] , and later by Michelson [ 3 ] , Lempel and Winograd [4], and Chien and Choy [ 5 ] . However, these transforms can be made to play a much more central role in the subject. Known ideas of coding theory can be described in a frequency domain setting that is much different from the familiar time domain setting, but closely related to treatments basedonthe so-called Mattson-Solomon polynomial [6]. Cyclic codes can be defined as codes whose codewordshave certain specified spectral components equal to zero. Alternant codes (and Goppa codes) can be given a similar interpretation. Also, the decoding of many codes (including BCH, Reed-Solomon,and Goppa codes) can be described spectrally. This emerging viewpoint is welcome and important for new vantage pointon an a number of reasons. Firstly, any established discipline will usually bring new insights. Thus, existing codes can be classified in yet another way,
and hence interrelationships can be seen in a new light. Secondly, there are strong pedagogical advantages since most engineers are well versed in transform techniques, especially those dealing with waveform design or signal processing. Thirdly, computational or implementation advantages often are found in a frequency domain decoder. Certainly, it is important to know as many decoder techniques as possible so that the simplest can be chosen fora given application. Finally, new codes, algorithms,and techniques can be sought from another point of view. This paper will begin with some tutorial sections that develop part of the subject of error control codes in a transform setting. First of all, we want to set up the viewpoint, terminology, and analogies with signal processing theory so that the new results that come later can be explained easily. But we also hope to stimulate interest in and to accelerate the development of a spectral point of view to coding and to popularize the ideas of [2, 41 and [ 5 ] . It is our belief that the spectralformulation and terminology bring the subject much closer to the subject of signal processing and make error control coding moreaccessible to the nonspecialist in coding theory. We then turn from the tutorial tone to give some new techniques. Thespectral interpretation is used to describe encoderidecoder implementations for BCHandReed-
Copyright 1979 by International Business Machines Corporation. Copying is permitted without payment of royalty provided that (1) each reproduction is done without alteration and (2) the Journal reference and IBM copyright notice are included on the first page. The title and abstract may be used without further permission in computer-based and other information-service systems. Permission to republish other excerpts should be obtained from the Editor.
299
VOL. 2!3
NO. 3
MAY 1979
R. E. BLAHUT
Solomon codes. A new technique for correcting erasures and errors is described. This technique requires virtually no additional complexity over the decoder for correcting errors only. We also describe techniques for decoding codes, including Goppa codes, beyond the designed distance; techniques for modifying the definition of BCH codes so that the decoder is simpler; and techniques for designing codes of very long blocklength that, although weak in performance, can be decoded using arithmetic in small fields. These techniques are described for their own merits, but they are also presented as evidence supporting our claim that the spectral point of view is quiterewardingand worth further development. Finally, by the examples discussed and the generaltone of the paper, we hope to underscore ourview that it is as important to massage codes to fit knowndecoding algorithms as it is to seek new codes with good properties but having no known practical decoders. Finite field transforms The Fourier transform plays a basic role in the study of real-valued or complex-valued signals when the time variable is continuous, and the discreteFourier transform plays a parallel role when the time variable is discrete. Fourier transforms also exist for functions of a discrete index that takevalues in a finite field. Such transforms are very useful in the study of error control codes, but they are less well knownthan Fourier transforms overthe complex field, and so we review them in this section. The basic ideas appear earlier in [l]. Recall the definition of the discrete Fourier transform of a vector of complex numbers
tion to values of n satisfying n = qm - I . These values of n will be called primitive blocklengths. Then a is a primitive element of GF(q"). It is natural to call the discreteindex i "time" and e the time domain function or the signal and also call the discrete index j "frequency" and E the frequency domain function or the spectrum.Just as real-valued functions can have complex-valued Fourier transforms, so too can GF(q)-valued signals have GF(q"')-Valued Fourier transforms.
Theorem I Over GF(y), a field of characteristic p , a vector and its spectrum are related by
,I-
Ei =
1aijei,
i=n
Proof
xfl - I = ( x
1)(x"-' +
+ . . . + x + I),
and by definition of cy, ar is a root of the left side for all r. Hence ar is a root of the last term for all r # 0 modulo n. But this is equivalent to
fl --1
arj = 0
=
r # 0 modulo rr,
J=O
while if r
71 - 1
0,
1 ar3= n modulo p ,
1
j=O
which is not zero if n is not a multiple of the field characteristic, p . Using these facts, we have
71-1
71-
n-I
n-1
( n modulo p)e,.
j=O
i=O
Finally, if the field has characteristic p , then for some integer s , q - 1 = p s - I is a multiple of n , an4consequently n is not a multiple of p . Hence n modulo p # 0. This proves the theorem. The Fourier transform has many strong properties which carry over to the finite field case. Suppose that
ei
wherehere j = . TheFourier kernel is anNthroot of unity in the field of complex numbers. In the finite field, GF(q"'), an element N of order n is an nth root of unity. Drawing on the analogy between and 01, we have the following definition.
=fig,
0,
. . ., n
1.
Then
Definition I Let e = { p i I i = 0, . . ., n - 1) be a vector over GF(q), where n divides qTn- 1 for some rn, and let a be an element of GF(qi") of order n. The finite field Fourier transform of thevector e is thevectorover GF(q"'), E = {EjI,j = 0, . . ., n - I}, given by
n-1
Ej =
1 ai'ei
0,
. . ., 11
- I.
i=n
300
R. E. BLAHUT
I B M J. RES. DEVELOP.
VOL. 23
NO. 3
MAY 1979
modulo n (or equivalently, that the spectra are defined for all k and are periodic with period n). There is also a Parseval formula. From the convolution, n- 1 I n-1 Ej = aiJfigi = Fj-kCk.
1
=
i=O
fl
k=O
*
GF(U)
L components of information
n components
Take j
,,-I
0 to get
Transform
1
=J i
1
Jgi = -
n-l
2 F7,-,Gk.
k-0
Codeword
When dealing with polynomials, the polynomial e(x) = ciI~,x'"' . . . + c,x + e , , is associated with a polynomial + E(x) = En-I~"" + . . . + E l s + E, by means of the finite field Fourier transform.This polynomial is called the spectrum polynomial or the associated polynomial of ~ ' ( x )It was introduced to the studyof error control codes . by Mattson and Solomon [6], although not in the terminology of the Fourier transform. The following theorem relates the roots of these polynomials to the properties of the spectrum.
Tllrorrm 2 a) The polynomial ( ) ( x ) has a root at a' if and only if the j t h spectral component E, equals zero. b) The polynomial E(?;)has a root at cy-' if and only if the ith time component equals zero. Proof'
(.(a') =
It- 1
C i =
2:
gi+kSk.
k=n
x.
viaiJ =
E,.
Therefore, in the frequency domain, the encoding operation can be written as a product Cj = Cisj. For fixed generator G,, any spectrum that satisfies this expression is a frequency domain codeword, provided that all components in the time domain are GF(q)-valued. Because the signal spectrum S, is arbitrary, the only significant role of Gj is to specify frequencies where the codeword spectrum Cj is zero. Thus, we can define a cyclic code alternatively as follows. Givena set of spectral components j , , . . ., j r l - kcalled parity frequencies, the set of words over GF(y) whose spectrum is zero in components .I,. . , j,,-k is a cyclic code.
' '
i=O
Part b follows in the same way. Thus, in the finite fields, when one speaks of roots of polynomials or of spectral components equal to zero, one really speaks of the same thing, but the terminology and the insights are different, and the two notions appeal to different audiences.
Cyclic codes A code over GF(4) is aset of time domain signals of length n called codewords. If a Fourier transform exists for length n , then each codeword has a spectrum in an extension field GF(4"') called the frequency domain codeword. A cyclic code is a code such that the linear combination of two codewords is a codeword, and the cyclic shift of a codeword is a codeword.
A cyclic code over GF(4) conventionally described in is of terms of a generator polynomial g ( x ) over GF(q) degree 17 - k . Every codeword is represented by a polynomial of degree 17 - I , written as C ( S ) = s(.x)g(x), where s(x) is a signal polynomial of degree k - 1. This isa convolution in components. the time do,.lain
Notice that although each codeword in a cyclic code is a vectoroverGF(4), the spectrum is a vectorover GF(4"'). Hence, we may think of a cyclic code as the set of inverse Fourier transforms of all spectral vectors that are constrained to zero several prescribed components in provided that said Fourier transforms are GF(q)-valued. It is not possible to choose any spectrum that is zero in the prescribed components; some of these may have inverse transforms with components that are in GF(4). not However, if rn = I , that is, if n = q - I , then every spectrum consistent with the constraints yields a codeword. One may encode by filling the unconstrained components of the spectrum with information symbols and then taking the inverse transform as illustrated in Fig. I . The most popular class of cyclic code is the class of BCH codes. From the spectral point of view we have the following definition.
D&ition 2 A primitive f-error-correcting BCH code of blocklength n = 4'" - 1 is the set of all words over GF(q) whose spectrum is zero in a specified block of 2t consecutive
301
VOL. 23
NO. 3
MAY 1979
R. E. BLAHUT
= c ~ Con.
i = (1
CYj.Then
Let X = qj. Since q is relatively prime to n = q - I . as ; . ranges over all values between 0 and n - I , k also ranges over all values between 0 and IZ - 1 . Hence The adjective consecutive is the key one in specializing the definition of a cyclic code to that a BCH code. of It is well known that a BCH code corrects terrors. In the next section we give the proof couched in spectral terminology. The remainder of this section is concerned with encoding. First suppose that ti = y - 1 (or possibly a submultiple of q - I). The BCH code is then known as a Reed-Solomon code. Theencoding is as follows. Someset of 22 consecutive frequencies (e.g., the first 2r) are chosen as the symbols constrainedtozero.The information symbols are loaded into the remaining I I - 2t symbols, and the inverse Fourier transform is taken to produce a (nonsystematic) codeword. For the more general BCH code, the encoding is more complex. Again, 2t consecutive locations are chosen to be zero. The remaining symbols must be chosen to represent the information only in those q possible ways that have GF(q)-valued transforms; but of course we do not wish to do this by trial and error. Recall that over the complex numbers, a function V ( f ) has a real-valued transform if V( , f ) = V ( , f ) . The analogous condition is given by the following well-known theorem. which may be found in [7].
Throrrtn 3 Let C,,;. = 0 , . . ., ti - I , take elements in GF(q)where II is a divisor of 4 - 1. Then c ; , i = 0, . . . , 17 1, are all elements of GF(q) if and only if the following equations, called conjugacy constraints, are satisfied.
~
,,-I
/I-
i=O
i=(l
and by uniqueness of the Fourier transforms = c i for all i. Thus, c i is a root of x - x for all i and all such roots are elements of GF(y).
Using Theorem 3, we can easily construct the Hamming (7, 4) code in the frequency domain. This is shown in Fig. 2. Frequencies C , and C, are chosen as parity frequencies so that a single error can be corrected. The information is contained at frequencies C,, and C,. The remaining frequencies are constrained by Theorem 3. Theorem 3 also requires that Ci = C, so that C,, can only have the value 0 or 1. Thus, the equivalent bit content of C, is one bit. The equivalent bit content of C, is three bits. Thus, the four information bits of the Hamming code can be used to uniquely specify the spectrum. The information bits are insertedintothefrequencydomain rather than the time domain.
In the general case, the integers modulo n are divided into conjugacy classes of the form
A ) = {yj. $j>
q3;, . . .. j ) .
If the spectral component Cj is specified, then every other spectral componentwhose index is in the conjugacy class of; must be a power of Cj and hence cannot be separately specified. (It is suggestive to use the term chord for the set of frequencieswhoseindices are in the same conjugacy class.) Further, if the conjugacy class has r members, then we must have
Proof
,I
By definition
302
Consequently, we are not free choose to any element of GF(y)for C,,,but only those of order q - I . Since every A S iswell known, for a fieldof characteristic p andanyelement of GF(q) has order dividing q - 1, it is clear integer Y, ( n + b)lr= ( I ~+ b. Therefore, that the size of every conjugacy class divides m.
; ,
c, =
x cyJc,
I
e: = e,.
0,
. . ., ti
1.
i=ll
R I BE . B L A H U T . M
J. RES. DEVELOP.
VOL. 23
NO. 3
MAY 1979
Thus, to specify an encoder, we break the first q"' - 1 integers into conjugacy classes, and select one integer to represent each class. These representatives specify the uniquely assignablesymbols. To form a BCH code,a block of 2t spectral components are chosen asparity frequencies and set to zero. The remaining assignable symbols are information symbols, arbitrary except for occasional constraints on the order. All other symbols in a chord are not free; they are obligatory frequencies. Table 1 shows the situation for GF(64). Ifwe choose the first column as free symbols and take C,, C,, C,, C,, C,, and C, as parity frequencies, then we have a tripleerror-correctingBCHcode. Then C,,, C,, C,, C,,, C,,, C,,, C,,, C,:,, C,,, and C,, are the information symbols. C, and C,, must be symbols of order 2 . C,, must be an element of order 3. C, must be an element of order 1 . All other symbols are arbitrary elements of GF(64).Itrequires a total of 45 information bits to specify these symbols. Hence, we have the (63, 45) t = 3 BCH code. After thesefree symbols areloaded,the obligatory symbols are padded with appropriate powers. The complete spectrum is then transformed into the codeword.
Decoding in the frequency domain The BCH bound is proved very simply and intuitively in the frequency domain. This proof is a variation of a time domain proof of Chien [8] that we have transferred into the frequency domain.
Tlleorc~nl (BCH holrncl) If a vector c ' ~ .i = 0, . . ., n 4 I , has less than d nonzero components and if the spectrum is zero on any d - 1 successive components (C, = 0, j = + I , . . ., j,, + d - I), then c , = 0 for all i.
This is the equation of a linear feedback shift register that generates all the Cj fromany block of them of length d - I . But C is zero in a block of length d - I . Using this block as an initial condition, the linear feedback shift register generates all the rest; hence, all terms of C are zero, and c must be the all-zero vector. Now consider thetask of decoding a received word for a BCH or Reed-Solomon code.The original decoding technique described by Reed and Solomon [9] had a strong frequency domain flavor,but thereafter techniques evolved primarily in the time domain, although some frein. quencydomainvariables (thesyndromes)docreep What amounts to a frequency domain decoder was first proposed by Mandelbaum [IO], though in the terminologyof the Chinese Remainder Theorem. Such a decoder was implemented in a computer program by Paschburg
j,
Proof Let i,. . . ., i,, denote the v nonzero components of c, v < d . Define the locator polynomial A(x): A(x) =
=
I1
k=1
"
(I
xaik)
Ayx" + AO~,x"" . +
. . + A,x + A,.
Interpreted as a vector, A is a frequency spectrum which is judiciously defined so that its inverse transform A = { A i } equals zero (by Theorem 2) at every time i, at which C, # 0. The product A i c i in the time domain is zero; therefore, the convolution in the frequency domain is zero.
,,-I
I
AkCj-k = 0.
Considerdecoding a received word r, = c', + e,, i = 0, . . .. n - I , consisting of a codeword and an error word. Figure 3 shows a comparison of a time domain implementation of a BCH code, a frequency domain implementation, and several hybrid implementations. In the frequency domain implementation, one first computes R, the transform of the received word r. The transform consists of the transform of the codeword and the transform of the error:
Ri=Ci+E,
,j=o;..,n-
k=,
1.
But the vector A is nonzero only in a block of length at most d and A, = 1 so that
rl- I
Since codeword C, is zero on a block of 2t components (say from 1 to 2 t ) , we have 2t known values of E, called the syndromes:
C , = - y A k Cj-k' Lk=1
Sj= E,
R,
I , . . ., 2t.
303
VOL. 23
NO. 3
M A Y 1979
R. E. BLAHUT
Suppose there are v 5 t errors. As in the proof of Theorem 4, define the error-locator polynomial A ( x ) :
A(x)
n
Y
(1
xaik).
=
k=l
0 whenever ei # 0, we have
j=0
(It does no harm to sum out to n - 1 even though Aj = 0 for j > r.) The convolution is a set of n equations in n - t unknowns: t coefficients of A ( x ) and n - 2t components of E. Of the n equations, there are t equations that involve only components of A and known components of E, and theseare alwayssolvable for A . The remaining components of E can be obtained by recursive extension; that is, sequentially computed from A using the above convolution equation. This computation can be described as the operation of a linear feedback shift register with tap weights given by the coefficients of A . In this way Ej is computed for all j , and Cj = R j - E,. An inverse Fourier transform completes the decoding. The key step of computing A from the known 2t components of E can be done by the elegant algorithm of Berlekamp [ 121 which was described in terms of shift registers by Massey [13]. We will discuss some modifications of this algorithm in later sections, so we restate it here.
The decoder canbe used with an encoder thatis either in the time domain or the frequencydomain. If the encoder is in the frequency domain, then the information symbols are used to specify certain components of the spectrum whose inverse transform thengives the time domain codeword. With this scheme, the corrected spectrum is the information. The decoder does not have an inverse transform. The frequency domain encoder may be simpler than the time domain encoder if n is composite bea cause a fast transform may be simpler than convolution. The final circuit of Fig. 3 shows a hybrid implementation. Here the transform of the error patternis computed by recursive extension in the frequency domain, but the correction is done in the time domain. This circuitis similar in appearance to the first circuit, but the development has been much different. The syndrome generator is the same as the direct transform. The Chien search is essentially the same as the inverse transform. It is a v x n transform with a GF(q"')-valued output vector compared to an n X n transform with a GF(q)-valued output. The fourth circuit has the advantage of a simpler appearance than the first. In view of the many variations summarized by Fig. 3, thedesigner has a number of options fromwhich to choose. It should be obvious that his choice will depend not only on the code parameters such as blocklength and minimum distance, but also on how the implementation is divided between hardware and software,and even on the type of circuitry available to him. Notice that each of the circuits of Fig. 3 has both a Fourier transform and an inverse Fourier transform, though in some casestheseappearunderthenames "Chien search" or "syndrome generator," and in some cases not all of the output components need be computed. Thus,oneneeds efficient methodsfor computing the Fourier transforms. As is well known, the Fourier transform can be efficiently computed by a fast Fourier transform algorithm whenever n = q"' - 1 is composite, and this is sometimes used to justify choice of a composite blocklength. But even when q"' - 1 is prime the transform often is still practical. Circuitry to implement the full n x n matrix multiplication can be quite simple for moderate n. For example, when n is prime and a has a square root f i , one can also use the chirp transform. This is a convenient variation of the Fourier transform based on the calculation
Theorem 5 (Berlekamp-Massey algorithm) Let Sj,+,, . . ., Sj,+zLbe given. Let the following set of recursive equations be used to compute A("'(x):
,,-I
j=O
- A,.xB"-~'(x),
= (1 -
2t; the initial conditions are A'"'(x) = 1 , B""(x) = I , and 6, = I if both Ar # 0 and deg B""l)(x) > max deg ~l(~'(x) otherwise 6,. = 0. Then A""(x) is the and smallest degree polynomial such that A':' = 1 and
1,
M 1 -
. . .,
1AY1)Sj,+,._j 0
=
+ deg A"",
. . ., 2t.
j=O
A proof can be foundin [ 121 or [ 131. If Sj,+l,. . ., SJO+*, are the syndromes of a codeword for which at most t errors occurred, then A'*')(x)has degree equal to the number of errors and
n-1
i=O
i=O
i=O
304
The term on the left can be easy to implement in hardware. It consists of a pointwise product of ci with p-'
R. E. BLAHUl
IBM 1. RES.DEVELOP.
IOL. 23 I
NO. 3
MAY 1979
Berlekampalgorithm
L""1
magnitude computer
search
/=
1,...,11
I-
+
I
computer
pi (an
p-J ( n multiplications).
Erasure and error decoding
BCH codes also are used for protection with channels that make both erasures and errors. A decoding algorithm
for this purpose was discovered by Forney [ 141. The derivation is manipulative and difficult to understand intuitively since it introduces some new variables in an arbitrary way. By transforming the discussion into the frequencydomain, algorithms for decoding messages with
erasures and errors are very simply explained, and the reason why the decoder works is clearly visible. Further, the sharp insight allows us to propose a simple adaptation of the Berlekamp-Massey algorithm so that both erasures and errors can be decoded with virtually no hardware other than that required for the errors-only decoder. Let v be the vector of erased symbols. Suppose that erasuresare made in locations i,, i,, . . ., i,. (In other components v l = 0.) The received word is a codeword corrupted by errors and erasures,
rj =
c j
+ c~~ + vi
i = 0,
. . .. n
I.
305
VOL. 23
NO. 3
MAY 1979
R. E. BLAHUT
Let w be any vector that is zero at every erased location and otherwisenonzero. In particular, define w as follows. Let U, = ail, I = I , . . ., p, denote the erasure locations. Define the erasure-locator polynomial
ti-1
initial estimate A")(x) = 1 and an initial choice of another polynomial called the update polynomial B'"'(x) = 1, and proceeding through 2t iterations. In the case of erasures, the syndrome is replaced with the modified syndrome in the equation for Ar,
R(x)
1 R,xk = n ( 1 xu,).
P
k=O
I=1
This is defined so that the inverse transform of the vector fi has components mi equaltozerowhenever vi Z 0. Therefore, oivi = 0. Then
wiri = oi(ci + ci
Ar
1
j=n
11
+ vi) = wici + o i c i
After n iterations starting with the initial values A")(x) = B'"'(x)= 1, the error-locatorpolynomial h ( x )is obtained. But what happens if we start instead with the values A'"'(x)= B'"'(x) = R(x)? Then notice that
or
r:
= oici
+ e:,
where we have defined the modified received word r: = wiri, and the modified error vector e: = m i p i . The modified error vector e' has errors in the same locations as e. The problem now is to decode r' to find e'. In the frequency domain R'
=
A'"(x)R(x) B'"(x)R(x)
A"-"(x)R(x)
-
Arx13'""(x)Cl(x),
= (1
S,.)XB'~-"(X)R(X)
R"'(x)R(x)and compute Ar by
(a* C) + E'.
But L(1 is nonzero in a block of length p + 1 and by construction of a BCH code, C is zero in a block of length 2t. Consequently, fi * C is zero in a block of length 2t - p . In this block, define the modified syndrome T, by Ti = R:. Then Tj = ( i * R)j = Ej'. f Hence, just asin the errors-only case, from these 2t - p known values of E' we can find the error-locator polynomial A(x) provided the number of errors is less than (2t - p ) / 2 . Once the error-locator polynomial is known, we can combine it with the erasure-locator polynomial and proceed as in the errors-only case. To do this, first define theerror-and-erasure-locator polynomial T(x) = a ( x ) A ( x ) .The inverse Fourier transform of r is zero at every erasure or error. That is,y i = 0 if e , # 0 or vi # 0. Therefore, yl(rl + u i ) = 0 , r * (E + V) = 0, and r is nonzero in a block of length at most 2t - p + 1. Hence, the 2t known values of E + V can be recursively extended to n values by using this convolution equation and the known value of r. Then
Ci = Ri- ( E i + Vi).
1 r(;-l).s,l-j 1 r';:;).sj, =
j=O
j=O
then
Ar
$ (iA'i-l)R,-j-k
j=O
k=O
i x
Sj=
k : "
A'i-l)Tn-k.
Therefore, if we initialize the Berlekamp-Massey algorithm with a ( ~ ) instead of with l , the modified syndromes are computed implicitly and need not explicitly appear, while the algorithm generates recursively the error-and-erasure-locator polynomial T(x) according to the equations
r(r)(X) = r''-l)(x)
B"'(x)
= (1
n
-
ArxP1I(x),
i3r)xB'r-1'(~) G,xA~'T''-"(x), +
ar =
1r;-l)sn-j.
j=O
The resulting decoder is shown in Fig. 4. The only change from the decoder for errors only is the computation of the erasure-locator polynomial, which is trivial compared to other decoding computations. Finally, notice that it does not matter what symbol actually appears in an erased symbol; it can be set to the most likely estimate of the received symbol, if the application uses this information to assess the probability of correct decoding.
Alternant codes The decodingtechniques we have describedapplynot only to BCH codes, also to alternant codes. Alternant but codescomprise aclass of linear codes introduced by
An inverse Fourier transform completes the decoding. Thestep of computing theerror-locator polynomial fromthe modified syndromescan use the BerlekampMasseyalgorithm. However, it is possible to do much better by combining several steps. To describehow to do this it is necessary to refer back to the procedure of the Berlekamp-Massey algorithm as summarized by Theorem 5. The idea of the Berlekamp-Massey algorithm is to compute A(x) by a recursive procedure, starting with an
306
R. E. BLAHUT
VOL. 23
NO. 3
MAY 1979
location polynomial
r----l
I I
Information symbols
obligatory symbols
transform computer
+I
Channel
Transform computer
L""J
1
Bufl'er register
(a)
Initialize Berlekamp-
0-c Massey
algorithm
Recursive extension
location polynomial
j=l;..,2/
nlgorlthm
extension
computer
Error and erasure decoding for BCH codes: (a) frequency domain i mplementation; (b) mixed domain encoder-time domain encoder, hybrid decoder.
Figure 4
Helgert [ 151 and include the class of codes introduced by Goppa [ 161. Delsarte [ 171 showed that an alternant code is a subset of slightly modified Reed-Solomon code. Choose h a fixed n-vector of nonzero components over GF(q'"), and choose a Reed-Solomon code over GF(4") with del . The alternant code consists of all signed distance 2t GF(q)-valued vectorsc such that hici,i = 0, . . ., n - I , is a codeword in the Reed-Solomon code. Alternant codes are highly regarded because some of them have true minimum distance considerably larger than the designed distance (asymptotically close to the Gilbert bound).
corresponding to the time domain product of the more usual definition;thesecondcondition ensures that the time domain codewords GF(q)-valued. are Thus, the codeword spectrum is filtered by H prior tospecifying the 2r contiguous parity frequencies. A Goppa codeis an alternant code of designed distance + 1 with G described by a polynomial of degree 2t, called the Goppa polynomial.
2r
The definition of the alternant codes is easily translated into the frequency domain where it takes on more of a signal processing flavor. Let g i = h:', which is always defined since hl # 0, and let G and H denote the transforms of g and h. Then since gihi = 1 for all i, (G * H ) equals one at j = 0 and otherwise equals zero. (H is an invertible filter.) The alternant code Yi is the setof vectors whose transforms Cj, , j = 0, . . ., n - I , satisfy two conditions.
'2H,-kCj
i=O
I , . . ., 2r
and C: = Cpj,with indices interpreted modulo n in both conditions. The first of these conditions is a convolution
The alternant codes can be decoded just as the BCH codes. All that needs to be added is a step to modify the syndromes by the inverse of the vector h either by multiplying in the time domain or convolving in the frequency domain. No other change is necessary. Hence, any frequency domain or time domain BCH decoder can decode alternant codes out to the designed distance 2t + 1 . However, since the appeal of alternant codeslies in their much larger minimum distance, it is not clear that an alternant code used with a BCH decoder has any advantage over a BCH code used with a BCH decoder. Alternant codes will not have practical importance until a constructive procedure is found for obtaining the good ones, and a decoding algorithm is found for decoding beyond the designed distance. Some small steps in this directionare discussed in the next section.
307
VOL. 23
NO. 3
MAY 1919
R. E. B L A H U T
Decoding beyond the BCH bound If every codeword in a code % has a certain set of 2t contiguous spectral components equal to zero, then by the BCH bound, the minimum distance of the code is at least 2t + 1 and most of the common decoding algorithms will minimum distance of correct up to terrors. However, the the code might actually be larger than 2t + I , and in any case some patterns of more than t errors may be correctable. The variations that can occur are illustrated by the following four examples:
Berlekamp-Masseyalgorithm. Even if thereare more than t errors,the smallest-degree polynomial may be unique, and the received word then can be uniquely decoded. Suppose there are t + I errors, then the two unknown syndromes S,,,,, S,,,, will be enough, if known, to find A(x). Hence, analytically continue the BerlekampMassey algorithm through two more iterations with these new syndromes as unknowns. Then we have
I . A binary BCH code with all of the parity frequencies in a block of length 2t but with the actual minimum distance larger than 2t + I . (The Golay code is an example of such a code.) 2. A cyclic code with parity frequencies that are not contiguous. 3. A decoder for some of the (t + I)-error patterns in a terror-correcting Reed-Solomon code. 4. A decoder for alternant code suchas a Goppa code. an
Berlekamp [I21 discusses decoding of BCH codes beyond the BCH bound by forcing appropriate constraint equations in the frequency domain to be satisfied, but the techniques quickly become unmanageable as the number of errors increases beyond t. Hartmann [I81 gives some closely related frequency domain techniques that again involve searching through sets of nonlinear equations for solutions. Wewill describe here some time domain decoding techniques that decode BCH or alternant codes a small distance beyond the designed distance. These techniques are motivated by [12, 181, but for someapplications the complexity appears to grow more slowly as the number of errors increases beyond t. Thebasic idea is to add extra discrepancies as unknown variables, decode in terms of these variables, and then solve for the variables by some kind of search over low weight error patterns and when available, by using a priori facts such as that the codeword is GF(q)-valued. We will only discuss the decoding of errors; if desired, the ideas of the previous section may be added to decode erasures as well. We start the discussion with a Reed-Solomon code of designed distance 2t + I . Then any polynomial A(x) of degree t + u with t + u distinct roots is a legitimate errorlocator polynomial if
11-1
where 6,,+, E (0, I}, A2,+, E GF(q"'1, and Szf+, = 0 whenever A,(+, = 0. Except for S,,,,, and A2f+l, everything on the right is known from the 2t syndromes. Transform the frequency domain vectorthe into time domain by transforming the two components on the right to get
Aj2""
=
[1
S,,+,A,,+,~~~+la~2i]A~z'1
-i +
-[A2t+P
(I
-i," 2 1l 1 ,
- ~ Z / + A t + P
where we have used the general fact that if E : = EJ-lthen = a-ic~i. We mustnow the inverse transform satisfies choose the unknowns, if possible, so that the error pattern contains at most t + 1 nonzero components. If deg A '211 5 t and the number of distinct zeros of X"" equals the degree of A"'), then the number of errors equals the degree of A"". This case is easily checked. If there is only one solution for with t + 1 zero components and a corresponding A'2t+2'(x) degree t + of I , a unique pattern of t + I errors canbefound. Let A2(+1= a , = akz whenever are they nonzero. The cases to be considered are
k.1
A;2/+"
A:2"
- ak,a-ihj2tl
2. A ~ z ~ + 2 1 =
3.
AjZr+'l =
A:zr~
ak,a-zl
h(Z~I i ,
-
Ai2"
- ak,a-ib;zo k*
a k, a - z t h j 2 r l
akla-'hI"',
4.
a a
j=n
AjST-j= 0
r=I+t+u;..,2t.
Each of these cases is to be searched over k , = 0, . . ., q - 2; k, = 0 , . . ., q - 2 for a solution with AY"" = 0 for exactly t + I values of i. With these values for the unmust knowns, the polynomial A'2'+2'(~~) have degree t + I . Then A'2'+2'(x) be used to recursively extend the syncan dromes, starting from the known syndromes, and using
,,-I
308
The smallest-degree such polynomial (if there is one) corresponds to themaximum-likelihood codeword. If it is of degree at most t , thispolynomial is produced by the
Sj=
A'2'+2's,-j k
2t
+ 1, . . ., n.
k=O
R. E. BLAHUT
VOL. 23
NO. 3
MAY 1979
Searching through the four cases above appears quite tedious to the reader, but is very orderly and simple in structure, and shift register circuits canbe easily designed to search for a solution. One can organize the search in a variety of ways. Among the possibilities, one of the less obvious is a histogram approach. For example, with case 4, for each value of k , prepare a histogram of (Ai2" k -i ( L o (Y *(Y h ' )-I (a?A~").Any component k , of the histogram that takes the value t + 1 corresponds to a possible t + 1 error pattern. The decoder can be further extended to decode t + u errors. Although the equations become lengthy, it seems that such an approach may be practical out to t + 3 or t + 4 depending on the blocklength of the code. Now consider binary codes. These differ from ReedSolomon codes in that the decoder can be simplified as described below, and also because many of the t + u error patterns found may correspond to nonbinary error patterns and so must be discarded. When treating binary cyclic codes, we will make use of the fact that the Berlekamp-Massey algorithm of Theorem 5 can be simplified because A,, = 0 for all even r. Published proofs of this important fact [7, 121 are quitelengthy. An easy proof is given in the following theorem.
T h e o r m 6 In GF(2"'), suppose for a n y linear feedback shift register A(x), and any sequence SI, S,, . . .' S 2 Y - 1 satisfying S,, = S;, that
Thus, for even r , A,, is zero and we can analytically combine two iterations to give, for even r ,
A'"'(x) = A'""(.x)
-
A,-lxB(r-Z)(~r),
B'"(x)
(1
Now suppose that we have a binary BCH code of designed distance 2t + l , and we wish to correctall patterns of t + I (or fewer) errors whenever they are uniquely decodable. The only measurement data available to the decoder are contained in the 2t syndromes SI, S,, . ' ., Stf. All other frequencies either can take on arbitrary values or are completely determined from the syndromes by the constraints. The algorithm canbeiterated again to give A'2'+2'(x) A'2')(x) A,,+,XB'~"(X). =
~
Transform the frequency domain vector A;'+" by transforming the components on the right to get
y + 2 '
A2,+lN-lh)""
and suppose apattern o f t or fewer errors not found. was Preparea histogram of arA~"/hi2" over the nonzero components of GF(4). If one component (ormore) of the histogram equals t + I , this corresponds to a candidate error pattern for that value of At+l. For eachof these candidates, the corresponding polynomial A"'+"(X) can then be used to extendthe syndromes in the frequency domain. Those cases that do not satisfy the conjugacy constraints can be discarded at this point. An inverse Fourier transform for each candidate givesan error pattern. it is If unique, it is the correct error pattern. Next consider a binary code for which the parity frequencies are not contiguous. An example is the (63, 28, 15) binary cyclic code with parity frequencies C , , C:$, C,, C,, C,, C , , , and C Z l . This code should be preferred to the (63, 24, 15) BCH code because of a superior rate, but the BCH code might be chosen because of its well-known decoding algorithms. However, with a little extracomplexity, we can modify a frequencydomain BCH decoder to handle the (63, 28, 15) code. Using the procedure discussed above, all patterns of seven or fewer errors that agree with the twelve contiguous parity frequencies are found. Then S,, is computed for each of these candidates. Only one will agree with the measured value of S Z l . The same ideas apply to a BCH code with more than I errors. To extend the decoder more than one error beyond the BCH bound requiresmore complex equations, but to go a small distance they are still quite manageable. For t + 2 errors,
t
sj = - 1
i=l
,I-
.\j~j-i
,j = 2u
I , . . ., v.
i=l
then S,"
Proof
St.
k=l
i=l
But by symmetry every term in the sum with i # k appears twice, so in GF(2"') those two terms add to zero. Hence, only the diagonal terms contribute, and
i=l
which agrees with the expression for and so proves the S', theorem.
VOL. 23
NO. 3
MAY 1979
Figure 5 Two-dimensional spectrumover Galois field GF(8): (a) unconstrained spectrum, (b) constrained spectrum.
Codes based on multidimensional transforms Multidimensional Fourier transforms also can be used to define error control codes. We shall consider several examples, but the most familiar example is the two-dimensional product code. This is a two-dimensional array of elements from GF(q) such that every row is a codeword in a code El and everycolumn is a codeword in a code (e2. A cyclicproduct codeis a product codein which the rows and columns are from cyclic codes El and ( e 2 . To ensure that the cyclic product code is actually cyclic, one imposes the condition that the number of rows and the number of columns are relatively prime. But, for a general multidimensional transform code, the dimensionsneed not be relatively prime.
and
Multidimensional transforms have been used for the study of error control codes in the guise of the MattsonSolomon polynomial. A treatment of cyclic product codes with two-dimensional transforms can be found in Lin and Weldon [ 191. Papers by Delsarte, Goethels, and MacWilliams [ZO] and by Kasami, Lin, and Peterson [ Z I ] are representative of theuse of multidimensional transforms. For simplicity, we will limit discussion to the two-dimensional transform. Let e,,, be an n X n ' , two-dimensional array, which will be called a two-dimensional timefunction, where n and n' both divide 4"' - 1 for some m. Let p and y be elements of GF(q"') of order n and n' respectively. The array
and search for values of the unknowns that have t + 1 or t + 2 components of A12t+4' equal to zero. Many of these will correspond to nonbinary error patterns and so must be rejected later. Finally, we cometoGoppacodes.Let G(x) be the Goppa polynomial, and fromthe spectrumof the received signal R j , let the syndromes be computed by
IL-
Sj =
1 Hi+jRi
will be called the two-dimensional spectrum and the indices j and j ' are the frequency variables. It is obvious that
i=0
(or instead, in the time domain, multiply to find gA1ri = h,ri). Decode beyond the designed distance just as for the Reed-Solomon code. The simplifications to the Berlekamp-Massey algorithm for BCH codes due to conjugacy constraints do not apply. In general, one can expect that many candidate error patterns with t + I errors will initially be found. For each of these, the filtered syndromes can be extended; then the inverse Fourier transNo form is taken and multiplied by si. more than one pattern o f t + 1 errors can be binary. Alternatively, working in the frequency domain, the filtered syndromes can be inverse-filtered using G as the tap weights of a finite impulse response filter. This convolves G with the filtered syndromes. The filter output must satisfy the conjugacy constraints or be rejected. Only one candidate error pattern will survive this test. An inverse Fourier transform gives the time domain error pattern.
by inspection of the one-dimensional inverse transform. We can choose n = n' primitive element, and
=
q"'
I . Then p
a, a
Consider a two-dimensional spectrum over GF(q). For definiteness wewill illustrate with GF(8) and n = 7 as shown in Fig. 5(a). Each square in the grid contains an octal symbol. We define a code by selecting a set of N K of these components to (two-dimensional) parityfrebe quencies, which are constrained tobe zero as in Fig. 5(b).
VOL. 23
NO. 3
M A Y 1979
The remaining set of K components are filled with K information symbols, and then the inverse transform (twodimensional) is taken. The time function is the codeword corresponding to theinformation symbols. Clearly, thisis a linear code, and any choice of the parity frequencies defines another linear code. In general, these codes are not cyclic codes.
6 0
0
1
I O
2 0 3 0 4 0 5 0 6
0
0
0
If the code is in a subfield of GF(q) [GF(2) is the only subfield of GF(8)], one must restrict the setof codewords to those that have components in the subfield. This is only a subfield-subcode. One could also extend the idea of an alternant code to a multidimensional alternant code by multiplying by a two-dimensionaltemplatebefore extracting the subfield-subcode. For an example, as shown in Fig. 6(a), chooseall of the elements in a set of vertical stripes and a set of horizontal stripes to be panty frequencies. All the two-dimensional time domainfunctions with these frequencies equal to zero are the codewords. That is,
(C)
(dl
for each parity frequency j'. This can be interpreted in another way by defining the two-dimensional polynomial
11-1
Figure 6 Spectra of some codesover Galois field GF(8): (a) product of cyclic codes; (b) product of Reed-Solomon codes; (c) dual of a product code; (d) product of (7, 4) BCH codes.
e(x, y ) =
1 1~~,,x~y~'
i'=o
n-1
j-0
Table 2
so that the code satisfies e(a', a'') = 0 for every j and every j ' that are parity frequencies. Since the parity frequencies were defined on vertical and horizontal stripes, we have
e(aj, y ) = 0,
P(X,
a")
for every j and every j ' that are parity frequencies. But this says that for fixed i, eii, is a cyclic code and for fixed i ' , eii,is a cyclic code. That is, e,,, is a product code.Product codes were studied by Elias [22], who showed that the minimum distance is the product of the minimum distances of the two codes. It was proved by Burton and Weldon [23] that if dimensions n and n' are relatively prime, then theproduct code of two cyclic codes is equivalent to a cyclic code. If we take the stripes of parity frequencies to be contiguous, then we have a code that is the product of two Reed-Solomon codes. Figure 6(b) illustrates a (49,25) d = 9 code overGF(8) defined spectrally. Eachof the 25 information symbols can be loaded with an octal information character, and the result is transformed to the time domain to obtain the codeword.
The same structure can be used to obtain a code over GF(2) by selecting only those codewords that are binary. To do this constructively in the frequency domain, only an independent set of frequencies may be specified. Theorem 3 is easily extended to a two-dimensionalversion which requires that
'?j1
=j( Z'
from which we can construct Table 2. Each row of the table shows a constrained set of frequencies. Any member of the row can be chosenas parity orasanarbitrary
311
VOL. 23
NO. 3
MAY 1979
R. E. BLAHUT
information symbol. The remaining symbols in a row are not arbitrary. The frequency C,,o can only be 0 or 1 because it is itsown square.The remaining information symbols are octal. Altogether 49 bits specify the spectrum, but of course some of these are panty and contain no information. Figure 6(d) shows the specialization of Fig. 6(b) to a binary code. There are only 16 open frequencies which, because of the constraints, can encode 16 bits. This is a consequence of the fact that row 1 and column 1 have theirparitysymbols scattered among different rows of Table 2 . The code is an unimpressive (49, 16, 9) code. The second case,illustrated in Fig. 6(c), is a dual to the idea of a product code.A rectangle a frequencies high and h frequencies wide is chosen for the set parity frequenof cies. It is easily seen that the minimum distance satisfies
d
2
i=
ir-0
Expand the product in the exponent and let a = 7 , an= a~1 and can be dropped. Then =
1
13i.j,[ai,jo,
,P 1
y
ill=
Er,,,j,+j,, =
i=0
Notice that the inner sum is an n X n Fourier transform for each value of i and the outersum is an n X n Fourier transform for each value of if. The factor multiplying the inner sum is a minor nuisance. We can make it vanish simply by changing the definition of the BCH code. We will define an equivalenttwo-dimensional code, whose performance properties are the same as a BCH code and which circumvents the need for the extra compensation factor. Let n = nn where n and n are relatively prime. Let the code consistof all two-dimensional GF(q)-valuedtime functions c i i , i = 0, . . ., n - 1 , i = 0, . . ., n - 1 , such satisfies that the two-dimensional transform {Cjj,}
+ min ( a , b ) .
Hence the example gives (49,45,3) code overGF(8). The binary subfield subcode is a (49, 39) d 2 3 code. In the next two sections, we will make use of two-dimensional codes to introduce some new codes with special properties.
Fast BCH codes We have seen that for any BCH code, the encoderldecoder involves two Fourier transforms, possibly realized as a Chien search oras a syndrome computer.If n is composite, then a fast algorithm can be used for the Fourier transforms so as to reduceconsiderably thecomputational load. However, the fast Fourier transform requires some adjustment terms when the factors are nonbinary. (Finite field transforms normally have nonbinary factors.) This is only a minor problem, but it does disrupt the otherwise orderly organization of the calculations. If it can be eliminated at no cost, it should be.
where the subscripts are modulo n and modulo n, respectively. This is a linear t-error-correcting code which is different from a BCH code in only a trivial way. The rate and minimum distance are unchanged. The rate is the same because of the following theorem.
Theorem 7 The two-dimensional conjugacy class of j modulo n and j modulo n has the same number of elements as the conjugacy class of j modulo nn.
Proof Let r be the smallest integer such that both 2 = j modulo n and 27 = j modulo n are satisfied. Let s be the smallest integer such that 27 = j modulo nn. Then 27 = an + j , and 2;; = bnn + j for some a and for some b. Obviously, the smallest such rand the smallest such s are identical.
Toseethe transform
n-1
Fourier
Ej =
2 aijei,
i=O
and suppose n = nn. Replace each of the indices by a coarse and vernier index as follows: = i n,i ., I = 0, . . ., n - I ,
+
= n;;J + j
0, j=O,
I
.,I
. . n . . ., n
)
- 1;
-
We show the distance of the code is at least 2t + I by showing a decoding procedure for t errors. Given a received word with two-dimensional transform R,,, define mod , the syndromes Sj= R(J ~ , mod n,,),j = I , . . ., 2t. Use the Berlekamp-Massey algorithm and a recursive extenn,, sion to obtain Sj, 1 , . . ., n ; and set E(jmod jmod ,,,, = Sj, j = 1, . . ., n. Since n and n are relatively prime, every syndrome finds its own place in Eij,. We must prove that this procedure gives the correct frequency domain error pattern if fewer than t errors occurred.
~
1,
1,
j = 0 ,
312
. . ., n
Then
But if a single,error takes place in row i, and column i;, then Sj= (/31kyik). The parenthesized term is a power of the primitive element a , unique for each row and column
R. E. BLAHUT
\.OL. 213
NO. 3
MAY 1979
0 k=l
0
where X , is a unique power of CY for each error location (ik, i i ) . Recursively extending these syndromes and folding them back into the two-dimensional spectrum gives Ejj,if t or fewer errors took place. Finally,
Cjj, = R,,
-
s,,,,
XIY;
+ X,Y; + . . . + xY;.
E,,,
the
Long codes in small fields As the blocklength increases, BCH codes become unsatisfactory for several reasons. Not only does d / n vanish if
the rate is fixed, but at the same time the decoding computations must take place in an ever larger field. The alternant codes represent one way to modify BCH codes to improve minimum distance and so offset the first disadvantage. The second disadvantage, however, has not received much attention. Wewill develop this problem here, and give some early steps toward a solution. A practical decoder fora BCH code of blocklength n
=
This set of equations is familiar from the decoding of a Reed-Solomon code. There is one difference; here the Y, need not be distinctbecause several errors might occur in the same column. However, it is a simple matter to combine the terms with the same Yk to obtain a similar set of equations with smaller v and this smaller v also satisfies v 5 t. Then, the Berlekamp-Massey algorithm followed by recursive extension will yield SI,, S,,, . . , Sin. Because errors can occurin the same column the procedure does not uniquely give (X,, Y,) k = I , . . ., v , but does give partial information which can be used in decoding. In general, we have the following.
Theorem 8 Suppose that v 5 t errors occur, and for any 1 integers m,, m i , a , u, thesyndromes Sm,+nl,mb+a,l,= I , . . ., 2t, are known. Then these uniquely define the synI = dromes Sm,+al,mb+n,l, 1, . . ., n .
Proof
X,:+nl
is a large Galois field. Wewill describe some codes of large blocklength that can be decoded in a small Galois field. Although the rate of these codes is inferior to BCH codes of the same n and d, their lesser complexity may make them the only affordable choice in some applications. We will use a two-dimensional code with n = 2 - 1 rows and the samenumber of columns. Hence, theblocklength of the code is n2, but we can hope to do all of the decoding with computations in the field GF(2). Before defining the codes, we first discussdecoding procedures and a two-dimensional version of the BCH bound. The codes will be defined to fit the desired decoding procedure. Let a single error occur at row i and coland let the column umn i, let the row locator be X , = CY, locator be Y, = a. Then the syndrome S,, is
Smo+o/,mh+nl
k=l
;;+nl
k=l
1 (X,;(X,Y,).
be the number of distinct over k and let . ., V I , denote these. Let X k ,denote the sum of the factors multiplying in each equation. (It is the same for each 1.) Then Let
V
Y,,, k = 1, .
Y:
Sm+n/,mb+n,r =
1 xk,Y:,
k=l
1>
. . ., 2t,
where the Pk, are now distinct and v 5 t. The BerlekampMassey algorithm followed by recursive extension will produce the remaining syndromes
Y
Smo+a/,mb+nl =
k= 1
Xk,Yi,
= 1,.
. ., n.
s,, = CYijairj = X J Y1 1
and if v errors occur, then
s,, =
1 XlY:
k=l
Hence, by this theorem, any 2t syndromes in a straight line (horizontal, vertical, or at any angle) canbe extended to all syndromes in that line. Further, because of conjugacy constraints, each of these new syndromes also determines all syndromes in its conjugacy class. We will return to this point in the examples below. Now let us see how the BCH boundgeneralizes to two (or more) dimensions. Suppose that we had 2t contiguous syndromes anywhere in the first row. These can be extended to give all syndromes in the first row. Similarly 2t
. . ., Sl,,tare known
S,,= X,Yl
+ X,Y, + . . + XY,
VOL. 23
NO. 3
MAY 1979
I 2 3 4 5 6 7 8 9 IO I ! I2 13 14 IS 16 17 18 19 20 21
" "
l P P P P P P P P P P P
I provided
2 3 P
4 S P 6 P 7 P 8 9 P IO P I I 12 P 1 3 14P 1 5 16
P P PP P PP P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P
P
P
P P P P P
We give an example of a binary code defined as a square two-dimensional code in the field GF(2"'); take t = 8, n = 255, so the blocklength is 255' = 65 025. Wewill work through the selection of parity frequencies so that all of the parity frequencies in the block j = I , . . . , 2t; ,j' = I , . . . , 2t can be computed. Theorem 9 then guarantees that the remaining syndromes can be computed. First take SI,, . . ., SI,, as parityfrequencies. Each of these is in a different conjugacy class, and each class has eight elements, so each of these parity frequencies is equivalent to eight parity bits. These can be extended to S I 3 , , j ' = 1, . . ., n , if at most terrors occurredand then by theconjugacy constraints Slj,, SZj,, S,j,, S,j,, and S,,;j,, j ' = I , . . ., n are all known. Next take S , S,,, S , S,,, ,, ,,
17 18
19
Figure 7
contiguous syndromes anywhere in the second row can be extended togive all syndromes in the second row. Further, 2t contiguous syndromes anywhere in each of the first 2t rows can be extended to give all syndromes in each of the first 2t rows, and hence 2t contiguous syndromes in every column. These can be extendedto give all syndromes and hence suffice to give the error pattern. The simplest example is the case of a square array of 2t X 2t known syndromes. The general situation is as follows. Theorem 9 Given any a , a' and m,, 1 = I , of (2t)' syndromes
S~~(~l+l'),~r'(~ll+l')+l >
S,,, . . ., SI,., parity frequencies. This adds I I X 8 as more parity bits and determines Sj,, Sj,, S, S j I 6 , . j= Sf,, , I , . . ., n . Continue ifl this way to choose all the parity
frequencies shown in Fig. 7. These determine the remaining frequencies in the 2t X 2t corner and hence all of the frequencies if at most t errors occurred. Each parity frequency is equivalent to eight parity bits. The code is a (65 025, 64 337, 17) code. Its virtue is that it is easily decoded despite its blocklength. We first describe a conceptual frequency domain decoder; later we simplify this by bypassing many of the Fourier transforms. Given a received word, compute its two-dimensional Fourier transform. This requires 5 10 two-hundred-andfifty-five-point Fourier transforms. Perform a BerlekampMassey algorithm along the first row, recursively extend, and use conjugacy constraints to fill in rows 2 , 4 , 8 , and 16. Do the same along the first column to find columns 2 , 4, 8, and 16. Repeat for row 3, then column 3 and so on. When 2t rowsarecomplete, then all columnscanbe found. An inverse two-dimensional Fourier transform gives the error pattern. A simpler procedure is as follows. Compute the twodimensional Fourier transform only at the 86 parity frequencies.Each of these is an eight-bitnumber. Insert these at the appropriate positions of a 16 by 16 array of numbers representing the 16 by 16 frequencies in the upper left corner. Now decode the first row, extending syndromes and using conjugacy constraints to fill in all posTake inverse the sible entries in the 16 X 16 array. Fourier transform of the first row. The nonzero locations specify columns in the time domain codewords at which errors occur. Each nonzero magnitude gives the sum of
1'
1 . . . 2t. 1 = 1 . . . 2t , ,
1 ,
uniquely determines the error pattern provided v rors took place, and ~1 is relatively prime to n. Proof
t er-
Apply Theorem 8 for each 1 to determine the syndromes ~ ~ I ( N l l + l ' ) . a ' ( n l l + l " 1' = I . . . >n' ,1 = 1 . . . 2t. Now, because I' ranges over all n values, we can redefine the index I' to absorb so that the known syndromes are Scll,,a,l,+l, = 1, . . ., n ; 1 = 1, . . ., 2t. Apply Theorem 8 I' again for each I' to determine the syndromesS,,/,,,,,/,+/, I' = I , . . ., / r ; 1 = I , . . ., 1 2 . We can now redefine the index 1 to absorb LI'I' so that we know S,ll,,l, I' = I , . . ., / 1 ; 1 = 1 , . . . , n. Since u is relatively prime to n , theindex al' ranges over all n values. Hence, all syndromes are determined, and so is the error pattern.
rn,
Based upon Theorem 9, for each a , u ' , m,, I = 1 , . ' ., 2t, we can define a two-dimensional code % as the set of arrays cii, such that the two-dimensional transform satisfies
314
Cu(*l+/,),u,(ml+l,)+l = 0 1= 1
1
'
' 1'
2t. I' = I
1
'
'
2t.
R. E. BLAHU'I
I B M J . RES.DEVELOP.
\IOL. 23
NO. 3
M A Y 1979
the row locators for errorsin this column. Some columns with three or more errors may not show up here because the row locators add to zero. Discard the syndromes outside the 16 by 16 array. Continue in this way with each odd numbered row of the sixteen rows,when necessary decoding a column just to obtain some needed syndromes through the conjugacy constraints. Each of the sixteen rows, when the inverse transform is taken, has nonzero values only in the eight (or fewer)columns containing errors. Identify these eight columns. In each column the transforms in the sixteen rows provide sixteen magnitudes. One such set of magnitudes can be written in terms of the row error locators for that column, TI =
4. A. Lempel and S. Winograd, A NewApproach to Error Correcting Codes, IEEE Trans. Inf. Theory IT-23, 503-508 (1977). 5. R. T. Chien and D. M. Choy, Algebraic Generalization of BCH-Goppa-Helgert Codes, IEEE Trans. Inf. Theory IT21, 70-79 (1975). 6. H. F. Mattson and G. Solomon, ANew Treatment of Bose Chaudhuri Codes, J . Soc. Indus. Appl. Math. 9, NO. 4, 654-699 (1961). 7. W. W. Petersonand E. J. Weldon, Jr., Error-Correcting
x,+ x, + . . . + X , T, = X : + X : + . . . + X:,
0 0 0
TI, =
+ x:+ . . . +
Xi6,
where v is the number of errors in that column. Since v 5 8, this set can be decoded in the same way to find the rows in which this column has errors. Altogether, this decoder requires the computation of 86 Fourier transforms, and 24 passes through the basic decoding algorithm, each such pass consisting of a Berklekamp-Massey algorithm, a recursive extension, possible computation of conjugacy relations, and a 255point inverse Fourier transform. All data paths are eight bits wide, Galois field computations areeight bits by eight bits, and most of the computations simply re-exercise the same procedures, and so can use the same hardware. References
1. J. M. Pollard, The Fast Fourier Transform in a Finite Field, Mathemutics o Computation 25, No. 114, 365-374 f (1971). 2. W. C. Gore, Transmitting Binary Symbols with ReedSolomon Codes, Proceedings o Princeton Conjkrence on f Information Sciences und S y s t e m s , Princeton, NJ, 1973, pp. 495-497. 3. A. Michelson, A Fast Transform in some Galois Fields and
Generalized Reed-Muller Codes and Their Relatives, Info. Control 15, 403-442 (1970). 21. T. Kasami, S. Lin, and W. W. Peterson, Polynomial Codes, IEEE Trans. Info. Theory IT-14, 807-814 (1968). 22. P. Elias, Error Free Coding, I R E Trtrns. Infi). Theory IT4, 29-37 (1954). 23. H. 0 . Burton and E. J. Weldon, Jr., Cyclic Product Codes, IEEE Trans. Info. Theory IT-11, 433-439 (1965).
an Application to Decoding Reed-Solomon Codes, IEEE Abstracts of Papers-IEEE International Symposium on Information Theory, Ronneby, Sweden, 1976, p. 49.
The author is located ut the I B M Federal Systems Divis i o n laboratory, Owego, New York 13827.
315
VOL. 23
NO. 3
Id A Y 1979
R . E. BLAHUT