Detecting and Correcting Bit Errors
COS 463: Wireless Networks
Lecture 8
Kyle Jamieson
Bit errors on links
• Links in a network go through hostile environments
– Both wired, and wireless:
Scattering
Diffraction
Reflection
– Consequently, errors will occur on links
– Today: How can we detect and correct these errors?
• There is limited capacity available on any link
– Tradeoff between link utilization & amount of error control
2
Today
1. Error control codes
– Encoding and decoding fundamentals
– Measuring a code’s error correcting power
– Measuring a code’s overhead
– Practical error control codes
• Parity check, Hamming block code
2. Error detection codes
– Cyclic redundancy check (CRC)
3
Where is error control coding used?
• The techniques we’ll discuss today are
pervasive throughout the internetworking stack
Application
• Based on theory, but broadly applicable in
practice, in other areas: Transport
– Hard disk drives Network
– Optical media (CD, DVD, & c.) Link
– Satellite, mobile communications
Physical
• In 463, we cover the “tip of the iceberg” in the
Internetworking stack
4
Error control in the Internet stack
• Transport layer
– Internet Checksum (IC)
over TCP/UDP header, data IC TCP payload
TCP header
5
Error control in the Internet stack
• Transport layer
– Internet Checksum (IC)
over TCP/UDP header, data IC TCP payload
TCP header
• Network layer (L3)
– IC over IP header only IC IP payload
IP header
6
Error control in the Internet stack
• Transport layer
– Internet Checksum (IC)
over TCP/UDP header, data IC TCP payload
TCP header
• Network layer (L3)
– IC over IP header only IC IP payload
IP header
• Link layer (L2)
– Cyclic Redundancy Check (CRC) LL header LL payload LL CRC
7
Error control in the Internet stack
• Transport layer
– Internet Checksum (IC)
over TCP/UDP header, data IC TCP payload
TCP header
• Network layer (L3)
– IC over IP header only IC IP payload
IP header
• Link layer (L2)
– Cyclic Redundancy Check (CRC) LL header LL payload LL CRC
• Physical layer (PHY)
– Error Control Coding (ECC), or PHY payload
– Forward Error Correction (FEC)
8
Today
1. Error control codes
– Encoding and decoding fundamentals
– Measuring a code’s error correcting power
– Measuring a code’s overhead
– Practical error control codes
• Parity check, Hamming block code
2. Error detection codes
– Cyclic redundancy check (CRC)
9
Error control: Motivation
00 Network 00
01 01
10 10
11 message 11
Sender Receiver
“Allowed”
messages
• Every string of bits is an “allowed” message
– Hence any changes to the bits (bit errors) the sender
transmits produce “allowed” messages
• Therefore without error control, receiver wouldn’t
know errors happened!
10
Error control: Key Ideas
• Reduce the set of “allowed” messages
– Not every string of bits is an “allowed” message
– Receipt of a disallowed string of bits means that the
message was garbled in transit over the network
• We call an allowable message (of n bits) a codeword
– Not all n-bit strings are codewords!
– The remaining n-bit strings are “space” between
codewords
• Plan: Receiver will use that space to both detect and
correct errors in transmitted messages
11
Encoding and decoding
• Problem: Not every string of bits is “allowed”
– But we want to be able to send any message!
– How can we send a “disallowed” message?
• Answer: Codes, as a sender-receiver protocol
– The sender must encode its messages codewords
– The receiver then decodes received bits messages
• The relationship between messages and codewords
isn’t always obvious!
12
A simple error-detecting code
• Let’s start simple: suppose messages are one bit long
• Take the message bit, and repeat it once
– This is called a two-repetition code
Sender:
0 ➜ 00
01
10
1 ➜ 11
13
Receiving the two-repetition code
• Suppose the network causes no bit error
• Receiver removes repetition to correctly decode the
message bits
Sender: Network: Receiver:
0 ➜ 00 00 ➜ 0
01
10
1 ➜ 11 11 ➜ 1
14
Detecting one bit error
• Suppose the network causes up to one bit error
• The receiver can detect the error:
– It received a non-codeword
• Can the receiver correct the error?
– No! The other codeword could have been sent as well
Sender: Network: Receiver:
0 ➜ 00 00 ➜ 0
01 01 Error detected
10 10 Error detected
1 ➜ 11 11 ➜ 1
15
Reception with two bit errors
• Can receiver detect presence of two bit errors?
– No: It has no way of telling which codeword was sent!
• Enough bit errors that the sent codeword “jumped
over” the space between codewords
Sender: Network: Receiver:
0 ➜ 00 00 ➜ 0
Space between 01
codewords:
10
1 ➜ 11 11 ➜ 1
16
Hamming distance
• Measures the number of bit flips to change one
codeword into another
• Hamming distance between two messages m1, m2: The
number of bit flips needed to change m1 into m2
• Example: Two bit flips needed to change codeword 00 to
codeword 11, so they are Hamming distance of two apart:
00 01 11
17
How many bit errors can we detect?
• Suppose the minimum Hamming distance between any
pair of codewords is dmin
• Then, we can detect at most dmin− 1 bit errors
– Will land in space between codewords, as we just saw
2 bit errors
dmin = 3
– Receiver will flag message as “Error detected”
18
Decoding error detecting codes
• The receiver decodes in a two-step process:
1. Map received bits à codeword
• Decoding rule: Consider all codewords
– Choose the one that exactly matches the
received bits
– Return “error detected” if none match
2. Map codeword à source bits and “error detected”
• Use the reverse map of the sender
19
A simple error-correcting code
• Let’s look at a three-repetition code
• If no errors, it works like the two-repetition code:
Sender: Network: Receiver:
0 ➜ 000 000 ➜ 0
001 001
010 010
100 100
011 011
101 101
110 110
1 ➜ 111 111 ➜ 1
20
Correcting one bit error
• Receiver chooses the closest codeword (measured by
Hamming distance) to the received bits
– A decision boundary exists halfway between codewords
Sender: Network: Receiver:
0 ➜ 000 000 ➜ 0
001 001
Fix error
010 010
100 100
Decision boundary
011 011
101 101
Fix error
110 110
1 ➜ 111 111 ➜ 1
21
Decoding error correcting codes
• The receiver decodes in a two-step process:
1. Map received bits à codeword
• Decoding rule: Consider all codewords
– Choose one with the minimum Hamming
distance to the received bits
2. Map codeword à source bits
• Use the reverse map of the sender
22
How many bit errors can we correct?
• Suppose there is at least dmin Hamming distance between
any two codewords Round
down
• Then, we can correct at most " dmin −1$ bit flips
"# 2 $%
– This many bit flips can’t move received bits closer to
another codeword, across the decision boundary:
2 bit errors
dmin = 5
Decision boundary
23
Code rate
• Suppose codewords of length n, messages length k (k < n)
• The code rate R = k/n is a fraction between 0 and 1
• So, we have a tradeoff:
– High-rate codes (R approaching one) correct fewer errors,
but add less overhead
– Low-rate codes (R close to zero) correct more errors, but
add more overhead
24
Today
1. Error control codes
– Encoding and decoding fundamentals
– Measuring a code’s error correcting power
– Measuring a code’s overhead
– Practical error control codes
• Parity check, Hamming block code
2. Error detection codes
– Cyclic redundancy check (CRC)
25
Parity bit
• Given a message of k data bits D1, D2, …, Dk, append a
parity bit P to make a codeword of length n = k + 1
– P is the exclusive-or of the data bits:
• P = D1 ⊕ D2 ⊕ ⋯ ⊕ Dk
– Pick the parity bit so that total number of 1’s is even
k data bits parity bit
011100 1
26
Checking the parity bit
• Receiver: counts number of 1s in received message
– Even: received message is a codeword
– Odd: isn’t a codeword, and error detected
• But receiver doesn’t know where, so can’t correct
• What about dmin?
– Change one data bit à change parity bit, so dmin = 2
• So parity bit detects 1 bit error, corrects 0
• Can we detect and correct more errors, in general?
27
Two-dimensional parity
• Break up data into multiple rows d1,1 d1,2 d1,3 d1,4 p1
– Start with normal parity within d2,1 d2,2 d2,3 d2,4 p2
each row (pi)
d3,1 d3,2 d3,3 d3,4 p3
– Do the same down columns (qi)
– Add a parity bit r covering row d4,1 d4,2 d4,3 d4,4 p4
parities q1 q2 q3 q4 r
• This example has rate 16/25
pj = dj,1 ⨁ dj,2 ⨁ dj,3 ⨁ dj,4
qj = d1,j ⨁ d2,j ⨁ d3,j ⨁ d4,j
r = p1 ⨁ p2 ⨁ p3 ⨁ p4
28
Two-dimensional parity: Properties
• Flip 1 data bit, 3 parity bits flip d1,1 d1,2 d1,3 d1,4 p1
• Flip 2 data bits, ≥ 2 parity bits flip
• Flip 3 data bits, ≥ 3 parity bits flip d2,1 d2,2 d2,3 d2,4 p2
d3,1 d3,2 d3,3 d3,4 p3
• Therefore, dmin = 4, so
– Can detect ≤ 3 bit errors d4,1 d4,2 d4,3 d4,4 p4
– Can correct single-bit errors (how?)
q1 q2 q3 q4 r
• 2-D parity detects most four-bit errors
29
Block codes
• Let’s fully generalize the parity bit for even more error
detecting/correcting power
• Split message into k-bit blocks, and add n−k parity bits
to the end of each block:
– This is called an (n, k) block code
k bits n−k bits
data bits parity bits
codeword: n bits
30
A higher rate error correcting code?
• What if we repeat the parity bit 3 ? D1D2D3D4 PPP
– P = D1 ⊕ D2 ⊕ D3 ⊕ D4; R = 4/7
– Flip one data bit, all parity bits flip. So dmin = 4?
• No! Flip another data bit, all parity bits flip back to
original values! So dmin = 2
– What happened? Parity checks either all failed or all
succeeded, giving no additional information
31
Hamming (7, 4) code
k = 4 bits n − k = 3 bits D1 D4
P1
D1D2D3D4 P1P2P3
P1 = D 1 ⊕ D3 ⊕ D4 P2 P3
P2 = D 1 ⊕ D 2 ⊕ D 3
P3 = ⊕ D2 ⊕ D3 ⊕ D4 D2
D3:all
32
Hamming (7, 4) code: dmin
• Change one data bit, either: D1 D4
– Two Pi change, or
P1
– Three Pi change
• Change two data bits, either: P2 P3
– Two Pi change, or D2
– One Pi changes D3:all
dmin = 3: Detect 2 bit errors, correct 1 bit error
33
Hamming (7, 4): Correcting One Bit Error
• Infer which corrupt bit from D1 D4
which parity checks fail:
P1
• P1 and P2 fail ⇒ Error in D1
• P2 and P3 fail ⇒ Error in D2 P2 P3
• P1, P2, & P3 fail ⇒ Error in D3 D2
• P1 and P3 fail ⇒ Error in D4 D3:all
• What if just one parity check fails?
Summary: Higher rate (R = 4/7) code correcting one bit error
34
Today
1. Error control codes
2. Error detection codes
– Cyclic redundancy check (CRC)
35
Cyclic redundancy check (CRC)
• Most popular method error detecting code at L2
– Found in Ethernet, Wi-Fi, token ring, many many others
• Often implemented in hardware at the link layer
• Represent k-bit messages as degree k − 1 polynomials
– Each coefficient in the polynomial is either zero or
one, e.g.:
k = 6 bits of message
1 0 1 1 1 0
M(x) = 1x5 + 0x4 + 1x3 + 1x2 + 1x + 0
36
Modulo-2 Arithmetic
• Addition and subtraction are both exclusive-or without
carry or borrow
Multiplication example: Division example:
1101 1101
110 110 101110
110
0000 111
11010 110
110100 011
000
101110 110
110
37
CRC at the sender
• M(x) is our message of length k
– e.g.: M(x) = x5 + x3 + x2 + x (k = 6) 101110
• Sender and receiver agree on a generator polynomial
G(x) of degree g − 1 (i.e., g bits)
– e.g.: G(x) = x3 + 1 (g = 4) 1001
1. Calculate padded message T(x) = M(x)∙xg−1
– i.e., right-pad with g − 1 zeroes
– e.g.: T(x) = M(x)∙x3 = x8 + x6 + x5 + x4
101110 000
38
CRC at the sender
2. Divide padded message T(x) by generator G(x)
– The remainder R(x) is the CRC:
101 011
1001 101110 000
1001
0101
0000
1010
1001
011 0
000 0
11 00
10 01
1 010
1 001
011 R(x) = x + 1
39
CRC at the sender
3. The sender transmits codeword C(x) = T(x) + R(x)
– i.e., the sender transmits the original message with the
CRC bits appended to the end
– Continuing our example, C(x) = x8 + x6 + x5 + x4 + x + 1
101110 011
40
Properties of CRC codewords
• Remember: Remainder [ T(x)/G(x) ] = R(x)
• What happens when we divide C(x) / G(x)?
• C(x) = T(x) + R(x) so remainder is
– Remainder [ T(x)/G(x) ] = R(x), plus
– Remainder [ R(x)/G(x) ] = R(x)
– Recall, addition is exclusive-or operation, so:
• Remainder [ C(x)/G(x) ] = R(x) + R(x) = 0
41
Detecting errors at the receiver
• Divide received message C′(x) by generator G(x)
– If no errors occur, remainder will be zero
1 0 1 0 1 1
1001 101110 011
1001
0101
0000
1010
1001
011 0
000 0
11 01
10 01
1 001
1 001
0 0 0 à no error detected
42
Detecting errors at the receiver
• Divide received message C′(x) by generator G(x)
– If errors occur, remainder may be non-zero
1 0 1 0 1 1
1001 101111 011
1001
0101
0000
1011
1001
010 0
000 0
10 01
10 01
0 0 0 1 à error detected
43
Detecting errors at the receiver
• Divide received message C′(x) by generator G(x)
– If errors occur, remainder may be non-zero
101 011
1001 101111 010
1001
How0many
1 0 1errors can the CRC detect?
0000
☟
1
How do
0 1 1
1 0we0 choose
1 generator G(x)?
010 0
000 0
10 01
10 01
0 0 0 0 à undetected error!
44
Detecting errors with the CRC
• The error polynomial E(x) = C(x) + C′(x) is the difference
between the transmitted and received codeword
– E(x) tells us which bits the channel flipped
• We can write the received message C′(x) in terms of C(x)
and E(x): C′(x) = C(x) + E(x), so:
– Remainder [C′(x) / G(x) ] = Remainder [ E(x) / G(x) ]
• When does an error go undetected?
– When Remainder [ E(x) / G(x) ] = 0
45
Detecting single-bit errors w/CRC
• Suppose a single-bit error in bit-position i: E(x) = xi
– Choose G(x) with ≥ 2 non-zero terms: xg−1 and 1
– Remainder [ xi / (xg−1 + ⋯ + 1) ] ≠ 0, e.g.:
1
1001 001000
1001
1
• Therefore a CRC with this choice of G(x) always detects
single-bit errors in the received message
46
Error detecting properties of the CRC
• The CRC will detect:
All single-bit errors
• Provided G(x) has two non-zero terms
– All burst errors of length ≤ g − 1
• Provided G(x) begins with xg−1 and ends with 1
• Similar argument to previous property
– All double-bit errors
• With conditions on the frame length and choice of
G(x)
– Any odd number of errors
• Provided G(x) contains an even number of non-zero
coefficients
47
Error detecting code: CRC
• Far less overhead than error correcting codes
– Typically 16 to 32 bits on a 1,500 byte (12 Kbit) frame
• Error detecting properties are more complicated
– But in practice, “missed” bit errors are exceedingly rare
48
Friday Precept:
Work on Lab 2
Tuesday Topic:
Convolutional Codes
49