Discrete Memoryless Channel
x0 y0
x y1
1
. .
X X P(y k | x j ) Y Y
. .
. .
xJ 1 y K -1
Definition of DMC
Channel with input X & output Y which is
noisy version of X.
Discrete when both of alphabets X & Y finite sizes.
Memoryless when no dependency between
input symbols.
1
Discrete Memoryless Channel (cont’)
Channel Matrix (Transition Probability Matrix)
p( y0 | x0 ) p(y1 | x0 ) ... p( y K 1 | x0 )
p( y | x ) . . . . . p ( y | x )
0 1 K 1 1
P .
.
p( y0 | x J 1 ) . . . . .. p( y K 1 | xJ 1 )
The size is J by K
K 1
p( yk | x j ) 1 for all j
k 0
a priori prob. is :
P k p ( x j ), j 0 ,1 ,.., J 1
2
Discrete Memoryless Channel (cont’)
Given a priori prob. p(xj ), and the channel matrix, P
then we can find the prob. of the various output
symbols, p(yk) as
the joint prob. dist’n of X and Y
p(xj , yk ) p(X xj ,Y yk ) p(Y yk / X xj ) p(X xj )
p( yk / xj ) p(xj )
the marginal prob. dist’n of the output Y,
J 1
p( yk ) p(Y yk ) p(Y yk / X xj ) p( X xj )
j 0
J 1
p( yk / xj ) p(xj ), for k 0,1,.., K 1
j 0 3
Discrete Memoryless Channel(cont’)
BSC (Binary Symmetric Channel)
1-p
x0 0 y0 0
p
1-p y1 1
x1 1
4
Mutual Information
Conditional Entropy
J 1
1
H ( X | Y yk ) p( x j | yk ) log 2 [ ]
j 0 p ( x j | yk )
The mean value
K 1
H (X |Y )
k 0
H ( X | Y yk ) p( yk )
K 1 J 1
1
k 0 j0
p ( x j , y k ) log 2[
p(x j | yk )
]
-> H(X|Y) : a conditional entropy (equivocation)
The amount of uncertainty remaining about the
channel input data after the channel output has been
observed.
Mutual Information : The uncertainty of the input resolved by
observing output J 1 K 1 p ( yk | x j )
I(X;Y) ≡ H(X) - H(X|Y) ,and I ( X ; Y )
j 0 k 0
p ( x j , y k ) log 2 [
p ( yk )
]
5
Properties of Mutual Information
( simple ex. needed for 2 by 2 DMC)
Symmetric : I ( X ; Y ) I (Y ; X )
Non-negative :
I ( X ;Y ) 0
I ( X ; Y ) H (Y ) H (Y | X )
I ( X ; Y ) H ( X ) H (Y ) H ( X , Y ) J 1 K 1
where 1
H ( X , Y ) p ( x j , yk ) log 2 [ ]
j 0 k 0 p ( x j , yk )
H(X,Y)
H(X|Y) I(X;Y) H(Y|X)
H(X) H(Y)
6
Channel Capacity
For a dms with input X, output Y, & p ( y k | x j, )
J 1 K 1 p( yk | x j )
I ( X ; Y ) p ( x j , yk ) log 2 [ ]
j 0 k 0 p( yk )
J 1
where p ( x j , y k ) p ( yk | x j ) p ( x j ) , p ( y k ) p ( yk | x j ) p ( x j )
j 0
I(X;Y) just depends upon { p( x j ), j 0,1,2,..., J 1} , & channel.
Since { p( x j )} is indep. of the channel, it is possible to
maximize I(X;Y) w.r.t. { p( x j )}.
Def. of Channel Capacity.
C max I ( X ; Y )(bits per channel use)
{ p ( x j )}
7
Ex.) for BSC
C max I ( X ; Y ) I ( X ; Y ) | p ( x0 ) 0.5
C 1 p log 2 p (1 p) log 2 (1 p) 1 H ( p )
8
Channel Coding Theorem
For reliable communication , needs channel encoding & decoding.
“any coding scheme which gives the error as small as possible, and
which is efficient enough that code rate is not too small?”
=> Shannon’s second theorem (noisy coding theorem)
Let dms with alphabet X have entropy H(X) and produce symbols
once every Ts, and dmc have capacity C and be used once every Tc .
Then,
i) if H ( X ) C , there exists a coding scheme.
Ts Tc
ⅱ) if H (X )
C , it is not possible to transmit with arbitrary
Ts Tc small error.
9
Ex.) for BSC with p0 0.5
The condition for reliable comm. ,
1 C
Ts Tc
Tc
Let Ts
be r , then r C
for r C , there exists a code (with code rate less
than or equal to C) capable of achieving an arbitrary
low probability of error.
“ The code rate r k where k is k-bit input, and n is
n-bit coded bits,.” n
10
Differential Entropy
Differential Entropy
1
h( X ) f x ( x) log 2 [ ]dx
f x ( x)
where f x (x ) is p.d.f.
-extension into continuous r.v.
Basis to derive the Channel Capacity Theorem
11
Maximum Differential Entropy for Specified Variance
Find p.d.f. for which h( x ) is maximum, subject to
i) f X ( x)dx 1 , ii ) ( x ) 2 f X ( x) dx 2 const
where is the mean, and 2 is the variance
Since 2 is a measure of average power, it is to find maximization
with constraint of constant power
12
Maximum Differential Entropy for Specified Variance
Sol. is based on calculus of variation & use of Lagrange multiplier
f X ( x) log 2 f X ( x) 1 f X ( x) 2 ( x ) 2 f X ( x) dx
I
I should be stationary to get maximum entropy.
- The desired form of f X ( x) is
1 ( x u )2
f X ( x) exp 2
2 2
Gaussian p.d.f.
- The maximum entropy
1
h ( x ) log 2 (2 e 2 )
2
Gaussian channel model is so widely utilized.
13
Mutual Information for Continuous r.v
f X ( x | y)
I(X :Y )
f X ,Y ( x, y ) log 2
f X ( x)
dxdy
where f X ,Y ( x, y ) is the joint pdf of X&Y, &
f X ( x | y ) is the conditional pdf of X , given that Y y
Max{I ( X k : Yk ) : E X k2 P]
f xk ( x )
leads to the most important channel capacity
14
Shannon’s Channel Capacity Theorem
For bandlimited, power limited Gaussian channels
P
C Blog2 1 (bits/s)
N
The capacity of a channel of bandwidth B, perturbed by
additive white gaussian noise of psd N0 /2, and limited in bandwidth to B,
P is the average transmitted power, and N is the noise (NoB)
- It is not possible to transmit at rate higher than C reliability by any means.
- It does not say how to find coding and modulation to achieve maximum capacity,
but it indicates that approaching this limit, the transmitted signal should have statistical
property approximately to Gaussian noise.
15
Bandwidth efficient diagram
Define ideal system as Rb C,
P EbC where Eb is the Tx energy per bit
C/B
C EbC 20
Then, log2 (1 )
B No B 10
Rb = C
Rb > C
Eb 2C / B 1
Rb < C
No C/B
For infinite bandwidth channel -1.6
1
Eb Eb/No(dB)
ln 2 0.693 1.6dB
No B
P Shannon's limit
C lim C log2 e
B No 0.1
-10 0 10 20 30 40 50
16