ECNG 3001 Communication Systems II
Topic 3 Source Coding
Source coding introduction
A/D Conversion
Sampling Theorem
Encoding
Waveform Coding - Pulse
Modulation (PM) Techniques
PAM, PCM, Companding
Huffman Coding Algorithm
Performance in source coding
Source Coding Introduction
Source Coding
Introduction
Information
Source
A source encoder (source coder) converts the
output of an information source into a code whose
form may be processed at the transmitter end of a
communications link
ECNG 3001 is concerned with digital source codes
How are analog or digital source outputs prepared
for source coding?
3
Source Coding prep for
coding
Before source coding can occur,
the sources output must be represented in electronic form
Transduce non-electronic source outputs e.g. using a microphone,
sensor, etc.
Condition electronic sources (apply impedance matching, filtering, etc.)
the sources output must also be represented in digital
form
Convert analog source information to digital via A/D conversion
Sample
Quantize
Directly encode digital source outputs
May employ sampling and/or quantization
Preparing source types Analog
Analog source outputs - convert to
electronic, digital form
Analog non-electronic
Information Source & preprocessing
Source
Transducer
A/D Converter
Source Coder
Analog electronic
Information Source & preprocessing
Signal
Source
A/D Converter
Conditioner
Source Coder
5
Preparing source types - Digital
Input is already digital convert to
electronic form
Digital non-electronic
Information Source & preprocessing
Source
Transducer
Source Coder
Digital electronic
Information Source & preprocessing
Signal
Source
Conditioner
Source Coder
6
Preparing source types for
digital communication
Digital source outputs generally require less
processing than analogue prior to source coding
Analog inputs must be digitized
What is the digitization process for analog
sources?
Transduce the analog source output if not already
electronic (make electronic)
Analog-to-digital (A/D) conversion (make digital)
Sample
Quantize
7
Analog to digital conversion
The process of converting a
continuous-time, continuousamplitude (analog) source output
into a discrete-time, discrete
amplitude (digital) signal
The A/D Process
Obtain the instantaneous
value of the analog input at
regular intervals sampling
Approximate instantaneous
(still continuous) values to
a set of finite valuesquantization
Quantized
signal
Original signal
before sampling
9
Sampling
The electronic analog source is tapped at
regular intervals to acquire its instantaneous
value
But why sample?
As the first step in digitizing the source output
discretizing the source output in time
The original (source) signal can be approximated
at the receiver by interpolating between
successive restored sample values
How often should the source be tapped or
sampled?
10
Sampling how often?
ORIGINAL SOURCE OUTPUT
SINE WAVE
a) Sample once per cycle constant
level recovered completely distorted
c) 2 times per cycle recovered sig.
begins to resemble the original sine
wave
b) 1.5 times per cycle lower freq sine wave
recovered (less distorted than in a))
d) > 2 times per cycle even better!
11
Sampling how often
It is clear that sampling at a rate of at
least 2x that of the sources output
frequency allows a reasonably
accurate reconstruction of the original
signal
Illustration was for a single frequency.
An analog source may generate
multiple frequencies. Considerations?
Observation and considerations for
complex signals summarised in
Nyquists sampling theorem
12
The Nyquist Sampling
Theorem
If a signal has a maximum frequency of B Hz,
it is completely defined by samples which
occur at intervals of 1/(2B) seconds
Interval 1/(2B) seconds aka sampling interval, T s
In other words,
a signal band-limited to B Hz can be sufficiently
reconstructed from samples taken at a
minimum rate of 2B samples per second
Or,
the sampling frequency must be equal to or
greater than twice the highest frequency
component of the signal
13
Proof: Nyquist Sampling
Theorem
Let the discrete signal x(t) represent the result of
sampling the source output using impulses at nT s time
instants
x(t) is the time-domain product of the electronic source
output x(t) and a pulse train (t - nTs), denoted as:
x (t )
x(nT ) (t nT ) or ,
s
x (t ) x(t ) (t nTs )
n
since each sampled pulse x(t) is defined only when t =
nTs, yielding the value of x(t) only at each multiple n of T s
14
Proof of Nyquist Sampling
Theorem (2)
Taking the Fourier Transform of both sides yields:
X ( f ) X ( f )* F
1
X ( f ) X ( f )*
Ts
1
X ( f )
Ts
(t nTs )
( f
X(f
or ,
n
) thus,
Ts
n
)
Ts
In words, the Fourier Transform of the
product of the impulse train and original
signal is a repetition of the FT of the
original signal, with each replica
separated by (1/Ts) Hz
15
Nyquist sampling Fourier
Analysis
Original signal and FT
Fourier
Multiplier pulse train
Sampled signal and
FT Multiplication
in t domain results
in replication of
original FT in f
domain, each
replica shifted by
16
Nyquist Sampling: Theorem
Result
For a source signal x(t) of bandwidth B, spectrum
replicas are adjacent and do not overlap for Ts =
1/(2B), or equivalently for fs = 2B
fs = 2B adjacent spectra
For Ts < 1/(2B) or fs > 2B,
replicas are separated from
each other
Spectrum overlap occurs for Ts > 1/(2B), or
equivalently for fs < 2B (time domain change has
inverse effect in freq. domain - Fourier)
Results in distortion of the source outputs spectrum
receiver cannot completely recover original signal -
17
Sampling Problem: Aliasing
For overlapping replicas of the source spectrum
(under-sampling), distortion is due to:
The loss of the tail of any spectrum copy G()
beyond I f I > fs/2 Hz. (folding freq.)
The reappearance of this tail inverted or folded
onto an adjacent spectrum replica (spectral
folding or aliasing)
Folding frequency
18
Effect of Aliasing
To reconstruct the signal at the receiver:
A single signal spectrum copy is selected by appropriate band
filtering with an effectively ideal low-pass filter of bandwidth f s/2
Hz
However, aliased signal cannot be recovered
completely
Folding results in loss of some frequency components after
filtering
Remaining frequency components sum with their folded versions
from adjacent spectrum copy
How can this be resolved?
19
Solving the Aliasing problem
In practical systems, sampling is done at a
higher rate than Nyquist
Allows the construction of a realizable filter at the
receiver to extract one spectral replica by creating
a guard band between adjacent source spectrum
replicas. Filter response needs not approach ideal
Size of guard band denoted by BG = fs 2B for
source output bandwidth of B and sampling
frequency of fs
20
Example 1
A bandlimited signal has a bandwidth equal
to 3400Hz. What sampling rate should be
used to guarantee a guard band of 1200Hz?
Let fs = sampling frequency, B = signal
bandwidth, and BG = guard band.
fs = 2B + BG
fs = 2(3400) + 1200 = 8000 Hz or
Samples/sec.
21
Example 2
A source with bandwidth 4000 Hz is
sampled at the Nyquist rate. Assuming
that the resulting sampled sequence can
be approximately modeled by a DMS with
1 1 1 1 1
, , 2}
, , and with
alphabet A = {-2, -1, 0, 21,
4 8 16 16
corresponding probabilities {
},
determine the rate at which information is
generated at this source in bits per
second.
22
The next step: Quantization
Quantization: the process of
approximating a continuous range of
values with a relatively small set of
integer values or discrete symbols
Source output is digitized in
amplitude in this process output
now becomes fully digitized
23
Quantization the process
Continuous signal has amplitudes within the range (m p,-mp)
This range is segmented into L equally-spaced levels
termed linear quantization
A representation point is selected either at each of the L
levels, at the mid-point of the sub-ranges between each
level, to which sources output samples are rounded
Thus, the process produces only a finite set of amplitude
values
digitized amplitudes
24
Determining Quantization Error
Consider a sampled source output quantized to 8 level
Peak-to-peak input = 2mp
volts
Number of quantization levels
=L
Spacing between each level =
V = 2mp/L
mp
d ( x x )
x
2mp
Quantization error q for
sampled input value x, after
quantization,
q ( xis x )
- mp
where
is the level to which
the actual value of x is
quantized
Max. error qmax = V/2
25
Determining Quantization Noise
Power
Quantization error q is continuous over any given
quantization interval
x is continuous, and so q can take on any value up to |
V/2| for a given quantization interval
If x takes on units of voltage, then q is also defined in volts
Taking the square of the instantaneous error
voltage q referenced to 1 gives instantaneous
noise power:
q 2 ( x x ) 2 W
The average value of q2 within any quantization
interval is the mean of all possible instantaneous
noise power values over the quantization interval
Mean-squared noise power found by weighting noise
power with the likelihood of occurrence for all values of
noise power over the interval
26
Quantization Noise Power (2)
Thus mean-squared noise power D i per interval is
the weighted average noise power over an interval:
Di q 2 q 2 f X ( x)dx
where fX(x) is a continuous function of the probability of any x
(and hence difference q) occurring. Simplest case is
equiprobable uniform distribution
For x equally likely to occur anywhere in the
interval, V, (and not outside) the probability of
any given x occurring is 1/V, hence
1
Di q
V
2
V
2
2
q
dq
V
2
(interval defined over the
range V)
27
Quantization noise power (3)
Quantization noise power, Nq, for a signal equally likely to
fall anywhere in a quantization interval 2mp/L, is therefore:
1
N q Di q
V
2
Nq
(V ) 2
q dq
12
V / 2
V / 2
m 2p
3L2
28
Performance in Quantization
Quality of quantized signal described quantitatively in
terms of the Signal to Quantization Noise Ratio
(SQNR): ratio of signal power to quantization noise power
Lets consider signal = m(t)
denoted x(t))
Avg signal power = So = m2(t)
(referenced to 1)
Quantization noise power = Nq
(often
2
Thus SQNR for equal-sized
S0
m
(t ) quantization
2
3L
intervals, 2mp/L,N quniform
m 2p quantization:
29
Uniform Quantization:
Observations
SQNR increases with
Increasing signal power
Decreasing quantization noise power
In uniform quantization, quantization noise
power is constant (mp2/3L2) over signal dynamic
range
SQNR varies with signal power
Uniform quantization not optimum because lower
signal amplitudes have lower SQNR (consider voice)
30
Increasing SQNR
In uniform quantization schemes, SQNR can be
increased by decreasing quantization noise at lower
amplitudes. How do we do this?
Nq decreased by increasing no. of quantization
levels
To equalize SQNR across signal dynamic range, we
may
Reduce quantization interval for small signal amplitudes:
non-uniform quantization or
Nonlinearly expand signal dynamic range so small
amplitudes see same Nq as large ones: Companding
31
Non-uniform Quantization
One scheme designed to
compensate for low SQNR of
uniform quantization
Lower m(t) amplitudes divided
into more quantization levels
Lower amplitudes subject to
lower quantization noise
SQNR equalized over
dynamic range of m(t)
32
Companding
Another scheme designed to compensate for low SQNR of
uniform quantization
Signal passed through non-linear
element that compresses large
amplitudes
Compressed signal quantized
uniformly
At receiver, dynamic range of
Compressing and Expanding
referred to as Companding
For voice signals, SQNR typically
improved by ~ 24 dB by
received signal restored (expanded)
with inverse characteristic of
compressor
Typical Compression
Characteristic
33
Common compression
characteristics
(a) -law characteristic (North America)
(b) A-law characteristic (Europe)
To read graphs :
As the actual input signal m approaches its maximum value (x
axis), apply less gain to it according to an amount of
compression set by or A (y axis).
For low signal levels m, apply more gain according to the
values of or A
34
- and A-law companding
characteristics
-law characteristic (used in North America)
Output y is expressed as:
1
m
y
ln 1
ln(1 )
m p
m
1
mp
A-law characteristic (used in Europe)
A m
y
1 ln A m p
1
Am
y
1 ln
1 ln A
m p
m 1
mp A
NOTE:
The greater
or A, the more
larger
amplitudes of
m are
compressed in
the output y.
m
1
mp
35
Improving SQNR with
companding
For -law
companded source
output, SQNR is
2
2 by
given
m
So
3L
2
Nq
ln(1 )
With
compression
______
2
m (t )
Without
compression
S0/Nq often
denoted as S0/N0
36
Summary of Source Prep
Perfce
Sampling
Sampling frequency must
be at least twice highest
frequency component in
signal:
fs 2B samples / second
Quantization
Quantization introduces
error, q, and noise, Nq,
V
which increase
with
Quantized
2
1 interval,
2
quantization
V :signal
Nq q2
q dq
V V
2
Original signal
before sampling
37
Summary of Source Prep
Perfce
Uniform
Quantization
For L equally-sized
quantization intervals,
2
S0
2 m (t )
Signal to Noise
is:
3L Ratio
2
Nq
mp
Alternatives
Non-uniform
quantization
Companding
38
Encoding
In the encoding process, a sequence
of bits is assigned to different
quantization values
Maybe independent of quantization
process or integral to quantization
39
Encoding Independent of
Quantn
Quantized levels mapped to fixed length binary
code words
Original signal
Samples Continuous values at
discrete times (PAM)
Quantization Continuous
values rounded to a set of
discrete values
Coding Each discrete value
mapped to a code
40
Mapping quantized values - PCM
Pulse Code Modulation, PCM
PCM
sign
al
4-bit PCM
41
41
Uniform PCM the process
The source output (bandlimited to B Hz) is usually
passed through an anti-aliasing filter of same BW
before sampling to eliminate aliasing of spurious
signals
Sampling is then done at a rate of Nyquist or
more provides a guard band
Samples are then passed to a uniform quantizer
(even-sized quantization intervals), L levels
Each quantized source output is mapped to a
binary sequence of fixed length v bits, where, L =
2v (encoding)
L generally chosen to be a power of 2
42
Source-coded bit rate
example
An analog source bandlimited to 4000 Hz is sampled at
the Nyquist rate, then quantized and encoded with 4-level
uniform PCM. Determine the source coding bit rate for this
source. Compare this rate with that obtained in example 2
on slide 22.
Nyquist sampling rate = 2(4000) = 8000
samples/sec
Uniform PCM employs uniform quantization, hence
no. of bits per level is v = log2(4) = 2 bits
Thus the source coded bit rate is 2(8000) = 16000
bits/s
43
PCM Observations
The output of the PCM process is a bit stream
Each quantized, encoded sample generates v bits
per sample
Sampling carried out at a minimum of 2B samples/sec
v bits per sample generates 2vB bits per second output
2vB bits/sec referred to as source-coded data or bit
rate
Represents the minimum coding rate for uniform PCM
Waveform coding schemes integral to quantization,
such as DPCM and DM, may be used to encode
source output with fewer bits than standard PCM
44
Increasing PCM Efficiency:
DPCM
Differential PCM, DPCM, codes the difference
between two adjacent samples rather than the
absolute value of the sample itself
Codes fewer bits than PCM
Used with sources which change slowly, i.e.
whose samples are correlated
Sources can be sampled faster to improve
correlation
Encoding fewer bits improves efficiency but
higher sampling rate may counteract this
improvement (this will be explored further in
topic 4 The Channel)
45
Increasing PCM Efficiency:
DM
Delta Modulation output encoded using a single bit
Output is a bit sequence representing slope of analog
input (i.e. derivative of input analog signal)
If staircase function goes up by one step, 1 generated
0 generated otherwise
Staircase always changes in the same direction as the
analog input
Thus, staircase very closely follows or represents the analog
input
Binary output can be used to regenerate the staircase
function and thus an estimate of the original signal
Original signal approximated by recreating and smoothing the
staircase function with an integrator or a filter
46
DM performance
If the analog source output changes too slowly or too
rapidly, distortion is introduced
If the input changes too slowly, quantization error becomes
significant
If it changes too rapidly, slope overload occurs
Either condition introduces distortion noise
Step size influences slope overload and quantization
error
As step size increases, quantization error increases
As step size decreases, slope overload increases
How can this issue be addressed?
47
Tracking the input with DM
48
Improving DM performance
Achieved by selecting optimum step
size
Use adaptive DM
For rapid changes in input, use large
steps eliminates slope overload
For slow changes, use smaller steps
significantly reduces quantization noise
49
Encoding efficiency
Source output is represented by H(X) bits per
quantized sample
Thus encoder must map each quantized sample to
a binary code of length v H(X) bits to maintain
representation accuracy
Binary codes can only be of integer lengths (e.g. 4
bits/sample, 8 bits/sample, etc.)
The value of H(X) is not always an integer (Why?)
Coding process is most efficient for v = H(X), i.e.
H(X) being an integer value
Efficiency is lower if more bits are encoded than
are required to accurately represent the quantized
source output, i.e. for v > H(X)
50
Improving encoding
efficiency
Efficiency can be improved by compressing the output of the
source. How?
Consider that H(X) is a measure of the average information
content per source output (sample)
For H(X) less than max, not all samples reveal exactly the
same amount of information (H(X) is the average of all
values)
Recall that information content of any given sample depends on
its likelihood of occurrence (higher likelihood of occurrence: less
information conveyed, and vice versa)
Thus no. of bits needed to represent less likely samples > no. of
bits for more likely samples
Hence we can encode with codes of variable lengths, based on
the sources output probability distribution
With this, the average code word length approaches H(X),
reducing the average number of bits generated by the source
coder, and improving efficiency
For H(X) = max, compression possible?
51
Implementing variable length
coding
Implemented by applying one of
several available algorithms to
quantized outputs
A common algorithm is the Huffman
Source Coding Algorithm (Huffman
Coding)
52
Huffman Coding
The probability of occurrence of each
quantized sample is used to
construct a matching codeword
This is accomplished using the
Huffman Coding Algorithm
53
The Huffman Coding
Algorithm
1. Sort source outputs in decreasing order of their
2.
3.
4.
5.
probabilities
Merge the two least probable outputs into a single output
whose probability is the sum of the corresponding
probabilities
No. of remaining outputs = 2? If no, go back to step 2;
otherwise go on to step 4
Arbitrarily assign 0 and 1 as codewords for the two
remaining outputs
If an output is the result of the merger of two outputs
from a preceding step, append the current codeword with
a 0 and a 1 to obtain the codeword for the preceding
outputs; then repeat step 5. If no output is preceded by
another output in another step, then stop
NOTE: The same convention as used in step 4 should be applied in step
5, i.e. if a 0 were assigned to the upper branch and a 1 to the lower in
step 4, the same should be done for each iteration in step 5.
54
Huffman Coding Example
An information source generates outputs with the
alphabet {a1, a2, a3, a4, a5} with probabilities {1/2,
1/4, 1/8, 1/16, 1/16}. Design a Huffman code for
this
source.
Alphabet Symbol
Algorithm
Resulting
symbol
Probability
a1
1/2
a2
1/4
a3
a4
a5
Implementation
0
0
0
1/8
1/16
1/8
1
1/16
10
1/2
1/4
Codeword
0
110
1
111
0
111
1
55
Huffman average codeword
length
Average codeword length is the weighted
average of all codeword lengths
Evaluated by weighting individual codeword
lengths by their probabilities of occurrence
and summing the result
Let p(i) = probability of ith sample occurring (ith
codeword)
Let l(i) = the length of the ith codeword
L
Average codeword
R length
p(i )l (Ri ) given by
i 1
56
Coding efficiency gain with
Huffman
The efficiency of the Huffman code is given
by
H(X )
R
1
R satisfies the inequality
H ( X ) R H ( X ) 1
Compared to standard PCM, R < v: on
average, fewer bits required to represent the
source without sacrificing accuracy
57
Huffman efficiency gain
example
For the source in the previous example,
calculate H(X), R and the efficiency of the
Huffman code. Determine how many bits are
required to represent the source with
standard PCM, and compare the efficiency of
the Huffman code with that obtained by
coding the source with PCM
58
Limitations of Huffman
coding
Huffman coding requires advanced knowledge
of the sources probability distribution
Cannot accommodate sources with differing statistics
Thus implementation can become quite complicated
in practice
Other techniques can be employed
E.g. Lempel-Ziv coding
Also employs variable-length encoding
However, does not rely on the statistics of the source
Encodes a string of consecutive binary codewords by parsing
it into phrases that are then coded into variable-length codes
59
Source Coding Brief Review
Source coding comprises:
Digitization (for analog sources)
Sampling
Quantization
Encoding
Waveform coding
Various techniques for improving coding
efficiency
Basic performance metrics examined
60
Summary of source coding
performance
For sampling:
A bandlimited analog source output of
maximum frequency B must be sampled
at the Nyquist rate (2B Hz) or higher
Sampling at lower rates termed
undersampling and prevents accurate
source representation
Effect of aliasing reduced for sampling rates
higher than Nyquist provide guard band
61
Source coding performance
(2)
For Quantization:
The sampled source output is quantized using L
levels
Process introduces quantization noise
Ratio of signal power to quantization noise power
measures the quality of the sampled output SQNR
For uniform quantization, SQNR varies with signal
level lower levels, lower SQNR, vice versa for
SQNR improved by employing non-uniform
quantization
Companding a type of non-uniform quantization
in which large amplitudes are compressed before
quantization, effectively increasing the SQNR of
lower-amplitude inputs
62
Source coding performance
(3)
For PCM:
Encoding produces unique fixed-length code
words for each quantized sample
DPCM produces shorter codewords per quantized sample
DM produces 1-bit codewords
DPCM and DM assume high correlation between
successive samples
Source coding data rate defined as a minimum of
2vB bits per second
Coding efficiency can be improved for quantized
samples from sources with non-uniform probability
distributions
63
Source coding performance
(4)
For improving encoding efficiency:
A minimum of H(X) bits per quantized sample
required to accurately encode the source output
PCM encodes fixed-length blocks of v bits per
quantized sample
Necessary that v H(X)
Not most efficient way to code for sources with varying
probabilities for each sample
Thus encode with variable-length codewords:
encode each source output according to its
probability of occurrence. Average number of bits
per quantized sample moves closer to H(X).
Average codeword length R < v, improving
efficiency
64
Reading
Mandatory reading:
Proakis, J, Masoud Salehi. Fundamentals of
Communication Systems : Sections 7.17.2.1, 7.3, 7.4, 12.2, 12.3-12.3.1
Recommended reading
Proakis, J, Masoud Salehi. Fundamentals of
Communication Systems : Section 12.3.2
Proakis, J, Masoud Salehi. Communication
Systems Engineering : Section 1.
65