0% found this document useful (0 votes)

84 views29 pages

01 EntropyLosslessCoding PDF

A lossy compression system contains a lossless compression system that encodes the output of the quantizer. The document discusses lossless compression techniques like Huffman coding and entropy coding. It explains that Huffman coding assigns variable length binary codes to symbols based on their probability to achieve an average code length close to the entropy bound with minimal redundancy. Vector and truncated Huffman coding are also introduced to improve efficiency.

Uploaded by

Ezhilarasan Kaliyamoorthy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

84 views29 pages

01 EntropyLosslessCoding PDF

Uploaded by

Ezhilarasan Kaliyamoorthy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 29

Lossless compression in

lossy compression systems

Almost every lossy compression system contains a lossless

compression system
Lossy compression system
Transform
Quantizer

Lossless
Encoder

Lossless
Decoder

Dequantizer
Inverse
Transform

Lossless compression system

We discuss the basics of lossless compression first,

then move on to lossy compression
Bernd Girod: EE398A Image and Video Compression

Entropy and Lossless Coding no. 1

Topics in lossless compression

Binary decision trees and variable length coding

Entropy and bit-rate
Prefix codes, Huffman codes, Golomb codes
Joint entropy, conditional entropy, sources with memory
Fax compression standards
Arithmetic coding

Bernd Girod: EE398A Image and Video Compression

Entropy and Lossless Coding no. 2

Example: 20 Questions

Alice thinks of an outcome (from a finite set), but does not

disclose her selection.
Bob asks a series of yes/no questions to uniquely
determine the outcome chosen. The goal of the game is to
ask as few questions as possible on average.
Our goal: Design the best strategy for Bob.

Bernd Girod: EE398A Image and Video Compression

Entropy and Lossless Coding no. 3

Example: 20 Questions (cont.)

Which strategy is better?

0 (=no)

1 (=yes)

B C

0
1

1 0

1 C D 0

Observation: The collection of questions and answers yield

a binary code for each outcome.
Bernd Girod: EE398A Image and Video Compression

Entropy and Lossless Coding no. 4

Fixed length codes

B C

F G

Average description length for K outcomes lav log2 K

Optimum for equally likely outcomes
Verify by modifying tree

Bernd Girod: EE398A Image and Video Compression

Entropy and Lossless Coding no. 5

Variable length codes

If outcomes are NOT equally probable:

Use shorter descriptions for likely outcomes

Use longer descriptions for less likely outcomes

Intuition:

Optimum balanced code trees, i.e., with equally likely outcomes, can
be pruned to yield unbalanced trees with unequal probabilities.
The unbalanced code trees such obtained are also optimum.

Hence, an outcome of probability p should require about

1
log 2 bits
p

Bernd Girod: EE398A Image and Video Compression

Entropy and Lossless Coding no. 6

Entropy of a random variable

Consider a discrete, finite-alphabet random variable X

Alphabet X { 0 ,1 , 2 ,..., K 1}

PMF f X x P X x

for each x X

Information associated with the event X=x

hX x log 2 f X x

Entropy of X is the expected value of that information

H X E hX X f X x log 2 f X x
x

Unit: bits
Bernd Girod: EE398A Image and Video Compression

Entropy and Lossless Coding no. 7

Information and entropy: properties

Information hX x 0
Information hX(x) strictly increases with decreasing
probability fX(x)
Boundedness of entropy

0 H ( X ) log 2
Equality if only one
outcome can occur

Equality if all outcomes

are equally likely

Very likely and very unlikely events do not substantially

change entropy

p log2 p 0 for p 0 or p 1
Bernd Girod: EE398A Image and Video Compression

Entropy and Lossless Coding no. 8

Example: Binary random variable

H X p log2 p (1 p)log 2 (1 p)
1

H X

0.9
0.8

Equally
likely

0.7
0.6
0.5
0.4
0.3

deterministic

0.2
0.1
0

0.1

0.2

0.3

0.4

0.5

Bernd Girod: EE398A Image and Video Compression

0.6

0.7

0.8

0.9

Entropy and Lossless Coding no. 9

Entropy and bit-rate

Consider IID random process X n (or source) where each

sample X n (or symbol) possesses identical entropy H(X)

H(X) is called entropy rate of the random process.

Noiseless Source Coding Theorem [Shannon, 1948]

The entropy H(X) is a lower bound for the average word length R of
a decodable variable-length code for the symbols.

Conversely, the average word length R can approach H(X), if

sufficiently large blocks of symbols are encoded jointly.

Redundancy of a code:

RH X 0

Bernd Girod: EE398A Image and Video Compression

Entropy and Lossless Coding no. 10

Variable length codes

X with alphabet

Given IID random process

PMF f X x

Task: assign a distinct code word, cx, to each element,

x X , where cx is a string of cx bits, such that each
symbol xn can be determined, even if the codewords cx
n
are directly concatenated in a bitstream
Codes with the above property are said to be
uniquely decodable.
Prefix codes

and

No code word is a prefix of any other codeword

Uniquely decodable, symbol by symbol,
in natural order 0, 1, 2, . . . , n, . . .

Bernd Girod: EE398A Image and Video Compression

Entropy and Lossless Coding no. 11

Example of non-decodable code

Encode sequence of source symbols 0 , 2 , 3 , 0 , 1
Resulting bit-stream

0 10 11 0 01

Encode sequence of source symbols 1 , 0 , 3 , 0 , 1

Resulting bit-stream

01 0 11 0 01

Same bit-stream for different sequences of source symbols:

ambiguous, not uniquely decodable
BTW: Not a prefix code.

Bernd Girod: EE398A Image and Video Compression

Entropy and Lossless Coding no. 12

Unique decodability: McMillan and Kraft conditions

Necessary condition for unique decodability [McMillan]

Given a set of code word lengths ||cx|| satisfying McMillan

condition, a corresponding prefix code always exists [Kraft]

Hence, McMillan inequality is both necessary and sufficient.

Also known as Kraft inequality or Kraft-McMillan inequality.
No loss by only considering prefix codes.
Prefix code is not unique.

Bernd Girod: EE398A Image and Video Compression

Entropy and Lossless Coding no. 13

Prefix Decoder

Shift register
to hold longest
code word

...

Input buffer

Advance
||cx|| bits

Bernd Girod: EE398A Image and Video Compression

Code
word LUT

...

Code
word length
LUT

Entropy and Lossless Coding no. 14

Binary trees and prefix codes

Any binary tree can be

converted into a prefix code
by traversing the tree from
root to leaves.

Any prefix code corresponding

to a binary tree meets McMillan
condition with equality

0
0

1
1

01 10

1
1

1100

1101

111

3 22 2 24 23 1

Bernd Girod: EE398A Image and Video Compression

Entropy and Lossless Coding no. 15

Binary trees and prefix codes (cont.)

Augmenting binary tree by two

new nodes does not change
McMillan sum.
Pruning binary tree does not
change McMillan sum.

1
1

Bernd Girod: EE398A Image and Video Compression

1
1

2 l
2

McMillan sum for simplest

binary tree

1
0

l 1

2 l

21 21 1
Entropy and Lossless Coding no. 16

Instantaneous variable length encoding

without redundancy

A code without redundancy, i.e.

Example

R H(X )
requires all individual code
word lengths

lk log 2 f X k

All probabilities would have to

be binary fractions:

f X ( k ) 2

Bernd Girod: EE398A Image and Video Compression

H X 1.75 bits
R 1.75 bits
0
Entropy and Lossless Coding no. 17

Huffman Code

Design algorithm for variable length codes proposed by

Huffman (1952) always finds a code with minimum
redundancy.
Obtain code tree as follows:
1 Pick the two symbols with lowest probabilities and
merge them into a new auxiliary symbol.
2 Calculate the probability of the auxiliary symbol.

3 If more than one symbol remains, repeat steps

1 and 2 for the new auxiliary alphabet.
4 Convert the code tree into a prefix code.
Bernd Girod: EE398A Image and Video Compression

Entropy and Lossless Coding no. 18

Huffman Code - Example

R fixed 4 bits/symbol
Fixed length coding:
RHuffman 2.77 bits/symbol
Huffman code:
Entropy
H( X ) 2.69 bits/symbol
Redundancy of the Huffman code:
0.08 bits/symbol
Bernd Girod: EE398A Image and Video Compression

Entropy and Lossless Coding no. 19

Redundancy of prefix code for general distribution

Huffman code redundancy 0 1 bit/symbol

Theorem: For any distribution fX, a prefix code can be found, whose
rate R satisfies
Proof

H X R H X 1

Left hand inequality: Shannons noiseless coding theorem

Right hand inequality:

Choose code word lengths cx log 2 f X x

Resulting rate
R f X x log 2 f X x
x

f X x 1 log 2 f X x

H X 1
Bernd Girod: EE398A Image and Video Compression

Entropy and Lossless Coding no. 20

Vector Huffman coding

Huffman coding very inefficient for H(X) << 1 bit/symbol

Remedy:
Combine m successive symbols to a new block-symbol

Huffman code for block-symbols

Redundancy

1
H X R H X
m

Can also be used to exploit statistical dependencies between

successive symbols
m
Disadvantage: exponentially growing alphabet size X

Bernd Girod: EE398A Image and Video Compression

Entropy and Lossless Coding no. 21

Truncated Huffman Coding

Idea: reduce size of Huffman code table and maximum

Huffman code word length by Huffman-coding only the most
probable symbols.
Combine J least probable symbols of an alphabet of size K into an
auxillary symbol ESC
Use Huffman code for alphabet consisting of remaining K-J most
probable symbols and the symbol ESC
If ESC symbol is encoded, append
log 2 J bits to specify exact
symbol from the full alphabet

Results in increased average code word length trade off

complexity and efficiency by choosing J

Bernd Girod: EE398A Image and Video Compression

Entropy and Lossless Coding no. 22

Adaptive Huffman Coding

Use, if source statistics are not known ahead of time

Forward adaptation

Measure source statistics at encoder by analyzing entire data

Transmit Huffman code table ahead of compressed bit-stream
JPEG uses this concept (even though often default tables are
transmitted)

Backward adaptation

Measure source statistics both at encoder and decoder, using the

same previously decoded data
Regularly generate identical Huffman code tables at transmitter and
receiver
Saves overhead of forward adaptation, but usually poorer code
tables, since based on past observations
Generally avoided due to computational burden at decoder

Bernd Girod: EE398A Image and Video Compression

Entropy and Lossless Coding no. 23

Unary coding

Geometric source
Alphabet

0,1,...

PMF f X x 2 x 1 , x 0

Optimal prefix code with redundancy 0 is unary code

(comma code)

c0 "1" c1 "01" c2 "001" c3 "0001"

Consider geometric source with faster decay

1
PMF f X x 1 , with 0 ; x 0
2
x

Unary code is still optimum prefix code (i.e., Huffman code), but
not redundancy-free

Bernd Girod: EE398A Image and Video Compression

Entropy and Lossless Coding no. 24

Golomb coding

For geometric source with slower decay

1
PMF f X x 1 , with 1; x 0
2
x

Idea: Express each x as

x
xq
m
Distribution of new random variables
x mxq xr

with

m 1

f X q xq f X mxq i

xr x mod m

and

mxq

i 0

m 1

f i
i 0

1 xr
f X r xr

for 0 xr m
1 m
X q and X r statistically independent.

Bernd Girod: EE398A Image and Video Compression

Entropy and Lossless Coding no. 25

Golomb coding (cont.)

Golomb coding

1
2

m
Choose integer divisor

Encode xq optimally by unary code

Encode xr by a modified binary code, using code word lengths

ka log2 m

kb log2 m

Concatenate bits for xq and xr

In practice, m=2k is often used, so xr can be encoded by constant

code word length log 2 m

Bernd Girod: EE398A Image and Video Compression

Entropy and Lossless Coding no. 26

Golomb code examples

Unary
Code
1
01
001
0001
00001
000001
0000001
00000001
000000001

.
.
.

Unary
Code

10
11
Constant
010
011
length code
0010
0011
00010
00011
000010
m=2
000011
Unary
0000010
Code
0000011
00000010
00000011

.
.
.

Bernd Girod: EE398A Image and Video Compression

100
101
110
111
0100
0101
0110
0111
00100
00101
00110
00111
000100
000101
000110
000111

Constant
length code

m=4

.
.
.

Entropy and Lossless Coding no. 27

Golomb parameter estimation

Expected value for geometric distribution

E X 1 x
x

x 0

Approximation for E X 1

E X

1 E X

m
1

EX 2

1
EX
2

1

k max 0, log 2 E X
2

1 E X
Rule for optimum
performance of
Golomb code

m 2k

Bernd Girod: EE398A Image and Video Compression

Reasonable setting,
even if E X 1
does not hold

Entropy and Lossless Coding no. 28

Adaptive Golomb coder (JPEG-LS)

Pick the
best
Golomb
code

Initial estimate
of mean

Initialize A x and N 1
For each n 0,1, 2,

A
Set k max 0, log 2

2
N

Code symbol xn using Golomb code with parameter k
If N N max
Set A
A / 2 and N
N / 2
Update A
A xn and N
N 1

Avoid overflow
and slowly
forget the past

Bernd Girod: EE398A Image and Video Compression

Entropy and Lossless Coding no. 29

Digital Communication System
No ratings yet
Digital Communication System
10 pages
3 Source Coding
No ratings yet
3 Source Coding
31 pages
2201.01741v2 - Understanding Entropy Coding With Asymmetric Numeral Systems (ANS) - Statistician Perspective
No ratings yet
2201.01741v2 - Understanding Entropy Coding With Asymmetric Numeral Systems (ANS) - Statistician Perspective
26 pages
Video Processing Communications Yao Wang Chapter8a
No ratings yet
Video Processing Communications Yao Wang Chapter8a
19 pages
L15 Compression
No ratings yet
L15 Compression
63 pages
Source Encoding and Code Types
No ratings yet
Source Encoding and Code Types
186 pages
Source Coding and Shannon's Theorem
No ratings yet
Source Coding and Shannon's Theorem
30 pages
Source Coding Techniques
No ratings yet
Source Coding Techniques
18 pages
Data Compression Basics: Discrete Source
No ratings yet
Data Compression Basics: Discrete Source
34 pages
ECE452s L2
No ratings yet
ECE452s L2
26 pages
Information Theory: Principles and Applications: Tiago T. V. Vinhoza
No ratings yet
Information Theory: Principles and Applications: Tiago T. V. Vinhoza
34 pages
Entropy Coding - Wikipedia
No ratings yet
Entropy Coding - Wikipedia
2 pages
Data Compression
No ratings yet
Data Compression
113 pages
Lossless Compression: Huffman Coding: Mikita Gandhi Assistant Professor Adit
No ratings yet
Lossless Compression: Huffman Coding: Mikita Gandhi Assistant Professor Adit
39 pages
Information Theory: Dr. Muhammad Imran Farid
No ratings yet
Information Theory: Dr. Muhammad Imran Farid
32 pages
DC-PPT 5
No ratings yet
DC-PPT 5
44 pages
Source Coding
No ratings yet
Source Coding
10 pages
ECEVSP L03 Compression2
No ratings yet
ECEVSP L03 Compression2
40 pages
Lec27 PDF
No ratings yet
Lec27 PDF
26 pages
Introduction To Information Theory and Coding
No ratings yet
Introduction To Information Theory and Coding
46 pages
Entropy & Coding in Info Theory
No ratings yet
Entropy & Coding in Info Theory
10 pages
Entropy, Coding and Data Compression
No ratings yet
Entropy, Coding and Data Compression
33 pages
Image Compression: Sankalp Kallakuri
No ratings yet
Image Compression: Sankalp Kallakuri
21 pages
Lossless Compression: Lesson 1
No ratings yet
Lossless Compression: Lesson 1
10 pages
Revision of Lecture 1: Q Bits R R Q Q (Bits/symbol) I (M P Log R R R) ? M, P
No ratings yet
Revision of Lecture 1: Q Bits R R Q Q (Bits/symbol) I (M P Log R R R) ? M, P
18 pages
Lecture 3-Huffman Coding
No ratings yet
Lecture 3-Huffman Coding
30 pages
Understanding Source Coding & Compression
No ratings yet
Understanding Source Coding & Compression
48 pages
Information Theory and Coding
No ratings yet
Information Theory and Coding
27 pages
Source Coding
No ratings yet
Source Coding
29 pages
Chapter10 Part1 Huffman
No ratings yet
Chapter10 Part1 Huffman
17 pages
CH 6
No ratings yet
CH 6
21 pages
Intro To ICT 11
No ratings yet
Intro To ICT 11
31 pages
Week 6 Information Theory - Part 2
No ratings yet
Week 6 Information Theory - Part 2
27 pages
Info Theory for Telecom Students
No ratings yet
Info Theory for Telecom Students
28 pages
Information Theory and Coding PDF
No ratings yet
Information Theory and Coding PDF
150 pages
Unit 5 - Part-Ii
No ratings yet
Unit 5 - Part-Ii
41 pages
3-1-Lossless Compression
No ratings yet
3-1-Lossless Compression
10 pages
Lecture9 Lossless Coding
No ratings yet
Lecture9 Lossless Coding
46 pages
TEOI InformationOfDataSources
No ratings yet
TEOI InformationOfDataSources
55 pages
Lossless Math
No ratings yet
Lossless Math
32 pages
Lecture I: Data Compression Data Encoding: Efficient Information Encoding To
No ratings yet
Lecture I: Data Compression Data Encoding: Efficient Information Encoding To
48 pages
Image and Video Compression: Lecture 12, April 27, 2009 Lexing Xie
No ratings yet
Image and Video Compression: Lecture 12, April 27, 2009 Lexing Xie
77 pages
Unit 2
No ratings yet
Unit 2
28 pages
Algorithms in The Real World: Data Compression: Lectures 1 and 2
No ratings yet
Algorithms in The Real World: Data Compression: Lectures 1 and 2
55 pages
L12, L13, L14, L15, L16 - Module 4 - Source Coding
No ratings yet
L12, L13, L14, L15, L16 - Module 4 - Source Coding
59 pages
Lecture 2 28 August, 2015: 2.1 An Example of Data Compression
No ratings yet
Lecture 2 28 August, 2015: 2.1 An Example of Data Compression
7 pages
Chap 2
No ratings yet
Chap 2
47 pages
L1 Part2
No ratings yet
L1 Part2
21 pages
Source Coding & Compression Basics
No ratings yet
Source Coding & Compression Basics
14 pages
Source Coding Theory: TSBK01 Image Coding and Data Compression
No ratings yet
Source Coding Theory: TSBK01 Image Coding and Data Compression
14 pages
Uniquely Decodable Codes
No ratings yet
Uniquely Decodable Codes
10 pages
Publication 3 26433 1410
No ratings yet
Publication 3 26433 1410
6 pages
Huffman Coding
No ratings yet
Huffman Coding
39 pages
Kjom351 09 PDF
No ratings yet
Kjom351 09 PDF
7 pages
Lossless Data Compression
No ratings yet
Lossless Data Compression
24 pages
03 Arithmeding PDF
No ratings yet
03 Arithmeding PDF
23 pages
00 Introduction
No ratings yet
00 Introduction
15 pages
Microprocessor Architecture: Introduction To Microprocessors Chapter 2
No ratings yet
Microprocessor Architecture: Introduction To Microprocessors Chapter 2
47 pages
Advanced in The VLSI Design Flow PDF
No ratings yet
Advanced in The VLSI Design Flow PDF
47 pages
Syllabus 2010 Schme ECE Signed
No ratings yet
Syllabus 2010 Schme ECE Signed
115 pages
Example 7 Maths IA
No ratings yet
Example 7 Maths IA
19 pages
Solution For Homework 2 Problem 1
No ratings yet
Solution For Homework 2 Problem 1
8 pages
Urdaneta City University: Measures of Dispersion and Shape
No ratings yet
Urdaneta City University: Measures of Dispersion and Shape
10 pages
Statistical Analysis Guide
No ratings yet
Statistical Analysis Guide
11 pages
Linear Regression Analysis in Education and Economics
No ratings yet
Linear Regression Analysis in Education and Economics
19 pages
CFA Level II Formula Sheet CFA Level II Formula Sheet: Finance (Harvard University) Finance (Harvard University)
100% (1)
CFA Level II Formula Sheet CFA Level II Formula Sheet: Finance (Harvard University) Finance (Harvard University)
5 pages
Advanced Regression with IPL Data
No ratings yet
Advanced Regression with IPL Data
25 pages
521705-XLS-ENG Arteta B
No ratings yet
521705-XLS-ENG Arteta B
1,206 pages
Final Exam Formula Sheet
No ratings yet
Final Exam Formula Sheet
3 pages
PowerPoint CH 03b
No ratings yet
PowerPoint CH 03b
50 pages
Statistical Analysis of Student Scores
No ratings yet
Statistical Analysis of Student Scores
9 pages
B.Sc. Stats: Statistical Control Exam
No ratings yet
B.Sc. Stats: Statistical Control Exam
2 pages
Check in Activity 2
No ratings yet
Check in Activity 2
4 pages
Is 15393 2 2003 PDF
No ratings yet
Is 15393 2 2003 PDF
51 pages
STA1006S Summarized Notes
No ratings yet
STA1006S Summarized Notes
16 pages
Activity 1 Normal Distribution
No ratings yet
Activity 1 Normal Distribution
7 pages
Tugas 1 (Membangkitkan 500 Data) : Nama: Iswardhani Ariyanti NIM: L1B017035 Prodi: Budidaya Perairan
No ratings yet
Tugas 1 (Membangkitkan 500 Data) : Nama: Iswardhani Ariyanti NIM: L1B017035 Prodi: Budidaya Perairan
29 pages
Hypothesis Testing Questions Answers Updated
No ratings yet
Hypothesis Testing Questions Answers Updated
3 pages
Normal Distribution: Mean, Mode, Median 0 +1sd +2sd +3sd
No ratings yet
Normal Distribution: Mean, Mode, Median 0 +1sd +2sd +3sd
6 pages
Probability and Statistics I
No ratings yet
Probability and Statistics I
2 pages
Minitab Time Series Guide
No ratings yet
Minitab Time Series Guide
1 page
Chi-Square Tests Guide
100% (1)
Chi-Square Tests Guide
14 pages
STF1103 Statistic For Biology 2 Course Outline
No ratings yet
STF1103 Statistic For Biology 2 Course Outline
3 pages
Handouts CH 3 (Gujarati)
No ratings yet
Handouts CH 3 (Gujarati)
5 pages
Assignment 1 SOLUTIONS
No ratings yet
Assignment 1 SOLUTIONS
4 pages
BCS301 Questions Paper
No ratings yet
BCS301 Questions Paper
17 pages
Appendix A Cumulative Probabilities For A Standard Normal Distribution P (Z X) N (X) For X 0 or P (Z Z) N (Z) For Z 0
No ratings yet
Appendix A Cumulative Probabilities For A Standard Normal Distribution P (Z X) N (X) For X 0 or P (Z Z) N (Z) For Z 0
9 pages
Model FDS
No ratings yet
Model FDS
2 pages
Comparing Cross-Section and Time-Series Factor Models: R MC BM OP INV
No ratings yet
Comparing Cross-Section and Time-Series Factor Models: R MC BM OP INV
47 pages
Probability and Statistics Exercises
No ratings yet
Probability and Statistics Exercises
3 pages