0% found this document useful (0 votes)

27 views24 pages

Huffman Code

The document discusses Huffman coding, a method for data compression that assigns variable-length binary codes to source symbols based on their probabilities. It outlines the two-step algorithm for creating Huffman codes, necessary conditions for optimality, and the complexity of the algorithm. Additionally, it covers the pros and cons of Huffman coding, including its efficiency and challenges with varying symbol probabilities.

Uploaded by

Sk Fayad

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

27 views24 pages

Huffman Code

Uploaded by

Sk Fayad

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 24

Lecture 6:

Huffman Code

Thinh Nguyen
Oregon State University
Review
Coding: Assigning binary codewords to
(blocks of) source symbols.

Variable-length codes (VLC)

Tree codes (prefix code) are instantaneous.

Example of VLC
Creating a Code: The Data Compression
Problem
Assume a source with an alphabet A and known
symbol probabilities {pi}.

Goal: Chose the codeword lengths as to minimize

the bitrate, i.e., the average number of bits per
symbol ∑ li * pi.

Trivial solution: li = 0 * i.

Restriction: We want an decodable code, so

∑ 2 i <=1 (Kraft inequality) must be valid.
-l

Solution (at least in theory): li = – log pi

In practice…
Use some nice algorithm to find the codes

Huffman coding
Tunnstall coding
Golomb coding
Huffman Average Code Length
Input: Probabilities p1, p2, ... , pm for symbols a1, a2, ... ,am,
respectively.

Output: A tree that minimizes the average number of bits

(bit rate) to code a symbol. That is, minimizes

− m
l = ∑ pi li
i =1

Where li is the length of codeword ai

Huffman Coding
Two-step algorithm:
1. Iterate:
– Merge the least probable symbols.
– Sort.
2. Assign bits.

0 0 0
a 0.5 0.5 0.5 Merge
10 10 1
b 0.25 0.25 0.5 Sort
110 11
c 0.125 0.25 Assign
111
d 0.125 Get code
More Examples of Huffman Code
More Examples of Huffman Code
More Examples of Huffman Code
More Examples of Huffman Code
Average Huffman Code Length
Optimality of A Prefix Code
Necessary conditions for an optimal variable-length binary
code:

1. Given any two letters aj and ak, if P(aj) >= P(ak) , then lj <= lk,
where lj is the length of the codeword aj.

2. The two least probable letters have codewords with the same
maximum length lm.
3. In the tree corresponding to the optimum code, there must be
two branches stemming from each intermediate node.

4. Suppose we change an intermediate node into a leaf node by

combining all the leaves descending from it into a composite
word of a reduced alphabet. Then if the original tree was
optimal for the original alphabet, the reduced tree is optimal for
the reduced alphabet.
Condition 1: If P(aj) >= P(ak) , then lj <= lk, where lj is the length
of the codeword aj.

Easy to see why?

Proof by contradiction:

Suppose a code X is optimal with P(aj) >= P(ak), but lj > lk

By simply exchanging aj and ak, we have a new code Y in which,

its average length = ∑ lipi is smaller than that of code X.
Hence, the contradition is reached. Thus, condition must hold
Condition 2: The two least probable letters have codewords with
the same maximum length lm.

Easy to see why?

Proof by contradiction:
Suppose we have an optimal code X in which, two codewords with lowest
probabilities ci and cj and that ci is longer than cj by k bits.

Then because this is a prefix code, cj cannot be the prefix to cj. So, we
can drop the last k bits of ci.

We also guarantee that by dropping the last k bits of ci, we still have a
decodable codeword. This is because ci and cj have the longest length
(least probable codes), hence they cannot be the prefix of any other
code.
By dropping the k bits of ci , we create a new code Y which has shorter
average length, hence contradiction is reached.
Condition 3: In the tree corresponding to the optimum code,
there must be two branches stemming from each intermediate
node..

Easy to see why?

If there were any intermediate node with only one branch

coming from that node, we could remove it without affecting the
decodability of the code while reducing its average length.

0 1
0 1

0 c
0 c
a: 00
a: 000
0 1
0 1 b: 01
b: 001
a b c: 1
a b c: 1
Condition 4:
Suppose we change an intermediate node into a leaf node by combining
all the leaves descending from it into a composite word of a reduced
alphabet. Then if the orginal tree was optimal for the original alphabet,
the reduced tree is optimal for the reduced alphabet.

0 1
0 1

0 1 d
0 1 d
e c
c
0 1
a: 000 e: 00
a b
b: 001 c: 01
c: 01 d:1
d:1
Huffman code satisfies all four conditions
Lower probable symbols are at longer depth of the tree
(condition 1).

Two lowest probable symbols have equal length (condition

2).

Tree has two branches (condition 3).

Code for the reduced alphabet needs to be optimum for the

code of the original alphabet to be optimum by construction
(condition 4)
Optimal Code Length (Huffman Code
Length)
−
H (S ) ≤ l < H (S ) + 1

−
l : Average length of an optimal code
m
H ( S ) = −∑ P(ai ) log 2 P(a) i : Entropy of the source
i =1

Proof:
Extended Huffman Code
A = {a1, a2 ,...am }, An = {a1a1...a1 , a1a1...a2 ,..., am am ...am }
1424 3
n times
m n symbols in the A n alphabet
−
H (S ) ≤ l < H (S ) + 1 / n
−
l : Average length of Huffman Code
H ( S ) : Entropy of the source

Proof: page 53 of the book

Huffman Coding: Pros and Cons
+ Fast implementations.

+ Error resilient: resynchronizes in ~ l2 steps.

- The code tree grows exponentially when the source is

extended.

- The symbol probabilities are built-in in the code.

Hard to use Huffman coding for extended sources / large
alphabets or when the symbol probabilities are varying by
time.
Huffman Coding of 16-bit CD-quality
audio
Filename Original file Entropy (bits) Compressed Compression
size (bytes) File Size Ratio
(bytes)
Mozart 939,862 12.8 725,420 1.30
symphony
Folk rock 402,442 13.8 349,300 1.15
(Cohn)

Huffman coding of the Differences

Filename Original file Entropy (bits) Compressed Compression

size (bytes) File Size Ratio
(bytes)
Mozart 939,862 9.7 569,792 1.65
symphony
Folk rock 402,442 10.4 261,590 1.54
(Cohn)
Complexity of Huffman Code
O(n log(n))
Log(n) is the depth of the tree and n operation to
compare for the lowest probabilities.
Notes on Huffman Code
Frequencies computed for each input
Must transmit the Huffman code or frequencies as well
as the compressed input.
Requires two passes

Fixed Huffman tree designed from training data

Do not have to transmit the Huffman tree because it is
known to the decoder.
H.263 video coder

3. Adaptive Huffman code

One pass
Huffman tree changes as frequencies change

Lecture 5
No ratings yet
Lecture 5
31 pages
Huffman
No ratings yet
Huffman
53 pages
Lec.4n - COMM 552 Information Theory and Coding
No ratings yet
Lec.4n - COMM 552 Information Theory and Coding
23 pages
Huffman Coding Principles
No ratings yet
Huffman Coding Principles
31 pages
Huffman Coding: Outline
No ratings yet
Huffman Coding: Outline
12 pages
Huffman Codes: Spring 2010
No ratings yet
Huffman Codes: Spring 2010
7 pages
Huffman Coding Algorithm Guide
No ratings yet
Huffman Coding Algorithm Guide
54 pages
Huffman Coding: Greedy Algorithm Guide
No ratings yet
Huffman Coding: Greedy Algorithm Guide
27 pages
Huffman Coding
No ratings yet
Huffman Coding
39 pages
Huffman Coding for Engineers
No ratings yet
Huffman Coding for Engineers
17 pages
Chapter Three
No ratings yet
Chapter Three
30 pages
Huffman Coding
No ratings yet
Huffman Coding
11 pages
Data Compression Unit-2
No ratings yet
Data Compression Unit-2
74 pages
Huff Man
No ratings yet
Huff Man
8 pages
Unite 4-Greedy Method - CSE
No ratings yet
Unite 4-Greedy Method - CSE
41 pages
Huffman Coding - Wikipedia
No ratings yet
Huffman Coding - Wikipedia
11 pages
Unit 2
No ratings yet
Unit 2
82 pages
Huffman Code
No ratings yet
Huffman Code
51 pages
Optimal Source Code: L L L L P L
No ratings yet
Optimal Source Code: L L L L P L
11 pages
04huffman 2x2
No ratings yet
04huffman 2x2
6 pages
Huffman Code
No ratings yet
Huffman Code
5 pages
Lecture2huffmancoding 151018181815 Lva1 App6892
No ratings yet
Lecture2huffmancoding 151018181815 Lva1 App6892
31 pages
Lecture 22 Compression
No ratings yet
Lecture 22 Compression
42 pages
Data Structure: Huffman Tree:Project Submitted To: Sir Abdul Wahab
No ratings yet
Data Structure: Huffman Tree:Project Submitted To: Sir Abdul Wahab
24 pages
University of Management & Technology: Submitted By: Usama Dastagir 14030027011 Hassan Humayoun 14030027043
No ratings yet
University of Management & Technology: Submitted By: Usama Dastagir 14030027011 Hassan Humayoun 14030027043
7 pages
Huffman Coding Scheme
No ratings yet
Huffman Coding Scheme
59 pages
Multimedia Data Compression
No ratings yet
Multimedia Data Compression
31 pages
0g Huffman
No ratings yet
0g Huffman
23 pages
Huffman Codes
No ratings yet
Huffman Codes
27 pages
Graph Theory - Important Application of Trees Huffman Coding
No ratings yet
Graph Theory - Important Application of Trees Huffman Coding
50 pages
Huffman Coding Explained
No ratings yet
Huffman Coding Explained
27 pages
Data Compression
No ratings yet
Data Compression
28 pages
Huffman Codes and Its Implementation: Submitted by Kesarwani Aashita Int. M.Sc. in Applied Mathematics (3 Year)
No ratings yet
Huffman Codes and Its Implementation: Submitted by Kesarwani Aashita Int. M.Sc. in Applied Mathematics (3 Year)
28 pages
S 2
No ratings yet
S 2
8 pages
Data Compression - Unit 2
No ratings yet
Data Compression - Unit 2
31 pages
4 Information Theory
No ratings yet
4 Information Theory
53 pages
HuffmanCoding 2
No ratings yet
HuffmanCoding 2
16 pages
Huffman Trees and Codes-V1
No ratings yet
Huffman Trees and Codes-V1
15 pages
Huffman Coding Explained
100% (1)
Huffman Coding Explained
13 pages
Data Compression Huffman Codes
No ratings yet
Data Compression Huffman Codes
60 pages
Mad Unit 3-Jntuworld
No ratings yet
Mad Unit 3-Jntuworld
53 pages
Huffman Coding
No ratings yet
Huffman Coding
23 pages
Huffman Coding
No ratings yet
Huffman Coding
40 pages
Mini Project
No ratings yet
Mini Project
26 pages
Algorithmics: Information Coding Techniques
No ratings yet
Algorithmics: Information Coding Techniques
44 pages
ICS 220 - Data Structures and Algorithms: Dr. Ken Cosh
No ratings yet
ICS 220 - Data Structures and Algorithms: Dr. Ken Cosh
22 pages
Huffman Coding
No ratings yet
Huffman Coding
32 pages
Entropy Coding Techniques Guide
No ratings yet
Entropy Coding Techniques Guide
10 pages
Huffman
No ratings yet
Huffman
11 pages
Huffman Coding Technique
No ratings yet
Huffman Coding Technique
13 pages
Compression: Another Example of Greedy Algorithm: Huffman Codes
No ratings yet
Compression: Another Example of Greedy Algorithm: Huffman Codes
4 pages
5 Huffman Coding
No ratings yet
5 Huffman Coding
50 pages
Communication Theory II - Lecture 7
No ratings yet
Communication Theory II - Lecture 7
34 pages
Lecture35-37 SourceCoding
No ratings yet
Lecture35-37 SourceCoding
20 pages
Huffman Codes for ECE Students
No ratings yet
Huffman Codes for ECE Students
24 pages
Huffman Coding
No ratings yet
Huffman Coding
12 pages
Power Point
No ratings yet
Power Point
10 pages
Synthesis-By-Analysis of BCH Codes: October 2012
No ratings yet
Synthesis-By-Analysis of BCH Codes: October 2012
6 pages
Star Thermal Printer Programmer's Manual
No ratings yet
Star Thermal Printer Programmer's Manual
117 pages
Get-Pip Py
No ratings yet
Get-Pip Py
460 pages
Brief History of ASCII Code
No ratings yet
Brief History of ASCII Code
5 pages
Y10 02 P11 Activities
No ratings yet
Y10 02 P11 Activities
3 pages
Midtern-Truyen Thong So Va Ma Hoa
No ratings yet
Midtern-Truyen Thong So Va Ma Hoa
15 pages
Forensics Tool Usage Log
No ratings yet
Forensics Tool Usage Log
3 pages
Discrete Memory Less Channel
No ratings yet
Discrete Memory Less Channel
68 pages
Python 2 Unicode Handling Guide
No ratings yet
Python 2 Unicode Handling Guide
19 pages
Unihan RadicalStrokeCounts
No ratings yet
Unihan RadicalStrokeCounts
232 pages
Forward Error Correction Basics
No ratings yet
Forward Error Correction Basics
5 pages
MP4 Video Transcoding Log
No ratings yet
MP4 Video Transcoding Log
4 pages
Telecom Engineering: Channel Coding
No ratings yet
Telecom Engineering: Channel Coding
31 pages
Encoding and Interleaving of Information Signal in GSM
No ratings yet
Encoding and Interleaving of Information Signal in GSM
33 pages
AlteryxDesignerDesktop RegexCheatSheet v2 EN
No ratings yet
AlteryxDesignerDesktop RegexCheatSheet v2 EN
1 page
Channel Coding
100% (1)
Channel Coding
21 pages
Channel Coding Exercise
No ratings yet
Channel Coding Exercise
13 pages
Slide02 - Communication System
No ratings yet
Slide02 - Communication System
14 pages
Dip-Unit 5
No ratings yet
Dip-Unit 5
37 pages
CH 15
No ratings yet
CH 15
34 pages
Experiment 5 CRC
No ratings yet
Experiment 5 CRC
5 pages
5ece Syll
No ratings yet
5ece Syll
34 pages
Coding Theory and Techniques
No ratings yet
Coding Theory and Techniques
1 page
Low Delay Burst Erasure Correction Codes: Emin Martinian Carl-Erik W. Sundberg
No ratings yet
Low Delay Burst Erasure Correction Codes: Emin Martinian Carl-Erik W. Sundberg
5 pages
Ece141 Lec12 Linear Block Codes
No ratings yet
Ece141 Lec12 Linear Block Codes
54 pages
Introduction To Coding Theory: Basic Codes and Shannon'S Theorem
No ratings yet
Introduction To Coding Theory: Basic Codes and Shannon'S Theorem
7 pages
Ect306 B
No ratings yet
Ect306 B
3 pages
Aegyptus: Egyptian Hierogglyphs, Oogpti Iand Merogiti
No ratings yet
Aegyptus: Egyptian Hierogglyphs, Oogpti Iand Merogiti
6 pages
Los 10 Pecados Capitales Del Marketing Según Kotler
0% (1)
Los 10 Pecados Capitales Del Marketing Según Kotler
358 pages

Huffman Code

Uploaded by

Huffman Code

Uploaded by

Lecture 6:

 Variable-length codes (VLC)

 Tree codes (prefix code) are instantaneous.

 Goal: Chose the codeword lengths as to minimize

 Restriction: We want an decodable code, so

 Solution (at least in theory): li = – log pi

 Output: A tree that minimizes the average number of bits

Where li is the length of codeword ai

4. Suppose we change an intermediate node into a leaf node by

 Easy to see why?

 Suppose a code X is optimal with P(aj) >= P(ak), but lj > lk

 By simply exchanging aj and ak, we have a new code Y in which,

 Easy to see why?

 Easy to see why?

 If there were any intermediate node with only one branch

 Two lowest probable symbols have equal length (condition

 Tree has two branches (condition 3).

 Code for the reduced alphabet needs to be optimum for the

Proof: page 53 of the book

+ Error resilient: resynchronizes in ~ l2 steps.

- The code tree grows exponentially when the source is

- The symbol probabilities are built-in in the code.

Huffman coding of the Differences

Filename Original file Entropy (bits) Compressed Compression

 Fixed Huffman tree designed from training data

 3. Adaptive Huffman code

You might also like

Variable-length codes (VLC)

Tree codes (prefix code) are instantaneous.

Goal: Chose the codeword lengths as to minimize

Restriction: We want an decodable code, so

Solution (at least in theory): li = – log pi

Output: A tree that minimizes the average number of bits

Easy to see why?

Suppose a code X is optimal with P(aj) >= P(ak), but lj > lk

By simply exchanging aj and ak, we have a new code Y in which,

Easy to see why?

Easy to see why?

If there were any intermediate node with only one branch

Two lowest probable symbols have equal length (condition

Tree has two branches (condition 3).

Code for the reduced alphabet needs to be optimum for the

Fixed Huffman tree designed from training data

3. Adaptive Huffman code