BY-CHITESH KUMAR
1
Dynamic Huffman Coding
of
"TENNESSEE"
DEFINITION
• Definition: A near-minimal variable-length
character coding that changes based on the
frequency of characters processed. As
characters are processed, frequencies are
updated and codes are changed (or, the
coding tree is modified).Also known
as dynamic Huffman coding.
• EXAMPLE:---
3
"TENNESSEE“--T
➢ Stage 1 (First occurrence of t )
r
/\
0 t(1)
✓ Order: 0,t(1)
where:-
* ‘r’ represents the root
* ‘0’ represents the null node
* ‘t(1)’ denotes the occurrence of T with a frequency of 1
4
"TENNESSEE“-TE
➢Stage 2 (First occurrence of e)
r
/ \
1 t(1)
/ \
0 e(1)
✓Order: 0,e(1),1,t(1)
5
"TENNESSEE“-TEN
➢ Stage 3 (First occurrence of n )***
r
/ \
2 t(1)
/ \
1 e(1)
/ \
0 n(1)
❖Order: 0,n(1),1,e(1),2,t(1) : Misfit
6
Reorder: TEN
r
/ \
t(1) 2
/ \
1 e(1)
/ \
0 n(1)
✓ Order: 0,n(1),1,e(1),t(1),2
✓ WE swapped 2 with t(1)
7
"TENNESSEE“-TENN
• Stage 4 ( Repetition of n ) ***
r
/ \
t(1) 3
/ \
2 e(1)
/ \
0 n(2)
• Order: 0,n(2),2,e(1),t(1),3 : Misfit
8
Reorder: TENN
r
/ \
n(2) 2
/ \
1 e(1)
/ \
0 t(1)
✓ Order: 0,t(1),1,e(1),n(2),2
▪ t(1),n(2) are swapped
9
"TENNESSEE“- TENNE
➢ Stage 5 (Repetition of e )
r
/ \
n(2) 3
/ \
1 e(2)
/ \
0 t(1)
✓ Order: 0,t(1),1,e(2),n(2),3
10
"TENNESSEE“-TENNES
➢ Stage 6 (First occurrence of s)
r
/ \
n(2) 4
/ \
2 e(2)
/ \
1 t(1)
/ \
0 s(1)
• Order: 0,s(1),1,t(1),2,e(2),n(2),4
11
"TENNESSEE“ -TENNESS
➢ Stage 7 (Repetition of s)***
r
/ \
n(2) 5
/ \
3 e(2)
/ \
2 t(1)
/ \
0 s(2)
❖ Order: 0,s(2),2,t(1),3,e(2),n(2),5 : Misfit
12
Reorder: TENNESS
r
/ \
n(2) 5
/ \
3 e(2)
/ \
1 s (2)
/ \
0 t(1)
✓ Order : 0,t(1),1,s(2),3,e(2),n(2),5 , ( this is also a misfit, but we have
no other choise , so left it , encoder will try to correct it in next
transmission of letter)
▪ s(2) and t(1) are swapped
13
"TENNESSEE“-TENNESSE
➢ Stage 8 (Third repetition of e )
r
/ \
n(2) 6
/ \
3 e(3)
/ \
1 s(2)
/ \
0 t(1)
❖ Order : 0,t(1),1,s(2),3,e(3),n(2),6 : Misfit
14
Reorder: TENNESSE
r
/ \
e(3) 5
/ \
n(2) 3
/ \
1 s(2)
/ \
0 t(1)
✓ Order : 1,t(1),1,s(2),n(2),3,e(3),5 ( here error has been corrected)
n(2) and e(3) are swapped (and also n(2) swapped with 3 on same
level)
15
"TENNESSEE“-TENNESSEE
➢ Stage 9 (fourth repetition of e )
r
0 / \1
e(4) 5
0 / \1
n(2) 3
0/ \1
1 s(2)
0/ \1
0 t(1)
✓ Order : 0,t(1),1,s(2), n(2), 3, e(4),5
16
ENCODING
The letters can be encoded as follows:
TENNESSEE
• (4) e : 0
• (2)n : 11
• (2)s : 101
• (1) t : 1001
17
Average Code Length
Average code length = i=0,n (length*frequency)/ i=0,n frequency
= { 1(4) + 2(2) + 2(3) + 1(4) } / (4+2+2+1)
= 18 / 9 = 2
how to find Pi:
Total letter =9, 4/9=P(e)=0.44
18
ENTROPY
Entropy = - i=1,n (pi log2 pi)
= - ( 0.44 * log20.44 + 0.22 * log20.22
+ 0.22 * log20.22 + 0.11 * log20.11 )
= - (0.44 * log0.44 + 2(0.22 * log0.22 + 0.11 * log0.11)
/ log2
= 1.8367
19
Ordinary Huffman Coding
• TENNESSE • ENCODING
9 • E:1
0/ \1 • S : 00
5 e(4) • T : 010
0/ \1
• N : 011
s(2) 3
0/ \1
Average code length = (1*4 + 2*2 +
t(1) n(2) 2*3 + 3*1) / 9 = 1.89
20
SUMMARY
1. The average code length of ordinary Huffman coding seems to be better
than the Dynamic version , in this exercise.
2. But, actually the performance of dynamic coding is better.
3. The problem with static coding is that the tree has to be constructed in
the transmitter and sent to the receiver. The tree may change because
the frequency distribution of the English letters may change in plain text
technical paper, piece of code etc.
4. Since the tree in dynamic coding is constructed on the receiver as well, it
need not be sent.
5. Considering this, Dynamic coding is better.
6. Also, the average code length will improve if the transmitted text is
bigger.
21
WHAT NEXT…???
In next video we shall continue to
“Arithmetic Coding”
THANK YOU
22