[go: up one dir, main page]

0% found this document useful (0 votes)
212 views29 pages

Lec5 - LZW Compression

The document provides an example of applying LZW compression to the string "ababababa", showing how the algorithm initializes with an alphabet dictionary then encodes strings as codes are added to the dictionary without transmitting it. Each new string encountered is added to the dictionary and encoded until the full

Uploaded by

Ali Ahmed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
212 views29 pages

Lec5 - LZW Compression

The document provides an example of applying LZW compression to the string "ababababa", showing how the algorithm initializes with an alphabet dictionary then encodes strings as codes are added to the dictionary without transmitting it. Each new string encountered is added to the dictionary and encoded until the full

Uploaded by

Ali Ahmed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

CS 411 : Data Compression

Lecture 5
Lempel Ziv Welsch (LZW) Compression
and Decompression Method
Text Compression: Lempel Ziv Welsch (LZW) algorithm
 The original Lempel Ziv approach to data compression was first
published in in 1977.
 Terry Welch's refinements to the algorithm were published in
1984. The algorithm is surprisingly simple. In a nutshell, LZW
compression replaces strings of characters with single codes.

Encoder Decoder
As the input is processed, As the code is processed,
develop a dictionary and select reconstruct the dictionary to invert
string of symbols and encode the process of encoding
each string as a token using the
dictionary.
1- LZW coding (Dictionary-based Algorithm)

LZW (Lempel-Ziv-Welch) coding, assigns fixed-length code


words to variable length sequences of source symbols, but
requires no a priori knowledge of the probability of the source
symbols.
LZW is used in:
•Tagged Image file format (TIFF)
•Graphic interchange format (GIF)
•Portable document format (PDF)

LZW was formulated in 1984

11/9/2022 3
Important features of LZW

1. The dictionary is created while the data are being


encoded. So encoding can be done “on the fly”
2. The dictionary is not required to be transmitted. The
dictionary will be built up in the decoding
3. If the dictionary “overflows” then we have to reinitialize
the dictionary and add a bit to each one of the code words.
4. Choosing a large dictionary size avoids overflow, but
spoils compressions

11/9/2022 4
LZW: Compression algorithm:
 Initialize Dictionary with alphabet:

STRING = get input character


WHILE there are still input characters DO
CHAR = get input character
IF STRING+CHAR is in the Dictionary
THEN
STRING = STRING+CHAR
ELSE
output the code for STRING
add STRING+CHAR to Dictionary
STRING=CHAR
ENDIF
END of WHILE
Output code for the STRING
LZW Compression : Example (1):
What would be the encoded version of the following message :
ababababa
If LZW compression algorithm starting with the dictionary containing a,
b, and space were used
a b
0 1

Solution

 Dictionary Message: a b a b a b a b a
 0a Codes :
1b

Initialize Dictionary with alphabet.


LZW Compression : Example (1):

 Dictionary Message: a b a b a b a b a
 0a Codes :
1b

STRING=a STRING = get input character.


LZW Compression : Example (1):

 Dictionary Message: a b a b a b a b a
 0a Codes : 0
1b
 2 ab IF STRING+CHAR=ab is in the
Dictionary: No
THEN
STRING = STRING+CHAR
ELSE
STRING=a output the code for STRING: 0
CHAR =b add STRING+CHAR=ab to
STRING+CHAR=ab Dictionary
STRING=b STRING=CHAR=b
ENDIF
LZW Compression : Example (1):

 Dictionary Message: a b a b a b a b a
 0a Codes : 0 1
1b
 2 ab IF STRING+CHAR=ba is in the
Dictionary: No
 3 ba THEN
STRING = STRING+CHAR
ELSE
STRING=b output the code for STRING: 1
CHAR =a add STRING+CHAR=ba to
STRING+CHAR=ba Dictionary
STRING=a STRING=CHAR=a
ENDIF
LZW Compression : Example (1):

 Dictionary Message: a b ab a b a b a
 0a Codes : 0 1
1b
 2 ab * IF STRING+CHAR=ab is in the Dictionary:
Yes
 3 ba THEN
STRING = STRING+CHAR=ab
ELSE
STRING=a output the code for STRING
CHAR =b add STRING+CHAR to Dictionary
STRING+CHAR=ab STRING=CHAR
STRING=ab ENDIF
LZW Compression : Example (1):

 Dictionary Message: a b ab a b a b a
 0a Codes : 0 1 2
1b
IF STRING+CHAR=aba is in the
 2 ab Dictionary: No
THEN
 3 ba STRING = STRING+CHAR
ELSE
 4 aba output the code for STRING: 2
STRING=ab add STRING+CHAR=aba to
CHAR =a Dictionary
STRING+CHAR=aba STRING=CHAR=a
STRING=a ENDIF
LZW Compression : Example (1):

 Dictionary Message: a b ab ab a b a
 0a Codes : 0 1 2
1b
 2 ab * IF STRING+CHAR=ab is in the
Dictionary: Yes
 3 ba THEN
STRING = STRING+CHAR: ab
 4 aba ELSE
output the code for STRING
STRING=a add STRING+CHAR to
CHAR =b Dictionary
STRING+CHAR=ab STRING=CHAR
STRING=ab ENDIF
LZW Compression : Example (1):

 Dictionary Message: a b ab aba b a


 0a Codes : 0 1 2
1b
 2 ab IF STRING+CHAR=aba is in the
Dictionary: Yes
 3 ba
THEN
 4 aba * STRING = STRING+CHAR: aba
ELSE
STRING=ab output the code for STRING
CHAR =a add STRING+CHAR to Dictionary
STRING+CHAR=aba STRING=CHAR
STRING=aba ENDIF
LZW Compression : Example (1):

 Dictionary Message: a b ab aba b a


 0a Codes : 0 1 2 4
1b
 2 ab
IF STRING+CHAR=abab is in the
 3 ba Dictionary: No
THEN
 4 aba STRING = STRING+CHAR
ELSE
 5 abab output the code for STRING: 4
add STRING+CHAR=abab to
STRING=aba Dictionary
CHAR =b STRING=CHAR= b
STRING+CHAR=abab ENDIF
STRING=b
LZW Compression : Example (1):

 Dictionary Message: a b ab aba ba


 0a Codes : 0 1 2 4
1b
 2 ab IF STRING+CHAR=ba is in the Dictionary:
 3 ba Yes
THEN
 4 aba STRING = STRING+CHAR: ba
ELSE
 5 abab output the code for STRING
STRING=b add STRING+CHAR to Dictionary
CHAR =a STRING=CHAR
STRING+CHAR=ba ENDIF
STRING=ba
LZW Compression: Example (1):

 Dictionary Message: a b ab aba ba


 0a Codes : 0 1 2 4 3
1b
 2 ab
 3 ba *
 4 aba
 5 abab

STRING=b
CHAR =a
Output code for the STRING: 3
STRING+CHAR=ba
LZW Compression : Example (2)
What would be the encoded version of the following message :
aab bba aab aab bba
If LZW compression algorithm starting with the dictionary containing a,
b, and space were used
a b space
1 2 3

Solution
1) word aab not exist in the dictionary  add aab in the dictionary with code : 4
and output codes of aab and space as 1123
a b space aab
1 2 3 4
1) word bba not exist in the dictionary  add bba in the dictionary with code : 5
and output codes of bba and space as 2213
a b space aab bba
1 2 3 4 5
1) word aab exist in the dictionary  output codes of aab and space as 43
2) word aab exist in the dictionary  output codes of aab and space as 43
3) word bba exist in the dictionary  output codes of bba and space as 53
Then the compressed text codes are:
1123221343435
LZW Compression : Example (3)
What would be the encoded version of the following message :
aba bba baaab aba baaab baaab
If LZW compression algorithm starting with the dictionary containing a, b, and
space were used
a b space
1 2 3

Solution
 word aba not exist in the dictionary  add aba in the dictionary with code : 4 and output codes of aab
and space as 1213
a b space aba
1 2 3 4
 word bba not exist in the dictionary  add bba in the dictionary with code : 5 and output codes of bba
and space as 2213
a b space aba bba
1 2 3 4 5
 word baaab not exist in the dictionary  add baaab in the dictionary with code : 6 and output codes of
baaab and space as 211123
a b space aba bba baaab
1 2 3 4 5 6
 word aba exist in the dictionary  output codes of aba and space as 43
 word baaab exist in the dictionary  output codes of baaab and space as 63
 word baaab exist in the dictionary  output codes of baaab as 6
Then the compressed text codes are:
12132213 211123 43636
LZW Compression : Example (4)

• AB not exist in the dictionary  add AB in the dictionary with code 128 and output code of A : 65
• BA not exist in the dictionary  add BA in the dictionary with code 129 and output code of B : 66
• AA not exist in the dictionary  add AA in the dictionary with code 130 and output code of A : 65
• ABA not exist in the dictionary  add ABA in the dictionary with code 131 and output code of AB : 128
• ABB not exist in the dictionary  add ABB in the dictionary with code 132 and output code of AB : 128
• BAA not exist in the dictionary  add BAA in the dictionary with code 133 and output code of AB : 129
• ABAA not exist in the dictionary  add ABAA in the dictionary with code 134 and output code of ABA :
131
• ABAAA not exist in the dictionary  add ABAAA in the dictionary with code 135 and output code of
ABAA : 134
• AAB not exist in the dictionary  add AAB in the dictionary with code 136 and output code of AA : 130
• BAB not exist in the dictionary  add BAB in the dictionary with code 137 and output code of BA : 1329
• BB not exist in the dictionary  add BB in the dictionary with code 138 and output code of B : 66
• BBB not exist in the dictionary  add BBB in the dictionary with code 139 and output code of BB : 138
• BBBB Bnot exist in the dictionary  add BBBB in the dictionary with code 140 and output code of BBB
: 139
• BB exist in the dictionary  output code of BB : 138
2- LZW Decompression Algorithm
Initialize Dictionary
Input code c
Decode code c (index) to w
Output decoded string w
Put w? in Dictionary
REPEAT
a) Input code c
Decode the 1st symbol s1 of the code c
Complete the previous Dictionary entry with s1
b) Finish decoding the remainder of the code c
Output decoded string w
Put put w? in Dictionary
UNTIL no more codes
LZW Decompression : Example (1)
Consider the below code and its predefined dictionary.
0 2 13 1 1
Code 0 1

Key P Q

Here they go by using LZW decompression method...

It is noted that the first code will always defined in the dictionary. Thus,
we will get
P 2 13 1 1 - Since this is the first code that we decompress. We will just
left it without adding any new code to dictionary.

Code 0 1

Key P Q
LZW Decompression : Example (1)
P 2 13 1 1 - Since 2 is not found in dictionary. We will replace 2 with text(0)fc(0) which is
equivalent to PP. New code added into dictionary based on key text(0)fc(0) which is
equivalent to PP.
PPP 1 3 1 1
Code 0 1 2

Key P Q PP

PPP 1 3 1 1 - Since 1 is found in dictionary. We will replace 1 with text(1) which is


equivalent to Q. New code added into dictionary based on key text(2)fc(1) which is
equivalent to PPQ.
PPPQ 3 1 1

Code 0 1 2 3

Key P Q PP PPQ
LZW Decompression : Example (1)
PPPQ 3 1 1 - Since 3 is found in dictionary. We will replace 3 with text(3)which is
equivalent to PPQ. New code added into dictionary based on key text(1)fc(3) which
is equivalent to QP.
PPPQPPQ 1 1
Code 0 1 2 3 4

Key P Q PP PPQ QP

PPPQPPQ 1 1 - Since 1 is found in dictionary. We will replace 1 with text(1) which is


equivalent to Q. New code added into dictionary based on key text(3)fc(1) which is
equivalent to PPQQ.
PPPQPPQQ 1

Code 0 1 2 3 4 5

Key P Q PP PPQ QP PPQQ


LZW Decompression : Example (1)

PPPQPPQQ 1 - Since 1 is found in dictionary. We will replace 1with text(1)which


is equivalent to Q. New code added into dictionary based on key with text(1)fc(1)
which is equivalent to QQ.
PPPQPPQQQ

Code 0 1 2 3 4 5 6

Key P Q PP PPQ QP PPQQ QQ


LZW Decompression : Example (2)

Input Code T Dictionary Items Output Text

012233588 a 0 1 a
a b
012233588 b 0 1 2 ab
a b ab
012233588 ab 0 1 2 3 abab
a b ab ba

012233588 ab 0 1 2 3 4 ababab

a b ab ba aba
012233588 ba 0 1 2 3 4 5 abababba
a b ab ba aba abb
LZW Decompression : Example (2)

Input Code T Dictionary Items Output


Text
012233588 ba 0 1 2 3 4 5 6 abababba
ba
a b ab ba aba abb bab
012233588 abb 0 1 2 3 4 5 6 7 abababba
baabb
a b ab ba aba abb bab baa
012233588 abba 0 1 2 3 4 5 6 7 8 abababba
baabbabb
a b ab ba aba abb bab baa abba a
012233588 abba 0 1 2 3 4 5 6 7 8 9 abababba
baabbabb
a b ab ba aba abb bab baa abba abbaa aabba
LZW Decompression : Example (3)
the following message was compressed using LZW compression algorithm starting
with a dictionary whose first, second, and third entries are b : 1, c: 2 , and space : 3.
What is the decompressed message? 2123113431213536
Solution
 2123113431213536
Output messege: cbc space , add cbc in the dictionary
b c space cbc
1 2 3 4
 2123113431213536
Output messege: bb space , add bb in the dictionary
b c space cbc bb
1 2 3 4 5
 2123113431213536
Output code : cbc space ,
b c space cbc bb
1 2 3 4 5
 2123113431213536
Output code : bcb space , add bcb in the dictionary
b c space cbc bb bcb
1 2 3 4 5 6
 2123113431213536
Output code : bb space ,
b c space cbc bb bcb
1 2 3 4 5 6
 2123113431213536
Output code : bcb space ,
b c space cbc bb bcb
1 2 3 4 5 6
Then the compressed text codes are:
cbc bb cbc bcb bb bcb
LZW Decompression : Example (4)
65 A
65 66 65 128 128 129 131 134 130 129 66 130 139 138
A B A AB AB AB BA ABA ABAA AA BA B BB BBB BB 66 B
..
65 Exist in dictionary  output : A
66 Exist in dictionary  output : B , add (previous + first character) AB in dictionary 128 AB
with code 128 129 BA
65 Exist in dictionary  output : A , add BA in dictionary with code 129
128 Exist in dictionary  output : A B , add AA in dictionary with code 130 130 AA
128 Exist in dictionary  output : A B , add ABA in dictionary with code 131 131 ABA
129Exist in dictionary  output : BA , add ABB in dictionary with code 132
131Exist in dictionary  output : A BA , add BAA in dictionary with code 133 132 ABB
134 NOT Exist in dictionary  output : A BA? , first character 133 BAA
130 exist in dictionary  output : AA SO, the output of 134 is ABAA, , add ABAA in
dictionary with code 134, add ABAAA in dictionary with code 135 134 ABA
129 Exist in dictionary  output : BA , add AAB in dictionary with code 136 A
66 Exist in dictionary  output : B , add BAB in dictionary with code 137.
139 NOT Exist in dictionary  output : BB? , then output of 139 is BBB,
135 ABA
138 NOT Exist in dictionary  output : B? , then output of 138 is BB AA
136 AAB
.
. 137 BAB
. 138 BB
The decoded message is
139 BBB
A B AA B A B B AA B AA B AAAA B A B B B B B B B B
LZW advantages/disadvantages
Advantages:
simple, fast and good compression
Extremely effective when there are repeated patterns
in the data that are widely spread
dynamic codeword table built for each file
decompression recreates the codeword table so it
does not need to be passed
disadvantages
not the optimum compression ratio
actual compression hard to predict
Create entries in the dictionary that may never be
used

You might also like