Unit IV
Unit IV
Image Compression
fundamentals in Digital
Image Processing
© Dr. Dafda
Introduction
What is Image Compression?
• Image Compression is the art and science of reducing the
amount of data required to represent an image.
• Data required for two hour standard definition(SD)
television
movie using 720×480×24 bits pixel arrays is 224 GB of data.
Coding, Spatial(Inter-Pixel)
and
Psychovisual(Irrelevant
in
information)Redundancy
© Dr. Dafda
Principal types of data redundancies
Data
redundancies
Coding Redundancy Spatial and temporal
Irrelevant information
redundancy
◦ Each piece of information or ◦ Pixels of most 2-D intensity arrays are ◦ Most 2-D intensity arrays contain
event is assigned a sequence of correlated spatially (i.e. each pixel is information that is ignored by
code symbols, called a code similar to or dependent on neighboring human visual system and/or
word. pixels. extraneous to the intended use of
◦ Number of symbols in each ◦ Information is unnecessarily replicated the image.
code word is its length. in the representations of the correlated ◦ It is redundant in the sense
pixels. that it
is not used.
37
Huffman coding
in Digital Image
Processing© Dr. Dafda
Huffman coding
• The most popular technique for removing coding redundancy is due to Huffman
(Huffman [1951]). When coding the symbols of an information source
individually, Huffman coding yields the smallest possible number of code symbols
per source symbol.
• In terms of the noiseless coding theorem, the resulting code is optimal for a fixed
value of n, subject to the constraint that the source symbols be coded one at a
time.
• At the far left, a hypothetical set of source symbols and their probabilities are
ordered from top to bottom in terms of decreasing probability values. To form
the first source reduction, the bottom two probabilities, 0.06 and 0.04, are
combined to form a "compound symbol" with probability 0.1.
• This compound symbol and its associated probability are placed in the first
source reduction column so that the probabilities of the reduced source are also
ordered from the most to the least probable. This process is then repeated until
a reduced source with two symbols (at the far right) is reached.
• The second step in Huffman's procedure is to code each reduced source,
starting with the smallest source and working back to the original source. The
minimal length binary code for a two-symbol source, of course, is the symbols 0
and 1.
• These symbols are assigned to the two symbols on the right (the assignment is
arbitrary; reversing the order of the 0 and 1 would work just as well). As the
reduced source symbol with probability 0.6 was generated by combining two
symbols in the reduced source to its left, the 0 used to code it is now assigned to
both of these symbols, and a 0 and 1 are arbitrarily appended to each to
distinguish them from each other. This operation is then repeated for each
reduced source until the original source is reached. The final code appears at
the far left.
• The entropy of the source is 2.1435 bits/symbol. The resulting Huffman code
efficiency is 97.43 %.
• Huffman's procedure creates the optimal code for a set of symbols and probabilities
subject to the constraint that the symbols be coded one at a time. After the code has
been created, coding and/or decoding is accomplished in a simple lookup table
manner.
• The code itself is an instantaneous uniquely decodable block code. It is called a block
code because each source symbol is mapped into a fixed sequence of code symbols. It
is instantaneous, because each code word in a string of code symbols can be decoded
without referencing succeeding symbols.
• Arithmetic coding takes in a complete data stream and output one specific
codeword which is a floating point number between 0 and 1, with as few digits as
possible. As the number of symbols in the message increases, the interval
used to represent it becomes smaller, which means longer the bit stream, more
is the precision of the output number.
• Huffmann coding encodes source symbols one at a time which might not be
efficient while Arithmetic coding encodes sequences of source symbols to
variable length codewords.
• Arithmetic coding is a non-block code which , means that arithmetic code does
not generate individual codes for each character.
• The efficiency = 100% can always be achieved using arithmetic coding,
as entropy H will be equal to Lavg, and hence it also satisfies Shannon’s first
theorem which says that minimum coding bits should be equal to entropy H, which
is obtained by arithmetic coding. But the difficulty for Arithmetic coding is practical
implementation of long messages is not possible, as arbitrary-precision floating
point library/registers are not there.
Differences between Huffman coding and Arithmetic
coding
Huffman Coding Arithmetic coding
1. Huffman coding is more popular. 1. Arithmetic coding is less popular.
2. It is not always optimal. 2. It is always optimal.
3. Huffman coding is a simpler 3. Arithmetic coding is complex
technique. technique.
4. Compression ratio is poor. 4. Compression ratio is very good.
5. Compression speed is fast. 5. Compression speed is slow.
6. Memory space required is more. 6. Memory space required is less.
7. Precision is not an important 7. Precision is an important
factor. consideration.
Steps of Arithmetic coding:
1. Arrange the characters of string/numbers into ascending order.
2. Divide the numeric range 0 to 1 into the number of different symbols present in
the message.
3. Expand the first letter to be coded along with the range.
4. Further subdivide this range into number of symbols.
5. Repeat the procedure until the termination character is encoded.
• LZW uses a code table which has the first 256 entries(0-255) representing the
256 gray levels, while the remaining entries represent patterns of grey level
present in the image. For example, code 256 may represent a sequence of 4
bytes, 231, 123, 124 and 125.
• During decompression, code 256 is translated via the code table to recreate the
4 bytes. The advantage of LZW coding is that while storing or transmitting, we
only need the encoded data and not he dictionary.
Steps of LZW encoding:
1. Keep all the symbols in the table, starting from 0 (to 255 for images). The
image is encoded by processing its pixels in a left to right and top to bottom
approach.
2. Each successive intensity value is concatenated with a variable called the
currently recognized sequence.
3. On each step, the table (dictionary) is searched for each
concatenated
sequence and if found, is replaced by the newly concatenated and
recognized(i.e., located in the dictionary) sequence. If the concatenated
sequence is not found, the address of the currently recognized sequence is
output as the next encoded value, the concatenated but unrecognized
sequence is added to the dictionary, and the currently recognized sequence is
initialized to the current pixel value.
4. Continue until the whole sequence is covered.
5. The produced sequence of numbers can be later encoded in binary.
42
Run-Length
in Digital
CodingImage Processing
and its implementation in
MATLAB
Run-length encoding (RLE)
• RLE is one of the simplest lossless data compression techniques. It reduces
interpixel redundancy. It consists of replacing a sequence (Run) of identical
symbols by a pair containing the symbol and the run length.
• Scan the image horizontally or vertically and while scanning assign a group of
pixel with the same intensity into a pair (G ,L) where G is the intensity and L
is the length of the “run”. This method can also be used for detecting edges
and boundaries of an object. It is mostly used for images with a small number
of gray levels and is not effective for highly textured images.
str=[ 5 5 5 5 5 5 4 4 4 3 3 2 ];
str=[1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1
1];
•Run-length coding (RLC) works by counting adjacent pixels with the same gray
level value called the run-length, which is then encoded and stored
• RLC works best for binary, two-valued, images
•RLC can also work with complex images that have been preprocessed
by thresholding to reduce the number of gray levels to two
RLC can be implemented in various ways, but the first step is to define
the required parameters
•Horizontal RLC (counting along the rows) or vertical RLC (counting along the
columns) can be used .
•This technique is especially successful in compressing bi-level images since the
occurrence of a long run of a value is rare in ordinary gray-scale images. A solution
to this is to decompose the gray-scale image into bit planes and compress every bit-
plane separately.
43
Bit-Plane
in Digital Image Processing
Coding
and its implementation in
MATLAB
Bit - Plane coding
• Bit-Plane coding is a type of lossless compression algorithm and it is an effective
technique for reducing an image's interpixel redundancy.
• Here the idea is to process the image's bit planes individually.
• The technique, called bit-plane coding, is based on the concept of decomposing
a multilevel (monochrome or color) image into a series of binary images and
compressing each binary image via one of several well-known binary
compression methods.
The gray levels of an m-bit gray-scale image can be represented in the form of the base 2
polynomial
Based on this property, a simple method of decomposing the image into a collection of binary images is to separate the m
coefficients of the polynomial into m 1-bit bit planes. The zeroth-order bit plane is generated by collecting the a0 bits of each pixel,
while the (m - 1)st -order bit plane contains the am-1, bits or coefficients. In general, each bit plane is numbered from 0 to m-1
and is constructed by setting its pixels equal to the values of the appropriate bits or polynomial coefficients from each pixel in the
original image.
The inherent disadvantage of this approach is that small changes in gray level can have a significant impact on the complexity of the
bit planes. If a pixel of intensity 127 (01111111) is adjacent to a pixel of intensity 128 (10000000), for instance, every bit plane will
contain a corresponding 0 to 1 (or 1 to 0) transition. For example, as the most significant bits of the two binary codes for 127 and
128 are different, bit plane 7 will contain a zero-valued pixel next to a pixel of value 1, creating a 0 to 1 (or 1 to 0) transition at that
point. An alternative decomposition approach (which reduces the effect of small gray-level variations) is to first represent the image
by an m-bit Gray code. The m-bit Gray code gm-1... g2g1g0 that corresponds to the polynomial in Eq. above can be computed from
Here, ⊕ denotes the exclusive OR operation. This code has the unique property that successive code words differ in only one-bit
position. Thus, small changes in gray level are less likely to affect all m bit planes. For instance, when gray levels 127 and 128 are
adjacent, only the 7th bit plane will contain a 0 to 1 transition, because the Gray codes that correspond to 127 and 128 are
11000000 and 01000000, respectively.
Lossless compression
44 and Lossy
compression
in Digital Image
Processing
Forward:
Inverse:
if u=0 if v=0
if u>0 if v>0
DCT (cont’d)
Reconstructed images
by truncating
50% of the
coefficient
s
More compact
RMS error: 2.32 1.78 1.13
transformation
Quantization:
• In JPEG, each F[u,v] is divided by a constant q(u,v).
Table of q(u,v) is called quantization table T.
• After T table is created, the values are quantized to
reduce the number of bits needed for encoding.
• Quantization divides the number of bits by a
constant,
then drops the fraction. This is done to optimize the
number of bits and the number of 0s for each
particular application.
Compression:
• Quantized values are read from the table and
redundant 0s are removed.
• To cluster the 0s together, the table is read diagonally
Block diagram of (a) JPEG encoder, (b) JPEG in an zigzag fashion. The reason is if the table
decoder doesn’t have fine changes, the bottom right corner of
the table is all 0s.
• JPEG usually uses lossless run length encoding and
Huffman coding at the compression phase.
Block diagram of (a) JPEG encoder
and
Decoder for color images
JPEG compression for color images
STEPS IN JPEG
1. COMPRESSION
Pre-process image. (8*8 blocks subdivision
and level shifting for grayscale images, color
transformation, down-sampling and 8*8
blocks subdivision for color images.
2. Apply 2D forward DCT. Each block is
represented by 64 frequency components,
one of which (the DC component) is the
average color of the 64 pixels in the block.
3. Quantize DCT coefficients.
4. Apply RLE and Huffman
encoding(Entropy encoding).
8*8 block of an
image
183 160 94 153 194 163 132 165
183 153 116 176 187 166 130 169
179 168 171 182 179 170 131 167
Qmat = 14 17 22 29 51 87 80 62
18 22 37 56 68 109 103 77
24 35 55 64 81 104 113 92
49 64 78 87 103 121 120 10
1
72 92 95 98 112 100 103 99
Quantization
16 11 10
Table 16 24 40 51 61
12 12 14 19 26 58 60 55
14 13 16 24 40 57 69 56
14 17 22 29 51 87 80 62
= 20 24 35 55 64 81 104 113 92
3 1 -1 -2 -3 -1 4 1 0 0 0 0 0 0 0 0
DCT matrix Quantized
value
Thresholding/Threshold coding:
20 5 -3 1 3 -2 1 0
-3 -2 1 2 1 0 0 0
-1 -1 1 1 1 0 0 0
t=round(dcts./Qmat) -1 0 0 1 0 0 0 0
= 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
Zig- Zag Scanning of the
Coefficients
20 5 -3 1 3 -2 1 0
-3 -2 1 2 1 0 0 0
-1 -1 1 1 1 0 0 0
-1 0 0 1 0 0 0 0
[20,5,-3,-1,-2,-3,1,1,-1,-1,0,0,1,2,3,-2,1,1,0,0,0,0,0,0,1,1,0,1,0,EOB]
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
• Here the EOB symbol denotes the end-of-block condition. A special EOB
Huffman code word is provided to indicate that the remainder of the coefficients
in a reordered sequence are zeros.
Decoding the
Coefficients
-14 -13 16 24 40 0 0 0
-14 0 0 29 0 0 0 0
ds_hat = t * Qmat
0 0 0 0 0 0 0 0
=
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
Computing the
IDCT
67 12 -9 20 69 43 -8 42
58 25 15 30 65 40 -4 47
46 41 44 40 59 38 0 49
41 52 59 43 57 42 3 42
fs_hat=round(idct2(ds_hat))
44 54 58 40 58 47 3 33
49 52 53 40 61 47 1 33
53 50 53 46 63 41 0 45
55 50 56 53 64 34 -1 57
Shifting Back the
195 140 119 148 197 171 120 170
Coefficients 186 153 143 158 193 168 124 175
174 169 172 168 187 166 128 177
169 180 187 171 185 170 131 170
f_hat=fs_hat+12 172 182 186 168 186 175 131 161 Reconstructed
8 177 180 181 168 189 175 129 161 image
181 178 181 174 191 169 128 173
183 178 184 181 192 162 127 185
• Low bit-rate compression: JPEG offers an excellent quality at high and mid
bit-rates. However, the quality is unacceptable at low bit-rates (e.g.
below
0.25 bpp)
• Lossless and lossy compression: JPEG cannot provide a superior
performance at lossless and lossy compression in a single code-stream.
• Transmission in noisy environments: the current JPEG standard provides
some resynchronization markers, but the quality still degrades when bit-
errors are encountered.
• Different types of still images: JPEG was optimized for natural images.
Its performance on computer generated images and bi-level (text) images
is poor.
JPEG 2000
• JPEG 2000 is based on the DWT(Discrete Wavelet Transform). DWT
decomposes the image using functions called wavelets.
• The main advantage of wavelet is that it offers both time and frequency
information.
• The idea here is to have more localized analysis of the information which
is not possible using cosine functions whose spatial supports are identical
to the data.
• JPEG 2000 distinguishes itself from JPEG compression standards not only
by virtue of its higher compression ratios, but also by its many new
functionalities.
• The most noticeable among them is its scalability. From a compressed
JPEG 2000 bitstream, it is possible to extract a subset of the bitstream that
decodes to an image of variable quality and resolution.
What is Wavelet Transform ?
• Wavelets are functions that are defined in time as well as in frequency around a
certain point.
• Wavelet is a localize image transform. Because of localize process LF and HF
component are get scanned row wise as well as column wise.
• Wavelets based transform are mathematical tools which are used to extract
information from images. A significant benefit it has over Fourier transforms is
temporal(time) resolution which signifies that it can captures both frequency and
location information of the images.
• Wavelet transform can be viewed as the projection of a signal into a set of basis
functions named wavelets. Such basis functions offer localization in the frequency
domain.
• The wavelet transform is designed in such a way that we get good frequency
resolution for low frequency components(average intensity values of image) and
high temporal resolution for high frequency components(edges of image).
Steps of Wavelet Transform
Step1: Start with a mother wavelet such as Haar, Morlet, Daubechies etc. The image
is then translated into shifted and scaled versions of this mother wavelet. First
original image have to been passed through high pass filter and low pass filter by
applying filter on each row. We know when we apply LPF we get approximation
and when we apply HPF we get the details.
Step2: Now to the horizontal approximation, we again apply LPF and HPF to the
columns. Hence we get the approximate image (LL) and vertical detail of the
horizontal approximation(LH).
Step3: Next we apply LPF and HPF to the horizontal detail. Hence we get
horizontal detail of the image (HL) and the diagonal detail of the image (HH).
If the 3 detail sub-signals i.e. LH, HL and HH are small, they can be assumed to be
zero, without any significant change in the image. Hence large compression can be
achieved using wavelet transform.
Features of JPEG 2000
• Lossless and lossy compression: the standard provides lossy compression with a
superior performance at low bit-rates. It also provides lossless compression with
progressive decoding. Applications such as digital libraries/databases and
medical imagery can benefit from this feature.
• Protective image security: the open architecture of the JPEG2000 standard
makes easy the use of protection techniques of digital images such as
watermarking, labeling, stamping or encryption.
• Region-of-interest coding: in this mode, regions of interest (ROI’s) can be
defined. These ROI’s can be encoded and transmitted with better quality than
the rest of the image .
• Robustness to bit errors: the standard incorporate a set of error resilient tools to
make the bit-stream more robust to transmission errors.
Block diagram of JPEG
2000
Sr. Parameter JPEG JPEG 2000
No:
1. Definition It was created by Joint Photographic It was created by JPEG in year 2000 and is a new
Expert Group in 1986. standard.
2. Lossy/Lossless JPEG uses lossy compression. JPEG 2000 offers both lossless & lossy
compression.
3. Complexity JPEG codec has low complexity. JPEG 2000 codec is highly complex.
4. Resolution JPEG gives single resolution and single quality. JPEG 2000 gives multiple resolution and full
and Quality quality scalability.
5. Compression ratio JPEG gives CR around 10 to 20. JPEG 2000 can give CR around 30 to 200.
6. Algorithms/ JPEG uses DCT. JPEG 2000 uses DWT.
Techniques used
7. Computation It requires less computational complexity It requires more computational complexity
and computation time is less. and computation time is also more.
8. Image Quality Blocking artifacts are present in the image and Blocking artifacts are not present in the image, image
regions of interest can also be not selected. quality is very good and regions of interest can also
be selected for quality.
9. Applications JPEG is used widely on World Wide Web and in JPEG 2000 is widely in use in the medical and
digital cameras. wireless multimedia arenas. Most diagnostic imagery,
such as MRI, CT scans, and X-rays, are encoded as
JPEG 2000.
Compariso
n
Original(979
KB)