ADAPTIVE WAVELET FILTERS IN IMAGE CODERS HOW IMPORTANT ARE THEY?
Subhasis Saha
Rao Vemuri
University of California, Davis
Lawrence Livermore National Laboratory
7000 East Ave.
Livermore, CA 94550
email : {saha, vemuri1@llnl.gov}
Abstract: Wavelet-based image coding algorithms
lossy or lossless, use a fixed perfect reconstruction
(PR) filterbank built into the algorithm for coding
and decoding all kinds of images. This generic
approach of filter selection and usage may not always
give the best compression from the viewpoint of a
specific application. However, no systematic study
has been done to see if using different wavelet filters
for different image types improves the coding
performance. To explore this problem, a variety of
wavelets are used to compress a variety of images at
different compression ratios and the results are
reported here. The results intuitive at best, is that the
performance in lossy coders is image dependent and
while some wavelet filters performs better than others
depending on the image being coded, no specific
wavelet filter performs uniformly better than others
on all test images. This observation leads to the
hypothesis that both for lossy and lossless
compression, the "most appropriate" wavelets should
be chosen adaptively depending on the statistical
nature of image being coded, to achieve better
compression.
I. INTRODUCTION
Uncompressed text, graphics, audio and video
data require considerable storage capacity given
today's storage technology. Similarly for multimedia
communications, data transfer of uncompressed
images and video over digital networks require very
high bandwidth. For example, an uncompressed still
image of size 640x480 pixels with 24 bit of color
require about 7.37 Mbits of storage, and an
uncompressed full-motion video (30 frames/sec) of
10 sec duration needs 2.21 Gbits of storage and a
bandwidth of 221 Mbits/sec. Even if we assume that
there is enough storage capacity available, it is
impossible to transmit large number of images or
play video (sequence of images) in real time due to
insufficient data transfer rates as well as limited
network bandwidths. In summary, at the present state
of technology, the only solution is to compress
multimedia data before its storage and transmission,
and decompress it at the receiver for play back.
For still images, the subject of this paper,
modern image compression techniques offer a
solution to this problem by reducing the storage
requirements
and
transmission
bandwidths.
Compression ratios of 16 to 32 are quite common
and ratios higher than this can be achieved but at the
expense of the picture quality. Image compression is
achieved by exploiting the spatial and spectral
redundancy or irrelevancy present in the image data.
There are many compression techniques that are in
part competitive and in part complementary.
However, the most important compression technique
for still images is the transform coding based on the
Discrete Cosine Transform (DCT). This is the
popular JPEG standard.
In recent years, wavelet transform has become a
cutting-edge technology in signal processing in
general and in image data compression in particular.
A wide variety of wavelet-based image compression
schemes ranging from simple entropy coding to more
complex techniques such as vector quantization, tree
encoding and edge-based coding using wavelets have
been developed. Although none of these is part of the
standard yet, because of its inherent advantages that
may not be far away. In fact in the upcoming JPEG2000 standard, the top contenders are all waveletbased compression algorithms.
Like most other lossy coders, a wavelet-based
image coder can be divided into three components:
decomposition of the image using wavelet
representations,
quantization
of
transformed
coefficients, and entropy encoding of quantized
coefficients, as shown in Fig. 1. Although in each
component one has freedom to choose from a pool of
candidates and this choice ultimately affects the
overall coder performance, the choice of the
appropriate wavelet representation or filterbank
selection is very important. This is because, when the
performance of the wavelet filter is poor in the first
place, use of adequate quantization and even
sophisticated entropy encoders may not always
provide significant enough gain to maintain
satisfactory picture quality. Hence, the importance of
searching and using "good" wavelet filters in most
coding schemes can't be over emphasized. Therefore,
it is worthwhile to evaluate the choice of the filters
independently, keeping other parts of the coder fixed.
In [1] the effect of varying the quantizers and
encoders individually was evaluated.
Input
Image
Wavelet
Transform
Quantizer
Entropy
Encoder
However, these are only a few published results todate and no systematic study has been done to see if
using different wavelets for different image types
improves the coding performance.
In section II, first a brief description of the two
important wavelet families (filterbanks) that can be
used in wavelet-based image coders is given. This is
followed by the discussion of the role of regularity
and smoothness of such wavelets. The experiments
and results using different wavelet filters on different
images are presented in section III. Finally, section
IV gives the summary and future research directions.
Compressed
Image
Fig. 1 A typical wavelet-based image coder
All well known lossy image coding algorithms
developed so far, use a particular filterbank chosen
from a pool of filters designed and developed by
researchers over more than a decade. Once chosen,
the filter coefficients are hard coded into the
algorithm. In other words, the same filters are used
for coding and decoding all types of images whether
it's a natural image, synthetic image, medical image,
aerial image, scanned image or compound image.
This generic approach of filter selection may not
always give the best quality of service (compression)
from the viewpoint of a specific application.
When deciding on a filter bank for a particular
application, there are many variables to take into
account. Subband decomposition comes in several
varieties, as they can either have uniform-band splits,
octave-band splits, or more generally, nonuniformband splits [2]. They can use filter sets that have
finite impulse response (FIR) or infinite impulse
response (IIR), and orthogonal or biorthogonal.
Furthermore, they can be perfectly reconstructing
(PR), such as many biorthogonal filter sets or
conjugate quadrature filter banks (CQFs), or nearperfectly reconstructing like the quadrature mirror
filter bank (QMF). Regularity and smoothness or
number of zero moments of the filters are the other
important factors to be decided upon.
In the context of image coding, some filters have
been found to perform better overall than others.
However, most of the research and experiments in
image coding that have been done and results that
have been published are mainly using a few well
known test images, (‘Lena’ is the most popular and
widely used test image) an overwhelming majority of
which are natural images. Recently, it has been
observed [3], [4] that for lossless compression using
various integer to integer wavelet transforms, no
specific wavelet filter performed uniformly better than
others on a variety of test images and the performance
has been found to be much more image dependent.
II. ROLE OF REGULARITY AND
SMOOTHNESS OF WAVELETS IN
IMAGE COMPRESSION
A.
Orthogonal
Filterbanks
and
Biorthogonal
There are two well-known wavelet filter families
used in wavelet-based image coders viz. orthogonal
and biorthogonal wavelets. Orthogonal wavelets are
the family of wavelets that generate orthonormal
bases of L 2(Rn). Among them the most important
ones to image coding are compactly supported
orthogonal wavelets. In the discrete wavelet transform
(DWT), compactly supported wavelets correspond to
finite impulse response (FIR) filters and thus lead to
efficient implementations. A systematic way of
constructing compactly supported wavelets was
developed by Daubechies [5] and a fast algorithm for
computing a DWT was given by Mallat [6]. The
popular Daubechies family of compactly supported
wavelets is parameterized by an integer that is
proportional to the length of the wavelet filter. For
compactly supported wavelets, the length of a
wavelet filter is related to the degree of smoothness
and regularity of the wavelet, which in turn can affect
the coding performance.
The main attraction of biorthogonal wavelets is
linear phase of FIR filters (symmetric / antisymmetric impulse response). There are also
systematic ways of constructing compactly supported
biorthogonal wavelets [7]. One can choose, for
example, to build filters with similar or dissimilar
lengths for decomposition and reconstruction, or
which are nearly orthogonal. Linear phase
(symmetric) FIR filters are used since such filters can
be easily cascaded in pyramidal filter structures
without the need for phase compensation. Although
the advantages of using linear phase biorthogonal
filters in image coding have been conjectured, a
previous study by Rioul [8] did not clearly support
them. However, since there is little extra cost
associated with biorthogonal wavelets, they are
adopted in most of the wavelet image coders.
Image compression applications using wavelet
bases exploit their ability to efficiently decorrelate
and approximate image data with few non-zero
wavelet coefficients. The design and choice of
wavelet ψ (t) or equivalently the corresponding filter
must therefore be such that it produces maximum
number of wavelet coefficients that are close to zero.
This depends mostly on the regularity of the function
(or image data here), the number of vanishing
moments of the wavelet and the size of the support.
B. Regularity
A discrete-time filter is called regular, if it
converges, through the iteration schemes, to a scaling
function and wavelet with some degree of regularity
(e.g. piecewise smooth, continuous etc.). A filter
with certain number of zeros at z = -1 (corresponding
to the aliasing frequency ω = π ) is called regular if
the iteration of the filter tends to a continuous
function. In the limit, iteration of the lowpass filter
leads to the scaling function ϕ (t) . The wavelet,
ψ (t) , follows from ϕ (t) by just one application of
the highpass filter. Regularity of the scaling function
or the lowpass filter is important. Since ψ (t) is a
finite linear combination of ϕ (2t − n) , the
regularity of the wavelet is equal to that of the
scaling function. Stated otherwise, regularity r
means that the rth derivative exists almost
everywhere. While this regularity of the wavelet or
the scaling function is linked to the number of zeros
at ω = π of the lowpass filter, the link is not so
obvious.
The regularity of a wavelet transform (or scaling
function) is more-or-less a measure of its
smoothness. Being more specific, if a scaling
function is m times differentiable, and its mth
derivative is Holder continuous of order α , then its
regularity will be m + α . The corresponding wavelet
has the same regularity as the scaling function.
C. Smoothness
A wavelet is said to have N vanishing moments
(or smoothness of order N), if
∞
∫ t ψ (t)dt = 0
n
n = 0,1, . . . N -1
(1)
−∞
that is, the first N moments of the wavelet are zero.
The smoothness of continuous -time wavelet systems
has been the object of intensive study for years.
Because the wavelet ψ (t) is determined from the
scaling function ϕ (t) by means of the highpass filter
coefficients as
ψ (t)=
2 ∑ g1(n)ϕ(2t − n)
(2)
n
it is the smoothness of the scaling function (resulting
from the infinitely iterated lowpass filter) which
determines the smoothness of the overall wavelet
system.
It can be shown [2] that for the infinite product
formulas to converge, the lowpass filters H0 (z) and
G0 (z) must have a sufficient number of zeros at z
= -1. This condition leads to two more restrictions
on the filter coefficients as follows:
H0 (0) = G0 (0) = 2
H0 (−1) = G0 (−1) = 0
(3)
The fact that the analysis lowpass filter, H0 (z)
has N zeros at z = - 1 , implies that the synthesis
wavelet ψ (t) has N vanishing moments.
Conversely, if synthesis lowpass filter G0 (z) has N
zeros at z = -1, then the analysis wavelet ψ (t) has
N vanishing moments. If ψ (t) has N vanishing
moments, any function x(t) that is N times
differentiable can be represented through the wavelet
transform with a great compression potential. Thus it
is desirable for the analysis wavelet to have a large
number of vanishing moments.
Regularity and Smoothness of a few popular
wavelets are shown in the Table-1. α(N) is a
linearly increasing function of N, which approaches
0.2075N for large N [2].
Table 1. Regularity and Smoothness of a few
wavelets (from [2])
Wavelet
Haar
Sinc
Meyer
Battle-Lemerie
Daubechies
Smoothness
Regularity
1
∞
∞
N
N
0
∞
∞
N
α(Ν)
D. Role of Regularity and Smoothness
Arbitrarily high regularity can be achieved by
both
the decomposition waveletψ and the
reconstruction wavelet ψ˜ , provided sufficiently long
filters are chosen. With the link between filterbanks
and wavelets well established, some research work
has been done to see the effect of regularity and
smoothness of the wavelet on the performance of
image coding, without any conclusive result [8], [9],
[10], [3]. While some researchers have found the
regularity of filters important [3] others have not. For
example, Villasenor [9] have shown with examples
that regularity alone is not a sufficient criterion, and
filters with high regularity has extremely poor
compression performance. Impulse and step response
properties need to be considered as well, and filters
with poor regularity can still achieve reasonably good
compression if it has good impulse and step response
properties.
III. EXPERIMENTS AND RESULTS
The numbers of various PR filterbanks that have
been developed over the years are prohibitively large,
even after we eliminate some apparently unreasonable
choices. In this paper, a selective number of such
filters, both orthogonal and biorthogonal, are used
from possible choices and the comparative results
using thirteen different wavelets, on thirteen different
test images of various sizes are presented. For the
biorthogonal filters, experiment is carried out with
reversing the role of the analysis and synthesis
filterbanks as well. That brings the effective number
of filters used to twenty. Due to limitations of space,
only a small portion of the results (for compression
ratio of 16:1 only) is presented and discussed here.
More detailed results would be presented in [13].
A. Filters Used
The following wavelet filters have been used to
compress various images in this experiment.
• Orthogonal filters
• Haar - Orthonormal linear phase filter
with two coefficients
• Daubechies Orthogonal filters of order
two, four, and eight with four, eight,
and sixteen coefficients respectively.
• Adelson’s symmetric filters with nine
coefficients
• Biorthogonal filters
• Cohen, Daubechies, and Feauveau
(CDF) biorthogonal filters e.g. CDF9/7, CDF-9/11, CDF-13/3 filters
• Villesenor biorthogonal filters e.g. Vill18/10, Vill-13/11 and Vill-6/10 filters
• Odegard biorthogonal filter with 9/7
coefficients
• Brislawn 10/10 biorthogonal filter
B. Images Used
Test images from five different categories viz.
Natural Images (Lena, Barbara, Baboon, Peppers,
Cameraman, Couple, Seagull, and Airplane),
Synthetic Images (Scientific simulation visualization
data (Terascale data)), Compound Images (Bengali
script), Medical Images (MRI, Nervecell), and Aerial
Images (Aerial1) have been coded using the above
wavelets in this experiment.
Using the same filters to code an image,
performance of the image coder at four different
compression ratios is evaluated. The quantizer is
similar to the one used in [11] and the entropy
encoder is an adaptive arithmetic coder [12]. Five
levels of decomposition are used. Again, due to
limitations of space, only a small portion of the
results (for compression ratio of 16:1) is presented in
Table-2.
C. Results
While the variation of the peak signal to noise
ratio (PSNR) values for different filters for
compressing the same image was not that wide, there
was up to 3 dB of difference between results using a
Haar type filter and a biorthogonal linear phase filter.
The performance using the biorthogonal filters were
always significantly better than using the orthogonal
filters, except for images containing binary data (text)
viz. compound images. For pure binary text images,
as expected, Haar filters significantly outperformed
the rest of the filters. However, within biorthogonal
filters, although the CDF-9/7 performed well for
many images, for a number of other images
Villasenor’s filters outperformed others. We also
observed that it is better to use a filter with fewer
zeros at the decomposition than at the reconstruction
since a smooth wavelet at the reconstruction
introduces a smooth error and more pleasing image
quality.
We observed that both the regularity of the
reconstruction wavelet ψ˜ and the number of
vanishing moments of the analysis wavelet ψ are
important. In particular, after experimenting by
exchanging the roles of the analysis and synthesis
filters, in most cases, we observed the following :
•
•
•
For the same number of vanishing moments for
both ψ and ψ˜ , the scheme with most regular
ψ˜ performed best.
Increasing the regularity of ψ˜ , even at the
expense of the number of vanishing moments for
ψ , led to better results.
For comparable regularity of ψ˜ , the scheme
with largest vanishing moments for ψ
performed best.
For example, in the case of CDF-9/7
biorthogonal filters, using the 9-tap filter as the
analyzing lowpass filter and the 7-tap filter as the
reconstruction lowpass filter (which corresponds to a
more regular ψ˜ , though both ψ and ψ˜ have four
vanishing moments), for most images better PSNR
results (by more than one dB) are obtained than by
exchanging the two filters.
IV. SUMMARY AND FUTURE
RESEARCH DIRECTION
In the lossy image coding experiment presented
here, some wavelet filters have been found to perform
better than others depending on the image being
coded, and no specific wavelet filter has performed
uniformly better than others on these images. Similar
results have also been observed in the context of
lossless compression using various integer to integer
wavelet transforms. Our conclusion is that, the
performance in both lossless and lossy coders has
been found to be image dependent. While this simple
finding is very intuitive, to date no thorough analysis
and investigation have been done to see if there is a
clear link between the characteristics of an image and
the coding performance so that the “most appropriate”
wavelets can be used based on the image being
coded.
This can also be viewed from a different angle.
Using a linear phase biorthogonal filterbank, say the
popular CDF-9/7 pair, when we compress a set of
different images using an embedded quantization
scheme and an adaptive arithmetic encoder, for the
same compression ratio say 16:1, we get widely
varying PSNR values ranging from as low as 25 dB
to more than 40 dB depending on the image being
coded. So, this large variation in PSNR can only be
attributed to the nature and inherent characteristics of
the image, since everything else is fixed. But we
don’t know for sure what exactly in the image
characteristics is making so much of a difference. Is it
the high frequency components in the image, is it the
brightness, is it the dynamic range of the pixel values
or is it something else ?
In general, images from different categories
mentioned
earlier, have
different
inherent
characteristics. For example, it is a common
observation that most of the natural images are
continuous tone compared to the synthetic images
most of which are discrete tone that is the dynamic
range of the pixel bit depth are under utilized. Such
images generally have some numerical structures that
are not well represented by smooth basis functions.
Many medical images like MRI or CT scan contain
significant low-intensity (black) regions along image
boundaries. Compound images with significant
amount of text
are a mixture of binary and
continuous tone data. Even within a particular
category, images vary in many ways with widely
varying first and second order Markov statistics.
Whereas some are relatively flat, others are very busy
having more edges and contours in them. Some are
darker and others have more sharpness. So, an
analysis of these images shows
different
characteristics like mean, median, variance and
histogram in the spatial domain. In [13], we analyze
different images to obtain the various spatial domain
characteristics stated above. We use spectral flatness
measure (SFM) to determine the overall image
activity. We also analyze the transform coefficients
of various subbands (before any quantization and
entropy encoding is performed) using different
wavelet filters, to see the effect of various wavelets
on an image, including the transform coding gain
(TCG) which measures the energy compaction of the
transform for different filters used.
As a result, these factors should help to determine
the proper filters to be used for compressing that
image for better performance. Our belief is that both
for lossy and lossless compression, the "most
appropriate" wavelets should be chosen dynamically
depending on the nature of image being coded, to
achieve better compression. This is important
because, when the performance of the wavelet filter is
poor in the first place, use of context modeling of the
transform coefficients can't provide significant
enough gain. Hence, the importance of searching and
using "good" wavelet filters in most coding schemes
can't be over emphasized, and the factors mentioned
above should be considered to help us determine the
“right” filter to be used for compressing that image
for best performance. In summary,
for better
performance of both lossless and lossy coding
schemes, the filters used to perform the subband
decomposition should adapt to the statistical nature
of the input image.
Acknowledgement: We would like to thank
Mark Duchanieu of LLNL for having various
discussions on this.
V. REFERENCES
[1] J. Liu, V. R. Algazi, and R. R. Estes, “A
comparative study of wavelet image coders”, Optical
Engineering, 35(9), Sep. 1996
[2] M. Vetterli and J. Kovacevic, ‘Wavelets and
Subband Coding’, Englewood Cliffs, NJ, Prentice
Hall, 1995.
[3] R. Calderbank, I. Daubechies, W. Sweldens, and
B. L. Yeo, "Lossless image compression using
integer to integer wavelet transforms", in Proc. IEEE
International Conference on Image Processing '97,
pp. Oct. 1997.
Compression Conference '93, pp. 550-553, Mar
1993.
[4] M. Zandi, M. Boliek, E. L. Schwartz, and A.
Keith,
"CREW
second
evaluation",
ISO/IEC/JTC1/SC29/WG1, CRC-TR-9534, Nov.
1995.
[9] J. D. Villasenor, B. Belzer, and J. Liao, "Wavelet
filter evaluation for image compression", IEEE Trans.
on Image Processing, Vol. 4, No. 8, pp. 1053-1060,
Aug. 1995.
[5] I. Daubechies, ‘Orthonormal bases of compactly
supported wavelets’, Comm. Pure and Applied
Math., Vol. 41, Nov. 1988, pp. 909-996.
[10] E. A. B. da Silva and M. Ghanbari, "On the
performance of linear phase wavelet transforms in
low-bit rate image coding", IEEE Trans. on Image
Processing, Vol. 5, N0. 5, pp. 689-704, May 1996.
[6] S. G. Mallat, ' A Theory for Multiresolution
Signal Decomposition: The Wavelet Representation',
IEEE Trans. Pattern Analysis and Machine
Intelligence, Vol. 11, No. 7, July 1989, pp. 674-693.
[11] D. Taubman, and A. Zakhor, “Multirate 3-D
subband coding of video”, IEEE Trans. Image
Processing, Vol. 3, No. 5, Sep. 1994.
[7] A. Cohen, I. Daubechies, and J. C. Feauveau,
"Biorthogonal bases of compactly supported
wavelets", Comm. on Pure and Applied
Mathematics, 1992, Vol. XLV, pp. 485-560.
[12] T. C. Bell, J. G. Cleary, and J. H. Witten,
“Text Compression”, Prentice Hall, Englewood
Cliffs, NJ, 1990.
[8] O. Rioul, "On the choice of Wavelet filters for
still image compression", in Proc. IEEE Data
[13] S. Saha and R. Vemuri, “Analysis Based
Coding of Images”, In preparation.
Table 2 PSNR values (in dB) for different images for compression ratio of 16:1
Image Statistics
Lena
Barbara
Baboon
Peppers
Airplane
Aerial
Teradata
Bengali
camera
couple
Mean
99
117
129
120
179
181
111
213
119
33
Median
Std. Dev.
97
53
117
55
130
42
121
54
200
46
192
39
113
25
255
78
144
62
24
32
Variance
2796
2982
1789
2894
2154
1556
621
6116
3886
Haar
31.51
27.23
24.19
33.04
32.51
25.99
36.05
27.3
Daub2 (4 coeff)
33.03
28.32
24.77
34.37
34.02
27.2
39.63
Daub4 (8 coeff)
33.48
29.1
25.04
34.74
34.65
27.63
Daub8 (16 coeff)
33.65
29.64
25.1
34.38
34.45
Adelson (9 coeff)
33.93
29.51
25
34.78
CDF-9/7
(CDF-7/9)
34.28
33.1
29.54
28.76
25.05
24.47
CDF-9/11
33.97
29.8
Odegard-9/7
34.3
Brislawn-10/10
seagull
mri
nervecell
87
57
117
53
66
44
54
98
84
998
4304
2900
7001
29.3
33.18
27.38
29.42
30.66
24.92
29.45
34.08
27.94
31.59
32.4
41.35
24.25
29.02
34.24
27.88
32.31
32.73
27.64
42.15
22.77
28.76
33.96
27.59
32.15
33.04
34.96
27.7
41.34
24.66
29.89
34.89
28.29
32.38
33.6
35.19
34.36
35.34
33.97
27.89
26.79
42.4
39.59
24.71
24.58
30.19
29.2
35.26
34.21
28.5
27.77
32.77
31.39
33.96
32.83
24.68
34.68
34.73
27.72
41.72
24.21
29.63
34.62
28.16
32.28
33.46
30.16
24.98
35.08
35.27
28.11
42.41
24.62
30.09
35.04
28.42
32.73
33.9
33.68
29.07
24.25
34.53
35.07
27.62
41.86
23.07
29.1
34.18
28.23
32.63
33.71
Villasenor-10/18
(Villasenor-18/10)
34.15
33.41
30.09
29.2
25.33
24.65
35.19
34.39
35.55
33.85
28.23
27.12
42.83
40.42
23.21
24.36
29.79
29.13
34.95
33.74
28.48
27.85
32.85
31.67
33.96
32.85
Villasenor-13/11
(Villasenor-11/13)
34.28
33.82
30.18
29.39
24.86
24.75
34.99
34.57
35.27
34.8
28.01
27.52
42.42
41.05
24.47
24.6
29.88
29.73
34.89
34.79
28.32
28.06
32.67
32.11
33.66
33.49
Villasenor-6/10
(Villasenor-10/6)
34.04
33.14
29.57
28.75
24.57
24.5
35.16
34.15
35.03
33.4
27.9
26.84
42.32
39.61
23.85
24.84
29.89
28.96
34.86
33.53
28.46
27.6
32.73
31.26
33.72
32.61
CDF-13/3
(CDF-3/13)
33.65
32.98
28.81
28.54
24.78
23.82
34.69
33.85
34.93
33.28
27.55
26.89
41.34
39.59
24.51
23.75
29.46
28.6
35.13
33.61
28.21
27.4
32.3
31.04
33.58
32.24
Wavelets