[go: up one dir, main page]

Academia.eduAcademia.edu
ADAPTIVE WAVELET FILTERS IN IMAGE CODERS HOW IMPORTANT ARE THEY? Subhasis Saha Rao Vemuri University of California, Davis Lawrence Livermore National Laboratory 7000 East Ave. Livermore, CA 94550 email : {saha, vemuri1@llnl.gov} Abstract: Wavelet-based image coding algorithms lossy or lossless, use a fixed perfect reconstruction (PR) filterbank built into the algorithm for coding and decoding all kinds of images. This generic approach of filter selection and usage may not always give the best compression from the viewpoint of a specific application. However, no systematic study has been done to see if using different wavelet filters for different image types improves the coding performance. To explore this problem, a variety of wavelets are used to compress a variety of images at different compression ratios and the results are reported here. The results intuitive at best, is that the performance in lossy coders is image dependent and while some wavelet filters performs better than others depending on the image being coded, no specific wavelet filter performs uniformly better than others on all test images. This observation leads to the hypothesis that both for lossy and lossless compression, the "most appropriate" wavelets should be chosen adaptively depending on the statistical nature of image being coded, to achieve better compression. I. INTRODUCTION Uncompressed text, graphics, audio and video data require considerable storage capacity given today's storage technology. Similarly for multimedia communications, data transfer of uncompressed images and video over digital networks require very high bandwidth. For example, an uncompressed still image of size 640x480 pixels with 24 bit of color require about 7.37 Mbits of storage, and an uncompressed full-motion video (30 frames/sec) of 10 sec duration needs 2.21 Gbits of storage and a bandwidth of 221 Mbits/sec. Even if we assume that there is enough storage capacity available, it is impossible to transmit large number of images or play video (sequence of images) in real time due to insufficient data transfer rates as well as limited network bandwidths. In summary, at the present state of technology, the only solution is to compress multimedia data before its storage and transmission, and decompress it at the receiver for play back. For still images, the subject of this paper, modern image compression techniques offer a solution to this problem by reducing the storage requirements and transmission bandwidths. Compression ratios of 16 to 32 are quite common and ratios higher than this can be achieved but at the expense of the picture quality. Image compression is achieved by exploiting the spatial and spectral redundancy or irrelevancy present in the image data. There are many compression techniques that are in part competitive and in part complementary. However, the most important compression technique for still images is the transform coding based on the Discrete Cosine Transform (DCT). This is the popular JPEG standard. In recent years, wavelet transform has become a cutting-edge technology in signal processing in general and in image data compression in particular. A wide variety of wavelet-based image compression schemes ranging from simple entropy coding to more complex techniques such as vector quantization, tree encoding and edge-based coding using wavelets have been developed. Although none of these is part of the standard yet, because of its inherent advantages that may not be far away. In fact in the upcoming JPEG2000 standard, the top contenders are all waveletbased compression algorithms. Like most other lossy coders, a wavelet-based image coder can be divided into three components: decomposition of the image using wavelet representations, quantization of transformed coefficients, and entropy encoding of quantized coefficients, as shown in Fig. 1. Although in each component one has freedom to choose from a pool of candidates and this choice ultimately affects the overall coder performance, the choice of the appropriate wavelet representation or filterbank selection is very important. This is because, when the performance of the wavelet filter is poor in the first place, use of adequate quantization and even sophisticated entropy encoders may not always provide significant enough gain to maintain satisfactory picture quality. Hence, the importance of searching and using "good" wavelet filters in most coding schemes can't be over emphasized. Therefore, it is worthwhile to evaluate the choice of the filters independently, keeping other parts of the coder fixed. In [1] the effect of varying the quantizers and encoders individually was evaluated. Input Image Wavelet Transform Quantizer Entropy Encoder However, these are only a few published results todate and no systematic study has been done to see if using different wavelets for different image types improves the coding performance. In section II, first a brief description of the two important wavelet families (filterbanks) that can be used in wavelet-based image coders is given. This is followed by the discussion of the role of regularity and smoothness of such wavelets. The experiments and results using different wavelet filters on different images are presented in section III. Finally, section IV gives the summary and future research directions. Compressed Image Fig. 1 A typical wavelet-based image coder All well known lossy image coding algorithms developed so far, use a particular filterbank chosen from a pool of filters designed and developed by researchers over more than a decade. Once chosen, the filter coefficients are hard coded into the algorithm. In other words, the same filters are used for coding and decoding all types of images whether it's a natural image, synthetic image, medical image, aerial image, scanned image or compound image. This generic approach of filter selection may not always give the best quality of service (compression) from the viewpoint of a specific application. When deciding on a filter bank for a particular application, there are many variables to take into account. Subband decomposition comes in several varieties, as they can either have uniform-band splits, octave-band splits, or more generally, nonuniformband splits [2]. They can use filter sets that have finite impulse response (FIR) or infinite impulse response (IIR), and orthogonal or biorthogonal. Furthermore, they can be perfectly reconstructing (PR), such as many biorthogonal filter sets or conjugate quadrature filter banks (CQFs), or nearperfectly reconstructing like the quadrature mirror filter bank (QMF). Regularity and smoothness or number of zero moments of the filters are the other important factors to be decided upon. In the context of image coding, some filters have been found to perform better overall than others. However, most of the research and experiments in image coding that have been done and results that have been published are mainly using a few well known test images, (‘Lena’ is the most popular and widely used test image) an overwhelming majority of which are natural images. Recently, it has been observed [3], [4] that for lossless compression using various integer to integer wavelet transforms, no specific wavelet filter performed uniformly better than others on a variety of test images and the performance has been found to be much more image dependent. II. ROLE OF REGULARITY AND SMOOTHNESS OF WAVELETS IN IMAGE COMPRESSION A. Orthogonal Filterbanks and Biorthogonal There are two well-known wavelet filter families used in wavelet-based image coders viz. orthogonal and biorthogonal wavelets. Orthogonal wavelets are the family of wavelets that generate orthonormal bases of L 2(Rn). Among them the most important ones to image coding are compactly supported orthogonal wavelets. In the discrete wavelet transform (DWT), compactly supported wavelets correspond to finite impulse response (FIR) filters and thus lead to efficient implementations. A systematic way of constructing compactly supported wavelets was developed by Daubechies [5] and a fast algorithm for computing a DWT was given by Mallat [6]. The popular Daubechies family of compactly supported wavelets is parameterized by an integer that is proportional to the length of the wavelet filter. For compactly supported wavelets, the length of a wavelet filter is related to the degree of smoothness and regularity of the wavelet, which in turn can affect the coding performance. The main attraction of biorthogonal wavelets is linear phase of FIR filters (symmetric / antisymmetric impulse response). There are also systematic ways of constructing compactly supported biorthogonal wavelets [7]. One can choose, for example, to build filters with similar or dissimilar lengths for decomposition and reconstruction, or which are nearly orthogonal. Linear phase (symmetric) FIR filters are used since such filters can be easily cascaded in pyramidal filter structures without the need for phase compensation. Although the advantages of using linear phase biorthogonal filters in image coding have been conjectured, a previous study by Rioul [8] did not clearly support them. However, since there is little extra cost associated with biorthogonal wavelets, they are adopted in most of the wavelet image coders. Image compression applications using wavelet bases exploit their ability to efficiently decorrelate and approximate image data with few non-zero wavelet coefficients. The design and choice of wavelet ψ (t) or equivalently the corresponding filter must therefore be such that it produces maximum number of wavelet coefficients that are close to zero. This depends mostly on the regularity of the function (or image data here), the number of vanishing moments of the wavelet and the size of the support. B. Regularity A discrete-time filter is called regular, if it converges, through the iteration schemes, to a scaling function and wavelet with some degree of regularity (e.g. piecewise smooth, continuous etc.). A filter with certain number of zeros at z = -1 (corresponding to the aliasing frequency ω = π ) is called regular if the iteration of the filter tends to a continuous function. In the limit, iteration of the lowpass filter leads to the scaling function ϕ (t) . The wavelet, ψ (t) , follows from ϕ (t) by just one application of the highpass filter. Regularity of the scaling function or the lowpass filter is important. Since ψ (t) is a finite linear combination of ϕ (2t − n) , the regularity of the wavelet is equal to that of the scaling function. Stated otherwise, regularity r means that the rth derivative exists almost everywhere. While this regularity of the wavelet or the scaling function is linked to the number of zeros at ω = π of the lowpass filter, the link is not so obvious. The regularity of a wavelet transform (or scaling function) is more-or-less a measure of its smoothness. Being more specific, if a scaling function is m times differentiable, and its mth derivative is Holder continuous of order α , then its regularity will be m + α . The corresponding wavelet has the same regularity as the scaling function. C. Smoothness A wavelet is said to have N vanishing moments (or smoothness of order N), if ∞ ∫ t ψ (t)dt = 0 n n = 0,1, . . . N -1 (1) −∞ that is, the first N moments of the wavelet are zero. The smoothness of continuous -time wavelet systems has been the object of intensive study for years. Because the wavelet ψ (t) is determined from the scaling function ϕ (t) by means of the highpass filter coefficients as ψ (t)= 2 ∑ g1(n)ϕ(2t − n) (2) n it is the smoothness of the scaling function (resulting from the infinitely iterated lowpass filter) which determines the smoothness of the overall wavelet system. It can be shown [2] that for the infinite product formulas to converge, the lowpass filters H0 (z) and G0 (z) must have a sufficient number of zeros at z = -1. This condition leads to two more restrictions on the filter coefficients as follows: H0 (0) = G0 (0) = 2 H0 (−1) = G0 (−1) = 0 (3) The fact that the analysis lowpass filter, H0 (z) has N zeros at z = - 1 , implies that the synthesis wavelet ψ (t) has N vanishing moments. Conversely, if synthesis lowpass filter G0 (z) has N zeros at z = -1, then the analysis wavelet ψ (t) has N vanishing moments. If ψ (t) has N vanishing moments, any function x(t) that is N times differentiable can be represented through the wavelet transform with a great compression potential. Thus it is desirable for the analysis wavelet to have a large number of vanishing moments. Regularity and Smoothness of a few popular wavelets are shown in the Table-1. α(N) is a linearly increasing function of N, which approaches 0.2075N for large N [2]. Table 1. Regularity and Smoothness of a few wavelets (from [2]) Wavelet Haar Sinc Meyer Battle-Lemerie Daubechies Smoothness Regularity 1 ∞ ∞ N N 0 ∞ ∞ N α(Ν) D. Role of Regularity and Smoothness Arbitrarily high regularity can be achieved by both the decomposition waveletψ and the reconstruction wavelet ψ˜ , provided sufficiently long filters are chosen. With the link between filterbanks and wavelets well established, some research work has been done to see the effect of regularity and smoothness of the wavelet on the performance of image coding, without any conclusive result [8], [9], [10], [3]. While some researchers have found the regularity of filters important [3] others have not. For example, Villasenor [9] have shown with examples that regularity alone is not a sufficient criterion, and filters with high regularity has extremely poor compression performance. Impulse and step response properties need to be considered as well, and filters with poor regularity can still achieve reasonably good compression if it has good impulse and step response properties. III. EXPERIMENTS AND RESULTS The numbers of various PR filterbanks that have been developed over the years are prohibitively large, even after we eliminate some apparently unreasonable choices. In this paper, a selective number of such filters, both orthogonal and biorthogonal, are used from possible choices and the comparative results using thirteen different wavelets, on thirteen different test images of various sizes are presented. For the biorthogonal filters, experiment is carried out with reversing the role of the analysis and synthesis filterbanks as well. That brings the effective number of filters used to twenty. Due to limitations of space, only a small portion of the results (for compression ratio of 16:1 only) is presented and discussed here. More detailed results would be presented in [13]. A. Filters Used The following wavelet filters have been used to compress various images in this experiment. • Orthogonal filters • Haar - Orthonormal linear phase filter with two coefficients • Daubechies Orthogonal filters of order two, four, and eight with four, eight, and sixteen coefficients respectively. • Adelson’s symmetric filters with nine coefficients • Biorthogonal filters • Cohen, Daubechies, and Feauveau (CDF) biorthogonal filters e.g. CDF9/7, CDF-9/11, CDF-13/3 filters • Villesenor biorthogonal filters e.g. Vill18/10, Vill-13/11 and Vill-6/10 filters • Odegard biorthogonal filter with 9/7 coefficients • Brislawn 10/10 biorthogonal filter B. Images Used Test images from five different categories viz. Natural Images (Lena, Barbara, Baboon, Peppers, Cameraman, Couple, Seagull, and Airplane), Synthetic Images (Scientific simulation visualization data (Terascale data)), Compound Images (Bengali script), Medical Images (MRI, Nervecell), and Aerial Images (Aerial1) have been coded using the above wavelets in this experiment. Using the same filters to code an image, performance of the image coder at four different compression ratios is evaluated. The quantizer is similar to the one used in [11] and the entropy encoder is an adaptive arithmetic coder [12]. Five levels of decomposition are used. Again, due to limitations of space, only a small portion of the results (for compression ratio of 16:1) is presented in Table-2. C. Results While the variation of the peak signal to noise ratio (PSNR) values for different filters for compressing the same image was not that wide, there was up to 3 dB of difference between results using a Haar type filter and a biorthogonal linear phase filter. The performance using the biorthogonal filters were always significantly better than using the orthogonal filters, except for images containing binary data (text) viz. compound images. For pure binary text images, as expected, Haar filters significantly outperformed the rest of the filters. However, within biorthogonal filters, although the CDF-9/7 performed well for many images, for a number of other images Villasenor’s filters outperformed others. We also observed that it is better to use a filter with fewer zeros at the decomposition than at the reconstruction since a smooth wavelet at the reconstruction introduces a smooth error and more pleasing image quality. We observed that both the regularity of the reconstruction wavelet ψ˜ and the number of vanishing moments of the analysis wavelet ψ are important. In particular, after experimenting by exchanging the roles of the analysis and synthesis filters, in most cases, we observed the following : • • • For the same number of vanishing moments for both ψ and ψ˜ , the scheme with most regular ψ˜ performed best. Increasing the regularity of ψ˜ , even at the expense of the number of vanishing moments for ψ , led to better results. For comparable regularity of ψ˜ , the scheme with largest vanishing moments for ψ performed best. For example, in the case of CDF-9/7 biorthogonal filters, using the 9-tap filter as the analyzing lowpass filter and the 7-tap filter as the reconstruction lowpass filter (which corresponds to a more regular ψ˜ , though both ψ and ψ˜ have four vanishing moments), for most images better PSNR results (by more than one dB) are obtained than by exchanging the two filters. IV. SUMMARY AND FUTURE RESEARCH DIRECTION In the lossy image coding experiment presented here, some wavelet filters have been found to perform better than others depending on the image being coded, and no specific wavelet filter has performed uniformly better than others on these images. Similar results have also been observed in the context of lossless compression using various integer to integer wavelet transforms. Our conclusion is that, the performance in both lossless and lossy coders has been found to be image dependent. While this simple finding is very intuitive, to date no thorough analysis and investigation have been done to see if there is a clear link between the characteristics of an image and the coding performance so that the “most appropriate” wavelets can be used based on the image being coded. This can also be viewed from a different angle. Using a linear phase biorthogonal filterbank, say the popular CDF-9/7 pair, when we compress a set of different images using an embedded quantization scheme and an adaptive arithmetic encoder, for the same compression ratio say 16:1, we get widely varying PSNR values ranging from as low as 25 dB to more than 40 dB depending on the image being coded. So, this large variation in PSNR can only be attributed to the nature and inherent characteristics of the image, since everything else is fixed. But we don’t know for sure what exactly in the image characteristics is making so much of a difference. Is it the high frequency components in the image, is it the brightness, is it the dynamic range of the pixel values or is it something else ? In general, images from different categories mentioned earlier, have different inherent characteristics. For example, it is a common observation that most of the natural images are continuous tone compared to the synthetic images most of which are discrete tone that is the dynamic range of the pixel bit depth are under utilized. Such images generally have some numerical structures that are not well represented by smooth basis functions. Many medical images like MRI or CT scan contain significant low-intensity (black) regions along image boundaries. Compound images with significant amount of text are a mixture of binary and continuous tone data. Even within a particular category, images vary in many ways with widely varying first and second order Markov statistics. Whereas some are relatively flat, others are very busy having more edges and contours in them. Some are darker and others have more sharpness. So, an analysis of these images shows different characteristics like mean, median, variance and histogram in the spatial domain. In [13], we analyze different images to obtain the various spatial domain characteristics stated above. We use spectral flatness measure (SFM) to determine the overall image activity. We also analyze the transform coefficients of various subbands (before any quantization and entropy encoding is performed) using different wavelet filters, to see the effect of various wavelets on an image, including the transform coding gain (TCG) which measures the energy compaction of the transform for different filters used. As a result, these factors should help to determine the proper filters to be used for compressing that image for better performance. Our belief is that both for lossy and lossless compression, the "most appropriate" wavelets should be chosen dynamically depending on the nature of image being coded, to achieve better compression. This is important because, when the performance of the wavelet filter is poor in the first place, use of context modeling of the transform coefficients can't provide significant enough gain. Hence, the importance of searching and using "good" wavelet filters in most coding schemes can't be over emphasized, and the factors mentioned above should be considered to help us determine the “right” filter to be used for compressing that image for best performance. In summary, for better performance of both lossless and lossy coding schemes, the filters used to perform the subband decomposition should adapt to the statistical nature of the input image. Acknowledgement: We would like to thank Mark Duchanieu of LLNL for having various discussions on this. V. REFERENCES [1] J. Liu, V. R. Algazi, and R. R. Estes, “A comparative study of wavelet image coders”, Optical Engineering, 35(9), Sep. 1996 [2] M. Vetterli and J. Kovacevic, ‘Wavelets and Subband Coding’, Englewood Cliffs, NJ, Prentice Hall, 1995. [3] R. Calderbank, I. Daubechies, W. Sweldens, and B. L. Yeo, "Lossless image compression using integer to integer wavelet transforms", in Proc. IEEE International Conference on Image Processing '97, pp. Oct. 1997. Compression Conference '93, pp. 550-553, Mar 1993. [4] M. Zandi, M. Boliek, E. L. Schwartz, and A. Keith, "CREW second evaluation", ISO/IEC/JTC1/SC29/WG1, CRC-TR-9534, Nov. 1995. [9] J. D. Villasenor, B. Belzer, and J. Liao, "Wavelet filter evaluation for image compression", IEEE Trans. on Image Processing, Vol. 4, No. 8, pp. 1053-1060, Aug. 1995. [5] I. Daubechies, ‘Orthonormal bases of compactly supported wavelets’, Comm. Pure and Applied Math., Vol. 41, Nov. 1988, pp. 909-996. [10] E. A. B. da Silva and M. Ghanbari, "On the performance of linear phase wavelet transforms in low-bit rate image coding", IEEE Trans. on Image Processing, Vol. 5, N0. 5, pp. 689-704, May 1996. [6] S. G. Mallat, ' A Theory for Multiresolution Signal Decomposition: The Wavelet Representation', IEEE Trans. Pattern Analysis and Machine Intelligence, Vol. 11, No. 7, July 1989, pp. 674-693. [11] D. Taubman, and A. Zakhor, “Multirate 3-D subband coding of video”, IEEE Trans. Image Processing, Vol. 3, No. 5, Sep. 1994. [7] A. Cohen, I. Daubechies, and J. C. Feauveau, "Biorthogonal bases of compactly supported wavelets", Comm. on Pure and Applied Mathematics, 1992, Vol. XLV, pp. 485-560. [12] T. C. Bell, J. G. Cleary, and J. H. Witten, “Text Compression”, Prentice Hall, Englewood Cliffs, NJ, 1990. [8] O. Rioul, "On the choice of Wavelet filters for still image compression", in Proc. IEEE Data [13] S. Saha and R. Vemuri, “Analysis Based Coding of Images”, In preparation. Table 2 PSNR values (in dB) for different images for compression ratio of 16:1 Image Statistics Lena Barbara Baboon Peppers Airplane Aerial Teradata Bengali camera couple Mean 99 117 129 120 179 181 111 213 119 33 Median Std. Dev. 97 53 117 55 130 42 121 54 200 46 192 39 113 25 255 78 144 62 24 32 Variance 2796 2982 1789 2894 2154 1556 621 6116 3886 Haar 31.51 27.23 24.19 33.04 32.51 25.99 36.05 27.3 Daub2 (4 coeff) 33.03 28.32 24.77 34.37 34.02 27.2 39.63 Daub4 (8 coeff) 33.48 29.1 25.04 34.74 34.65 27.63 Daub8 (16 coeff) 33.65 29.64 25.1 34.38 34.45 Adelson (9 coeff) 33.93 29.51 25 34.78 CDF-9/7 (CDF-7/9) 34.28 33.1 29.54 28.76 25.05 24.47 CDF-9/11 33.97 29.8 Odegard-9/7 34.3 Brislawn-10/10 seagull mri nervecell 87 57 117 53 66 44 54 98 84 998 4304 2900 7001 29.3 33.18 27.38 29.42 30.66 24.92 29.45 34.08 27.94 31.59 32.4 41.35 24.25 29.02 34.24 27.88 32.31 32.73 27.64 42.15 22.77 28.76 33.96 27.59 32.15 33.04 34.96 27.7 41.34 24.66 29.89 34.89 28.29 32.38 33.6 35.19 34.36 35.34 33.97 27.89 26.79 42.4 39.59 24.71 24.58 30.19 29.2 35.26 34.21 28.5 27.77 32.77 31.39 33.96 32.83 24.68 34.68 34.73 27.72 41.72 24.21 29.63 34.62 28.16 32.28 33.46 30.16 24.98 35.08 35.27 28.11 42.41 24.62 30.09 35.04 28.42 32.73 33.9 33.68 29.07 24.25 34.53 35.07 27.62 41.86 23.07 29.1 34.18 28.23 32.63 33.71 Villasenor-10/18 (Villasenor-18/10) 34.15 33.41 30.09 29.2 25.33 24.65 35.19 34.39 35.55 33.85 28.23 27.12 42.83 40.42 23.21 24.36 29.79 29.13 34.95 33.74 28.48 27.85 32.85 31.67 33.96 32.85 Villasenor-13/11 (Villasenor-11/13) 34.28 33.82 30.18 29.39 24.86 24.75 34.99 34.57 35.27 34.8 28.01 27.52 42.42 41.05 24.47 24.6 29.88 29.73 34.89 34.79 28.32 28.06 32.67 32.11 33.66 33.49 Villasenor-6/10 (Villasenor-10/6) 34.04 33.14 29.57 28.75 24.57 24.5 35.16 34.15 35.03 33.4 27.9 26.84 42.32 39.61 23.85 24.84 29.89 28.96 34.86 33.53 28.46 27.6 32.73 31.26 33.72 32.61 CDF-13/3 (CDF-3/13) 33.65 32.98 28.81 28.54 24.78 23.82 34.69 33.85 34.93 33.28 27.55 26.89 41.34 39.59 24.51 23.75 29.46 28.6 35.13 33.61 28.21 27.4 32.3 31.04 33.58 32.24 Wavelets