[go: up one dir, main page]

Academia.eduAcademia.edu
Published in Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP97) 4, 2721-2724, 1997 which should be used for any reference to this work 1 ADAPTIVE BLOCK-SIZE TRANSFORM CODING FOR IMAGE COMPRESSION Javier Bracamonte*, Michael Ansorge and Fausto Pellandini Institute of Microtechnology, University of Neuchâtel Rue A.-L. Breguet 2, CH-2000 Neuchâtel, Switzerland Email: bracamonte@imt.unine.ch ABSTRACT In this paper we report the results of an adaptive blocksize transform coding scheme that is based on the sequential JPEG algorithm. This minimum informationoverhead method implies a transform coding technique with two different block sizes: N ×N and 2N ×2N pixels. The input image is divided into blocks of 2N × 2N pixels and each of these blocks is classified according to its image activity. Depending on this classification, either four N-point or a single 2N -point 2-D DCT is applied on the block. The purpose of the algorithm is to take advantage of large uniform regions that can be coded as a single large unit instead of four small units—as it is made by a fixed block-size scheme. For the same reconstruction quality, the results of the adaptive algorithm show a significant improvement of the compression ratio with respect to the non-adaptive scheme. 1. INTRODUCTION Transform-based image coding algorithms have been the object of intense research during the last twenty years. Eventually they have been selected as the main mechanism of data compression in the definition of digital image and video coding standards. For example JPEG, MPEG-1, MPEG-2, H.261 and H.263 all rely on an algorithm based on the Discrete Cosine Transform (DCT). A transform-based image coding method involves subdividing the original image into smaller N × N blocks of pixels and applying a unitary transform, such as the DCT, on each of these blocks. A plethora of methods have been proposed regarding different kinds of processing executed after the transform operation. In general, once the value of N has been selected for *His work was supported in part by the Laboratory of Microtechnology (LMT EPFL) common to the Swiss Federal Institute of Technology, Lausanne, and the Institute of Microtechnology, University of Neuchâtel. a particular algorithm, it remains fixed. In JPEG, for instance, the value of N is 8, and thus the input image is divided into blocks of 8x8 pixels exclusively. Several adaptive transform-based methods are described in [1, 2, 3]. Most of these methods are fixed block-size schemes in which an adaptive quantization of the transform coefficients is made. A variable blocksize scheme has been reported in [4], in which it is also claimed that it was one of the first departures from traditional fixed to variable block-size schemes. In their approach the transform coefficients are processed by vector quantization. In this paper an adaptive block-size transform coding scheme based on the JPEG algorithm is reported, and some results are presented showing the improvement of the compression efficiency with respect to the non-adaptive scheme. The remainder of this paper is organized as follows. The adaptive scheme is described in Section 2. The selection of the classification parameters is discussed in Section 3. Computational complexity issues are analyzed in Section 4. The performance of the algorithm is reported in Section 5, and finally the conclusions are stated in Section 6. 2. ADAPTIVE BLOCK-SIZE SCHEME A block diagram of the adaptive block-size scheme is shown in Figure 1. It is based on the sequential JPEG algorithm [5], and it is described in the following paragraphs. The input image is divided into blocks of 16x16 pixels (referred to as B blocks in this paper). On each of these blocks a metric is evaluated in order to determine its degree of image activity. If the resulting measure is below a predefined threshold, then the block is classified as a 0 block. Otherwise, it is classified as a 1 block. An adaptive DCT module receives the B blocks along with their classification bit. The DCT module 2 1 0 0 1 Quantizer Blocks of 16x16 pixels (B Blocks) DC-DPCM Run-length coding Adaptive Block 8- or 16-point Original Image Classification 2-D DCT Huffman Coder Header Data Compressed image data Figure 1: Adaptive JPEG-based algorithm executes a 16-point 2-D DCT on those B blocks that have been classified as 0 blocks. The B blocks classified as 1 blocks are further divided into 4 subblocks of 8x8 pixels each, and then an 8-point 2-D DCT is applied on each of the 4 subblocks. The 2-D DCT coefficients are quantized and then zigzag reordered in order to increase the sequences of zero-valued coefficients. The DC coefficients are coded with a DPCM scheme, whereas the AC coefficients are run-length coded. Finally the resulting symbols are Huffman coded. The purpose of the adaptive scheme is to take advantage of large uniform regions within the image to be coded. And then, instead of coding these 16 × 16pixel image regions as four blocks of 8 × 8 pixels (as it would be made by a fixed block scheme) the algorithm will code them as a single block of 16 × 16 pixels. Before describing the selection of the classification parameters, it is useful to remark that two different types of applications are possible for the adaptive scheme: either it is used to code natural still images, or it is embedded in a video coding system in order to code prediction error frames. The need for an application oriented definition comes from the fact that natural still images and prediction error frames are imagedata of different structure. The classification algorithm and its corresponding parameters can then be better selected to match the input data, in search of a higher coding efficiency. In this paper, we report the results of addressing the video application. ing prediction error frames, a two-class, two-different block transform sizes, with a largest block-size of 16x16 pixels was a good compromise between minimum information overhead (0.004 bit/pixel) and good compression efficiency. The standard deviation was used as the metric to classify the B blocks. Experimentally, it was determined that setting the classification threshold at a value of 3.5 leads to decoded frames of the same quality as those obtained by the non-adaptive scheme. 4. COMPUTATIONAL COMPLEXITY By analyzing the spectrum of the transform of the 0 blocks, it was noticed that only the first four low frequency coefficients have a significant magnitude. The heavy computational overhead of a 16-point 2-D DCT is then reduced considerably since only these four coefficients are to be calculated. Thus, for each classified 0 block a saving of about 85% of the operations is achieved with respect to the non-adaptive scheme. Since the proportion of selected 0 blocks is typically high (e.g., an average of 56% and 78% for the first 150 error prediction frames of Miss America and Claire respectively) the global reduction of operations—taking into accont the computational overhead of the classification stage—is of 38% for Miss America and 55% for Claire. This important saving of operations is even higher at the decoder where the classification overhead is not present. 3. SELECTION OF THE CLASSIFICATION PARAMETERS 5. RESULTS One sensitive drawback of most adaptive image coding algorithms is the generation of overhead information. In general, the higher the degree of adaptability of a coding algorithm, the larger the amount of overhead data that needs to be sent to the decoder. In an adaptive block-size scheme, the overhead information is closely related to the size of the blocks and to the number and segmentation structure of the different classes. For our adaptive scheme, it was found that for cod- The success of the adaptive algorithm in improving the compression ratio depends on the number of 0 blocks that are selected during the classification stage. The higher the number of selected 0 blocks, the higher the improvement of the compression ratio with respect to the non-adaptive scheme. Coding the image Lena (for high reconstruction quality ) with the proposed scheme gives only a modest improvement of the compression ratio. This is due to the very few number of 0 blocks that are selected 3 pected that by using a motion compensation technique, the compression ratio will be further improved, since a better prediction would increase the percentage of selected 0 blocks in the error prediction frames. 6. CONCLUSIONS Figure 2: Lena: Block Classification by the classification algorithm, as it is shown in Figure 2. This was nevertheless the expected result since the classification parameters were not chosen to match the statistics of natural still images. On the other hand, for error prediction frames the number of 0 blocks selected during the block classification is very high. For the sequence Miss America, the percentage of selected 0 blocks per frame is shown in Figure 3. As it can be noticed from this curve, for most of the frames, more than half of the B blocks are classified as 0 blocks, which has a strong effect on the compression efficiency. The coding performance of the adaptive algorithm in video sequences has been studied by using the coding scheme shown in Figure 4. This is a coder based on the H.261 standard [7], without motion compensation. Curves showing the compression ratio of the adaptive versus the non-adaptive method are shown in Figure 5(a) for the sequence Miss America. The quality of the reconstructed frames, from a subjective point of view, is the same for both the adaptive and nonadaptive schemes. The PSNR curve of the adaptive scheme is, in average, 0.081 dB below the corresponding curve of the non-adaptive scheme: a difference that is not perceptible by the human visual system [6]. Figure 6 shows the original and the reconstructed frame No. 13 of Miss America. This is the frame with the highest percentage of selected 0 blocks. The results of the compression ratio for the sequence Claire are shown in Figure 5(b). The conclusions regarding the quality of the reconstructed frames are the same as those pointed out for the sequence Miss America. It is worth noting that the curves in Figure 5 do not include motion estimation/compensation. It is ex- An adaptive block-size transform coding scheme for image compression was presented. The method features a minimum information overhead, a significant reduction of the computational complexity and a coding efficiency that largely outperforms its non-adaptive counterpart. A combination of all these advantages may contribute to the development of effective solutions in various application fields, in particular for low power portable video communication systems. 7. ACKNOWLEDGEMENTS This work was supported by the Swiss National Science Foundation under Grant FN 2000-47’185.96, and by the Laboratory of Microtechnology (LMT EPFL). The latter entity is common to the Swiss Federal Institute of Technology, Lausanne, and the Institute of Microtechnology, University of Neuchâtel. 8. REFERENCES [1] R. J. Clarke, Transform Coding of Images. Academic Press, London, UK, 1985. [2] P. F. Farrelle, Recursive Block Coding for Image Data Compression, Springer-Verlag, New York, USA, 1990. [3] K. R. Rao, and P. Yip, Discrete Cosine Transform: Algorithms, Advantages, Applications, Academic Press, Boston, MA, USA, 1990. [4] J. Vaisey and A. Gersho, “Image Compression with Variable Block Size Segmentation”, IEEE Transactions on Signal Processing, Vol. 40, No. 8, August 1992, pp. 2040–2060. [5] W. B. Pennebaker and J. L. Mitchell, JPEG Still Image Data Compression Standard. Van Nostrand Reinhold, New York, USA, 1993. [6] V. Bhaskaran and K. Konstantinides, Image and Video Compression Standards. Algorithms and Architectures. Kluwer Academic Publishers, Boston, MA, USA, 1995. [7] ITU-T Recommendation H.261 “Video codec for audiovisual services at p × 64 kbits”, March, 1993. 4 Sequence Miss America 70 % of selected 0 blocks 0 65 Variable Length Coding Adaptive (8- or 16-point) 2-D DCT Quantizer VLC Q T 1 Data Input BC 60 Block Classification Inter / Intra Mode Q-1 T-1 55 0 0 50 1 45 0 25 50 75 100 Frame Number 125 Frame Memory 150 Figure 3: Percentage of selected 0 blocks Figure 4: Adaptive H.261-based algorithm Sequence Miss America Sequence Claire 60 120 50 40 30 20 0 25 50 75 100 Frame Number 125 150 Top: Adaptive block−size Bottom: Fixed block−size 100 Compression Ratio Compression Ratio Top: Adaptive block−size Bottom: Fixed block−size 80 60 40 20 0 25 50 75 100 Frame Number (a) Miss America (b) Claire Figure 5: Compression Ratio (a) Original (b) Reconstructed Figure 6: Miss America: Frame No. 13 125 150