Published in Proceedings of the International Conference on
Acoustics, Speech, and Signal Processing (ICASSP97) 4, 2721-2724, 1997
which should be used for any reference to this work
1
ADAPTIVE BLOCK-SIZE TRANSFORM CODING FOR IMAGE
COMPRESSION
Javier Bracamonte*, Michael Ansorge and Fausto Pellandini
Institute of Microtechnology, University of Neuchâtel
Rue A.-L. Breguet 2, CH-2000 Neuchâtel, Switzerland
Email: bracamonte@imt.unine.ch
ABSTRACT
In this paper we report the results of an adaptive blocksize transform coding scheme that is based on the sequential JPEG algorithm. This minimum informationoverhead method implies a transform coding technique
with two different block sizes: N ×N and 2N ×2N pixels. The input image is divided into blocks of 2N × 2N
pixels and each of these blocks is classified according to its image activity. Depending on this classification, either four N-point or a single 2N -point 2-D
DCT is applied on the block. The purpose of the algorithm is to take advantage of large uniform regions
that can be coded as a single large unit instead of four
small units—as it is made by a fixed block-size scheme.
For the same reconstruction quality, the results of the
adaptive algorithm show a significant improvement of
the compression ratio with respect to the non-adaptive
scheme.
1. INTRODUCTION
Transform-based image coding algorithms have been
the object of intense research during the last twenty
years. Eventually they have been selected as the main
mechanism of data compression in the definition of digital image and video coding standards. For example
JPEG, MPEG-1, MPEG-2, H.261 and H.263 all rely
on an algorithm based on the Discrete Cosine Transform (DCT).
A transform-based image coding method involves
subdividing the original image into smaller N × N
blocks of pixels and applying a unitary transform, such
as the DCT, on each of these blocks. A plethora of
methods have been proposed regarding different kinds
of processing executed after the transform operation.
In general, once the value of N has been selected for
*His work was supported in part by the Laboratory of Microtechnology (LMT EPFL) common to the Swiss Federal Institute of Technology, Lausanne, and the Institute of Microtechnology, University of Neuchâtel.
a particular algorithm, it remains fixed. In JPEG, for
instance, the value of N is 8, and thus the input image
is divided into blocks of 8x8 pixels exclusively.
Several adaptive transform-based methods are described in [1, 2, 3]. Most of these methods are fixed
block-size schemes in which an adaptive quantization
of the transform coefficients is made. A variable blocksize scheme has been reported in [4], in which it is also
claimed that it was one of the first departures from traditional fixed to variable block-size schemes. In their
approach the transform coefficients are processed by
vector quantization.
In this paper an adaptive block-size transform coding scheme based on the JPEG algorithm is reported,
and some results are presented showing the improvement of the compression efficiency with respect to the
non-adaptive scheme.
The remainder of this paper is organized as follows.
The adaptive scheme is described in Section 2. The selection of the classification parameters is discussed in
Section 3. Computational complexity issues are analyzed in Section 4. The performance of the algorithm
is reported in Section 5, and finally the conclusions are
stated in Section 6.
2. ADAPTIVE BLOCK-SIZE SCHEME
A block diagram of the adaptive block-size scheme is
shown in Figure 1. It is based on the sequential JPEG
algorithm [5], and it is described in the following paragraphs.
The input image is divided into blocks of 16x16 pixels (referred to as B blocks in this paper). On each of
these blocks a metric is evaluated in order to determine
its degree of image activity. If the resulting measure
is below a predefined threshold, then the block is classified as a 0 block. Otherwise, it is classified as a 1
block.
An adaptive DCT module receives the B blocks
along with their classification bit. The DCT module
2
1 0
0 1
Quantizer
Blocks of
16x16
pixels
(B Blocks)
DC-DPCM
Run-length
coding
Adaptive
Block 8- or 16-point
Original
Image Classification 2-D DCT
Huffman
Coder
Header
Data
Compressed
image data
Figure 1: Adaptive JPEG-based algorithm
executes a 16-point 2-D DCT on those B blocks that
have been classified as 0 blocks. The B blocks classified as 1 blocks are further divided into 4 subblocks
of 8x8 pixels each, and then an 8-point 2-D DCT is
applied on each of the 4 subblocks. The 2-D DCT coefficients are quantized and then zigzag reordered in order to increase the sequences of zero-valued coefficients.
The DC coefficients are coded with a DPCM scheme,
whereas the AC coefficients are run-length coded. Finally the resulting symbols are Huffman coded.
The purpose of the adaptive scheme is to take advantage of large uniform regions within the image to
be coded. And then, instead of coding these 16 × 16pixel image regions as four blocks of 8 × 8 pixels (as it
would be made by a fixed block scheme) the algorithm
will code them as a single block of 16 × 16 pixels.
Before describing the selection of the classification
parameters, it is useful to remark that two different
types of applications are possible for the adaptive
scheme: either it is used to code natural still images,
or it is embedded in a video coding system in order to
code prediction error frames. The need for an application oriented definition comes from the fact that natural still images and prediction error frames are imagedata of different structure. The classification algorithm
and its corresponding parameters can then be better
selected to match the input data, in search of a higher
coding efficiency. In this paper, we report the results
of addressing the video application.
ing prediction error frames, a two-class, two-different
block transform sizes, with a largest block-size of 16x16
pixels was a good compromise between minimum information overhead (0.004 bit/pixel) and good compression efficiency.
The standard deviation was used as the metric to
classify the B blocks. Experimentally, it was determined that setting the classification threshold at a value
of 3.5 leads to decoded frames of the same quality as
those obtained by the non-adaptive scheme.
4. COMPUTATIONAL COMPLEXITY
By analyzing the spectrum of the transform of the 0
blocks, it was noticed that only the first four low frequency coefficients have a significant magnitude. The
heavy computational overhead of a 16-point 2-D DCT
is then reduced considerably since only these four coefficients are to be calculated. Thus, for each classified 0 block a saving of about 85% of the operations
is achieved with respect to the non-adaptive scheme.
Since the proportion of selected 0 blocks is typically
high (e.g., an average of 56% and 78% for the first 150
error prediction frames of Miss America and Claire respectively) the global reduction of operations—taking
into accont the computational overhead of the classification stage—is of 38% for Miss America and 55%
for Claire. This important saving of operations is even
higher at the decoder where the classification overhead
is not present.
3. SELECTION OF THE CLASSIFICATION
PARAMETERS
5. RESULTS
One sensitive drawback of most adaptive image coding
algorithms is the generation of overhead information.
In general, the higher the degree of adaptability of a
coding algorithm, the larger the amount of overhead
data that needs to be sent to the decoder.
In an adaptive block-size scheme, the overhead information is closely related to the size of the blocks
and to the number and segmentation structure of the
different classes.
For our adaptive scheme, it was found that for cod-
The success of the adaptive algorithm in improving the
compression ratio depends on the number of 0 blocks
that are selected during the classification stage. The
higher the number of selected 0 blocks, the higher the
improvement of the compression ratio with respect to
the non-adaptive scheme.
Coding the image Lena (for high reconstruction
quality ) with the proposed scheme gives only a modest improvement of the compression ratio. This is due
to the very few number of 0 blocks that are selected
3
pected that by using a motion compensation technique,
the compression ratio will be further improved, since
a better prediction would increase the percentage of
selected 0 blocks in the error prediction frames.
6. CONCLUSIONS
Figure 2: Lena: Block Classification
by the classification algorithm, as it is shown in Figure 2. This was nevertheless the expected result since
the classification parameters were not chosen to match
the statistics of natural still images.
On the other hand, for error prediction frames the
number of 0 blocks selected during the block classification is very high. For the sequence Miss America,
the percentage of selected 0 blocks per frame is shown
in Figure 3. As it can be noticed from this curve, for
most of the frames, more than half of the B blocks are
classified as 0 blocks, which has a strong effect on the
compression efficiency.
The coding performance of the adaptive algorithm
in video sequences has been studied by using the coding
scheme shown in Figure 4. This is a coder based on
the H.261 standard [7], without motion compensation.
Curves showing the compression ratio of the adaptive versus the non-adaptive method are shown in Figure 5(a) for the sequence Miss America. The quality
of the reconstructed frames, from a subjective point
of view, is the same for both the adaptive and nonadaptive schemes. The PSNR curve of the adaptive
scheme is, in average, 0.081 dB below the corresponding curve of the non-adaptive scheme: a difference that
is not perceptible by the human visual system [6].
Figure 6 shows the original and the reconstructed
frame No. 13 of Miss America. This is the frame with
the highest percentage of selected 0 blocks.
The results of the compression ratio for the sequence Claire are shown in Figure 5(b). The conclusions regarding the quality of the reconstructed frames
are the same as those pointed out for the sequence Miss
America.
It is worth noting that the curves in Figure 5 do
not include motion estimation/compensation. It is ex-
An adaptive block-size transform coding scheme for
image compression was presented. The method features a minimum information overhead, a significant
reduction of the computational complexity and a coding efficiency that largely outperforms its non-adaptive
counterpart.
A combination of all these advantages may contribute to the development of effective solutions in various application fields, in particular for low power
portable video communication systems.
7. ACKNOWLEDGEMENTS
This work was supported by the Swiss National Science Foundation under Grant FN 2000-47’185.96, and
by the Laboratory of Microtechnology (LMT EPFL).
The latter entity is common to the Swiss Federal Institute of Technology, Lausanne, and the Institute of
Microtechnology, University of Neuchâtel.
8. REFERENCES
[1] R. J. Clarke, Transform Coding of Images. Academic Press, London, UK, 1985.
[2] P. F. Farrelle, Recursive Block Coding for Image Data Compression, Springer-Verlag, New York,
USA, 1990.
[3] K. R. Rao, and P. Yip, Discrete Cosine Transform: Algorithms, Advantages, Applications, Academic Press, Boston, MA, USA, 1990.
[4] J. Vaisey and A. Gersho, “Image Compression with
Variable Block Size Segmentation”, IEEE Transactions on Signal Processing, Vol. 40, No. 8, August
1992, pp. 2040–2060.
[5] W. B. Pennebaker and J. L. Mitchell, JPEG Still
Image Data Compression Standard. Van Nostrand
Reinhold, New York, USA, 1993.
[6] V. Bhaskaran and K. Konstantinides, Image and
Video Compression Standards. Algorithms and Architectures. Kluwer Academic Publishers, Boston,
MA, USA, 1995.
[7] ITU-T Recommendation H.261 “Video codec for
audiovisual services at p × 64 kbits”, March, 1993.
4
Sequence Miss America
70
% of selected 0 blocks
0
65
Variable
Length
Coding
Adaptive
(8- or 16-point)
2-D DCT Quantizer
VLC
Q
T
1
Data
Input
BC
60
Block
Classification
Inter / Intra
Mode
Q-1
T-1
55
0
0
50
1
45
0
25
50
75
100
Frame Number
125
Frame
Memory
150
Figure 3: Percentage of selected 0 blocks
Figure 4: Adaptive H.261-based algorithm
Sequence Miss America
Sequence Claire
60
120
50
40
30
20
0
25
50
75
100
Frame Number
125
150
Top: Adaptive block−size
Bottom: Fixed block−size
100
Compression Ratio
Compression Ratio
Top: Adaptive block−size
Bottom: Fixed block−size
80
60
40
20
0
25
50
75
100
Frame Number
(a) Miss America
(b) Claire
Figure 5: Compression Ratio
(a) Original
(b) Reconstructed
Figure 6: Miss America: Frame No. 13
125
150