[go: up one dir, main page]

0% found this document useful (0 votes)
133 views53 pages

Chapter-5 Data Compression

Data compression converts an input data stream into another data stream that has a smaller size. It helps save storage space and bandwidth for transmission. There are two main types of compression: lossy and lossless. Lossy compression discards unimportant data and provides higher compression but some loss of quality, while lossless compression reproduces the original data exactly after decompression. Common compression techniques include entropy encoding like Huffman coding, source encoding, and hybrid coding combining both. JPEG is a commonly used lossy compression standard for images that employs DCT and quantization.

Uploaded by

Sapana Gurung
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
133 views53 pages

Chapter-5 Data Compression

Data compression converts an input data stream into another data stream that has a smaller size. It helps save storage space and bandwidth for transmission. There are two main types of compression: lossy and lossless. Lossy compression discards unimportant data and provides higher compression but some loss of quality, while lossless compression reproduces the original data exactly after decompression. Common compression techniques include entropy encoding like Huffman coding, source encoding, and hybrid coding combining both. JPEG is a commonly used lossy compression standard for images that employs DCT and quantization.

Uploaded by

Sapana Gurung
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 53

Chapter-5

DATA COMPRESSION
What is Data Compression?
• Data compression is the process of converting an input data stream or
the source stream or the original raw data into another data stream
that has a smaller size.
• For example text compression, image compression, audio
compression and video compression.
Why Data Compression?
• Uncompressed graphics, audio and video data require considerable
storage capacity , so data compression helps to save storage space.
• The same is true for multimedia communications. Data transfer take
smaller bandwidth and time by compressed multimedia as compare
to uncompressed one.
• So Data compression helps to provide feasible and cost effective
solutions for above two problems.
Types of Compression:
1. Lossy Compression :
• Lossy compression algorithms is normally not to reproduce an exact copy of the
source information after decompression but rather a version of it which is perceived
by the recipient as a true copy.
• In lossy compression some information is lost during the processing, where the
image data is stored into important and unimportant data. The system then discards
the unimportant data.
• It provides much higher compression rates but there will be some loss of
information compared to the original source file.
• The main advantage is that the loss cannot be visible to eye or it is visually lossless.
• Visually lossless compression is based on knowledge about color images and human
perception.
Contd…
2. Lossless Compression :
• In this type of compression no information is lost during the compression and
the decompression process.
• Here the reconstructed image is mathematically and visually identical to the
original one.
• It achieves only about a 2:1 compression ratio.
• This type of compression technique looks for patterns in strings of bits and
then expresses them more concisely.
Coding Requirements:
• Let us consider the general requirements imposed on most multimedia systems:
• Storage :
• Images require much more storage space than simple text; audio and video have even more
demanding properties for data storage.
• For example:
• a full screen true color image is 640 x 480 X 3 = 921600 bytes
• The size of one second of uncompressed CD quality stereo audio is 44.1kHz X 2 X 2 = 176400 bytes
• The size of one second of uncompressed PAL video is 384X288X 3 X 25 frames = 8294400 bytes
• Throughput:
• continuous media require very large throughput
• For example :
• an uncompressed CD quality stereo audio stream needs 176400 bytes/sec. A raw digitized PAL TV signal
needs(13.5MHz + 6.75MHz + 6.75MHz) X 8bits
• =216 X106 bits/sec = 27 X106 Bytes/sec
Calculation
• To determine storage requirements for the PAL standard, we assume
the same image resolution as 640 X 480 pixels and 3 bytes/pixel to
encode the luminance and chrominance components. Hence the
storage requirement for one image frame is 640 x 480 x 3
bytes=921,600 bytes or 7,372,800 bits. To store 25 frame/second, the
storage requirement is 230.4 x 10^5 bytes or 184.32x 10^6 bits.
Certain Constraints of Compression
Technique
• The quality of the coded, and later on, decoded data should be as
good as possible.
• To make cost-effective implementation possible, the complexity of the
technique used should be minimal.
• The processing of the algorithm must not exceed certain time span.
Techniques of Data Compression
• There are different important techniques of data compression:
Entropy Encoding
• Is lossless process
• Is used regardless of the media’s specific characteristics.
• The data stream to be compressed is considered to be a simple digital
sequence and the semantics of the data is ignored.
• It is lossless as decompression process regenerates the data
completely.
• Example: Run-length coding, Huffman coding, Arithmetic coding
Source Encoding
• Is a lossy process
• It takes into account the semantics of the data.
• degree of compression depends on data content.
• E.g. content prediction technique - DPCM, delta modulation
Hybrid Coding
• Most multimedia systems use hybrid techniques.
• combine entropy with source encoding
• E.g. JPEG, H.263, DVI (RTV & PLV), MPEG-1, MPEG-2, MPEG-4
Major Steps of Data Compression
Contd…
1. Picture Preparation:
• Preparation includes analog to digital conversion and generating an appropriate digital
representation of the information.
• An image is divided into blocks of 8x8 pixels, and represented by a fixed number of bits per
pixel.
2. Picture Processing:
• Processing is actually the first step of the compression process which makes use of
sophisticated algorithms.
• A transformation from the time to the frequency domain can be performed using DCT.
• In the case of motion video compression, interframe coding uses a motion vector for each
8x8 blocks.
• Motion video computation for digital video.
• This step can be repeated iteratively several time.
Contd…
3. Quantization:
• Quantization process the results of the previous step.
• It specifies the granularity of the mapping of real numbers into integers.
• This process results in a reduction of precision.
• For example they could be quantized using a different number of bits per coefficient.
For example 12 bits for real values, 8 bits for integer value.
• This step can be repeated iteratively several time.
4. Entropy Encoding:
• Entropy encoding is usually the last step.
• It compresses a sequential digit data stream without loss.
• For example, a sequence of zeroes in a data stream can be compressed by specifying
the number of occurrences followed by the zero itself.
Some Compression Technique:
Run Length Encoding:
• Sampled images, audio and video data streams often contain sequences of the
same bytes.
• By replacing these repeated byte sequences with the number of occurrences,
a substantial reduction of data can be achieved. This is called run length
encoding.
• To illustrate byte-stuffing, we define the exclamation mark “!” as a special flag.
• A single occurrence of this exclamation flag is interpreted as a special flag
during decompression.
• Two consecutive exclamation flags are interpreted as an exclamation mark
occurring within the data.
Contd…
• In this method, if a byte occurs at least four consecutive times, the
number of occurrences is counted.
• In the following example, the character “C” occurs 8 consecutive
times and is “compressed” to 3 characters “C!8”:

• Uncompressed data: ABCCCCCCCCDEFGGG

• Run-length coded: ABC!8DEFGGG


Contd…
• Determine Run length encoding of following:

• Data: AAVVDDEEFFFFG12S4444
Contd
Diatomic encoding:
• It is a variation of run-length encoding based on a combination of two data
bytes.
• This technique determines the most frequently occurring pairs of bytes.
• According to an analysis of the English language, the most frequently occuring
pairs are the following:
• “e” and “t” ,”a” and “s”, “t” and “h” , “r” and “e”, “i” and “n”, “h” and “e” etc..
• This leads to a data reduction of more than 10%.
Contd…
• Huffman Encoding:
• Different characters do not have to coded with a fixed number of bits.
• Frequently-occurring characters are coded with shorter strings and seldom-
occurring characters are coded with longer strings.
• Huffman coding is based on this statistical methods.
• Given the characters that must be encoded, together with the probability of
their occurrences, the Huffman coding algorithm determines the optimal
code using the minimum number of bits.
• In text, the shortest code is assigned to those characters that occur most
frequently.
• To determine a Huffman code, it is useful to construct a binary tree.
Contd
• The leaves (node) of this tree represent the characters that are to be
encoded.
• Every node contains the occurrence probability of one of the
characters belonging to this subtree.
• 0 and 1 are assigned to the branches of the tree
• Question: Find Huffman coding for following:

Node A B C D E
Probability 0.16 0.51 0.09 0.13 0.11
Cond…
• We have:
• p(A)=0.16, p(B)=0.51, p(C)=0.09, p(D)=0.13, p(E)=0.11
• Characters with the lowest probabilities are combined in the first
binary tree, thus C and E are the leaves. The combined probability of
their root node CE is 0.20.
• The edge from node CE to C is assigned a 1 and the edge from CE to E
becomes a 0. This is an arbitrary assignment.
Contd…
• The following nodes remain:
• p(A)=0.16, p(B)=0.51, p(CE)=0.20, p(D)=0.13
• Again two nodes with the lowest probabilities are combined into a binary
tree:
• Nodes A and D and p(AD)=0.29
• Remaining nodes :
• P(AD)=0.29, p(B)=0.51, p(CE)=0.20
• Again:
• Nodes p(AD) and p(CE) with p(ADCE)=0.49
• Remaining nodes:
• P(ADCE)=0.49 and p(B)=0.51
Contd..
• Finally the root node p(ADCEB)=1.0
Contd..
• And the Huffman Code is:
• W(A)=001, w(B)=1, w(c )=011, w(D)=000, w(E )=010
JPEG(Joint Photographic Experts Group)
• JPEG is an image compression standard that was developed by the “Joint
Photographic Experts Group”.
• JPEG is a commonly used method of lossy compression for digital photography
(image).
• It employs a transform coding method using the DCT (Discrete Cosine Transform)
• The JPEG image compression standard became an international standard in 1992
• JPEG can be applied to color or grayscale images.
• A fast coding and decoding of still images is also used for video sequences known
as Motion JPEG.
• For detail information: click here
Requirements:
• The JPEG implementation should be independent of image size.
• The JPEG implementation should be applicable to any image and pixel
aspect ratio.
• Color representation itself should be independent of the special
implementation.
• Image content may be of nay complexity, with any statistical
characteristics.
• The JPEG standard specification should be state-of-the-art (or near)
regarding the compression factor and achieved image quality.
Four Variants of image compression of JPEG
1. The lossy sequential DCT-based mode must be supported by every
JPEG implementation.
2. The expanded lossy DCT-based mode provide’s set of further
enhancement to the baseline process.
3. The lossless mode has a low compression ratio that allows perfect
reconstruction of the original image.
4. The hierarchical mode accommodates images of different
resolutions and selects its algorithms from the three modes defined
above.
Main Step of JPEG Compression Process
Assignment #3

• Explain Lossy Sequential DCT-based Mode, Expanded Lossy DCT-based Mode,


Lossless Mode and Hierarchical Mode of JPEG compression.
DVI(Digital Video Interactive)
• It is a technology that includes coding algorithms with VLSI chip set
for the video subsystem, specified data format for audio and video
files, an application user interface to audio-visual kernel and
compression as well as decompression algorithms as the fundamental
components.
• DVI can process data, text, graphics , still images, video and audio.
History of DVI
• DVI stated as a project at the David Sarnoff Research Center of the RCA Company in Princeton in 1984.
• At that time, the major goal are to compress video and audio at the data rate appropriate for a CD, and
to decompress it in real time.
• In 1986, the first draft of a DVI-specific chip using a silicon compiler was developed.
• In 1986, General Electrics (GE) took over this technology.
• The DVI development team became employees of GE and the project continued.
• The first public presentation took place at the second Microsoft CD-ROM conference in Seattle in March
1987.
• For the first time, the real time play-back of video stored on a CD-ROM was demonstrated.
• In 1989 at the fourth Microsoft CD-ROM conference, IBM and Intel announced their cooperation
concerning DVI.
• The DVI team was later taken over by Intel.
• In 1993, the software-only decoder became available as the product Indeo.
DVI video processing
• Concerning audio, the demand for a hardware solution that can be implemented at a
reasonable price is met by using a standard signal processor.
• Processing of still images and video is performed by a video processor.
• Video processor consists of two VLSI chips containing more than 265000 transistors
each.
1. VDP1:
• pixel processor
• It processes bitmaps and is programmed in microcode
2. VDP2:
• display processor
• Generates analog RGB signals out of the different bitmap formats and its configuration is also programmable.
• The coupling of the processors is carried out by the Video-Ram(VRAM).
Contd…
Audio and Still Image Encoding
• Audio signals are digitized using 16 bits per sample and are either PCM-encoded or
compressed using the Adaptive Differential Pulse Code Modulation(ADPCM) technique.
• Thereby, a reduction to about four bits per sample is achieved at a quality
corresponding to stereo broadcasting.
• Different sampling frequencies are supported (11025 Hz, 22050 Hz etc.)
• When encoding still images, different video input formats can be used.
• These can be composite, as well as component video signals like RGB.
• In the case of an RGB signal, the color of each pixel is split into portions of the three
colors of the spectrum- red, green and blue and each color is processed separately.
• For image preparation, DVI assumes an internal digital YUV format, i.e. any video input
signal must first be transformed into this format.
Contd…
• The color of each pixel is split into a luminance component Y and the two chrominance
components U and V.
• Starting wit an RGB signal, DVI computes the YUV signal using the following relationship:
• Y= 0.30R+0.59G+0.11B
• U=B-Y
• V= R-Y
• DVI always combines all chrominance components of 4X4 pixel blocks into a single value.
• The chrominance component of the top left pixel of such a block is used as the reference
value for the 16 pixels.
• To increase image quality during presentation, an interpolation technique is applied to
the chrominance values of adjacent blocks.
Video Encoding
• Two technique:
1. Presentation- Level Video(PLV):
• It is characterized by its better quality.
• This is achieved at the expense of a very time consuming asymmetric compression
performed by specialized compression facilities.
• PLV is suitable for applications distributed on CD-ROM.
Contd…
Contd..
2. Real-time Video(RTV):
• This is a symmetric compression technique that works with hardware and
software and can be performed in real-time.
• If an algorithm takes the same time to compress a data archive as it does to
decompress it, it is considered symmetrical.
• In previous versions of DVI, RTV was known as Edit Level Video (ELV).
• ELV was conceived to enable the developers of DVI applications to see their
video sequences with reduced quality during the construction phase.
• Afterwards, they would send the videotapes to a DVI compression facility to
be compressed in the PLV mode and would get in return compressed video
sequences of higher quality than RTV.
Finding Minimal SSD
Finding Minimal SAD
Contd…
• Motion Computation for P-frames:
• Look for match window within search window :
• Match window corresponds to macro-block
• Search window corresponds to arbitrary window size
• Larger search window, better motion estimation
• Matching methods :
• SSD (Sum of squared differences)
• correlation: SSD = Summation, I (xi - yi)^2 with steps:
• subtract pixel by pixel
• square the residuals
• sum them
• find minimal SSD correlation among matching windows.
• SAD (Sum of absolute differences)
• correlation: SAD = Summation, I |xi - yi|
• correlation is absolute value of residuals
• Outliers - In SSD, one bad point gives large difference which MPEG Compression skews the decision of picking
correct match windows.
MPEG(Moving Picture Expert Group)
• The MPEG standard was developed by ISO/IEC JTC1/SC 29/WG11 to
cover motion video as well as audio coding according to the ISO/IEC
standardization process.
• MPEG strives for a data stream compression rate of about 1.2
Mbits/second, which is today’s typical CD-ROM data transfer rate.
• It can deliver a data rate of at most 1856000 bits/second
• Data rate for audio are between 32 and 448 Kbits/second
• In 1993, MPEG was accepted as the International Standard and the
first commercially available MPEG products entered the market.
Contd..
• MPEG standard explicitly considers functionalities of following standard:
• JPEG:
• JPEG standard development was always ahead of the MPEG standard.
• The MPEG standard makes use of JPEG.
• H.261:
• Since H.261 standard was already available during the work on the MPEG standard, the
working group strived for compatibility with this standard.
• However MPEG is the more advanced technique.
• MPEG is suitable for symmetric as well as asymmetric compression.
• Asymmetric compression requires more effort for coding than for
decoding and decompression is also performed many times.
Video Encoding
• Image preparation phase of MPEG is similar to H.261
• Each image consists of three components.
• The luminance component has twice as many samples in the horizontal
and vertical axes as the other two components.
• The resolution of the luminance component should not exceed
768*576; for each component, a pixel is coded with eight bits.
• The MPEG data stream includes more information than a data stream
compressed according to the JPEG standard. For example: aspect ratio,
it provides 14 different image aspect ratios per pixel and also defined 8
image refresh frequency.
Contd…
• MPEG stream includes 4 types of image coding for video processing:
1. I-frames :
• Intra-coded frames
2. P-frames:
• Predictive-coded frames
3. B-frames:
• Bi-directionally predictive coded frames
4. D-frames:
• DC coded frames
I-frames(Intra-coded images)
• I-frames are self contained i.e. coded without any reference to other images.
• An I-frame is treated as a still image.
• MPEG makes use of JPEG for I-frames.
• Compression rate of I-frames is the lowest within MPEG.
• However, contrary to JPEG, compression must often be executed in real-time.
• I-frames use 8X8 blocks defined within a macro block, on which a DCT is
performed.
• The DC-coefficients are then DPCM coded (differences of successive blocks of
one component are computed and transformed using variable-length coding)
P-frames(Predictive-coded frames)
• It require information of the previous I-frame and/or all previous P-
frames for encoding and decoding.
• The coding of P-frames is based on the fact that, by successive
images, their areas often do not change at all but instead, the whole
area is shifted.
• For temporal redundancy, the last P or I frame that is most similar to
the block under consideration is determined.
Macro-blocks in P frames
MPEG/Video Processing of P frames
• Apply 2D DCT to macro-blocks not reduced by motion compensation.
• Motion vector of adjacent macro blocks often differs slightly, hence
apply DPCM(differences of successive blocks of one component are
computed and transformed using variable-length coding).
• P-frames consist of I-frame macro blocks and 6 predictive macro-
blocks.
• P-frames are quantized and entropy encoded using run length
encoding.
B-frames(Bi-directionally predictive-coded
frames)
• It require information of the previous and following I- and/or P-frame
for encoding and decoding.
• The highest compression ratio is attainable by using these frames.
• A B-frame is defined as the difference of a prediction of the past
image and the following P- or I- frame.
• B-frames can never be directly accessed in a random fashion.
D-frames(DC-coded frames)
• They are intra frame encoded.
• They can be used for fast forward or fast rewind modes.
• The DC parameters are DCT-coded.
• D-frames consist only of the lowest frequencies of an image.
• They only use one type of macro block and only the DC-coefficients
are encoded.

You might also like