PPT 1 – Detailed Theory Notes
1. Multimedia (Slide 5)
• Definition: Multimedia means presenting information using a combination of text, images, audio, video,
and graphics.
• Features:
1. Multiple Modalities: Different types of media together (e.g., text + sound).
2. Interactivity: User can interact (pause, play, click, navigate).
• Applications:
o Video Conferencing: Google Meet, Zoom (used for meetings).
o Telemedicine: Doctor consulting a patient over video call.
o E-learning: Online platforms like Byju’s, Coursera.
• Real Life Example: WhatsApp → we use text messages, voice notes, pictures, and videos all together
= multimedia.
2. Historical Perspective (Slide 6)
• Timeline of Digital Media:
o 1975–1980: Digital Sound (music stored in digital form).
o 1980–1985: Digital Images (scanners, digital photos).
o 1985–1990: Digital Video (camcorders, CDs).
o 1990 onwards: Digital Graphics/Animation (3D models, games).
• Real Life Example:
o Earlier → Black & White Analog TV.
o Now → Smart TV with Netflix, YouTube in HD = Digital revolution.
3. Audio (Slides 7–14)
• Definition: Audio is a continuous wave signal.
• Key Concepts:
1. Amplitude: Loudness of sound (measured in decibels).
2. Frequency: Pitch of sound (measured in Hz → cycles/sec).
3. Sampling: Taking values of wave at regular intervals.
▪ Telephone → 8000 samples/sec.
▪ CD quality → 44100 samples/sec.
4. Quantization: Converting each sampled value into bits.
▪ More bits = better quality, larger storage.
• Formats: WAV, MP3, WMA, MIDI.
• Tools: Adobe Audition, Sound Forge, Pro Tools.
• Real Life Example:
o A phone call sounds low quality (8kHz sample rate).
o A music CD sounds clear (44.1kHz sample rate).
4. Image (Slides 15–23)
• Definition: Image is a 2D function f(x,y) where each point gives intensity.
• Key Concepts:
1. Pixel (Picture Element): Smallest unit of an image.
2. Resolution: Number of pixels (higher resolution = sharper image).
3. Quantization: Number of bits per pixel (e.g., 8-bit pixel → 256 levels).
4. Image Size Formula: Width × Height × Bits per pixel.
▪ Example: 256 × 256 image with 8-bit pixel = 65,536 bytes.
• Formats: BMP, GIF, TIFF, JPEG.
• Tools: Photoshop, Illustrator.
• Real Life Example:
o If you zoom into a photo, you can see tiny squares (pixels).
o High-resolution DSLR photo = more pixels = better detail.
5. Video (Slides 24–26)
• Definition: Video = sequence of images shown quickly to create motion.
• Key Concepts:
1. Frame Rate:
▪ Movies → 24 frames/sec.
▪ TV (PAL) → 25 fps.
▪ TV (NTSC) → 30 fps.
2. Bandwidth Requirement: Video size = Image size × Frame rate.
3. Editing Tools: Adobe Premiere, After Effects, Final Cut Pro.
• Real Life Example:
o YouTube runs at 30 fps for smooth videos.
o Old cartoons with low fps look choppy.
6. Graphics (Slides 27–29)
• Definition: Graphics are visual images created using points, lines, shapes.
• Key Concepts:
1. Meshes: Combination of points and lines used to form 3D objects.
2. APIs: OpenGL, DirectX, Java3D.
3. Software Tools: 3ds Max, Maya (used for movies & games).
• Real Life Example:
o PUBG game characters → made using 3D meshes.
o Cartoon movies like Toy Story → created using 3D graphics.
7. Multimedia Communication (Slides 30–32)
• Definition: Transfer of multimedia (audio, video, images, text) from one device to another.
• Steps:
1. Sender: Captures → Compresses → Synchronizes → Transmits.
2. Receiver: Receives → Decompresses → Plays back.
• Challenges:
o Bandwidth limitations.
o Synchronization problems (audio not matching video).
o Real-time delay.
• Real Life Example:
o During Zoom or Google Meet → sometimes audio comes late or video freezes because of low
internet bandwidth
PPT 2 – Image Compression (Intro)
1. Recap (Slide 2)
• Covered in last lecture:
o Digital representation of Audio, Image, Video, Geometry.
o Need of compression → storage & transmission efficiency.
• Why compression?
o Multimedia files (images/videos) are very large.
o Storing or sending without compression = slow & costly.
• Real Life Example:
o Raw photo from DSLR (20 MB) → after JPEG compression becomes 2 MB (easy to send on
WhatsApp).
3. Fidelity Criteria (Slide 4)
• Definition: Measures the quality of compressed image compared to the original.
• Types:
1. MSE (Mean Square Error): Measures average error between original and compressed image.
2. SNR (Signal-to-Noise Ratio): Higher value = better quality.
3. Subjective Voting: Human users judge quality (MOS – Mean Opinion Score).
• Real Life Example:
o When you compress a photo on WhatsApp, if quality is still “good” to the eye → fidelity is
acceptable.
4. Compression Techniques – Lossless vs Lossy (Slide 5)
• Lossless Compression:
o No information loss, original can be fully recovered.
o Example: PNG, ZIP files.
• Lossy Compression:
o Some information is lost to save space, original cannot be fully recovered.
o Example: JPEG, MP3.
• Real Life Example:
o ZIP a folder (lossless) → when unzipped, exactly same files come back.
o WhatsApp compresses photos (lossy) → size smaller but quality reduces.
5. Compression Techniques – Symmetric vs Asymmetric (Slide 6)
• Symmetric Compression:
o Compression and decompression take similar time.
o Used for interactive applications (video calls).
• Asymmetric Compression:
o Compression is slow but decompression is very fast.
o Used in retrieval/storage applications (streaming movies).
• Real Life Example:
o Live Zoom call → symmetric (fast both ways).
o Netflix movies → asymmetric (compressed once on server, but decompressed quickly on
millions of devices).
6. Data Redundancy (Slide 7)
• Types of redundancy in images:
1. Coding Redundancy:
▪ Use shorter codes for frequently used symbols.
▪ Example: In English text, letter “e” is most common → assign shorter code.
2. Interpixel Redundancy:
▪ Neighboring pixels often similar.
▪ Example: Sky in a photo → many blue pixels are almost same.
3. Psychovisual Redundancy:
▪ Human eye cannot notice small changes.
▪ Example: Removing fine background details of an image without being noticed.
7. Coding Redundancy (Slide 8)
• Fixed-length coding: All symbols use same number of bits.
• Variable-length coding: Frequent symbols use fewer bits → saves space.
• Example:
o If average code length is 3 bits (fixed), but with variable coding = 2.7 bits → compression
achieved.
• Real Life Example: Morse code – “E” has single dot (short code, because it’s frequent).
8. Interpixel Redundancy (Slides 9–10)
• Definition: Similarity between neighboring pixels reduces storage need.
• Histogram: Shows frequency of pixel intensity values.
• If pixels are highly correlated, data can be compressed.
• Real Life Example: A scanned page of text → background (white pixels) repeats a lot → compressible.
9. Psychovisual Redundancy (Slide 11)
• Definition: Remove details not visible to human eye.
• Example: Reduce 256 intensity levels to 16 levels, still looks similar to human eye.
• Real Life Example: JPEG compression reduces color shades in the background (like sky), but we still
see image as normal.
10. Lossless Compression Techniques (Slide 12)
• Common techniques:
1. Variable-length coding (for coding redundancy).
2. Run-length coding (for interpixel redundancy).
3. Predictive coding (predict next pixel from previous ones).
• Real Life Example: Fax machines and PDFs use run-length coding to compress text documents.
PPT 3 – Image Compression (Huffman Coding)
1. Recap (Slide 2)
• Previous lecture covered:
o Compression ratio, Fidelity measures.
o Data redundancy (Coding, Interpixel, Psychovisual).
o Compression techniques (Lossless vs Lossy, Symmetric vs Asymmetric).
• This lecture focus = Huffman Coding (Variable Length Coding).
• Why Huffman?
o To reduce coding redundancy by giving shorter codes to frequent symbols.
2. Huffman Coding – Introduction (Slide 3)
• Definition: A method of variable-length coding used for lossless compression.
• Steps:
1. Start with a set of symbols and their probabilities.
2. Pick two lowest probability symbols.
3. Combine them into one node.
4. Repeat until one tree is formed.
5. Assign binary codes (0/1) from root to leaves.
• Real Life Example: In text messages, “space” and “e” appear very often, so Huffman coding gives them
shorter codes.
3. Huffman Coding – Example (Slides 4–7)
• Suppose we have symbols and probabilities:
o a1 = 0.2, a2 = 0.4, a3 = 0.2, a4 = 0.1, a5 = 0.1.
• Step 1: Sort symbols by probability.
• Step 2: Combine two smallest (a4 + a5 = 0.2).
• Step 3: Continue combining until tree is complete.
• Result: Most frequent symbol (a2) gets shortest code.
• Real Life Example: In a book, “the” occurs very frequently → give it short code like 01, while rare words
get longer codes.
4. Huffman Tree Formation (Slides 8–12)
• At each step:
o Pick two nodes with least probabilities.
o Merge them into one node.
o Assign 0 to left branch, 1 to right branch.
• Final tree gives codes to each symbol.
• Real Life Example: WhatsApp compresses your chat text using similar techniques → frequent words
like “ok”, “yes”, “hi” are stored in fewer bits.
5. Code Assignment (Slide 13)
• Example codes may look like:
o a2 → 0
o a1 → 10
o a3 → 110
o a4 → 1110
o a5 → 1111
• Observation: Higher probability = shorter code length.
• Real Life Example: Similar to phone keypad speed-dial → you assign “Mom” as number 1 (shortest)
because you call often, and rare contacts have longer numbers.
8. Decoding (Slides 16–18)
• Method: Start at tree root and follow bits (0 = left, 1 = right) until a symbol is found.
• Example:
o Input bits = 00111010001
o Traverse tree → sequence of symbols decoded.
• Important: Huffman coding is prefix-free → no code is prefix of another, so decoding is unambiguous.
• Real Life Example: Barcodes on products → scanned using similar prefix-free codes to decode
quickly.
9. Summary (Slide 19)
• Huffman coding is efficient for text and image compression.
• Works best when symbol probabilities are uneven (some frequent, some rare).
• Used in many standards: JPEG, MP3, MPEG.
• Real Life Example: When you save a JPEG image, Huffman coding is used in the background to reduce
file size.
PPT 4 – Image Compression (Run Length, Predictive & Lossy)
1. Recap (Slide 2)
• Last lecture covered:
o Lossless compression: Huffman coding, Entropy.
• Current lecture moves to other lossless techniques (Run Length, Predictive) and Lossy compression
basics.
2. Run Length Coding (Slide 3)
• Definition: Store sequences of same symbol as symbol + count.
• Example:
o Input: AAABBCCCCCCCCCAA
o Output: A3B2C9A2
o Compression ratio = 16/8 = 2.
• Best for: Images or data with long runs of same values (like blank spaces, solid backgrounds).
• Real life Example:
o Fax machines use run length coding to compress black-and-white documents.
o Simple example: Instead of saying “ha ha ha ha ha” → say “ha × 5”.
4. Lossy Compression Basics (Slides 7–11)
• Definition: Remove less important details that human eye/ear cannot detect.
• Concepts:
1. Psychovisual Redundancy: Remove details not noticed by humans.
2. Trade-off: Higher compression → more quality loss.
3. Quantization: Represent range of values by fewer levels (irreversible).
• Example Images: Show same photo at compression ratios 7.7, 12.3, 33.9 → higher compression =
blurrier image.
• Real Life Example:
o WhatsApp reduces photo size before sending. You still see the subject clearly, but background
details are lost.
o MP3 music removes very high/low frequencies inaudible to humans.
5. Quantization in Lossy Compression (Slide 12)
• Definition: Approximation of values to a limited set of discrete levels.
• Effect: Causes permanent information loss (irreversible).
• Real Life Example:
o Instead of storing 256 brightness levels, reduce to 16 → file smaller, but slight quality drop.
o Like rounding money values: ₹152 → ₹150.
6. Predictive Coding (Lossy) (Slides 13–15)
• Concept: Similar to lossless predictive but allows small error tolerance.
• Delta Modulation:
o Encode whether next value is higher or lower than previous.
o Very efficient but less accurate.
• Real Life Example:
o Old voice recording devices used delta modulation → saved space by just storing up/down
changes in audio wave.
o Like stock market news → instead of full value, they just say “+2 points” or “–3 points”.
PPT 5 – Image Compression (Transform Coding)
1. Recap (Slide 2)
• Previous lecture: Predictive Coding (Lossless + Lossy).
• Now: Transform Coding – a very important lossy compression technique.
• Reference book: Digital Image Processing (Gonzalez & Woods).
2. Transform Coding – Concept (Slide 3)
• Definition: Represents image data in another space (transform domain) to reduce redundancy.
• Steps:
1. Transform image into another domain (frequency domain).
2. Identify & remove redundancy.
3. Quantize coefficients → irreversible info loss.
4. Apply inverse transform → approximate original image.
• Real life Example:
o JPEG images use DCT transform coding to compress photos.
o Like translating a paragraph into shorthand → smaller but still understandable.
3. Fourier Transform (Review – Slides 4–5)
• Definition: Represents a signal as sum of sines & cosines.
• Equations:
o Forward transform: Converts time/space → frequency.
o Inverse transform: Converts frequency → time/space.
• DFT (Discrete Fourier Transform): Digital version.
• FFT (Fast Fourier Transform): Efficient algorithm to compute DFT.
• Real life Example:
o Equalizer in music player → adjusts low bass, mid, high treble frequencies = Fourier analysis
in action.
4. Discrete Cosine Transform (DCT) (Slides 6–8)
• Why DCT?
o More efficient than Fourier for compression.
o Compacts energy in few coefficients.
o Used in JPEG standard.
• Equations:
o Forward DCT: Converts pixels → frequency components.
o Inverse DCT: Converts frequency → pixels again.
• Energy Compaction:
o Most important image details stored in low-frequency components.
o High-frequency components (edges, fine details) can be discarded.
• Real life Example:
o In JPEG compression, background smooth areas are kept with few coefficients, while edges
use more.
o Like summarizing a movie: keep important scenes (low frequency) and skip minor details (high
frequency).
5. Transform Coding Pipeline (Slide 9)
• Compression:
1. Transform image into frequency domain (DCT).
2. Quantize coefficients (remove less important ones).
3. Encode remaining coefficients.
• Decompression:
1. Decode coefficients.
2. Apply inverse transform → get back approximate image.
• Real life Example:
o Sending a selfie on WhatsApp → phone compresses photo with DCT + quantization → receiver
gets similar but lighter image.
6. Why Sub-image Blocks (Slide 10–11)
• Reason:
o Large image divided into smaller blocks (e.g., 8×8 pixels).
o Makes computation faster and localized.
o Error is small & less visible.
• Common sizes: 8×8, 16×16.
• Real life Example:
o JPEG images often show “blocky artifacts” when highly compressed → because each block is
compressed separately.
7. Which Transform is Best? (Slide 11)
• Criteria:
o Low error for same number of coefficients.
o Computationally efficient.
• Preferred Transform: Discrete Cosine Transform (DCT).
• Real life Example:
o Almost all JPEG images worldwide use DCT → proof of its efficiency.
8. Quantization Schemes (Slide 12)
• Purpose: Decide how many coefficients to keep.
• Types:
1. Global thresholding: One threshold for entire image.
2. Local thresholding: Different thresholds for different blocks.
3. Block-based quantization: Retain M out of N coefficients.
• Real life Example:
o In JPEG compression, DC coefficient (main brightness) is always kept, while many high-
frequency AC coefficients are dropped.
o Like in exams → you keep main points, drop extra details due to time limit.
9. Final Pipeline (Slide 13–14)
• Compression Flow: Image → Transform (DCT) → Quantization → Encoding.
• Decompression Flow: Decoding → Inverse DCT → Approximate Image.
• Real life Example:
o Taking a ZIP of lecture notes → compress → send → unzip → get back approx same file (JPEG
case = lossy, so not exact).