Image Formation and Representation
Yao Wang
Tandon School of Engineering, New York University
Yao Wang, 2021 ECE-GY 6123 1
What will you lean
• How do we perceive color?
• How to capture, display, print in color?
• What are primary colors?
• How are digital images specified?
– Color resolution vs. spatial resolution vs. temporal resolution.
• What are some standard image/video formats?
Yao Wang, 2021 ECE-GY 6123 2
Outline
• Color perception
• Color production using primary colors
• Color specification
• Color image representation
• Image capture and display
• 3D->2D Projection
• Video format (SD, HD, UHD)
Yao Wang, 2021 ECE-GY 6123 3
Color Perception and Specification
• Light -> color perception
• Human perception of color
• Type of light sources
• Trichromatic color mixing theory
• Specification of color
– Tristimulus representation
– Luminance/Chrominance representation
• Color coordinate conversion
Yao Wang, 2021 ECE-GY 6123 4
Light is part of the EM wave
from [Gonzalez2008]
Yao Wang, 2021 ECE-GY 6123 5
Eye Anatomy
From http://www.stlukeseye.com/Anatomy.asp
Yao Wang, 2021 ECE-GY 6123 6
Eye vs. Camera
Camera components Eye components
Lens Lens, cornea
Shutter Iris, pupil
Film Retina
Cable to transfer images Optic nerve send the info to the brain
Yao Wang, 2021 ECE-GY 6123 7
Photo Receptors in the Retina
• Rods: perceive brightness
only, extremely sensitive even
at night
• Cones: perceive color tone
– Red (560-580nm), green (530-
540nm), and blue (420-440nm)
cones
– Different cones have different From
http://www.macula.org/anatomy/
frequency responses (each cone retinaframe.html
like a filter!)
– Tri-receptor theory of color vision
[Young1802]
• More rods (120 million) than
cones (6 million)
Yao Wang, 2021 ECE-GY 6123 8
Frequency Responses of Cones and
the Luminous Efficiency Function
Ci = ò C (l )ai (l )dl , i = r , g , b, y
Yao Wang, 2021 ECE-GY 6123 9
Three Attributes of Color
• Luminance (brightness)
• Chrominance
– Hue (color tone) and Saturation (color purity)
• Represented by a “color cone”
Yao Wang, 2021 ECE-GY 6123 10
Illuminating and Reflecting Light
• Illuminating sources:
– emit light (e.g. the sun, light bulb, TV monitors)
– perceived color depends on the emitted freq.
– follows additive rule
• R+G+B=White
• Reflecting sources (secondary light):
– reflect an incoming light (e.g. the color dye, matte surface,
cloth)
– perceived color depends on reflected freq (=emitted freq-
absorbed freq.)
– follows subtractive rule
• R+G+B=Black
Yao Wang, 2021 ECE-GY 6123 11
Trichromatic Color Mixing
• Trichromatic color mixing theory
– Any (actually most) color can be obtained by mixing three primary
colors with a right proportion
C = å Tk Ck , Tk : Tristimulus values
k =1, 2,3
• Primary colors for illuminating sources:
– Red, Green, Blue (RGB)
– Color monitor works by exciting red, green, blue phosphors using
separate electronic guns
• Primary colors for reflecting sources:
– Cyan, Magenta, Yellow (CMY)
– Color printer works by using cyan, magenta, yellow and black (CMYK)
dyes
• More advanced printers use more primary colors to produce a wider range
of colors
Yao Wang, 2021 ECE-GY 6123 12
RGB vs CMY
Magenta = Red + Blue Magenta = White - Green
Cyan = Blue + Green Cyan = White - Red
Yellow = Green + Red Yellow = White - Blue
Yao Wang, 2021 ECE-GY 6123 13
red
Green Blue
Yao Wang, 2021 ECE-GY 6123 14
CIE 1931 RGB Primaries
from [Szeliski2010]
• Monochromatic primaries: Red (700nm), Green (546.1nm), Blue
(435.8nm)
• Color matching functions: specify the amount of each primary to match a
particular monochromatic color
• Need “negative amount” of red to match some colors: add this amount of
red to the test color to match the sum of remaining primaries
Yao Wang, 2021 ECE-GY 6123 15
CIE 1931 XYZ Primaries
from [Szeliski2010]
• XYZ do not correspond to real color primaries (Y=luminance,
Z=blue, X: mixed). They are imaginary primary colors.
• Using these color matching functions can however represent
any single spectral color using non-negative tristimulus values
Yao Wang, 2021 ECE-GY 6123 16
Tristimulus Values
• Tristimulus values
– The amounts of three primary colors needed to form any
particular color are called the tristimulus values, denoted by X,
Y, Z (or any other three symbols).
– For any color with spectrum S(λ )
!
X= ∫ S(λ )x(λ )dλ , Y = ∫ S(λ ) y(λ )dλ , Z = ∫ S(λ )z(λ )dλ
• Trichromatic (or chromaticity) coefficients
X Y Z
x= , y= , z= .
X +Y + Z X +Y + Z X +Y + Z
– Only two chromaticity coefficients are necessary to specify the
chrominance of a light.
x + y + z =1
Yao Wang, 2021 ECE-GY 6123 17
Conversion between CIE 1931 RGB and
XYZ
from [Szeliski2010]
The three columns in the matrix are the tristimulus values of the R,G,B primaries
defined in terms of CIE XYZ primaries. From these, you could derive the
chromaticity coefficients of RGB primaries and correspondingly locate them in
the CIE chromaticity diagram.
Ex: Red: x=0.49/(0.49+0.17697+0)=0.735, y=0.17697/(0.49+0.17697+0)=0.265.
Yao Wang, 2021 ECE-GY 6123 18
CIE 1931 XYZ Chromaticity Diagram
• Shows all visible colors by humans.
• Colors on the boundary: spectrum
colors, highest saturation.
• Mixing any three colors in the visible
range can generate colors in the
triangle formed by these three points.
• Not all visible colors can be
reproduced by RGB primaries used
for display, or CMY primaries used
for printing.
• The triangle on the right shows the
color gamut of CIE RGB primary.
• Printing gamut using CMY is smaller
from [https://commons.wikimedia.org/wiki/File:CIE1931xy_CIERGB.svg]
than display gamut using RGB.
Yao Wang, 2021 ECE-GY 6123 19
RGB Color Primary Defined by ITU Rec.
709 for Digital TV (BT. 601)
! R $ ! 3.240479 −1.537150 −0.498535 $! X $
# & # &# &
# G & = # −0.969256 1.875992 0.041556 &# Y &
# B & # 0.055648 −0.204043 1.057311 &# Z &
" % " %" %
! X $ ! 0.412453 0.357580 0.180423 $! R $
# & # &# &
# Y & = # 0.212671 0.715160 0.072169 &# G &
# Z & # 0.019334 0.119193 0.950277 &# B &
" % " %" %
The three columns in the RGB-> XYZ conversion are the tristimulus values of the
R,G,B primaries defined in terms of CIE XYZ primaries. From these, you could
derive the chromaticity coefficients of RGB primaries and correspondingly locate
them in the CIE diagram (HW!)
Yao Wang, 2021 ECE-GY 6123 20
BT.709 (left) vs. CIE 1931 (right) Color
Gamut
https://en.wikipedia.org/wiki/Rec._709#/media/File:CIExy1931_Rec_709.svg
Yao Wang, 2021 ECE-GY 6123 21
Color Representation Models
• Specify the tristimulus values associated with the three primary
colors
– RGB, CMY, XYZ
• Specify the luminance and chrominance
– CIELAB (nonlinear transformation of XYZ, so that perceived difference in
luminance and chrominance are more uniformly spaced in LAB coordinates)
– HSI (Hue, saturation, intensity)
– YIQ (used in analog NTSC color TV)
– YCbCr (used in digital color TV)
• Amplitude specification (Standard Dynamic Range):
– 8 bits for each color component, or 24 bits total for each pixel
– Total of 16 million colors
– A true RGB color display of size 1Kx1K requires a display buffer
memory size of 3 MB
• High dynamic range (HDR) Image: up to 16 bits/component
Yao Wang, 2021 ECE-GY 6123 22
HSI Color Model
• Hue represents dominant color as perceived by an
observer. It is an attribute associated with the dominant
wavelength.
• Saturation refers to the relative purity or the amount of
white light mixed with a hue. The pure spectrum colors
are fully saturated. Pink and lavender are less saturated.
• Intensity reflects the brightness.
Yao Wang, 2021 ECE-GY 6123 23
Conversion Between RGB and HSI
• Converting from RGB to HSI
ì 1 ü
ì q [( R - G ) + ( R - B)]
if B £ G -1 ï
ï 2 ïï
H =í with q = cos í ý
î360 - q if B > G
[ ]
1
ï ( R - G ) 2 + ( R - B)(G - B) ï
ïî 2
ïþ
3
S = 1- [min( R, G, B)]
( R + G + B)
1
I = [ R + G + B]
3
• Converting from HSI to RGB
RG sector (0≤H<120) GB sector (120≤H<240) BR sector (240≤H<360)
B = I (1 - S ) R = I (1 - S ) G = I (1 - S )
é S cos H ù é S cos( H - 120) ù é S cos( H - 240) ù
R = I ê1 + G = I ê1 + ú B = I ê1 + ú
ú ë cos(60 - ( H - 120)) û
ë cos(60 - H ) û ë cos(60 - ( H - 240)) û
G = 1 - ( R + B) B = 1 - ( R + G) R = 1 - (G + B)
Yao Wang, 2021 ECE-GY 6123 24
YIQ Color Coordinate System
• YIQ is defined by the National Television System
Committee (NTSC) for US analog color TV system
– Y describes the luminance, I and Q describes the chrominance.
– A more compact representation of the color.
– YUV plays similar role in PAL and SECAM.
• Conversion between RGB (analog) and YIQ (analog)
éY ù é0.299 0.587 0.114 ù é R ù é R ù é1.0 0.956 0.621 ù éY ù
ê I ú = ê0.596 - 0.274 - 0.322ú êG ú, êG ú = ê1.0 - 0.272 - 0.649ú ê I ú
ê ú ê úê ú ê ú ê úê ú
êëQ úû êë 0.211 - 0.523 0.311 úû êë B úû êë B úû êë1.0 - 1.106 1.703 úû êëQ úû
Yao Wang, 2021 ECE-GY 6123 25
YUV/YCbCr Coordinate
• YUV is the color coordinate used in color TV in PAL
system, somewhat different from YIQ.
• YCbCr is the digital equivalent of YUV, used for digital
TV, with 8 bit for each component, in range of 0-255
Yao Wang, 2021 ECE-GY 6123 26
Comparison of Different Color Spaces
Yao Wang, 2021 ECE-GY 6123 27
Color Coordinate Conversion
• Conversion between different primary sets are linear
(3x3 matrix)
• Conversion between primary and XYZ/YIQ/YUV are
also linear
• Conversion to HSI/Lab are nonlinear
– HSI and Lab coordinates
• coordinate Euclidean distance proportional to actual color
difference
• Conversion formulae between many color coordinates
can be found in [Gonzalez2008]
• Color space of HDTV by Recommendation 709 can be
found in [Woods2012]
Yao Wang, 2021 ECE-GY 6123 28
Grayscale Image Specification
• Each pixel value represents the brightness of the pixel. With 8-bit
image, the pixel value of each pixel is 0 ~ 255
• Matrix representation: An image of MxN pixels is represented by
an MxN array, each element being an unsigned integer of 8 bits
é160 162 ! 166 154ù
ê162 158 ! 122 69 ú
ê ú
M =ê " " # " " ú
ê ú
ê 60 55 ! 79 94 ú
êë 58 55 ! 99 109úû
Yao Wang, 2021 ECE-GY 6123 29
Color Image Specification
• Three components
– M = {R, G, B}
é 73 ! 87 ù é66 ! 98ù é31 ! 61ù
R = êê " # " úú, G = êê " # " úú, B = êê " # " úú
êë27 ! 17 úû êë36 ! 13 úû êë36 ! 14úû
G B
R
Red nose is brightest! Blue Cheek is brightest
Yao Wang, 2021 ECE-GY 6123 30
Image Capture and Display
• Light reflection physics
• Imaging operator
• Color capture
• Color display
Yao Wang, 2021 ECE-GY 6123 31
Image and Video Capture
Courtesy of Onur Guleryuz
Yao Wang, 2021 ECE-GY 6123 32
More on Video Capture
• Camera absorption function
y ( X, t ) = ò C ( X, t , l )ac (l )dl
• Projection from 3-D to 2-D camera plane
X®x
P
y ( P( X), t ) = y ( X, t ) or y (x, t ) = y ( P -1 (x), t )
• The projection operator is non-linear
– Perspective projection
– Orthographic projection
– More on this later
Yao Wang, 2021 ECE-GY 6123 33
Gray and Color Image Capture
• Gray images are captured by a single sensor,
sensitive to the entire visible spectrum, similar
to the rods
• Color images are captured by having three
sensors, each sensitive to a different primary
color, similar to the cones
– Alternatively using a single sensor proceeded by different color
filters
• Each sensor or filter has its own frequency
response, which may differ from the cones
responses in the human retina.
Yao Wang, 2021 ECE-GY 6123 34
Color Imaging Using Color Filter Arrays
• Single sensor array, with different color filters to
separate the RGB primary components
• Sensors: CCD, CMOS
Bayer RGB Pattern
(Each 2x2 pixels contains 2
green, 1 red, 1 blue)
Human eye is more
sensitive to the high
frequency in the luminance
(mostly determined by
green)
From http://en.wikipedia.org/wiki/Bayer_filter
Yao Wang, 2021 ECE-GY 6123 35
From http://en.wikipedia.org/wiki/Bayer_filter
1: original scene
2: output of a 120x80 pixel sensor with a Bayer
filter
3: output color coded
4: Reconstructed image after interpolating
missing colors (demosaicing)
Yao Wang, 2021 ECE-GY 6123 36
Gray and Color Image Display/Printing
• Gray images are displayed by a single light sensitive
diode, with intensity proportional to gray level.
• LCD monitors display color images by having three
phosphors at slightly shifted positions near each pixel,
each generating a different primary color (red, green,
blue)
– Our eye does the interpolation!
• Color images are printed by having three color inks
(cyan, magenta, yellow). High end printers use more
inks to produce a larger color gamut and more vivid
colors.
Yao Wang, 2021 ECE-GY 6123 37
Captured and Displayed Color
• Original lighted scene has a spectrum distribution (in terms of
wavelength) at a particular space and time (x,y,z,t). Call it c(l)
• If the human oberserves it directly, the cones’ responses (each with
frequency responses ai(l)) will be A = ∫ C(λ )a (λ )d λ, i = r, g, b, y
i i
• If the scene is captured by a camera, with three sensors, each with
frequency response bi(l), the captured RGB values will be
– 𝐶! = ∫ 𝐶 𝜆 𝑏! 𝜆 𝑑𝜆, 𝑖 = 𝑟, 𝑔, 𝑏
• The display will use three phophors, each with different spectrum
di(l) and the displayed spectrum is the weighted sum of the three
spectrums
• The displayed spectrum again is viewed by the human, generating
three other responses Bi = ∫ D(λ )ai (λ )d λ, i = r, g, b, y
• 𝑏! 𝜆 and 𝑑! 𝜆 should be chosen so that 𝐵! is close to 𝐴! for the
perceived scene to be similar to the original scene
Yao Wang, 2021 ECE-GY 6123 38
Color Calibration
• Each camera/display may use R,G,B primaries
somewhat different
• Captured images also depend on scene lighting
• Using a reference white, rescale the RGB values so
that “white” will have equal R,G,B values
– Important when the images are captured under non-white
illumination
• BT. 709 uses daylight illuminate D_65 as standard
white
Yao Wang, 2021 ECE-GY 6123 39
Gamma Correction
• Displayed light intensity is nonlinearly related to the
actual intensity following the Gamma rule
g = af r
• Gamma correction precompensate this nonlinearity
inside the camera
1/r
h= f
g = ah r = af
• Gamma value depends on the display device, typically
gamma~2.2
Yao Wang, 2021 ECE-GY 6123 40
Digital Video Formats
• Standard Definition (SD) video (BT. 601)
– Resolution: 720x480, 25-30Hz
– Standard Dynamic Range (SDR): 0.0002 to 100 cd/m2 (nits) in
luminance range, represented in 8bits/color
– Color spec: BT. 709
• Narrow color range: 33.5% of visible range
• High Definition (HD) video (BT. 709)
– Resolution increase to 1920x1080, up to 60 Hz (2K video)
– Color spec: BT. 709
• Ultra High Definition (UHD) video
– Resolution further increase to 3840x2160 (4K video) or higher, up to
120Hz
– High dynamic range (HRD): 0,00005 to 1000 cs/m2, represented by
16bits/color (10-12 bpp for distribution, 14-16bpp for production)
– Wide color gamut: DCI-P3 (41.8%) and BT.2020 (57.3%).
Yao Wang, 2021 ECE-GY 6123 41
UHD Alliance Definition of UHD TV
• Resolution: 3840x2160 for content, distribution, and
playback displays.
• Color bit depth: 10 bits minimum for content and
distribution, 10 bits for playback displays.
• Color representation: BT.2020 for content,
distribution, and playback displays.
• Mastering display: 0.03- 1,000 nits
• Playback display: 0.05-1,000 nits, or 0.0005-540nits
• UHD content backwards compatible with SDR displays
Yao Wang, 2021 ECE-GY 6123 42
Color Gamut Specifications
• BT.709: Used in HDTV and SDTV
• DCI-P3: Used in cinema presentation
• BT.2020: Specified for UHD, covers 99.9% of Pointer’s
gamut
• BT.2020 primaries are maximum saturation pure colors,
created by extremely narrow spectral slices of light
energy: Red (630nm), Green (532 nm), Blue (467nm)
• Recall CIE RGB primaries are Red (700nm), Green
(546.1nm), Blue (435.8nm)
Yao Wang, 2021 ECE-GY 6123 43
New Color Space for UHD: BT.2020
From [BT2020]
Yao Wang, 2021 ECE-GY 6123 44
BT 2020
BT 709
Yao Wang, 2021 ECE-GY 6123 From [Schulte2016] 45
Yao Wang, 2021 ECE-GYFrom
6123 [Schulte2016] 46
SDR vs. HDR
From: http://files.spectracal.com/Documents/White%20Papers/HDR_Demystified.pdf
HDR TV: captured using HDR camera and displayed by HDR display
HDR photography: computed HDR image using multiple captured images using SDR cameras
Yao Wang, 2021 ECE-GY 6123 47
What should you know?
• What is light? Difference between illuminating and reflecting light?
What are the attributes used to describe the color?
• How human perceive color? Functions of cones and rods in the
retina?
• How to produce different colors ? What are primary colors? How to
judge the “goodness” of color primaries?
• How to represent a color? Different color models? Meaning of
chromaticity?
• How to capture a color image?
• How to display a color image?
• How to print a color image?
• How to specify a color image digitally?
• What are some standard image formats?
• You should be able to answer all the questions in this slide as a
review of this lecture. Similarly for following lectures.
Yao Wang, 2021 ECE-GY 6123 48
Reading Assignments
• [Szeliski2021] Richard Szeliski, Computer Vision: Algorithms and Applications.
Ch. 2. (Sec. 2.2.1, 2.3)
• Optional: [Schulte2016] Tom Schulte, Joel Barsotti, “HDR Demystified –
Emerging UHDTV Systems,” Mar. 2016.
http://files.spectracal.com/Documents/White%20Papers/HDR_Demystified.pdf
• Other reference:
• [Wang2002] Wang et al, Digital video processing and communications. Chap 1
(Sec. 1.3-1.4 optional)
• [Gonzalez2008] Gonzalez & Woods, “Digital Image Processing”, Prentice Hall,
2008, 3rd ed. Chapter 3 (Section 6.1 – 6.2)
• [BT2020] "BT.2020 : Parameter values for ultra-high definition television
systems for production and international programme exchange". International
Telecommunication Union. 2012-08-23. Retrieved 2014-08-31.
Yao Wang, 2021 ECE-GY 6123 49