Image Representation
Sunday, March 20, 2016
10:17 AM
Image Model
Digital Image
- Discrete representation of image
- Discretized element:
Spatial resolution (x, y-coordinate)
- f(x, y) = i(x, y) r(x, y)
f(x, y): Intensity
Intensity (quantization)
i(x, y): Illumination
Amount of source illumination
incident on the viewing scene
- Storage requirement: M * N * k
M = Image width
N = Image height
0 i(x, y) < +
k = No. of bits required to represent grey lvl
r(x, y): Reflectance
Amount of source illumination
reflected by objects in scene
0 r(x, y) 1
0 = Total absorption
1 = Total reflectance
- i, r are continuous function f(x, y) can be
continuous
COMP 4421 Page 1
Image Enhancement in Spatial Domain
Sunday, March 20, 2016
10:58 AM
Spatial Domain Enhancement
g(x, y) = T[f(x, y)]
- f(x, y) = Input image
- g(x, y) = Output image
- T = Operation on f (Intensity Transformation function)
Defined over neighborhood
Pixel, 3x3 square, rectangular sub-area
Operated on single/set of images
g(x, y) = T(f1(x, y), f2(x, y), , fn(x y)]
COMP 4421 Page 2
Gray-Level Transformation
Sunday, March 20, 2016
11:22 AM
Typical transform
General Properties
- Neighborhood: 1x1 (pixel)
- Notation: s = T(r)
Type
Def.
Illustration
Inverse
s = (L - 1) - r
Log
s = c log(1 + r)
r = Gray lvl of f(x, y) x, y
s = Gray lvl of g(x, y) x, y
c = Scale-factor Rescale intensity
to range [0, L -1]
Window/Gray-level Slicing
Operate on a gray window [ri, ri+x]
Power
(Gamma
correction) c = Scale-factor
Contrast
For (rmin, rmax):
Stretching
- r < rmin s = 0
- r > rmax s = L - 1
- rmin r rmax: Linearly map to
full range [0, L - 1]
Linear Mapping: [a1, b1] [a2, b2]
COMP 4421 Page 3
Image Histogram
Sunday, March 20, 2016
11:38 AM
Usage
Construction
- Gray lvl distribution (can be modelled by stat
distribution) Help identify intensity threshold)
- Input: Digital image with gray lvl [0, L - 1]
- Output: p(rk) = nk / N (Prob of occur of gray lvl rk)
rk = kth gray lvl
nk = No of pixel with kth gray lvl
N = Total no of pixels
k = 0, 1, , (L - 1)
- Brightness
- Constrast
Histogram Equalization
- Purpose:
Approx flat histogram (uniform distribution)
Equally many pixel at every gray lvl
Global vs Local Histogram Equalization
- Advantage
Significantly improve image appreance (perform contrast stretching
+ compressing simultaneously)
Automatic, good pre-processing step
Pros
Global
Local
Run fast
- Reveal more detailed
structure (adaptive to local
statistics)
- Methods:
- Suitable for image with
varying illumination
sk = Tk(L - 1) = (p(r1) + p(r2) + + p(rk))(L - 1)
Tk = p(r1) + p(r2) + + p(rk)
Cons
sk: New gray lvl for kth gray lvl
(L - 1): Max gray lvl
r
300
700
800
900
500
400
196
300
p(r)
0.0732 0.1709 0.1953 0.2197 0.1221 0.0977 0.0479 0.0732
0.0732 0.2441 0.4394 0.6591 0.7812 0.8789 0.9268 1
N = 4096, Max gray lvl = 7
COMP 4421 Page 4
- Might greatly
enhance noise
- Slow
- Not produce histogram as
- Might fail to
flat as global
recover detailed
- Might cause discontinuity
structures due to (2 pixel initially same
low contrast
intensity might transformed
differently)
Local Enhancement
Sunday, March 20, 2016
Linear filtering using mask
12:16 PM
- Usage: Apply mask on pixel z 5 (Convolution)
Image Averaging
gi(x, y) = ith observed noisy image
i = 1, 2, , K
Put mask on image such that z5 is in center of mask
Properties:
New intensity for z5:
f(x, y) = True image
(x, y) = Zero mean, random
Gaussian noise
K large
close to f, small
- Common filter:
Filter
Smoothing/Low-pass
Sharpening/High-pass
Usage
- Blurring: Remove small image
details
- Noise reduction
- Highlight fine details
- Freq filtering: Stop high, allow
low freq.
Non-linear Filter: Median Filter
- Enhance blurred details
- Freq filtering: Stop low, allow high
freq.
Design
principle
- Operation: On pixel z
Put window such that its center is at z
Set z = Median of windowed neighoring
- Near center: Positive wi
- All wi > 0
-1
- Scale factor = (wi)
- Near outer: Neg wi
- wi = 0
- NOTE: Result might < 0
Scale/Clip gray lvl back to [0, L - 1]
Neighbor sort = {2, 2, 3, 3, 4, 4, 8. 9, 10}
zNew = Median = 4
- Property: Convergence (Perform for many times
Image eventually will not change)
- Usage:
Remove impulse noise
Avoid excessive smoothing
Preserve edges
COMP 4421 Page 5
Example
First Order Derivatives
Sunday, March 20, 2016
6:49 PM
Maginute calculation using 3x3 mask
Gradient
- Basic method: Mag at z5
- Function of x, y
- Represent intensity change over x, y-axis
z1
z2
z3
z4
z5
z6
z7
z8
z9
Precise:
Fast approx:
- Magnitude:
- Cross different method:
z1
z2
z3
z4
z5
z6
z7
z8
z9
- Linear mask:
Prewitt operator:
mag( f) |GradientX| + |GradientY|
= |(z3 + z6 + z9) - (z1 + z4 + z7)| + |(z7 + z8 + z9) - (z1 + z2 + z3)|
Sobel operator:
COMP 4421 Page 6
Image Enhancement in Frequency Domain
Sunday, March 20, 2016
6:50 PM
1-D Discrete Fourier Transform (DFT)
Fourier Transform
- Any continuous function = Sum of sine function of
different freq
- DFT:
f(t) = g(t1) + g(t2) + g(t3) +
g(tk):
Sine function
u = 0, 1, 2, , (N - 1)
(Freq, Ampl, Phase diff) = (f k,, Ak, k)
- Inverse:
- Let f = 0 +
x = 0, 1, 2, , (N - 1)
Let H(f) = Function such that H(fk) can represent (Ak, k)
Fourier Transform: Given f(t), find such H(f)
H(fk) = bk (bk is complex)
|bk| = Ak
2-D Discrete Fourier Transform
arg(bk) = k
- DFT:
- 1-D Fourier Transform:
Input: f(x)
Output: u = Freq
u = 0, 1, , M - 1
v = 0, 1, , N - 1
- Inverse:
- 1-D Inverse Fourier Transform:
- Separability of 2D DFT: 1-D DFT Row 1-D DFT Col
Filtering in Frequency Domain
1. Pre-processing Input image f(x, y)
2. Fourier transform:
f(x, y) F(u, v)
3. Apply filter function:
G(x, y) = F(u, v)H(u, v)
(element-by-element multiplication)
H(u, v) = Filter in freq. domain
4. Inverse Fourier transform:
G(x, y) g(x, y)
5. Post-processing Output image g(x, y)
COMP 4421 Page 7
Smoothing Frequency-Domain Filters
Sunday, March 20, 2016
Filter
11:39 PM
Ideal Low-Pass Filter
Butterworth Low-Pass Filter
Gaussian Low-Pass Filter
(ILPF)
(BLPF)
(GLPF)
Main
formula
: Distance from (u, v)
to origin of freq plane
D0 0
Illustration
Relating
concept
- Image power:
- Slight variation:
Use Limit the min. value of H(u, v)
(cutoff freq)
Commonly:
R(u, v): Real part of DFT
I(u, v): Imaginary part of DFT
- ILPF with radius D0 enclosed % of power:
- Calculation of :
Cutoff freq = fc (at D = D0)
(u, v) summed over points inside
circle centered (0, 0) and radius D0
Pros &
Cons
Limitation: Sharp cut off of freq Ringing
Artifact
- Usual BLPF: No sharp discontinuity of freq:
n = 1: No ringing artifact
1 < n 2: Ringing artifact hardly visible
- Slight-varied BLPF: Smoother image (although
allowing more high freq)
Illustration of ILPF
COMP 4421 Page 8
- Most popular low-pass filter
- No ringing artifact
Left: Input image
Right:
Its Fourier spectrum
Outside Inside: Circle of radius 5, 15, 30, 80, 230 enclosing 94.6, 96.4, 98, 99.5% image power respectively
COMP 4421 Page 9
Sharpening Frequency-Domain Filters
Sunday, March 20, 2016
11:39 PM
Low-Pass vs. High-Pass Filter
- Low-pass: Suppress high freq Blur image
- High-pass: Suppress low freq Highlight details
Hhp(u, v) = 1 - Hlp(u, v)
Common Sharpening/High-Pass Filters
Filter
Ideal High-Pass Filter
Butterworth High-Pass Filter
Gaussian High-Pass Filter
(IHPF)
(BHPF)
(GHPF)
Main
formula
- Main version:
: Distance from
(u, v) to origin of freq plane
D0 0
- Slight variation:
Cutoff freq (limit max. H(u, v)): At D = D0:
- High freq emphasis: Preserve low freq but also
amplify high freq
0 a 1, b > a
Illustration
Pros &
Cons
Ringing artifact
n Less ringing artifact
Illustration of High-Pass Filter
- GHPF:
COMP 4421 Page 10
No ringing artifact
- BHPF with High-freq Emphasis:
COMP 4421 Page 11
Linear Image Restoration & Filtering
Sunday, March 20, 2016
6:50 PM
Noise
- Source:
CCD-Camera: Light lvl, sensor temp,
Magnetic resonance: Magnetic field strength
Corruption in transmission channel
- Noise modelling assumption:
Independent of spatial cooridnate
Uncorrelated with image
Use statistical model:
Describe by Prob. Distribution Function (PDF)
PDF identical to all pixel
Noise signal is random
Statistical Noise Model
Noise Model Formula
PDF Graph
Gaussian
= Mean
= sd
Rayleigh
Gamma
COMP 4421 Page 12
Image Illustration
Exponential
Uniform
Impulse
(Salt &
Pepper)
For b > a:
- Gray lvl b: Light dot (Salt noise)
- Gray lvl a: Dark dot (Pepper noise)
If Pa = 0 OR Pb = 0: Unipolar noise
If Pa 0 AND Pb 0, especially Pa Pb:
Salt & pepper granules (small
pieces) randomly distribute all over
image
Noise Model Parameter Estimation
- Choose an region of interest (ROI) from the image
- Construct histogram of ROI
Obtain zi's (gray lvls) and p(zi)
- Param calculation:
S = ROI
COMP 4421 Page 13
Mean Filters
Monday, March 21, 2016
10:38 AM
Filter
Arithmetic Mean
Main
formula
Geometric Mean
Harmonic Mean
Contraharmonic Mean
Q = Order of filter
- Similar to using mask:
(n row, m col)
Properties Work well for rand noise
- Compare to Arthimetic Mean Filter: - Salt noise: Work well
- Q > 0 Eliminate pepper noise
- Pepper noise: Fail
Smoothing effect: Similar
Tends to lose less image detail
- Work well for rand noise
- Other noises: Work well
- Q < 0 Eliminate salt noise
Notation:
SXY = Mask area (centered at the operating pixel)
g(x, y) = Corrupted image
= Restored image
Illustration
COMP 4421 Page 14
- Q = 0 Arthmetic Mean Filter
- Q = -1 Harmonic Mean Filter
COMP 4421 Page 15
Order-Statistics Filters
Monday, March 21, 2016
Filter
10:39 AM
Median
Max
Min
Midpoint
Main
formula
Alpha-trimmed Mean
Delete d/2 lowest & d/2 highest grey lvl values
in SXY
gr(s, t) = Undeleted grey lvl
Properties - Excellent noise-reduction
- Considerably less
blurring than linear
filters of same size
- Preserve edge
Find brightest point in
SXY
Find darkest point in SXY Work best for randomly distributed noise
Gaussian, uniform noise
- Work best when multiple types of noise
combined together in a single image
Gaussian + Salt & Pepper,
- d = mn - 1 Median filter
- Convergence
Illustration
COMP 4421 Page 16
Adaptive Filters
Monday, March 21, 2016
Filter
10:40 AM
Adaptive, Local Noise Reduction
Main Formula - Calculate means & variance:
Adaptive Median
- Within neighboring area SXY:
Zxy = Grey lvl at (x, y)
Zmin = Min grey lvl, Zmax = Max grey lvl
Zmed = Median grey lvl
Smax = Max. allowed size of SXY
Start with min. window size S = St
Lvl A:
A1 = Zmed - Zmin, A2 = Zmed - Z
If A1 > 0 & A2 < 0: Go to Lvl B
SXY = Neighbouring pixel in IMAGE
S'XY = Pixels in pure NOISE REGION
Else: Window size S
If window size Smax: Repeat Lvl A
Else: Output ZXY
Lvl B:
- Filter:
B1 = ZXY - Zmin, B2 = ZXY - Zmax
If B1 > 0 & B2 < 0: Output ZXY
Else: Output Zmed
Properties
No filter
-
Arithmetic Mean Filter
close to g(x, y) Preseve edge, boundaries
- Lvl A: Determine whether Zmed is impulse (noise):
Zmin < Zmed < Zmax (A1 > 0, A2 < 0) NOT impulse
Otherwise: Window size
- Lvl B: Determine whether ZXY is impulse (noise):
Zmin < ZXY < Zmax (B1 > 0, B2 < 0) Not impulse Output ZXY
Otherwise: Output Zmed
Illustration
COMP 4421 Page 17
Periodic Noise Reduction by Frequency Domain
Filtering
Sunday, March 20, 2016
6:50 PM
Noise-Reduction Filter
Main type
Band-Reject
Notch Reject
Operation
Remove a band of freq about origin of Fourier spectrum
Remove freq in predefined neighborhood about center freq (u0, v0)
Filter
visualization
Ideal
formula
W = Width of band
D0 = Radical center
Butterworth
formula
Gausian
formula
COMP 4421 Page 18
Illustration
COMP 4421 Page 19
Image Restoration Process
Monday, March 21, 2016
10:41 AM
Minimum mean square error (Wiener) Filtering
Model of Image Restoration Process
- Aim: Minimze
Term
Spatial Domain
Frequency Domain
Input image
f(x, y)
F(u, v)
N(x, y) & F(x, y) statistically uncorrelated
Degradation
function
h(x, y)
H(u, v)
N(x, y):
Additive
Noise
(x, y)
N(u, v)
Degraded
image
g(x, y)
G(u, v)
= h(x, y) * f(x, y) + (x, y)
= H(u, v)F(u, v) + N(u, v)
Restored
image
- Assumption:
Zero mean
Power spectrum known:
H: linear & known
Power spectrum of F(u, v) known:
- Estimation
N & F known ( S & Sf know):
* NOTE:
h(x, y) * f(x, y): Matrix convolution (using mask)
H(u, v)F(u, v): Element-by-element multiplication
H*(u, v) = Conjungate of H(u, v)
H(u, v) = a + bi H*(u, v) = a - bi
Modelling steps:
N & F unknown:
1. Input image
2. Apply Degradation function + Noise
3. Apply Restoration filter
4. Output image
- Properties: Incorporate H & into restoration process
Not suffer from zero-value of H(u, v)
Inverse Filter
- Assumption: H known
Geometric Mean Filter
- Estimation:
- Properties:
If N not known exactly F not recovered
exactly
If H(u, v) very small
very large, dominate calculation
bad estimation
Solution:
Limit freq near origin Try avoiding
very small H(u, v)
Use const in denominator:
COMP 4421 Page 20
= 1 Inverse filter
= 0, = 1 Standard Wiener filter
Geometric Transformations
Monday, March 21, 2016
10:41 AM
Gray-Level Assignment
Spatial Transformation
1. Distortion Correction:
f(x, y) unknown g(x', y') known
Integral f(x, y) Non-integral g(x', y')
Grey lvl of f(x, y) = ?
- Solution 1: Nearest neighbor approach:
Map non-integral g(x', y') Nearest integral g(x'', y'')
Set f(x, y) = g(x'', y'')
- Solution 2: Bilinear Interpolation
In g: Get 4 integral points nearest to g(x', y')
Calculation:
x' = r(x, y) = c1x + c2y + c3xy + c4
y' = s(x, y) = c5x + c6y + c7xy + c8
Need 4 pairs (x, y) (x', y') to estimate all coeff
2. Distortion:
f(x, y) known g(x', y') unknown
Integral f(x, y) Non-integral g(x', y')
Grey lvl of g(x', y') = ?
Solution: Nearest neighbor approach
Map non-integral g(x', y') Nearest integral g(x'', y'')
Set g(x'', y'') = f(x, y)
COMP 4421 Page 21
Non-Linear Filtering
Tuesday, March 22, 2016
1:12 AM
Linear Diffusion Filtering
Concept of Diffusion (Fick's Laws)
- Assumption:
Intensity diffuses isotropicaly (Diffusion Intensity grad)
D=I
- 1st Law:
Max flux Concentration gradient
(Diffusion) (Change in concentration)
- Equations:
Initially:
Low-pass Gaussian filter
- 1D Implementation:
- 2nd Law:
Rate of concentration accumulation within volume
Change of local concentration gradient
- Limitation:
Blur important features (edges, )
Noise reduced t Signal smoother, but edge
dislocated
- Diffusion equation (Heat equation, 1st + 2nd law combined):
1D:
D = Diffusivity (const/function)
2D:
Non-Linear Isotropic Diffusion Filtering
If D is const:
- Aim: Only smooth WITHIN region, NOT ACROSS boundaries
- How: Diffusivity function
Zero on boundary
Non-Linear Anisotropic Diffusion Filtering
Non-zero inside region
- Concept:
Smoothing/Reduce noise ALONG edges, not only within
each region
Diffusion Intensity grad
- 2D Implementation:
- How: Diffuse along grad & tangent vector of Gaussian
intensity smoothed image u
COMP 4421 Page 22
Diffusion Intensity grad
- 2D Implementation:
- How: Diffuse along grad & tangent vector of Gaussian
intensity smoothed image u
u = G * u
Grad vector
Tangent vec
w2 very large, w1 very small Smoothing along edge
achieved
* NOTE: 0, anisotropic isotrotopic diffusion
- Equations:
: Matrix with eigenvector
& eigenvector 1, 2
At boundary, 1 0
COMP 4421 Page 23
COMP 4421 Page 24
Morphological Image Processing
Sunday, March 20, 2016
6:50 PM
Basic Binary Morphological Operation
Set Theory Basis
A, B Z2
- Empty set:
A=
Dilation
Erosion
Def
Set of all displ z, such that (translated
by z) and A has overlapping elements
Set of all displ z such that B (translated by z)
is contained in A
- Union:
A B = {w | w A or w B}
How to use
- Intersection:
A B = {w | w A and w B}
- Put on top of A
- If they overlap Mark CENTER of
- Put B on top of A
- If B totally inside A Mark CENTER of B
Illustration
Briding gap in broken characters
Eliminate irrelevant details in image
- Subset:
A B w A: w B
A B = {z | ( )z A }
A B = {z | (B)z A}
- Complement:
AC = {w | w A}
- Difference:
A B = {w | w A, w B} = A BC
- Reflection:
= {w | w = -b, for b B}
- Translation by z:
(A)z = {c | c = a + z, for a A}
Application
COMP 4421 Page 25
Advance Binary Morphological Operations
Tuesday, March 22, 2016
10:18 PM
Operation
Def
Effect
Properties
Opening
- A B = (A B) B
(Erosion Dilation)
Smooth object contour from INSIDE
- Convergence:
(A B) B = A B
(Analogy: Imagine roll ball inside object)
- Alternative:
A B = {(B)z | (B)z A}
(Union of all translate of B
fitting into A)
Closing
A B = (A B) B
- Subimage:
ABA
Smooth object contour from OUTSIDE
(Dilation Erosion)
(Analogy: Image roll ball outside object)
- Convergence:
(A B) B = A B
- Subimage:
AAB
- Dual:
(A B)C = AC
Boundary
Extraction
(A) = A - (A B)
Region
Filling
- Begin with point p inside boundary
Fill inside bounded region with 1
(Difference between object &
eroded object)
- Procedure:
X0 = p
Xk = (Xk-1 B) AC
Stop when Xk = Xk-1
- Why AC: Avoid floodfill OUTSIDE
boundary
Connected
Begin with point p inside object Fill
COMP 4421 Page 26
Connected
components
- Begin with point p inside object Fill
out entire object with 1
- Procedure:
X0 = p
Xk = (Xk-1 B) A
Stop when Xk = Xk-1
- Why A: Avoid going outside object
- Black region = 1, NOT founded by algorithm
- Striped region =1, FOUNDED by algorithm
Skeleton
1. Definition: Skeleton S(A) of set A:
a. If:
i. z S(A)
ii. (D)z A: Largest disk, centered at z
b. Then:
i. D': D' A, D' disk (not necessarily
centered at z) larger than (D)z
ii. (D)z = Max. disk
iii. (D)z touches A's boundary at 2 places
2. Skeleton construction: B = Structuring element
Sk(A) = (A kB) - ((A kB) B)
A kB = ((A B) B) ) B
(k successive erosion of A by B)
K = max{k | A kB = }
(Last iterative step before A erodes to
empty)
3. Image reconstruction: Starting with Sk(A), K, B
(Sk(A) kB) = ((Sk(A) B) B) ) B
(k successive dilation of Sk(A) by B)
COMP 4421 Page 27
Gray-scale Morphological Image Processing
Tuesday, March 22, 2016
11:54 PM
Operations
f(x, y) = Input image, domain Df
b(x, y) = Structuring element, domain Db
Operation Def
Dilation
Effect
- 2-D:
If b(x, y) > 0 x, y:
(f b)(s, t) = max{f(s - x, t - y) + b(x, y) | (s - x, t - y)
Df ; (x, y)
Db}
- 1-D:
- Dark, small details reduced
(f b)(s) = max{f(s - x) + b(x) | (s - x)
Erosion
- Brighter output image
Df ; x
D b}
- 2-D:
If b(x, y) > 0 x, y:
(f b)(s, t) = min{f(s + x, t + y) - b(x, y) | (s + x, t + y)
- 1-D:
Df ; (x, y)
D b}
- Darker output image
- Bright, small details reduced
(f b)(s) = min{f(s + x) - b(x) | (s + x)
Df ; x
D b}
- Dilation-Erosion relation:
(f b)c(s, t) = (fc )(s, t)
fc(x, y) = -f(x, y)
(x, y) = b(-x, -y)
Opening - f b = (f b) b
- Remove small light detail
Analogy: Roll ball UNDER rough surface
Closing
- f b = (f b) b
- Undisturb overall gray lvl, large bright features
- Remove small dark detail
Analogy: Roll ball ABOVE rough surface
- Undisturb overall gray lvl, large bright features
- Duality: (f b)c(s, t) = (fc )(s, t)
1-D Grayscale Morphologial operation Illustration: Structuring element: b(x) = 1 x [-1, 1]
COMP 4421 Page 28
Applications
Def
Morphological smoothing
Morphological gradient
Granulometry
g = (f b) b
g = (f b) - (f b)
- Do opening with structuring element of gradually size
(Opening Closing)
- Compute difference between image in previous & new step
- Ending: Differences normalized Used to construct histogram
of particle-size distribution
Effect
Remove/Attenuate bright + dark
artifacts, noise
Highlight sharp gray lvl transitions
COMP 4421 Page 29
Determine size distribution of particles in image
Image Segmentation
Sunday, March 20, 2016
6:50 PM
Point Detection: Laplacian mask
Gradient (Discontinuity) Based Segmentation
- Mask design: Sum of all weights = Zero No
response at flat region
- Principle: Second derivative
f = Input image
- Intensity discontinuity detected based on mask response
within pre-defined window
3x3 window:
w1
w2
w3
w4
w5
w6
w7
w8
w9
Mask response (at pixel under w5):
- Principle: |R| T Intensity discontinuity detected
T = Pre-defined threshold
Line Detection
- Principle: 1 aligned, connected points detected Line detected
- Masks:
- Mask usage: For line in
Specific direction: Use single mask
All directions: Combine all masks
R = max(|Rhoriz|, |Rvert|,|R+45|, |R-45|)
COMP 4421 Page 30
3x3 Mask:
|R| = |2f(x, y)| T Point detected
COMP 4421 Page 31
Edge Detection
Thursday, March 24, 2016
5:54 PM
Edge models
Type
Step
Roof
Ramp
Illustration
- Edge point: Any point contained in ramp
Properties - Edge = Collection of connected pixel - Edge = Line cutting through region
on region boundary
- Width Thickness & Sharpness of line - Width = Length of ramp
- Width = 1 pixel
- Represent thin features
- Intensity profile (left right):
1st derivative: (+) transition in, (-) transition out
roads, line, drawings,
2nd derivative: (+) dark side, (-) bright side
* NOTE: Derivative approximation:
Gradient operator
Edge Detection Strategy
- Usage: 1st derivative Detect edge presence
- Theory:
- Strategies:
f(x, y): Detect presence of edge
Sign of 2f(x, y): Detect whether edge pixel
on dark/brige side
Zero-crossing property of 2f(x, y): Locate
center of thick edge
-
- Limitation: Noise
Solution: Smooth image first
- Computation: Use mask
Mask
Horiz
Gx
Vert
Gy
COMP 4421 Page 32
Prewitt
Sobel
-1
-1
-1
-1
-1
-1
-1
Gy
+45
Laplacian of Gausian (LoG) operator
- Usage: Detect zero-crossing value Center of thick edge
- Properties:
Laplacian: Pure 2nd derivative Unacceptably sensitive
to noise
LoG: Laplacian smoothed by Gaussian Noise
Reliably detect zero-crossing
- Def:
LoG(f) =
(f * h) = f *
(h)
- Computation: 5x5 mask
-1
-1
-2
-1
-1
-2
16
-2
-1
-1
-2
-1
-1
COMP 4421 Page 33
-45
-1
-1
-1
-2
-1
-1
-1
-1
-1
-1
-2
-1
-1
-1
-2
-1
-1
-1
Edge Point Linking & Boundary Detection
Thursday, March 24, 2016
5:52 PM
Global processing: Hough transform
Local processing
- Idea: Similar edge points connected
- Criteria for simiarity:
Grad Magnitude:
|f(x, y) - f(x0, y0)| E
Grad Direction:
|(x, y) - (x0, y0)| A
Distance:
- Concept: Each line can be represented by 2 params:
: Distance from origin
: Slope
(x0, y0) = Current pixel
Given edge point (x0, y0), slope Find :
(x, y) = Neighbouring pixel
= x0cos + y0sin
E, A, D = Threshold
- Param range:
(D = Max. image dimension)
-90 90
- Why use (, ) but NOT (a, b) (as y = ax + b):
, has LIMITED range
a, b has INFINITE range
- Edge-linking procedure:
Initialize accumulating array:
A[, ] = 0
[-90, 90]
Iterate through every edge point
For each edge point (xi, yi):
Iterate from -90 to 90
For each j, calculate j:
j = xicosj + yisinj
(Curve in (, )-param space)
Set A[j, j] = A[j, j] + 1
Pick highest A[j, j]'s (based on pre-defined criteria) Edges
detected
COMP 4421 Page 34
COMP 4421 Page 35
Global Intensity-based Segmentation
Thursday, March 24, 2016
5:53 PM
Automatic Thresholding Method
Global Thresholding Problem
1. Iterative approach:
- Initial estimate T
- Assumption:
Image f(x, y): Light object + Dark background
Histogram: 2 groups of intensity
- Objective: Select threshold T Clearly separate
these groups
- Segment image using T
Regions: G1 ( T) + G2 (< T)
- Compute avg grey lvl 1, 2 for G1, G2
= Prob of occur of gray lvl zi in region Gk
- Compute new T:
- Repeat until T converges (changes in T small enough)
- Extension: Multi-lvl Thresholding
2. Statistical, Minimum error approach
- Assumption:
Histogram is mixture of 2 distribution
p(z | k) = Prob of occur of gray lvl z in region k
p(z) = w1p(z | 1) + w2p(z | 2)
wk = Prob of occur of region k in image
w1 + w2 = 1; 0 w1, w2 1
Error:
Expectation-Maximisation (EM) Method
1. Problem:
Given a histogram, how to model it using multiple Gaussian
distributions
E(T) min w1p(T | 1) = w2p(T | 2)
If p(z | k) k are Gaussian distributed with same sd:
Then:
2. Relations:
- n different gray lvl: z1, z2, , zn
p(zk) = Prob of occur of gray lvl zk
= No of pixel of gray lvl zk
N = f1 + f2 + + fn: Total no of pixels
- M Gaussian models (1, 1), (2, 2), , (M, M)
COMP 4421 Page 36
p(z | j) = Prob of occur of gray lvl z in region j
p(j) = Prop of occur of region j in image
0 p(j) 1
p(j | z) = Prop of occur of region j, given gray lvl z
(Posterior prob)
-
3. Objective: Max. expected neg log-likelihood function
E = -lnL
4. Method:
- Initialize estimate j, j, p(j)
Based on observation
Easy way: p(j) = 1 / M
- Estimate pold(j | z) j, z:
- Compute
- Iterate until convergence
COMP 4421 Page 37
Watershed Segmentation
Wednesday, March 30, 2016
Watersed Segmentation Algorithm
10:14 AM
Concept: Region-based Segmentation
1. Definition:
- g(x, y): Grad. Image
- min = min{g(x, y)}, max = max{g(x,y)}
- M1, M2, , MR: Regional minima of g(x, y)
- Cn(Mi) = Set of POINTS in catchment basin of Mi that will be
flooded at step n
-
- Tn = {(s, t) | g(s, t) < n}: Set of POINTS with grad mag < n
- Qn = Set of CONNECTED COMPONENTS in Tn
- B: 3x3 structuring element
- Gradually pour water into catchment basin
Gradually grey lvl startingku
- When 2 basin start merging Build a dam/watershed/ridge
in between to prevent merging
- When no more merge detected Water reaches highest
possible lvl Stop
2. Algorithm:
- Initialization: Cmin(M1), Cmin(M2), , Cmin(MR)
- Run n = (min + 1) (max + 1). At step n:
For each q Qn:
Case 1: q Cn-1 = :
New regional minimum detected
q MR+1, R R + 1
Cn(MR+1) = q
- Dam lines = Object boundaries
Case 2: q Cn-1 = {Contain 1 Cn-1(Mi)}:
Cn-1(Mi) Cn(Mi)
Case 3: q Cn-1 = {Contain 2 Cn-1(Mi)}:
Watershed/Ridge detected Build dam here
For each q Qn:
Case 2:
Continuously dilate q Cn-1 with B but limit
dilation INSIDE q
Case 3:
Continuously dilate q Cn-1 with B but limit
dilation INSIDE q
After each dilation, mark newly obtained
points reached by 2 Cn-1(Mi) as DAM
POINT
* NOTE: Wateshed segmentation is often applied on grad
image
Regional minima Small value of grad ( Small grey
lvl changes)
Inside object of interest
COMP 4421 Page 38
* NOTE: Dilation used here are BINARY mophorlogical
dilation
COMP 4421 Page 39
Use of Motion in Segmentation
Wednesday, March 30, 2016
10:14 AM
Accumulative Difference Approach
Basic Approach: Direct differences
- Idea:
Compare 2 images acquired at different time pixel by
pixel.
Subsequent image subtracted from reference image:
- Ideas: Accumulative Difference Image (ADI)
Differences between reference & subsequence frames accumulated
Noise sensitivity
Motion direction + Speed estimated
Difference occur at a pixel Counter of that pixel
Stationary elements canceled
Non-stationary elements = Non-zero entires
- Implementation
- Implementation:
f(x, y, t1): Reference frame
f(x, y, t2), , f(x, y, t n): Subsequent frames
Absolute ADI:
dij: Difference image
f(x, y, ti): Reference frame, taken at time ti
Positive ADI:
f(x, y, tj): Subsequent frame, taken at time tj
Negative ADI:
Assumption:
f(x, y, ti), f(x, y, tj) same scene
Rigid-body, in-plane motion
Issues of dij:
Noise sensitive:
Incorrect T Noise not cancelled
Solution - Island Removal: Remove all 4-/8connected region of 1's having < Predefined
no of 1's
Give limited info about motion: Motion direction +
speed not easily estimated
- Observations:
Positive ADI: Non-zero area
Size = Size of moving object
Location = Location of moving object in reference frame
When moving object displaced completely with respect to its
position in reference frame No of non-zero entries stop
Absolute ADI = Positive ADI Negative ADI
Direction + Speed: Determined from entries in Absolute & Negative
ADI
COMP 4421 Page 40
Snake: Active Contour Model
Wednesday, March 30, 2016
10:15 AM
Control the Shape of Contour (Snake) with Energy Function
Paramaterize the Position of 2D Contour (Snake)
- Internal energy:
First term: Control "tension" of contour
Tension of the "spring" that connects 2 consecutive
contour points
Second term: Control "rigidity" of contour
How easily the contour can bend itself
- External energy:
0 s 1: Param that as traversing from 1 point of
contour to another
x(s), y(s): Position of contour points, function of s
:
Contour can be open/close
Scalar potential function, defined on image plane
Describe image features that snake will be attracted to
To make snake seeks image edge:
Implementation
c > 0: const
G * I: Image convolved with Gaussian
smoothing filter of sd
Contour
- Theorem:
Lagrange equation:
must satisfy Euler- Final shape of contour
Dynamic Deformable Model of Snake
- Discrete formulation using finite difference scheme:
: Time-varying contour
- Evolution equation:
- Discrete formulation of dynamic snake in matrix form (finite
difference scheme):
h = Step size
= Time step size
Solve iteratively:
- Matrix form:
A: Matrix in terms of ,
COMP 4421 Page 41
At equilibrium: Stationary contour with min internal
+ external energy obtained
- Matrix form:
A: Matrix in terms of ,
At equilibrium: Stationary contour with min internal
+ external energy obtained
With appropriate (s), (s),
, snake will adjust itself to
seek the boundary the the shape correctly despite the noisy
background
COMP 4421 Page 42
Image Registration
Tuesday, May 3, 2016
Affine Transformation Functions
11:20 AM
Transformation that preserves collinearity (all points lying on
line initially still lie on line after transformation)
Image Registration
1. Rigid-body transformation (Object size unchanged)
- Definition: Process of:
Combining multiple images into a single one
Transforming image to make it match with referenced
image/coordinate system
- Applications:
Panorama photo
a. Translation:
b. Rotation by about axis:
Rotation is measured clockwise while looking along
positive axis direction
- z-axis:
Medical imaging
Accommodate tissue deformation in image-guided
surgery
Satellite image repairing
- General method:
- x-axis:
- y-axis:
c. Translation + Rotation:
-
6 deg of freedom (DOF): x, y, z, dx, dy, dz
- Combining into single matrix:
2. Non-rigid-body transformation (Object size changed)
a. Scaling:
b. Shearing:
Objective Functions
kxy 0 Stretch along x-axis on xy-plane
1. External markers
* NOTE: Combining all transformation forms into single matrix:
12 unknowns 4 pairs of corresponding points (before +
after transformation) needed
COMP 4421 Page 43
12 unknowns 4 pairs of corresponding points (before +
after transformation) needed
stereotactic frame, fiducial markers,
- Seen & located in imagebs as tiny 3D volumes, represented
by points
- Used to guide/evaluate transformation Pinpoints to find
transformation matrix parameters
- Registration error: Euclidean distance between
corresponding points in
and
Optimization method
1. Problem:
Transformation parameters a1, a2, , an
2. Internal markers
a. Surface-based
Segmentation
Corresponding surfaces delineated & guide
transformation
Minimize distance between 2 surfaces
Find
For rigid-body transformation:
Find T = (x, y, z, tx, ty, tz) F min
b. Edge-based
- Edges extracted based on gradient function Guide
transformation
- Gradient magnitude functions:
2. Powell's method:
- Initialization:
n basis vector
Starting point
: Gaussian-smoothed image
: Input image
- Repeat the following cycle until
significantly small:
For i = 1, 2, , n: 1D Line Minimization
Find i such that
min
T
Set
Set
for the NEXT cycle
- Search strategies to achieve global optimization:
Multi-resolution (Coarse-to-fine) approach:
c. Voxel/Pixel intensity
Multi-start approach:
- Sum of Square intensity difference (SSD):
D: Interested domain (overlap region
between u & v, entire domain, )
Widely used
Very sensitive to outliners (pixels with large
intensity difference)
- Sum of Absolute difference (SAD):
COMP 4421 Page 44
outliner effect
- Correlation coefficient (CC):
: Mean
-1 CC 1
Assumption: Linear relationship between intensity
values in images
COMP 4421 Page 45
Image Compression
Tuesday, May 3, 2016
Data redundancy
11:20 AM
1. Coding redundancy:
- Different symbols can be encoded using different no. of bits
Basic Concepts
- Lossless: Image compressed & decompressed
without losing information
- Related measurements:
Avg no of bits:
r = Discrete rand var in [0, 1]
kth gray lvl (k = 0, 1, 2, , (L - 1)):
- Lossy: Change/reduce some image data when
compression Higher data reduction, but
imperfect reproduction of original image
Occurrence = nk, pr(rk) = nk / n, Symbol bit length = l(rk)
Compression ratio:
Original image: Lavg1
- Relative data redundancy:
Original dataset X1, size = n1
Compressed dataset X2, size = n2
Compressed image: Lavg2
rk
pr(rk)
Compression ratio:
Relative data redundancy of X2 to X1:
n2 = n1 (RD = 0): No redundant data
n2 << n1 (CR 1, RD 1): Highly
redundant data, significant compression
n2 >> n1 (CR 0, RD -): X2 has much
more data than original X1
Code scheme 1 l1(rk)
Code scheme 2
(Fix-length)
(Variable length)
l2(rk)
r0 = 0/7 0.19
000
11
r1 = 1/7 0.25
001
01
r2 = 2/7 0.21
010
10
r3 = 3/7 0.16
011
001
r4 = 4/7 0.08
100
0001
r5 = 5/7 0.06
101
00001
r6 = 6/7 0.03
110
000001
r7 = 7/7 0.02
111
000000
Lavg
2. Interpixel redundancy:
- Adjacent pixel values can be reasonably predicted from current pixels
Fidelity Criteria
- Related measurements: Autocorrelation coefficients:
f(x, y) = Original image
= Re-construction image from compression
n = Pixel distance
N = No. of pixels in image
- Objective fidelity criterion:
Root-mean-square error:
(n) Pixel correlation
Mean square signal-to-noise ratio:
- Subjective fidelity criterion: Based on human's subjective
rating
Rating scale of Television Allocations Study Organization
Value
Rating
Excellent An image of extremely high quality, as
good as you could desire.
Description
Fine
An image of high quality, providing
enjoyable viewing. Interference is not
objectionable.
Passable
An image of acceptable quality.
Interference is not objectionable.
Marginal An image of poor quality. You wish you
could improve it. Interference is
COMP 4421 Page 46
2 image, same histogram, different autocorrelation coef profile
- Mapping: Transform 2D image to more efficient format Pixel
redundancy
6
2.7
Passable
An image of acceptable quality.
Interference is not objectionable.
Marginal An image of poor quality. You wish you
could improve it. Interference is
somewhat objectionable.
Interior
Unusable An image so bad that you could not
watch it.
- Mapping: Transform 2D image to more efficient format Pixel
redundancy
A very poor image, but you could watch
it. Objectionable interference is definitely
present.
Binary image, size 1024343
Natural scheme: 1 pixel = 1 bit (black/white)
n1 = 10243431 = 351232
Run-length coding scheme:
Each scan line = Alternate chain of black/white pixels
Sequence of pairs (g1, w1), (g2, w2),
gi = ith grey lvl (0/1)
wi = run length
gi needs 1 bit, wi need 11 bits (max(wi) = 1024)
Total 12166 pairs (gi, wi)
n2 = 1216611 = 133826
Image Compression Model
3. Psychovisual redundancy
- Certain info relatively less important than other to human visual Can be
removed, but irreversible
Human eye sensitive to distinguishing features (edges, patterns, ) but
not detailed quantitative info of image
- Quantization:
Map intensity values broad narrow range
CR, but Image quality (false contour, )
- Improved grayscale (IGS) quantization:
Compensate for degraded visual impact due to quantization
- Source encoder: Remove redundancies
Mapper: Transform input data into interpixelredundancy-reduced format
Quantizer:
Mechanism:
Add some amount of locally-dependent random to pixel's gray lvl
before quantization
How to create locally-dependent random:
Reduce accuracy of mapper's output based on
pre-defined fidelity criterion Psychovisual
redundancy
Observation: Low-order bits of pixel intensity are noisy &
nearly random
Use lower bits of neighboring pixel Generate pseudorandom
Irreversible info loss
Symbol Encoder: Create fix/variable-length code to
represent quantizer output
- Channel encoder: Noise immunity of source encoder's
output
kth Bitplane: Image of kth bit of pixel intensities in image
(7,4)-Hamming code (Channel encoding)
- 3 redundant bits added to 4-bit word All single-bit
error (in 7-bit combined word) detected & corrected
Procedures:
Initialization: Sum = 0
For each pixel:
New Sum = Pixel grey lvl + Lower-half bits of Sum
IGS Code = Higher-half bits of Sum
If Sum = 11111, reset Sum = 0
- Implementation:
Input: b3b2b1b0
Output: h1h2h3h4h5h6h7
h1 = b3 b2 b0
Pixel
Gray lvl
Sum
IGS Code
h2 = b3 b1 b0
i-1
n/a
0000 0000
n/a
0110 1100 0110 1100 = 0110 1100 + 0000 0110
h3 = b3
COMP 4421 Page 47
Output: h1h2h3h4h5h6h7
h1 = b3 b2 b0
Pixel
Gray lvl
Sum
IGS Code
h2 = b3 b1 b0
i-1
n/a
0000 0000
n/a
0110 1100 0110 1100 = 0110 1100 + 0000 0110
i+1
1000 1011 1001 0111 = 1000 1011 + 1100 1001
i+2
1000 0111 1000 1110 = 1000 0111 + 0111 1000
i+3
1111 0100 1111 0100 = 1111 0100 + 1110 1111
h3 = b3
h4 = b2 b1 b0
h5 = b2
h6 = b1
h7 = b0
Error detection & correction:
c4 = h1 h3 h5 h7
c2 = h2 h3 h6 h7
c1 = h4 h5 h6 h7
x = c4c2c1(2)
x 0 hx corrupted, need flipping
Input b3b2b1b0 = 1010
Output h1h2h3h4h5h6h7 = 1011010
h1 = b3 b2 b0 = 1 0 0 = 1
h2 = b3 b1 b0 = 1 1 0 = 0
h4 = b2 b1 b0 = 0 1 0 = 1
Receiver input h1h2h3h4h5h6h7 = 1011110
c4 = h1 h3 h5 h7 = 1 1 1 0 = 1
c2 = h2 h3 h6 h7 = 0 1 1 0 = 0
c1 = h4 h5 h6 h7 = 1 1 1 0 = 1
x = c4c2c1(2) = 101(2) = 5
h5 = 1 corrupted, SHOULD BE h5 = 0
* NOTE: X = t1 t2 tn: XOR operation
X = 1 if ODD no of 1's in t1, t2, , tn
X = 0 if EVEN no of 1's in t1, t2, , tn
X called even-parity-bit: Set {t1, t2, , tn} {X} has
EVEN no of 1's
COMP 4421 Page 48
Lossless Compression
Tuesday, May 17, 2016
Lossless Predictive Coding
4:42 PM
- Features:
Interpixel redundancies of closely spaced pixels by coding only
new info of each pixel
Huffman Encoding
New info of pixel = Actual value - Predicted value
- Variable-symbol encoding: Assign longer code for symbol
with smaller occurrence
Predictor:
f'n = Predicted intensity value
- Implementation:
Order probs of symbols in descending order
fn = Input intensity value
n = Pixel index (1, , N)
Combine 2 lowest probs into single one for next
source reduction
- Encoder:
Input: Image f1, f2, , fN
Output: f1, f2, , fm, em+1, em+2, , em
en = fn - f'n (Prediction error)
No quantization of en Lossless
Decoding: 010100111100
01010
0111
00
a3
a1
a2
a2
a6
- Decoder:
Input: f1, f2, , fm, em+1, em+2, , em
Output: f1, f2, , fm, fm+1, fm+2, , fm
- Produce optimal codes
- Need to store symbol table with compressed data
(f1, , fm known)
(f2, , fm+1 known)
- Effect: Variance of en < Variance of fn Require less bit to encode CR
COMP 4421 Page 49
Lossy Compression
Tuesday, May 17, 2016
11:02 PM
Transform Coding
- Features:
Lossy Predictive Coding
Transform pixel to new space Compact representation of info
Compression = Quantization of transformed coeffs
- Based on Lossless Predictive Coding Scheme, except
Predicted error quantized to limited range
- Discrete Cosine Transform (DCT):
Features: Like DFT, but no imaginary comp, better power packing
Formula:
- Encoder:
Input: Image f1, f2, , fN
Forward DCT:
Output: f1, f2, , fm, e'm+1, e'm+2, , e'm
e'n = Quantize(en)
Inverse DCT:
- Decoder:
Input: f1, f2, , fm, e'm+1, e'm+2, , e'm
Output: f1, f2, , fm, f''m+1, f''m+2, , f''m
X1 = f1, , Xm = fm
X1 = f2, X2 = f3, , Xm-1 = fm, Xm
= f''m+1
- Encoding procedures:
Sub-image size n:
Should be n = 2k Apply fast transform algorithm O(nlogn)
n Compression, + Computational complexity
Ideal (empirical testing): n = 8, 16
X1 = f3, X2 = f4, , Xm-2 = fm, Xm-1 =
f''m+1, Xm = f''m+2
* NOTE: f''n fn
- Delta Modulation (simple lossy coding)
Encoder:
f''1 = f1
f'n = f''n-1, en = fn - f'n, f''n = f'n + e'n
Quantization:
JPEG Coding Scheme
- Encoding:
Send out f''1, e2, , en
Decoder: f''1 available, f'n = f''n-1, f''n = f'n + e'n
- Decoding:
COMP 4421 Page 50
- Decoding:
- JPEG Quality: Quantify compression
Quality Values in Quantization table Blurring + CR
100% quality T(u, v) = 1 u, v
- JPEG Color Image Compression:
COMP 4421 Page 51
Face Detection & Recognition
Tuesday, May 3, 2016
11:20 AM
Face Detection using Integral image & AdaBoost learning
(Viola, Jones, 2004)
Face Detection using Image pyramid & Neural networks
(Rowley, Baluja, Kanade, 1998)
- Concepts:
Integral image ii:
Sum of pixel intensity values above & left of (x, y), inclusive
- Classifier trained directly using preprocessed & normalized
face & nonface training sub-windows
Any rectangular sum can be computed efficiently using integral
image:
- Neural network
Input:
Preprocessed 2020 sub-windows extracted from
image pyramid (multi-scale representation of same
image)
Why use image pyramid:
Vary sub-window size Detect face of
different size
D = (A + B + C + D) - (A + C) - (A + B) + A
Fix subwindow size & Varying image size
Keep design of neural network simple
= ii(x1, y1) - ii(x1, y2 - 1) - ii(x2 - 1, y1) + ii(x2 - 1, y2 - 1)
Rectangular feature (Haar-like function):
Output: Face/Non-face
Value = Pixels in black region - Pixels in white region
- False-detection error minimization:
Partially-trained system applied to random scenery image
not containing any faces
3 types:
2-rectangle
3-rectangle 4-rectangle
Region detected as faces added into neg training set
- Issues:
Undetectable situations: blurry, highly-rotated, partiallyappeared faces
System trained only on real faces, hand drawn faces
sometimes detected
- Machine learning strategy analysis:
Training data: Set of 2424 face/non-face images
Need to select rectangular features to represent each image:
Problems: 160000 potential features (different type + size) Too
many
How to select a SINGLE feature + threshold Best separate
positive (face) & negative (non-face) training examples?
Solution: Combine several weak classifier (return high classification
error) Stronger classifier
- AdaBoost:
Algorithm to construct strong classifier as linear combination of selected
weak classifier:
Weak classifier:
- Detecting rotated faces:
- Train router network Preprocess each input window
Estimate orientation
- Adjust window back to upright, frontal position
Preprocess Feed to main neural network
: Training image
: Classification function utilizing rectangular feature
Strong classifier:
sgn(x): sign function:
Input:
: Training data (i = 1, 2, , N)
: Training image
hj: Weak classifier (j = 1, 2, , P)
Training procedures:
Initialization:
COMP 4421 Page 52
M = No. of face image in training data
Face Recognition
L = No. of nonface image in training data
- 1D vector representative of mn-size face image can dimen
(mn) to much lower
For t = 1, 2, , T:
Normalize all weights:
Perform Principle Component Analysis (PCA) dimen
- Eigenfaces:
Give set of N mn-size images, each represented by 1D
For each classifier hk (k = 1, 2, , P):
Train Minimize classification error
vector of dimen mn
Select weak classifier with minimum error:
Mean image:
Set weight for ht:
Scatter matrix:
Reweight each examples:
Final strong classifier:
Corresponding t eigenvectors + non-zero
eigenvalues:
with 1 2 t
N = 108, P = 180000: T can be 200
: PCA-transformed to new space of lower
dimen (as t mn)
Near Infrared-image for Face Recognition
- Infrared imaging system can produce face mages at good
condition regardless of visible lights in environment Better
than normal use of color camera
: Eigenfaces
- Face recognition using Eigenfaces:
Process image database with PCA + Eigenface
Given new image to be recognize :
Calculate coefs of :
Face detection:
Face
If is face, find closest labeled face based on nearest
neighbor in t-dimen space
COMP 4421 Page 53
Iris Recognition
Thursday, May 19, 2016
Step
no
Step
Iris Image
Acquisition
5:30 PM
Objectives
Method
Daugman system
- Capture high-quality image of iris:
Sufficient resolution + sharpness
Good contrast in interior iris pattern
Iris must be centered
- LED-based point light source, carefully positioned to avoided reflection
off eyeglasses
- Live video feedback via tiny LCD
- Non-invasive to human operator:
Lvl of illumination reflected to iris should not be
annoying
Capturing device should not constraining
- Challenge: Iris is small (diameter 1cm)
2
Iris
Localization
- Step 1: Gradient-based edge detection (Obtain gradient magnitude +
Thresholding) Convert image into binary edge
Gradient magnitude | G(x, y) * I(x, y)|
Del operator
2D Gaussian
Localize portion of acquired image corresponding to iris
- Step 2: Circle Hough transform Localize boundaries of limbic &
pupillary
(xj, yj): Edge point (j = 1, , n)
g(xj, yj, xc, yc, r) = (xj - xc)2 + (yj - yc)2 - r2
Pattern
matching :
Alignment
Bring newly acquired iris pattern into spatial alignment
with candate database entry
- Map: Cartesian image coordinates (x, y) Dimensionless polar image
coordinates (r, )
( Cut circle at = 0 and spread it to rectangular form)
x(r, ) = (1 r)xp(0, ) + rxl(1, )
y(r, ) = (1 r)yp(0, ) + ryl(1, )
r [0, 1]
[0, 2)
(xp(0, ), yp(0, )): Coordinates of pupillary boundary in
direction
(xl(1, ), yl(1, )): Coordinates of limbic boundary in direction
- Why use this alignment: Invariant to shift, scaling & rotation
COMP 4421 Page 54
- Why use this alignment: Invariant to shift, scaling & rotation
Shift (Offsets of eye parallel to camera's sensor): Hough transform
independent of iris position in whole image
Scaling (Offsets along camera's optical axis): Solved by transformation
to common representation (polar coordinate (r, ))
Rotation (Angular deviation about optical axis): Just need to shift the
Polar representation left/right
4
Pattern
Choose a representation of aligned iris patterns that makes Laplacian pyramind + Laplacian of Gaussian filter:
matching:
their distinctive patterns apparent
Laplacian of Gaussian (LoG) filter:
Representation
: Radial dist of a point from filter's center
: sd
Laplacian pyramid:
Pattern
matching:
Goodness of
Match
Evaluate goodness of match between newly acquired &
database representations
- Normalized correlation
p1(i, j), p2(i, j): 2 image of size nm
For each image pk:
Normal correlation between p 1 & p2:
- With representation of acquired iris image & iris in database 4 NCs
Used for identification
6
Pattern
matching Decision
Decide if newly acquired data & database entry derived
from same iris, based on goodness of match
- Strategy formulation:
Training data: Extracted from database
Authentic class (A): Database entry derived from same iris as
the newly acquired
Imposter class (I): Database entry derived from different iris with
newly acquired
Imagine you can make binary decision along a line:
Apply to arbitary sets of points: Draw a line ("weight vector") and
project all points onto that line
- Method: Find weight vector
A):
such that for 2 projected class samples (I &
Variance within class minimized
COMP 4421 Page 55
Variance within class minimized
Variance between different classes maximized
Within class variance:
Total within class variance:
Between class variance:
* NOTE:
are 44 matrix
Optimization:
Values: Above this point Class A
Below this point Class I
COMP 4421 Page 56