Convolution and Filtering
COS 429: Computer Vision
Figure credits: S. Lazebnik, S. Seitz, K. Grauman, and M. Hebert
Local Neighborhoods
• Hard to tell anything from a single pixel
– Example: you see a reddish pixel. Is this the object’s
color? Illumination? Noise?
• The next step in order of complexity is to look at
local neighborhood of a pixel
Linear Filters
• Given an image In(x,y) generate a
new image Out(x,y):
– For each pixel (x,y), Out(x,y) is a specific linear
combination of pixels in the neighborhood of In(x,y)
• This algorithm is
– Linear in input values (intensities)
– Shift invariant
Discrete Convolution
• This is the discrete analogue of convolution
∞
𝑓 𝑥 ∗ 𝑔 𝑥 = � 𝑓 𝑡 𝑔 𝑥 − 𝑡 𝑑𝑑
−∞
• Pattern of weights = “filter kernel”
• Will be useful in smoothing, edge detection
Example: Smoothing
Original: Mandrill Smoothed with
Gaussian kernel
Gaussian Filters
• One-dimensional Gaussian
1 𝑥2
− 2
𝐺1 𝑥 = 𝑒 2𝜎
2𝜋𝜎2
• Two-dimensional Gaussian
1 𝑥2 +𝑦2
−
𝐺2 𝑥 = 𝑒 2𝜎2
2𝜋𝜎2
Gaussian Filters
Gaussian Filters
Gaussian Filters
• Gaussians are used because:
– Smooth
– Decay to zero rapidly
– Simple analytic formula
– Central limit theorem: limit of applying (most) filters
multiple times is some Gaussian
– Separable:
𝐺2 𝑥, 𝑦 = 𝐺1 𝑥 𝐺1 𝑦
Computing Discrete Convolutions
𝑖𝑚𝑚𝑚 𝑗𝑚𝑚𝑚
Out 𝑥, 𝑦 = � � 𝑓 𝑖, 𝑗 In(𝑥 − 𝑖, 𝑦 − 𝑗)
𝑖=𝑖𝑚𝑚𝑚 𝑗=𝑗𝑚𝑚𝑚
• What happens near edges of image?
– Ignore (Out is smaller than In)
– Pad with zeros (edges get dark)
– Replicate edge pixels
– Wrap around
– Reflect
– Change filter
Computing Discrete Convolutions
𝑖𝑚𝑚𝑚 𝑗𝑚𝑚𝑚
Out 𝑥, 𝑦 = � � 𝑓 𝑖, 𝑗 In(𝑥 − 𝑖, 𝑦 − 𝑗)
𝑖=𝑖𝑚𝑚𝑚 𝑗=𝑗𝑚𝑚𝑚
• What happens if kernel is infinite?
– Truncate when filter falls off to near zero
– For Gaussian, typical support between 2σ and 3σ
Source: K. Grauman
Computing Discrete Convolutions
𝑖𝑚𝑚𝑚 𝑗𝑚𝑚𝑚
Out 𝑥, 𝑦 = � � 𝑓 𝑖, 𝑗 In(𝑥 − 𝑖, 𝑦 − 𝑗)
𝑖=𝑖𝑚𝑚𝑚 𝑗=𝑗𝑚𝑚𝑚
• How long does it take?
– If In is n×n, f is m×m, naive computation takes time
O(m2n2)
– OK for small filter kernels, bad for large ones
Fourier Transforms
• Define Fourier transform of function f as
∞
1
𝐹 𝜔 =F 𝑓 𝑥 = � 𝑓 𝑥 𝑒𝑖𝜔𝜔 𝑑𝑑
2𝜋 −∞
• F is a function of frequency – describes how much
of each frequency is contained in f
• Fourier transform is invertible
Fourier Transform and Convolution
• Fourier transform turns convolution
into multiplication:
F 𝑓 𝑥 ∗𝑔 𝑥 =F 𝑓 𝑥 F 𝑔 𝑥
Fourier Transform and Convolution
• Useful application #1: Use frequency space to
understand effects of filters
– Example: Fourier transform of a Gaussian
is a Gaussian
– Thus: attenuates high frequencies
Amplitude
Amplitude
Amplitude
× =
Frequency Frequency Frequency
Fourier Transform and Convolution
• Useful application #2: Efficient computation
– Fast Fourier Transform (FFT) takes time
O(n log n)
– Thus, convolution can be performed in time
O(n log n + m log m)
– Greatest efficiency gains for large filters (m ~ n)
Alternative: Median Filtering
• A median filter operates over a window by selecting
the median intensity in the window
Credit: K. Grauman
Median Filter
Salt-and-pepper noise Median filtered
Credit: M. Hebert
Gaussian vs. Median filtering
3x3 5x5 7x7
Gaussian
Median
Edge Detection
Winter in Kraków photographed by Marcin Ryczek
Origin of Edges
• Edges are caused by a variety of factors:
surface normal discontinuity
depth discontinuity
surface color discontinuity
illumination discontinuity
Credit: Steve Seitz
Edge Detection
• Intuitively, much of semantic
and shape information is
available in the edges
• Ideal: artist’s line drawing
(but artist is also using
object-level knowledge)
• But what, mathematically,
is an edge?
Credit: D. Lowe
What is an Edge?
Edge easy to find
What is an Edge?
Where is edge? Single pixel wide or multiple pixels?
What is an Edge?
Noise: have to distinguish noise from actual edge
What is an Edge?
Is this one edge or two?
What is an Edge?
Texture discontinuity
Formalizing Edge Detection
• Look for strong step edges
𝑑𝑑
>𝜏
𝑑𝑑
• One pixel wide: look for maxima in dI / dx
• Noise rejection: smooth (with a Gaussian)
over a neighborhood of size σ
Canny Edge Detector
• Smooth
• Find derivative
• Find maxima
• Threshold
Canny Edge Detector
• First, smooth with a Gaussian of some width σ
Canny Edge Detector
• Next, find “derivative”
• What is derivative in 2D? Gradient:
𝜕𝜕 𝜕𝜕
∇𝑓 𝑥, 𝑦 = ,
𝜕𝜕 𝜕𝜕
Canny Edge Detector
• Useful fact #1: differentiation
“commutes” with convolution
𝑑𝑑 𝑑 𝑑𝑑
∗𝑔= 𝑓∗𝑔 =𝑓∗
𝑑𝑑 𝑑𝑑 𝑑𝑑
• Useful fact #2: Gaussian is
separable:
𝐺2 𝑥, 𝑦 = 𝐺1 𝑥 𝐺1 𝑦
Canny Edge Detector
• Thus, combine first two stages of Canny:
𝑓 𝑥, 𝑦 ∗ 𝐺′1 𝑥 𝐺1 𝑦
∇ 𝑓 𝑥, 𝑦 ∗ 𝐺2 𝑥, 𝑦 =
𝑓 𝑥, 𝑦 ∗ 𝐺1 𝑥 𝐺′1 𝑦
𝑓 𝑥, 𝑦 ∗ 𝐺′1 𝑥 ∗ 𝐺1 𝑦
=
𝑓 𝑥, 𝑦 ∗ 𝐺1 𝑥 ∗ 𝐺′1 𝑦
Canny Edge Detector
Original Image Smoothed Gradient Magnitude
Canny Edge Detector
• Nonmaximum suppression
– Eliminate all but local maxima in gradient magnitude
(sqrt of sum of squares of x and y components)
– At each pixel p look along direction of gradient:
if either neighbor is bigger, set p to zero
– In practice, quantize direction to
horizontal, vertical, and two diagonals
– Result: “thinned edge image”
Canny Edge Detector
• Final stage: thresholding
• Simplest: use a single threshold
• Better: use two thresholds
– Find chains of touching edge pixels, all ≥ τ low
– Each chain must contain at least one pixel ≥ τ high
– Helps eliminate dropouts in chains, without being too
susceptible to noise
– “Thresholding with hysteresis”
Canny Edge Detector
Original Image Edges
Faster Edge Detectors
• Can build simpler, faster edge detector by omitting
some steps:
– No nonmaximum suppression
– No hysteresis in thresholding
– Simpler filters (approx. to gradient of Gaussian)
1 0 −1 1 2 1
• Sobel: 2 0 −2 0 0 0
1 0 −1 −1 −2 −1
• Roberts:
1 0 0 −1
0 −1 1 0
Second-Derivative-Based Edge Detectors
• To find local maxima in derivative, look for zeros in
second derivative
• Analogue in 2D: Laplacian
𝜕2𝑓 𝜕2𝑓
∇2𝑓 𝑥, 𝑦 = 2 + 2
𝜕𝜕 𝜕𝜕
• Marr-Hildreth edge detector
LOG
• As before, combine Laplacian with Gaussian
smoothing: Laplacian of Gaussian (LOG)
LOG
• As before, combine Laplacian with Gaussian
smoothing: Laplacian of Gaussian (LOG)
LOG vs. DOG
• Laplacian of Gaussian sometimes approximated by
Difference of Gaussians
∇2𝐺1(𝑥, 𝜎)
2 �𝐺1 𝑥, 𝜎 2 −
𝐺1 𝑥, 𝜎/ 2 �
Problems with Laplacian Edge Detectors
• Distinguishing local minimum vs. maximum
• Symmetric – poor performance near corners
• Sensitive to noise
– Higher-order derivatives = greater noise sensitivity
– Combines information along edge, not just perpendicular
Image gradients vs. meaningful contours
• Berkeley segmentation database:
http://www.eecs.berkeley.edu/Research/Projects/CS/vision/grouping/segbench/
image human segmentation gradient magnitude
Data-Driven Edge Detection
Input images
Ground truth
Training data
Output
P. Dollar and L. Zitnick, Structured forests for fast edge detection, ICCV 2013