CSE 483: Computer Vision
Image Segmentation
Prof. Mahmoud Khalil
Spring 2022
1
Segmentation Goals
Goal: identify groups of pixels that go together
2
3
Image Segmentation – Point Detection
This is not a mathematically
This filter gives highest response
proven filter, this is just made
only for my required detection, and
for the context of the current
can give smaller responses
situation... �ﻫﻠﻴﻬ.
� Matches the
otherwise, so I can just threshold
point that is white surrounded
the result to eliminate them. The
by all black. If it's continuous,
HPs are the threshold and filter size
all weights are fired and so we
get 8 * (-1)*8 = 0. Otherwise 8
4
Image Segmentation – Line Detection
Max response = indeed a line = 6 * 255, so let's apply a threshold on that value.
5
Image Segmentation – Edge Detection
7
Apply threshold to detect edges. But this thing stays
constant for a while, are these multiple edges?
Easier to detect w/ 2nd order derivative by connecting the two impulses to
get center of edge at their zero-crossing, but scales very badly with noise
Can deal with the problem of noise by
doing average blurring as preprocessing
step before doing edge detection.
Obviously you won't see with your eyes
the effect of blurring, since you didn't
even see the noise in the first place.
This is machine-level stuff.
Edge Detection
(Gradient Operators)
They all come in pairs because we use two
perpendicular directions to detect edges in
them and with knowledge of both we can
detect the edge's direction (one filter may
only detect one direction). Roberts goes 45
and -45, the other two do vertical and horiz
10
Partial Derivatives of an Image
12
(-1, 1) implies that when you see a transition from 0 to 255
the result is white, which means the beginning of an edge
will be white, and the end will be black... (2nd order)
triangle means 1st order, superscript
Image Gradient means further order (so a superscript
of 2 means 2nd order, 3 = 3rd, etc...)
13
Designing an Edge Detector
This kernel is of a
smoothing filter
The result of applying convolution with
this filter is elimination of noise (good)
and widening of gradient (semi-bad)...
It's semi-bad cause the gradient is
wider but we can still get the edge with
zero-crossing in second order detection
After this step (1st order
derivative), as mentioned
before we should apply a
threshold so we can get
14
useful information
In this slide we see that, since
smoothing filter and 1st O.D.
Derivative Theorem of Convolution are linear operations, merging
them gives the same result
This is the 1st O.D. to the
Gaussian (smoothing) kernel
15
As I was saying, 2nd O.D.'s zero crossing will counteract
the widening of the ramp, and since noise was removed
due to blurring, not much of an issue to actually use 2nd
O.D., an additional bonus is that we can get the direction
of the edge by getting the slope at the zero-crossing. In
Laplacian of Gaussian 1st O.D., the response would've been negative or
positive based on direction (which means the threshold
in that scenario should've been applied on absolute vals)
2nd O.D. to Gaussian
(smoothing) kernel
17
2D Edge Detection Filters
Source: Steve Seitz 18
Edge by Derivative of Gaussian
16
Smoothing with a Gaussian equivalent to the amount of smoothing
Source: Kristen Grauman, UT-Austin 19
Effect of σ on derivatives
Source: Kristen Grauman, UT-Austin 20
Edge Detection Generally speaking, edge detection is a pre-processing step.
• Criteria for optimal edge detection (Canny 86):
• Good detection:
• minimize the probability of false positives (detecting spurious
edges caused by noise),
• false negatives (missing real edges)
- Good localization:
• edges must be detected as close as possible to the true edges.
- Single response constraint:
• minimize the number of local maxima around the true edge (i.e.
detector must return single point for each true edge point)
21
Edge Detection
• Examples:
This one is a result
of small threshold 22
or too much smooth
CannyBoii doesn't do anything new, this is
what we've been doing for the past 20 slides
or so... Smoothen-->Deriv.-->Calc. Mag. + Dir.
Canny Edge Detector
• Smooth by Gaussian
x2 y2
S G * I 1
G e 2 2
2
• Compute x and y derivatives
T
S S S Sx Sy
T
x y
• Compute gradient magnitude and orientation
S S x2 S y2
Sy
tan 1
Sx
23
Canny Edge Operator
S G * I G * I
T
G G
G
x y
T
G G
S *I *I
x y
24
Now that's what's new about CannyBoii
Non‐maximum suppression for each orientation
After detecting the edge (step 3), he
At q, we have a gets the orientation of the edge at each
maximum if the point (by getting the perpendicular to
the edge at the respective point) and
value is larger applies non-maximum suppression at it,
than those at both which simply means "if not max. value
p and at r. among values, get suppressed you little
piece of !@$%". This basically fixes the
Interpolate to get problem of "too many responses".
these values.
Source: D. Forsyth 26
And so after you're done doing thinning at that point, to
Edge linking get to the following point simply move in the direction
of the edge, which is the tangent... �ﻤﻞ ﻗﺮا�ﺔ
Assume the marked point
is an edge point. Then we
construct the tangent to the
edge curve (which is
normal to the gradient at
that point) and use this to
predict the next points We use it to predict which of
the already-existing points to
(here either r or s). go to next, that is expected to
be part of the thinner edge.
Source: D. Forsyth 27
Non‐Maximum Suppression Examples
Source: D. Forsyth 28
The second additional step that CannyBoii did
was this "Hysteresis Thresholding" thing...
Thresholding
Source: D. Forsyth High Threshold (Strong Edges) Small Threshold (Weak Edges) 29
Hysteresis thresholding
• Use a high threshold
Start
to start edge curves
and a low threshold
Connects
to continue them. strong edges
Source: S. Seitz 30
Hysteresis Thresholding
31
Hysteresis Thresholding
Source: L. Fei-Fei 32
Canny Edge Detector
Sx
Sy
25
Canny edge detector
0. Pre-processing: Smoothen w/ Gaussian
1. Filter image with x, y derivatives of Gaussian
2. Find magnitude and orientation of gradient
3. Non‐maximum suppression:
• Thin multi‐pixel wide “ridges” down to single pixel width
4. Thresholding and linking (hysteresis):
• Define two thresholds: low and high
• Use the high threshold to start edge curves and the low threshold to continue them
• MATLAB: edge(image, ‘canny’)
33
Effect of (Gaussian kernel spread/size)
original Canny with Canny with
The choice of depends on desired behavior
• large detects large scale edges
• small detects fine features
34
Humans implicitly segment stuff in their incredible brains and
then re-draw based on the semantics they conclude. Machines
can't do that, they must analyze first and then get the semantic.
Low‐level Edges vs. Perceived Contours
Source: L. Lazebnik
35
Thresholding
42
Global Thresholding
43
The Role of Illumination
Has no notion of localization.
To it, there are a bunch of
grays, and a bunch of blacks.
Can't tell which is background, 45
which is foreground.
Adaptive Thresholding
Global Thresholding
It is bound to fail. Always.
Because just cutting off at a
certain color, no matter what
color you choose, will take out
part of the object, because it
shares part of the spectrum
with the background.
46
Adaptive Thresholding
Yay more hyper-parameters :) ! The new one is the size of grid cells based on which
you want to do your histogram thresholding. The purpose of the grid is to localize
parts (adapt within the image). It's still failing though, so let's try and fix it up.
Decent background
w.r.t. foreground
Small background
w.r.t. foreground
Optimal Global Adaptive Thresholding
Histogram plot of overlapping regions p1 and p2, where
would the threshold be? So far we've been analyzing if
regions are separated (in fact that's how we detected
that they are separate regions in the first place...)
We could keep trying to place T at different points on the z-axis,
and see how large the two regions would be if they were cutoff at
that position (so for p1, its right end would be T, and for p2, its left
end would be T). Mathematically, this can be represented by:
So, we want to maximize this function. Let's call the function e(T), where it
is a function in terms of T (not z, because z does not change!). So, simple
maxima and minima tells you get the derivative = 0 and voila.
All good, but in practice, how do we know what region 48
is p1 and what region is p2? ArTiFiCiAL iNTeLLiGeNcE.
Basically analyze a crap ton of similar images manually to do human segmentation
and then guide your algorithm by its results to "guess" better.