Markov Random Fields
Markov Random Fields
Erkut Erdem
• The
The data termdata term
Edata data(u) expresses
(u) Eexpresses our
our goal thatgoal
thethat the optimal
optimal model u be
model u be consistent with the measurements.
consistent with the measurements. The smoothness energy Esmoothness (u)
• The smoothness energy Esmoothness(u) is derived from our prior
is derivedknowledge
from our prior
aboutknowledge about plausible solutions.
plausible solutions.
Stereo Disparity: Given two images of a scene, find the binocular dis-
Sample Vision Tasks
ˆ
• Denoising: Given a noisy image I(x,y), where some
measurements may be missing, recover the original image
I(x, y), which is typically assumed to be smooth.
• Stereo Disparity
• Surface Reconstruction
• …
D.
J.
Fleet
Markov
Markov Random
Random Fields
Fields
Markov Random
Markov Random Fields Fields
A Markov Markov
Random Field Random
Random
Markov
Markov
(MRF) is a Fields
Fields
Random
Random
graph G(V,
= E).
(V,Fields
Fields
E).
A Markov
Markov Random Field (MRF) is a graph G =
A Markov
A Markov
Markov Random
Markov
Random Fields
Markov
Random Field
Field Random
Random
(MRF)
(MRF) Random
is aa graph
is Fields
Fields
Fields
graph GG == (V, E).
(V, E).
AA•Markov
Markov
V •=V{1, Random
=A2, {1,
Markov 2,
..., N Field
...,
} N}
is
Random (MRF)
(MRF)
the is the
set is anodes,
is
set
of
Field aof graph
graph
(MRF) nodes, GGis==each
each a of(V,
(V,graph E).which
E).
of
which GGis ==associated
(V, is E). associated
A Markov A Markov
Random Random
Field (MRF) Field (MRF)
is a graph G = (V, E). is a graph (V, E).
ndom Field V (MRF)
••Markov
Vwith
= {1,
=random
{1, is a
...,
2,Random
2, ..., graph
N}
N} G
is =
is (RV),
the
the (V,
set E).
setuof of nodes,
nodes, each
each of(V,
of which
which is associated
is associated
A•
with
•• VV = A Markov
a a Random
random variable
= {1, 2, ..., N } is the set Field Field
variable (MRF)
(MRF)
set of (RV), ,
of nodes,
j is
is u
for a
nodes, eachj , jgraph
graph
for= j
1...N
each of =G =
1...N
.
of which . E).
whichisisassociated .
associated
• • V
with
with = aa •
{1, •V
random
random2,V = =
..., {1,
N {1,
} 2,
variable
variable ...,
2,
is
is ...,
the
theN }
N}
set
(RV),
(RV),set isofis
ofthe
u
u the
nodes,
,
nodes,
, set
for
for set jof
each
j of
=
=eachnodes,
nodes,
of
1...N
1...N which
of .. each
which each is
of of
is which
which
associated isisass as
, ..., N }withwithis thea= set
random of nodes,
variable each (RV),
(RV), of which
uujjdenoted
,,of jj is associated
for
for jj = =N 1...N
1...N .i.the
• The VThe
•associated neighbourhood
{1,with
neighbourhood 2,
with...,
a N}
a of
random
random is of
node
the node
i,
set
variable
variable
variable i, denoted
nodes,
Markov
(RV),
(RV),(RV), u ,
each
iu , N
is
, for
, ,of
Random
for
for jisj= setthe
which
= of
1...N setnodes
1...N of
isFields
. . nodes
associated to to
with a random variable (RV), uj , for j = j1...N . j
dom variable • The
The (RV), u j
neighbourhood , for j = 1...N
of node . i,Nuanddenoted Ni1...N
•
which which
with i neighbourhood
isa i is
adjacent;
random adjacent; i.e.,
variable ofi.e.,
j node
∈ N
j
(RV), ∈ i,
if denoted
if
, and
only
for j N
only
= if i,, is
(i, isj)(i,
if the
the
. ∈ j) set
set
E. ∈ of
of
E. nodes to
nodes to
• •The The neighbourhood
• The neighbourhood neighborhood ofof node
node i, i, denoted ij N ,
i, denoted Nii , is the set of nodes to is
is the
the set
set ofof nodes
nodes to
hbourhood • which
to
The
which
of which
node •ii •The
is
isi, The
is
neighbourhood A neighbourhood
neighbourhood
Markov
adjacent;
adjacent;
denoted
adjacent; N i.e.
of
i.e.,
i.e., Random
, node
jj
is ∈
∈the NN of
i, of
set node
Field
if node
denoted
if
if and
and
andof i, i,
(MRF)
only
only
nodes
only denoted
N denoted
i if
,
if
if to is
is
(i,
(i, a
the
j)j) N N
graph
∈ ,
set
i
∈ i ,is
E.
E. is
of Gthe
.
the
=
nodes setset
(V, of of
E).
to nn
i ∈ N i
i and
which
• which
The • TheMarkovi is adjacent;
Markov Random
neighbourhood Randomi.e., field j
of ∈ N
satisfies
field
node ii if
if and
satisfies
i, denoted only
onlyififN(i, (i, , j)∈the
j)
is ∈E. E.set of nodes to
adjacent; • i.e.,The
which Markov
j ∈ iN which
isiwhich
Randomi is
adjacent;
if and isadjacent;
i only adjacent;
field
i.e., if j i.e.,
satisfies
(i,∈ j) N i.e.,
∈
i ifjE. j∈∈Nonly
and N i if i ifand
i
ifand(i, only only
j) ∈ ififE. (i,(i,j)j)∈∈E.E.
•• The
The Markov
Markov • V = {1,
Random
Random field
field ..., N } is the set of nodes, each of whic
2,satisfies
satisfies
Thewhich
•• The Markov
Markov i isRandom
adjacent; field j ∈ Ni if and only if (i, j) ∈ E.
i.e.,satisfies
satisfies
• The •
Markov•The Thep(uMarkov|
Markov
Random {u
ip(ui j| {u } Random
Random
j∈V\i
field )satisfies
j }j∈V\i =field p(u
)variable |satisfies
p(u
= isatisfies
field {uij|}{u j∈N j} )j∈N . i )(1) . j
= 1...N (1) (1)
ov Random field satisfies with a random (RV), iu
j , for .
• The Markov Random p(u ii||{u {ujjfield
}}j∈V\i )) = = p(u ii ||j{u{u jj} }j∈N )) .. (1)
p(up(ui | {u j }j∈V\i )
j∈V\i
) satisfies
j∈V\i blanket of =
= p(u
p(u p(u
i | |{u
{u
i node } } j∈N
j j∈N
j∈N
) ) . .ii (1)
(1) (1)
Ni
isNoften is
is called
often • the
called
called The Markov
the
the Markov
Markov
p(u
neighbourhood
{ujj}}p(u | {u
| {u blanket
blanket} } ofof
) of node
=
node i.
node p(u i,.ii. denoted
ii
p(u |ii{u . j }j }j∈N
|) {u N ) ,. is the
i
p(ui | {uj }j∈V\i) = p(u p(uii || {u i
j∈V\i
j∈Ni i ) ). j = j p(u
j∈V\i
j∈V\i i )
| {u = (1)
j } j∈N j∈N i ii) . (1)
N
NNii N isis often
isii often
is often calledcalled
called p(u
the
whichthe
the Markov
Markov Markov
Markov
i | {u ijis }j∈V\i
blanket
blanket
adjacent;blanket
blanket
) = of ofp(u of
of
node
node
i.e., |node
i node
j{u i.i.jN
∈ }i.i.j∈N ).
i ifi and only if (i, j)
(1)
n called
BayesiantheN Markovis often
ifiltering Nblanket
N is often
i is
called
i(see often
the ofthe called
called
nodeMarkov
tracking i.the theMarkov
notes) Markov
blanket usedof blanket
ablanket
node
special i.ofof node
classnode ofi.i.MRFs
Bayesian filtering (see the tracking notes) used a special class of MRFs
Ni is often called • ThetheMarkov MarkovRandom blanketfield of node satisfies i.
for Bayesian
Bayesian
which
Bayesian
for which
Bayesian the filtering
filtering
filtering graph
the (see graph(see
(see
was thewas athe
the tracking
tracking
chain.
trackinga chain.
tracking The
notes)
notes) notes)
notes)
joint
The used
used used
used
distribution
joint aaspecialaa special
special
distribution
special over
class
class class
class of the
over
of of
of
MRFs
D.
MRFs MRFs
JRVs FMRFs
.
theleet
RVs
ringfor (seeBayesian
for the Bayesian
tracking
which filtering
Bayesian
the notes)
graph filtering
(see used
was (see
theaatracking
filtering (seethe
special
chain. the tracking
notes)
tracking
class
The of
joint usednotes)
MRFsnotes) a special
distribution usedused aclassspecial
aoverspecial of
the class
MRFs
class
RVs ofo
of a
for
for which
which
of first-order
which the Markov
graph
the graphMarkov
a first-order waschain
was a chain. a
chain. can
chain.
chain The be
The p(u
The
can joint factored
joint
be | {u
joint }into
distribution
distribution
distribution
i factored j j∈V\i a )
into aover =
product p(u
over over
product the |
of
itheRVs{u
the con-
RVs } RVs
jofj∈Ncon- i
) .
MarkovMarkov
Random Fields
Markov
Random
Markov
Markov (cont’d)
Random Fields
Fields Fields
Random
Random (cont) (cont)
Fields(cont)
(cont)
The distribution over an MRF (i.e., over RVs u = (u1, ...,
•The
Thedistribution
distribution over
overananMRF
Thedistribution
The distribution MRFan
over
over (i.e.,
(i.e.,
an MRF
MRF over(i.e., RVs
RVsover
(i.e., u =RVs
over (u1,u...,
RVs u==u(u
N that
(u1)),1),...,...,uuNsat-
N)))
that satisfies isfies
(1) can(1)
be can be expressed
expressed as the as the product
product of of (positive) p
(positive)
isfies (1)isfies
can be
isfies (1)expressed
(1) canbe
can as the product
beexpressed
expressed as asthe of (positive)
theproduct
product of potentialpotent
of(positive)
(positive) func-
poten
potential functions definedon
tions defined onmaximal
maximal cliques cliques ofof G [Hammersley-Clif
tions defined
tionson
[Hammersley-Clifford
tions maximal
defined
defined cliques
onmaximal
Thm].
on maximal
of G [Hammersley-Clifford
cliques
cliques ofofGG [Hammersley-Clifford
[Hammersley-Clifford Thm]. T
Such distributions are often expressed in terms of an en
Such distributions
Such
Such are oftenare
distributions
distributions expressed
are often in termsinof
expressed in an energy
terms of an function
energy
• Such distributions
E, and are expressedΨinc : terms of an energy energy
often
oftenpotentials
clique expressed terms of an
E,function
and clique
E,,and
E, potentials
and clique
clique Ψc :
clique potentials
potentials
potentials ΨΨc :c: :
1 !
1 p(u)1 = exp(−E(u, θ)) , where E(u, θ)! = Ψc
1
!
Z , where (ūc, θcΨ)Ψc∈C .(ū (2)
!
p(u) = p(u) exp(−E(u,
p(u) == θ))
exp(−E(u,
exp(−E(u, θ))
θ)) , ,E(u,
where
where θ) E(u,
=E(u, θ)
θ) Ψ=c= c c (ūc ,c,θ
Z ZZ c∈C c∈C
c∈C
Here,
Here, Here,Here,
• C is the set of maximal cliques of the graph (i.e., m
• C is the•• Cset
C isisof
themaximal
the set
set of
of cliques of
maximal
maximal the graph
cliques
cliques of
of the
the (i.e.,
graph
graph maximal
(i.e.,
(i.e., sub-
maxi
maxim
graphs of G that are fully connected), D.
J.
Fleet
G thatof
graphs ofgraphs
graphs are
ofGGfully
thatare
that connected),
arefully
fullyconnected),connected),
• The clique potential Ψ , c ∈ C, is a non-negative fun
E,
Suchanddistributions
clique potentials are Ψc : expressed inc terms of an energy function
often
E,
tions
Such anddistributions
clique
defined
E,E, and potentials
on
andclique maximal
clique potentials
are often
potentials : expressed
Ψccliques ΨΨcc::of G [Hammersley-Clifford
in terms! of an energy! Thm]. function
1 potentials 1
E, and clique p(u) = Ψc :exp(−E(u, θ)) , where ! E(u, θ), θ=) . (2) Ψc (ūc,
p(u)
E,p(u) =
and clique 1
exp(−E(u, 11 Z Ψ
potentials θ)) , where E(u, θ) = Ψ (ū
=Z exp(−E(u, θ)) c : , where E(u, θ) =! cΨc (ū c!c
c, ΨθΨcc∈C . ,,(2)
)function
!
1p(u)
p(u)
Such distributions Z = = exp(−E(u,
are often
exp(−E(u, θ))
expressed
θ)) ,, where
in termsE(u, of
c∈C θ) an = energy c cc θθcc))..
c (ū
(ū
p(u) Markov
= exp(−E(u,
1 ZZRandom θ)) , whereFields E(u, θ) =(cont’d)
c∈CΨc (ūc, θc) . (2)
Here,
Z exp(−E(u,
!
c∈C
c∈C
E,
Here,
Markov p(u)
and =
clique
Random potentialsFields Ψ θ)) :
c (cont), where E(u, θ) =
c∈C Ψ c (ūc , θc ) . (2)
Here, Here, Z
Here,1 • C is the set of maximal cliques of! c∈C
the graph (i.e., max
Here,
•Here,
Cp(u)
is the= set exp(−E(u,
of maximalθ)) cliques, where of the E(u, graphθ) =(i.e., maximal Ψc (ūc, θc)sub- . (2)
tion over •anCMRF is •the
• C set
C(i.e.,
Zis is of set
over
graphs
the
the maximal
RVs
set ofof u =
Gmaximal cliques
that
maximal (uare1 , ..., of
uNthe
fully
cliques
cliques graph
that
))connected),
of sat- (i.e.,
the graph
graph maximal
(i.e.,
(i.e., maximal
maximal sub-
c∈C
• graphs
C is theofset G that are fully cliques
of maximal connected), of the graph (i.e., maximal sub-
n be expressed
•• C is as
graphs
is the
thethe
ofsetGproduct
set
graphs
graphs that
of of
of
of are
maximal offully
maximal
GG that
that
(positive)
cliques
are
are connected),
cliques
fully
fully of potential
the of
connected),
connected),graph
the func-
(i.e.,
graph maximal
(i.e., maximalfunctio sub-
Here,
graphs of G •thatThe are clique
fully potential
connected), Ψ c , c ∈ C, is a non-negative
• The
d on maximal clique
subgraphs
cliques potential
of
of G thatΨcare , c fully
[Hammersley-Clifford ∈ C,connected),
is a non-negative Thm]. function defined
• graphs
The •clique
•The
The potential
of Gclique
onthat
cliquetheareRVs
potential
potentialΨcin
fully ∈Ψ
, c clique Ψ C,,, ccisū∈
connected), ∈ a, C,
c
non-negative
parameterized
is a non-negative
non-negative function
by θ c . defineddefi
function
function de
• •on• The
The Cthe isclique
clique the potential
RVs set
in of maximal
potential
clique c ∈cliques
ūΨcc,,parameterized C, ccisis a of theθgraph
non-negative
non-negative
by c . (i.e.,
function
function maximal defined sub-
on
• defined
The the RVs
clique in clique
potential
the RVs in
ūΨ , parameterized
c c , of
clique c ∈ C, is a by θ
non-negative c . by θ .function defined
butions are on often
graphs
the expressed
RVsonon
of the
the
•GZ,
in RVs
RVs
that
clique in in
theare terms clique
partition
ū clique
fully
, parameterized an
ū
ū ,, energy
parameterized
parameterized
function,
connected),
cc function
ensures
by θ . cc .
the distribution sums to
• Z, the partition function, ensures the distribution sums to 1:
c c
ue potentials Z,Ψcthe
• • on the
:••Z,partition
partition
RVs
Z, the
the
function,
function,
inpartition
clique
partition censures
ensures
ūfunction,
, parameterized
function,
thethe
ensures
ensures
distribution
! "θc . sums
distribution
by
the sums
distribution
distribution
to to 1:
1:
sums
sums to1:
to 1:
• Z, • The clique potential
the partition function, ! , c"
Ψcensures ∈ZC,the = is adistribution
non-negative exp(−Ψ sums function
to
(ū
c c c 1:
, θ defined
))
1 • Z, the partition Z =
function,
!
ensures
! " exp(−Ψ u the...u
(ū , θ
cdistribution
c c
c∈C
)) sums to 1:
exp(−E(u, onθ)) the, RVswhere in cliqueZ=
E(u, θ) ū cZ,= parameterized
=
!!Ψ exp(−Ψ
(ū 1
" ,
c c cexp(−Ψ θ N )by.(ū θ ,
(2).θ ))
c cc c (ū , θθ ))
Z = c cc cc ))
! "
Z u1 ...u c∈C
Z= u! N exp(−Ψ c (ū c , θ c ))
•TheThe partition The partition
function is 1 ...ufunction
Nc∈C
important
c∈C
u
"
u ...u
...u isc∈C
for important
learning asforit’slearning
a as it’s a f
• Z,partition
the partitionfunction Z is u=
function, N ensures
important
1 ...u c∈C 11 exp(−Ψ
for
NN thec∈C
c (ūcas
distribution
learning , θit’s
c ))sumsa function to 1: of
function
The partition
The of partition
The the
the parameters
function
parameters
partition functionisu!
function important
1 ...uθN is
= c∈C
is {θ c}
importantfor
important . .But
But
c∈Clearning for often as it’s
learning
learning it’s anot
it’s function
as
as critical
it’s
it’s a a offor
functi
functio
The
the partition
parameters
critical for function
θ = {θcis
inference. important
}
c∈C . But often " forit’s learning
not critical as it’sfor a function
inference. of
the parameters
The partition θ =Z{θ
function
=c}isc∈C . But
important
exp(−Ψ
often for it’s (ūc,critical
cnot
learning
θc))as it’s forainference.
function of
e set of maximal cliques
the
the of
parameters
parameters
the parameters θ = {θc}c∈C the graph
θθ == . {θ
{θ
But(i.e.,
cc}}often
c∈C
c∈C maximal
.
. Butit’s often
not sub- it’s
critical not critical
critical
for .
for
D.
Jfor
inference. infer
infere
Fleet
u1 ...uN c∈C
Inference with {θ MRFs
} is challenging as useful factorizations o
that arethe
of G Inference fullyparameters
connected), θ
with MRFs is challenging = . But often it’s
c c∈C as useful factorizations of the joint not critical for inference.
ven by
!
(i,j)∈E
Image
V (ui, uj )
Image Image
Denoising Denoising
(3) Denoising
onsider
from image
the
• restoration:
measurement
Given a noisy Given
model,
image a noisy
perhaps image
with
toration: Given a noisy image v, perhaps with miss- v, perhaps
missing pixels, with miss-
gan
n pixels, vrecover
the datarecover an
an
and the image uu.that
image
solution that is
is both
both smooth
smooth and close
close to
to
v.
image u that is both smooth and close to v.
• Classical
independence techniques:
of observations.
ta each pixel be a node in a graph
node in a graph G = (V, E), with 4-connected
– Linear filtering (e.g. Gaussian
G = (V,
filtering)
E), with 4-connected
likelihoods.
ighourhoods. The
– Median maximal
filtering
cliques are pairs of nodes.
e maximal cliques are pairs of nodes.
provide a definition
– Wiener of smooth-
filtering
• Modern
een pixels and theirtechniques
neighbours.
– PDE-based techniques
Denoising/smoothing
Non-local
es E(u) (and– thereby methods
maximizes uj techniques that preserve
uj
– Wavelet techniques
edges in images
al to the negative log posterior).
– MRF-based techniques
vj
vj
Page: 5
Denoising as a Probabilistic
Inference
• Perform maximum a posteriori (MAP) estimation by
maximizing the a posteriori distribution:
p(true image | noisy image) = p(u | v)
• By Bayes theorem:
likelihood of noisy
image given true image
image prior
p(v | u)p(u)
p(u | v) =
p(v) normalization
• If we take logarithm:
term
Modeling the Likelihood
• We assume that the noise at one pixel is
independent of the others.
p(v | u) = ∏ p(vij | uij )
i, j
probable
probable improbable
improbable
an Roth, 29.11.2010 | Department of Computer Science | GRIS | 20
S.
Roth
© Stefan Roth, 29.11.2010 | Department of Computer Science | GRIS | 20
atural Images
Natural Images
What distinguishes “natural” images from “fake” ones?
• We can take a large database of natural images and study them.
• What distinguishes “natural” images from “fake” ones?
S.
Roth
Image Denoising
pixels of the
uj
v
Niji,j noisy image
(observed)
vj Edges representing
the likelihood
Edges representing
the prior
S.
Roth
Accordingly, the energy function is given by
© Stefan Roth, 29.11.2010 | Department of Computer Science | GRIS | 27
uj
vj vj
uj vj
p(u|v)
l independence since,
• Several of up tofor
a constant,
observations.
options E is equal
MAP estimation to the negative log
process:
likelihoods.– Gradient techniques
2503: Markov Random Fields
– Gibbs sampling
provide a– definition of smooth-
Simulated annealing
– Belief propagation
ween pixels and their neighbours.
– Graph cut
– …
es E(u) (and thereby maximizes
ual to the negative log posterior).
Page: 5
Quadratic Potentials in 1D
Quadratic Potentials in 1D
tic Potentials in 1D
Quadratic
Let aPotentials
Quadratic
v be the sumQuadratic
of signal uin
Potentials
Potentials
smooth 1D in1D
andin 1D
1D
IID Gaussian no
Let vinbe1D
ntials the sum of a smooth 1D signal u and IID Gaussian noise e:
h 1D signal u and IID Gaussian noise e:
• Let
Let Let v be
v be thethe sum sumofof
sum a asmooth
of asmooth
smooth v1D
1D 1D signal
=signalsignal
u + eu,u and
and IID
andIID IID Gaussiannoi
Gaussian no
al =u and IID Gaussian
Gaussian noise
noise e:: v = u+e, (4
u+e, (4)
where
where u = (u1, ..., uN ), vv=v=(v=1u, ..., u
++ and e = (e1, ..., eN ).
evN,e ),, and
e(v, where u = (u1, ..., uN ), (4) v = (v1, ..., vN ), and e = (e1, ..., eN ).
where
1 , ..., vN ), and e = (e1 , ..., eN ).
where
With uu == (u(u
Gaussian 1 , ...,
1 , ...,
IIDtheuNu),N ),v =
noise,
v= (v(v , ...,
1 , 1...,
the negative vNv),N ),and ande = e =(e(e
log likelihood
, ...,
1 , 1...,
provides eNe). N ).
a
and e•Gaussian
), With (e1, ...,Gaussian
= With eIID
N ). noise, IID noise, negative log likelihood
the negative log likelihood provides a quadrat
negative log likelihood provides a quadratic
data With
term. data
provides
With term.
Gaussian
If we alet
Gaussian Ifthewe
quadratic
IID IID letnoise,
noise,
smoothness thethe
data smoothness
term.
the If webe
negative
negative
term letterm
log the
log
quadraticbe quadratic
smoothness
likelihood
likelihood as provides
provides
well, as then well qa
aup
othness
og likelihood term bebequadratic
provides
term a quadratic
quadratic asaswell,well, then up
then up to a constant, the log
to a data to term.
data
constant, a constant,
term.
the If
Iflogwewe the
letlet
posterior log posterior
thethe smoothness
smoothness
is is term term bebequadratic
quadraticasaswell, wellt
rmisbe quadratic
or posterior
as well, is
then up
to to a constant,
a constant, thethe log log N
posterior
posterior is is N−1
N N−1
! !
N −1 2 2
E(u) = (u 2n − vn ) + λ (u − u n . (5
)
! !
2
−N−1v ) 2
+ λ
! E(u)(u = − u (u
) 2!
. N−
n !n N v ) +(5) λ (u N
n+1 N
−1 −
−1
! n u n+1
) .
n n+1 n n=1 2 2
!
n=1 2 2
λ
!
(un+1 −n=1 un ) 2 . E(u) E(u) n=1 =(5)∗ = (u (u − −
n n n nv v) )
+
n=1 +λ λ (u (u
n+1 n+1− − u n n ). .
u )
A good A good
solution solution
u ∗
should
n=1 to v, and adjacent nodes on the grid
u should
be
n=1 n=1
close beto close
v, and to v,
adjacent
n=1and
n=1 adjacent
nodes on
D.
nodes
J.
the
F leet
on
grid
e close ∗u∗ should be close to v, and adjacent nodes on
v, and
should A
adjacent A
goodgood
should
havenodes solution
solution
have
similaron the u
similar
grid
values. shouldThe be
values. close toλv,>and
The constant,
constant, adjacent
λ > 0, controls
0, controls nodes
the tradeof on thet
n+1 E(u)n = (un − vn ) + λ (un+1 − un) . (5)
have
n=1 similar AAgood values.
good solution The n=1
solution
n=1
constant,
shouldλbe
uu∗∗ should >n=1
be 0, controls
close
close n=1 the adjacent
to v, and tradeoff nodes nodes oo
∗
ngood
and A good
smoothness
adjacent solution
and
nodes∗ dataon u should
fit.
the grid be close to v, and adjacent nodes on the g
solution
should
should u have
should
have be close
similar
similar values. to v, The
values. and
The adjacent
constant,nodes λ > on the grid th
0, controls th
t,
houldλ should
> 0,
the optimal
Quadratic
have
controls
havebetweensimilar
between
similar
the
values.
u∗, wesmoothness
smoothness
values.Potentials
tradeoffThe and
take derivatives
The constant, λ >in
constant,
and datadata
of E(u) λ >with
fit. 0, controls 1D
0, controls
respect to theu tradeoff
the trade
n:
between smoothness and data fit.
etween smoothness and data fit.
∂ E(u) • To To
To find
find
find the
the
the
= 2 (un − vn ) +∗ 2λ (−uoptimal
optimal u
u ∗∗
,,we
we
we take
take derivatives
take derivatives of
n−1 + 2un − un+1 ) ,
of E(u) with respe resp
es∂ofuTo find
nE(u) withthe respect
with optimal
respect ∗ u ,uwe
to ::
take derivatives of E(u) with respect to u
o find the optimal∂∂uE(u) , we take derivatives of E(u) with respect to un:
E(u) n
erefore the
necessary
∂ E(u) condition == 22(u (u fornn −−
thevvncritical
n)) + 2λ (−uisn−1 + 2un − un+1
point n+1)) ,,
un−1 +∂ 2u E(u) − u ∂)∂=u
,unn2 (un − vn ) + 2λ (−un−1 + 2un − un+1) ,
n 2 (un − vn ) + 2λ (−un−1 + 2un − un+1 ) ,
n ∂= un+1
∂ uuandnand therefore
n +therefore
λ (−un−1the the
+ 2u necessary
necessary
n − un+1conditioncondition
) = vn . for the critical (6)point is is
rndthe and therefore
critical
and the
therefore
point is necessary
the necessary condition for the critical
condition for thepoint is
therefore the necessary condition for the critical point is
on (6) doescritical notpoint
hold isat
endpoints,
uunn + + λλ(−u (−u
n =n−1
n−1 1+ and2unn − = uN , as
n+1 ) =they vn .
un+1)
= vn . u + uλn (−u + λ(6) (−un−1 + 2un − un+1) = vn . (
nly one neighbor. n For endpoints n−1 +we − un+1
2unobtain ) = vnequations:
different . (6)
Equation (6)
Equation (6) does
does not not hold hold at at endpoints, n = 1 and n = N N
= 1•(6)
n Equation
s,quation For n(6)
and endpoints
= doesN , not
as we obtain
hold
they different equations:
at endpoints, n = 1 and
n = N , as th
have
have does only
onlynot one
u1one+hold neighbor.
(uat1 −
λneighbor.endpoints,
u2)For For endpoints
vn1 = 1 and
= endpoints = N different
wenobtain , as they eq eq
have different
only one equations:
neighbor. For endpoints we N linear
obtainequations
different equations
we obtain
ave only one neighbor. in thedifferent
N unknowns
uN + λ (uFor endpoints=wevNobtain
N − uN−1 ) u u11 ++ λ (u1 − u2 ) = v1
equations:
u1 + λ (u1 − u2) = v1 D.
J.
Fleet
) = v
2kov Random1 Fields Page: 6
u1 + λ (u uuNN1 −++u ) NN=−vu
λλ2(u
(u 1 N −1 ) = vN
At locations
Suppose n where no measurement
our measurements exists,ofthe
exist at a subset derivative
positions, of E w.r.t.
denoted P.
uThe solution
yields the is easily extended to missing measurements (i.e., to handle
condition
Then we can write the energy function as
Missing
interpolation and smoothing).
ments (i.e., to handle ! −u(u +
Measurements
2 u 2 − u
!
= 0 . 2 (18)
E(u) = n−1
n − v n ) n+ λ n+1 (u n+1 − u n )
Suppose our measurements exist at a subset of positions, denoted P , (17)
n∈P all n
The • Suppose
solution is our
still measurements
a large matrix exist at a subset
equation, as in of(7).
positions,
Rows of the
Then we can write the energy function as
denoted P . Then we can write the energy function as
ositions, denoted
matrix with measurements
At locations nE(u)
! are unchanged.
where=no measurement
!But those for which mea-
exists,
"λ the derivative
(un − vn)2 + (un+1 − un )2 of
, E #w.r.t. (17)
u surements are missing,n∈P
yields2 the condition have the form 0 all . . n. −1 2 −1 . . . 0 , with
− un ) substituted
zeros •, At locations
(17)
for nthewhere no measurement
corresponding vn on the exists, we have:
side.
right-hand
At locations n where −uno + 2 un − un+1exists,
n−1measurement = 0 the. derivative of E(18) w.r.t
The Jacobi
• Theupdate
Jacobi equation
update in this case
equation in becomes
this case becomes:
u yields
The solutionthe condition
is w.r.t.
still $a large matrix (t) equation,(t)as in (7). Rows of the
derivative of E 1
(t+1) 1+2λ
(vn + λun−1 + λun+1) for n ∈ P ,
matrix with umeasurements
n = 1 −u(t)n−1 are+unchanged.
2(t)un − un+1But = those
0 . for which mea-(19)
(18)
2 (un−1 + un+1) " otherwise #
surements are missing, have the form 0 . . . −1 2 −1 . . . 0 , with
The solution is(18) still a large matrix equation, as in (7). Rows of the
zeros
The substituted
equations that for govern
the corresponding
the endpoints vn on
canthe be right-hand
expressed side.
inD.
an
J.
Fanalo-
leet
matrix
ngous with measurements
(7). manner.
Rows of the are unchanged.
" But those for which # mea-
2D Image Smoothing
2D Image Smoothing
• For 2D images, the analogous energy we want to
For 2D images, the analogous energy we want to minimize becomes
minimize becomes:
!
E(u) = (u[n, m] − v[n, m])2
n,m∈P
ts (i.e., to handle!
+λ (u[n+1, m] − u[n, m])2 + (u[n, m+1] − u[n, m])2 (20)
all n,m
where
where
ons, denotedPP is.aissubset
a subset of pixels
of pixels where
where the measurements
the measurements vLet available.
are v be the sum
are available.
Taking derivatives with respect to u[n, m] and setting them equal to
2 Looks
zero yields a linear system of equations that has thefamiliar??
same form as (9).
n ) , (17) where u = (u1 ,
The only difference is that the linear filter g is now 2D: e.g.,
D.
J.
Fleet
0 −1 0 With Gaussian I
Robust Potentials
als Robust Potentials
Robust Potentials
Robust Potentials
• Quadratic potentials are not robust to outliers and hence
iersQuadratic
they
and hence
Quadratic potentials
over-smooth
they areedges.
over- notare
potentials robust
These
not to outliers
effects
robust
Quadratic potentials are not robust to outliers and hence the will
to and hence
propagate
outliers and they ove
hence the
throughout throughout
smoothsmooth edges.
the These
graph. the effects
graph.
will propagate throughout the graph.
smooth edges. edges. These
These effects
effects will
will propagate
propagate throughout
throughout the the grap
gra
• Instead of quadratic potentials, we could use a robust
e a Instead error
robust Instead
error of quadratic
function
function potentials,
ρ:
of quadratic we could
potentials, weuse a robust
could use a error function
robust error
could use a robust error func fun ρ
N N−1
N ! NN−1 −1
!
ρ(un+1 −E(u)
un , σ= (22)n −ρ(u
ρ(u vn, σ−d)v+, λσ ) + ρ(u − un , −
σ−s)uu, ,, σσ ))(22
! ! !
),
sE(u) = λ n+1ρ(u
ρ(u ss ,,
n=1
nn nn dd
n=1
n+1
n+1 nn
n=1
n=1 n=1
where where
r example, σdthe andLorentzian
σσd s and
are σscale
and are parameters.
scale
scale For
example,
parameters.
parameters.
parameters. For
For the Lorentzia
example,
example, the Lor
the Lo
s
error function is givenisby
error function given by
" %
2z 1"# z $ 2 # $ %
1 z 22 " 2z D.
2z
J2z
.
Fleet
z, σ) = 2 ρ(z, σ). = log
ρ(z, (23)
σ) =1 +
log 1 + , ρ (z,
,
, σ)
ρ
ρ
""=
(z,
(z, σ)
σ)2 =
= 2
. (23
..
2σ + z 2 2 σ 2 σ 2σ + z
2σ
2σ +22 + zz 2 2
Instead of quadratic potentials, we could use a robust error function ρ:
N
! N−1
!
E(u) = ρ(un − vn, σd) + λ ρ(un+1 − un , σs) , (22
n=1 Robust Potentials
n=1
4 1
0.5
3
-6 -4 -2 2 4 6
2
-0.5
1 -1
-1.5
-6 -4 -2 2 4 6
4 1
0.5
3
Robust Potentials
-6 -4 -2 2 4 6
2
-0.5
1 -1
-1.5
-6 -4 -2 2 4 6
2 2 2
1 1 1
1 32 64 1 32 64 1 32 64
• Asmoother
This Lorentzian
uses smoothness
a quadratic datapotential
potential,encourages an smooth-
and a Lorentzian
nessapproximately piecewise
potential to encourage constant result:
an approximately
constant result:
piecewise
higher-order potential
Fig. 15 Qualitative results of our method. (a) Original images. (b) Segmentation result obtained using the pairwise CRF (explained in Sect. 2).
Kohli
et
al.,
Int
J
Comput
Vis
(2009)
(c) Results obtained by incorporating the robust P n higher order potential (13) defined on segments. (d) Hand labelled result used as ground truth
efficient. They can handle potentials defined over cliques of that incorporation of P n Potts and robust P n model type
Modeling
tural Images the Potentials
What distinguishes
• Could the“natural” images
potentials frompriors)
(image “fake” be
ones?
learned from
• We can takenatural
a large database of natural images and study them.
images?
−2 −2
10 10
−4 −4
10 10
−6 −6
10 10
x-derivative y-derivative
−2 −2
10 10
−4 −4
10 10
−6 −6
10 10
x-derivative y-derivative
Sharp
•© Stefan peak
Roth, 29.11.2010 at
| Department zero:
of Computer ScienceNeighboring
| GRIS | 23 pixels most often have
identical intensities.
• Heavy tails: Sometimes, there are strong intensity differences
due to discontinuities in the image.
S.
Roth
Modeling the Potentials
Potentials■ We face a similar challenge in modeling the potentials
(or compatibility functions), e.g.: f (T , T )
Statistics of Natural Images
■ Gaussian distributions are inappropriate:
H i,j i+1,j
challenge in modeling the potentials
functions),
• Gaussian fH (T
e.g.:distributions
• They , match
i,jare
do not
i+1,j ) statistics
of natural images well.
Tinappropriate:
the
• They
– They do not match the statistics
would lead to of naturaldiscontinuities.
blurred images well.
tions are inappropriate:
■ Welead
– They would to blurred
need discontinuities.
discontinuity-preserving potentials:
the statistics of natural images well.
• Discontinuity-preserving potentials are needed:
• One possibility: Student-t distribution. −1
o blurred• discontinuities.
One possibility: Student-t distribution.
−2
−3
⇥ −4
nuity-preserving potentials: 1 −5
2 2 −7
dent-t distribution. −1
−8
−9
−10
−300 −200 −100 0 100
−2
−3 log-density
⇥ −4
1 −5
© Stefan Roth, 29.11.2010 | Department of Computer Science | GRIS | 26
+ (Ti,j Ti+1,j )2 −6
2 2 −7
−8
−9
−10
−300 −200 −100 0 100 200 300
log-density S. Roth