[go: up one dir, main page]

0% found this document useful (0 votes)
85 views32 pages

MA5232 Modeling and Numerical Simulations: Iterative Methods For Mixture-Model Segmentation 8 Apr 2015

This document summarizes a lecture on iterative methods for mixture-model segmentation and subspace clustering. It reviews K-means clustering and the Expectation-Maximization (EM) algorithm for central clustering. It then discusses modeling data with a mixture of subspaces and formulations of the K-subspaces and EM algorithms for subspace segmentation. The EM algorithm estimates subspace membership probabilities and model parameters iteratively through E and M steps. Key differences between K-subspaces and EM are noted, and homework is assigned on exercises from the lecture handout.

Uploaded by

navneeth91
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
85 views32 pages

MA5232 Modeling and Numerical Simulations: Iterative Methods For Mixture-Model Segmentation 8 Apr 2015

This document summarizes a lecture on iterative methods for mixture-model segmentation and subspace clustering. It reviews K-means clustering and the Expectation-Maximization (EM) algorithm for central clustering. It then discusses modeling data with a mixture of subspaces and formulations of the K-subspaces and EM algorithms for subspace segmentation. The EM algorithm estimates subspace membership probabilities and model parameters iteratively through E and M steps. Key differences between K-subspaces and EM are noted, and homework is assigned on exercises from the lecture handout.

Uploaded by

navneeth91
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

4/7/2015

MA5232 Modeling and Numerical


Simulations
Lecture 2
Iterative Methods for Mixture-Model
Segmentation
8 Apr 2015
National University of Singapore

Last time
PCA reduces dimensionality of a data set while
retaining as much as possible the data variation.
Statistical view: The leading PCs are given by the
leading eigenvectors of the covariance.
Geometric view: Fitting a d-dim subspace model via
SVD

Extensions of PCA
Probabilistic PCA via MLE
Kernel PCA via kernel functions and kernel matrices
National University of Singapore

4/7/2015

This lecture
Review basic iterative algorithms for central
clustering
Formulation of the subspace segmentation
problem

National University of Singapore

Segmentation by Clustering

From: Object Recognition as Machine Translation, Duygulu, Barnard, de Freitas, Forsyth, ECCV02

4/7/2015

Example 4.1
Euclidean distance-based clustering is not
invariant to linear transformation

Distance metric needs to be adjusted after


linear transformation

National University of Singapore

Central Clustering
Assume data sampled from a mixture of
Gaussian

Classical distance metric between a sample


and the mean of the jth cluster is the
Mahanalobis distance

National University of Singapore

4/7/2015

Central Clustering: K-Means


Assume a map function provide each ith
sample a label
An optimal clustering minimizes the withincluster scatter:

i.e., the average distance of all samples to


their respective cluster means
National University of Singapore

Central Clustering: K-Means


However, as K is user defined,
when
each point becomes a cluster itself: K=n.
In this chapter, would assume true K is known.

National University of Singapore

4/7/2015

Algorithm
A chicken-and-egg view

National University of Singapore

Two-Step Iteration

National University of Singapore

10

4/7/2015

Example
http://util.io/k-means

National University of Singapore

11

Feature Space

Source: K. Grauman

4/7/2015

Results of K-Means Clustering:

Image

Clusters on intensity

Clusters on color

K-means clustering using intensity alone and color alone


* From Marc Pollefeys COMP 256 2003

4/7/2015

A bad local optimum

National University of Singapore

15

Characteristics of K-Means
It is a greedy algorithm, does not guarantee to
converge to the global optimum.
Given fixed initial clusters/ Gaussian models, the
iterative process is deterministic.
Result may be improved by running k-means
multiple times with different starting conditions.
The segmentation-estimation process can be
treated as a generalized expectationmaximization algorithm
National University of Singapore

16

4/7/2015

EM Algorithm [Dempster-Laird-Rubin 1977]


Expectation Maximization (EM) estimates the
model parameters and the segmentation in a
ML sense.
Assume samples are independently drawn
from a mixed probabilistic distribution,
indicated by a hidden discrete variable z
Cond. dist. can be Gaussian
National University of Singapore

17

The Maximum-Likelihood Estimation


The unknown parameters are
The likelihood function:

The optimal solution maximizes the loglikelihood


National University of Singapore

18

4/7/2015

The Maximum-Likelihood Estimation


Directly maximize the log-likelihood function
is a high-dimensional nonlinear optimization
problem

National University of Singapore

19

Define a new function:

The first term is called expected complete loglikelihood function;


The second term is the conditional entropy.
National University of Singapore

20

10

4/7/2015

Observation:

National University of Singapore

21

The Maximum-Likelihood Estimation


Regard the (incomplete) log-likelihood as a
function of two variables:
Maximize g iteratively (E step, followed by M
step)

National University of Singapore

22

11

4/7/2015

Iteration converges to a stationary


point

National University of Singapore

23

Prop 4.2: Update

National University of Singapore

24

12

4/7/2015

Update
Recall

Assume
is fixed, then maximize the
expected complete log-likelihood

National University of Singapore

25

To maximize the expected log-likelihood, as an


example, assume each cluster is isotropic
normal distribution:

Eliminate the constant term in the objective

National University of Singapore

26

13

4/7/2015

Exer 4.2

Compared to k-means, EM assigns the


samples softly to each cluster according to a
set of probabilities.

National University of Singapore

27

EM Algorithm

National University of Singapore

28

14

4/7/2015

Exam 4.3: Global max may not exist

National University of Singapore

29

Alternative view of EM:


Coordinate ascent
w

w1

National University of Singapore

30

15

4/7/2015

Alternative view of EM:


Coordinate ascent
w

w1

National University of Singapore

31

Alternative view of EM:


Coordinate ascent
w

w2
w1

National University of Singapore

32

16

4/7/2015

Alternative view of EM:


Coordinate ascent
w

w2
w1

National University of Singapore

33

Alternative view of EM:


Coordinate ascent
w

w2
w1

National University of Singapore

34

17

4/7/2015

Alternative view of EM:


Coordinate ascent
w

w2
w1

National University of Singapore

35

Visual example of EM

18

4/7/2015

Potential Problems
Incorrect number of Mixture Components

Singularities

Incorrect Number of Gaussians

19

4/7/2015

Incorrect Number of Gaussians

Singularities
A minority of the data can have a
disproportionate effect on the model
likelihood.
For example

20

4/7/2015

GMM example

Singularities
When a mixture component collapses on a
given point, the mean becomes the point, and
the variance goes to zero.
Consider the likelihood function as the
covariance goes to zero.
The likelihood approaches infinity.

21

4/7/2015

K-means VS EM

k-means clustering and EM clustering on an artificial dataset ("mouse"). The


tendency of k-means to produce equi-sized clusters leads to bad results, while
EM benefits from the Gaussian distribution present in the data set

National University of Singapore

43

So far
K-means
Expectation Maximization

National University of Singapore

44

22

4/7/2015

Next up
Multiple-Subspace Segmentation
K-subspaces
EM for Subspaces

National University of Singapore

45

Multiple-Subspace Segmentation

National University of Singapore

46

23

4/7/2015

K-subspaces

National University of Singapore

47

K-subspaces
With noise, we minimize

Unfortunately, unlike PCA, there is no constructive


solution to the above minimization problem. The main
difficulty is that the foregoing objective is hybrid it is
a combination of minimization on the continuous
variables {Uj} and the discrete variable j.
National University of Singapore

48

24

4/7/2015

K-subspaces

National University of Singapore

49

K-subspaces

Exactly the same as


in PCA

National University of Singapore

50

25

4/7/2015

K-subspaces

National University of Singapore

51

K-subspaces

National University of Singapore

52

26

4/7/2015

EM for Subspaces

National University of Singapore

53

EM for Subspaces

National University of Singapore

54

27

4/7/2015

EM for Subspaces

National University of Singapore

55

EM for Subspaces

National University of Singapore

56

28

4/7/2015

EM for Subspaces

National University of Singapore

57

EM for Subspaces
In the M step

National University of Singapore

58

29

4/7/2015

EM for Subspaces

National University of Singapore

59

EM for Subspaces

National University of Singapore

60

30

4/7/2015

EM for Subspaces

National University of Singapore

61

EM for Subspaces

National University of Singapore

62

31

4/7/2015

Relationship between K-subspaces and


EM
At each iteration,
K-subspaces algorithm gives a definite
assignment of every data point into one of the
subspaces;
EM algorithm views the membership as a
random variable and uses its expected value
to give a probabilistic assignment of the
data point.
National University of Singapore

63

Homework
Read the handout Chapter 4 Iterative
Methods for Multiple-Subspace
Segmentation.
Complete exercise 4.2 (page 111) of the
handout

National University of Singapore

64

32

You might also like