Class 4 Part 2 PDF
Class 4 Part 2 PDF
Class 4 Part 2 PDF
Overview
Principal Component Analysis
•Purpose
•Derivation
•Examples
Literature
u2 u1
u1
y = Ux, U : p × N , p < N
Find the vector u1 ,such that the variance of the data along this
uB
direction is maximized: uA
σ u2 = E {(u1T x) 2 } , E {x} = 0, u1 = 1
1
{ }
p
, xˆ i = ∑ u k ( xTk u k ), uTk u m = δ k ,m
2
mse = E x − xˆ
k =1
uk
x
x̂
2
{
}
p
mse = E x − x = E x − ∑ u k (x u k )
2 T
ˆ
k =1
scalar
{ } p T
p p
= E x − 2 E ∑ x u k (x u k ) + E ∑ (x u n )u n ∑ u k (x u k )
2 T T T T
k =1 n =1 k =1
=E x{ } 2 p T 2
k =1
p T 2
− 2 E ∑ ( x u k ) + E ∑ ( x u k )
k =1
=E x{ } 2 p T 2
− E ∑ ( x u k )
k =1
p
= trace ( C ) − ∑ uTk Cu k , C = E {xxT }
k =1
maximize
=0
Examples
Training:
-Calculate the eigenspace for all faces in the training database
-Project each face into the eigenspace → feature reduction
Classification:
-Project new face into eigenspace
-Nearest neighbor in the eigenspace
Feature reduction/extraction
Original Reconstruction with 20 PC
http://www.nist.gov/
Generative model
a1,1 a1, N
A= % unknown mixing matrix, a i = ( a1,i ,..., a N ,i )T
a a
N ,1 N , N
X1
S1
S1
X3
Blind source separation :
separate the three original signals
S1
S2
S3
Restrictions
Restrictions
E { g1 ( s1 ) g 2 ( s2 )...g N ( sN )} = E { g1 ( s1 )} E { g 2 ( s2 )} ...E { g N ( sN )}
Independence includes uncorrelatedness:
E {( si − µi )( sn − µ n )} = E {( si − µi )} E {( sn − µ n )} = 0
Scaling matrix S
Restrictions
Restrictions
Ambiguities
1.) Scale
N N
1
x = ∑ ai si = ∑ ( ai )(α i si )
i =1 i =1 αi
Reduce ambiguity by enforcing E {si 2 } = 1
2.) Order
We cannot determine an order of the independant
components
Computing ICA
a) Minimizing mutual information:
ˆ −1x
sˆ = A
N
Mutual information: I (sˆ ) = ∑ H ( sˆi ) − H (sˆ )
i =1
b) Maximizing Nongaussianity
s = A −1x
introduce y and b: y = bT x = bT As = qT s
From central limit theorem:
y = qT s is more gaussian than any of the si and becomes
least gaussian if y = si .
Iteratively modify bT such that the "gaussianity"
of y is minimized. When a local minimum is reached,
bT is a row vector of A −1.
…
xN ,1 xN , M
x1 xM
Each pixel is a feature, each face image a point in the feature space.
Dimension of feature vector is given by the size of the image.
x3 u2
ui are the eigenvectors which can be x1 xM
represented as pixel images in the original u1 x
2
cooridinate system x1...xN
x1
Fall 2004 Pattern Recognition for Vision
ICA Applied to Faces
x1,1 xN ,1
x1 xN
…
S1
S1
x1, M xN , M
Image removed due to copyright considerations. See Figure 1 in: Baek, Kyungim et. al.
"PCA vs. ICA: A comparison on the FERET data set." International Conference of Computer Vision,
Pattern Recognition, and Image Processing, in conjunction with the 6th JCIS. Durham, NC, March 8-14 2002, June 2001.