ADSP
ADSP
ADSP
ADSP
ADSP
ADSP
ADSP
ADSP
ADSP
Advanced Digital Signal Processing (18-792)
Spring Semester, 2012
Department of Electrical and Computer Engineering
Notes on Adaptive Filter Search Methods and Algorithms
Note: This handout is intended to summarize the major equations presented during the videotaped lecture
of March 28, 2012. The videocam was malfunctioning, with some artifacts evident in the video coding. It
is hoped that these notes will enable the lecture to be easier to follow.
I. Introduction
dk
+
yk
–
xk εk
In the previous lecture on adaptive filtering we introduced the adaptive filter structure above. The input x k
is passed through an FIR filter (at least for now) with variable coeffients w k , producing output y k . The
output is compared to the desired signal d k and the error signal k is the difference between the two. The
2
coefficients are manipulated with the goal of minimizing 2 = E k . These interactions are described by
the equations
T T
yk = X W = W X ,
T
R = E XX , P = E d k X
2 2 T
= E d k + W T RW – 2P W
If x k , d k , and k are all wide-sense stationary, the optimal solution for the coefficients is
–1 2 2 T
W = R P which implies min = E d k – P W
18-792 Adaptive Filtering Algorithms -2- Spring, 2012
(Note that W refers to the optimal coefficients, not complex conjugation.)
II. Newton’s method
d
d w1
d
d
We define as = -------- = d w 2 = 2RW – 2P
dW
d
d w2
etc·
1 –1
Pre-multiplying by --- R we obtain
2
1 –1 1 –1 –1
--- R = --- R 2RW – 2P = W – R P = W – W
2 2
1 –1
or W = W – --- R
2
The problem is that this result assumes that we know R and P a priori, which normally we do not. One
quick fix is to do the estimation iteratively by stepping only a fraction of the distance:
–1
W k + 1 = W k – uR k where u is a small step-size constant
Since = 2RW – 2P , we can write
–1
W k + 1 = W k – uR RW k – 2P or
W k + 1 = 1 – 2u W k + 2uW
Does this all converge to W ? It is shown that convergence is obtained if 0 u 1 and if u = 1 2 , con-
vergence is achieved in one step.
III. Performance of adaptive filters
A. Speed of convergence
It is shown that the time constant for the coefficients is approximately w 1 2u , provided that
0 u 1 2.
B. Mean-square error (MSE)
2 2 2k T
Letting V = W – W , it is shown that = min + 1 – 2u V 0 RV 0
18-792 Adaptive Filtering Algorithms -3- Spring, 2012
Hence the time constant for mean-square, MSE is twice that of the time constant for the coefficients:
MSE = w 2
IV. The Least-Mean Squared (LMS) algorithm
–1
The Newton algorithm has the disadvantage that you need to know R in advance for it to work. Our
solution to this problem is twofold:
–1
1. We ignore the factor R
2. We approximate the gradient operator by something simpler
–1
Specifically, we first replace w k = w k – R by W k = W k –
This leaves us with a gradient-descent algorithm. The parameter is the normalized step size. It behaves
similarly to the stepsize parameter u that was previously used butit is not quite the same.
Now what about the gradient k ?
We can write
d T
kmp = ------- = 2RW – 2P = 2E X k X k W k – X k d k = – 2E X k k
dw
So we can define the stochastic gradient ˆ = – 2X k k
Note that we are replacing the expected value of X k k by the instantaneous value. This enables us to write
the final LMS adaptation equation:
2 k
W k + 1 = W k + 2X k k = W k + ------------ X k
2
L
We will discuss this extremely simple adaptation algorithm and others in the next class.