The Expectation-Maximization Algorithm
The Expectation-Maximization Algorithm
Maximization Algorithm
common task in signal processing is the estimation maximization) algorithm is ideally suited to problems of this
of the parameters of a probability distribution func- sort, in that it produces maximum-likelihood (ML) estimates
.tion. Perhaps the most frequently encountered esti- of parameters when there is a many-to-one mapping from an
mation problem is the estimation of the mean of a signal in underlying distribution to the distribution goveming the ob-
noise. In many parameter estimation problems the situation servation. In this article, the EM algorithm is presented at a
is more complicated because direct access to the data neces- level suitable for signal processing practitioners who have
sary to estimate the parameters is impossible, or some of the had some exposure to estimation theory. (A brief summary
data are missing. Such difficulties arise when an outcome is of ML estimation is provided in Box 1 for review.)
a result of an accumulation of simpler outcomes, or when The EM algorithm consists of two major steps: an expec-
outcomes are clumped together, for example, in a binning or tation step, followed by a maximization step. The expectation
histogram operation. There may also be data dropouts or is with respect to the unknown underlying variables, using
clustcrinr i n such LI way that the niimhcr 01' the cui-wit c\tiniatc of the parametcrs ancl
undcrlying data points i s unl\no\\ n (censor- conilitioncil upon the observ;itions. The
ing ;tnd/or truncation). The FV (eq~ectation- niaxinii/atiun \ ~ e pthcn provide\ a new e\ti-
Authorized licensed use limited to: Birla Institute of Technology and Science. Downloaded on August 20,2022 at 12:57:15 UTC from IEEE Xplore. Restrictions apply.
mate of the parameters. These two steps are iterated until
convergence. The concept is illustrated in Fig. I.
The EM algorithm was discovered and employed inde-
pendently by several different researchers until Dempster [ 11
brought their ideas together, proved convergence, and coined
the term “EM algorithm.” Since that seminal work, hundreds
of papers employing the EM algorithm in many areas have
been published. A large list of references is found at [2].A
typical application area of the EM algorithm is in genetics,
where the observed data (the phenotype) is a function of the
underlying, unobserved gene pattern (the genotype), e.g. [3].
Another area is estimating parameters of mixture distribu-
tions, e.g. [4]. The EM algorithm has also been widely used
in econometric, clinical, and sociological studies that have
unknown factors affecting the outcomes [5]. Some applica-
tions to the theory of statistical methods are found in [6].
In the area of signal processing applications, the largest
area of interest in the EM algorithm is in maximum likelihood
tomographic image reconstruction, e.g. [7, 81 Another com-
monly cited application is training of hidden Markov models,
especially for speech recognition, e.g. [91. The books [ 10,111
have chapters with extensive development on hidden Markov
models (HMMs).
Other signal processing and engineering applications be-
gan appearing in about 1985. These include: parameter esti-
mation [la, 131; ARMA modeling [14, 151; image modeling,
reconstruction, and processing [ 16, 171; simultaneous detec-
tion and estimation [18, 19, 201; pattern recognition and
neural network training [21, 22, 231; direction finding [24];
noise suppression 1251; spectroscopy [27]; signal and se-
quence detection [28]; time-delay estimation [29]; and spe-
cialized developments of the EM algorithm itself [30].The
EM algorithm has been the subject for multiprocessing algo-
rithm development [31]. The EM algorithm is also related to objects may be further subdivided into two shapes: round and
algorithms used in information theory to compute channel square. Using a pattern recognizer, it is desired to determine
capacity and rate distortion functions [32, 331, since the the probability of a dark object. For the sake of the example,
expectation step in the EM algorithm produces a result simi- assume that the objects are known to be trinomially distrib-
lar to entropy. The EM algorithm is philosophically similar uted. Let the random variable X,represent the number of
to ML detection in the presence of unknown phase (incoher- round dark objects, X, represent the number of square dark
ent detection) or other unknown parameters: the likelihood objects, and X,represent the number of light objects, and let
function is averaged with respect to the unknown quantity [x,,x2,x3]’ = x be the vector of values the random variables
(i.e., the expected value of the likelihood function is com- take for some image. (In this article the convention is that
puted) before detection, which is a maximization step (see, vectors are printed in bold font, and scalars are printed in
e.g., [34, Chap. 51. math italic. All vectors by convention are taken as column
vectors. Uppercase letters are random variables.) Assume
Ector’s Problem: An Introductory Example further that enough is known about the probabilities of the
different classes so that the probability may be written as
The image-processing example introduced by Ector and Hat-
ter (see the “Tale of Two Distributions” sidebar), although
somewhat contrived, illustrates most of the principles of the
EM algorithm as well as the notational conventions of this
article. In many aspects it is similar to a problem that is of
practical interest - the emission tomography (ET) problem
discussed later in this article.
Suppose that in an image pattern-recognition problem, where p is an unknown parameter of the distribution and n =
there are two general classes to be distinguished: a class of x,+x,+x,. The notation f ( x ,,x2,x3Ip)is typical throughout the
dark objects and a class of light objects. The class of dark article; it is used to indicate the probability function which
Authorized licensed use limited to: Birla Institute of Technology and Science. Downloaded on August 20,2022 at 12:57:15 UTC from IEEE Xplore. Restrictions apply.
may be either a probability density function (pdf) or a prob-
ability mass function (pmf).
A feature extractor is employed that can distinguish which
objects are light and which are dark, but cannot distinguish
shape. Let bI,y2]T = y be the number of dark objects and number
of light objects detected, respectively, so that y , = xI + x2and yz
= x,, and let the corresponding random variables be Y , and Yz.
There is a many-to-one mapping between (x,,x2}and yl. For
example, if y , = 3, there is no way to tell from the measurements
whether x1= 1and x, = 2 or x, = 2 and x, = 1. The EM algorithm
is specifically designed for problems with such many-to-one
mapping\. Thcn (see Box Z),
Authorized licensed use limited to: Birla Institute of Technology and Science. Downloaded on August 20,2022 at 12:57:15 UTC from IEEE Xplore. Restrictions apply.
Maximization Step (M-step). Use the data from the
expectation step as if it were actually measured data to
determine an ML estimate of the parameter. This estimated
data is sometimes called "imputed" data. As a numerical example, suppose that the true parameter
In this example, with xl"+l' and xZ"+" imputed and x3 is p = 0.5, n = 100 samples are drawn, with y , = 100. (The
available, the ML estimate of the parameter is obtained by true values of x, and x2 are 25 and 38, respectively, but the
algorithm does not know this.) Table 1 illustrates the result
taking the derivative of logf(x,ik"', xZ[k+l',x,lp) with respect
of the algorithm starting from p"' = 0. The final estimate p"
to p , equating it to zero, and solving for p , = 0.52 is in fact the ML estimate of p that would have been
obtained by maximizing Eq. (1) with respect top, had the x
(7) data been available.
Let
Authorized licensed use limited to: Birla Institute of Technology and Science. Downloaded on August 20,2022 at 12:57:15 UTC from IEEE Xplore. Restrictions apply.
conditioning argument to the expectation and is regarded as restriction to distributions in the exponential family. These
fixed and known at every E-step. The first argument condi- are pdfs (or pmfs) of the form
tions the likelihood of the complete data.
For the M-step let O'k+'l be that value of 0 which maximizes f(xie) = h(x) exp[c(0ITt(x)i/a(e) (11)
Q(O1O'k'):
where 8 is a vector of parameters for the family [35,36].The
function t(x) is called the suflicient statistic of the family (a
statistic is sufficient if it provides all of the information
necessary to estimate the parameters o i the distribution from
It is important to note that the maximization is with respect
the data [35,36]).Members of the exponential family include
to the first argument of the Q function, the conditioner of the
most distributions of engineering interest, including Gauss-
complete data likelihood.
ian, Poisson, binomial, uniform, Rayleigh, and others. For
The EM algorithm consists of choosing an initial elk1,then exponential families, the E-step can be written as
performing the E-step and the M-step successively until
convergence. Convergence may be determined by examining
when the parameters quit changing, i.e., stop when
I18ik1- OLk-llII< E for some E and some appropriate distance
measure (I.((.
Let tlk+'l - E[t(~)ly,O'~'].
As a conditional expectation is an
The general form of the EM algorithm as stated in Eqs. (9)
and (IO) may be specialized and simplified somewhat by estimator, t'k+''is an estimate of the sufficient statistic (The
Authorized licensed use limited to: Birla Institute of Technology and Science. Downloaded on August 20,2022 at 12:57:15 UTC from IEEE Xplore. Restrictions apply.
.___ ___
Convergence of the EM Algorithm
For every iterative algorithm, the question of convergence
needs to be addressed: does the algorithm come finally to a
I --------+
Noise
source
solution, or does it iterate ad museum, ever learning but never
coming to a knowledge of the truth? For the EM algorithm,
Processor the convergence may be stated simply: at every iteration of
the EM algorithm, a value of the parameter is computed so
that the likelihood function does not decrease. That is, at
L- ~ J
4. Single-microphone ANC system.
every iteration the estimated parameter provides an increase
in the likelihood function increases until a local maximum is
achieved, at which point. the likelihood function cannot in-
crease (but will not decrease). Box 3 contains a more precise
statement of this convergence for the general EM algorithm.
Despite the convergencetheorem in Box 3, there is no guarantee
that the convergence will be to a global maximum. For likelihood
functions with multiple maxima, convergence will be to a local
maximum which depends on the initial starting point 0‘”’.
The convergence rate of the EM algorithm is also of
interest. Based on mathematical and empirical examinations,
it has been determined that the convergence rate is usually
5. Processor block diagram of the ANC system. slower than the quadratic convergence typically available
with a Newton’s-type method [4]. However, as observed by
EM algorithm is sometimes called the estimatiodmaximiza- Dempster [I], the convergence near the maximum (at least
tion algorithm because, for exponential families, the first step for exponential families) depends upon the eigenvalues of the
is an estimator. It has also been called the expectationhodi- Hessian 01the updale lunction M , so that rapid convergence
fication algorithm [9]). In light of the fact that the M-step will may be possible. In any event, even with potentially slow
be maximizing convergence there are advantages to EM algorithms over
Newton’s algorithms. In the first place, no Hessian needs to
E[log b(x)ly,e[k’] + c ( e y tLk+*’- log a(0) be computed. Also, there is no chance of “overshooting”’ the
target or diverging away from the maximum. The EM
with respect to 8 and that E[log b(x)ly,q[”] does not depend
algorithm is guaranteed to be stable and to converge to an
upon 0, it is sufficient to write:
ML estimate. Further discussion of convergence appears in
E-step Compute [37, 381.
Authorized licensed use limited to: Birla Institute of Technology and Science. Downloaded on August 20,2022 at 12:57:15 UTC from IEEE Xplore. Restrictions apply.
Box 2: Combination and
conditional expectations multinomials
Let Xi,Xz,X3have a multinomial distribution with class prob
abilities (pl,p2,p3), so
( x , + x z + x , ) ! x, *z 13
P(X,=x,,X, =x2,X7=x3)= PI Pz P3
x1!xz!x3!
~P(xl=i,xz=y-l,X~=.x,)
4 0
eorem. So ( X i +
+ p z p 3 ) . This
IY=y], it is first
ity, P(Xi = xilY
probability can
ET Image Reconstruction Based upon the geometry of the sensors and the body it is
possible to determine p(b,d). The detector variables y(d) are
In ET [7], tissues within a body are stiinulated to emit Poisson distributed,
photons. These photons are detected by detectors surround-
ing the tissue. For purposes of computation the body is Ud)’
f ( y l h ( d ) )= P ( y ( d ) = y ) = e-h(d)-
divided into B boxes. The number of photons generated in Y!
each box is denoted by n(b),b = 1,2,..., B. The number of
photons detected in each detector is denoted by y(d), d = and it can be shown that
1,2,...,D , as shown in Fig. 3. Let y = [y(l),y(2) ,...,y ( d ) ]denote
B
the vector of observations.
= E[y(d)l= CW)P(W.
The generation of the photons from box b can be described b=l
Authorized licensed use limited to: Birla Institute of Technology and Science. Downloaded on August 20,2022 at 12:57:15 UTC from IEEE Xplore. Restrictions apply.
Assuming that each box generates independently of every d=l, ,D
other box and that the detectors operate independently, the d)logp(b,d ) - logx'k+l'(b,d)!
X[k+l'(b,
likelihood function of the complete data is I)
h[k+Il( b )= C x " " ( b , d ) p ( b , d )
d =I
where
y ( t ) = s ( t ) + cJcv(t);
Authorized licensed use limited to: Birla Institute of Technology and Science. Downloaded on August 20,2022 at 12:57:15 UTC from IEEE Xplore. Restrictions apply.
= [$(I - p ) , $(2- p ) , ..., 4 ~ 9 1 ~ .
The complete data set is x = [y‘, If we knew s,
estimation of the AR parameters would be straightforward
using familiar spectrum estimation techniques.
The likelihood function for the complete data is
Then
s,(t) = @ s , ( t - l ) + g u ( t )
y(t) = h T s p ( t+) o,v(t)
where
(22)
The expectations in Eqs. (20), (21), and (22) are first and
second moments of Gaussians, conditioned upon observation and
Authorized licensed use limited to: Birla Institute of Technology and Science. Downloaded on August 20,2022 at 12:57:15 UTC from IEEE Xplore. Restrictions apply.
Let the elements of the HMM be parameterized by 0, i.e.,
there is a mapping 0 3 (A(8),j~(0),&~(~10)).
The mapping is
assumed to be appropriately smooth. In practice, the initial
probability and transition probabilities are some of the ele-
ments of 0. The parameter estimation problem for an HMM
is this: given a sequence of observations, y = ( y , , y2,...,yr),
determine the parameter 8 which maximizes the likelihood
function
Authorized licensed use limited to: Birla Institute of Technology and Science. Downloaded on August 20,2022 at 12:57:15 UTC from IEEE Xplore. Restrictions apply.
since the expectation is conditioned upon the observations,
the only random component comes from the state variable.
The E-step can thus be written as f--
' Matched
Filter1
. bl(4 ,
i
I
!__ Matched y2(i) Signal
, Detection b2(i)
-
I
r(t) Filter 2 Algorithm
The conditional probability is
TD
(with h a Lagrange multiplier) leads to y(i) = Lyi(i), y2(i),..., y ~ ( i ) l .
Authorized licensed use limited to: Birla Institute of Technology and Science. Downloaded on August 20,2022 at 12:57:15 UTC from IEEE Xplore. Restrictions apply.
Because the interference among the users is similar to in- likelihood of the unobserved data. From Eq. (31),f(ylb,a) is
tersymbol interference, optimal detection requires dealing Gaussian. To compute the E-step
with the entire sequence of matched filter vectors
E[log f(xla)l y, aik1]
= z f ( b l y , a‘k1)
log f(xl a)
bsl+l)(M+l)K
where R(b) and S(b) depend upon the bits and correlations
(33)
and c is a constant that makes the density integrate to 1. Note
that even though the noise is Gaussian, which is in the The conditional probability required for the expectation is
exponential family, the overall likelihood function is not
Gaussian because of the presence of the random bits - it is
actually a mixture of Gaussians. For the special case of only
a single user the likelihood function becomes
r 1 M 1
Authorized licensed use limited to: Birla Institute of Technology and Science. Downloaded on August 20,2022 at 12:57:15 UTC from IEEE Xplore. Restrictions apply.
may be specialized as in Eqs. (12) and (13). Otherwise, it will 10. L. Rabiner and B.-H. hang, Fnndamentals of Speech Recognition.
Prentice-Hall, 1993.
be necessary to use the general statement of the EM algorithm
(Eqs. (9) and (10)). In many cases, the type of conditioning 11. J.R. Deller, J.G. Proakis, and J.H.L. Hansen, Discrete-Time Processing
ofSpeech Signals. Macmillan, 1993.
exhibited in Eqs. (19), (24) or (32) may be used: the observed
data is conditioned upon data not observed so that the likeli- 12. M. Segal andE. Weinstein, “Parameter estimationof continuous dynami-
cal linear systems given discrete time observations,” P. IEEE, ~01.75, no. 5,
hood function may be computed. In general, if the complete pp. 727-729, 1987.
data set is x = (y,z) for some unobserved z, then
13. S. Zabin and H. Poor, “Efficient estimation of class-A noise parameters
via the EM algorithm,” IEEE Trans. Info. T., vol. 37, no. 1, pp. 60-72, 1991,
E[log ,f(~l8)ly,8’~’]= If( ~ly,8‘~’)log f(xl8) dz, 14. A. Isaksson, “Identification of ARX models subject to missing data,”
IEEEAuto C, vol. 38, no. 5, pp. 813-819, 1993.
since, conditioned upon y the only random component of x
1.5. I. Zisknd and D. Hertz, “Maximum likelihood localization of narrow-
is z. band autorcgressive sources via the EM algorithm,” IEEE Trans. Sig. Proc.,
Analytically, the most difficult portion of the EM algo- vol. 41, no. 8, pp. 2719-2724, 1993.
rithm is the E-step. This is also often the most difficult 16. R. Lagendijk, J. Biemond, and D. Boekee, “Identification and restoration
computational step; for the general EM algorithm, the expec- of noisy blurred images using the expectation-maximization algorithm,”
tation must be computed over all values of the unobserved IEEE Trans. ASSP, vol. 38, no. 7, pp. 1180-1191, 1990.
variables. There may be, as in the case of the HMM, efficient 17. A. Katsaggelos and K. Lay, “Maximum likelihood blur identification and
image restoration using the algorithm,” IEEE Trans. Sig. Proc., vol. 39, no.
algorithms to ease the computation, but even these cannot
3, pp. 729-733, 1991.
completely eliminate the computational burden.
18. A. Ansari and R. Viswanathan, “Application of EM algorithm to the
In most instances where the EM algorithm applies, there detection of direct sequence signal in pulsed noise jamming,” IEEE Trans.
are other algorithms that also apply, such as gradient descent Com.,vol.41,no. 8,pp. 1151-1154, 1993.
(see, e.g., [49]). As already observed, however, these algo- 19. M. Fcder, “Parameter estimation and cxtraction of helicoptcr signals
rithms may have problems of their own such as requiring observed with a wide-band interference,” IEEE Trans. Sig. Proc., vol. 41,
derivatives or setting of convergence-rate parameters. Be- no. 1, pp. 232-244, 1993.
cause of its generality and the guaranteed convergence, the 20. G. Kaleh, “Joint parameter estimation and symbol detection for linear
EM algorithm is a good choice to consider for many estima- and nonlinear unknown channels,” IEEE Trans. Com., vol. 42, no. 7, pp.
2506-2413, 1994.
tion problems. Future work will include application in new
and different areas, as well as developments to improve 21, W. Byrne, “Altemating minimization and Boltzman machine learning,”
IEEE Trans. Neural Net., vol. 3, no. 4, pp. 612-620, 1992.
convergence speed and computational structure.
22. M. Jordan and R. Jacobs, “Hierarchical mixtures of experts and the EM
algorithm,” Neural Comp., vol. 6, no. 2, pp. 181-214, 1994.
Todd K. Moon is Associate Professor at the Electrical and
23. R. Streit and T. Luginbuh, “ML training of probabilistic neural net-
Computer Engineering Department and Center for Self-Or- works,” IEEE Trans. Neural Net., vol. 5 , no. 5, pp. 764-783, 1994.
ganizing Intelligent Systems at Utah State University.
24. M. Miller and D. Fuhrmann, “Maximum likelihood narrow-band direc-
tion finding and the EM algorithm,” IEEE Trans. ASSP, vol. 38, no. 9, pp.
References 1560- 1577, 1990.
25. S. Vaseghi and P. Rayner, “Detection and suppression of impulsive noise
1. A.P. Dempster, N.M. Laird, and D.B. Rubin, “Maximum likelihood from in speech communication systems,” IEE Proc-I, vol. 137, no. I , pp. 38-46,
incomplete data via the EM algorithm,” J. Royal Statiscal Soc., Ser. R , vol. 1990.
39, no.1, pp.1-38, 1977.
26. E. Weinstein, A. Oppenheim, M. Feder, and J. Buck, “Iterative and
2. For an extensive list of references to papers describing applications of the sequential algorithm for multisensor signal enhancement,” IEEE Trans. Sig.
EM algorithm, see http://www.engineering/usu.edu/Departmenls/ece/Publi-Proc., vol. 42, no. 4, pp. 846-859, 1994.
cations/Moon on the World-Wide Web.
27. S.E. Bialkowski, “Expectation-maximization (EM) algorithm for regres-
3. C. Jiang, “The use of mixture models to detect effects of major genes on sion, deconvolution, and smoothing of shot-noise limited data,” Journal of
quantitative characteristics in a plant-breeding experiment,” Generics, vol. Chemomerrics, 1991.
136, no. 1, pp. 383-394, 1994.
28. C. Georghiades and D. Snyder, “The EM algorithm for symbol unsyn-
4. R. Redner and H.F. Walker, “Mixture densities, maximum-likelihood chronized sequencedetection,”IEEE Comun., vol. 39, no. 1, pp. 54-61,1991.
estimation and the EM algorithm (review),” SIAM Rev., vol. 26, no. 2, pp.
195-237, 1984. 29. N. Antoniadis and A. Hero, “Time-delay estimation for filtered Poisson
processes using an EM-type algorithm,” IEEE Trans. Sig. Proc., vol. 42, no.
5. J. Schmee and G.J. Hahn, “Simple method for regression analysis with 8, pp. 2112-2123, 1994.
censored data,” Technometrics, vol. 21, no. 4, pp. 417-432, 1979.
30. M. Segal and E. Weinstein, “The cascade EM algorithm,” P. IEEE, vol.
6. R.Little and D.Rubin, “On jointly estimating parameters and missing data 76, no. 10, pp. 1388-1390, 1988.
by maximizing the complete-data likelihood,” Am. Statistn., vol. 37, no. 3,
pp. 218-200, 1983. 3 1 , C. Gyulai, S. Bialkowski, G. S. Stiles, and L. Powers, “A comparison of
three multi-platform message-passing interfaces on an expecation-maximi-
7. L.A. Shepp and Y.Vardi, “Maximum likelihood reconstruction lor emis- zation algorithm,” in Proceedings of the 1993 World Conference on
sion tomography,” IEEE Med. Im., vol.1, pp. 113-122, October 1982. Transputers, pp. 4511164, 1993.
8. D.L. Snyder and D.G. Politte, “Image reconstruction from list-mode data 32. R.E. Blahut, “Computation of channel capacity and rate-distortion func-
in an emission tomography system having time-of-flight measurements,” tions,” IEEE Trans. Infor. Th., vol. 18, pp. 460-473, July 1972.
IEEENucl. S., vol. 30, no. 3, pp. 1843-1849, 1983.
33. 1. Csiszar and G. Tusnday, “Information geometry and altcrnating
9. L. Rabiner, “A tutorial on hidden Markov models and selected applications minimization procedures,” Statistics and Decisions, Supplement Issue I ,
in speech recognition,” P. IEEE, vol. 77, no. 2, pp. 257-286, 1989. 1984.
Authorized licensed use limited to: Birla Institute of Technology and Science. Downloaded on August 20,2022 at 12:57:15 UTC from IEEE Xplore. Restrictions apply.
34. J.G. Proakis, Digital Communications. McCraw Hill, 3rd ed., 1995 44. J. Picone, “Continuous speech recognition using hidden Markov mod-
els,” Signal Processing Magazine, vol. 7, p. 41, July 1990.
35. R.O. Duda and P.E. Hart, Pattern C2assification and Scene Analysis
Wiley, 1973. 45. L.E. Baum, T. Petrie, G. Soules, and N. Weiss, “A maximization
technique occurring in the statistical analysis of probabilistic functions of
36. P.J. Bickel and K.A. Doksum, Mathematical Statistics. Holden-Day, Markov chains,” Ann. Math. Stat., vol. 41, no. 1, pp. 164-171, 1970.
1977.
46. S. Verdu, “Optimum multiuser asymptotic efficiency,” IEEE Trans.
37. C. Wu, “On the convergence properties of the EM algorithm,” Ann.
Cum., vol. COM-34, no. 9, pp. 890-896, September 1986.
Statist., vol. I I , no. 1, pp. 95-103, 1983.
47. H.V. Poor, “On parameter estimation in DS/SSMA formats,”in Proceed-
38. R.A. Boyles, “On the convergence of the EM algorithm,” J . Roy. Sta. B.,
ings uf’the Internutional Conference on Advances in Communications and
vo1.45, no. 1, pp. 47-50, 1983.
Control Systems, 1988.
39. B. Widrow and S.D. Steams, Adaptive Signal Processing. Prentice-Hall,
48. R. Lupas and S. Verdu, “Near-far resistance of multiuser detectors in
1985.
asynchronous channels,” ZEEE Trans. Comm, vol. 38, pp. 496-508, April
40. J.C. Stevens and K.K. Ahuja, “Recent advances in active noise control,” 1990.
AIAA Jourrzal, vol. 29, no. 7, pp. 1058-1067, 1991.
49. A.V. Oppenheim, E. Weinsten, K. C. Zangi, M. Feder, and D. Gauger,
41. M. Feder, A. Oppenheim, and E. Weinstein, “Maximum likelihood noise “Single-sensor active noise cancellation based on the EM algorithm,”
cancellation using the EM algorithm,” IEEE Trans. ASSP., vol. 37, no. 2, pp. ICASSP, 1992.
204-216, 1989.
50. L. Scharf, Stutisticul Signal Processing: Detection, Estimation, und Time
42. S.M. Kay, Modern Spectral Estimation. Prentice-Hall, 1988. Series Analysis. Addison Wesley, 199 I .
43. Y. Singer, “Dynamical encoding of cursive handwriting,” Biol. Cybern., 5 1. H.L.V. Trees, Detection, Estimation, and Modulation Theory, Part I.
vol. 71, no. 3, pp. 227-237, 1994. New York: John Wiley and Sons, 1968.
Authorized licensed use limited to: Birla Institute of Technology and Science. Downloaded on August 20,2022 at 12:57:15 UTC from IEEE Xplore. Restrictions apply.