0% found this document useful (0 votes)

31 views8 pages

Dynamic Linear Models With Switching

The document discusses dynamic linear models with switching measurement matrices to model changes in vector time series. It presents filtered estimators for state vectors and state occupancy probabilities, along with a maximum likelihood estimation procedure using a pseudo-expectation-maximization algorithm. The authors relate their models to existing literature and provide applications, including tracking multiple targets.

Uploaded by

jiaqi bao

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

31 views8 pages

Dynamic Linear Models With Switching

Uploaded by

jiaqi bao

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

Dynamic Linear Models With Switching

Author(s): R. H. Shumway and D. S. Stoffer

Source: Journal of the American Statistical Association , Sep., 1991, Vol. 86, No. 415
(Sep., 1991), pp. 763-769
Published by: Taylor & Francis, Ltd. on behalf of the American Statistical Association

Stable URL: https://www.jstor.org/stable/2290410

JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide
range of content in a trusted digital archive. We use information technology and tools to increase productivity and
facilitate new forms of scholarship. For more information about JSTOR, please contact support@jstor.org.

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at
https://about.jstor.org/terms

Taylor & Francis, Ltd. and American Statistical Association are collaborating with JSTOR to
digitize, preserve and extend access to Journal of the American Statistical Association

This content downloaded from

3.104.43.49 on Tue, 05 Dec 2023 12:21:41 +00:00
All use subject to https://about.jstor.org/terms
Dynamic Linear Models With Switching
R. H. SHUMWAY and D. S. STOFFER*

The problem of modeling change in a vector time series is studied using a dynamic linear model with measurement matrices
that switch according to a time-varying independent random process. We derive filtered estimators for the usual state vectors
and also for the state occupancy probabilities of the underlying nonstationary measurement process. A maximum likelihood
estimation procedure is given that uses a pseudo-expectation-maximization algorithm in the initial stages and nonlinear opti-
mization. We relate the models to those considered previously in the literature and give an application involving the tracking
of multiple targets.

KEY WORDS: Change points; Expectation-maximization algorithm; Nonlinear models; State-space; Target tracking.

1. INTRODUCTION ment covariance matrix R. The problems of estimating the

parameters by maximum likelihood and the signal vector x,
One way of modeling change in an evolving time series
by Kalman filtering and smoothing techniques have been
is by assuming that the dynamics of some underlying model
exhaustively treated in the literature [see Shumway (1988)
changes discontinuously at certain undetermined points in
for a review of some of them]. For purposes of maximum
time. In this article we will be concerned primarily with
likelihood estimation it is convenient to assume that the ini-
modeling change in the dynamic linear model, a general
tial vector xo along with the errors v, and w, have multi-
form that includes autoregressive integrated moving aver-
variate.normal distributions. It should be recognized that
age (ARIMA) and classical regression models as special
the model defined in (1) and (2) usually is applied with a
cases.
considerably reduced parameter space. For example, struc-
In a stable dynamic linear model, one assumes that some
tural models are often applied that represent the observed
underlying q x 1 observation vectors Yt are connected to
series as the sum of unobserved trend and seasonal com-
unobserved p x 1 signal vectors of interest x, through the
ponents (see, for example, Kitagawa and Gersch 1984;
observation equation
Harvey and Todd 1983). The two examples considered here
yt = Atxt + vt (1) also involve specializing (1) and (2) by reducing the num-
ber of nonzero parameters considered.
for the time points t = 1, . .., n, where At are q x p mea-
Generalizations of the preceding model to include the
surement matrices that convert the unobserved signal mea-
possibility of changes occurring over time have been ap-
surements into the data vectors Yt. The vectors vt are in-
proached by allowing changes in the error covariances (see
dependent zero-mean white-noise vectors with common q
Harrison and Stevens 1976 or Gordon and Smith 1988, 1990)
x q covariance matrix R. The measurement or design ma-
or by assigning mixture distributions to the observation er-
trices At are usually regarded as specified and may be used
rors vt (see Penia and Guttman 1988). Approximations to
to model situations involving structured multiple signals or
filtering were derived in all of the articles just cited. An
where there are missing observations (see, for example,
application to monitoring renal transplants was described in
Shumway 1988, sec. 3.3). The description of the model is
Smith and West (1983) and in Gordon and Smith (1990).
completed by noting that the signal process xt was gener-
Changes can also be modeled in the classical regression
ated from a starting point xo with mean ,u and covariance
case by allowing switches in the design matrices, as in
matrix X using the state equations
Quandt (1972). Applications of the switching approach to
xt= 4Fxt_ + Wt, (2) modeling changes in econometric time series have been re-
viewed by Tsurimi (1988). Switching via a stationary Mar-
where 'F is a p x p transition matrix that models the evo-
kov chain with independent observations has been devel-
lution of the signal vector xt through time and wt is a p x
oped by Lindgren (1978) and Goldfeld and Quandt (1973).
1 white-noise process, assumed to be independent of vt,
Markov switching for dependent data has been applied by
with covariance matrix Q. The parameters of the model are
Hamilton (1989) to detect changes between positive and
the initial mean and covariance ,u and E plus the state tran-
negative growth periods in the economy. Applications to
sition matrix 4 and covariance Q along with the measure-
speech recognition have been considered by Juang and Ra-
biner (1985); hidden Markov models are summarized by
* R. H. Shumway is Professor, Division of Statistics, University ofRabiner and Juang (1986). An application of the idea of
California, Davis, CA 95616. D. S. Stoffer is Associate Professor, De- switching to the tracking of multiple targets has been con-
partment of Mathematics and Statistics, University of Pittsburgh, Pitts- sidered in Bar-Shalom and Tse (1975) and in Bar-Shalom
burgh, PA 15260. R. H. Shumway's time was supported in part by the
Science and Technology Division, Institute for Defense Analysis. D. S.
(1978) who obtained approximations to Kalman filtering
Stoffer's time was supported in part by National Science Foundation Grant in terms of weighted averages of the innovations. They
#DMS-9000522 and by the Centers for Disease Control through a co-
operative agreement with the Association for Schools of Public Health
(ASPH). We are indebted to an anonymous referee who pointed out a
flaw in our original attempt at treating the case where the A,'s are Markov
dependent and to Gabriel Frankel of the Institute for Defense Analysis ? 1991 American Statistical Association
and Mohsen Pouramadhi of Northern Illinois University for a number of Joumal of the American Statistical Association
helpful suggestions. September 1991, Vol. 86, No. 415, Theory and Methods

763

This content downloaded from

3.104.43.49 on Tue, 05 Dec 2023 12:21:41 +00:00
All use subject to https://about.jstor.org/terms
764 Journal of the American Statistical Association

call their approximations "probabilistic data association (2) is of the form

filters. "
zt (P02 0 0 Zt-1 wt
2. DYNAMIC LINEAR MODELS WITH SWITCHING

Our approach is motivated primarily by the problem of

tracking a large number of moving targets using a vector
Zt-L = o 00ZOt-2
Iaot lo 0 1 01 ao, t-Ia0 a,

y, of sensors. In this problem one does not know at any

which models the constant drift in terms of two constant
given time point which target any given sensor has de- processes, aot and alt. In this case a rising economy might
tected. Hence it is the structure of the measurement matrix be characterized by drift aot + alt and a falling economy
A, in (1) that is changing and not the dynamics of the un- by aot.
derlying signal x, or the noises v, or w,. To illustrate, as- To incorporate a reasonable switching structure for the
sume a 3 x 1 vector of satellite measurements yt = (Ytl,measurement matrix into the dynamic linear model (1) and
yt,, ytb)' is observing some combination of tracks or signals (2) that is compatible with both the practical situations de-
xt' = (x,,, xtV, xt3). For the measurement matrix scribed above, we assume that the m possible configura-
tions are states in a nonstationary independent process de-
I 0 0
fined by the time-varying probabilities
A,= I 0 0
1 0 0 =j(t) Pr{At = MJ}, (6)

in the defining model (1), it is clear that all sensors are for j = 1, ..., m and t = 1, 2, .. ., n, independent of past
observing target x,,, whereas for the measurement matrix measurement matrices Al, . . ., At-I and of past data Yi, Y2,
... Yt-i- Important information about the current state of
0 1 0 the measurement process is given by the filtered probabil-
At= 1 0 0, ities for being in state j, defined as the conditional proba-
O 0 1 bilities

1Tj(t I t) = Pr{At = MJ I Yt}, (7)

the first sensor Yt, observes the second target xt2, the second
which also
sensor yt, observes target xt,, and the third sensor Yt3vary as a function of time. We will use the no-
ob-
serves the third target xt3. All possible detection tationconfigu-
Ys = {Y1, Y2, *--, Ys} (s = 1, 2, ..., n) to denote the
rations will define a set of possible values for At, say M1, space spanned by the observations Yi, Y25 *.., y, The fil-
M2,1 ... .,Mm, as a collection of plausible measurement ma- tered probabilities (7) give time-varying estimates of the
trices for p = 3 sensors tracking one, two, or three targets. probability of being in state j given the past and present Yt.
As a second example of the switching model to be con- The most important estimators for purposes of tracking the
sidered here, consider the case where the dynamics of the process are the filtered estimators for the state vectors xt,
linear model changes suddenly over the history of a given say
single realization. For example, Lam (1990) has given the
following generalization of Hamilton's (1989) model for
txt = E(xt I Yt), (8)
detecting positive and negative growth periods in the econ- and the filter covariances defined by
omy. Assume the representation
Pt = cov(xt I Y) (9)
Yt=Z+lnt, (3)
since these are the best current estimators of position.
where zt is an autoregressive series
It is clear that and
if the parameters of thent islinear
dynamic a ran
with a drift that switches between two values a0 and a0 + model are known, we would like to be able to compute
a1. That is, quickly both an estimator for the smoothed trajectory and
its variance given by Equations (8) and (9) and an estimator
nt= nt- + ao + a1St
for the probabilities of each of the current measurement
with St = 0 or 1 depending on whether we are in state 1 configurations given by Equation (7). In many cases we
or state 2. Suppose, for purposes of illustration, thatwill not know the parameters of the dynamic linear model,
given by the initial means and by the transition and co-
zt = 1zt- I + 2Zt-2 + Wt
variance matrices mentioned in the discussion following (1)
is a second-order autoregressive series with var(wt) =and
oT2.
(2). In these cases a procedure for estimating such pa-
Lam (1990) wrote Equation (3) in differenced form rameters by maximum likelihood will be of interest. A
Bayesian direction can also be taken that assigns prior dis-
VYt = zt-zt- + ao + a St (4)
tributions to the unknown parameters (see Gordon and Smith
that we may take as the observation equation (1) with state 1988, 1990, or Pefia and Guttman 1988). Our approach here
vector will be in terms of classical maximum likelihood estimation
using an approximation to the Expectation-Maximization
=t (zr, zt-, a0t, a1,)' (EM) algorithm (Dempster, Laird, and Rubin 1977) in
andM1 = (1, -1, 1,0O) andM2 = (1, -1, 1, 1) determining combination with a standard nonlinear optimization
the two possible economic conditions. The state equation procedure.

This content downloaded from

3.104.43.49 on Tue, 05 Dec 2023 12:21:41 +00:00
All use subject to https://about.jstor.org/terms
Shumway and Stoffer: Dynamic Switching 765

To summarize we derive exact equations for the filtered tt- = (DP t- 1V + Q, (15)
probabilities and state vectors in Section 3. The estimators m

given for the state vectors do not involve mixtures of the

Xt = Xt + E N j(t I t) Ktj(yt-Mjxt ), (16
normal distribution as in Gordon and Smith (1988, 1990) j=1
or Pefia and Guttman (1987) so that no approximations are and
needed to avoid the geometric increase in computations. In m

Section 3 we derive a procedure for estimating the param-

pt = j(t I t)(I -KtjMj)Pt- ( 17)
eters using a pseudo-EM algorithm to get close to the final j=1

values before changing to a classical nonlinear optimization where

algorithm.
K = Pt-lMj1V (18)
3. FILTERING
with Ytj defined as in Equation (10). This
In this section, we establish the recursions for the filters dated filters xt and filtered covariances as weighted com-
associated with the state process x, and the switchingbinations
pro- of the m scaled innovations and covariances,
cess At. The filters in this section are an essential part of respectively, corresponding to each of the possible mea-
the maximum likelihood estimation procedure which will surement matrices. The equations are somewhat similar to
be derived in Section 4. the approximations introduced by Bar-Shalom and Tse (1975)
First, consider the derivation of filters for the measure- as "probabilistic data association filters" [see, also, the ap-
proximations of Gordon and Smith (1988) and Pefia and
ment matrices At. Let fj(t I t - 1) denote the conditional
density of Yt, given At = Mj and the past; we know im-Guttman (1988)].
mediately that, under the multivariate normal assumptions, To verify (16), let I(At = Mj) be the indicator function
the conditional density fj(t I t - 1) is that of a normal with of the set At = Mj and note that
mean Mjxt-' and covariance matrix
xt = E(xt I Yt) = E[E(xt I Yt, At) I YJ
E-tj = Mjpt- 1Mj + R, (10)
where = E{ E(xt l Yt, At = M) I(At =M) I Yt}

Xt = E(xt I Yt-1) (11)

is the one-step filtered estimator for xt. Then=m [xc'
the + Ktj(yt -m M3') I(At
updating = M1) IYt}
equation to get Pr{At = Mj I Yt} is given by

E X j(t I t) [X7t' + Kt1y, (Y-M -1

Ij(t I t) = -1) (12) j=1

where Ktj is given by (18). Equation (17) is derived in

where we assume that the distribution
similar fashion. ij(t) (j = 1, ..., m)
has been specified prior to observing
To review, weYl, ***,
have shown Yt.
how to extend the classical
A potential weakness of the model is the need to specify Kalman filtering recursions to the case where the measure-
the time-varying prior probabilities ij(t). If the investigator
ment matrices are switching (or not) in accordance with a
has no reason to prefer one state over another at time t, he nonstationary independent measurement process. In this
or she might choose uniform probabilities ij(t) = m-'.
model we have derived estimators for the probability of being
Smoothness can be introduced by letting in state j at time t given the past and present Yt in Equation
m
(12). The modified results for the filtered state estimators

Ti(t) = E ij7(t- lIt- 1), (13) (16) show that the filtered estimators involve weighted
j=1 combinations of the gain-adjusted innovations. The filtered
where the nonnegative weights are chosen so that Y, -in covariances
= Pt again involve weighted combinations of the
1. If the process At were Markov with transition probabil- conventional estimators.

ities iij, then (13) would be the update for the filter prob-
4. MAXIMUM LIKELIHOOD ESTIMATION
ability, as has been shown in Lindgren (1978) or Kitagawa
(1987) (see, also, Rabiner and Juang 1986). The difficulty To develop a procedure for maximum likelihood esti-
in extending the approach here to the Markov-dependent mation, note first of all that the innovations form of the
case is the dependence in Yt which makes it necessary to log-likelihood function is proportional to
enumerate over the possible histories to derive Equation (12).
Equation (13) has vi(t) as a function of the past observa-
In L'(o) I n E vj(t) fj(t I t -1) (19)
tions Y,_1 and hence is inconsistent with model assumption t= I j=1
(10). Nevertheless, this seems to be a reasonable compro- where f(t I t - 1) are the multivariate normal
mise that allows the data to modify the probabilities ii-/t). fined earlier with means M1xtt' and covariance matrices
The filtered estimators xtt' and xt = E(xt | Yt) are given given in (10). We may consider maximizing (19) directly
by
as a function of the parameters 03 = (,u, t?, Q, R), or we
xt= FxtII, (14) may consider applying the EM algorithm to the complete-

This content downloaded from

3.104.43.49 on Tue, 05 Dec 2023 12:21:41 +00:00
All use subject to https://about.jstor.org/terms
766 Journal of the American Statistical Association

data log-likelihood. In general, the EM algorithm con- Table 1. Definition of Sensor Target Associations Used
verges nicely in the initial stages and more slowly in the to Determine m = 10 Possible Measurement Matrices
in Example
final stages, where it is advantageous to switch to a stan-
dard nonlinear optimization procedure. Measurements and
To apply the EM algorithm, note in the Appendix that
Measurement observed tracks
the log-likelihood of the complete data, xo, xl, ..., x,,, Al, matrix Yti Yt2 Yt3
A2, ..., An, and Yi, ..., yn, can be written as in Shumway
Ml xt 1 Xt 1 Xt 1
and Stoffer (1982) or Shumway (1988) with an additional
M2 Xt,1 Xt x2,1
term corresponding to the unknown probabilities irj (t). This M3 xt i xt 2 Xt 1
leads to the same regression equations for the transition ma- M4 xd 2 X0 , r X,
M5 Xt i Xt 3 Xt,2
trix (I and the error covariance Q expressed in terms of the
MB Xt,1 Xt,2 X,3
same smoothers x7 = E(xt I YJ) and smoother covariances M7 Xt 2 Xt i Xt 3
Pn and Pn_,. We note only the modification (A. 16) for the M8 Xt,2 Xt,3 Xt
M9 Xt 3 Xt,2 Xt
last smoothed covariance. The equation for updating R given
M10 Xt3 Xt i Xt 2
in (A. 8) involves smoothed probabilities irj(t I n) that have
not been computed. The backwards recursions for the
smoothed probabilities involve integrating over mixtures of The matrices Q and R were taken to be diag(.0025, .0025,
normal distributions and are excessively complicated. Monte .0025) and diag(. 0625, .0625, .0625), respectively, and the
Carlo integration techniques such as the Gibbs sampler initial mean was ,u = (5, 5, 5)' with n = 100 points. The
(Carlin, Polson, and Stoffer 1990) may be useful, but here measurement matrices were switched three times during the
we apply Equation (A.8) in the EM algorithm assuming first 100 points, leading to the observed data shown in Fig-
that the smoothed probabilities can be approximated by the ure 1. Since all tracks started at the same mean, the ob-
filtered probabilities irj(t I t). The resulting "pseudo-EM"
served data corresponds roughly to three one-dimensional
algorithm works quite well as is evident from the example targets originating from a common launch point. To sim-
considered in the following section.
Since the pseudo-EM algorithm is quite slow in later it- First Observed Series
erations and does not necessarily increase the incomplete- 10

data log-likelihood or converge to maximizing values for

the parameters, a variable-metric (Fletcher-Powell-Davi-
don) nonlinear optimization procedure (see Nash and Walker-
25-
Smith 1987) was applied to finish the maximization of
Equation (19) in the latter stages.

5. AN EXAMPLE 0 I
0 20 40 60 80 100
time in points
To illustrate the procedures of the preceding sections, we
return to the problem of tracking multiple targets, as intro-
Second Observed Series
duced in Section 1. The example given here uses a set of 10

contrived data that simulates the behavior of three sensors

observing various configurations of one to three targets over
n - 100 time points. The difficulty is that it is not possible
5 -
to identify which configuration of targets is being observed
at any given time point. This introduces a whole collection
0~ I I
0
of plausible measurement matrices, M1, M2, .. ., Mm at each
time point to serve as possible models for the true mea-
surement matrix At. Two of these matrices, corresponding 0 20 40 60 80 100
time in points
to observing only one track on all three sensors or observ-
ing one track on each of the three sensors, were given in
Third Observed Series
Section 1. The complete collection of measurement matri- 10

ces investigated at each point as plausible explanations for

the data forms the set of possible states for the hidden Mar-
kov chain in this example. A listing of the m = 10 config-
O-5
urations assumed to be possible at each time point is shown
in Table 1. Notice that the two measurement matrices shown
in Section 1 can be identified as M1 and M7 in Table 1.
The underlying tracks were generated according to a model
with a transition matrix of the form F> = diag(1l.005, .990, 0 20 40 60 80 100
time in points
1.000) which corresponds to three tracks, a random walk,
and two tracks that increase and decrease slightly over time. Figure 1. Original Measurements for Tracking Example.

This content downloaded from

3.104.43.49 on Tue, 05 Dec 2023 12:21:41 +00:00
All use subject to https://about.jstor.org/terms
Shumway and Stoffer: Dynamic Switching 767

plify further we have not exhibited the nonlinear dynamics nal tracks, as detailed in Section 2. Time requirements were
of the measurement matrices, which would certainly be 6 min., 3 min. and 1 min. respectively for 8086-, 80286-,
present in any realistic application. It is fairly clear from and 80386-based microcomputers. It is clear that tremen-
the observed tracks where the transition points lie and which dous reductions in computing time are available through
series (staying level, decreasing, increasing) that we are ob- parallel processing, which can be done in Equations (12),
serving. We have chosen a fairly straightforward set of re- (16), and (17) using parallel paths for each possible mea-
alizations to test the method. surement matrix. Hence the multiple path filters would re-
There is the question in any application as to whether quire no more time than a single Kalman filter and smoother.
time varying probabilities or the parameters in the dynamic Figure 2 shows 4 of the 10 filtered probabilities, com-
linear model are more important or whether the two are puted using the forward recursions (12) and (14)-(18). These
equally important. Often the dynamics of the tracks are as- are states exhibiting nonzero probabilities over a significant
sumed known so that those parameters are known. In that portion of the range. It is clear that after an initial period
case adjustment of the initial probabilities might be very of indecisiveness (t < 20), the filter indicates Mlo for 20
useful for effectively pruning highly improbable measure- ' t ' 29, M6 for 30 ' t ' 49, Mg for 50 ' t ' 69, and
ment matrices. In other cases the dynamics of the signal or Mlo again for 70 ' t ' 100.
track are unknown, and the initial probabilities simply may Figure 3 shows the signal tracks, estimated using Equa-
not matter that much. In this case parameter estimation might tions (14)-(18), and we note that they compare well to the
proceed for some fixed reasonable a priori set of probabil- input series mixed to determine the observed realizations
ities, such as rj(t) = m-l. The filtered probabilities (12) in Figure 1. In fact, we did not plot the original tracks be-
can serve as the important indicator of whether or not we cause they would have overlayed the estimated tracks al-
are in state j at time t. In the present example there are most
10 exactly.
initial probabilities. Because of the large number of states
6. DISCUSSION
that are common in examples involving tracking, we chose
to hold all of these probabilities fixed at reasonable starting We have developed an approach to modeling change in
values and identify the tracks by varying the parameters in a vector time series that uses a model for switching that is
the state-space model. The m = 10 initial probabilities were different than considered by Gordon and Smith (1988, 1990)
all taken as .1, corresponding to assuming that each con- or by Penia and Guttman (1988). In general restricting the
figuration is equally likely. switching to the measurement matrices simplifies the Kal-
A summary of parameter estimates using 50 iterations of man filtering recursions considerably since the equations
the EM algorithm and several varible-metric (Fletcher- are exact and it is not necessary to approximate mixture of
Powell-Davidon) iterations is shown in Table 2. It is clear normals as in the Gordon and Smith approach.
that the log-likelihood (19) increases at each step and that The model is still general enough to include most struc-
the transition matrix parameters converge rather quickly, tural models useful in modeling and monitoring changes of
say after 5 iterations. Estimating variances requires some- regime in vector time series. Even changes of the kind con-
what more time, and these may be assumed known to in- sidered by Gordon and Smith can be accommodated as long
crease processing speed. One iteration requires running the as one is willing to assume that they occur as abrupt changes
filters and smoothers for both the probabilities and the sig- in the configuration of elements observed in the state vec-

Table 2. Iterations for Parameters in Tracking Example

Iteration
no. Oil 422 X qll q22 qf r,, r22 r3 In L'
Pseudo-EM

0 1.000 1.000 1.000 .1000 .1000 .1000 .1000 .1000 .1000 -316.67
5 1.006 .990 .999 .0272 .0283 .0332 .0658 .0653 .0744 -252.50
10 1.006 .990 .999 .0118 .0129 .0160 .0516 .0533 .0643 -232.53
15 1.006 .991 .999 .0070 .0081 .0102 .0509 .0522 .0664 -226.84
20 1.006 .991 .999 .0047 .0059 .0073 .0512 .0519 .0682 -223.93
50 1.006 .991 .999 .0016 .0026 .0025 .0534 .0516 .0724 -219.79

Nonlinear optimization (Variable Metric)

51 1.006 .991 .999 .0013 .0025 .0023 .0533 .0516 .0732 -219.70
52 1.006 .991 .999 .0010 .0029 .0019 .0533 .0516 .0723 -219.51
53 1.006 .991 .999 .0011 .0022 .0015 .0533 .0515 .0723 -219.41
54 1.006 .991 .999 .0011 .0021 .0015 .0533 .0515 .0722 -219.38
55 1.006 .991 .999 .0011 .0020 .0015 .0533 .0514 .0722 -219.38
56 1.006 .991 .999 .0011 .0023 .0012 .0536 .0466 .0711 -219.36
57 1.006 .991 .999 .0011 .0024 .0011 .0536 .0461 .0709 -219.33
58 1.006 .991 .999 .0011 .0024 .0012 .0533 .0460 .0704 -219.32

* 1.005 .990 1.000 .0025 .0025 .0025 .0625 .0625 .0625

*denotes true parameter values.

This content downloaded from

3.104.43.49 on Tue, 05 Dec 2023 12:21:41 +00:00
All use subject to https://about.jstor.org/terms
768 Journal of the American Statistical Association

Filtered Probability- State 2 Filtered Probability- State 9

1.2 1.2-

1.0 ,0 1
0.8 0.8

0.6- 0.6

e0.4- 0.4-
0.2 0.2

0.2 - 0.2
0 20 40 60 80 1 0 20 40 60 80 100
time in points time in points

Filtered Probability- State 6 Filtered Probability- State 10

1.2 1.2-

1.0 1.0

0.8 0.8-

0.6 0.6

0.4 -0.4-
0.2 0.2
0.2 0.2
0 20 40 60 80 100 0020 40 60 80 100
time in points time in points

Figure 2. Filtered Probabilities for Most

tor. Modeling of abrupt changes in the observation error, obtained for independent data (see Lindgren 1978) do not
however, is precluded under our formulation, making our carry over easily to the dependent data case given here.
switching structure of relatively little use in robust detection
of outliers as considered by Pend and Guttman. APPENDIX: PSEUDO-EM ALGORITHM
We mention also that our approach allows for simulta- We give details for the pseudo-EM algorithm used in the first
neous estimation of parameters in the dynamic linear model stages of the procedure for maximizing the incomplete-data log-
by maximum likelihood. This means that the sequential likelihood (16). If we imagine the unobserved and observed com-
model selection approach and parameter estimation can be ponents of the model (1) and (2) generated in the order xo, xl,
accomplished simultaneously. A1, Yl, x2, A2, Y2, ..., the complete-data log-likelihood can be
It should be noted in closing that more general models written as

can be considered that introduce dependence in the mea-

surement matrices as they evolve over time. For example, In L(0) =--lnjl n~~~~~~~
(l -ji'1x0-j) - lnIQI
2 2 2
the measurement matrices may form a stationary Markov -- E (Xt O Xt_L),Y_,(t-x0_t-L)_
chain with constant initial and transition probabilities. The - (xt -
cost of increasing the level of generality to the dependent 2 t=1

case seems to be high in computational effort since the pos- n m

sible dependent paths must be laboriously traced back through + E E I(A, = MJ) ln nrJ(
t=_ J=_ 2
the chain, as in Hamilton (1989) who analyzed a simple
n m

two-state Markov structure. In addition one will have to

assume reasonable values for the transition probabilities or
E I(At = Mj)(yt - Atxt
2t= j=l
treat them as additional parameters. Thus it seems that the
(A. 1)
simple extensions to dependent switching transitions that
The EM algorithm requires that we maximize the conditional ex-
pectation

Final Filtered Series Q(0, 00) = EJ[ln L(0) I Yn] (A.2)

10 -

with respect to 0 at each step, where 00 is the parameter

the previous iteration.
Now, taking conditional expectations in (A. 1), noting that
> 5 -
-gI(t I n) = E[I(A, = Mj) I YJ], (A.3)
0
leads to
~0

ri1^(t) wrr(t I n), (A.4)

(A.5)
0 20 40 60 80 100
time in points ? = BA-1, (A.6)

Figure 3. Filtered Estimators for Tracks. Q = n'l(C - B - 'B' + (FA(F'), (A.7)

This content downloaded from

3.104.43.49 on Tue, 05 Dec 2023 12:21:41 +00:00
All use subject to https://about.jstor.org/terms
Shumway and Stoffer: Dynamic Switching 769

and Bar-Shalom, Y., and Tse, E. (1975), "Tracking in a Cluttered Environ-

n m ment With Probabilistic Data Association," Automatica, 11, 4451-4460.
Carlin, B. P., Polson, N. G., and Stoffer, D. S., (1990), "A Monte Carlo
R = - i,j(t I n)[(yt-MJX7)(YI-Mjx7)'+ MJPjMj].
Approach to Nonnormal and Nonlinear State Space Modeling,' pre-
t=1 j=1
print.
(A.8) Dempster, A. P., Laird, N. M., and Rubin, D. B. (1977), "Maximum
Likelihood From Incomplete Data via the EM Algorithm" (with dis-
The covariance matrix I is held fixed at some reasonable value. cussion), Journal of the Royal Statistical Society, Ser. B, 35, 1-38.
The matrices A, B, and C are defined as Goldfeld, S. M., and Quandt, R. E. (1973), "A Markov Model for
n Switching Regression," Journal of Econometrics, 1, 3-16.
Gordon, K., and Smith, A. F. M. (1988), "Modeling and Monitoring
A = E (PnI + Xn lXn, (A.9) Discontinuous Changes in Time Series," in Bayesian Analysis of Time
t=I
Series and Dynamic Linear Models, ed. J. C. Spall, New York: Marcel
n
Dekker, pp. 359-392.
B = > (P t_, + xtnx'), (A.10) (1990), "Modeling and Monitoring Biomedical Time Series,"
t=1 Journal of the American Statistical Association, 85, 328-337.
and Hamilton, J. D. (1989), "A New Approach to the Economic Analysis of
n
Nonstationary Time Series and the Business Cycle," Econometrica,
C = > (Pn + xnx n). (All) 57, 357-384.
t=1 Harrison, P. J., and Stevens, C. F. (1976), "Bayesian Forecasting" (with
discussion), Journal of the Royal Statistical Society, Ser. B, 38, 205-
These matrices involve the usual Kalman smoothers xn = E(xt | 247.
Yn) and their covariances Harvey, A. C., and Todd, P. H. J. (1983), "Forecasting Economic Time
Series With Structural and Box-Jenkins Models," Journal of Business
p n = EI(x, - Xn)(Xt _Xtny I Yn} and Economic Statistics, 1, 299-315.
and Juang, B. H. and Rabiner, L. R. (1985), "Mixture Autoregressive Hid-
den Markov Models for Speech Signals," Transactions of the IEEE
pnt_l= E{x n)(Xt-l_Xny I Yn}.
Acoustics, Speech, and Signal Processing, 33, 1404-1413.
The smoothers are derived under the assumption that theKitagawa, At are G. (1987), "Non-Gaussian State-Space Modeling of Nonsta-
stochastic in a manner analogous to that used for deriving the tionary Time Series," Journal of the American Statistical Association,
82, 1032-1041.
filters in Section 3. We obtain for t = n, n - 1 ...
Kitagawa, G., and Gersch, W. (1984), "A Smoothness Priors Modeling
n= xt_I + Jt_1(xn -_ X1), (A.12) of Time Series With Trend and Seasonality," Journal of the American
Statistical Association, 79, 378-389.
where Lam, P. S. (1990), "The Hamilton Model With a General Autoregressive
Component: Estimation and Comparison With Other Models of Eco-
it = Pt-IzD(Pt-)-. (A.13) nomic Time Series," preprint.
The smoother covariances satisfy Lindgren, G. (1978), "Markov Regime Models for Mixed Distributions
and Switching Regressions," Scandinavian Journal of Statistics, 5, 81-
t-= P + Jt (P )Jt-I (A.14) 91.
Nash, J. C., and Walker-Smith, M. (1987), Nonlinear Parameter Esti-
for t = n, n- 1. I and
mation, New York: Marcel Dekker.
Pt_ = Pt_ J>2 + J _I (P71 - (PPtI)J '2 (A. 15) Pefia, D., and Guttman, I. (1988), "Bayesian Approach to Robustifying
the Kalman Filter," in Bayesian Analysis of Time Series and Dynamic
for t n, n-1. 2 subject to Linear Models, ed. J. C. Spall, New York: Marcel Dekker, pp. 227-
m 254.
Quandt, R. E. (1972), "A New Approach to Estimating Switching
Pn n_= irj(n I n)(I - KjM)DPIn- I (A.16) Regressions," Journal of the American Statistical Association, 67, 306-
J=1
310.
where Kn1 is defined in (15). Rabiner, L. R., and Juang, B. H. (1986), "An Introduction to Hidden
As discussed in the text, the difficulties encountered in com- Markov Models," IEEE Acoustics, Speech, and Signal Processing
Magazine, Jan., 4-16.
puting 1r(t I n) led us to replace it with rrj(t I t) in (A.8). The
Shumway, R. H. (1988), Applied Statistical Time Series Analysis, New
advantages of using the EM algorithm are (a) its stability under
York: Prentice-Hall.
the switching structure, and (b) the simple regression computa- Shumway, R. H., and Stoffer, D. S. (1982), "An Approach to Time
tions involved at each step as exhibited in (A.6)-(A.8). Series Smoothing and Forecasting Using the EM Algorithm," Journal
of Time Series Analysis, 3, 253-264.
[Received February 1990. Revised January 1991.1 Smith, A. F. M., and West, M. (1983), "Monitoring Renal Transplants:
An Application of The Multiprocess Kalman Filter," Biometrics, 39,
867-878.
REFERENCES Tsurimi, H. (1988), "Survey of Bayesian and Non-Bayesian Testing of
Model Stability in Econometrics," in Bayesian Analysis of Time Series
Bar-Shalom, Y. (1978), "Tracking Methods in a Multi-Target Environ- and Dynamic Linear Models, ed. J. C. Spall, New York: Marcel Dek-
ment," IEEE Transactions on Automatic Control, 23, 618-626. ker, pp. 75-100.

This content downloaded from

3.104.43.49 on Tue, 05 Dec 2023 12:21:41 +00:00
All use subject to https://about.jstor.org/terms

Dynamic Linear Models With Markov-Switching: Chang-Jin Kim
No ratings yet
Dynamic Linear Models With Markov-Switching: Chang-Jin Kim
22 pages
Albert Chib JB 93
No ratings yet
Albert Chib JB 93
15 pages
Chapter 6
No ratings yet
Chapter 6
95 pages
MACF Dan MPACF (Tiao - Box1981)
No ratings yet
MACF Dan MPACF (Tiao - Box1981)
16 pages
MS Regress - The MATLAB Package For Markov Regime Switching Models
No ratings yet
MS Regress - The MATLAB Package For Markov Regime Switching Models
38 pages
Covariance Estimation For Multivariate Conditionally Gaussian Dynamic Linear Models
No ratings yet
Covariance Estimation For Multivariate Conditionally Gaussian Dynamic Linear Models
21 pages
Computers and Mathematics With Applications: Alejandro Balbás, Beatriz Balbás, Inna Galperin, Efim Galperin
No ratings yet
Computers and Mathematics With Applications: Alejandro Balbás, Beatriz Balbás, Inna Galperin, Efim Galperin
15 pages
State Space Modelling Versus ARIMA Time-Series Modelling
No ratings yet
State Space Modelling Versus ARIMA Time-Series Modelling
9 pages
Bollerslev y Wooldridge1992
No ratings yet
Bollerslev y Wooldridge1992
31 pages
Regime-Switching Models: Department of Economics and Nuffield College, University of Oxford
No ratings yet
Regime-Switching Models: Department of Economics and Nuffield College, University of Oxford
50 pages
Regime-Switching Models: Department of Economics and Nuffield College, University of Oxford
No ratings yet
Regime-Switching Models: Department of Economics and Nuffield College, University of Oxford
50 pages
ECONOMETRIC
No ratings yet
ECONOMETRIC
50 pages
Zeluiz, V4n4a03
No ratings yet
Zeluiz, V4n4a03
21 pages
Dynamic Linear Models
No ratings yet
Dynamic Linear Models
18 pages
Lec-Markov Note 2
No ratings yet
Lec-Markov Note 2
40 pages
Journal of Statistical Software
No ratings yet
Journal of Statistical Software
19 pages
L Linearfactor New v6 PDF
No ratings yet
L Linearfactor New v6 PDF
80 pages
01100855 (1)
No ratings yet
01100855 (1)
4 pages
Bayesian Dynamic Modelling: Bayesian Theory and Applications
No ratings yet
Bayesian Dynamic Modelling: Bayesian Theory and Applications
29 pages
Bayesian Dynamic Modelling: Bayesian Theory and Applications
No ratings yet
Bayesian Dynamic Modelling: Bayesian Theory and Applications
27 pages
09 Ba402
No ratings yet
09 Ba402
22 pages
On Monitoring Shift in The Mean Processes With Vector Autoregressive Residual Control Charts of Individual Observation
No ratings yet
On Monitoring Shift in The Mean Processes With Vector Autoregressive Residual Control Charts of Individual Observation
9 pages
Matrix Algebraic Properties of The Fisher Information Matrix of Stationary Processes
No ratings yet
Matrix Algebraic Properties of The Fisher Information Matrix of Stationary Processes
33 pages
Ms Var Libro
No ratings yet
Ms Var Libro
369 pages
Models: Autoregressive Moving Average
No ratings yet
Models: Autoregressive Moving Average
13 pages
ARMA Models: Introduction & History
No ratings yet
ARMA Models: Introduction & History
23 pages
Bellman Filtering and Smoothing For State-Space Models
No ratings yet
Bellman Filtering and Smoothing For State-Space Models
60 pages
MATH387 Final Project
No ratings yet
MATH387 Final Project
20 pages
Advanced Markov Models Analysis
No ratings yet
Advanced Markov Models Analysis
38 pages
Gourieroux C., Monfort A. - Time Series and Dynamic Models-CUP (1997)
No ratings yet
Gourieroux C., Monfort A. - Time Series and Dynamic Models-CUP (1997)
685 pages
Bellman Filtering For State Space Models
No ratings yet
Bellman Filtering For State Space Models
26 pages
Time - Series - in - Brief
No ratings yet
Time - Series - in - Brief
11 pages
A Test For A Regime Shift: Fisheries Oceanography
No ratings yet
A Test For A Regime Shift: Fisheries Oceanography
5 pages
Stationary ARMA Procecess. - 2023
No ratings yet
Stationary ARMA Procecess. - 2023
52 pages
Статья на конференцию
No ratings yet
Статья на конференцию
27 pages
Dynamic Factor Models M Watson
No ratings yet
Dynamic Factor Models M Watson
43 pages
Hamilton94 PDF
No ratings yet
Hamilton94 PDF
42 pages
Hamilton 94
No ratings yet
Hamilton 94
42 pages
Advanced Multivariate Time Series Forecasting Mode
No ratings yet
Advanced Multivariate Time Series Forecasting Mode
8 pages
Lecture Note 4 - Dynamic Models For Stationary Data
100% (1)
Lecture Note 4 - Dynamic Models For Stationary Data
28 pages
Change Point Analysis
No ratings yet
Change Point Analysis
20 pages
Rees On Best 2008 Winner
No ratings yet
Rees On Best 2008 Winner
66 pages
Jurnal Varxx
No ratings yet
Jurnal Varxx
9 pages
Unit 2 - Stochastic Process
No ratings yet
Unit 2 - Stochastic Process
20 pages
American Statistical Association
No ratings yet
American Statistical Association
19 pages
MATH545-Time Series
No ratings yet
MATH545-Time Series
79 pages
Dynamics
No ratings yet
Dynamics
43 pages
FALLSEM2022-23 CBS1009 ETH VL2022230104326 2022-11-15 Reference-Material-II
No ratings yet
FALLSEM2022-23 CBS1009 ETH VL2022230104326 2022-11-15 Reference-Material-II
11 pages
Abraham 1984
No ratings yet
Abraham 1984
12 pages
Chapter 5 - Time Series Models
No ratings yet
Chapter 5 - Time Series Models
195 pages
Time Series
No ratings yet
Time Series
44 pages
0.00 Psaradakis, Zacharias MS-TVTP Model
No ratings yet
0.00 Psaradakis, Zacharias MS-TVTP Model
34 pages
Time Series
No ratings yet
Time Series
24 pages
PAPER Nowcasting RPackage PredictingEconVariabsUsingDynamicFactorModels
No ratings yet
PAPER Nowcasting RPackage PredictingEconVariabsUsingDynamicFactorModels
15 pages
Journal Time Series Analysis - 2023 - Armillotta - Count Network Autoregression
No ratings yet
Journal Time Series Analysis - 2023 - Armillotta - Count Network Autoregression
29 pages
5 - Unit Root-stationary-SIC 2
No ratings yet
5 - Unit Root-stationary-SIC 2
61 pages
Stationarity and Non-Stationarity
No ratings yet
Stationarity and Non-Stationarity
22 pages
rp14 08 PDF
No ratings yet
rp14 08 PDF
46 pages
Kalman Observer
No ratings yet
Kalman Observer
5 pages
Social Network Analysis - An Overview
No ratings yet
Social Network Analysis - An Overview
21 pages
Graph Regularized Non-Negative Matrix Factorization For Data Representation
No ratings yet
Graph Regularized Non-Negative Matrix Factorization For Data Representation
17 pages
4335-Article Text-7382-1-10-20190706
No ratings yet
4335-Article Text-7382-1-10-20190706
8 pages
Bordenave NONBACKTRACKINGSPECTRUMRANDOM 2018
No ratings yet
Bordenave NONBACKTRACKINGSPECTRUMRANDOM 2018
72 pages
A Dirichlet Multinomial Mixture Model-Based Approach For Short Text Clustering
No ratings yet
A Dirichlet Multinomial Mixture Model-Based Approach For Short Text Clustering
10 pages
A Biterm Topic Model For Short Texts
No ratings yet
A Biterm Topic Model For Short Texts
11 pages
Science Magazine 5809 2007-01-12
100% (1)
Science Magazine 5809 2007-01-12
140 pages
The Queen of Great Bliss
No ratings yet
The Queen of Great Bliss
50 pages
Industrial Grease Separator Specs
No ratings yet
Industrial Grease Separator Specs
3 pages
Healthcare Practice Administrator MBA in Harrisburg PA Resume Maureen McPeak
No ratings yet
Healthcare Practice Administrator MBA in Harrisburg PA Resume Maureen McPeak
2 pages
Performative Reflections On Love and Commitment
100% (1)
Performative Reflections On Love and Commitment
5 pages
Career Planning for Aspiring Leaders
No ratings yet
Career Planning for Aspiring Leaders
14 pages
E12001
No ratings yet
E12001
28 pages
Basic Statistics For The Behavioral Sciences 7th Edition Heiman Test Bank All Chapters Available
100% (1)
Basic Statistics For The Behavioral Sciences 7th Edition Heiman Test Bank All Chapters Available
135 pages
Gonato Third Examination
No ratings yet
Gonato Third Examination
6 pages
ĐỀ 9
No ratings yet
ĐỀ 9
5 pages
Navigating Teamwork and Friendship
No ratings yet
Navigating Teamwork and Friendship
4 pages
Nat and Pat Nat and Pat: Written by Angela Weeks. Illustrated by Trent Lambert
No ratings yet
Nat and Pat Nat and Pat: Written by Angela Weeks. Illustrated by Trent Lambert
16 pages
FS Notes 2022
No ratings yet
FS Notes 2022
38 pages
Neurodevelopmental Therapy Boosts Motor Skills in CP Kids
No ratings yet
Neurodevelopmental Therapy Boosts Motor Skills in CP Kids
6 pages
Swips Standard Wireline Jh070313 02 Engrev.03
No ratings yet
Swips Standard Wireline Jh070313 02 Engrev.03
2 pages
Project Economics
No ratings yet
Project Economics
8 pages
Tle He6 Module 4
100% (3)
Tle He6 Module 4
64 pages
Boudary Layet
No ratings yet
Boudary Layet
2 pages
Approval, Refusal, and Complaint Letters Guide
100% (1)
Approval, Refusal, and Complaint Letters Guide
20 pages
A Comprehensive Approach For Road Safety - The Example of Germany
No ratings yet
A Comprehensive Approach For Road Safety - The Example of Germany
1 page
Alternative Teaching Methods in Kalawit
No ratings yet
Alternative Teaching Methods in Kalawit
7 pages
Signage & Branding Solutions Guide
No ratings yet
Signage & Branding Solutions Guide
3 pages
Proverb 20
No ratings yet
Proverb 20
6 pages
e-GP System User Manual for PE Users
No ratings yet
e-GP System User Manual for PE Users
121 pages
Transmission Line Design-Advanced TADP 640: Steel Poles-Design Considerations - Miscellaneous Topics Dr. Prasad Yenumula
No ratings yet
Transmission Line Design-Advanced TADP 640: Steel Poles-Design Considerations - Miscellaneous Topics Dr. Prasad Yenumula
40 pages
Gidudu Brian CV - 1
No ratings yet
Gidudu Brian CV - 1
4 pages
Brand Management Cocreating Meaningful Brands 2nd Edition Michael Beverland ISBN 9781529720129 ISBN 9781529720136 Full Version
No ratings yet
Brand Management Cocreating Meaningful Brands 2nd Edition Michael Beverland ISBN 9781529720129 ISBN 9781529720136 Full Version
333 pages
Childhood Trauma Info Sheet 2015
100% (1)
Childhood Trauma Info Sheet 2015
3 pages
The Lilac District Advertising Proposal
No ratings yet
The Lilac District Advertising Proposal
17 pages
CE 417 Lecture 09
No ratings yet
CE 417 Lecture 09
5 pages

Dynamic Linear Models With Switching

Uploaded by

Dynamic Linear Models With Switching

Uploaded by

Dynamic Linear Models With Switching

Author(s): R. H. Shumway and D. S. Stoffer

Stable URL: https://www.jstor.org/stable/2290410

This content downloaded from

1. INTRODUCTION ment covariance matrix R. The problems of estimating the

This content downloaded from

call their approximations "probabilistic data association (2) is of the form

Our approach is motivated primarily by the problem of

y, of sensors. In this problem one does not know at any

1Tj(t I t) = Pr{At = MJ I Yt}, (7)

This content downloaded from

given for the state vectors do not involve mixtures of the

Section 3 we derive a procedure for estimating the param-

values before changing to a classical nonlinear optimization where

Xt = E(xt I Yt-1) (11)

E X j(t I t) [X7t' + Kt1y, (Y-M -1

where Ktj is given by (18). Equation (17) is derived in

This content downloaded from

data log-likelihood or converge to maximizing values for

contrived data that simulates the behavior of three sensors

ces investigated at each point as plausible explanations for

This content downloaded from

Table 2. Iterations for Parameters in Tracking Example

Nonlinear optimization (Variable Metric)

* 1.005 .990 1.000 .0025 .0025 .0025 .0625 .0625 .0625

This content downloaded from

Filtered Probability- State 2 Filtered Probability- State 9

Filtered Probability- State 6 Filtered Probability- State 10

Figure 2. Filtered Probabilities for Most

can be considered that introduce dependence in the mea-

case seems to be high in computational effort since the pos- n m

two-state Markov structure. In addition one will have to

Final Filtered Series Q(0, 00) = EJ[ln L(0) I Yn] (A.2)

with respect to 0 at each step, where 00 is the parameter

ri1^(t) wrr(t I n), (A.4)

Figure 3. Filtered Estimators for Tracks. Q = n'l(C - B - 'B' + (FA(F'), (A.7)

This content downloaded from

and Bar-Shalom, Y., and Tse, E. (1975), "Tracking in a Cluttered Environ-

This content downloaded from

You might also like