Abstract
We show how to improve the efficiency of Markov Chain Monte Carlo (MCMC) simulations in dynamic mixture models by block-sampling the discrete latent variables. Two algorithms are proposed: the first is a multi-move extension of the single-move Gibbs sampler devised by Gerlach, Carter and Kohn (in J. Am. Stat. Assoc. 95, 819–828, 2000); the second is an adaptive Metropolis-Hastings scheme that performs well even when the number of discrete states is large. Three empirical examples illustrate the gain in efficiency achieved. We also show that visual inspection of sample partial autocorrelations of the discrete latent variables helps anticipating whether blocking can be effective.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Andrieu, C., Moulines, E.: On the ergodicity properties of some adaptive MCMC algorithms. Ann. Appl. Probab. 16(3), 1462–1505 (2006)
Atchadé, Y., Rosenthal, J.: On adaptive Markov chain Monte Carlo algorithms. Bernoulli 11, 815–828 (2007)
Bauwens, L., Lubrano, M., Richard, J.: Bayesian Inference in Dynamic Econometric Models. Oxford University Press, Oxford (1999)
Besag, J., Green, E., Higdon, D., Mengersen, K.: Bayesian computation and stochastic systems (with discussion). Stat. Sci. 10, 3–66 (1995)
Carter, C., Kohn, R.: On Gibbs sampling for state space models. Biometrika 81, 541–553 (1994)
Carter, C., Kohn, R.: Semiparametric Bayesian inference for time series with mixed spectra. J. R. Stat. Soc. B 59(1), 255–268 (1997)
Casella, G., George, E.I.: Explaining the Gibbs sampler. Am. Stat. 46(3), 167–174 (1992)
Chib, S.: Calculating posterior distributions and modal estimates in Markov mixture models. J. Econom. 75, 79–97 (1996)
Engle, C., Kim, C.-J.: The long-run US/UK real exchange rate. J. Money Credit Bank. 31(3), 335–356 (1999)
Fama, E.F., French, K.R.: Common risk factors in the returns on stocks and bonds. J. Financ. Econ. 33(1), 3–56 (1993)
Fiorentini, G., Sentana, E., Shephard, N.: Likelihood-based estimation of latent generalized ARCH structures. Econometrica 72(5), 1481–1517 (2004)
Fruhwirth-Schnatter, S.: Data augmentation and dynamic linear models. J. Time Ser. Anal. 15(2), 183–202 (1994)
Fruhwirth-Schnatter, S.: Finite Mixture and Markov Switching Models. Springer, New York (2006)
Gerlach, R., Carter, C., Kohn, R.: Efficient Bayesian inference for dynamic mixture models. J. Am. Stat. Assoc. 95, 819–828 (2000)
Giordani, P., Kohn, R.: Efficient Bayesian inference for multiple change-point and mixture innovation models. J. Bus. Econ. Stat. 26(1), 66–77 (2008)
Giordani, P., Kohn, R.: Adaptive independent Metropolis-Hastings by fast estimation of mixtures of normals. J. Comput. Graph. Stat. 19(2), 243–259 (2010)
Giordani, P., Kohn, R., van Dijk, D.: A unified approach to nonlinearity, structural change, and outliers. J. Econom. 137, 112–133 (2007)
Haario, H., Saksman, E., Tamminen, G.: An adaptive Metropolis algorithm. Bernoulli 7, 223–242 (2001)
Hamilton, J.D.: A new approach to the economic analysis of nonstationary time series and the business cycle. Econometrica 57(2), 357–384 (1989)
Kim, C.-J.: Dynamic linear models with Markov-switching. J. Econom. 60, 1–22 (1994)
Kim, C.-J., Nelson, C.R.: State-Space Models with Regime Switching: Classical and Gibbs Sampling Approaches with Applications. Massachusetts Institute of Technology Press, Cambridge (1999)
Kim, C.-J., Shephard, N., Chib, S.: Stochastic volatility: likelihood inference and comparison with ARCH models. Rev. Econ. Stud. 65(3), 361–393 (1998)
Koopman, S.: Exact initial Kalman filtering and smoothing for nonstationary time series model. J. Am. Stat. Assoc. 92(440), 1630–1638 (1997)
Liu, J.S., Wong, W.H., Kong, A.: Covariance structure of the Gibbs sampler with applications to the comparisons of estimators and augmentation schemes. Biometrika 81(1), 27–40 (1994)
Nott, D.J., Kohn, R.: Adaptive sampling for Bayesian model selection. Biometrika 92(4), 747–763 (2005)
Omori, Y., Chib, S., Shephard, N., Nakajima, J.: Stochastic volatility with leverage: fast and efficient likelihood inference. J. Econom. 140, 425–449 (2007)
Roberts, G.O., Rosenthal, J.: Coupling and ergodicity of adaptive MCMC. J. Appl. Probab. 44, 458–475 (2007)
Roberts, G.O., Rosenthal, J.: Examples of adaptive MCMC. J. Comput. Graph. Stat. 18(2), 349–367 (2009)
Roberts, G.O., Sahu, S.K.: Updating schemes, correlation structure, blocking and parameterization for the Gibbs sampler. J. R. Stat. Soc. B 59(2), 291–337 (1997)
Scott, S.L.: Bayesian methods for hidden Markov models: recursive computing in the 21st century. J. Am. Stat. Assoc. 97(457), 337–351 (2002)
Seewald, W.: Discussion on Parameterization issues in Bayesian inference (by S.E. Hills and F.M. Smith). In: Bernardo, J.M., Berger, J.O., Dawid, A.P., Smith, A.F.M. (eds.) Bayesian Statistics, vol. 4, pp. 241–243. Oxford University Press, Oxford (1992)
Shephard, N., Pitt, M.: Likelihood analysis of non-Gaussian measurement time series. Biometrika 83(4), 653–667 (1997)
Tierney, L.: Markov chains for exploring posterior distributions. Ann. Stat. 22(4), 1701–1762 (1994)
Timmermann, A.: Moments of Markov switching models. J. Econom. 96, 75–111 (2000)
Yang, M.: Some properties of vector autoregressive processes with Markov-switching coefficients. Econom. Theory 16, 23–43 (2000)
Author information
Authors and Affiliations
Corresponding author
Additional information
The authors are grateful to the participants of the Bayesian Econometrics workshop of the Rimini Centre for Economic Analysis, and to two anonymous referees for helpful comments. The ideas expressed here are those of the authors and do not necessarily reflect the position of the European Commission.
Appendix: Converge of the adaptive MH algorithm
Appendix: Converge of the adaptive MH algorithm
As above we denote S t,t+h−1=(S t ,…,S t+h−1) the random vector defined on the finite space ℵ that contains all possible paths of length h. Our aim is to show that the distribution of \(\{\mathbf{S}_{t,t+h-1}^{(n)}; n \ge 1 \}\) generated by the adaptive MH algorithm given in Sect. 2.4 with transition kernel:
converges to the correct target distribution, say π w (S t,t+h−1)=π(S t,t+h−1), where w underlines the conditioning on \(\mathbf{w} \equiv (\mathbf{s}_{1,t-1},\mathbf{s}_{t+h}^{T},\boldsymbol{\theta},\mathbf{y})\) that is omitted. Here we focus on a generic block since convergence in distribution of the full sequence {S (n);n≥1} to π(S) is ensured by the results in Tierney (1994).
The adaptive proposal \(\tilde{q}_{n}(\mathbf{s}_{t,t+h-1})\) is such that:
where 0<δ<1, q 0(⋅) is the uniform distribution on ℵ, and the adaptive component q n (s t,t+h−1) is detailed in (2.6), (2.8), and (2.10) with the recursions (2.14)–(2.15) for the marginal and transition probabilities. The proof consists of verifying that this adaptive scheme satisfies the sufficient conditions given in Giordani and Kohn (2010):
for some constants K>0 and r>0.
Condition GK1 is immediate since the distributions π(s t,t+h−1) and q n (s t,t+h−1) are defined on the discrete space ℵ, so they are bounded above at 1. Furthermore, the constant term q 0(s t,t+h−1) equals 1/N h, so the two ratios in GK1 are bounded above.
Condition GK2 needs more attention. The weight that the adaptive component attaches to a given block is a function of the marginal and transition probabilities and can be written as:
We first show that the marginal and transition probabilities q n (s t ) and q n (s t+1|s t ) involved in Eq. (A.1) settle down asymptotically. The recursion for marginal probabilities (2.14) yields:
whereas the one for transition probabilities (2.15) implies:
Giordani and Kohn (see Lemma 1 in Appendix, 2010) show that Condition GK1 implies uniform ergodicity, i.e. \(T_{n}(\mathbf{s}_{t,t+h-1},\overline{\mathbf{s}}_{t,t+h-1}) \geq \epsilon \pi(\,\overline{\mathbf{s}}_{t,t+h-1})\) for some ϵ>0. This ensures that q n (s t ) is strictly positive asymptotically and thus q n (s t+1|s t )−q n+1(s t+1|s t )=O(n −1).
Let us write q n (s t+k |s t+k−1)−q n+1(s t+k |s t+k−1)=ϵ k with ϵ k =O(n −1), and let C n denote the numerator of q n (s t,t+h−1) in Eq. (A.1). We have:
so C n −C n+1=ϵ C =O(n −1). Let D n denote the denominator of q n (s t,t+h−1) in (A.1). Since
as well. Hence:
which proves Condition GK2.
Rights and permissions
About this article
Cite this article
Fiorentini, G., Planas, C. & Rossi, A. Efficient MCMC sampling in dynamic mixture models. Stat Comput 24, 77–89 (2014). https://doi.org/10.1007/s11222-012-9354-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11222-012-9354-4