Hostname: page-component-7c8c6479df-7qhmt Total loading time: 0 Render date: 2024-03-29T05:17:11.123Z Has data issue: false hasContentIssue false

Forward-reverse expectation-maximization algorithm for Markov chains: convergence and numerical analysis

Published online by Cambridge University Press:  26 July 2018

Christian Bayer*
Affiliation:
Weierstrass Institute for Applied Analysis and Stochastics
Hilmar Mai*
Affiliation:
Deutsche Bank AG
John Schoenmakers*
Affiliation:
Weierstrass Institute for Applied Analysis and Stochastics
*
* Postal address: Weierstrass Institute for Applied Analysis and Stochastics, Mohrenstrasse 39, 10117 Berlin, Germany.
** Postal address: Deutsche Bank AG, Otto-Suhr-Allee 16, 10585 Berlin, Germany.
* Postal address: Weierstrass Institute for Applied Analysis and Stochastics, Mohrenstrasse 39, 10117 Berlin, Germany.

Abstract

We develop a forward-reverse expectation-maximization (FREM) algorithm for estimating parameters of a discrete-time Markov chain evolving through a certain measurable state-space. For the construction of the FREM method, we develop forward-reverse representations for Markov chains conditioned on a certain terminal state. We prove almost sure convergence of our algorithm for a Markov chain model with curved exponential family structure. On the numerical side, we carry out a complexity analysis of the forward-reverse algorithm by deriving its expected cost. Two application examples are discussed.

Type
Original Article
Copyright
Copyright © Applied Probability Trust 2018 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

[1]Abramowitz, M. and Stegun, I. A. (1964). Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables. U.S. Government Printing Office, Washington, D.C. Google Scholar
[2]Barndorff-Nielsen, O., Kent, J. and Sørensen, M. (1982). Normal variance-mean mixtures and z distributions. Internat. Statist. Rev. 50, 145159. Google Scholar
[3]Bayer, C. and Schoenmakers, J. (2014). Simulation of forward-reverse stochastic representations for conditional diffusions. Ann. Appl. Prob. 24, 19942032. Google Scholar
[4]Bayer, C., Moraes, A., Tempone, R. and Vilanova, P. (2016). An efficient forward-reverse expectation-maximization algorithm for statistical inference in stochastic reaction networks. Stoch. Anal. Appl. 34, 193231. Google Scholar
[5]Bladt, M. and Sørensen, M. (2014). Simple simulation of diffusion bridges with application to likelihood inference for diffusions. Bernoulli 20, 645675. Google Scholar
[6]Bröcker, T. (1975). Differentiable Germs and Catastrophes. Cambridge University Press. Google Scholar
[7]Chan, K. S. and Ledolter, J. (1995). Monte Carlo EM estimation for time series models involving counts. J. Amer. Statist. Assoc. 90, 242252. Google Scholar
[8]Chen, H. F., Guo, L. and Gao, A. J. (1988). Convergence and robustness of the Robbins–Monro algorithm truncated at randomly varying bounds. Stoch. Process. Appl. 27, 217231. Google Scholar
[9]Delyon, B. and Hu, Y. (2006). Simulation of conditioned diffusion and application to parameter estimation. Stoch. Process. Appl. 116, 16601675. Google Scholar
[10]Delyon, B., Lavielle, M. and Moulines, E. (1999). Convergence of a stochastic approximation version of the EM algorithm. Ann. Statist. 27, 94128. Google Scholar
[11]Dempster, A. P., Laird, N. M. and Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. J. R. Statist. Soc. B 39, 138. Google Scholar
[12]Fort, G. and Moulines, E. (2003). Convergence of the Monte Carlo expectation maximization for curved exponential families. Ann. Statist. 31, 12201259. Google Scholar
[13]Gonnet, G. H. (1981). Expected length of the longest probe sequence in hash code searching. J. Assoc. Comput. Mach. 28, 289304. Google Scholar
[14]Lange, K. (1995). A gradient algorithm locally equivalent to the EM algorithm. J. R. Statist. Soc. B 57, 425437. Google Scholar
[15]Liu, C. and Rubin, D. B. (1994). The ECME algorithm: a simple extension of EM and ECM with faster monotone convergence. Biometrika 81, 633648. Google Scholar
[16]MacDonald, I. L. and Zucchini, W. (1997). Hidden Markov and Other Models for Discrete-Valued Time Series. Chapman & Hall, London. Google Scholar
[17]Meng, X.-L. and Rubin, D. B. (1993). Maximum likelihood estimation via the ECM algorithm: a general framework. Biometrika 80, 267278. Google Scholar
[18]Meng, X.-L. and Schilling, S. (1996). Fitting full-information item factor models and an empirical investigation of bridge sampling. J. Amer. Statist. Assoc. 91, 12541267. Google Scholar
[19]Milstein, G. N. and Tretyakov, M. V. (2004). Evaluation of conditional Wiener integrals by numerical integration of stochastic differential equations. J. Comput. Phys. 197, 275298. Google Scholar
[20]Milstein, G. N., Schoenmakers, J. G. M. and Spokoiny, V. (2004). Transition density estimation for stochastic differential equations via forward-reverse representations. Bernoulli 10, 281312. Google Scholar
[21]Milstein, G. N., Schoenmakers, J. G. M. and Spokoiny, V. (2007). Forward and reverse representations for Markov chains. Stoch. Process. Appl. 117, 10521075. Google Scholar
[22]Neath, R. C. (2013). On convergence properties of the Monte Carlo EM algorithm. In Advances in Modern Statistical Theory and Applications, Institute of Mathematical Statistics, Beachwood, OH, pp. 4362. Google Scholar
[23]Schauer, M., van der Meulen, F. and van Zanten, H. (2017). Guided proposals for simulating multi-dimensional diffusion bridges. Bernoulli 23, 29172950. Google Scholar
[24]Sedgewick, R. and Flajolet, P. (1996). An Introduction to the Analysis of Algorithms. Addison-Wesley, Reading, MA. Google Scholar
[25]Stinis, P. (2011). Conditional path sampling for stochastic differential equations through drift relaxation. Commun. Appl. Math. Comput. Sci. 6, 6378. Google Scholar
[26]Stuart, A. M., Voss, J. and Wiberg, P. (2004). Fast communication conditional path sampling of SDEs and the Langevin MCMC method. Commun. Math. Sci. 2, 685697. Google Scholar
[27]Wei, G. C. G. and Tanner, M. A. (1990). A Monte Carlo implementation of the EM algorithm and the poor man's data augmentation algorithms. J. Amer. Statist. Assoc. 85, 699704. Google Scholar
[28]Wu, C.-F. J. (1983). On the convergence properties of the EM algorithm. Ann. Statist. 11, 95103. Google Scholar