Book contents
- Frontmatter
- Contents
- Preface
- 1 Introduction
- Part I Stochastic Models and Bayesian Filtering
- Part II Partially Observed Markov Decision Processes: Models and Applications
- Part III Partially Observed Markov Decision Processes: Structural Results
- Part IV Stochastic Approximation and Reinforcement Learning
- Appendix A Short primer on stochastic simulation
- Appendix B Continuous-time HMM filters
- Appendix C Markov processes
- Appendix D Some limit theorems
- References
- Index
Appendix A - Short primer on stochastic simulation
Published online by Cambridge University Press: 05 April 2016
- Frontmatter
- Contents
- Preface
- 1 Introduction
- Part I Stochastic Models and Bayesian Filtering
- Part II Partially Observed Markov Decision Processes: Models and Applications
- Part III Partially Observed Markov Decision Processes: Structural Results
- Part IV Stochastic Approximation and Reinforcement Learning
- Appendix A Short primer on stochastic simulation
- Appendix B Continuous-time HMM filters
- Appendix C Markov processes
- Appendix D Some limit theorems
- References
- Index
Summary
The main use of stochastic simulation in this book was in Parts II, III and IV for simulation-based gradient estimation and stochastic optimization. Also simulation was used in Chapter 3 for particle filters. This appendix presents some elementary background material in stochastic simulation that is of relevance to filtering, POMDPs and reinforcement learning. Our coverage is necessarily incomplete and only scratches surface of a vast and growing area. The books [284, 280] are accessible treatments of stochastic simulation to an engineering audience.
Assume that uniformly distributed pseudo random numbers u ∼ U[0, 1] can be generated efficiently, where U[0, 1] denotes the uniform probability density function with support from 0 to 1 (“pseudo random” since a computer is a deterministic device). For example, the Matlab command rand(n) generates an n×n matrix where each element is U[0, 1] and statistically independent of other elements. Starting with U[0, 1] random numbers, the aim is to generate samples of random variables and random processes with specified distributions.
An important motivation for stochastic simulation stems from computing multidimensional integrals efficiently via Monte Carlo methods. Given a function ϕ : ℝX→ ℝ, then if p(·) denotes a pdf having support over ∝ X, the multi-dimensional integral can be expressed as
where Ep denotes expectation with respect to p. By simulating independently and identically distributed (i.i.d.) samples {xk}, k = 1, …, N, from the pdf p(·), classical Monte Carlo methods compute the above integral approximately as.
By the strong law of large numbers (see Appendix D), for large N, one would expect the approximation to be accurate. The logic is that direct computation of the integral via deterministic methods can be difficult, whereas the Monte Carlo method can be implemented efficiently by proper choice of p(·).
In classical Monte Carlo methods, the samples {xk} are generated i.i.d. In the last 25 years, there have been significant advances in Markov chain Monte Carlo (MCMC) methods where, in order to evaluate the above integral, {xk} is generated according to a geometrically ergodic Markov chain whose stationary distribution is p(·).
Simulation of random variables
Assuming an algorithm is available for generating uniform U[0, 1] random numbers, we describe below three elementary methods for simulating random variables, namely, the inverse transform method, the acceptance rejection method and the composition method.
- Type
- Chapter
- Information
- Partially Observed Markov Decision ProcessesFrom Filtering to Controlled Sensing, pp. 428 - 441Publisher: Cambridge University PressPrint publication year: 2016