Gradient estimation for smooth stopping criteria

Bernd Heidergott; Yijie Peng

doi:10.1017/apr.2022.7

Gradient estimation for smooth stopping criteria

Part of: Markov processes Classical measure theory

Published online by Cambridge University Press: 15 June 2022

Bernd Heidergott

and

Yijie Peng

Show author details

Bernd Heidergott*: Affiliation:
Vrije Universiteit Amsterdam
Yijie Peng*: Affiliation:
Peking University
*: *Postal address: Department of Operations Analytics, De Boelelaan 1105, 1081 HV Amsterdam. Email address: b.f.heidergott@vu.nl
**Postal address: Guanhua School of Management, 52 Haidian Rd, Beijing. Email address: pengyijie@pku.edu.cn

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

We establish sufficient conditions for differentiability of the expected cost collected over a discrete-time Markov chain until it enters a given set. The parameter with respect to which differentiability is analysed may simultaneously affect the Markov chain and the set defining the stopping criterion. The general statements on differentiability lead to unbiased gradient estimators.

Keywords

Sensitivity analysis Monte Carlo simulation gradient estimation

MSC classification

Primary: 28A15: Abstract differentiation theory, differentiation of set functions

Secondary: 60J20: Applications of Markov chains and discrete-time Markov processes on general state spaces (social mobility, learning theory, industrial processes, etc.)

Type: Original Article
Information: Advances in Applied Probability , Volume 55 , Issue 1 , March 2023 , pp. 29 - 55

DOI: https://doi.org/10.1017/apr.2022.7 [Opens in a new window]
Copyright: © The Author(s), 2022. Published by Cambridge University Press on behalf of Applied Probability Trust

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Asmussen, S. et al. (2008). Asymptotic behavior of total times for jobs that must start over if a failure occurs. Math. Operat. Res. 33, 932–944.CrossRef Google Scholar

Asmussen, S., Lipsky, L. and Thompson, S. (2014). Checkpointing in failure recovery in computing and data transmission. In Analytical and Stochastic Modelling Techniques and Applications (ASMTA 2014), eds B. Sericola, M. Telek and G. Horváth, Springer, Cham, pp. 253–272.CrossRef Google Scholar

Avrachenkov, K., Piunovskiy, A. and Zhang, Y. (2015). Hitting times in Markov chains with restart and their application to network centrality. Methodology Comput. Appl. Prob. 20, 1173–1188.CrossRef Google Scholar

Bashyam, S. and Fu, M. (1994). Application of perturbation analysis to a class of periodic review (s, S) inventory systems. Naval Res. Logistics 41, 47–80.Google Scholar

Bashyam, S. and Fu, M. (1998). Optimization of (s, S) inventory systems with random lead times and a service level constraint. Manag. Sci. 44, 243–256.Google Scholar

Brown, L. et al. (2005). Statistical analysis of a telephone call center. J. Amer. Statist. Assoc. 100, 36–50.CrossRef Google Scholar

Cao, X. (2007). Stochastic Learning and Optimization: a Sensitivity-Based Approach. Springer, New York.CrossRef Google Scholar

Caswell, H. (2013). Sensitivity analysis of discrete Markov chains via matrix calculus. Linear Algebra Appl. 438, 1727–1745.CrossRef Google Scholar

Caswell, H. (2019). Sensitivity Analysis: Matrix Methods in Demography and Ecology. Springer, Cham.CrossRef Google Scholar

Cohn, D. (1980). Measure Theory. Birkhäuser, Stuttgart.CrossRef Google Scholar

Dekker, R. et al. (1998). Maintenance of light-standards—a case-study. J. Operat. Res. Soc. 49, 132–143.CrossRef Google Scholar

Fu, M. and Hu, J. Q. (1997). Conditional Monte Carlo: Gradient Estimation and Optimization Applications. Kluwer, Boston.CrossRef Google Scholar

Fu, M. C. (2006). Gradient estimation. In Handbooks in Operations Research and Management Science, Vol. 13, Simulation, eds S. Henderson and B. Nelson, North Holland, Amsterdam, pp. 575–616.CrossRef Google Scholar

Glasserman, P. (1991). Gradient Estimation via Perturbation Analysis. Kluwer, Boston.Google Scholar

Glasserman, P. (2004). Monte Carlo Methods in Financial Engineering. Springer, New York.Google Scholar

Heidergott, B. (2001). A weak derivative approach to optimization of threshold parameters in a multi-component maintenance system. J. Appl. Prob. 38, 386–406.CrossRef Google Scholar

Heidergott, B. (2001). Option pricing via Monte Carlo simulation: a weak derivative approach. Prob. Eng. Inf. Sci. 15, 335–349.CrossRef Google Scholar

Heidergott, B. (2007). Max-Plus Linear Stochastic Models and Perturbation Analysis. Springer, New York.Google Scholar

Heidergott, B. and Farenhorst-Yuan, T. (2010). Gradient estimation for multicomponent maintenance systems with age-replacement policy. Operat. Res. 58, 706–718.CrossRef Google Scholar

Heidergott, B., Hordijk, A. and Weisshaupt, H. (2008). Derivatives of Markov kernels and their Jordan decomposition. J. Appl. Anal. 14, 13–26.CrossRef Google Scholar

Heidergott, B., Leahu, H. and Volk-Makarewicz, W. (2014) A smoothed perturbation analysis of Parisian options. IEEE Trans. Automatic Control 60, 469–474.CrossRef Google Scholar

Heidergott, B. and Vázquez-Abad, F. (2006). Measure-valued differentiation for random horizon problems. Markov Process. Relat. Fields 12, 509–536.Google Scholar

Heidergott, B. and Vázquez-Abad, F. (2008). Measure-valued differentiation for Markov chains. J. Optimization Theory Appl. 136, 187–209.CrossRef Google Scholar

Ho, Y. C. and Cao, X. (1991). Perturbation Analysis of Discrete Event Dynamic Systems. Kluwer, Boston.CrossRef Google Scholar

Kallenberg, O. (2001). Foundations of Modern Probability, 2nd edn. Springer, New York.Google Scholar

Kartashov, N. (1996). Strong Stable Markov Chains. De Gruyter, Zeist.CrossRef Google Scholar

Kulkarni, G., Nicola, V. and Trivedi, S. (1987). The completion time of a job on multimode systems. Adv. Appl. Prob. 19, 932–954.CrossRef Google Scholar

Law, A. and Kelton, D. (2000). Simulation Modeling and Analysis. McGraw-Hill, Boston.Google Scholar

Leahu, H. (2008). Measure-valued differentiations for finite products of measures. Doctoral Thesis, Vrije Universiteit Amsterdam.Google Scholar

L’Ecuyer, P. and Perron, G. (1994). On the convergence rates of IPA and FDC derivative estimators. Operat. Res. 42, 643–656.CrossRef Google Scholar

Lyuu, Y.-D. and Teng, H.-W. (2011). Unbiased and efficient Greeks of financial options. Finance Stoch. 15, 141–181.CrossRef Google Scholar

Peng, Y., Fu, M. C., Hu, J. Q. and Heidergott, B. (2018). A new unbiased stochastic derivative estimator for discontinuous sample performances with structural parameters. Operat. Res. 66, 487–499.CrossRef Google Scholar

Pflug, G. (1992). Gradient estimates for the performance of Markov chains and discrete event processes. Ann. Operat. Res. 39, 173–194.CrossRef Google Scholar

Pflug, G. (1996). Optimisation of Stochastic Models. Kluwer, Boston.CrossRef Google Scholar

Pflug, G. and Rubinstein, R. (2002). Inventory processes: quasi-regenerative property, performance evaluation, and sensitivity estimation via simulation. Stoch. Models 18, 469–496.CrossRef Google Scholar

Rubinstein, R. (1992). Sensitivity analysis of discrete event systems by the ‘push out’ method. Ann. Operat. Res. 39, 229–250.CrossRef Google Scholar

Rubinstein, R. and Shapiro, A. (1993). Discrete Event Systems: Sensitivity Analysis and Optimization by the Score Function Method. John Wiley, Chichester.Google Scholar

Rudin, W. (1964). Principles of Mathematical Analysis. McGraw-Hill, New York.Google Scholar

Rudin, W. (1987). Real and Complex Analysis. McGraw-Hill, New York.Google Scholar

Article contents

Gradient estimation for smooth stopping criteria

Abstract

Keywords

MSC classification

Access options

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests