Bias and Overtaking Optimality for Continuous-Time Jump Markov Decision Processes in Polish Spaces

Quanxin Zhu; Tomás Prieto-Rumeau

doi:10.1239/jap/1214950357

Bias and Overtaking Optimality for Continuous-Time Jump Markov Decision Processes in Polish Spaces

Part of: Stochastic systems and control

Published online by Cambridge University Press: 14 July 2016

Quanxin Zhu and

Tomás Prieto-Rumeau

Show author details

Quanxin Zhu*: Affiliation:
South China Normal University
Tomás Prieto-Rumeau*: Affiliation:
Universidad Nacional de Educación a Distancia
*: ∗. Research partially supported by the Natural Science Foundation of China (10626021), the Natural Science Foundation of Guangdong Province (06300957), and CONACYT grant 45693-F.
∗. Research partially supported by the Natural Science Foundation of China (10626021), the Natural Science Foundation of Guangdong Province (06300957), and CONACYT grant 45693-F.

Article contents

Abstract
References

Rights & Permissions

Abstract

Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the ‘Save PDF’ action button.

In this paper we study the bias and the overtaking optimality criteria for continuous-time jump Markov decision processes in general state and action spaces. The corresponding transition rates are allowed to be unbounded, and the reward rates may have neither upper nor lower bounds. Under appropriate hypotheses, we prove the existence of solutions to the bias optimality equations, the existence of bias optimal policies, and an equivalence relation between bias and overtaking optimality.

Keywords

Continuous-time jump Markov decision process expected average reward criterion general state space bias optimality overtaking optimality

MSC classification

Secondary: 90C40: Markov and semi-Markov decision processes 93E20: Optimal stochastic control

Type: Research Article
Information: Journal of Applied Probability , Volume 45 , Issue 2 , June 2008 , pp. 417 - 429

DOI: https://doi.org/10.1239/jap/1214950357 [Opens in a new window]
Copyright: Copyright © Applied Probability Trust 2008

References

[1] Arapostathis, A. et al. (1993). Discrete-time controlled Markov processes with average cost criterion: a survey. SIAM J. Control Optimization 31, 282–344.CrossRef Google Scholar

[2] Cao, X. R. (1998). The relations among potentials, perturbation analysis and Markov decision processes. Discrete Event Dyn. Syst. 8, 71–87.CrossRef Google Scholar

[3] Cao, X. R. and Chen, H. F. (1997). Potentials, perturbation realization and sensitivity analysis of Markov processes. IEEE Trans. Automatic Control 42, 1382–1397.Google Scholar

[4] Guo, X. P. (2007). Continuous-time Markov decision processes with discounted rewards: the case of Polish spaces. Math. Operat. Res. 32, 73–87.CrossRef Google Scholar

[5] Guo, X. P. and Liu, K. (2001). A note on optimality conditions for continuous-time Markov decision processes with average cost criterion. IEEE Trans. Automatic Control 46, 1984–1984.Google Scholar

[6] Guo, X. P. and Rieder, U. (2006). Average optimality for continuous-time Markov decision processes in Polish spaces. Ann. Appl. Prob. 16, 730–756.CrossRef Google Scholar

[7] Guo, X. P., Hernández-Lerma, O. and Prieto-Rumeau, T. (2006). A survey of recent results on continuous-time Markov decision processes. Top 14, 177–261.CrossRef Google Scholar

[8] Haviv, M. and Puterman, M. L. (1998). Bias optimality in controlled queueing systems. J. Appl. Prob. 35, 136–150.CrossRef Google Scholar

[9] Hernández-Lerma, O. and Lasserre, J. B. (1996). Discrete-Time Markov Control Processes: Basic Optimality Criteria. Springer, New York.CrossRef Google Scholar

[10] Hernández-Lerma, O. and Lasserre, J. B. (1999). Further Topics on Discrete-Time Markov Control Processes. Springer, New York.CrossRef Google Scholar

[11] Hernández-Lerma, O., Vega-Amaya, O. and Carrasco, G. (1999). Sample-path optimality and variance-minimization of average cost Markov control processes. SIAM J. Control Optimization 38, 79–93.CrossRef Google Scholar

[12] Jasso-Fuentes, H. and Hernández-Lerma, O. (2008). Characterizations of overtaking optimality for controlled diffusion processes. Appl. Math. Optimization 57, 349–369.CrossRef Google Scholar

[13] Jasso-Fuentes, H. and Hernández-Lerma, O. (2008). Ergodic control, bias, and sensitive discount optimality for Markov diffusion processes. To appear in Stoch. Ann. Appl. Google Scholar

[14] Lund, R. B., Meyn, S. P. and Tweedie, R. L. (1996). Computable exponential convergence rates for stochastically ordered Markov processes. Ann. Appl. Prob. 6, 218–237.CrossRef Google Scholar

[15] Prieto-Rumeau, T. and Hernández-Lerma, O. (2005). The Laurent series, sensitive discount and Blackwell optimality for continuous-time controlled Markov chains. Math. Meth. Operat. Res. 61, 123–145.CrossRef Google Scholar

[16] Prieto-Rumeau, T. and Hernández-Lerma, O. (2006). Bias optimality for continuous-time controlled Markov chains. SIAM J. Control Optimization 45, 51–73.CrossRef Google Scholar

[17] Prieto-Rumeau, T. and Hernández-Lerma, O. (2006). Variance minimization and the overtaking optimality approach to continuous-time controlled Markov chains. Submitted.Google Scholar

[18] Puterman, M. L. (1974). Sensitive discount optimality in controlled one-dimensional diffusions. Ann. Prob. 2, 408–419.CrossRef Google Scholar

[19] Puterman, M. L. (1994). Markov Decision Process. John Wiley, New York.CrossRef Google Scholar

[20] Zhu, Q. X. (2007). Average optimality inequality for continuous-time Markov decision processes in Polish spaces. Math. Meth. Operat. Res. 66, 299–313.CrossRef Google Scholar

[21] Zhu, Q. X. (2008). Average optimality for continuous-time Markov decision processes with a policy iteration approach. J. Math. Analysis Appl. 339, 691–704.CrossRef Google Scholar

[22] Zhu, Q. X. and Guo, X. P. (2005). Another set of conditions for strong n (n=-1,0) discount optimality in Markov decision processes. Stoch. Anal. Appl. 23, 953–974.CrossRef Google Scholar

[23] Zhu, Q. X. and Guo, X. P. (2007). Markov decision processes with variance minimization: a new condition and approach. Stoch. Anal. Appl. 25, 577–592.CrossRef Google Scholar

Article contents

Bias and Overtaking Optimality for Continuous-Time Jump Markov Decision Processes in Polish Spaces

Abstract

Keywords

MSC classification

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests