Skip to main content Accessibility help
×
Home

Average Optimality for Continuous-Time Markov Decision Processes Under Weak Continuity Conditions

  • Yi Zhang (a1)

Abstract

This paper considers the average optimality for a continuous-time Markov decision process in Borel state and action spaces, and with an arbitrarily unbounded nonnegative cost rate. The existence of a deterministic stationary optimal policy is proved under the conditions that allow the following; the controlled process can be explosive, the transition rates are weakly continuous, and the multifunction defining the admissible action spaces can be neither compact-valued nor upper semicontinuous.

    • Send article to Kindle

      To send this article to your Kindle, first ensure no-reply@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about sending to your Kindle. Find out more about sending to your Kindle.

      Note you can select to send to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be sent to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

      Find out more about the Kindle Personal Document Service.

      Average Optimality for Continuous-Time Markov Decision Processes Under Weak Continuity Conditions
      Available formats
      ×

      Send article to Dropbox

      To send this article to your Dropbox account, please select one or more formats and confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your <service> account. Find out more about sending content to Dropbox.

      Average Optimality for Continuous-Time Markov Decision Processes Under Weak Continuity Conditions
      Available formats
      ×

      Send article to Google Drive

      To send this article to your Google Drive account, please select one or more formats and confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your <service> account. Find out more about sending content to Google Drive.

      Average Optimality for Continuous-Time Markov Decision Processes Under Weak Continuity Conditions
      Available formats
      ×

Copyright

Corresponding author

Postal address: Department of Mathematical Sciences, University of Liverpool, Liverpool L69 7ZL, UK. Email address: yi.zhang@liv.ac.uk

References

Hide All
[1] Berberian, S. K. (1999). Fundamentals of Real Analysis. Springer, New York.
[2] Bertsekas, D. P. and Shreve, S. E. (1978). Stochastic Optimal Control. Academic Press, New York.
[3] Cavazos-Cadena, R. (1991). A counterexample on the optimality equation in Markov decision chains with the average cost criterion. Syst. Control Lett. 16, 387392.
[4] Cavazos-Cadena, R. and Salem-Silva, F. (2010). The discunted method and equivalence of average criteria for risk-sensitive Markov decision processes on Borel spaces. Appl. Math. Optimization 61, 167190.
[5] Costa, O. L. V. and Dufour, F. (2012). Average control of Markov decision processes with Feller transition probabilities and general acton spaces. J. Math. Anal. Appl. 396, 5869.
[6] Feinberg, E. A. (2012). Reduction of discounted continuous-time MDPs with unbounded Jump and reward rates to discrete-time total-reward MDPs. In Optimization, Control, and Applications of Stochastic Systems, Birkhäuser, New York, pp. 7797.
[7] Feinberg, E. A. and Lewis, M. E. (2007). Optimality inequalities for average cost Markov decision processes and the stochastic cash balance problem. Math. Operat. Res. 32, 769783.
[8] Feinberg, E. A., Kasyanov, P. O. and Zadoianchuk, N. V. (2012). Average cost Markov decision processes with weakly continuous transition probabilities. Math. Operat. Res. 37, 591607.
[9] Feinberg, E. A., Kasyanov, P. O. and Zadoianchuk, N. V. (2013). Berge's theorem for noncompact image sets. J. Math. Anal. Appl. 397, 255259.
[10] Feinberg, E. A., Kasyanov, P. O. and Zadoianchuk, N. V. (2013). Fatou's lemma for weakly converging probabilities. Preprint, Department of Applied Mathematics and Statistics, State University of New York at Stony Brook. Available at http://arxiv.org/abs/1206.4073v2.
[11] Feinberg, E. A., Mandava, M. and Shiryaev, A. N. (2014). On solutions of Kolmogorov's equations for nonhomogeneous Jump Markov processes. J. Math. Anal. Appl. 411, 261270.
[12] Gíhman, I. Ī. and Skorohod, A. V. (1975). The Theory of Stochastic Processes. II. Springer, New York.
[13] Guo, X. (2007). Continuous-time Markov decision processes with discounted rewards: the case of Polish spaces. Math. Operat. Res. 32, 7387.
[14] Guo, X. and Hernández-Lerma, O. (2003). Drift and monotonicity conditions for continuous-time controlled Markov chains with an average criterion. IEEE Trans. Automatic Control 48, 236245.
[15] Guo, X. and Hernández-Lerma, O. (2009). Continuous-Time Markov Decision Processes: Theory and Applications. Springer, Berlin.
[16] Guo, X. and Liu, K. (2001). A note on optimality conditions for continuous-time Markov decision processes with average cost criterion. IEEE Trans. Automatic Control 46, 19841984.
[17] Guo, X. and Rieder, U. (2006). Average optimality for continuous-time Markov decision processes in Polish spaces. Ann. Appl. Prob. 16, 730756.
[18] Guo, X. and Ye, L. (2010). New discount and average optimality conditions for continuous-time Markov decision processes. Adv. Appl. Prob. 42, 953985.
[19] Guo, X. and Zhang, Y. (2013). Generalized discounted continuous-time Markov decision processes. Preprint. Available at http://arxiv.org/abs/1304.3314.
[20] Guo, X., Hernández-Lerma, O. and Prieto-Rumeau, T. (2006). A survey of recent results on continuous-time Markov decision processes. Top 14, 177261.
[21] Guo, X., Huang, Y. and Song, X. (2012). Linear programming and constrained average optimality for general continuous-time Markov decision processes in history-dependent policies. SIAM J. Control Optimization 50, 2347.
[22] Hernández-Lerma, O. and Lasserre, J. B. (1996). Discrete-Time Markov Control Processes. Springer, New York.
[23] Hernández-Lerma, O. and Lasserre, J. B. (2000). Fatou's lemma and Lebesgue's convergence theorem for measures. J. Appl. Math. Stoch. Anal. 13, 137146.
[24] Jaśkiewicz, A. (2009). Zero-sum ergodic semi-Markov games with weakly continuous transition probabilities. J. Optimization Theory Appl. 141, 321347.
[25] Jaśkiewicz, A. and Nowak, A. S. (2006). On the optimality equation for average cost Markov control processes with Feller transition probabilities. J. Math. Anal. Appl. 316, 495509.
[26] Jaśkiewicz, A. and Nowak, A. S. (2006). Optimality in Feller semi-Markov control processes. Operat. Res. Lett. 34, 713718.
[27] Kitaev, M. Yu. and Rykov, V. V. (1995). Controlled Queueing Systems. CRC, Boca Raton, FL.
[28] Kitayev, M. Yu. (1986). Semi-Markov and Jump Markov controlled models: average cost criterion. Theory. Prob. Appl. 30, 272288.
[29] Kuznetsov, S. E. (1981). Any Markov process in a Borel space has a transition function. Theory. Prob. Appl. 25, 384388.
[30] Piunovskiy, A. and Zhang, Y. (2011). Discounted continuous-time Markov decision processes with unbounded rates: the convex analytic approach. SIAM J. Control Optimization 49, 20322061.
[31] Piunovskiy, A. and Zhang, Y. (2012). The transformation method for continuous-time Markov decision processes. J. Optimization Theory Appl. 154, 691712.
[32] Prieto-Rumeau, T. and Hernández-Lerma, O. (2012). Selected Topics on Continuous-time Controlled Markov Chains and Markov Games. Imperial College Press, London.
[33] Puterman, M. L. (1994). Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley, New York.
[34] Zhu, Q. (2008). Average optimality for continuous-time Markov decision processes with a policy iteration approach. J. Math. Anal. Appl. 339, 691704.

Keywords

MSC classification

Related content

Powered by UNSILO

Average Optimality for Continuous-Time Markov Decision Processes Under Weak Continuity Conditions

  • Yi Zhang (a1)

Metrics

Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Abstract views

Total abstract views: 0 *
Loading metrics...

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed.