Skip to main content Accessibility help
×
Home

Countable state Markov decision processes with unbounded jump rates and discounted cost: optimality equation and approximations

  • H. Blok (a1) and F. M. Spieksma (a1)

Abstract

This paper considers Markov decision processes (MDPs) with unbounded rates, as a function of state. We are especially interested in studying structural properties of optimal policies and the value function. A common method to derive such properties is by value iteration applied to the uniformised MDP. However, due to the unboundedness of the rates, uniformisation is not possible, and so value iteration cannot be applied in the way we need. To circumvent this, one can perturb the MDP. Then we need two results for the perturbed sequence of MDPs: 1. there exists a unique solution to the discounted cost optimality equation for each perturbation as well as for the original MDP; 2. if the perturbed sequence of MDPs converges in a suitable manner then the associated optimal policies and the value function should converge as well. We can model both the MDP and perturbed MDPs as a collection of parametrised Markov processes. Then both of the results above are essentially implied by certain continuity properties of the process as a function of the parameter. In this paper we deduce tight verifiable conditions that imply the necessary continuity properties. The most important of these conditions are drift conditions that are strongly related to nonexplosiveness.

    • Send article to Kindle

      To send this article to your Kindle, first ensure no-reply@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about sending to your Kindle. Find out more about sending to your Kindle.

      Note you can select to send to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be sent to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

      Find out more about the Kindle Personal Document Service.

      Countable state Markov decision processes with unbounded jump rates and discounted cost: optimality equation and approximations
      Available formats
      ×

      Send article to Dropbox

      To send this article to your Dropbox account, please select one or more formats and confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your <service> account. Find out more about sending content to Dropbox.

      Countable state Markov decision processes with unbounded jump rates and discounted cost: optimality equation and approximations
      Available formats
      ×

      Send article to Google Drive

      To send this article to your Google Drive account, please select one or more formats and confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your <service> account. Find out more about sending content to Google Drive.

      Countable state Markov decision processes with unbounded jump rates and discounted cost: optimality equation and approximations
      Available formats
      ×

Copyright

Corresponding author

Postal address: Mathematisch Instituut, Leiden University, Postbus 9512, 2300 RA Leiden, The Netherlands.
∗∗ Email address: blokh1@math.leidenuniv.nl
∗∗∗ Email address: spieksma@math.leidenuniv.nl

References

Hide All
[1] Adan, I. J. B. F., Kulkarni, V. G. and van Wijk, A. C. C. (2013). Optimal control of a server farm. Inf. Syst. Operat. Res. 51, 241252.
[2] Anderson, W. J. (1991). Continuous-Time Markov Chains. Springer, New York.
[3] Federgruen, A. (1978). On N-person stochastic games with denumerable state space. Adv. Appl. Prob. 10, 452471.
[4] Guo, X. and Hernández-Lerma, O. (2003). Continuous-time controlled Markov chains with discounted rewards. Acta Appl. Math. 79, 195216.
[5] Guo, X. and Piunovskiy, A. (2011). Discounted continuous-time Markov decision processes with constraints: unbounded transition and loss rates. Math. Operat. Res. 36, 105132.
[6] Guo, X. and Zhu, W. (2002). Denumerable-state continuous-time Markov decision processes with unbounded transition and reward rates under the discounted criterion. J. Appl. Prob. 39, 233250.
[7] Guo, X., Hernández-Lerma, O and Prieto-Rumeau, T. (2006). A survey of recent results on continuous-time Markov decision processes. Top 14, 177261.
[8] Hordijk, A. (1974). Dynamic Programming and Markov Potential Theory (Math. Centre Tracts 51), Mathematisch Centrum, Amsterdam.
[9] Munkres, J. R. (2000). Topology, 2nd edn. Prentice Hall, Upper Saddle River, NJ.
[10] Piunovskiy, A. and Zhang, Y. (2014). Discounted continuous-time Markov decision processes with unbounded rates and randomized history-dependent policies: the dynamic programming approach. 4OR-Q J. Operat. Res. 12, 4975.
[11] Prieto-Rumeau, T. and Hernández-Lerma, O. (2012). Discounted continuous-time controlled Markov chains: convergence of control models. J. Appl. Prob. 49, 10721090.
[12] Prieto-Rumeau, T. and Hernández-Lerma, O. (2012). Selected Topics on Continuous-Time Controlled Markov Chains and Markov Games (ICP Adv. Texts Math. 5), Imperial College Press, London.
[13] Reuter, G. E. H. (1957). Denumerable Markov processes and the associated contraction semigroups on. l. Acta Math. 97, 146.
[14] Royden, H. L. (1988). Real Analysis, 2nd edn. Macmillan, New York.
[15] Rudin, W. (1976). Principles of Mathematical Analysis, 3rd edn. McGraw-Hill, New York.
[16] Spieksma, F. M. (2012). Kolmogorov forward equation and explosiveness in countable state Markov processes. Ann. Operat. Res. 10.1007/s10479-012-1262-7.
[17] Spieksma, F. M. (2013). Countable state Markov processes: non-explosiveness and moment function. Prob. Eng. Inf. Sci. 29, 623637.

Keywords

MSC classification

Related content

Powered by UNSILO

Countable state Markov decision processes with unbounded jump rates and discounted cost: optimality equation and approximations

  • H. Blok (a1) and F. M. Spieksma (a1)

Metrics

Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Abstract views

Total abstract views: 0 *
Loading metrics...

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed.