Skip to main content Accessibility help
×
Home

The Expected Total Cost Criterion for Markov Decision Processes under Constraints

  • François Dufour (a1) and A. B. Piunovskiy (a2)

Abstract

In this work, we study discrete-time Markov decision processes (MDPs) with constraints when all the objectives have the same form of expected total cost over the infinite time horizon. Our objective is to analyze this problem by using the linear programming approach. Under some technical hypotheses, it is shown that if there exists an optimal solution for the associated linear program then there exists a randomized stationary policy which is optimal for the MDP, and that the optimal value of the linear program coincides with the optimal value of the constrained control problem. A second important result states that the set of randomized stationary policies provides a sufficient set for solving this MDP. It is important to note that, in contrast with the classical results of the literature, we do not assume the MDP to be transient or absorbing. More importantly, we do not impose the cost functions to be nonnegative or to be bounded below. Several examples are presented to illustrate our results.

    • Send article to Kindle

      To send this article to your Kindle, first ensure no-reply@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about sending to your Kindle. Find out more about sending to your Kindle.

      Note you can select to send to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be sent to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

      Find out more about the Kindle Personal Document Service.

      The Expected Total Cost Criterion for Markov Decision Processes under Constraints
      Available formats
      ×

      Send article to Dropbox

      To send this article to your Dropbox account, please select one or more formats and confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your <service> account. Find out more about sending content to Dropbox.

      The Expected Total Cost Criterion for Markov Decision Processes under Constraints
      Available formats
      ×

      Send article to Google Drive

      To send this article to your Google Drive account, please select one or more formats and confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your <service> account. Find out more about sending content to Google Drive.

      The Expected Total Cost Criterion for Markov Decision Processes under Constraints
      Available formats
      ×

Copyright

Corresponding author

Postal address: Team CQFD, INRIA Bordeaux Sud-Ouest, 200 Avenue de la Vieille Tour, 33405 Talence cedex, France. Email address: dufour@math.u-bordeaux1.fr
∗∗ Postal address: Department of Mathematical Sciences, University of Liverpool, Liverpool, L69 7ZL, UK. Email address: piunov@liverpool.ac.uk

References

Hide All
[1] Altman, E. (1999). Constrained Markov Decision Processes. Chapman & Hall/CRC, Boca Raton, FL.
[2] Bäuerle, N. and Rieder, U. (2011). Markov Decision Processes with Applications to Finance. Springer, Heidelberg.
[3] Bertsekas, D. P. and Shreve, S. E. (1978). Stochastic Optimal Control (Math. Sci. Eng. 139). Academic Press, New York.
[4] Borkar, V. S. (1991). Topics in Controlled Markov Chains (Pitman Res. Notes Math. Ser. 240). Longman Scientific & Technical, Harlow.
[5] Borkar, V. S. (2002). Convex analytic methods in Markov decision processes. In Handbook of Markov Decision Processes (Internat. Ser. Operat. Res. Manag. Sci. 40), Kluwer, Boston, MA, pp. 347375.
[6] Dufour, F. and Piunovskiy, A. B. (2010). Multiobjective stopping problem for discrete-time Markov processes: convex analytic approach. J. Appl. Prob. 47, 947966.
[7] Dufour, F., Horiguchi, M. and Piunovskiy, A. B. (2012). The expected total cost criterion for Markov decision processes under constraints: a convex analytic approach. Adv. Appl. Prob. 44, 774793.
[8] Filar, J. and Vrieze, K. (1997). Competitive Markov Decision Processes. Springer, New York.
[9] Hernández-Lerma, O. and Lasserre, J. B. (1996). Discrete-Time Markov Control Processes (Appl. Math. 30). Springer, New York.
[10] Hernández-Lerma, O. and Lasserre, J. B. (1999). Further Topics on Discrete-Time Markov Control Processes (Appl. Math. 42). Springer, New York.
[11] Horiguchi, M. (2001). Markov decision processes with a stopping time constraint. Math. Meth. Operat. Res. 53, 279295.
[12] Horiguchi, M. (2001). Stopped Markov decision processes with multiple constraints. Math. Meth. Operat. Res. 54, 455469.
[13] Piunovskiy, A. B. (1997). Optimal Control of Random Sequences in Problems with Constraints (Math. Appl. 410). Kluwer Academic, Dordrecht.
[14] Puterman, M. L. (1994). Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley, New York.
[15] Sennott, L. I. (1999). Stochastic Dynamic Programming and the Control of Queueing Systems. John Wiley, New York.

Keywords

MSC classification

Metrics

Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Abstract views

Total abstract views: 0 *
Loading metrics...

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed