The Expected Total Cost Criterion for Markov Decision Processes under Constraints

François Dufour; A. B. Piunovskiy

doi:10.1239/aap/1377868541

The Expected Total Cost Criterion for Markov Decision Processes under Constraints

Part of: Markov processes

Published online by Cambridge University Press: 04 January 2016

François Dufour and

A. B. Piunovskiy

Show author details

François Dufour*: Affiliation:
Université Bordeaux, IMB and INRIA Bordeaux Sud-Ouest
A. B. Piunovskiy*: Affiliation:
University of Liverpool
*: ∗ Postal address: Team CQFD, INRIA Bordeaux Sud-Ouest, 200 Avenue de la Vieille Tour, 33405 Talence cedex, France. Email address: dufour@math.u-bordeaux1.fr
∗∗ Postal address: Department of Mathematical Sciences, University of Liverpool, Liverpool, L69 7ZL, UK. Email address: piunov@liverpool.ac.uk

Article contents

Abstract
References

Rights & Permissions

Abstract

Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the ‘Save PDF’ action button.

In this work, we study discrete-time Markov decision processes (MDPs) with constraints when all the objectives have the same form of expected total cost over the infinite time horizon. Our objective is to analyze this problem by using the linear programming approach. Under some technical hypotheses, it is shown that if there exists an optimal solution for the associated linear program then there exists a randomized stationary policy which is optimal for the MDP, and that the optimal value of the linear program coincides with the optimal value of the constrained control problem. A second important result states that the set of randomized stationary policies provides a sufficient set for solving this MDP. It is important to note that, in contrast with the classical results of the literature, we do not assume the MDP to be transient or absorbing. More importantly, we do not impose the cost functions to be nonnegative or to be bounded below. Several examples are presented to illustrate our results.

Keywords

Markov decision process expected total cost criterion constraints linear programming occupation measure

MSC classification

Primary: 90C40: Markov and semi-Markov decision processes 60J10: Markov chains (discrete-time Markov processes on discrete state spaces)

Secondary: 90C90: Applications of mathematical programming

Type: General Applied Probability
Information: Advances in Applied Probability , Volume 45 , Issue 3 , September 2013 , pp. 837 - 859

DOI: https://doi.org/10.1239/aap/1377868541 [Opens in a new window]
Copyright: © Applied Probability Trust

References

Altman, E. (1999). Constrained Markov Decision Processes. Chapman & Hall/CRC, Boca Raton, FL.Google Scholar

Bäuerle, N. and Rieder, U. (2011). Markov Decision Processes with Applications to Finance. Springer, Heidelberg.Google Scholar

Bertsekas, D. P. and Shreve, S. E. (1978). Stochastic Optimal Control (Math. Sci. Eng. 139). Academic Press, New York.Google Scholar

Borkar, V. S. (1991). Topics in Controlled Markov Chains (Pitman Res. Notes Math. Ser. 240). Longman Scientific & Technical, Harlow.Google Scholar

Borkar, V. S. (2002). Convex analytic methods in Markov decision processes. In Handbook of Markov Decision Processes (Internat. Ser. Operat. Res. Manag. Sci. 40), Kluwer, Boston, MA, pp. 347–375.Google Scholar

Dufour, F. and Piunovskiy, A. B. (2010). Multiobjective stopping problem for discrete-time Markov processes: convex analytic approach. J. Appl. Prob. 47, 947–966.Google Scholar

Dufour, F., Horiguchi, M. and Piunovskiy, A. B. (2012). The expected total cost criterion for Markov decision processes under constraints: a convex analytic approach. Adv. Appl. Prob. 44, 774–793.Google Scholar

Filar, J. and Vrieze, K. (1997). Competitive Markov Decision Processes. Springer, New York.Google Scholar

Hernández-Lerma, O. and Lasserre, J. B. (1996). Discrete-Time Markov Control Processes (Appl. Math. 30). Springer, New York.Google Scholar

Hernández-Lerma, O. and Lasserre, J. B. (1999). Further Topics on Discrete-Time Markov Control Processes (Appl. Math. 42). Springer, New York.Google Scholar

Horiguchi, M. (2001). Markov decision processes with a stopping time constraint. Math. Meth. Operat. Res. 53, 279–295.Google Scholar

Horiguchi, M. (2001). Stopped Markov decision processes with multiple constraints. Math. Meth. Operat. Res. 54, 455–469.CrossRef Google Scholar

Piunovskiy, A. B. (1997). Optimal Control of Random Sequences in Problems with Constraints (Math. Appl. 410). Kluwer Academic, Dordrecht.CrossRef Google Scholar

Puterman, M. L. (1994). Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley, New York.Google Scholar

Sennott, L. I. (1999). Stochastic Dynamic Programming and the Control of Queueing Systems. John Wiley, New York.Google Scholar

Article contents

The Expected Total Cost Criterion for Markov Decision Processes under Constraints

Abstract

Keywords

MSC classification

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests