Discounted Cost Markov Decision Processes with a Constraint

Kazuyoshi Wakuta

doi:10.1017/S0269964800005131

Discounted Cost Markov Decision Processes with a Constraint

Published online by Cambridge University Press: 27 July 2009

Kazuyoshi Wakuta

Show author details

Kazuyoshi Wakuta: Affiliation:
Nagaoka Technical College, 888 Nishikatakai, Nagaoka, Niigata 940, Japan

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

We consider a discounted cost Markov decision process with a constraint. Relating this to a vector-valued Markov decision process, we prove that there exists a constrained optimal randomized semistationary policy if there exists at least one policy satisfying a constraint. Moreover, we present an algorithm by which we can find the constrained optimal randomized semistationary policy, or we can discover that there exist no policies satisfying a given constraint.

Type: Research Article
Information: Probability in the Engineering and Informational Sciences , Volume 12 , Issue 2 , April 1998 , pp. 177 - 187

DOI: https://doi.org/10.1017/S0269964800005131 [Opens in a new window]
Copyright: Copyright © Cambridge University Press 1998

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

1.Altman, E. (1994). Denumerable constrained Markov decision processes and finite approximations. Mathematics of Operations Research 19: 169–191.CrossRef Google Scholar

2.Altman, E. & Shwartz, A. (1991). Sensitivity of constrained Markov decision processes. Annals of Operations Research 32: 1–22.CrossRef Google Scholar

3.Beutler, F.J. & Ross, K.W. (1985). Optimal policies for controlled Markov chains with a constraint. Journal of Mathematical Analysis and Applications 112: 236–252.CrossRef Google Scholar

4.Beutler, F.J. & Ross, K.W. (1986). Time-average optimal constrained semi-Markov decision processes. Advances in Applied Probability 18: 341–359.CrossRef Google Scholar

5.Chitgopekar, S.S. (1975). Denumerable state Markovian sequential control processes: On randomizations of optimal policies. Naval Research Logistics Quarterly 22: 567–573.CrossRef Google Scholar

6.Frid, E.B. (1972). On optimal strategies in control problems with constraints. Theory of Probability and Its Applications 17: 188–192.CrossRef Google Scholar

7.Hinderer, K. (1970). Foundations of non-stationary dynamic programming with discrete time parameter. Berlin: Springer-Verlag.CrossRef Google Scholar

8.Kallenberg, L.C.M. (1983). Linear programming and finite Markovian control problems. In Mathematical Centre Tracts 148. Amsterdam: CWI.Google Scholar

9.Liu, J. & Liu, K. (1994). Markov decision programming with constraints. Acta Mathematicae Applicatae Sinica 10: 1–11.CrossRef Google Scholar

10.Puterman, M.L. (1994). Markov decision processes. New York: Wiley.CrossRef Google Scholar

11.Sennott, L.I. (1991). Constrained discounted Markov decision chains. Probability in the Engineering and Informational Sciences 5: 463–475.Google Scholar

12.Stoer, J. & Witzgall, C. (1970). Convexity and optimization infinite dimensions I. Berlin: Springer-Verlag.CrossRef Google Scholar

13.Wakuta, K. (1995). Vector-valued Markov decision processes and the systems of linear inequalities. Stochastic Processes and Their Applications 56: 159–169.CrossRef Google Scholar

14.Wakuta, K. (1996). A new class of policies in vector-valued Markov decision processes. Journal of Mathematical Analysis and Applications 202: 623–628.CrossRef Google Scholar

Article contents

Discounted Cost Markov Decision Processes with a Constraint

Abstract

Access options

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests