Skip to main content Accessibility help
×
Home

Risk-sensitive semi-Markov decision processes with general utilities and multiple criteria

  • Yonghui Huang (a1), Zhaotong Lian (a2) and Xianping Guo (a1)

Abstract

In this paper we investigate risk-sensitive semi-Markov decision processes with a Borel state space, unbounded cost rates, and general utility functions. The performance criteria are several expected utilities of the total cost in a finite horizon. Our analysis is based on a type of finite-horizon occupation measure. We express the distribution of the finite-horizon cost in terms of the occupation measure for each policy, wherein the discount is not needed. For unconstrained and constrained problems, we establish the existence and computation of optimal policies. In particular, we develop a linear program and its dual program for the constrained problem and, moreover, establish the strong duality between the two programs. Finally, we provide two special cases of our results, one of which concerns the discrete-time model, and the other the chance-constrained problem.

Copyright

Corresponding author

* Postal address: School of Mathematics, Sun Yat-Sen University, Guangzhou, 510275, China.
** Email address: hyongh5@mail.sysu.edu.cn
*** Postal address: Faculty of Business Administration, University of Macau, Macau, China. Email address: lianzt@umac.mo
**** Email address: mcsgxp@mail.sysu.edu.cn

References

Hide All
[1]Bäuerle, N. and Rieder, U. (2011). Markov Decision Processes with Applications to Finance. Springer, Heidelberg.
[2]Bäuerle, N. and Rieder, U. (2014). More risk-sensitive Markov decision processes. Math. Operat. Res. 39, 105120.
[3]Beutler, F. J. and Ross, K. W. (1986). Time-average optimal constrained semi-Markov decision processes. Adv. Appl. Prob. 18, 341359.
[4]Boyd, S. and Vandenberghe, L. (2004). Convex Optimization. Cambridge University Press.
[5]Cavazos-Cadena, R. and Montes-de-Oca, R. (2005). Nonstationary value iteration in controlled Markov chains with risk-sensitive average criterion. J. Appl. Prob. 42, 905918.
[6]Chávez-Rodríguez, S., Cavazos-Cadena, R. and Cruz-Suárez, H. (2016). Controlled semi-Markov chains with risk-sensitive average cost criterion. J. Optim. Theory Appl. 170, 670686.
[7]Chung, K. J. and Sobel, M. J. (1987). Discounted MDPs: distribution functions and exponential utility maximization. SIAM J. Control Optimization 25, 4962.
[8]Di Masi, G. B. and Stettner, Ł. (2007). Infinite horizon risk sensitive control of discrete time Markov processes under minorization property. SIAM J. Control Optimization 46, 231252.
[9]Feinberg, E. A. and Rothblum, U. G. (2012). Splitting randomized stationary policies in total-reward Markov decision processes. Math. Operat. Res. 37, 129153.
[10]Ghosh, M. and Saha, S. (2014). Risk-sensitive control of continuous time Markov chains. Stochastics 86, 655675.
[11]Guo, X., Vykertas, M. and Zhang, Y. (2013). Absorbing continuous-time Markov decision processes with total cost criteria. Adv. Appl. Prob. 45, 490519.
[12]Haskell, W. B. and Jain, R. (2013). Stochastic dominance-constrained Markov decision processes. SIAM J. Control Optimization 51, 273303.
[13]Haskell, W. B. and Jain, R. (2015). A convex analytic approach to risk-aware Markov decision processes. SIAM J. Control Optimization 53, 15691598.
[14]Hernández-Hernández, D. and Marcus, S. I. (1999). Existence of risk-sensitive optimal stationary policies for controlled Markov processes. Appl. Math. Optimization 40, 273285.
[15]Hernández-Lerma, O. and Lasserre, J. B. (1999). Further Topics on Discrete-Time Markov Control Processes. Springer, New York.
[16]Huang, Y. and Guo, X. (2009). Optimal risk probability for first passage models in semi-Markov decision processes. J. Math. Anal. Appl. 359, 404420.
[17]Mamer, J. W. (1986). Successive approximations for finite horizon semi-Markov decision processes with application to asset liquidation. Operat. Res. 34, 638644.
[18]Piunovskiy, A. and Zhang, Y. (2011). Discounted continuous-time Markov decision processes with unbounded rates: the convex analytic approach. SIAM J. Control Optimization 49, 20322061.
[19]Puterman, M. L. (1994). Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley, New York.
[20]Rockafellar, R. T. (1974). Conjugate Duality and Optimization. SIAM, Philadelphia, PA.
[21]Ross, S. M. (1996). Stochastic Processes, 2nd edn. John Wiley, New York.
[22]Suresh Kumar, K. and Pal, C. (2015). Risk-sensitive ergodic control of continuous time Markov processes with denumerable state space. Stoch. Anal. Appl. 33, 863881.

Keywords

MSC classification

Metrics

Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Abstract views

Total abstract views: 0 *
Loading metrics...

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed