Strongly consistent estimation in a controlled Markov renewal model

Michael Kolonko

doi:10.2307/3213512

Strongly consistent estimation in a controlled Markov renewal model

Published online by Cambridge University Press: 14 July 2016

Michael Kolonko

Show author details

Michael Kolonko*: Affiliation:
Universität Karlsruhe
*: ∗Postal address: Institut für Mathematische Statistik, Universität Karlsruhe, Englerstrasse 2, 7500 Karlsruhe, W. Germany.

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

The optimal control of dynamic models which are not completely known to the controller often requires some kind of estimation of the unknown parameters. We present conditions under which a minimum contrast estimator will be strongly consistent independently of the control used. This kind of estimator is appropriate for the adaptive or ‘estimation and control' approach in dynamic programming under uncertainty. We consider a countable-state Markov renewal model and we impose bounding and recurrence conditions of the so-called Liapunov type.

Keywords

DYNAMIC PROGRAMMING ADAPTIVE CONTROL ESTIMATION AND CONTROL ESTIMATION IN A MARKOV MODEL WAITING LINE WITH UNKNOWN PARAMETERS

Type: Research Papers
Information: Journal of Applied Probability , Volume 19 , Issue 3 , September 1982 , pp. 532 - 545

DOI: https://doi.org/10.2307/3213512 [Opens in a new window]
Copyright: Copyright © Applied Probability Trust 1982

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Beckenbach, E. F. and Bellman, R. (1961) Inequalities. Springer-Verlag, Berlin.CrossRef Google Scholar

Billingsley, P. (1968) Convergence of Probability Measures. Wiley, New York.Google Scholar

Borkar, V. and Varaiya, P. (1979) Adaptive control of Markov chains. In Lecture Notes in Control and Information Sciences 16, Springer-Verlag, Berlin.Google Scholar

Borkar, V. and Varaiya, P. (1980) Identification and adaptive control of Markov chains. Department of Electrical Engineering and Computer Sciences and the Electronics Research Laboratory, University of California, Berkeley, California 94720.Google Scholar

Doshi, B. and Shreve, S. E. (1980) Strong consistency of a modified maximum likelihood estimator for controlled Markov chains. J. Appl. Prob. 17, 726–734.CrossRef Google Scholar

El-Fattah, Y. M. (1981) Recursive estimation and control in Markov chains. Adv. Appl. Prob. 13, 778–803.CrossRef Google Scholar

Georgin, J. P. (1978) Estimation et contrôle des chaînes de Markov sur des espaces arbitraires. In Lecture Notes in Mathematics 636, Springer-Verlag, Berlin.Google Scholar

Hinderer, K. (1970) Foundations of Non-stationary Dynamic Programming with Discrete Time Parameter. Springer-Verlag, Berlin.Google Scholar

Hordijk, A. (1974) Dynamic Programming and Markov Potential Theory. Mathematical Centre Tract 51, Amsterdam.Google Scholar

Hordijk, A. (1976) Regenerative Markov decision models. Math. Progr. Stud. 6, 49–72.CrossRef Google Scholar

Kolonko, M. (1980a) Dynamische Optimierung unter Unsicherheit in einem Semi-Markoff-Modell mit abzählbarem Zustandsraum. Dissertation, Bonn. Google Scholar

Kolonko, M. (1980b) A countable Markov chain with reward structure. — Continuity of the average reward. Preprint No. 415, SFB 72, University of Bonn.Google Scholar

Kolonko, M. (1982) The average-optimal adaptive control of a Markov renewal model in presence of an unknown parameter. Math. Operationsforsch. Statist., Ser. Optimization. To appear.Google Scholar

Kumar, P. R. and Becker, A. (1980) A new family of optimal adaptive controllers. Mathematical Research Report No. 80–18, University of Maryland, Dept. of Mathematics.Google Scholar

Kurano, M. (1972) Discrete-time Markovian decision processes with an unknown parameteraverage return criterion. J. Operat. Res. Soc. Japan 15, 67–76.Google Scholar

Loève, M. (1978) Probability Theory II, 4th edn. Springer-Verlag, Berlin.Google Scholar

Mandl, P. (1974) Estimation and control in Markov chains. Adv. Appl. Prob. 6, 40–60.Google Scholar

Mandl, P. (1979) On the adaptive control of countable Markov chains. In Probability Theory, Banach Center Publications, Vol. 5. PWN — Polish Scientific Publishers, Warsaw.Google Scholar

Pfanzagl, J. (1969) On the measurability and consistency of minimum contrast estimates. Metrika 14, 249–272.Google Scholar

Schäl, M. (1975) On dynamic programming: compactness of the space of policies. Stoch. Proc. Appl. 3, 345–364.Google Scholar

Schäl, M. (1981) Estimation and control in discounted stochastic dynamic programming. Stochastics. To appear.Google Scholar

Wijngaard, J. (1977a) Stationary Markovian decision problem and perturbation theory of quasi-compact linear operators. Math. Operat. Res. 2, 91–102.CrossRef Google Scholar

Wijngaard, J. (1977b) Recurrence conditions and the existence of average optimal strategies for inventory problems on a countable state space. Bonner Math. Schriften 98, 149–161.Google Scholar

Article contents

Strongly consistent estimation in a controlled Markov renewal model

Abstract

Keywords

Access options

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests