Hostname: page-component-78c5997874-xbtfd Total loading time: 0 Render date: 2024-11-19T13:21:58.602Z Has data issue: false hasContentIssue false

Policy Improvement and the Newton-Raphson Algorithm

Published online by Cambridge University Press:  27 July 2009

P. Whittle
Affiliation:
Statistical Laboratory University of Cambridge
N. Komarova
Affiliation:
All-Union Correspondence Polytechnic Institute, Moscow, USSR

Abstract

We show that the calculation of the infinite-horizon value function for a linear/quadratic Markov decision process by policy improvement is exactly equivalent to solution of the equilibrium Riccati equation by the Newton-Raphson method. The assertion extends to risk-sensitive and non-Markov forinulations and thus shows, for example, that the Newton-Raphson method provides an iterative algorithm for the canonical factorization of operators which shows second-order convergence and has a variational basis.

Type
Articles
Copyright
Copyright © Cambridge University Press 1988

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Whittle, P. (1981). Risk-sensitive linear/quadratic/Gaussian control. Advances in Applied Probability 13:764777.CrossRefGoogle Scholar
Whittle, P. (1982). Optimization Over Time, Vol. I. Chichester: Wiley.Google Scholar
Whittle, P. (1983). Prediction and Regulation by Linear Least Square Methods, 2nd Ed.University of Minnesota Press.Google Scholar
Whittle, P. & Kuhn, J. (1986). A Hamiltonian formulation of risk-sensitive linear/quadratic/Gaussian control. International Journal of Control 43:112.Google Scholar