Sequential Decision Problems and Neural Networks

Part of Advances in Neural Information Processing Systems 2 (NIPS 1989)

Bibtex Metadata Paper

Authors

A. G. Barto, R. S. Sutton, C. J. C. H. Watkins

Abstract

Decision making tasks that involve delayed consequences are very common yet difficult to address with supervised learning methods. If there is an accurate model of the underlying dynamical system, then these tasks can be formulated as sequential decision problems and solved by Dynamic Programming. This paper discusses rein(cid:173) forcement learning in terms of the sequential decision framework and shows how a learning algorithm similar to the one implemented by the Adaptive Critic Element used in the pole-balancer of Barto, Sutton, and Anderson (1983), and further developed by Sutton (1984), fits into this framework. Adaptive neural networks can play significant roles as modules for approximating the functions required for solving sequential decision problems.