Gerald Tesauro, Terrence J. Sejnowski
We describe a class of connectionist networks that have learned to play back(cid:173)
gammon at an intermediate-to-advanced level. TIle networks were trained by a supervised learning procedure on a large set of sample positions evaluated by a human expert. In actual match play against humans and conventional computer programs, the networks demonstrate substantial ability to generalize on the basis of expert knowledge. Our study touches on some of the most important issues in net(cid:173) work learning theory, including the development of efficient coding schemes and training procedures, scaling, generalization, the use of real-valued inputs and out(cid:173) puts, and techniques for escaping from local minima. Practical applications in games and other domains are also discussed.