Fast Learning with Predictive Forward Models

Part of Advances in Neural Information Processing Systems 4 (NIPS 1991)

Bibtex Metadata Paper


Carlos Brody


A method for transforming performance evaluation signals distal both in space and time into proximal signals usable by supervised learning algo(cid:173) rithms, presented in [Jordan & Jacobs 90], is examined. A simple obser(cid:173) vation concerning differentiation through models trained with redundant inputs (as one of their networks is) explains a weakness in the original architecture and suggests a modification: an internal world model that encodes action-space exploration and, crucially, cancels input redundancy to the forward model is added. Learning time on an example task, cart(cid:173) pole balancing, is thereby reduced about 50 to 100 times.