Part of Advances in Neural Information Processing Systems 1 (NIPS 1988)
David Ackley
ALVIS is a reinforcement-based connectionist architecture that learns associative maps in continuous multidimensional environ(cid:173) ments. The discovered locations of positive and negative rein(cid:173) forcements are recorded in "do be" and "don't be" subnetworks, respectively. The outputs of the subnetworks relevant to the cur(cid:173) rent goal are combined and compared with the current location to produce an error vector. This vector is backpropagated through a motor-perceptual mapping network. to produce an action vec(cid:173) tor that leads the system towards do-be locations and away from don 't-be locations. AL VIS is demonstrated with a simulated robot posed a target-seeking task.