Fredrik Bissmarck, Hiroyuki Nakahara, Kenji Doya, Okihide Hikosaka
Motor control depends on sensory feedback in multiple modalities with different latencies. In this paper we consider within the framework of re- inforcement learning how different sensory modalities can be combined and selected for real-time, optimal movement control. We propose an actor-critic architecture with multiple modules, whose output are com- bined using a softmax function. We tested our architecture in a simu- lation of a sequential reaching task. Reaching was initially guided by visual feedback with a long latency. Our learning scheme allowed the agent to utilize the somatosensory feedback with shorter latency when the hand is near the experienced trajectory. In simulations with different latencies for visual and somatosensory feedback, we found that the agent depended more on feedback with shorter latency.