Trevor Darrell, Alex Pentland
We have developed a foveated gesture recognition system that runs in an unconstrained office environment with an active camera. Us(cid:173) ing vision routines previously implemented for an interactive envi(cid:173) ronment, we determine the spatial location of salient body parts of a user and guide an active camera to obtain images of gestures or expressions. A hidden-state reinforcement learning paradigm is used to implement visual attention. The attention module selects targets to foveate based on the goal of successful recognition, and uses a new multiple-model Q-Iearning formulation. Given a set of target and distractor gestures, our system can learn where to foveate to maximally discriminate a particular gesture.