All learning is Local: Multi-agent Learning in Global Reward Games

Yu-han Chang, Tracey Ho, Leslie P. Kaelbling

Advances in Neural Information Processing Systems 16 (NIPS 2003)

In large multiagent games, partial observability, coordination, and credit assignment persistently plague attempts to design good learning algo- rithms. We provide a simple and efficient algorithm that in part uses a linear system to model the world from a single agent’s limited per- spective, and takes advantage of Kalman filtering to allow an agent to construct a good training signal and learn an effective policy.