NIPS Proceedingsβ

Bayes-Adaptive Simulation-based Search with Value Function Approximation

Part of: Advances in Neural Information Processing Systems 27 (NIPS 2014)

[PDF] [BibTeX] [Supplemental] [Reviews]


Conference Event Type: Poster


Bayes-adaptive planning offers a principled solution to the exploration-exploitation trade-off under model uncertainty. It finds the optimal policy in belief space, which explicitly accounts for the expected effect on future rewards of reductions in uncertainty. However, the Bayes-adaptive solution is typically intractable in domains with large or continuous state spaces. We present a tractable method for approximating the Bayes-adaptive solution by combining simulation-based search with a novel value function approximation technique that generalises over belief space. Our method outperforms prior approaches in both discrete bandit tasks and simple continuous navigation and control tasks.