Provably efficient multi-task reinforcement learning with model transfer

Part of Advances in Neural Information Processing Systems 34 pre-proceedings (NeurIPS 2021)

Paper Supplemental

Bibtek download is not available in the pre-proceeding


Chicheng Zhang, Zhi Wang


We study multi-task reinforcement learning (RL) in tabular episodic Markov decision processes (MDPs). We formulate a heterogeneous multi-player RL problem, in which a group of players concurrently face similar but not necessarily identical MDPs, with a goal of improving their collective performance through inter-player information sharing. We design and analyze a model-based algorithm, and provide gap-dependent and gap-independent regret upper and lower bounds that characterize the intrinsic complexity of the problem.