Robust exploration in linear quadratic reinforcement learning

Umenberger, Jack; Ferizbegovic, Mina; Schön, Thomas B.; Hjalmarsson, Håkan

Robust exploration in linear quadratic reinforcement learning

Jack Umenberger, Mina Ferizbegovic, Thomas B Schön, Håkan Hjalmarsson

Advances in Neural Information Processing Systems 32 (NeurIPS 2019)

AuthorFeedback Bibtex MetaReview Metadata Paper Reviews Supplemental

Abstract

Learning to make decisions in an uncertain and dynamic environment is a task of fundamental performance in a number of domains. This paper concerns the problem of learning control policies for an unknown linear dynamical system so as to minimize a quadratic cost function. We present a method, based on convex optimization, that accomplishes this task ‘robustly’, i.e., the worst-case cost, accounting for system uncertainty given the observed data, is minimized. The method balances exploitation and exploration, exciting the system in such a way so as to reduce uncertainty in the model parameters to which the worst-case cost is most sensitive. Numerical simulations and application to a hardware-in-the-loop servo-mechanism are used to demonstrate the approach, with appreciable performance and robustness gains over alternative methods observed in both.

Abstract

Name Change Policy