Security Analysis of Safe and Seldonian Reinforcement Learning Algorithms

Part of Advances in Neural Information Processing Systems 33 (NeurIPS 2020)

AuthorFeedback »Bibtex »MetaReview »Paper »Review »Supplemental »


Pinar Ozisik, Philip S. Thomas


We analyze the extent to which existing methods rely on accurate training data for a specific class of reinforcement learning (RL) algorithms, known as Safe and Seldonian RL. We introduce a new measure of security to quantify the susceptibility to perturbations in training data by creating an attacker model that represents a worst-case analysis, and show that a couple of Seldonian RL methods are extremely sensitive to even a few data corruptions. We then introduce a new algorithm that is more robust against data corruptions, and demonstrate its usage in practice on some RL problems, including a grid-world and a diabetes treatment simulation.