A Bandit Framework for Strategic Regression

Part of Advances in Neural Information Processing Systems 29 (NIPS 2016)

Bibtex Metadata Paper Reviews Supplemental


Yang Liu, Yiling Chen


We consider a learner's problem of acquiring data dynamically for training a regression model, where the training data are collected from strategic data sources. A fundamental challenge is to incentivize data holders to exert effort to improve the quality of their reported data, despite that the quality is not directly verifiable by the learner. In this work, we study a dynamic data acquisition process where data holders can contribute multiple times. Using a bandit framework, we leverage on the long-term incentive of future job opportunities to incentivize high-quality contributions. We propose a Strategic Regression-Upper Confidence Bound (SR-UCB) framework, an UCB-style index combined with a simple payment rule, where the index of a worker approximates the quality of his past contributions and is used by the learner to determine whether the worker receives future work. For linear regression and certain family of non-linear regression problems, we show that SR-UCB enables a $O(\sqrt{\log T/T})$-Bayesian Nash Equilibrium (BNE) where each worker exerting a target effort level that the learner has chosen, with $T$ being the number of data acquisition stages. The SR-UCB framework also has some other desirable properties: (1) The indexes can be updated in an online fashion (hence computationally light). (2) A slight variant, namely Private SR-UCB (PSR-UCB), is able to preserve $(O(\log^{-1} T), O(\log^{-1} T))$-differential privacy for workers' data, with only a small compromise on incentives (achieving $O(\log^{6} T/\sqrt{T})$-BNE).