Truncated Linear Regression in High Dimensions

Part of Advances in Neural Information Processing Systems 33 (NeurIPS 2020)

AuthorFeedback »Bibtex »MetaReview »Paper »Review »Supplemental »


Constantinos Daskalakis, Dhruv Rohatgi, Emmanouil Zampetakis


<p>As in standard linear regression, in truncated linear regression, we are given access to observations (A<em>i, y</em>i)<em>i whose dependent variable equals y</em>i= A<em>i^{\rm T} \cdot x^* + \eta</em>i, where x^* is some fixed unknown vector of interest and \eta<em>i is independent noise; except we are only given an observation if its dependent variable y</em>i lies in some "truncation set" S \subset \mathbb{R}. The goal is to recover x^* under some favorable conditions on the A<em>i's and the noise distribution. We prove that there exists a computationally and statistically efficient method for recovering k-sparse n-dimensional vectors x^* from m truncated samples, which attains an optimal \ell</em>2 reconstruction error of O(\sqrt{(k \log n)/m}). As a corollary, our guarantees imply a computationally efficient and information-theoretically optimal algorithm for compressed sensing with truncation, such as that which may arise from measurement saturation effects. Our result follows from a statistical and computational analysis of the Stochastic Gradient Descent (SGD) algorithm for solving a natural adaption of the LASSO optimization problem that accommodates truncation. This generalizes the works of both: (1) [Daskalakis et al. 2018], where no regularization is needed due to the low dimensionality of the data, and (2) [Wainright 2009], where the objective function is simple due to the absence of truncation. In order to deal with both truncation and high-dimensionality at the same time, we develop new techniques that not only generalize the existing ones but we believe are of independent interest.</p>