Sparse Features for PCA-Like Linear Regression

Part of Advances in Neural Information Processing Systems 24 (NIPS 2011)

Bibtex Metadata Paper


Christos Boutsidis, Petros Drineas, Malik Magdon-Ismail


Principal Components Analysis~(PCA) is often used as a feature extraction procedure. Given a matrix $X \in \mathbb{R}^{n \times d}$, whose rows represent $n$ data points with respect to $d$ features, the top $k$ right singular vectors of $X$ (the so-called \textit{eigenfeatures}), are arbitrary linear combinations of all available features. The eigenfeatures are very useful in data analysis, including the regularization of linear regression. Enforcing sparsity on the eigenfeatures, i.e., forcing them to be linear combinations of only a \textit{small} number of actual features (as opposed to all available features), can promote better generalization error and improve the interpretability of the eigenfeatures. We present deterministic and randomized algorithms that construct such sparse eigenfeatures while \emph{provably} achieving in-sample performance comparable to regularized linear regression. Our algorithms are relatively simple and practically efficient, and we demonstrate their performance on several data sets.