A Generalization of Principal Components Analysis to the Exponential Family

Part of Advances in Neural Information Processing Systems 14 (NIPS 2001)

Bibtex Metadata Paper


Michael Collins, S. Dasgupta, Robert E. Schapire


Principal component analysis (PCA) is a commonly applied technique for dimensionality reduction. PCA implicitly minimizes a squared loss function, which may be inappropriate for data that is not real-valued, such as binary-valued data. This paper draws on ideas from the Exponen- tial family, Generalized linear models, and Bregman distances, to give a generalization of PCA to loss functions that we argue are better suited to other data types. We describe algorithms for minimizing the loss func- tions, and give examples on simulated data.