Part of Advances in Neural Information Processing Systems 14 (NIPS 2001)
Michael Collins, S. Dasgupta, Robert E. Schapire
Principal component analysis (PCA) is a commonly applied technique for dimensionality reduction. PCA implicitly minimizes a squared loss function, which may be inappropriate for data that is not real-valued, such as binary-valued data. This paper draws on ideas from the Exponen- tial family, Generalized linear models, and Bregman distances, to give a generalization of PCA to loss functions that we argue are better suited to other data types. We describe algorithms for minimizing the loss func- tions, and give examples on simulated data.