Limiting Form of the Sample Covariance Eigenspectrum in PCA and Kernel PCA

Part of Advances in Neural Information Processing Systems 16 (NIPS 2003)

Bibtex Metadata Paper

Authors

David Hoyle, Magnus Rattray

Abstract

We derive the limiting form of the eigenvalue spectrum for sample co- variance matrices produced from non-isotropic data. For the analysis of standard PCA we study the case where the data has increased variance along a small number of symmetry-breaking directions. The spectrum depends on the strength of the symmetry-breaking signals and on a pa- rameter (cid:11) which is the ratio of sample size to data dimension. Results are derived in the limit of large data dimension while keeping (cid:11) fixed. As (cid:11) increases there are transitions in which delta functions emerge from the upper end of the bulk spectrum, corresponding to the symmetry-breaking directions in the data, and we calculate the bias in the corresponding eigenvalues. For kernel PCA the covariance matrix in feature space may contain symmetry-breaking structure even when the data components are independently distributed with equal variance. We show examples of phase-transition behaviour analogous to the PCA results in this case.