Part of Advances in Neural Information Processing Systems 15 (NIPS 2002)
Kwokleung Chan, Te-Won Lee, Terrence J. Sejnowski
Missing data is common in real-world datasets and is a problem for many estimation techniques. We have developed a variational Bayesian method to perform Independent Component Analysis (ICA) on high-dimensional data containing missing entries. Missing data are handled naturally in the Bayesian framework by integrating the generative density model. Mod- eling the distributions of the independent sources with mixture of Gaus- sians allows sources to be estimated with different kurtosis and skewness. The variational Bayesian method automatically determines the dimen- sionality of the data and yields an accurate density model for the ob- served data without overﬁtting problems. This allows direct probability estimation of missing values in the high dimensional space and avoids dimension reduction preprocessing which is not feasible with missing data.