Loading [MathJax]/jax/output/CommonHTML/jax.js

Unsupervised Feature Selection for the k-means Clustering Problem

Part of Advances in Neural Information Processing Systems 22 (NIPS 2009)

Bibtex Metadata Paper

Authors

Christos Boutsidis, Petros Drineas, Michael W. Mahoney

Abstract

We present a novel feature selection algorithm for the k-means clustering problem. Our algorithm is randomized and, assuming an accuracy parameter ϵ(0,1), selects and appropriately rescales in an unsupervised manner Θ(klog(k/ϵ)/ϵ2) features from a dataset of arbitrary dimensions. We prove that, if we run any γ-approximate k-means algorithm (γ1) on the features selected using our method, we can find a (1+(1+ϵ)γ)-approximate partition with high probability.