Part of Advances in Neural Information Processing Systems 29 (NIPS 2016)
Dangna Li, Kun Yang, Wing Hung Wong
Given iid observations from an unknown continuous distribution defined on some domain Ω, we propose a nonparametric method to learn a piecewise constant function to approximate the underlying probability density function. Our density estimate is a piecewise constant function defined on a binary partition of Ω. The key ingredient of the algorithm is to use discrepancy, a concept originates from Quasi Monte Carlo analysis, to control the partition process. The resulting algorithm is simple, efficient, and has provable convergence rate. We demonstrate empirically its efficiency as a density estimation method. We also show how it can be utilized to find good initializations for k-means.