Near-Optimal Density Estimation in Near-Linear Time Using Variable-Width Histograms

Part of Advances in Neural Information Processing Systems 27 (NIPS 2014)

Bibtex Metadata Paper Reviews Supplemental

Authors

Siu On Chan, Ilias Diakonikolas, Rocco A. Servedio, Xiaorui Sun

Abstract

Let $p$ be an unknown and arbitrary probability distribution over $[0 ,1)$. We consider the problem of \emph{density estimation}, in which a learning algorithm is given i.i.d. draws from $p$ and must (with high probability) output a hypothesis distribution that is close to $p$. The main contribution of this paper is a highly efficient density estimation algorithm for learning using a variable-width histogram, i.e., a hypothesis distribution with a piecewise constant probability density function. In more detail, for any $k$ and $\eps$, we give an algorithm that makes $\tilde{O}(k/\eps^2)$ draws from $p$, runs in $\tilde{O}(k/\eps^2)$ time, and outputs a hypothesis distribution $h$ that is piecewise constant with $O(k \log^2(1/\eps))$ pieces. With high probability the hypothesis $h$ satisfies $\dtv(p,h) \leq C \cdot \opt_k(p) + \eps$, where $\dtv$ denotes the total variation distance (statistical distance), $C$ is a universal constant, and $\opt_k(p)$ is the smallest total variation distance between $p$ and any $k$-piecewise constant distribution. The sample size and running time of our algorithm are both optimal up to logarithmic factors. The ``approximation factor'' $C$ that is present in our result is inherent in the problem, as we prove that no algorithm with sample size bounded in terms of $k$ and $\eps$ can achieve $C < 2$ regardless of what kind of hypothesis distribution it uses.