Near-Optimal Density Estimation in Near-Linear Time Using Variable-Width Histograms

Chan, Siu-On; Diakonikolas, Ilias; Servedio, Rocco A.; Sun, Xiaorui

Near-Optimal Density Estimation in Near-Linear Time Using Variable-Width Histograms

Part of Advances in Neural Information Processing Systems 27 (NIPS 2014)

Bibtex Metadata Paper Reviews Supplemental

Authors

Siu-On Chan, Ilias Diakonikolas, Rocco A. Servedio, Xiaorui Sun

Abstract

Let $p$ be an unknown and arbitrary probability distribution over $[0 ,1)$ . We consider the problem of \emph{density estimation}, in which a learning algorithm is given i.i.d. draws from $p$ and must (with high probability) output a hypothesis distribution that is close to $p$ . The main contribution of this paper is a highly efficient density estimation algorithm for learning using a variable-width histogram, i.e., a hypothesis distribution with a piecewise constant probability density function. In more detail, for any $k$ and $\eps$ , we give an algorithm that makes $\tilde{O}(k/\eps^2)$ draws from $p$ , runs in $\tilde{O}(k/\eps^2)$ time, and outputs a hypothesis distribution $h$ that is piecewise constant with $O(k \log^2(1/\eps))$ pieces. With high probability the hypothesis $h$ satisfies $\dtv(p,h) \leq C \cdot \opt_k(p) + \eps$ , where $\dtv$ denotes the total variation distance (statistical distance), $C$ is a universal constant, and $\opt_k(p)$ is the smallest total variation distance between $p$ and any $k$ -piecewise constant distribution. The sample size and running time of our algorithm are both optimal up to logarithmic factors. The approximation factor'' $C$ that is present in our result is inherent in the problem, as we prove that no algorithm with sample size bounded in terms of $k$ and $\eps$ can achieve $C < 2$ regardless of what kind of hypothesis distribution it uses.

Near-Optimal Density Estimation in Near-Linear Time Using Variable-Width Histograms

Authors

Abstract

Name Change Policy