Sharp Bounds for Generalized Uniformity Testing

Ilias Diakonikolas, Daniel M. Kane, Alistair Stewart

Advances in Neural Information Processing Systems 31 (NeurIPS 2018)

We study the problem of generalized uniformity testing of a discrete probability distribution: Given samples from a probability distribution p over an unknown size discrete domain Ω, we want to distinguish, with probability at least 2/3, between the case that p is uniform on some subset of Ω versus ε-far, in total variation distance, from any such uniform distribution. We establish tight bounds on the sample complexity of generalized uniformity testing. In more detail, we present a computationally efficient tester whose sample complexity is optimal, within constant factors, and a matching worst-case information-theoretic lower bound. Specifically, we show that the sample complexity of generalized uniformity testing is Θ(1/(ε^(4/3) ||p||3) + 1/(ε^2 ||p||2 )).