Part of Advances in Neural Information Processing Systems 35 (NeurIPS 2022) Main Conference Track
Dat Do, Nhat Ho, XuanLong Nguyen
As we collect additional samples from a data population for which a known density function estimate may have been previously obtained by a black box method, the increased complexity of the data set may result in the true density being deviated from the known estimate by a mixture distribution. To model this phenomenon, we consider the \emph{deviating mixture model} (1−λ∗)h0+λ∗(∑ki=1p∗if(x|θ∗i)), where h0 is a known density function, while the deviated proportion λ∗ and latent mixing measure G∗=∑ki=1p∗iδθ∗i associated with the mixture distribution are unknown. Via a novel notion of distinguishability between the known density h0 and the deviated mixture distribution, we establish rates of convergence for the maximum likelihood estimates of λ∗ and G∗ under Wasserstein metric. Simulation studies are carried out to illustrate the theory.