Kevin Swersky, Brendan J. Frey, Daniel Tarlow, Richard Zemel, Ryan P. Adams
In categorical data there is often structure in the number of variables that take on each label. For example, the total number of objects in an image and the number of highly relevant documents per query in web search both tend to follow a structured distribution. In this paper, we study a probabilistic model that explicitly includes a prior distribution over such counts, along with a count-conditional likelihood that deﬁnes probabilities over all subsets of a given size. When labels are binary and the prior over counts is a Poisson-Binomial distribution, a standard logistic regression model is recovered, but for other count distributions, such priors induce global dependencies and combinatorics that appear to complicate learning and inference. However, we demonstrate that simple, efﬁcient learning procedures can be derived for more general forms of this model. We illustrate the utility of the formulation by exploring applications to multi-object classiﬁcation, learning to rank, and top-K classiﬁcation.