Xuerui Wang, Natasha Mohanty, Andrew McCallum
We present a probabilistic generative model of entity relationships and their attributes that simultaneously discovers groups among the entities and topics among the corresponding textual attributes. Block-models of relationship data have been studied in social network analysis for some time. Here we simultaneously cluster in several modalities at once, incor- porating the attributes (here, words) associated with certain relationships. Signiﬁcantly, joint inference allows the discovery of topics to be guided by the emerging groups, and vice-versa. We present experimental results on two large data sets: sixteen years of bills put before the U.S. Sen- ate, comprising their corresponding text and voting records, and thirteen years of similar data from the United Nations. We show that in compari- son with traditional, separate latent-variable models for words, or Block- structures for votes, the Group-Topic model’s joint inference discovers more cohesive groups and improved topics.