Part of Advances in Neural Information Processing Systems 17 (NIPS 2004)
Andrew McCallum, Ben Wellner
Coreference analysis, also known as record linkage or identity uncer- tainty, is a difficult and important problem in natural language process- ing, databases, citation matching and many other tasks. This paper intro- duces several discriminative, conditional-probability models for coref- erence analysis, all examples of undirected graphical models. Unlike many historical approaches to coreference, the models presented here are relational--they do not assume that pairwise coreference decisions should be made independently from each other. Unlike other relational models of coreference that are generative, the conditional model here can incorporate a great variety of features of the input without having to be concerned about their dependencies--paralleling the advantages of con- ditional random fields over hidden Markov models. We present positive results on noun phrase coreference in two standard text data sets.