Conditional Models of Identity Uncertainty with Application to Noun Coreference

Part of Advances in Neural Information Processing Systems 17 (NIPS 2004)

Bibtex Metadata Paper

Authors

Andrew McCallum, Ben Wellner

Abstract

Coreference analysis, also known as record linkage or identity uncer- tainty, is a difficult and important problem in natural language process- ing, databases, citation matching and many other tasks. This paper intro- duces several discriminative, conditional-probability models for coref- erence analysis, all examples of undirected graphical models. Unlike many historical approaches to coreference, the models presented here are relational--they do not assume that pairwise coreference decisions should be made independently from each other. Unlike other relational models of coreference that are generative, the conditional model here can incorporate a great variety of features of the input without having to be concerned about their dependencies--paralleling the advantages of con- ditional random fields over hidden Markov models. We present positive results on noun phrase coreference in two standard text data sets.