Semi-Markov Conditional Random Fields for Information Extraction

Part of Advances in Neural Information Processing Systems 17 (NIPS 2004)

Bibtex Metadata Paper

Authors

Sunita Sarawagi, William W. Cohen

Abstract

We describe semi-Markov conditional random fields (semi-CRFs), a con- ditionally trained version of semi-Markov chains. Intuitively, a semi- CRF on an input sequence x outputs a “segmentation” of x, in which labels are assigned to segments (i.e., subsequences) of x rather than to individual elements xi of x. Importantly, features for semi-CRFs can measure properties of segments, and transitions within a segment can be non-Markovian. In spite of this additional power, exact learning and inference algorithms for semi-CRFs are polynomial-time—often only a small constant factor slower than conventional CRFs. In experiments on five named entity recognition problems, semi-CRFs generally outper- form conventional CRFs.