Hidden Markov Models for Human Genes

Part of Advances in Neural Information Processing Systems 6 (NIPS 1993)

Pierre Baldi, Søren Brunak, Yves Chauvin, Jacob Engelbrecht, Anders Krogh


Human genes are not continuous but rather consist of short cod(cid:173) ing regions (exons) interspersed with highly variable non-coding regions (introns). We apply HMMs to the problem of modeling ex(cid:173) ons, introns and detecting splice sites in the human genome. Our most interesting result so far is the detection of particular oscilla(cid:173) tory patterns, with a minimal period ofroughly 10 nucleotides, that seem to be characteristic of exon regions and may have significant biological implications.

• and Division of Biology, California Institute of Technology. t and Department of Psychology, Stanford University.