Michael C. Mozer, Richard Zemel, Marlene Behrmann
Despite the fact that complex visual scenes contain multiple, overlapping objects, people perform object recognition with ease and accuracy. One operation that facilitates recognition is an early segmentation process in which features of objects are grouped and labeled according to which ob(cid:173) ject they belong. Current computational systems that perform this oper(cid:173) ation are based on predefined grouping heuristics. We describe a system called MAGIC that learn. how to group features based on a set of pre(cid:173) segmented examples. In many cases, MAGIC discovers grouping heuristics similar to those previously proposed, but it also has the capability of find(cid:173) ing nonintuitive structural regularities in images. Grouping is performed by a relaxation network that aUempts to dynamically bind related fea(cid:173) tures. Features transmit a complex-valued signal (amplitude and phase) to one another; binding can thus be represented by phase locking related features. MAGIC'S training procedure is a generalization of recurrent back propagation to complex-valued units.
When a visual image contains multiple, overlapping objects, recognition is difficult because features in the image are not grouped according to which object they belong. Without the capability to form such groupings, it would be necessary to undergo a massive search through all subsets of image features. For this reason, most machine vision recognition systems include a component that performs feature grouping or image .egmentation (e.g., Guzman, 1968; Lowe, 1985; Marr, 1982).