Abhinav Gupta, Jianbo Shi, Larry S. Davis
Integrating semantic and syntactic analysis is essential for document analysis. Using an analogous reasoning, we present an approach that combines bag-of-words and spatial models to perform semantic and syntactic analysis for recognition of an object based on its internal appearance and its context. We argue that while object recognition requires modeling relative spatial locations of image features within the object, a bag-of-word is sufficient for representing context. Learning such a model from weakly labeled data involves labeling of features into two classes: foreground(object) or ''informative'' background(context). labeling. We present a ''shape-aware'' model which utilizes contour information for efficient and accurate labeling of features in the image. Our approach iterates between an MCMC-based labeling and contour based labeling of features to integrate co-occurrence of features and shape similarity.