Part of Advances in Neural Information Processing Systems 13 (NIPS 2000)
James Coughlan, Alan L. Yuille
Preliminary work by the authors made use of the so-called "Man(cid:173) hattan world" assumption about the scene statistics of city and indoor scenes. This assumption stated that such scenes were built on a cartesian grid which led to regularities in the image edge gra(cid:173) dient statistics. In this paper we explore the general applicability of this assumption and show that, surprisingly, it holds in a large variety of less structured environments including rural scenes. This enables us, from a single image, to determine the orientation of the viewer relative to the scene structure and also to detect target ob(cid:173) jects which are not aligned with the grid. These inferences are performed using a Bayesian model with probability distributions (e.g. on the image gradient statistics) learnt from real data.