This paper is about describing scenes as programs, and in fact inferring the program that generated a scene, given a formal grammar. It relies heavily on previous work [11,23, 24] in terms of program induction for scene generation, and it is concerned exclusively with box-shaped scenes (buildings, indoor spaces, etc). This can be used for image manipulation such as inpainting. As R4 points out, "the contribution here is that the 3D structure of the scene is modeled in addition to the appearance. Whereas previous methods operated in the image plane, the current method assumes that the scene is constituted by a box with perpendicular flat surfaces, either viewed from the inside or the outside. Another assumption is that there is some regularity to the pattern on each surface. This strong prior allows scaling up the program reasoning, making it possible to address much more complex scenes. Experiments show that the proposed method greatly outperforms the state-of-the-art both in plane segmentation and inpainting for scenes with box structure and regular patterns on the surfaces." I think this description is very accurate. This paper was well-received. The initial concerns about how the method would do if applied to non-regular scenes, like the ones assumed in the paper, was addressed by the authors in the rebuttal phase. I think despite the narrow scope of the paper (box scenes), reviewers liked the idea of doing program induction that involves both 3D structure and 2D observations, and they recommend acceptance. I agree with this assessment and I suggest a poster.