{"title": "Logic and MRF Circuitry for Labeling Occluding and Thinline Visual Contours", "book": "Advances in Neural Information Processing Systems", "page_first": 1153, "page_last": 1159, "abstract": "", "full_text": "Logic and MRF Circuitry for Labeling\nOccluding and Thinline Visual Contours\n\nEric Saund\n\nPalo Alto Research Center\n\n3333 Coyote Hill Rd.\nPalo Alto, CA 94304\nsaund@parc.com\n\nAbstract\n\nThis paper presents representation and logic for labeling contrast edges\nand ridges in visual scenes in terms of both surface occlusion (border\nownership) and thinline objects. In natural scenes, thinline objects in-\nclude sticks and wires, while in human graphical communication thin-\nlines include connectors, dividers, and other abstract devices. Our analy-\nsis is directed at both natural and graphical domains. The basic problem\nis to formulate the logic of the interactions among local image events,\nspeci\ufb01cally contrast edges, ridges, junctions, and alignment relations,\nsuch as to encode the natural constraints among these events in visual\nscenes. In a sparse heterogeneous Markov Random Field framework, we\nde\ufb01ne a set of interpretation nodes and energy/potential functions among\nthem. The minimum energy con\ufb01guration found by Loopy Belief Prop-\nagation is shown to correspond to preferred human interpretation across\na wide range of prototypical examples including important illusory con-\ntour \ufb01gures such as the Kanizsa Triangle, as well as more dif\ufb01cult ex-\namples. In practical terms, the approach delivers correct interpretations\nof inherently ambiguous hand-drawn box-and-connector diagrams at low\ncomputational cost.\n\n1 Introduction\n\nA great deal of attention has been paid to the curious phenomenon of illusory contours in\nvisual scenes [5]. The most famous example is the Kanizsa Triangle (Figure 1). Although\na number of explanations have been proposed, computational accounts have converged on\nthe understanding that illusory contours are an outcome of the more general problem of\nlabeling scene contours in terms of causal events such as surface overlap. Illusory con-\ntours are the visual system\u2019s way of expressing belief in an occlusion relation between\ntwo surfaces having the same lightness and therefore lacking a visible contrast edge. The\nphenomena are interesting in their revelation of interactions among multiple factors com-\nprising the visual system\u2019s prior assumptions about what constitutes likely interpretations\nof ambiguous input.\n\nSeveral computational models for this process have generated interpretations of Kanizsa-\nlike \ufb01gures corresponding to human perception. Williams[9] formulated an integer-linear\n\n\fFigure 1: a. Original Kanizsa Triangle. b. Solid surface version. c. Human preferred\ninterpretation. d, e. Other valid interpretations.\n\noptimization problem with hard constrains originating from the topology of contours and\njunctions, and soft constraints representing \ufb01gural biases for non-accidental interpretations\nand \ufb01gural closure. Heitger and von der Heydt[2] implemented a series of nonlinear \ufb01l-\ntering operations that enacted interactions among line terminations and junctions to infer\nmodal completions corresponding to illusory contours. Geiger[1] used a dense Markov\nRandom Field to represent surface depths explicitly and propagated local evidence through\na diffusion process. Saund[6] enumerated possible generic and non-generic interpretations\nof T- and L-junctions to set up an optimization problem solved by deterministic annealing.\nLiu and Wang[4] set up a network of contours traversing the boundaries of segmented re-\ngions, which interact to propagate local information through an iterative updating scheme.\n\nThis paper expands this body of previous work in the following ways:\n\n\u2022 The computational model is expressed in terms of a sparse heterogeneous Markov\nRandom Field whose solution is accessible to fast techniques such as Loopy Belief\nPropagation.\n\n\u2022 We introduce interpretations of thinlines in addition to solid surfaces, adding a\n\nsigni\ufb01cant layer of richness and complexity.\n\n\u2022 The model infers occlusion relations of surfaces depicted by line drawings of their\n\nborders, as well as solid graphics depictions.\n\n\u2022 We devise MRF energy functions that implement circuitry for sophisticated logi-\n\ncal constraints of the domain.\n\nThe result is a formulation that is both fast and effective at correctly interpreting a greater\nrange of psychophysical and near-practical contour con\ufb01guration examples than has hereto-\nfor been demonstrated. The model exposes aspects of fundamental ambiguity to be re-\nsolved by the incorporation of additional constraints and domain-speci\ufb01c knowledge.\n\n2 Interpretation Nodes and Relations\n\n2.1 Visible Contours and Contour Ends\n\nEarly vision studies commonly distinguish several models for visible contour creation and\nmeasurement, including contrast edges, lines or ridges, ramps, color and texture edges, etc.\nLet us idealize to consider only contrast edges and ridges (also known as \u201cbars\u201d), mea-\nsured at a single scale. We include in our domain of interest human-generated graphical\n\n\fFigure 2: a. Sample image region. b. Spatial relation categories characterizing links in\nthe MRF among Contour End nodes: Corner, Near Alignment, Far Alignment, Lateral. c.\nResulting MRF including nodes of type Visible Contour, Contour End, Corner Tie, and\nCorner Tie Mediator.\n\n\ufb01gures. Contrast edges arise from distinct regions or surfaces, while ridges may represent\neither a boundary between regions or else a \u201cthinline\u201d, i.e. a physical or graphical object\nwhose shape is essentially de\ufb01ned by a one-dimensional path at our scale of measurement.\nExamples of thinlines in photographic imagery include twigs, sidewalk cracks, and tele-\nphone wires, while in graphical images thinlines include separators, connectors, and arrow\nshafts. Figure 7e shows a hand-drawn sketch in which some lines (measured as ridges) are\nintended to de\ufb01ne boxes and therefore represent region boundaries, while others are con-\nnectors between boxes. We take the contour interpretation problem to include the analysis\nof this type of scene in addition to classical illusory contour \ufb01gures.\n\nFor any input data, we may construct a Markov Random Field consisting of four types of\nnodes derived from measured contrast edge and ridge contours. An interpretation is an\nassignment of states to nodes. Local potentials and the potential matrices associated with\npairwise links between nodes encode constraints and biases among interpretation states\nbased on the spatial relations among the visible contours. Figure 2 illustrates MRF nodes\ntypes and links for a simple example input image, as explained below.\n\nLet us assume that contours de\ufb01ning region boundaries are assigned an occlusion direction,\nequivalent to relative surface depth and hence boundary ownership. Figure 3 shows the pos-\nsible mappings between visible image contours measured as contrast edges or ridges, and\ntheir interpretation in terms of direction of surface overlap or else thinline object. Contrast\nedges always correspond to surface occlusion, while ridges may represent either a surface\nboundary or a thinline object. Correspondingly, the simplest MRF node type is the Visible\nContour node which has state dimension 3 corresponding to two possible overlap directions\nand one thinline interpretation.\n\nMost of the interesting evidence and interaction occurs at terminations and junctions of\nvisible contours. Contour End nodes are given the job of explaining why a smooth visible\nedge or ridge contour has terminated visibility, and hence they will encode the bulk of the\nmodal (illusory) and amodal (occluded) completion information of a computed interpreta-\ntion. Smooth visible contours may terminate in four ways:\n\n\fFigure 3: Permissible mappings between visible edge and ridge contours and interpreta-\ntions. Wedges indicate direction of surface overlap: white (FG) surface occludes shaded\n(BG) surface.\n\n1. The surface boundary contour or thinline object changes direction (turns a corner)\n\n2. The contour becomes modal because the background surface lacks a visible edge\n\nwith the foreground surface.\n\n3. The contour becomes amodal because it becomes occluded by another surface.\n\n4. The contour simply terminates when an surface overlap meets the end of a fold,\n\nor when a thin object or graphic stops.\n\nContour Ends therefore have 3x4 = 12 interpretation states as shown in Figure 4.\n\nFigure 4: Contour End nodes have state dimension 12 indicating contour overlap\ntype/direction (overlap or thinline) and one of four explanations for termination of the\nvisible contour.\n\nEvery Visible Contour node is linked to its two corresponding Contour End nodes through\nenergy matrices (or equivalently, potential matrices, using Potential \u03c8 = exp\n\u2212E) represent-\ning simple compatibility among overlap direction/thinline interpretation states. Additional\nlinks in the network are created based on spatial relations among Contour Ends as described\nnext.\n\n\fa\n\nb\n\nFigure 5: a. Corner Tie nodes have state dimension 6 indicating the causal relationship\nbetween the Contour End nodes they link. b. Energy matrix linking the Left Contour End\nof a pair of corner-relation Contour Ends to their Corner Tie. X indicates high energy\nprohibiting the state combination. E A refers to a low penalty for Accidental Coincidence\nof the Contour Ends. EDC refers to a (typically low) penalty of two Contour Ends failing\nto meet the ideal geometrical constraints of meeting at a corner. The subscripts refer to\nnecessary Near-Alignment Relations on the Contour Ends. The energy matrix linking the\nRight End Contour to the Corner Tie swaps the 5th and 6th columns.\n\n2.2 Contour Ends Relation Links\n\nLet us consider \ufb01ve classes of pairwise geometric relations among observed contour ends:\nCorner, Near-Alignment, Far-Alignment, Lateral, and Unrelated. Mathematical expres-\nsions forming the bases for these relations may be engineered as measures of distance and\nsmooth continuation such as used by Saund [6]. The Corner relation depends only on\nproximity; Near-Alignment depends on proximity and alignment; Far-Alignment omits the\nproximity requirement.\n\nWithin this framework a further re\ufb01nement distinguishes ridge Contour Ends from those\narising from contrast edges. Namely, ridge ends are permitted to form Lateral relation links\nwhich correspond to potential modal contours. Contrast edge Contour Ends are excluded\nfrom this link type because they terminate at junctions which distribute modal and amodal\ncompletion roles to their participating Contour Ends. Contour End nodes from ridge con-\ntours may participate in Far-Alignment links but their local energies are set to preclude\nthem from taking states representing modal completions.\n\nIn this way the present model \ufb01xes the topology of related ends in the process of setting up\nthe Markov Graph. An important problem for future research is to formulate the Markov\nGraph to include all plausible Contour End pairings and have the actual pairings sort them-\nselves out at solution time.\n\nBiases about preferred and less-preferred interpretations are represented through the terms\nin the energy matrices linking related Contour Ends. In accordance with prior work, we\nbias energy terms associated with curved Visible Contours and junctions of Contour Ends\nin favor of convex object interpretations. Space limitations preclude presenting the energy\nmatrices in detail, but we discuss the main novel and signi\ufb01cant considerations.\n\nThe simplest case is pairs of Contour Ends sharing a Near-Alignment or Far-Alignment\nrelation. These energy matrices are constructed to trade off priors regarding accidental\nalignment versus amodal or modal invisible contour completion interpretations. For Con-\n\n\fFigure 6: The Corner Tie Mediator node restricts border ownership of occluding contours\nto physically consistent interpretations. The energy matrix shown in e links the Corner Tie\nMediator to the Left Corner Tie of a pair sharing a Contour End. X indicates high energy.\nThe energy matrix for the link to the Right Corner Tie swaps the second and third columns.\n\ntour End pairs that are relatively near and well aligned, energy terms corresponding to\ncausally unrelated interpretations (CE states 0,1,2) are large, while terms corresponding to\namodal completion with compatible overlap/thinline property (CE states 6,7,8) are small.\nActual energy values for the matrices are assigned by straightforward formulas derived\nfrom the Proximity and Smooth Continuation terms mentioned above. Per Kanizsa, modal\ncompletion interpretations (CE states 3,4,5) are somewhat more expensive than amodal\ninterpretations, by a constant factor. Energy terms shift their relative weights in favor of\ncausally unrelated interpretations (CE corner states 0,1,2) as the Contour Ends become\nmore distant and less aligned.\n\nContour Ends sharing a Corner relation can be related in one of three ways:\nthey can\nbe causally unrelated and unordered in depth; they can represent a turning of a surface\nboundary or thinline object; they can represent overlap of one contour above the other. In\norder to exploit the geometry of Contour Ends as local evidence, these alternatives must be\narticulated and entered into the MRF node graph. To do this we therefore introduce a third\ntype of node, the Corner Tie node, possessing six states as illustrated in Figure 5a.\n\nThe energy matrix relating Contour End nodes and Corner Tie nodes is shown in Figure\n5b. It contains low energy terms representing the Corner Tie\u2019s belief that the Contour End\ntermination is due to direction change (turning a corner). It also contains low energy terms\nrepresenting the conditions of one Contour End\u2019s owning surface overlapping the other\ncontour, i.e. the relative depth relation between these contours in the scene.\n\n2.3 Constraints on Overlaps and Thinlines at Junctions\n\nPhysical considerations impose hard constraints on the interpretations of End Pairs meeting\nat a junction. Consider the T-junction in Figure 6a. One preferred interpretation for a\nT-junction is occlusion (6b). A less-preferred but possible interpretation is a change of\ndirection (corner) by one surface, with accidental alignment by another contour (6c). What\nis impossible is for a surface boundary to bifurcate and \u201cbelong\u201d to both sides of the T (6d).\n\nThis type of constraint cannot be enforced by the purely pairwise Corner Tie node. We\ntherefore introduce a fourth node type, the Corner Tie Mediator. This node governs the\nnumber of Corner Ties that any Contour End can claim to form a direction change (corner\nturn) relation with. The energy matrix for the Corner Tie Mediator node is shown in Figure\n6e: multiple Corner-Ties in the overlap direction-turn states (CT states 1 & 2) are excluded\n(solid arrows). But note that the matrix contains a low energy term (dashed arrow) for\nthe formation of multiple direction-turn Corner-Ties provided they are in the Thinline state\n(CT state 3); branching of thinline objects is physically permissible.\n\n\f3 Experiments and Conclusion\n\nLoopy Belief Propagation under the Max-Product algorithm seeks the MAP con\ufb01gura-\ntion which is equivalent to the minimum-energy assignment of states [8]. We have not\nencountered a failure of LBP to converge, and it is quite rare to encounter a lower-energy\nassignment of states than the algorithm delivers starting from an initial uniform distribution\nover states. However, multiple stable \ufb01xed points can exist. For some ambiguous \ufb01gures\nsuch as Figure 7e in which qualitatively different interpretations have similar energies, one\nmay clamp one or more nodes to alternative states, leading to LBP solutions which persist\nonce the clamping is removed. This invites the exploration of N-best con\ufb01guration solution\ntechniques [10].\n\nFigure 7 demonstrates MAP assignments corresponding to preferred human interpretations\nof the classic Kanizsa illusory contour \ufb01gure and others containing both aligning L-junction\nand ridge termination evidence for modal contours, amodal completions, and thinline ob-\njects. Note that the MRF correctly predicts that outline drawings of surface boundaries do\nnot induce illusory contours.\n\nFigure 7g borrows from experiments by Szummer and Cowans[7] toward a practical appli-\ncation in line drawing interpretation, in which closed boxes de\ufb01ne regions while connectors\nremain interpreted as thinline objects. For this scene containing 369 nodes and 417 links,\nthe entire process of forming the MRF and performing 100 iterations of LBP takes less\nthan a second. The major pressures operating in these situations are a \ufb01gural bias toward\ninterpreting closed paths as convex regions, and a preference to interpret ridge contours\nparticipating in T- and X- junctions as thinline objects.\n\nWe have shown how explicit consideration of ridge features and thinline interpretations\nbrings new complexity to the logic of sorting out depth relations in visual scenes. This\ninvestigation suggests that a sparse heterogeneous Markov Random Field approach may\nprovide a suitable basis for such models.\n\nReferences\n\n[1] Geiger, D., Kumaran, K, & Parida, L. (1996) Visual organization for \ufb01gure/ground separation. in\nProc. IEEE CVPR pp. 155-160.\n[2] Heitger, F., & von der Heydt, R. (1993) A Computational Model of Neural Contour Processing:\nFigure-Ground Segregation and Illusory Contours. Proc. ICCV \u201993.\n[3] Kanizsa, G. (1979) Organization in Vision, Praeger, New York.\n[4] Liu, X., Wang, D. (2000) Perceptual Organization Based on Temporal Dynamics. in S.A. Solla,\nT.K. Leen, K.-R. Muller (eds.), Advances in Neural Information Processing Systems 12, pp. 38-44.\nMIT Press.\n[5] Petry, S., & Meyer, G. (eds.) (1987) The Perception of Illusory Contours, Springer-Verlag, New\nYork.\n[6] Saund, E. (1999) Perceptual Organization of Occluding Contours of Opaque Surfaces, CVIU V.\n76, No. 1, pp. 70-82.\n[7] Szummer, M., & Cowans, P. (2004) Incorporating Context and User Feedback in Pen-Based\nInterfaces. AAAI TR FS-04-06 (Papers from the 2004 AAAI Fall Symposium.)\n[8] Weiss, Y., and Freeman, W.T. (2001) On the optimality of solutions of the max-product belief\npropagation algorithm in arbitrary graphs, IEEE Trans. Inf. Theory 47:2, pp. 723-735.\n[9] Williams, L. (1990) Perceptual Organization of Occluding Contours. Proc. ICCV \u201990. pp. 639-\n649.\n[10] Yanover, C. and Weiss, Y. (2003) Finding the M Most Probable Con\ufb01gurations Using Loopy\nin S. Thrun, L. Saul and B. Sch\u00a80lkpf, eds., Advances in Neural Information\nBelief Propagation.\nProcessing Systems 16, MIT Press.\n\n\f\f", "award": [], "sourceid": 2892, "authors": [{"given_name": "Eric", "family_name": "Saund", "institution": null}]}