{"title": "Multiscale Random Fields with Application to Contour Grouping", "book": "Advances in Neural Information Processing Systems", "page_first": 913, "page_last": 920, "abstract": "We introduce a new interpretation of multiscale random fields (MSRFs) that admits efficient optimization in the framework of regular (single level) random fields (RFs). It is based on a new operator, called append, that combines sets of random variables (RVs) to single RVs. We assume that a MSRF can be decomposed into disjoint trees that link RVs at different pyramid levels. The append operator is then applied to map RVs in each tree structure to a single RV. We demonstrate the usefulness of the proposed approach on a challenging task involving grouping contours of target shapes in images. MSRFs provide a natural representation of multiscale contour models, which are needed in order to cope with unstable contour decompositions. The append operator allows us to find optimal image labels using the classical framework of relaxation labeling, Alternative methods like Markov Chain Monte Carlo (MCMC) could also be used.", "full_text": "Multiscale Random Fields with Application to\n\nContour Grouping\n\nLongin Jan Latecki\n\nDept. of Computer and Info. Sciences\nTemple University, Philadelphia, USA\n\nlatecki@temple.edu\n\nMarc Sobel\n\nStatistics Dept.\n\nTemple University, Philadelphia, USA\n\nmarc.sobel@temple.edu\n\nChengEn Lu\n\nDept. of Electronics and Info. Eng.\n\nHuazhong Univ. of Sci. and Tech., China\n\nluchengen@gmail.com\n\nXiang Bai\n\nDept. of Electronics and Info. Eng.\n\nHuazhong Univ. of Sci. and Tech., China\n\nxiang.bai@gmail.com\n\nAbstract\n\nWe introduce a new interpretation of multiscale random \ufb01elds (MSRFs) that ad-\nmits ef\ufb01cient optimization in the framework of regular (single level) random \ufb01elds\n(RFs). It is based on a new operator, called append, that combines sets of random\nvariables (RVs) to single RVs. We assume that a MSRF can be decomposed into\ndisjoint trees that link RVs at different pyramid levels. The append operator is\nthen applied to map RVs in each tree structure to a single RV. We demonstrate\nthe usefulness of the proposed approach on a challenging task involving grouping\ncontours of target shapes in images. It provides a natural representation of mul-\ntiscale contour models, which is needed in order to cope with unstable contour\ndecompositions. The append operator allows us to \ufb01nd optimal image segment\nlabels using the classical framework of relaxation labeling. Alternative methods\nlike Markov Chain Monte Carlo (MCMC) could also be used.\n\n1 Introduction\nRandom Fields (RFs) have played an increasingly important role in the \ufb01elds of image denoising,\ntexture discrimination, image segmentation and many other important problems in computer vision.\nThe images analyzed for these purposes typically have signi\ufb01cant fractal properties which preclude\nthe use of models operating at a single resolution level. Such models, which aim to minimize mean-\nsquared estimation error, use only second-order image statistics which fail to accurately characterize\nthe images of interest. Multiscale random \ufb01elds (MSRFs) resolve this problem by using information\nat many different resolution levels [2, 15, 5]. In [6], a probabilistic model of multiscale conditional\nrandom \ufb01elds (mCRF) was proposed to segment images by labeling pixels using a prede\ufb01ned set of\nclass labels.\n\nThe main difference between the proposed interpretation of MSRFs or mCFF as known in the lit-\nerature, e.g., [2, 15, 6, 5], and the proposed MSRF is the interpretation of the connections between\ndifferent scales (levels). In the proposed approach, the random variables (RVs) linked by a tree sub-\nstructure across different levels compete for their label assignments, while in the existing approaches\nthe goal is to cooperate in the label assigns, which is usually achieved by averaging. In other words,\nusually the label assignment of a parent node is enforced to be compatible with the label assignment\nof its children by averaging. In contrast, in the proposed approach the parent node and all its children\ncompete for the best possible label assignment.\n\nContour grouping is one of key approaches to object detection and recognition, which is a funda-\nmental goal of computer vision. We introduce a novel MSRF interpretation, and show its bene\ufb01ts\nin solving the contour grouping problem. The MSRF allows us to cast contour grouping as con-\ntour matching. Detection and grouping by shape has been investigated in earlier work. The basic\n\n1\n\n\fidea common to all methods is to de\ufb01ne distance measures between shapes, and then accurately\nlabel and/or classify shapes using these measures. Classical methods, of this type, such as shape\ncontexts [1] and chamfer matching [13] can not cope well with clutter and shape deformations.\nSome researchers described the shape of the entire object using deformable contour fragments and\ntheir relative positions [10, 12], but their detection results are always grassy contour edges. The\ndeformable template matching techniques often require either good initial positions or clean images\n(or both) to avoid (false) local minima [14, 9]. Recently, Ferrari et al. [4] have used the sophisticated\nedge detection methods of [8]; the resulting edges are linked to a network of connected contour seg-\nments by closing small gaps. Wu et al. [16] proposed an active basis model that provides deformable\ntemplate consisting of a small number of Gabor wavelet elements allowed to slightly perturb their\nlocations and orientations.\n\nOur grouping is also based on the edge detection of [8], but we do not perform edge linking directly\nfor purposes of grouping. We perform matching a given contour model to edge segments in images.\nThis allows us to perform grouping and detection at the same time. Our method differs from former\nsampled-points-based matching methods [14, 3]; we match the contour segments from the given\ncontour to segments in edge images directly.\n\nWe decompose a given closed contour of a model shape into a group of contour segments, and match\nthe resulting contour segments to edge segments in a given image. Our model contour decomposition\nis \ufb02exible and admits a hierarchical structure, e.g., a parent contour segment is decomposed into two\nor more child segments. In this way, our model can adapt to different con\ufb01gurations of contour\nparts in edge images. The proposed MSRF interpretation allows us to formulate the problem of\ncontour grouping as a soft label assignment problem. Since in our approach a parent node and all its\nchildren compete for the best possible label assignment, allowing us to examine multiple composite\nhypotheses of model segments in the image, a successful contour grouping of edge segments is\npossible even if signi\ufb01cant contour parts are missing or are distorted. The competition is made\npossible by the proposed append operator. It appends the random variables (RVs) representing the\nparent and all its children nodes to a single new RV. Since the connectivity relation between each\npair of model segments is known, the soft label assignment and the competition for best labels make\naccurate grouping results in real images possible.\n\nWe also want to stress that our grouping approach is based on matching of contour segments. The\nadvantages of segment matching over alternative techniques based on point matching are at least\ntwofold: 1) it permits deformable matching (i.e., the global shape will not be changed even when\nsome segments shift or rotate a little); 2) it is more stable than point matching, since contour seg-\nments are more informative than points as shape cues.\n\n(cid:2)f = argmaxf p(f|X)\n\n2 Multiscale Random Fields\nGiven a set of data points X = {x1, . . . , xn}, the goal of random \ufb01elds is to \ufb01nd a label assignment\nf that maximizes the posterior probability p(f|X) (of that assignment):\n\n(1)\nThus, we want to select the label assignment with the largest possible probability given the observed\ndata. Although the proposed method is quite general, for clarity of presentation, we focus on an\napplication of interest to us: contour grouping based on contour part correspondence.\nWe take the contour of an example shape to be our shape model S. We assume that the model\nIn our application, the data points X =\nis composed of several contour segments s 1, . . . , sm.\n{x1, . . . , xn} are contour segments extracted by some low level process in a given image. The\nrandom \ufb01eld is de\ufb01ned by a sequence of random variables F = (F 1, . . . , Fm) associated with nodes\nsi of the model graph F represents the mapping of the nodes (model segments) S = {s 1, , sm} to\nthe data points X = {x1, . . . , xn} (i.e., F : S \u2192 X). We write Fi = xj to denote the event that\nthe model segment si is assigned the image segment xj by the map F. (Observe that usually the\nassignment is de\ufb01ned in the reverse direction, i.e., from an image to the model.)\nOur goal is to \ufb01nd a label assignment f = (f1, . . . , fm) \u2208 X m that maximizes the probability\np(f|X) = p(F1 = f1, . . . , Fm = fm|X), i.e.,\n\n(cid:2)f = ((cid:2)f1, . . . , (cid:2)fm) = argmax\n\n(f1,...,fm)\n\np(F1 = f1, . . . , Fm = fm|X)\n\n(2)\n\n2\n\n\fHowever, the object contour in the given image (which is composed of some subset of segments in\nX = {x1, . . . , xn} may have a different decomposition into contour segments than is the case for\nthe model s1, . . . , sm. This is the case, for example, if some parts of the true contour are missing,\ni.e., some si may not correspond to parts in X. Therefore, a shape model is needed that can provide\nrobust detection and recognition under these conditions. We introduce such a model by imposing a\nmultiscale structure on contour segments of the model shape. Let the lowest level zero represents the\n\ufb01nest subdivision of a given model contour S into the segments S 0 = {s0\n}. The \u03b1 level\npartition subdivides the contour into the segments S \u03b1 = {s\u03b1\n} for \u03b1 = 1, . . . , \u03b2, where \u03b2\ndenotes the highest (i.e., most coarse) pyramid level. For each pyramid level \u03b1, the segments, S \u03b1,\npartition the model contour S, i.e., S = s \u03b1\nm\u03b1. The segments S\u03b1 in level \u03b1 re\ufb01ne the\nsegments S\u03b1+1 in level \u03b1+ 1, i.e., segments in the level \u03b1+ 1 are unions of one or more consecutive\nsegments in the level \u03b1. On each level \u03b1 we have a graph structure G \u03b1 = (S\u03b1, E\u03b1), where E \u03b1 is\nthe set of edges governing the relations between segments in S \u03b1, and we have a forest composed of\ntrees that link nodes at different levels. The number of the trees corresponds to the number of nodes\non the highest level s\u03b2\nm\u03b2 , since each of these nodes is the root of one tree. We denote these\ntrees with T1, . . . , Tm\u03b2 . For example, in Fig. 1 we have eight segments on the level zero s 0\n1, . . . , s0\n8,\nand four segments on the level one\n1 \u222a s0\n\n1 \u222a \u00b7\u00b7\u00b7 \u222a s\u03b1\n\n1 , . . . , s\u03b1\nm\u03b1\n\n1, . . . , s0\nm0\n\n6, s1\n\n4 = s0\n\n4, s1\n\n3 = s0\n\n2, s1\n\n2 = s0\n\n7 \u222a s0\n8.\n\n1 , . . . , s\u03b2\n\n3 \u222a s0\n\n5 \u222a s0\n\ns1\n1 = s0\n\nThis construction leads to a tree structure relation among segments at different levels. For example,\nT1 is a tree with s1\n\n1 (segment 1) as a parent node and with two children s 0\n\n2 (segments 5 and 6).\n\n1, s0\n\n9\n\n8\n\n2\n\n7\n\n3\n\n10\n\n6\n\n12\n\n4\n\n11\n\n5\n\n1\n\nModel\n\n1T\n\n1\n\n2T\n\n2\n\n3T\n\n3\n\n4T\n\n4\n\n5\n\n6\n\n7\n\n8\n\n9\n\n10\n\n11\n\n12\n\niS\n1\n\niS\n0\n\nFigure 1: An example of a multiscale random \ufb01eld structure.\n\nWe associate a random variable F \u03b1\ni . The range of each random variable F \u03b1\nis the set of contour segments X = {x1, . . . , xn} extracted in a given image. The random variables\ni\ninherit the tree structure from the corresponding model segments. Thus, we obtain a multiscale\nrandom \ufb01eld with random variables (RVs)\n\ni with each segment s\u03b1\n\n, . . . , F \u03b1\n\nF = (F 0\n\n1 , . . . , F 0\nm0\n\n(3)\nthe relational structure (RS) G\u03b1 = (S\u03b1, E\u03b1), and trees T1, . . . , Tm\u03b2 . Our goal remains the same as\nstated in (2), but the graph structure of the underlying RF is signi\ufb01cantly more complicated by the\nintroduction of the multiscale tree relations. Therefore, the maximization in (2) is signi\ufb01cantly more\ncomplicated as well. Usually, the computation in multiscale random \ufb01elds is based on modeling the\ndependencies between the random variables related by the (aforementioned) tree structures.\n\n1 , . . . , F \u03b2\n\nm\u03b2 ),\n\n1 , . . . , F \u03b1\nm\u03b1\n\n, . . . , F \u03b2\n\nIn the proposed approach, we do not explicitly model these tree structure dependencies. Instead, we\nbuild relations between them using the construction of a new random variable that explicitly relates\nall random variables in each given tree. We introduce a new operator acting on random variables,\ncalled append operator. The operator combines a given set of random variables Y = {Y 1, . . . , Yk}\ninto a single random variable denoted\n(4)\nFor simplicity, we assume, in the de\ufb01nition below, that {Y 1, . . . , Yk} are discrete random variables\ntaking values in the set X = {x1, . . . , xn}. Our de\ufb01nition can be easily generalized to continuous\nrandom variables. The append random variable, \u2295Y, with distribution de\ufb01ned below, takes values\nin the set of pairs, {1, . . . , k} \u00d7 X. The distribution of the random variable \u2295Y is given by,\n\n\u2295 Y = Y1 \u2295 \u00b7\u00b7\u00b7 \u2295 Yk.\n\np(\u2295Y = (i, xj)) =\n\n\u00b7 p(Yi = xj),\n\n1\nk\n\n(5)\n\n3\n\n\fwhere index i is over the RVs and index j is over the labels. The intuition behind this construction\ncan be explained by the following simple example. Let Y 1, Y2 be two discrete random variables with\ndistributions\n\n(p(Y1 = 1), p(Y1 = 2), p(Y1 = 3)) and (p(Y2 = 1), p(Y2 = 2), p(Y2 = 3)),\n\nthen the distribution of Y1 \u2295 Y2 is simply given by vector\n\n1/2 \u00b7 (p(Y1 = 1), p(Y1 = 2), p(Y1 = 3), p(Y2 = 1), p(Y2 = 2), p(Y2 = 3)).\n\n(6)\n\n(7)\n\nArmed with this construction, we return to our multiscale RF with RVs in (3). Recall that the RVs\nrepresenting the nodes on the highest level F \u03b2\nm\u03b2 are the roots of trees T1, . . . , Tm\u03b2 . By\nslightly abusing our notation, we de\ufb01ne \u2295T i as the append of all random variables that are nodes of\ntree Ti. This construction allows us to reduce the multiscale RF with RVs in (3) to a RF with RVs\n\n1 , . . . , F \u03b2\n\nT = (\u2295T1, . . . ,\u2295Tm\u03b2).\n\n(8)\n\nThe graph structure of this new RF is de\ufb01ned by graph G = (T, E) such that\n\n(\u2295Ti,\u2295Tj) \u2208 E iff \u2203\u03b1 \u2203a,b (F \u03b1\n\n(9)\nIn simple words, \u2295Ti and \u2295Tj are related in G iff on some level \u03b1 both trees have related random\nvariables.\n\nb ) \u2208 E\u03b1 and F \u03b1\n\na , F \u03b1\n\na \u2208 \u2295Ti and F \u03b1\n\nb \u2208 \u2295Tj\n\nThe construction in (8) and (9) maps a multiscale RF to a single level RF, i.e., to a random \ufb01eld\nwith a simple graph structure G. The intuition is that we collapse all graphs G \u03b1 = (S\u03b1, E\u03b1) for\n\u03b1 = 1, . . . , \u03b2 to a single graph G = (T, E) by gluing all RVs in each tree T i to a single RV \u2295Ti.\nConsequently, any existing RF optimization method can be applied to compute\np(\u2295T1 = t1, . . . ,\u2295Tm\u03b2 = tm\u03b2\n\n(cid:2)t = ((cid:2)t1, . . . ,(cid:2)tm\u03b2 ) = argmax\n\n|X).\n\n(10)\n\n(t1,...,tm\u03b2\n\n)\n\nWe observe that when optimizing the new RF in (10), we can simply perform separate optimizations\non each level, i.e., on each level \u03b1 we optimize (8) with respect to the graph structure G \u03b1. Hence at\neach level \u03b1 we choose the maximum aposteriori estimate associated with the random \ufb01eld at that\nlevel. Our key contribution is the fact that these optimizing estimators are linked by the internal\nstructure of the RVs \u2295Ti.\nAfter optimizing a regular RF in (10) that contains append RVs, we obtain as the solution updated\ndistributions of the append RVs. From them, we can easily reconstruct the updated distributions of\nthe original RVs from the multiscale RF in (2) by the construction of the append RVs. For example,\n10) as the updated distribution of some RV Y1\u2295 Y2, then we can easily\nif we obtain ( 1\nderive the updated distributions of Y 1, Y2 as\n\n10 , 0, 1\n\n10 , 1\n\n10 , 3\n\n5 , 1\n\n(p(Y1 = 1) =\n\n1\n8\n\n, p(Y1 = 2) =\n\n3\n4\n\n, p(Y1 = 3) =\n\n1\n8\n\n) & (p(Y2 = 1) = 0, p(Y2 = 2) =\n\n1\n2\n\n, p(Y2 = 3) =\n\n1\n2\n\n)\n\nTo obtain the distributions of the compound RVs Y 1, Y2, we only need to ensure that both distribu-\ntions of Y1 and Y2 sum to one. Since we are usually interested in selecting a variable assignment with\nmaximum posterior probability (10), we do not need to derive these distributions. Consequently, in\nthis example, it is suf\ufb01cient for us to determine that the assignment of Y 1 to label 2 maximizes\nY1 \u2295 Y2.\nGoing back to our application in contour grouping, the RV \u2295T 2 is an append of three RVs repre-\nsenting segments 2, 7, 8 in Fig. 1. We observe that RVs appended to \u2295T 2 compete in the label\nassignment. For example, if a given assignment of RV \u2295T 2 to an image segment, say x5, maximizes\n\u2295T2, then, by the position in the discrete distribution of \u2295T 2, we can clearly identify which RV is\nthe winner, i.e., which of the model segments 2, 7, 8 is assigned to image segment x 5. We can also\nmake this competition soft (with more then one winner) if we select local maxima of the discrete\ndistribution of \u2295T2, which may lead to assigning more than one of model segments 2, 7, 8 to image\nsegments. In the computation model presented in the next section, we focus on \ufb01nding a global\nmaximum for each RV \u2295Ti.\n\n4\n\n\f3 Computing the label assignment with relaxation labeling\nThere exist several approaches to compute the assignment f that optimizes the relational structure of\na given RF [7], i.e., approaches that solve Eq. (10), which is our formulation of the general RF Eq.\n(2). In our implementation, we use a particularly simple approach of relaxation labeling introduced\nby Rosenfeld et al. in [11]. However, a more powerful class of MCMC methods could also be used\n[7]. In this section, we brie\ufb02y describe the relaxation labeling (RL) method, and how it \ufb01ts into our\nframework.\nWe recall that our goal is to \ufb01nd a label assignment t = (t 1, . . . , tm) that maximizes the probability\np(t|X) = p(\u2295T1 = t1, . . . ,\u2295Tm = tm|X) in Eq. (10), where we have shortened m = m \u03b2. One of\nthe key ideas of using RL is to decompose p(t|X) into individual probabilities p(\u2295T a = (ia, xj)),\nwhere index a = 1, . . . , m ranges over the RVs of the RF, index j = 1, . . . , n ranges over the\npossible labels, which in our case are the contour segments X = {x 1, . . . , xn} extracted from a\ngiven image, and index ia ranges over the RVs that are appended to \u2295T a, which we denote with\nia \u2208 a. For brevity, we use the notation\n\npa(ia, xj) = p(\u2295Ta = (ia, xj)).\n\nGoing back to our example in Fig. 1, p 2(7, x5) denotes the probability that contour segment 7 is\nassigned to an image segment x5, and 2 is the index of RV \u2295T2. We recall that \u2295T2 is an append of\nthree RVs representing segments 2, 7, 8 in Fig. 1. In Section 5, p 2(7, x5) is modeled as a Gaussian\nof the shape dissimilarity between model contour segment 7 and image contour segment 5.\nAs is usually the case for RFs, we also consider binary relations between RVs that are adjacent\nin the underlying graph structure G = (T, E), which represent conditional probabilities p(\u2295T a =\n(ia, xj) | \u2295 Tb = (ib, xk)). They express the compatibility of these label assignment. Again for\nbrevity, we use notation\n\nCa,b((ia, xj), (ib, xk)) = p(\u2295Ta = (ia, xj) | \u2295 Tb = (ib, xk)).\n\nFor example, C2,3((7, x5), (9, x8)) models the compatibility of assignment of model segment 7\n(part of model tree 2) to image segment x 5 with the assignment of model segment 9 (part of model\ntree 3) to image segment x8. This compatibility is a function of geometric relations between the\nsegments. Since segment 9 is above segment 7 in the model contour, it is reasonable to assign high\ncompatibility only if the same holds for the image segments, i.e., x 8 is above x5.\nThe RL algorithm iteratively estimates the change in the probability p a(ia, xj) by:\n\n\u03b4pa(ia, xj) =\n\n(11)\nwhere b varies over all append random variables \u2295T b different form \u2295Ta and ib varies over all\ncompound RVs that are combined by append to \u2295T b. Then the probability is updated by\n\nxk\u2208X: xk(cid:2)=xj\n\nb=1,...,m: b(cid:2)=a\n\nCa,b((ia, xj), (ib, xk)) \u00b7 pb(ib, xk),\n\n(cid:3)\nib\u2208b\n\n(cid:3)\n\n(cid:3)\n\n(cid:4)\n\n(cid:4)\npa(ia, xj)[1 + \u03b4pa(ia, xj)]\n\npa(ia, xj) =\n\n(12)\nThe double sum in the denominator simply normalizes the distribution of \u2295T a so that it sums to one.\nThe RL algorithm in our framework iterates steps (11) and (12) for all a = 1, . . . , m (append RVs),\nall ia \u2208 a, and all labels xj \u2208 X. It can be shown that the RL algorithm is guaranteed to converge,\nbut not necessarily to a global maximum [7].\n\nxk\u2208X pa(ia, xk)[1 + \u03b4pa(ia, xk)]\n\nia\u2208a\n\n,\n\n4 A contour grouping example\nWe provide a simple but real example to illustrate how our multiscale RF framework solves a con-\ncrete contour grouping instance. We use the contour model presented in Fig. 1. Let F i be a RV cor-\nresponding to model contour segment s i for i = 1, . . . , 12. We have two levels S 0 = {F5, . . . , F12}\nand S 1 = {F1, . . . , F4}. Both graph structures G0 and G1 are complete graphs. As described in\nSection 2, we have MSRF with four trees. The append RVs determined by these trees are:\n\u2295T1 = F1 \u2295 F5 \u2295 F6, \u2295T2 = F2 \u2295 F7 \u2295 F8, \u2295T3 = F3 \u2295 F9 \u2295 F10, \u2295T4 = F4 \u2295 F11 \u2295 F12\nWe obtain a regular (single level) RF with the four append RVs, T = (\u2295T 1,\u2295T2,\u2295T3,\u2295T4), and\nwith the graph structure G = (T, E) determined by Eq. (9).\n\n5\n\n\fGiven an image as in Fig. 2(a), we \ufb01rst compute its edge map shown in Fig. 2(b), and use a low level\nedge linking to obtain edge segments in Fig. 2(c). The 16 edge segments in Fig. 2(c) form our label\nset X = {x1, x2, . . . x16}. Our goal is to \ufb01nd label assignment to RVs \u2295Ta for a = 1, 2, 3, 4 with\nmaximum posterior probability (10). However, the label set of each append RV is different, e.g., the\nlabel set of \u2295T1 is equal to {1, 5, 6}\u00d7 X, where \u2295T1 = (1, x5) denotes the assignment of F1 = x5\nrepresenting mapping model segment 1 to image segment 5. Hence p 1(ia, xj) = p(\u2295T1 = (ia, xj))\nfor ia = 1, j = 5 denotes the probability of mapping model segment i a = 1 to image segment\nj = 5.\nAs described in Section 3, we use relaxation labeling to compute the maximum posterior probability\n(10). Initially, all probabilities pa(ia, xj) are set based on shape similarity between involved model\nand image segments. The assignments compatibilities are determined using geometric relations de-\nscribed in Section 5. After 200 iterations, RL \ufb01nds the best assignment for each RV \u2295T a as Fig.\n2(d) illustrates. They are presented in the format RV: model segment \u2192 edge segment:\n\u2295T1 : 1 \u2192 x12; \u2295T2 : 5 \u2192 x10; \u2295T3 : 8 \u2192 x7; \u2295T4 : 4 \u2192 x5.\nObserve that many model segments remained unmatched, since there they do not have any corre-\nsponding segments in the image 2(c). This very desirable property results from the label assignment\ncompetition within each append RV \u2295T a for a = 1, 2, 3, 4. This fact demonstrates one of the main\nbene\ufb01ts of the propose approach. We stress that we do not use any penalties for non matching,\nwhich are usually used in classical RFs (e.g., nil variables in [7]), but are very hard to set in real\napplications.\n\n8\n\n1\n\n16\n\n4\n\n13\n\n6\n\n5\n\n9\n\n12\n\n14\n\n15\n\n10\n\n7\n\n11\n\n2\n\n3\n\n(a)\n\n(b)\n\n(c)\n\n5\n\n8\n\n4\n\n1\n\n(d)\n\nFigure 2: (c) The 16 edge segments form our label set X = {x 1, x2, . . . x16}. (d) The numbers and\ncolors indicate the assignment of the model segments from Fig. 1.\n\n5 Geometric contour relations\nIn this section, we provide a brief description of contour segment relations used to assign labels\nfor contour grouping. Two kinds of relations are de\ufb01ned. First, the probability p a(ia, xj) is set to\nbe a Gaussian of shape dissimilarity between model segment i a and image segment xj. The shape\ndissimilarity is computed by matching sequences of tangent directions at their sample points. To\nmake our matching scale invariant, we sample each model and image segment with the same number\nof sample points. We also consider four binary relations to measure the compatibility between a pair\nof model segments and a pair of image segments: d (1)(i, i(cid:4)) \u2013 the maximum distance between the\nend-points of two contour segments i and i(cid:4)\n; d(2)(i, i(cid:4)) \u2013 the minimum distance between the end-\npoints of two contour segments i and i(cid:4)\n; d(3)(i, i(cid:4)) \u2013 the direction from the mid-point of i to the\nmid-point of i(cid:4)\n. To make our relations\nscale invariant, all distances are normalized by the sum of the lengths of segments i and i (cid:4)\n. Then the\ncompatibility between pair of model segments i a, ib and pair of image segments xj, xk is given by\na mixture of Gaussians:\n\n; d(4)(i, i(cid:4)) \u2013 the distance between the mid-points of i and i(cid:4)\n\nCa,b((ia, xj), (ib, xk)) =\n\nN (d(r)(ia, ib) \u2212 d(r)(xj , xk), \u03c3(r))\n\n1\n4\n\n(13)\n\n4(cid:3)\n\nr=1\n\n6 Experimental results\nWe begin with a comparison between the proposed append MSRF and single level RF. Given an edge\nmap in Fig. 3(b) extracted by edge detector [8], we employ a low level edge linking method to obtain\nedge segments as shown in 3(c), where the 27 edge segments form our label set X = {x 1, . . . , x27}.\nFig. 3(d) illustrates our shape contour model and its two level multiscale structure of 10 contour\nsegments. Fig. 3(e) shows the result of contour grouping obtained in the framework of the proposed\n\n6\n\n\fappend MSRF. The numbers and colors in indicate the assignment of the model segments. The\nbene\ufb01ts of the \ufb02exible multiscale model structure are clearly visible. Out of 10 model segments,\nonly 4 have corresponding edge segments in the image, and our approach correctly determined a\nlabel assignments re\ufb02ecting this fact.\n\nIn contrast, this is not the case for a single level RF. Fig. 3(f) shows a model with a \ufb01xed single level\nstructure, and its contour grouping result computed with classical RL can be found in Fig. 3(g).\nWe observe that model segment 2 on giraffe\u2019s head has no matching contour in the image, but is\nnevertheless incorrectly assigned. This wrong assignment in\ufb02uences model contour 4, and leads to\nanother wrong assignment. In the proposed approach, model contours 2 and 3 in Fig. 3(d) compete\nfor label assignments. Since contour 3 \ufb01nds a good match in the image, we correctly obtain (through\nour append RV structure) that that there is not match for segment 2.\n\n11\n\n24\n\n26\n\n5\n\n16\n\n13\n\n14\n9\n4\n\n20\n\n23\n\n17\n\n(a)\n\n(b)\n\n(c)\n\n3\n\n2\n\n2\n\n2\n\n19\n\n1\n25\n\n21\n\n7\n\n22\n\n18\n\n12\n6\n\n15\n\n8\n\n10\n\n27\n\n6\n\n3\n\n10\n\n1\n\n9\n\n5\n\n4\n\n8\n\n7\n\n(d)\n\n6\n\n1\n\n5\n\n4\n\n1\n\n3\n\n5\n\n4\n\n(e)\n\n(f)\n\n(g)\n\n2\n\n3\n\n1\n\n4\n\n5\n\nFigure 3: (d-g) comparison of results obtain by the proposed MSRF to a single level RF.\n\nBy mapping the model segments to the image segments, we enforce the existence of a solution.\nEven if no target shape is present in a given image, our approach will \u201dhallucinate\u201d a matching\ncon\ufb01guration of edge segments in the image. A standard alternative in the framework of random\n\ufb01elds is to use a penalty for non-matching (dummy or null nodes). However, this requires several\nconstants, and it is a highly nontrivial problem to determine their values.\nIn our approach, we\ncan easily distinguish hallucinated contours from true contours, since when the RF optimization is\ncompleted, we obtain the assignment of contour segments, i.e., we know a global correspondence\nbetween model segments and image segments. Based on this correspondence, we compute global\nshape similarity, and discard solutions with low global similarity to the model contour. This requires\nonly one threshold on global shape similarity, which is relatively easy to set, and our experimental\nresults verify this fact. In Figs. 4 and 5, we show several examples of contour grouping obtained by\nthe proposed MSRF method on the ETHZ data set [4]. We only use two contour models, the swan\nmodel (Fig. 1) and the giraffe model (Fig. 3(d)). Their original images are included as shape models\nin the ETHZ data set. Model contours are decomposed into segments by introducing break points at\nhigh curvature points. Edge contour segments in the test images have been automatically computed\nby a low level edge linking process. Noise and shape variations cause the edge segments to vary a\nlot from image to image. We also observe that grouped contours contain internal edge structures.\n\n7 Conclusions\n\nSince edges, and consequently, contour parts vary signi\ufb01cantly in real images, it is necessary to make\ndecomposition of model contours into segments \ufb02exible. The proposed multiscale construction\npermits us to have a very \ufb02exible decomposition that can adapt to different con\ufb01gurations of contour\nparts in the image. We introduce a novel multiscale random \ufb01eld interpretation based on the append\noperator that leads to ef\ufb01cient optimization. We applied the new algorithm to the ETHZ data set to\nillustrate the application potential of the proposed method.\n\n7\n\n\fFigure 4: ETHZ data set grouping results for the Giraffe model.\n\nFigure 5: ETHZ data set grouping results for the swan model.\n\nAcknowledgments\nThis work was supported in part by the NSF Grants IIS-0534929, IIS-0812118 in the Robust Intel-\nligence Cluster and by the DOE Grant DE-FG52-06NA27508.\n\nReferences\n[1] S. Belongie, J. Malik, and J. Puzicha. Shape matching and object recognition using shape contexts. IEEE\n\nTrans. Pattern Analysis and Machine Intelligence, 24:705\u2013522, 2002.\n\n[2] C. A. Bouman and M. Shapiro. A multiscale random \ufb01eld model for bayesian image segmentation. IEEE\n\nTrans. on IP, 3(2):162\u2013177, 1994.\n\n[3] H. Chui and A. Rangarajan. A new algorithm for non-rigid point matching. In CVPR, 2000.\n[4] V. Ferrari, L. Fevrier, F. Jurie, and C. Schmid. Groups of adjacent contour segments for object detection.\n\nIEEE Trans. PAMI, 2008.\n\n[5] A.R. Ferreira and H.K.H.Lee. Multiscale Modeling: A Bayesian Perspective. Springer-Verlag, Springer\n\nSeries in Statistics, 2007.\n\n[6] X. He, R. S. Zemel, and M. A. Carreira-Perpinan. Multiscale conditional random \ufb01elds for image labeling.\n\nIn CVPR, volume 2, pages 695\u2013702, 2004.\n\n[7] S. Z. Li. Markov Random Field Modeling in Image Analysis. Springer-Verlag, Tokyo, 2001.\n[8] D. Martin, C. Fowlkes, and J. Malik. Learning to detect natural image boundaries using local birghtness,\n\ncolour and texture cues. IEEE Trans. PAMI, 26:530\u2013549, 2004.\n\n[9] G. McNeill and S. Vijayakumar. Part-based probabilistic point matching using equivalence constraints.\n\nIn NIPS, 2006.\n\n[10] A. Opelt, A. Pinz, and A. Zisserman. A boundary-fragment-model for object detection. In ECCV, 2006.\n[11] A. Rosenfeld, R. Hummel, and S. Zucker. Scene labeling by relaxation operations. Trans. on Systems,\n\nMan and Cybernetics, 6:420\u2013433, 1976.\n\n[12] J. Shotton, A. Blake, and R. Cipolla. Contour-based learning for object detection. In ICCV, 2005.\n[13] A. Thayananthan, B. Stenger, P. H. S. Torr, and R. Cipolla. Shape context and chamfer matching in\n\ncluttered scenes. In CVPR, 2003.\n\n[14] Z. Tu and A.L. Yuille. Shape matching and recognition using generative models and informative features.\n\nIn ECCV, 2004.\n\n[15] A. S. Willsky. Multiresolution markov models for signal and image processing. Proceedings of the IEEE,\n\n90:1396\u20131458, 2002.\n\n[16] Y. N. Wu, Z. Si, C. Fleming, and S.-C. Zhu. Deformable template as active basis. In ICCV, 2007.\n\n8\n\n\f", "award": [], "sourceid": 335, "authors": [{"given_name": "Longin", "family_name": "Latecki", "institution": null}, {"given_name": "Chengen", "family_name": "Lu", "institution": null}, {"given_name": "Marc", "family_name": "Sobel", "institution": null}, {"given_name": "Xiang", "family_name": "Bai", "institution": null}]}