{"title": "Very loopy belief propagation for unwrapping phase images", "book": "Advances in Neural Information Processing Systems", "page_first": 737, "page_last": 743, "abstract": null, "full_text": "Very loopy belief propagation for \n\nunwrapping phase images \n\nBrendan J . Freyl, Ralf Koetter2, Nemanja Petrovic1,2 \n\n1 Probabilistic and Statistical Inference Group, University of Toronto \n\nhttp://www.psi.toronto.edu \n\n2 Electrical and Computer Engineering, University of Illinois at Urbana \n\nAbstract \n\nSince the discovery that the best error-correcting decoding algo(cid:173)\nrithm can be viewed as belief propagation in a cycle-bound graph, \nresearchers have been trying to determine under what circum(cid:173)\nstances \"loopy belief propagation\" is effective for probabilistic infer(cid:173)\nence. Despite several theoretical advances in our understanding of \nloopy belief propagation, to our knowledge, the only problem that \nhas been solved using loopy belief propagation is error-correcting \ndecoding on Gaussian channels. We propose a new representation \nfor the two-dimensional phase unwrapping problem, and we show \nthat loopy belief propagation produces results that are superior to \nexisting techniques. This is an important result, since many imag(cid:173)\ning techniques, including magnetic resonance imaging and interfer(cid:173)\nometric synthetic aperture radar, produce phase-wrapped images. \nInterestingly, the graph that we use has a very large number of \nvery short cycles, supporting evidence that a large minimum cycle \nlength is not needed for excellent results using belief propagation. \n\nIntroduction \n\n1 \nPhase unwrapping is an easily stated, fundamental problem in image processing \n(Ghiglia and Pritt 1998). Each real-valued observation on a 1- or 2-dimensional \ngrid is measured modulus a known wavelength, which we take to be 1 without loss \nof generality. Fig. Ib shows the wrapped, I-dimensional waveform obtained from the \noriginal waveform shown in Fig. la. Every time the original waveform goes above 1 \nor below 0, it is wrapped to 0 or 1, respectively. The goal of phase unwrapping is to \ninfer the original, unwrapped curve from the wrapped measurements, using using \nknowledge about which signals are more probable a priori. \n\nIn two dimensions, exact phase unwrapping is exponentially more difficult than 1-\ndimensional phase unwrapping and has been shown to be NP-hard in general (Chen \nand Zebker 2000). Fig. lc shows the wrapped output of a magnetic resonance \nimaging device, courtesy of Z.-P. Liang. Notice the \"fringe lines\" - boundaries \nacross which wrappings have occurred. Fig. Id shows the wrapped terrain height \nmeasurements from an interferometric synthetic aperture radar, courtesy of Sandia \nNational Laboratories, New Mexico. \n\n\f(a) \n\n(b) \n\n(d) \n\nFigure 1: (a) A waveform measured on a 1-dimensional grid. (b) The phase-wrapped version \nofthe waveform in (a), where the wavelength is 1. (c) A wrapped intensity ma p from a magnetic \nresonance imaging device, measured on a 2-dimensional grid (courtesy of Z.-P. Liang). (d) \nA wrapped topographic map measured on a 2-dimensional grid (courtesy of Sandia National \nLaboratories, New Mexico) . \n\nA sensible goal in phase unwrapping is to infer the gradient field of the original \nsurface. The surface can then be reconstructed by integration. Equivalently, the \ngoal is to infer the number of relative wrappings, or integer \"shifts\", between every \npair of neighboring measurements. Positive shifts correspond to an increase in the \nnumber of wrappings in the direction of the x or y coordinate, whereas negative \nshifts correspond to a decrease in the number of wrappings in the direction of the \nx or y coordinate. After arbitrarily assigning an absolute number of wrappings to \none point, the absolute number of wrappings at any other point can be determined \nby summing the shifts along a path connecting the two points. To account for \ndirection, when taking a step against the direction of the coordinate, the shift \nshould be subtracted. \n\nWhen neighboring measurements are more likely closer together than farther apart \na priori, I-dimensional waveforms can be unwrapped optimally in time that is linear \nin the waveform length. For every pair of neighboring measurements, the shift that \nmakes the unwrapped values as close together as possible is chosen. For example, \nthe shift between 0.4 and 0.5 would be 0, whereas the shift between 0.9 and 0.0 \nwould be -1. \n\nFor 2-dimensional surfaces and images, there are many possible I-dimensional paths \nbetween any two points. These paths should be examined in combination, since the \nsum of the shifts along every such path should be equal. Viewing the shifts as state \nvariables, the cut-set between any two points is exponential in the size of the grid, \nmaking exact inference for general priors NP-hard (Chen and Zebker 2000). \n\nThe two leading fully-automated techniques for phase unwrapping are the least \nsquares method and the branch cut technique (Ghiglia and Pritt 1998). (Some other \ntechniques perform better in some circumstances, but need additional information \nor require hand-tweaking.) The least squares method begins by making a greedy \nguess at the gradient between every pair of neighboring points. The resulting vector \n\n\ffield is not the gradient field of a surface, since in a valid gradient field, the sum of \nthe gradients around every closed loop must be zero (that is, the curl must be 0). \nFor example, the 2 x 2 loop of measurements 0.0, 0.3, 0.6, 0.9 will lead to gradients of \n0.3,0.3,0.3, 0.1 around the loop, which do not sum to O. The least squares method \nproceeds by projecting the vector field onto the linear subspace of gradient fields. \nThe result is integrated to produce the surface. The branch cut technique also \nbegins with greedy decisions for the gradients and then identifies untrustworthy \nregions of the image whose gradients should not be used during integration. As \nshown in our results section, both of these techniques are suboptimal. \n\nPreviously, we attempted to use a relaxed mean field technique to solve this problem \n(Achan, Frey and Koetter 2001). Here, we take a new approach that works better \nand is motivated by the impressive results of belief propagation in cycle-bound \ngraphs for error-correcting decoding (Wiberg, Loeliger and Koetter 1995; MacKay \nand Neal 1995; Frey and Kschischang 1996; Kschischang and Frey 1998; McEliece, \nMacKay and Cheng 1998). In contrast to other work (Ghiglia and Pritt 1998; \nChen and Zebker 2000; Koetter et al. 2001), we introduce a new framework for \nquantitative evaluation, which impressively places belief propagation much closer \nto the theoretical limit than other leading methods. \nIt is well-known that belief propagation (a.k.a. the sum-product algorithm, prob(cid:173)\nability propagation) is exact in graphs that are trees (Pearl 1988), but it has been \ndiscovered only recently that it can produce excellent results in graphs with many \ncycles. Impressive results have been obtained using loopy belief propagation for \nsuper-resolution (Freeman and Pasztor 1999) and for infering layered representa(cid:173)\ntions of scenes (Frey 2000) . However, despite several theoretical advances in our \nunderstanding of loopy belief propagation (c.f. (Weiss and Freeman 2001)) and pro(cid:173)\nposals for modifications to the algorithm (c.f. (Yedidia, Freeman and Weiss 2001)) , \nto our knowledge, the only problem that has been solved by loopy belief propagation \nis error-correcting decoding on Gaussian channels. \n\nWe conjecture that although phase unwrapping is generally NP-hard, there exists a \nnear-optimal phase unwrapping algorithm for Gaussian process priors. Further, we \nbelieve that algorithm to be loopy belief propagation. \n\n2 Loopy Belief Propagation for Phase Unwrapping \nAs described above, the goal is to infer the number of relative wrappings, or integer \n\"shifts\" , between every pair of neighboring measurements. Denote the x-direction \nshift at (x,y) by a(x , y) and the y-direction shift at (x , y) by b(x, y), as shown in \nFig.2a. If the sum of the shifts around every short loop of 4 shifts (e.g., a(x,y) + \nb(x + l,y) - a(x , y + 1) - b(x,y) in Fig. 2a) is zero, then perturbing a path will \nnot change the sum of the shifts along the path. So, a valid set of shifts S = \n{a(x,y) , b(x, y) : x = 1, ... , N -1;y = 1, .. . , M -I} in an N x M image must \nsatisfy the constraint \n\na(x,y) + b(x + l,y) - a(x,y + 1) - b(x,y) = 0, \n\n(1) \n\nfor x = 1, ... , N -1, Y = 1, ... , M -1. Since a(x, y) +b(x+ 1, y) -a(x, y+ 1) -b(x, y) \nis a measure of curl at (x, y), we refer to (1) as a \"zero-curl constraint\", reflecting \nthe fact that the curl of a gradient field is O. In this way, phase unwrapping is \nformulated as the problem of inferring the most probable set of shifts subject to \nsatisfying all zero-curl constraints. \n\nWe assume that given the set of shifts, the unwrapped surface is described by a low(cid:173)\norder Gaussian process. The joint distribution over the shifts S = {a(x, y), b(x, y) : \nI} and the wrapped measurements * = {\u00a2(x, y) : \nx = 1, ... , N - 1; Y = 1, ... , M -\n\n\f(a) \n\n(b) \n\nx-direction shifts (' a's) \n\n(x, y + l ) X a(x,y+ l ) X (x+ l ,y+ l ) \n\n-7 \n\nb(x, y ) I \n\nI b(x+ l ,y) \n\nX -7 X \n\na(x,y) \n\n(x,y) \n\n(x+ l ,y) \n\nVi' t.: \n~-r1 X \n\n. \n\nX \n\nX \n\nX \n\nX \n\nX \n\n[f X \n\nX \n\nX \n\nX \n\nX \n\n) X \n\nX \n\n) X \n\nX \n\nX \n\nX \n\nX \n\nX \n\nX \n\nX \n\nX \n\nX \n\nX \n\nX \n\nX \n\nX \n\nX \n\nX \n\nX \n\nX \n\nX \n\nX \n\nX \n\n(d) \n\na(x, y) \n\n~ it2 t \nit l t \n\nFigure 2: (a) Positive x-direction shifts (arrows labeled a) and positive y-direction shifts \n(arrows labeled b) between neighboring measurements in a 2 X 2 patch of points (marked by \nX 's) , (b) A graphical model that describes the zero-curl constraints (black discs) between \nneighboring shift variables (white discs), 3-element probability vectors (J-L's) on the relative \nshifts between neighboring variables (-1, 0, or +1) are propagated across the network: (c) \nConstraint-to-shift vectors are computed from incoming shift-to-constraint vectors; (d) Shift(cid:173)\nto-constraint vectors are computed from incoming constraint-to-shift vectors; (d) Estimates of \nthe marginal probabilities of the shifts given the data are computed by combining incoming \nconstra i nt-to-sh ift vectors, \n0 :::; r/J(x, y) < 1, x = 1, .. . , N; y = 1, . . . , M } can be expressed in the form \nex: II II 5(a(x,y) +b(x +1 ,y) - a(x,y +1 ) -b(x,y)) \n\nP(S , **(x+l,y)-c/>(x,y)-a(x,y))2/2u2 II II e-(C/>(x,y+1)-c/>(x,y)-b(x,y))2/ 2u2 . (2) \n\nN M-l \n\nx= l y=l \n\nx= l y=l \n\nThe zero-curl constraints are enforced by 5 (.), which evaluates to 1 if its argument is \no and evaluates to 0 otherwise. We assume t he slope of the surface is limited so that \nthe unknown shifts take on the values -1 , 0 and 1. a 2 is the variance between two \nneighboring measurements in the unwrapped image, but we find t hat in practice it \ncan be estimated directly from t he wrapped image. \nPhase unwrapping consists of making inferences about the a's and b's in the above \nprobability model. For example, the marginal probability t hat t he x-direction shift \nat (x,y) is k given an observed wrapped image ** , is \n\nP (a(x,y) = kl**., on a log-scale. \n(The plot for the mean squared error in the surface heights looks similar.) As >. -+ 0, \nunwrapping becomes impossible and as >. -+ 00, unwrapping becomes trivial (since \nno wrappings occur), so algorithms have waterfall-shaped curves. \n\n\fThe belief propagation algorithm clearly obtains significantly lower reconstruction \nerrors. Viewed another way, belief propagation can tolerate much lower wrapping \nwavelengths for a given reconstruction error. Also, it turns out that for this surface, \nit is impossible for an algorithm that infers relative shifts of -1,0 and 1 to obtain a \nreconstruction error of 0, unless A ::::: 12.97. Belief propagation obtains a zero-error \nwavelength that is significantly closer to this limit than the least squares method \nand the branch cuts technique. \n\n4 Conclusions \nPhase unwrapping is a fundamental problem in image processing and although it \nhas been shown to be NP-hard for general priors (Chen and Zebker 2000), we \nconjecture there exists a near-optimal phase unwrapping algorithm for Gaussian \nprocess priors. Further, we believe that algorithm to be loopy belief propagation. \nOur experimental results show that loopy belief propagation obtains significantly \nlower reconstruction errors compared to the least squares method and the branch \ncuts technique (Ghiglia and Pritt 1998) , and performs close to the theoretical limit \nfor techniques that infer relative wrappings of -1, 0 and + 1. The belief propagation \nalgorithm runs in about the same time as the other techniques. \nReferences \nAchan, K. , Frey, B. J. , and Koetter, R. 2001. A factorized variational technique for phase \nunwrapping in Markov random fields. In Uncertainty in Artificial Intelligence 2001. \nSeattle, Washington. \n\nChen, C. W. and Zebker, H. A. 2000. Network approaches to two-dimensional phase \nunwrapping: intractability and two new algorithms. Journal of the Optical Society \nof America A, 17(3):401- 414. \n\nFreeman, W. and Pasztor, E. 1999. Learning low-level vision. \n\nInt ernational Conference on Computer Vision, pages 1182- 1189. \n\nIn Proceedings of the \n\nFrey, B. J. 2000. Filling in scenes by propagating probabilities through layers and into \nappearance models. In Proceedings of the IEEE Conference on Computer Vision and \nPattern Recognition. \n\nFrey, B. J. and Kschischang, F. R. 1996. Probability propagation and iterative decod(cid:173)\n\ning. In Proceedings of the 34th Allerton Conference on Communication, Control and \nComputing 1996. \n\nGhiglia, D. C. and Pritt, M. D. 1998. Two-Dimensional Phase Unwrapping. Theory, \n\nAlgorithms and Software. John Wiley & Sons. \n\nKoetter, R. , Frey, B. J ., Petrovic, N., and Munson, Jr., D . C. 2001. Unwrapping phase \nimages by propagating probabilities across graphs. In Proceedings of the International \nConference on Acoustics, Speech and Signal Processing. IEEE Press. \n\nKschischang, F. R. and Frey, B. J. 1998. Iterative decoding of compound codes by proba(cid:173)\nbility propagation in graphical models. IEEE Journal on Selected Areas in Commu(cid:173)\nnications, 16(2):219- 230. \n\nMacKay, D . J . C. and Neal, R . M. 1995. Good codes based on very sparse matrices. In \nBoyd, C., editor, Cryptograph,!! and Coding. 5th IMA Conference, number 1025 in \nLecture Notes in Computer SCience, pages 100- 111. Springer, Berlin Germany. \n\nMcEliece, R. J. , MacKay, D. J . C., and Cheng, J. F. 1998. Turbo-decoding as an in(cid:173)\n\nstance of Pearl's 'belief propagation' algorithm. IEEE Journal on Selected Areas in \nCommunications, 16. \n\nPearl, J. 1988. Probabilistic R easoning in Intelligent Systems. Morgan Kaufmann, San \n\nMateo CA. \n\nWeiss, Y. and Freeman, W. 2001. On the optimaility of solutions of the max-product \nbelief propagation algorithm in artbitrary graphs. IEEE Transactions on Information \nTheory, Special Issue on Codes on Graphs and Iterative Algorithms, 47(2):736- 744. \nWiberg, N., Loeliger, H.-A., and Koetter, R. 1995. Codes and iterative decoding on general \n\ngraphs. European Transactions on Telecommunications, 6:513- 525. \n\nYedidia, J. , Freeman, W . T., and Weiss, Y. 2001. Generalized belief propagation. In \nAdvances in Neural Information Processing Systems 13. MIT Press, Cambridge MA. \n\n\f", "award": [], "sourceid": 2126, "authors": [{"given_name": "Brendan", "family_name": "Frey", "institution": null}, {"given_name": "Ralf", "family_name": "Koetter", "institution": null}, {"given_name": "Nemanja", "family_name": "Petrovic", "institution": null}]}*