{"title": "Sparse Signal Recovery Using Markov Random Fields", "book": "Advances in Neural Information Processing Systems", "page_first": 257, "page_last": 264, "abstract": "Compressive Sensing (CS) combines sampling and compression into a single sub-Nyquist linear measurement process for sparse and compressible signals. In this paper, we extend the theory of CS to include signals that are concisely represented in terms of a graphical model. In particular, we use Markov Random Fields (MRFs) to represent sparse signals whose nonzero coefficients are clustered. Our new model-based reconstruction algorithm, dubbed Lattice Matching Pursuit (LaMP), stably recovers MRF-modeled signals using many fewer measurements and computations than the current state-of-the-art algorithms.", "full_text": "Sparse Signal Recovery Using Markov Random Fields\n\nVolkan Cevher\nRice University\n\nvolkan@rice.edu\n\nChinmay Hegde\nRice University\n\nchinmay@rice.edu\n\nMarco F. Duarte\nRice University\n\nduarte@rice.edu\n\nRichard G. Baraniuk\n\nRice University\n\nrichb@rice.edu\n\nAbstract\n\nCompressive Sensing (CS) combines sampling and compression into a single sub-\nNyquist linear measurement process for sparse and compressible signals. In this\npaper, we extend the theory of CS to include signals that are concisely repre-\nsented in terms of a graphical model. In particular, we use Markov Random Fields\n(MRFs) to represent sparse signals whose nonzero coef\ufb01cients are clustered. Our\nnew model-based recovery algorithm, dubbed Lattice Matching Pursuit (LaMP),\nstably recovers MRF-modeled signals using many fewer measurements and com-\nputations than the current state-of-the-art algorithms.\n\n1 Introduction\nThe Shannon/Nyquist sampling theorem tells us that in order to preserve information when uni-\nformly sampling a signal we must sample at least two times faster than its bandwidth. In many\nimportant and emerging applications, the resulting Nyquist rate can be so high that we end up with\ntoo many samples and must compress in order to store or transmit them. In other applications, in-\ncluding imaging systems and high-speed analog-to-digital converters, increasing the sampling rate\nor density beyond the current state-of-the-art is very expensive. A transform compression system\nreduces the effective dimensionality of an N -dimensional signal by re-representing it in terms of a\nsparse expansion in some basis (for example, the discrete cosine transform for JPEG). By sparse we\nmean that only K \u226a N of the basis coef\ufb01cients are nonzero.\nThe new theory of compressive sensing (CS) combines sampling and compression into a single sub-\nNyquist linear measurement process for sparse signals [1, 2]. In CS, we measure not periodic signal\nsamples but rather inner products with M < N known measurement vectors; random measurement\nvectors play a starring role. We then recover the signal by searching for the sparsest signal that\nagrees with the measurements. Research in CS to date has focused on reducing both the number\nof measurements M (as a function of N and K) and on reducing the computational complexity of\nthe recovery algorithm. Today\u2019s state-of-the-art CS systems can recover K-sparse and more general\ncompressible signals using M = O(K log(N/K)) measurements using polynomial-time linear\nprogramming or greedy algorithms.\n\nWhile such sub-Nyquist measurement rates are impressive, our contention in this paper is that for\nCS to truly live up its name it must more fully leverage concepts from state-of-the-art compression\nalgorithms. In virtually all such algorithms, the key ingredient is a signal model that goes beyond\nsimple sparsity by providing a model for the basis coef\ufb01cient structure. For instance, JPEG does not\nonly use the fact that most of the DCT of a natural image are small. Rather, it also exploits the fact\nthat the values and locations of the large coef\ufb01cients have a particular structure that is characteristic\nof natural images. Coding this structure using an appropriate model enables JPEG and other similar\nalgorithms to compress images close to the maximum amount possible, and signi\ufb01cantly better than\na naive coder that just assigns bits to each large coef\ufb01cient independently.\n\n1\n\n\fIn this paper, we extend the theory of CS to include signals that are concisely represented in terms\nof a graphical model [3]. We use Markov Random Fields (MRFs) to represent sparse signals whose\nnonzero coef\ufb01cients also cluster together. Our new model-based recovery algorithm, dubbed Lattice\nMatching Pursuit (LaMP), performs rapid and numerically stable recovery of MRF-modeled signals\nusing far fewer measurements than standard algorithms.\n\nThe organization of the paper is as follows. In Sections 2 and 3, we brie\ufb02y review the CS and MRF\ntheories. We develop LaMP in Section 4 and present experimental results in Section 5 using both\nsimulated and real world data. We conclude by offering our perspective on the future directions of\nmodel-based CS research in Section 6.\n2 Compressive sensing: From sparsity to structured sparsity\nSparse signal recovery. Any signal x \u2208 RN can be represented in terms of N coef\ufb01cients {\u03b1i}\ni=1; stacking the \u03c8i as columns into the matrix \u03a8N \u00d7N , we can write succinctly\nin a basis {\u03c8i}N\nthat x = \u03a8\u03b8. We say that x has a sparse representation if only K \u226a N entries of \u03b8 are nonzero,\nK(cid:1) possible supports for such K-sparse signals. We say that x is\nand we denote by \u2126K the set of (cid:0)N\n\ncompressible if the sorted magnitudes of the entries of \u03b8 decay rapidly enough that it can be well\napproximated as K-sparse.\nIn Compressive Sensing (CS), the signal is not acquired by measuring x or \u03b1 directly. Rather, we\nmeasure the M < N linear projections y = \u03a6x = \u03a6\u03a8\u03b8 using the M \u00d7 N matrix \u03a6. In the\nsequel, without loss of generality, we focus on two-dimensional image data and assume that \u03a8 = I\n(the N \u00d7 N identity matrix) so that x = \u03b8. The most commonly used criterion for evaluating the\nquality of a CS measurement matrix is the restricted isometry property (RIP). A matrix \u03a6 satis\ufb01es\nthe K-RIP if there exists a constant \u03b4K > 0 such that for all K-sparse vectors x,\n\n(1 \u2212 \u03b4K)kxk2 \u2264 k\u03a6xk2 \u2264 (1 + \u03b4K)kxk2.\n\n(1)\n\nThe recovery of the set of signi\ufb01cant coef\ufb01cients \u03b8i is achieved using optimization: we search for\nthe sparsest \u03b8 that agrees with the measurements y. While in principle recovery is possible using a\nmatrix that has the 2K-RIP with \u03b42K < 1, such an optimization is combinatorially complex (NP-\ncomplete) and numerically unstable. If we instead use a matrix that has the 3K-RIP with \u03b43K < 1/2,\nthen numerically stable recovery is possible in polynomial time using either a linear program [1, 2]\nor a greedy algorithm [4]. Intriguingly, a random Gaussian or Bernoulli matrix works with high\nprobability, leading to a randomized acquisition protocol instead of uniform sampling.\nStructured sparsity. While many natural and manmade signals and images can be described to the\n\ufb01rst-order as sparse or compressible, their sparse supports (set of nonzero coef\ufb01cients) often have an\nunderlying order. This order plays a central role in the transform compression literature, but it has\nbarely been explored in the CS context [5, 6]. The theme of this paper is that by exploiting a priori\ninformation on coef\ufb01cient structure in addition to signal sparsity, we can make CS better, stronger,\nand faster.\n\nFigure 1 illustrates a real-world example of structured sparse support in a computer vision applica-\ntion. Figure 1(b) is a background subtracted image computed from a video sequence of a parking\nlot with two moving people (one image frame is shown in Figure 1(a)). The moving people form\nthe foreground (white in (b)), while the rest of the scene forms the background (black in (b)). The\nbackground subtraction was computed from CS measurements of the video sequence. Background\nsubtracted images play a fundamental role in making inferences about objects and activities in a\nscene and, by nature, they have structured spatial sparsity corresponding to the foreground innova-\ntions. In other words, compared to the scale of the scene, the foreground innovations are usually not\nonly sparse but also clustered in a distinct way, e.g., corresponding to the silhouettes of humans and\nvehicles. Nevertheless, this clustering property is not exploited by current CS recovery algorithms.\nProbabilistic RIP. The RIP treats all possible K-sparse supports equally. However, if we incor-\nporate a probabilistic model on our signal supports and consider only the signal supports with the\nhighest likelihoods, then we can potentially do much better in terms of the required number of\nmeasurements required for stable recovery.\n\nWe say that \u03a6 satis\ufb01es the (K, \u01eb)-probabilistic RIP (PRIP) if there exists a constant \u03b4K > 0 such\nthat for a K-sparse signal x generated by a speci\ufb01ed probabilistic signal model, (1) holds with\nprobability at least 1 \u2212 \u01eb over the signal probability space. We propose a preliminary result on the\n\n2\n\n\fPSfrag\n\n(a)\n\n(b)\n\n(c)\n\nFigure 1: A camera surveillance image (b) withthe background subtracted image (b) recovered using com-\npressive measurements of the scene. The background subtracted image has resolution N = 240 \u00d7 320 and\nsparsity K = 390. (c)Arandom K = 390 sparseimagein N = 240 \u00d7 320 dimensions. Theprobabilityof\nimage(b)undertheIsingmodelisapproximately 10856 timesgreaterthantheprobabilityofimage(c).\n\nnumber of random measurements needed under this new criterion; this is a direct consequence of\nTheorem 5.2 of [8]. (See also [9] for related results.)\nLemma 1. Suppose that M , N , and \u03b4 \u2208 [0, 1] are given and that the signal x is generated by\na known probabilistic model. Let \u2126K,\u01eb \u2286 \u2126K denote the smallest set of supports for which the\nprobability that a K-sparse signal x has supp(x) /\u2208 \u2126K,\u01eb is less than \u01eb, and denote D = |\u2126K,\u01eb|.\nIf \u03a6 is a matrix with normalized i.i.d. Gaussian or Bernoulli/Rademacher (\u00b11) random entries,\nthen \u03a6 has the (K, \u01eb)-PRIP with probability at least 1 \u2212 e\u2212c2M if M \u2265 c1(K + log(D)), where\nc1, c2 > 0 depend only on the PRIP constant \u03b4K.\n\nTo illustrate the signi\ufb01cance of the above lemma, consider the following probabilistic model for\nan N -dimensional, K-sparse signal. We assume that the locations of the non-zeros follow a ho-\nmogeneous Poisson process with rate \u03bb = \u2212 log(\u01eb/K)N \u2212\u03b1, where \u03b1 \u226a 1. Thus, a particular\nnon-zero coef\ufb01cient occurs within a distance of N \u03b1 of its predecessor with probability 1 \u2212 \u01eb/K. We\ndetermine the size of the likely K-sparse support set \u2126K under this particular signal model using\na simple counting argument. The location of the \ufb01rst non-zero coef\ufb01cients is among the \ufb01rst N \u03b1\nindices with probability 1 \u2212 \u01eb/K. After \ufb01xing the location of the \ufb01rst coef\ufb01cient, the location of\nthe second coef\ufb01cient is among the next N \u03b1 indices immediately following the \ufb01rst location with\nprobability 1 \u2212 \u01eb/K. Proceeding this way, after the locations of the \ufb01rst j \u2212 1 coef\ufb01cients, have been\n\ufb01xed, we have that the jth non-zero coef\ufb01cient is among N \u03b1 candidate locations with probability\n1 \u2212 \u01eb/K. In this way, we obtain a set of supports \u2126K,\u01eb of size N \u03b1K that will occur with probability\n(1 \u2212 \u01eb/K)K > 1 \u2212 \u01eb. Thus for the (K, \u01eb)-PRIP to hold for a random matrix, the matrix must have\nM = cK(1 + \u03b1 log N ) rows, as compared to the cK log(N/K) rows required for the standard\nK-RIP to hold. When \u03b1 is on the order of (log N )\u22121, the number of measurements required and the\ncomplexity of the solution method grow essentially linearly in K, which is a considerable improve-\nment over the best possible M = O(K log(N/K)) measurements required without such a priori\ninformation.\n3 Graphical models for compressive sensing\nClustering of the nonzero coef\ufb01cients in a sparse signal representation can be realistically captured\nby a probabilistic graphical model such as a Markov random \ufb01eld (MRF); in this paper we will\nfocus for concreteness on the classical Ising model [10].\nSupport model. We begin with an Ising model for the signal support. Suppose we have a K-sparse\nsignal x \u2208 RN whose support is represented by s \u2208 {\u22121, 1}N such that si = \u22121 when xi = 0 and\nsi = 1 when xi 6= 0. The probability density function (PDF) of the signal support can be modeled\nusing a graph Gs = (Vs, Es), where Vs = {1, . . . , N } denotes a set of N vertices \u2013 one for each\nof the support indices \u2013 and Es denotes the set of edges connecting support indices that are spatial\nneighbors (see Figure 2(a)). The contribution of the interaction between two elements {si, sj} in\nthe support of x is controlled by the coef\ufb01cient \u03bbij > 0. The contribution of each element si is\ncontrolled by a coef\ufb01cient \u03bbi, resulting in the following PDF for the sparse support s:\n\nwhere Zs(\u03bb) is a strictly convex partition function with respect to \u03bb that normalizes the distribution\nso that it integrates to one. The parameter vector \u03bb quanti\ufb01es our prior knowledge regarding the\n\n\u03bbisi \u2212 Zs(\u03bb)\uf8fc\uf8fd\n\uf8fe\n\n,\n\n(2)\n\np(s; \u03bb) = exp\uf8f1\uf8f2\n\uf8f3\n\nX\n\n(i,j)\u2208Es\n\n\u03bbij sisj + X\n\ni\u2208Vs\n\n3\n\n\fsj\n\nsi\n\nxj\n\nxi\n\nsj\n\nsi\n\ny1\n\nyM\n\nxj\n\nxi\n\nsj\n\nsi\n\n(a)\n\nFigure 2: Examplegraphicalmodels: (a)Isingmodelforthesupport,(b)Markovrandom\ufb01eldmodelforthe\nresultingcoef\ufb01cients,(c)Markovrandom\ufb01eldwithCSmeasurements.\n\n(b)\n\n(c)\n\nsignal support s and consists of the edge interaction parameters \u03bbij and the vertex bias parameters\n\u03bbi. These parameters can be learned from data using \u21131-minimization techniques [11].\nThe Ising model enforces coef\ufb01cient clustering. For example, compare the clustered sparsity of the\nreal background subtracted image in Figure 1(b) with the dispersed \u201cindependent\u201d sparsity of the\nrandom image in Figure 1(c). While both images (b) and (c) are equally sparse, under a trained Ising\nmodel (\u03bbij = 0.45 and \u03bbi = 0), the image (b) is approximately 10856 times more likely than the\nimage (c).\nSignal model. Without loss of generality, we focus on 2D images that are sparse in the space\ndomain, as in Figure 1(b). Leveraging the Ising support model from above, we apply the MRF\ngraphical model in Figure 2(b) for the pixel coef\ufb01cient values. Under this model, the support is\ncontrolled by an Ising model, and the signal values are independent given the support. We now\ndevelop a joint PDF for the image pixel values x, the support labels s, and the CS measurements y.\n\nWe begin with the support PDF p(s) from (2) and assume that we are equipped with a sparsity-\npromoting PDF p(x|s) for x given s. The most commonly used PDF is the Laplacian density (which\nis related to the \u21131-norm of x); however, other reference priors, such as generalized Gaussians that\nare related to the \u2113p-norm of x, p < 1, can be more effective [12]. We assume that the measurements\n\ny are corrupted by i.i.d. Gaussian noise, i.e., p(y|x) = N (cid:0)y|\u03a6x, \u03c32I(cid:1), where \u03c32 is the unknown\n\nnoise variance.\n\nFrom Figure 2(c), it is easy to show that, given the signal x, the signal support s and the compressive\nmeasurements y are independent using the D-separation property of graphs [13]. Hence, the joint\ndistribution of the vertices in the graph in Figure 2(b) can be written as\n\np(z) = p(s, x, y) = p(s, x)p(y|s, x) = p(s)p(x|s)p(y|x),\n\nwhere z = [sT , xT , yT ]T . Then, (3) can be explicitly written as\n\np(z) \u221d exp\uf8f1\uf8f2\n\uf8f3\n\nX\n\n(i,j)\u2208Es\n\n\u03bbijsisj + X\n\ni\u2208Vs\n\n[\u03bbisi + log(p(xi|si))] \u2212\n\n1\n2\u03c32 ||y \u2212 \u03a6x||2\n\n2\n\n(3)\n\n(4)\n\n\uf8fc\uf8fd\n\uf8fe\n\n.\n\n4 Lattice matching pursuit\nUsing the coef\ufb01cient graphical model from Section 3, we are now equipped to develop a new model-\nbased CS signal recovery algorithm. Lattice Matching Pursuit (LaMP) is a greedy algorithm for\nsignals on 2D lattices (images) in which the likelihood of the signal support is iteratively evaluated\nand optimized under an Ising model. By enforcing a graphical model, (i) partial knowledge of\nthe sparse signal support greatly decreases the ambiguity and thus size of the search space for the\nremaining unknown part, accelerating the speed of the algorithm; and (ii) signal supports of the same\nsize but different structures result in different likelihoods (recall Figure 1(b) and (c)), decreasing the\nrequired number of CS measurements and increasing the numerical stability.\nAlgorithm. The LaMP pseudocode is given in Algorithm 1. Similar to other greedy recovery al-\ngorithms such as matching pursuit and CoSaMP [4], each iteration of LaMP starts by estimating a\ndata residual r{k} given the current estimate of the signal x{k\u22121} (Step 1). After calculating the\nresidual, LaMP calculates a temporary signal estimate (Step 2) denoted by x{k}\n. This signal esti-\nmate is the sum of the previous estimate x{k\u22121} and \u03a6\u2032r{k}, accounting for the current residual.\nUsing this temporary signal estimate as a starting point, LaMP then maximizes the likelihood (4)\nover the support via optimization (Step 3). This can be ef\ufb01ciently solved using graph cuts with\n\nt\n\n4\n\n\fAlgorithm 1: LaMP \u2013 Lattice Matching Pursuit\n\nInput: y, \u03a6, x{0} = 0, s{0} = \u22121, and eK (desired sparsity).\nOutput: A eK-sparse approximation x of the acquired signal.\n\nAlgorithm:\nrepeat {Matching Pursuit Iterations}\nStep 1. Calculate data residual:\n\nr{k} = y \u2212 \u03a6x{k\u22121};\n\nStep 2. Propose a temporary target signal estimate:\n\nx{k}\nt = \u03a6\u2032r{k} + x{k\u22121};\n\nStep 3. Determine MAP estimate of the support using graph cuts:\n\ns{k} = maxs\u2208{\u22121,+1}N P(i,j)\u2208Es \u03bbij sisj +Pi\u2208Vs h\u03bbisi + log(p([x{k}\n\nt\n\nStep 4. Estimate target signal:\n\nt = 0;\n\nStep 5. Iterate:\nk = k + 1;\n\nt[s{k} = 1] = \u03a6\u2020[:, s{k} = 1]y; x{k} = Prune{t; eK};\n\nuntil Maximum iterations or (cid:13)(cid:13)r{k}(cid:13)(cid:13) < threshold;\n\nReturn x = x{k}.\n\np(xi|si = \u22121)\n\np(xi|si = +1)\n\n1\n\u01eb1\n\n\u01eb1\n\nL\n\n\u21d2 log p(xi|si = \u22121)\n\n\u2212 log \u01eb1\n\n\u01eb2\n\u01eb3\n\n\u21d2 log p(xi|si = +1)\n\n\u03c4 \u2032\n\n\u03c4\n\nlog \u01eb1\n\n\u2212 log \u01eb1\n\n\u21d2 log p(xi|si=\u22121)\nlog \u01eb2\nlog \u01eb3\n\u21d2 log p(xi|si=+1)\n\nlog \u01eb1\n\n]i|si))i;\n\n1\n\n\u22121\n\nU\u22121(xi; \u03c4 )\n\u2248 1\n\u03c4\n0\n\nU+1(xi; \u03c4 )\n\nFigure 3: Geometrical approximations of p(xi|si = \u22121) and log p(xi|si = +1).\n\nO(N ) complexity [14]. In particular, for planar Ising models, the global minimum of the problem\ncan be obtained. Once a likely signal support s{k} is obtained in Step 3, LaMP obtains an up-\ndated signal estimate x{k} using least squares with the selected columns of the measurement matrix\n\n\u03a6[:, s{k} = 1] and pruning back to the largest eK signal coef\ufb01cients (Step 4). Hence, the parameter\neK controls the sparsity of the approximation. In Step 4, a conjugate gradient method is used for\n\nef\ufb01ciently performing the product by a pseudoinverse. If the graphical model includes dependencies\nbetween the signal values xi, we then replace the pseudoinverse product by a belief propagation\nalgorithm to ef\ufb01ciently solve for the signal values x{k} within Step 4.\nSignal log-likelihood log p(x|s).\nThe correct signal PDF to use given the support p(x|s) is\nproblem-dependent. Here, we provide one approximation that mimics the \u21130 minimization for CS\nrecovery for the signal graphical model in Figure 2(c); we also use this in our experiments in Sec-\ntion 5. The state si = 1 represents a nonzero coef\ufb01cient; thus, all nonzero values of xi should\nhave equal probability, and the value xi = 0 should have zero probability. Similarly, the state\nsi = \u22121 represents a zero-valued coef\ufb01cient; thus, the mass of its probability function is concen-\ntrated at zero. Hence, we use the approximations for xi \u2208 [\u2212L, L], a restricted dynamic range:\np(xi|si = \u22121) = \u03b4(xi) and p(xi|si = 1) = (1 \u2212 \u03b4(xi))/2L. However, the optimization over\nthe joint PDF in (4) requires a \u201csmoothing\u201d of these PDFs for two reasons: (i) to obtain robustness\nagainst noise and numerical issues; and (ii) to extend the usage of the algorithm from sparse to\ncompressible signals.\nWe approximate log p(xi|si = \u00b11) using the parametric form illustrated in Figure 3. Here, the\nconstant \u03c4 is a slack parameter to separate large and small signal coef\ufb01cients, and \u01eb1, \u01eb2, and \u01eb3 are\nchosen according to \u03c4 and L to normalize each PDF. We also denote a = \u01eb3L, with a \u2248 1. Using\nthe normalization constraints, it is possible to show that as the dynamic range increases,\n\nlim\nL\u2192\u221e\n\n\u2212\n\nlog \u01eb2\nlog \u01eb1\n\n\u2192\n\n1\n\u03c4 a\n\nand\n\nlim\nL\u2192\u221e\n\n\u2212\n\nlog \u01eb3\nlog \u01eb1\n\n\u2192 0.\n\n5\n\n\fHence, we approximate the likelihoods using the utility functions Usi (x; \u03c4 ) that follow this form.\nThe optimization problem used by Step 3 of LaMP to determine the support is then approximately\nequivalent to the following problem\n\n(5)\n\ns{k+1} =\n\ns\u2208{\u22121,+1}N X\n\n(i,j)\u2208Ese\u03bbij sisj + X\n\nmax\n\ni\u2208Vs\n\nhe\u03bbisi + Usi ([x\n\n{k+1}\nt\n\n]i; \u03c4 )i ,\n\nlog \u01eb1\n\nsparseness on the lattice structure.\n\n. If the signal values are known to be positive, then the de\ufb01nitions of Usi can\n\ncost incurred by enforcing the lattice structure is too large. The pruning operation in Step 4 of LaMP\n\nis prior information on the signal support. The threshold \u03c4 is chosen at each iteration adaptively by\n\n4 ), where m is called the average\nmagnetization. In our recovery problem, the average magnetization and the desired signal sparsity\n\nwhere e\u03bb = \u03bb\nbe changed to enforce the positivity during estimation. The choice of e\u03bbij is related to the desired\nTo enforce a desired sparsity eK on the lattice structure, we apply statistical mechanics results on\nthe 2D Ising model and choose e\u03bbij = 0.5 arcsin((1 \u2212 m8)\u2212 1\nhas a simple relationship: m = h(+1) \u00d7 eK + (\u22121) \u00d7 (N \u2212 eK)i /N . We set e\u03bbi = 0 unless there\nsorting the magnitudes of the temporary target signal estimate coef\ufb01cients and determining the 5eK\nthreshold; this gives preference to the largest 5eK coef\ufb01cients that attain states si = 1, unless the\nthen enforces the desired sparsity eK.\n5 Experiments\nWe now use several numerical simulations to demonstrate that for spatially clustered sparse signals,\nwhich have high likelihood under our MRF model, LaMP requires far fewer measurements and\nfewer computations for robust signal recovery than state-of-the-art greedy and optimization tech-\nniques.1\nExperiment 1: Shepp-Logan phantom. Figure 4 (top left) shows the classical N = 100 \u00d7\n100 Shepp-Logan phantom image. Its sparsity in the space domain is K = 1740. We obtained\ncompressive measurements of this image, which were then immersed in additive white Gaussian\nnoise to an SNR of 10dB. The top row of Figure 4 illustrates the iterative image estimates obtained\nusing LaMP from just M = 2K = 3480 random Gaussian measurements of the noisy target.\nWithin 3 iterations, the support of the image is accurately determined; convergence occurs at the 5th\niteration.\n\nFigure 4 (bottom) compares LaMP to CoSaMP [4], a state-of-the-art greedy recovery algorithm, and\n\ufb01xed-point continuation (FPC) [17], a state-of-the-art \u21131-norm minimization recovery algorithm us-\ning the same set of measurements. Despite the presence of high noise (10dB SNR), LaMP perfectly\nrecovers the signal support from only a small number of measurements. It also outperforms both\nCoSaMP and FPC in terms of speed.\nExperiment 2: Numerical stability. We demonstrate LaMP\u2019s stability in the face of substantial\nmeasurement noise. We tested both LaMP and FPC with a number of measurements that gave close\nto perfect recovery of the Shepp-Logan phantom in the presence of a small amount of noise; for\nLaMP, setting M = 1.7K suf\ufb01ces, while FPC requires M = 4K. We then studied the degradation\nof the recovery quality as a function of the noise level for both algorithms. For reference, a value\nof \u03c3 = 20 corresponds to a measurement-to-noise ratio of just 6dB. The results in Figure 5(a)\ndemonstrate that LaMP is stable for a wide range of measurement noise levels. Indeed, the rate of\nincrease of the LaMP recovery error as a function of the noise variance \u03c3 (a measure of the stability\nto noise) is comparable to that of FPC, while using far fewer measurements.\nExperiment 3: Performance on real background subtracted images. We test the recovery\nalgorithms over a set of background subtraction images. The images were obtained from a test\nvideo sequence, one image frame of which is shown in Figure 1, by choosing at random two frames\nfrom the video and subtracting them in a pixel-wise fashion. The large-valued pixels in the resulting\nimages are spatially clustered and thus are well-modeled by the MRF enforced by LaMP. We created\n100 different test images; for each image, we de\ufb01ne the sparsity K as the number of coef\ufb01cients\n\n1We use the GCOptimization package [14\u201316] to solve the support recovery problem in Step 3 in Algorithm\n\n1 in our implementation of LaMP.\n\n6\n\n\fNoise-free target\n\nLaMP Iter. #1\n\nLaMP Iter. #2\n\nLaMP Iter. #3\n\nLaMP Iter. #4\n\nLaMP Iter. #5, 0.9s\n\nCoSaMP, 6.2s\n\nFPC, 6.5s\n\nFigure 4: Top: LaMPrecoveryoftheShepp-Logan phantom(N = 100 \u00d7 100, K = 1740, SNR = 10dB)\nfrom M = 2K = 3480 noisymeasurements. Bottom: RecoveriesfromLaMP,CoSaMP,andFPC,including\nrunningtimesonthesamecomputer.\n\n3000\n\n2500\n\n2000\n\n1500\n\n1000\n\n500\n\nr\no\nr\nr\ne\n\n \n\nn\no\n\ni\nt\nc\nu\nr\nt\ns\nn\no\nc\ne\nr\n \n\nm\nu\nm\nx\na\nM\n\ni\n\n \n0\n0\n\nLaMP, M = 1.7K\nFPC, M = 5K\nFPC, M = 4K\n\n \n\n5\n\n10\n\u03c3\n\n(a)\n\n15\n\n20\n\ne\nd\nu\n\nt\ni\n\nn\ng\na\nm\n\n \nr\no\nr\nr\ne\n\n \n\nd\ne\nz\n\ni\nl\n\na\nm\nr\no\nn\n\n \n\ne\ng\na\nr\ne\nv\nA\n\n1.5\n\n1\n\n0.5\n\n0\n \n0\n\n \n\nLaMP\nCoSaMP\nFPC\n\n1\n\n2\n\n3\nM/K\n\n4\n\n5\n\n(b)\n\nFigure 5: PerformanceofLaMP.(a)Maximumrecoveryerrorover1000 noiseiterationsasafunctionofthe\ninput noise variance. LaMP has the same robustness to noise as the FPC algorithm. (b) Performance over\nbackgroundsubtractiondatasetof100images. LaMPachievesthebestperformanceatM \u2248 2.5K,whileboth\nFPCandCoSaMPrequire M > 5K toachievethesameperformance.\n\nthat contain 97% of the image energy. We then performed recovery of the image using the LaMP,\nCoSaMP, and FPC algorithms under varying number of measurements M , from 0.5K to 5K. An\nexample recovery is shown in Figure 6.\n\nFor each test and algorithm, we measured the magnitude of the estimation error normalized by the\nmagnitude of the original image. Figure 5(b) shows the mean and standard deviations for the nor-\nmalized error magnitudes of the three algorithms. LaMP\u2019s graphical model reduces the number of\nmeasurements necessary for acceptable recovery quality to M \u2248 2.5K, while the standard algo-\nrithms require M \u2265 5K measurements to achieve the same quality.\n\n6 Conclusions\nWe have presented an initial study of model-based CS signal recovery using an MRF model to cap-\nture the structure of the signal\u2019s sparse coef\ufb01cients. As demonstrated in our numerical simulations,\nfor signals conforming to our model, the resulting LaMP algorithm requires signi\ufb01cantly fewer CS\nmeasurements, has lower computational complexity, and has equivalent numerical stability to the\ncurrent state-of-the-art algorithms. We view this as an initial step toward harnessing the power of\nmodern compression and data modeling methods for CS reconstruction.\n\nMuch work needs to be done, however. We are working to precisely quantify the reduction in the\nrequired number of measurements (our numerical experiments suggest that M = O(K) is suf\ufb01cient\nfor stable recovery) and computations. We also assert that probabilistic signal models hold the key\nto formulating inference problems in the compressive measurement domain since in many signal\nprocessing applications, signals are acquired merely for the purpose of making an inference such as\na detection or classi\ufb01cation decision.\n\n7\n\n\fTarget\n\nLaMP\n\nCoSaMP\n\nFPC\n\nFigure 6: Examplerecoveriesforbackgroundsubtractionimages,using M = 3K foreachimage.\n\nAcknowledgements. We thank Wotao Yin for helpful discussions, and Aswin Sankaranarayanan\nfor data used in Experiment 3. This work was supported by grants NSF CCF-0431150 and CCF-\n0728867, DARPA/ONR N66001-08-1-2065, ONR N00014-07-1-0936 and N00014-08-1-1112,\nAFOSR FA9550-07-1-0301, ARO MURI W311NF-07-1-0185, and the TI Leadership Program.\nReferences\n[1] D. L. Donoho. Compressed sensing. IEEE Trans. Info. Theory, 52(4):1289\u20131306, Sept. 2006.\n[2] E. J. Cand`es. Compressive sampling.\npages 1433\u20131452, Madrid, Spain, 2006.\n\nIn Proc. International Congress of Mathematicians, volume 3,\n\n[3] S. L. Lauritzen. Graphical Models. Oxford University Press, 1996.\n[4] D. Needell and J. Tropp. CoSaMP: Iterative signal recovery from incomplete and inaccurate samples.\n\nApplied and Computational Harmonic Analysis, June 2008. To appear.\n\n[5] C. La and M. N. Do. Tree-based orthogonal matching pursuit algorithm for signal reconstruction.\n\nIEEE Int. Conf. Image Processing (ICIP), pages 1277\u20131280, Atlanta, GA, Oct. 2006.\n\nIn\n\n[6] M. F. Duarte, M. B. Wakin, and R. G. Baraniuk. Wavelet-domain compressive signal reconstruction using\n\na hidden Markov tree model. In ICASSP, pages 5137\u20135140, Las Vegas, NV, April 2008.\n\n[7] V. Cevher, A. Sankaranarayanan, M. F. Duarte, D. Reddy, R. G. Baraniuk, and R. Chellappa. Compressive\n\nsensing for background subtraction. In ECCV, Marseille, France, Oct. 2008.\n\n[8] R. G. Baraniuk, M. Davenport, R. A. DeVore, and M. B. Wakin. A simple proof of the restricted isometry\n\nproperty for random matrices. 2006. To appear in Const. Approx.\n\n[9] T. Blumensath and M. E. Davies. Sampling theorems for signals from the union of linear subspaces.\n\n2007. Preprint.\n\n[10] B. M. McCoy and T. T. Wu. The two-dimensional Ising model. Harvard Univ. Press, 1973.\n[11] M. J. Wainwright, P. Ravikumar, and J. D. Lafferty. High-dimensional graphical model selection using\n\n\u21131-regularized logistic regression. In Proc. of Advances in NIPS, 2006.\n\n[12] D. P. Wipf and B. D. Rao. Sparse bayesian learning for basis selection.\n\n52(8):2153\u20132164, August 2004.\n\nIEEE Trans. Sig. Proc.,\n\n[13] J. Pearl. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kauf-\n\nmann Publishers, 1988.\n\n[14] V. Kolmogorov and R. Zabin. What energy functions can be minimized via graph cuts? IEEE Trans. on\n\nPattern Anal. and Mach. Int., 26(2):147\u2013159, 2004.\n\n[15] Y. Boykov, O. Veksler, and R. Zabih. Ef\ufb01cient approximate energy minimization via graph cuts. IEEE\n\nTrans. on Pattern Anal. and Mach. Int., 20(12):1222\u20131239, Nov. 2001.\n\n[16] Y. Boykov and V. Kolmogorov. An experimental comparison of min-cut/max-\ufb02ow algorithms for energy\n\nminimization in vision. IEEE Trans. on Pattern Anal. and Mach. Int., 26(9):1124\u20131137, Sept. 2004.\n\n[17] E. T. Hale, W Yin, and Y. Zhang. A \ufb01xed-point continuation method for \u21131-regularized minimization with\n\napplications to compressed sensing. Technical Report TR07-07, Rice University, CAM Dept., 2007.\n\n8\n\n\f", "award": [], "sourceid": 981, "authors": [{"given_name": "Volkan", "family_name": "Cevher", "institution": null}, {"given_name": "Marco", "family_name": "Duarte", "institution": null}, {"given_name": "Chinmay", "family_name": "Hegde", "institution": null}, {"given_name": "Richard", "family_name": "Baraniuk", "institution": null}]}