{"title": "PEWA: Patch-based Exponentially Weighted Aggregation for image denoising", "book": "Advances in Neural Information Processing Systems", "page_first": 2150, "page_last": 2158, "abstract": "Patch-based methods have been widely used for noise reduction in recent years. In this paper, we propose a general statistical aggregation method which combines image patches denoised with several commonly-used algorithms. We show that weakly denoised versions of the input image obtained with standard methods, can serve to compute an efficient patch-based aggregated estimd aggregation (EWA) estimator. The resulting approach (PEWA) is based on a MCMC sampling and has a nice statistical foundation while producing denoising results that are comparable to the current state-of-the-art. We demonstrate the performance of the denoising algorithm on real images and we compare the results to several competitive methods.", "full_text": "PEWA: Patch-based Exponentially Weighted\n\nAggregation for image denoising\n\nCharles Kervrann\n\nInria Rennes - Bretagne Atlantique\n\nSerpico Project-Team\n\nCampus Universitaire de Beaulieu, 35 042 Rennes Cedex, France\n\ncharles.kervrann@inria.fr\n\nAbstract\n\nPatch-based methods have been widely used for noise reduction in recent years.\nIn this paper, we propose a general statistical aggregation method which combines\nimage patches denoised with several commonly-used algorithms. We show that\nweakly denoised versions of the input image obtained with standard methods, can\nserve to compute an ef\ufb01cient patch-based aggregated estimator. In our approach,\nwe evaluate the Stein\u2019s Unbiased Risk Estimator (SURE) of each denoised can-\ndidate image patch and use this information to compute the exponential weighted\naggregation (EWA) estimator. The aggregation method is \ufb02exible enough to com-\nbine any standard denoising algorithm and has an interpretation with Gibbs distri-\nbution. The denoising algorithm (PEWA) is based on a MCMC sampling and is\nable to produce results that are comparable to the current state-of-the-art.\n\n1\n\nIntroduction\n\nSeveral methods have been proposed to solve the image denoising problem including anisotropic\ndiffusion [15], frequency-based methods [26], Bayesian and Markov Random Fields methods [20],\nlocally adaptive kernel-based methods [17] and sparse representation [10]. The objective is to esti-\nmate a clean image generally assumed to be corrupted with additive white Gaussian (AWG) noise.\nIn recent years, state-of-the-art results have been considerably improved and the theoretical lim-\nits of denoising algorithms are currently discussed in the literature [4, 14]. The most competitive\nmethods are mostly patch-based methods, such as BM3D [6], LSSC [16], EPLL [28], NL-Bayes\n[12], inspired from the N(on)L(ocal)-means [2]. In the NL-means method, each patch is replaced\nby a weighted mean of the most similar patches found in the noisy input image. BM3D combines\nclustering of noisy patches, DCT-based transform and shrinkage operation to achieve the current\nstate-of-the-art results [6]. PLOW [5], S-PLE [24] and NL-Bayes [12], falling in the same cate-\ngory of the so-called internal methods, are able to produce very comparable results. Unlike BM3D,\ncovariances matrices of clustered noisy patches are empirically estimated to compute a Maximum\nA Posteriori (MAP) or a Minimum-Mean-Squared-Error (MMSE) estimate. The aforementioned\nalgorithms need two iterations [6, 12, 18] and the performances are surprisingly very close to the\nstate-of-the-art in average while the motivation and the modeling frameworks are quite different. In\nthis paper, the proposed Patch-based Exponential Weighted Aggregation (PEWA) algorithm, requir-\ning no patch clustering, achieves also the state-of-the-art results.\nA second category of patch-based external methods (e.g. FoE [20], EPLL [28], MLP [3]) has been\nalso investigated. The principle is to approximate the noisy patches using a set of patches of an\nexternal learned dictionary. The statistics of a noise-free training set of image patches, serve as\npriors for denoising. EPLL computes a prior from a mixture of Gaussians trained with a database\nof clean image patches [28]; denoising is then performed by maximizing the so-called Expected\nPatch Log Likelihood (EPLL) criteria using an optimization algorithm. In this line of work, a multi-\n\n1\n\n\flayer perceptron (MLP) procedure exploiting a training set of noisy and noise-free patches was able\nto achieve the state-of-the-art performance [3]. Nevertheless, the training procedure is dedicated\nto handle a \ufb01xed noise level and the denoising method is not \ufb02exible enough, especially for real\napplications when the signal-to-noise ratio is not known.\nRecently, the similarity of patch pairs extracted from the input noisy image and from clean patch\ndataset has been studied in [27]. The authors observed that more repetitions are found in the same\nnoisy image than in a clean image patch database of natural images; also, it is not necessary to\nexamine patches far from the current patch to \ufb01nd good matching. While the external methods\nare attractive, computation is not always feasible since a very large collection of clean patches are\nrequired to denoise all patches in the input image. Other authors have previously proposed to learn\na dictionary on the noisy image [10] or to combine internal and external information (LSSC) [16].\nIn this paper, we focus on internal methods since they are more \ufb02exible for real applications than\nexternal methods. They are less computationally demanding and remain the most competitive.\nOur approach consists in estimating an image patch from \u201cweakly\u201d denoised image patches in the\ninput image. We consider the general problem of combining multiple basic estimators to achieve\nan estimation accuracy not much worse than that of the \u201cbest\u201d single estimator in some sense. This\nproblem is important for practical applications because single estimators often do not perform as\nwell as their combinations. The most important and widely studied aggregation method that achieves\nthe optimal average risk is the Exponential Weighted Aggregation (EWA) algorithm [13, 7, 19].\nSalmon & Le Pennec have already interpreted the NL-means as a special case of the EWA procedure\nbut the results of the extended version described in [21] were similar to [2].\nOur estimator combination is then achieved through a two-step procedure, where multiple estimators\nare \ufb01rst computed and are then combined in a second separate computing step. We shall see that\nthe proposed method can be thought as a boosting procedure [22] since the performance of the pre-\ncomputed estimators involved in the \ufb01rst step are rather poor, both visually and in terms of peak\nsignal-to-noise ratio (PSNR). Our contributions are the following ones:\n\n1. We show that \u201cweak\u201d denoised versions of the input noisy images can be combined to get\n\na boosted estimator.\n\n2. A spatial Bayesian prior and a Gibbs energy enable to select good candidate patches.\n3. We propose a dedicated Monte Carlo Markov Chain (MCMC) sampling procedure to com-\n\npute ef\ufb01ciently the PEWA estimator.\n\nThe experimental results are comparable to BM3D [6] and the method is implemented ef\ufb01ciently\nsince all patches can be processed independently.\n\n2 Patch-based image representation and SURE estimation\nFormally, we represent a n-dimensional image patch at location x \u2208 X \u2282 R2 as a vector f (x) \u2208 Rn.\nWe de\ufb01ne the observation patch v(x) \u2208 Rn as: v(x) = f (x) + \u03b5(x) where \u03b5(x) \u223c N (0, \u03c32In\u00d7n)\n\nrepresents the errors. We are interested in an estimator (cid:98)f (x) of f (x) assumed to be independent of\nin the Mean Square Error sense such that E[R((cid:98)f (x))] = E[(cid:107)f (x) \u2212 (cid:98)f (x)(cid:107)2\n\nn] (E denotes the math-\nematical expectation). SURE has been already investigated for image denoising using NL-means\n[23, 9, 22, 24] and for image deconvolution [25].\n\nv(x) that achieves a small L2 risk. We consider the Stein\u2019s Unbiased Risk Estimator\n\nR((cid:98)f (x)) = (cid:107)v(x) \u2212 (cid:98)f (x)(cid:107)2\n\nn \u2212 n\u03c32\n\n3 Aggregation by exponential weights\nAssume a family {f\u03bb(x), \u03bb \u2208 \u039b} of functions such that the mapping \u03bb \u2192 f\u03bb(x) is measurable\nand \u039b = {1,\u00b7\u00b7\u00b7 , M}. Functions f\u03bb(x) can be viewed as some pre-computed estimators of f (x) or\n\u201cweak\u201d denoisers independent of observations v(x), and considered as frozen in the following. The\nset of M estimators is assumed to be very large, that is composed of several hundreds of thousands\n\n2\n\n\fM(cid:88)\n\n(cid:18) w\u03bb(x)\n\n(cid:19)\n\n\u03c0\u03bb(x)\n\n(cid:41)\n\nM(cid:88)\n\n(cid:98)f (x) =\n\nof candidates. In this paper, we consider aggregates that are weighted averages of the functions in\nthe set {f\u03bb(x), \u03bb \u2208 \u039b} with some data-dependent weights:\n\nw\u03bb(x) f\u03bb(x) such that w\u03bb(x) \u2265 0 and\n\nw\u03bb(x) = 1.\n\n(1)\n\n\u03bb=1\n\n\u03bb=1\n\nAs suggested in [19], we can associate two probability measures w(x) = {w1(x),\u00b7\u00b7\u00b7 , wM (x)} and\n\u03c0(x) = {\u03c01(x),\u00b7\u00b7\u00b7 , \u03c0M (x)} on {1,\u00b7\u00b7\u00b7 , M} and we de\ufb01ne the Kullback-Leibler divergence as:\n\nDKL(w(x), \u03c0(x)) =\n\nw\u03bb(x) log\n\n.\n\n(2)\n\nM(cid:88)\n\n\u03bb=1\n\nThe exponential weights are obtained as the solution of the following optimization problem:\n\nw\u03bb(x)\u03c6(R(f\u03bb(x))) + \u03b2 DKL(w(x), \u03c0(x))\n\nsubject to (1)\n\n(3)\n\nwhere \u03b2 > 0 and \u03c6 : R \u2192 R (e.g. \u03c6(z) = z (see [19])). From the Karush-Kuhn-Tucker conditions,\nthe unique closed-form solution is\n\n(cid:98)w(x) = arg min\n\nw(x)\u2208RM\n\n(cid:40) M(cid:88)\n\n\u03bb=1\n\n(cid:98)w\u03bb(x) =\n\nand \u03b2 can be interpreted as a \u201ctemperature\u201d parameter. This estimator satis\ufb01es oracle inequalities of\nthe following form [7]:\n\n(cid:80)M\nexp(\u2212\u03c6(R(f\u03bb(x)))/\u03b2) \u03c0\u03bb(x)\n\u03bb(cid:48)=1 exp(\u2212\u03c6(R(f\u03bb(cid:48)(x)))/\u03b2) \u03c0\u03bb(cid:48)(x)\n(cid:40) M(cid:88)\n\n(cid:41)\n\nw\u03bb(x)\u03c6(R(f\u03bb(x))) + \u03b2 DKL(w(x), \u03c0(x))\n\n.\n\n(5)\n\n,\n\n(4)\n\nE[R((cid:98)f (x))] \u2264 min\n\nw(x)\u2208RM\n\n\u03bb=1\n\nThe role of the distribution \u03c0 is to put a prior weight on the functions in the set. When there is no\npreference, the uniform prior is a common choice but other choices are possible (see [7]).\nIn the proposed approach, we de\ufb01ne the set of estimators as the set of patches taken in denoised\nversions of the input image v. The next question is to develop a method to ef\ufb01ciently compute the\nsum in (1) since the collection can be very large. For a typical image of N = 512 \u00d7 512 pixels,\nwe could potentially consider M = L \u00d7 N pre-computed estimators if we apply L denoisers to the\ninput image v.\n\n4 PEWA: Patch-based EWA estimator\n\nSuppose that we are given a large collection of M competing estimators. These basic estimators\ncan be chosen arbitrarily among the researchers favorite denoising algorithm: Gaussian, Bilateral,\nWiener, Discrete Cosine Transform or other transform-based \ufb01lterings. Let us emphasize here that\nthe number of estimators M is not expected to grow and is typically very large (M is chosen on the\norder of several hundreds of thousands). In addition, the essential idea is that these estimators only\nslightly improve the PSNR values of a few dBs.\nLet us consider u(cid:96), (cid:96) = 1,\u00b7\u00b7\u00b7 , L denoised versions of v. A given pre-computed patch estimator\nf\u03bb(x) is then a n-dimensional patch taken in the denoised image u(cid:96) at any location y \u2208 X , in the\nspirit of the NL-means algorithm which considers only the noisy input patches for denoising. The\nproposed estimator is then more general since a set of denoised patches at a given location are used.\nOur estimator is then of the following form if we choose \u03c6(z) = |z|:\n\n(cid:98)f (x) =\n\n1\n\nZ(x)\n\nL(cid:88)\n\n(cid:88)\n\ny\u2208X\n\n(cid:96)=1\n\nL(cid:88)\n\n(cid:88)\n\n(cid:96)(cid:48)=1\n\ny(cid:48)\u2208X\n\ne\u2212|R(u(cid:96)(y))|/\u03b2 \u03c0(cid:96)(y) u(cid:96)(y), Z(x) =\n\ne\u2212|R(u(cid:96)(cid:48) (y(cid:48)))|/\u03b2 \u03c0(cid:96)(cid:48)(y)\n\n(6)\n\nwhere Z(x) is a normalization constant.\nInstead of considering a uniform prior over the set of\ndenoised patches taken in the whole image, it is appropriate to encourage patches located in the\n\n3\n\n\fspatial neighborhood of x [27]. This can be achieved by introducing a spatial Gaussian prior\nG\u03c4 (z) \u221d e\u2212z2/(2\u03c4 2) in the de\ufb01nition as\n\ne\u2212|R(u(cid:96)(y))|/\u03b2 G\u03c4 (x \u2212 y) u(cid:96)(y).\n\n(7)\n\n(cid:98)fPEWA(x) =\n\nL(cid:88)\n\n(cid:88)\n\ny\u2208X\n\n(cid:96)=1\n\n1\n\nZ(x)\n\nThe Gaussian prior has a signi\ufb01cant impact on the performance of the EWA estimator. Moreover, the\npractical performance of the estimator strongly relies on an appropriate choice of \u03b2. This important\nquestion has been thoroughly discussed in [13] and \u03b2 = 4\u03c32 is motivated by the authors. Finally,\nour patch-based EWA (PEWA) estimator can be written in terms of energies and Gibbs distributions\nas:\n\n(cid:98)fPEWA(x) =\n\nE(u(cid:96)(y)) =\n\nL(cid:88)\n\n(cid:88)\n\n1\n\ne\u2212E(u(cid:96)(y)) u(cid:96)(y),\nn \u2212 n\u03c32|\n\n+\n\nZ(x)\n|(cid:107)v(x) \u2212 u(cid:96)(y)(cid:107)2\n\ny\u2208X\n\n(cid:96)=1\n\n4\u03c32\n\n(cid:107)x \u2212 y(cid:107)2\n\n2\n\n.\n\n2\u03c4 2\n\nL(cid:88)\n\n(cid:88)\n\n(cid:96)(cid:48)=1\n\ny(cid:48)\u2208X\n\nZ(x) =\n\ne\u2212E(u(cid:96)(cid:48) (y(cid:48))),\n\n(8)\n\nThe sums in (8) cannot be computed, especially when we consider a large collection of estimators.\nIn that sense, it differs from the NL-means methods [2, 11, 23, 9] which exploits patches generally\ntaken in a neighborhood of \ufb01xed size.\nInstead, we propose a Monte-Carlo sampling method to\napproximately compute such an EWA when the number of aggregated estimators is large [1, 19].\n\n4.1 Monte-Carlo simulations and computational issues\n\nBecause of the high dimensionality of the problem, we need ef\ufb01cient computational algorithms,\nand therefore we suggest a stochastic approach to compute the PEWA estimator. Let us con-\nsider a random process (Fn(x))n\u22650 consisting in an initial noisy patch F0(x) = v(x). The pro-\nposed Monte-Carlo procedure recommended to compute the estimator is based on the following\nMetropolis-Hastings algorithm:\nDraw a patch by considering a two-stage drawing procedure:\n\n\u2022 draw uniformly a value (cid:96) in the set {1, 2,\u00b7\u00b7\u00b7 , L}.\n\u2022 draw a pixel y = yc + \u03b3, y \u2208 X , with \u03b3 \u223c N (0, I2\u00d72\u03c4 2) and yc is the position of the\n\ncurrent patch. At the initialization yc = x.\n\nDe\ufb01ne Fn+1(x) as: Fn+1(x) =\n\nif \u03b1 \u2264 e\u2212\u2206E(u(cid:96)(y)),Fn(x))\notherwise\n\n(9)\n\n(cid:26) u(cid:96)(y)\n\nFn(x)\n\nwhere \u03b1 is a random variable: \u03b1 \u223c U [0, 1] and \u2206E(u(cid:96)(y), Fn(x))\nIf we assume the Markov chain is ergodic, homogeneous, reductible, reversible and stationary, for\nany F0(x), we have almost surely\n\n(cid:52)\n= E(u(cid:96)(y)) \u2212 E(Fn(x)).\n\nlim\n\nT\u2192+\u221e\n\n1\n\nT \u2212 Tb\n\nT(cid:88)\n\nn=Tb\n\nFn(x) \u2248 (cid:98)fPEWA(x)\n\n(10)\n\nwhere T is the maximum number of samples of the Monte-Carlo procedure. It is also recommended\nto introduce a burn-in phase to get a more satisfying estimator. Hence, the \ufb01rst Tb samples are\ndiscarded in the average The Metropolis-Hastings rule allows reversibility and then stationarity of\nthe Markov chain. The chain is irreducible since it is possible to reach any patch in the set of possible\nconsidered patches. The convergence is ensured when T tends to in\ufb01nity. In practice, T is assumed\n\nto be high to get a reasonable approximation of (cid:98)fPEWA(x). In our implementation, we set T \u2248 1000\n\nand Tb = 250 to produce fast and satisfying results. To improve convergence speed, we can use\nseveral chains instead of only one [21].\nIn the Metropolis-Hastings dynamics, some patches are more frequently selected than others at a\ngiven location. The number of occurrences of a particular candidate patch can be then evaluated. In\nconstant image areas, there is probably no preference for any one patch over any other and a low\nnumber of candidate patches is expected along image contours and discontinuities.\n\n4\n\n\f4.2 Patch overlapping and iterations\n\nThe next step is to extend the PEWA procedure at every position of the entire image. To avoid\nblock effects at the patch boundaries, we overlap the patches. As a result, for the pixels lying\nin the overlapping regions, we obtain multiple EWA estimates. These competing estimates must\nbe fused or aggregated into the single \ufb01nal estimate. The \ufb01nal aggregation can be performed by a\nweighted average of the multiple EWA estimates as suggested in [21, 5, 22]. The simplest method of\naggregating such multiple estimates is to average them using equal weights. Such uniform averaging\nprovided the best results in our experiments and amounts to fusing n independent Markov chains.\nThe proposed implementation proceeds in two identical iterations. At the \ufb01rst iteration, the esti-\nmation is performed using several denoised versions of the noisy image. At the second iteration,\nthe \ufb01rst estimator is used as an additional denoised image in the procedure to improve locally the\nestimation as in [6, 12]. The second iteration improves the PSNR values in the range of 0.2 to 0.5\ndB as demonstrated by the experiments presented in the next section. Note that the \ufb01rst iteration is\nable to produce very satisfying results for low and medium levels of noise. In practical imaging, we\nuse the method described in [11] to estimate the noise variance \u03c32 for real-world noisy images.\n\n5 Experimental results\n\nWe evaluated the PEWA algorithm on 25 natural images showing natural, man-made, indoor and\noutdoor scenes (see Fig. 1). Each original image was corrupted with white Gaussian noise with zero\nmean and variance \u03c32. In our experiments, the best results are obtained with n = 7 \u00d7 7 patches\nand L = 4 images ul denoised with DCT-based transform [26] ; we consider three different DCT\nshrinkage thresholds: 1.25\u03c3, 1.5\u03c3 and 1.75\u03c3 to improve the PSNR of 1 to 6 db at most, depending\non \u03c3 and images (see Figs. 2-3). The fourth image is the noisy input image itself. We evaluated\nthe algorithm with a larger number L of denoised images and the quality drops by 0.1 db to 0.3 db,\nwhich is visually imperceptible. Increasing L suggest also to considering more than 1000 samples\nsince the space of candidate patches is larger. The prior neighborhood size corresponds to a disk of\nradius \u03c4 = 7 pixels but it can be smaller.\nPerformances of PEWA and other methods are quanti\ufb01ed in terms of PSNR values for several noise\nlevels (see Tables 1-3). Table 1 reports the results obtained with PEWA on each individual image for\ndifferent values of standard deviation of noise. Table 2 compares the average PSNR values on these\n25 images obtained by PEWA (after 1 and 2 iterations) and two state-of-the-art denoising methods\n[6, 12]. We used the implementations provided by the authors: BM3D (http://www.cs.tut.\ufb01/\u02dcfoi/GCF-\nBM3D/) and NL-Bayes (www.ipol.im). The best PSNR values are in bold and the results are quan-\ntitatively quite comparable except for very high levels of noise. We compared PEWA to the baseline\nNL-means [2] and DCT [26] (using the implementation of www.ipol.im) since they form the core of\nPEWA. The PSNR values increases of 1.5 db and 1.35 db on average over NL-means and DCT re-\nspectively. Finally, we compared the results to the recent S-PLE method which uses SURE to guide\nthe probabilistic patch-based \ufb01ltering described in [24]. Figure 2 shows the denoising results on the\nnoisy Valdemossa (\u03c3 = 15), Man (\u03c3 = 20) and Castle (\u03c3 = 25) images denoised with BM3D,\nNL-Bayes and PEWA. Visual quality of methods is comparable.\nTable 3 presents the denoising results with PEWA if the pre-computed estimators are obtained with\na Wiener \ufb01ltering (spatial domain1) and DCT-based transform [26]. The results of PEWA with 5\u00d7 5\nor 7 \u00d7 7 patches are also given in Table 3, for one and two iterations. Note that NL-means can be\nconsidered as a special case of the proposed method in which the original noisy patches constitute\nthe set of \u201cweak\u201d estimators. The MCMC-based procedure can be then considered as an alternative\nprocedure to the usual implementation of NL-means to accelerate summation. Accordingly, in Table\n3 we added a fair comparison (7\u00d77 patches) with the implementation of NL-means algorithm (IPOL\n(ipol.im)) which restricts the search of similar patches in a neighborhood of 21 \u00d7 21 pixels. In these\nexperiments, \u201cPEWA basic\u201d (1 iteration) produced better results especially for \u03c3 \u2265 10. Finally we\ncompared these results with the most popular and competitive methods on the same images. The\nPSNR values are selected from publications cited in the literature. LSSC and BM3D are the most\n\n\u00d7 (v(x) \u2212 mean(v(x))), where (cid:96) = {1, 2, 3} and\n\n(cid:18)\n\n0,\n\n1 u(cid:96)(x) = mean(v(x)) + max\na1 = 0.15, a2 = 0.20, a3 = 0.25.\n\n(cid:19)\n\nvar(v(x)) \u2212 a(cid:96)\u03c32\n\nvar(v(x))\n\n5\n\n\fcameraman\n(256 \u00d7 256)\n\npeppers\n\n(256 \u00d7 256)\n\nhouse\n\n(256 \u00d7 256)\n\nLena\n\n(512 \u00d7 512)\n\nbarbara\n\n(512 \u00d7 512)\n\nboat\n\n(512 \u00d7 512)\n\nman\n\n(512 \u00d7 512)\n\ncouple\n\n(512 \u00d7 512)\n\nhill\n\n(512 \u00d7 512)\n\nmaya\n\n(313 \u00d7 473)\n\nasia\n\n(313 \u00d7 473)\n\naircraft\n\n(473 \u00d7 313)\n\npanther\n\n(473 \u00d7 313)\n\nalley\n\n(192 \u00d7 128)\n\ncomputer\n\n(704 \u00d7 469)\n\ndice\n\n(704 \u00d7 469)\n\n\ufb02owers\n\n(704 \u00d7 469)\n\ngirl\n\n(704 \u00d7 469)\n\ntraf\ufb01c\n\n(704 \u00d7 469)\n\ntrees\n\n(192 \u00d7 128)\n\nvalldemossa\n(769 \u00d7 338)\n\ncastle\n\n(313 \u00d7 473)\n\nyoung man\n(313 \u00d7 473)\n\ntiger\n\n(473 \u00d7 313)\n\nman on wall picture\n\n(473 \u00d7 313)\n\nFigure 1: Set of 25 tested images. Top left: images from the BM3D website (cs.tut.\ufb01/\u02dcfoi/GCF-\nBM3D/) ; Bottom left: images from IPOL (ipol.im); Right: images from the Berkeley segmentation\ndatabase (eecs.berkeley.edu/Research/Projects/CS/ vision/bsds/).\n\nperformant but PEWA is able to produce better results on several piecewise smooth images while\nBM3D is more appropriate for textured images.\nIn terms of computational complexity, denoising a 512 \u00d7 512 grayscale image with an unoptimized\nimplementation of our method in C++ take about 2 mins (Intel Core i7 64-bit CPU 2.4 Ghz). Re-\ncently, PEWA has been implemented in parallel since every patch can be processed independently\nand the computational times become a few seconds.\n\n6 Conclusion\n\nWe presented a new general two-step denoising algorithm based on non-local image statistics and\npatch repetition, that combines ideas from the popular NL-means [6] and BM3D algorithms [6] and\ntheoretical results from the statistical literature on Exponentially Weighted Aggregation [7, 21]. The\n\ufb01rst step of PEWA involves the computation of denoised images obtained with a separate collec-\ntion of multiple denoisers (Wiener, DCT... ) applied to the input image. In the second step, the\nset of denoised image patches are selectively exploited to compute an aggregated estimator. We\nshowed that the estimator can be computed in reasonable time using a Monte-Carlo Markov Chain\n(MCMC) sampling procedure. If we consider DCT-based transform [6] in the \ufb01rst step, the results\nare comparable in average to the state-of-the-art results. The PEWA method generalizes the NL-\nmeans algorithm in some sense but share also common features with BM3D (e.g. DCT transform,\ntwo-stage collaborative \ufb01ltering). tches, contrary to NL-Bayes and BM3D. For future work, wavelet-\nbased transform, multiple image patch sizes, robust statistics and sparse priors will be investigated\nto improve the results of the \ufb02exible PEWA method.\n\n6\n\n\fnoisy (PSNR = 24.61)\n\nPEWA (PSNR = 29.25)\n\nBM3D [6] (PSNR = 29.19)\n\nNL-Bayes [12] (PSNR = 29.22)\n\nFigure 2: Comparison of algorithms. Valldemossa image corrupted with white Gaussian noise (\u03c3 =\n15). The PSNR values of the three images denoised with DCT-based transform [26] and combined\nwith PEWA are 27.78, 27.04 and 26.26.\n\nnoisy\n\n(PSNR = 20.18)\n\nPEWA\n\n(PSNR = 29.49)\n\nBM3D [6]\n\n(PSNR = 29.36)\n\nNL-Bayes [12]\n(PSNR = 29.48)\n\nnoisy\n\n(PSNR = 22.11)\n\nPEWA\n\n(PSNR = 30.50)\n\nBM3D [6]\n\n(PSNR = 30.59)\n\nNL-Bayes [12]\n(PSNR = 30.60)\n\nFigure 3: Comparison of algorithms. First row: Castle image corrupted with white Gaussian noise\n(\u03c3 = 25). The PSNR values of the three images denoised with DCT-based transform [26] and\ncombined with PEWA are 25.77, 24.26 and 22.85. Second row: Man image corrupted with white\nGaussian noise (\u03c3 = 20). The PSNR values of the three images denoised with DCT-based transform\n[26] and combined with PEWA are 27.42, 26.00 and 24.67.\n\n7\n\n\fCameraman\n\nPeppers\nHouse\nLena\nBarbara\n\nBoat\nMan\nCouple\n\nHill\nAlley\n\nComputer\n\nDice\n\nFlowers\n\nGirl\nTraf\ufb01c\nTrees\n\nValldemossa\n\nMan Picture\n\nAircraft\n\nAsia\nCastle\n\nMaya\nPanther\nTiger\n\nYoung man\nAverage\n\n\u03c3 = 5\n38.20\n38.00\n39.56\n38.57\n38.09\n37.12\n37.68\n37.35\n37.01\n36.29\n39.04\n46.82\n43.48\n43.95\n37.85\n34.88\n36.65\n37.59\n38.67\n38.06\n37.78\n34.72\n38.53\n36.92\n40.79\n38.54\n\n\u03c3 = 10\n34.23\n34.68\n36.40\n35.78\n34.73\n33.75\n33.93\n33.91\n33.52\n32.20\n35.13\n43.87\n39.67\n41.22\n33.54\n29.93\n31.79\n34.62\n34.46\n34.13\n33.58\n29.64\n33.91\n32.85\n37.36\n34.75\n\n\u03c3 = 15\n31.98\n32.75\n34.86\n34.12\n32.86\n31.94\n31.93\n31.98\n31.69\n29.98\n32.81\n42.05\n37.47\n39.52\n31.13\n27.49\n29.25\n33.00\n32.25\n32.02\n31.27\n27.17\n31.56\n30.63\n35.58\n32.67\n\n\u03c3 = 20\n30.60\n31.40\n33.72\n32.90\n31.43\n30.64\n30.50\n30.57\n30.50\n28.54\n31.23\n40.58\n35.90\n38.27\n29.58\n25.86\n27.59\n31.75\n30.73\n30.56\n29.73\n25.42\n30.02\n29.13\n34.30\n31.26\n\n\u03c3 = 25\n29.48\n30.30\n32.77\n31.89\n30.28\n29.65\n29.50\n29.48\n29.56\n27.46\n30.01\n39.36\n34.55\n37.33\n28.48\n24.69\n26.37\n30.72\n29.60\n29.49\n28.44\n24.28\n28.83\n27.99\n33.25\n30.15\n\n\u03c3 = 50\n26.25\n26.69\n29.29\n28.83\n26.58\n26.64\n26.67\n26.02\n26.92\n24.13\n26.38\n35.33\n30.81\n34.14\n25.50\n21.78\n23.18\n27.68\n26.63\n26.15\n24.65\n22.85\n25.59\n24.63\n29.59\n26.95\n\n\u03c3 = 100\n22.81\n22.84\n25.35\n25.65\n22.95\n23.63\n24.15\n23.27\n24.49\n21.37\n23.27\n30.82\n27.53\n30.50\n22.90\n20.03\n20.71\n24.99\n24.32\n23.09\n21.50\n18.17\n22.75\n21.90\n25.20\n23.76\n\nTable 1: Denoising results on the 25 tested images for several values of \u03c3. The PSNR values are\naveraged over 3 experiments corresponding to 3 different noise realizations.\n\nPEWA 1\nPEWA 2\nBM3D [6]\n\nNL-Bayes [12]\n\nS-PLE [24]\nNL-means [2]\n\nDCT [26]\n\n\u03c3 = 5\n38.27\n38.54\n38.64\n38.60\n38.17\n37.44\n37.81\n\n\u03c3 = 10\n34.39\n34.75\n34.78\n34.75\n34.38\n33.35\n33.57\n\n\u03c3 = 15\n32.26\n32.67\n32.68\n32.48\n32.35\n31.00\n31.87\n\n\u03c3 = 20\n30.76\n31.26\n31.25\n31.22\n30.67\n30.16\n29.95\n\n\u03c3 = 25\n29.62\n30.15\n30.19\n30.12\n29.77\n28.96\n28.97\n\n\u03c3 = 50\n26.00\n26.95\n26.97\n26.90\n26.46\n25.53\n25.91\n\n\u03c3 = 100\n22.35\n23.76\n24.08\n23.65\n23.21\n22.29\n23.08\n\nTable 2: Average of denoising results over the 25 tested images for several values of \u03c3. The exper-\niments with NL-Bayes [12], S-PLE [24], NL-means [2] and DCT [26] have been performed using\nthe implementations of IPOL (ipol.im). The best PSNR values are in bold.\n\nImage\n\n\u03c3\n\nPeppers\n\n(256 \u00d7 256)\n\nHouse\n\n(256 \u00d7 256)\n\nLena\n\n(512 \u00d7 512)\n\nBarbara\n\n(512 \u00d7 512)\n\n5.00 15.00 25.00 50.00\n\n5.00 15.00 25.00 50.00\n\n5.00 15.00 25.00 50.00\n\n5.00 15.00 25.00 50.00\n\nPEWA 1 (W) (5\u00d75)\nPEWA 2 (W) (5\u00d75)\nPEWA 1 (W) (7 \u00d77)\nPEWA 2 (W) (7 \u00d77)\nPEWA 1 (D) (5 \u00d75)\nPEWA 2 (D) (5 \u00d75)\nPEWA 1 (D) (7 \u00d77)\nPEWA 2 (D) (7 \u00d77)\nPEWA Basic (7\u00d77)\nNL-means [2] (7\u00d77)\n\nBM3D [6]\n\nNL-Bayes [12]\nND-SAFIR [11]\n\nK-SVD [10]\nLSSC [16]\nPLOW [5]\nSOP [18]\n\n36.69 30.58 27.50 22.85\n37.45 32.20 29.72 26.09\n36.72 30.60 27.60 22.82\n37.34 32.34 30.11 26.53\n37.70 32.45 29.83 26.01\n37.95 32.80 30.20 26.66\n37.71 32.43 29.87 26.00\n38.00 32.75 30.30 26.69\n36.88 31.34 29.47 26.02\n36.77 30.93 28.76 24.24\n38.12 32.70 30.16 26.68\n38.09 32.26 29.79 26.10\n37.34 32.13 29.73 25.29\n37.80 32.23 29.81 26.24\n38.18 32.82 30.21 26.62\n37.69 31.82 29.53 26.32\n37.63 32.40 30.01 26.75\n\n37.89 31.88 28.55 23.49\n38.98 34.27 32.13 28.35\n37.90 31.90 28.59 23.52\n39.00 34.57 32.51 29.04\n39.28 34.23 31.79 27.72\n39.46 34.74 31.67 29.15\n39.27 34.26 31.79 27.71\n39.56 34.83 32.77 29.29\n37.88 34.13 32.14 28.25\n37.75 32.36 31.11 27.54\n39.83 34.94 32.86 29.69\n39.39 33.77 31.36 27.62\n37.62 34.08 32.22 28.67\n39.33 34.19 31.97 28.01\n39.93 35.35 33.15 30.04\n39.52 34.72 32.70 29.08\n38.76 34.35 32.54 29.64\n\n37.27 31.43 28.30 23.45\n38.05 33.40 31.11 27.80\n37.26 31.45 28.33 23.45\n38.00 33.65 31.56 28.40\n38.46 33.72 31.33 27.59\n38.57 33.96 31.81 28.43\n38.45 33.72 31.25 27.62\n38.58 34.12 31.89 28.83\n37.39 33.26 31.20 27.92\n36.65 32.00 30.45 27.32\n38.72 34.27 32.08 29.05\n38.75 33.51 31.16 27.62\n37.91 33.70 31.73 28.38\n38.63 33.76 31.35 27.85\n38.69 34.15 31.87 28.87\n38.66 33.90 31.92 28.32\n38.31 33.84 31.80 28.96\n\n36.39 30.18 29.31 22.71\n37.13 31.94 29.47 25.58\n36.40 30.18 27.32 22.71\n37.00 32.10 30.00 26.20\n37.71 32.20 29.55 25.58\n38.03 32.70 30.03 26.01\n37.70 32.30 29.84 26.20\n38.09 32.86 30.28 26.58\n36.80 31.89 29.76 25.83\n36.79 30.65 28.99 25.63\n38.31 33.11 30.72 27.23\n38.38 32.47 30.02 26.45\n37.12 31.80 29.24 24.09\n38.08 32.33 29.54 25.43\n38.48 33.00 30.47 27.06\n37.98 21.17 30.20 26.29\n37.74 32.65 30.37 27.35\n\nTable 3: Comparison of several versions of PEWA (W (Wiener), D (DCT), Basic) and competitive\nmethods on a few standard images corrupted with white Gaussian noise. The best PSNR values are\nin bold (PSNR values from publications cited in the literature).\n\n8\n\n\fReferences\n[1] Alquier, P., Lounici, K. (2011) PAC-Bayesian bounds for sparse regression estimation with exponential\n\nweights. Electronic Journal of Statistics 5:127-145.\n\n[2] Buades A., Coll, B. & Morel, J.-M. (2005) A review of image denoising algorithms, with a new one. SIAM\n\nJ. Multiscale Modeling & Simulation, 4(2):490-530.\n\n[3] Burger, H., Schuler, C. & Harmeling, S. (2012) Image denoising: can plain neural networks compete with\nBM3D ? In IEEE Conf. Comp. Vis. Patt. Recogn. (CVPR\u201912), pp. 2392-2399, Providence, Rhodes Island.\n[4] Chatterjee, P. & Milanfar, P. (2010) Is denoising dead?, IEEE Transactions on Image Processing,\n\n19(4):895-911.\n\n[5] Chatterjee, P. & Milanfar, P. (2012) Patch-based near-optimal image denoising, IEEE Transactions on\n\nImage Processing, 21(4):1635-1649.\n\n[6] Dabov, K., Foi, A., Katkovnik, V. & Egiazarian, K. (2007) Image denoising by sparse 3D transform-domain\n\ncollaborative \ufb01ltering, IEEE Transactions on Image Processing, 16(8):2080-2095.\n\n[7] Dalayan, A.S. & Tsybakov, A. B. (2008) Aggregation by exponential weighting, sharp PAC-Bayesian\n\nbounds and sparsity. Machine Learning 72:39-61.\n\n[8] Dalayan, A.S. & Tsybakov, A. B. (2009) Sparse regression learning by aggregation and Langevin Monte\n\nCarlo. Available at ArXiv:0903.1223\n\n[9] Deledalle, C.-A., Duval, V., Salmon, J. (2012) Non-local methods with shape-adaptive patches (NLM-\n\nSAP). J. Mathematical Imaging and Vision, 43:103-120.\n\n[10] Elad, M. & Aharon, M. (2006) Image denoising via sparse and redundant representations over learned\n\ndictionaries. IEEE Transactions on Image Processing, 15(12):3736-3745.\n\n[11] Kervrann, C. & Boulanger, J. (2006) Optimal spatial adaptation for patch-based image denoising. IEEE\n\nTranscations on Image Processing, 15(10):2866-2878.\n\n[12] Lebrun, M., Buades, A. & Morel, J.-M. (2013) Implementation of the \u201cNon-Local Bayes\u201d (NL-Bayes)\n\nimage denoising algorithm, Image Processing On Line, 3:1-42. http://dx.doi.org/10.5201/ipol.2013.16\n\n[13] Leung, G. & Barron, A. R. (2006) Information theory and mixing least-squares regressions. IEEE Trans-\n\nactions on Information Theory 52:3396-3410.\n\n[14] Levin, A., Nadler, B., Durand, F. & Freeman, W.T. (2012) Patch complexity, \ufb01nite pixel correlations and\n\noptimal denoising. Europ. Conf. Comp. Vis. (ECCV\u201912), pp. 73-86, Firenze, Italy.\n\n[15] Louchet, C. & Moisan, L. (2011) Total Variation as a local \ufb01lter. SIAM J. Imaging Sciences 4(2):651-694.\n[16] Mairal, J., Bach, F., Ponce, J., Sapiro, G. &. Zisserman, A. (2009) Non-local sparse models for image\n\nrestoration. In IEEE Int. Conf. Comp. Vis. (ICCV\u201909), pp. 2272-2279, Tokyo, Japan.\n\n[17] Milanfar, P. (2013) A Tour of modern image \ufb01ltering. IEEE Signal Processing Magazine, 30(1):106-128.\n[18] Ram, I., Elad, M. & Cohen, I. (2013) Image processing using smooth ordering of its patches. IEEE\n\nTransactions on Image Processing, 22(7):2764\u20132774\n\n[19] Rigollet, P. & Tsybakov, A. B. (2012) Sparse estimation by exponential weighting. Statistical Science\n\n27(4):558-575.\n\n[20] Roth, S. & Black, M.J. (2005) Fields of experts: A framework for learning image priors. In IEEE Conf.\n\nComp. Vis. Patt. Recogn. (CVPR\u201905), vol. 2, pp. 860-867, San Diego, CA.\n\n[21] Salmon, J. & Le Pennec, E. (2009) NL-Means and aggregation procedures. In IEEE Int. Conf. Image\n\nProcess. (ICIP\u201909), pp. 2977-2980, Cairo, Egypt.\n\n[22] Talebi, H., Xhu, X. & Milanfar, P. (2013) How to SAIF-ly boost denoising performance. In IEEE Trans-\n\nactions on Image Processing 22(4):1470-1485.\n\n[23] Van De Ville, D. & Kocher, M. (2009) SURE based non-local means. IEEE Signal Processing Letters,\n\n16(11):973-976, 2009.\n\n[24] Wang, Y.-Q. & Morel, J.-M. (2013). SURE guided Gaussian mixture image denoising. SIAM J. Imaging\n\nSciences, 6(2):999-1034.\n\n[25] Xue, F., Luisier, F. & Blu T. (2013) Multi-wiener SURE-LET deconvolution. IEEE Transactions on Image\n\nProcessing, 22(5):1954-1968.\n\n[26] Yu G. & Sapiro G. (2011) . DCT image denoising: a simple and effective image denoising algorithm.\n\nImage Processing On Line (http://dx.doi.org/10.5201/ipol.2011.ys-dct).\n\n[27] Zontak, M. & Irani, M. (2011) Internal statistics of a single natural image. In IEEE Comp. Vis. Patt.\n\nRecogn. (CVPR\u201911), pp. 977-984, Colorado Springs, CO.\n\n[28] Zoran, D. & Weiss, Y. (2011) From learning models of natural image patches to whole image restoration.\n\nIn IEEE Int. Conf. Comp. Vis. (ICCV\u201911), pp. 479-486, Barcelona, Spain.\n\n9\n\n\f", "award": [], "sourceid": 1139, "authors": [{"given_name": "Charles", "family_name": "Kervrann", "institution": "Inria"}]}