Paper ID: | 835 |
---|---|

Title: | Extending Stein's unbiased risk estimator to train deep denoisers with correlated pairs of noisy images |

Overall, the proposed work is interesting but the work is not clearly presented. The questions are: 1. The first experiments showed better results by eSURE than those of SURE. How is SURE trained using "twice more data"? Is it by increasing the batch size by 2? It would be good to see the results by twice batch size: 1) with two realizations of the same image; 2) different images. 2. For the second experiments with imperfect ground truth, what is the sigma in the training? The eSURE requires sigma in Eq. (10). Is that sigma_noisy in Tables 2 and 3? Does eSURE need to know sigma_gt? 3. Line 244, it seems that each network is trained for each noise level. So 7 DnCNN-eSURE networks were trained for Table 2 or 3? This is not practical and does not make sense. Normally, sigma_gt is unknown and may vary in the same dataset. It is reasonable to see one network for all different noise levels, like that in Table 1 for blind denoising.

Summary: The authors demonstrate that SURE and Noise2Noise, two methods that allow training a denoiser without clean ground truth can be viewed in the same theoretical framework. They show theoretically that Noise2Noise is not generally applicable in cases were the noise in the two images correlated, as it is e.g. the case when training is done with ground truth created by averaging noisy images. They present a novel method (eSURE) as a solution and show that Noise2Noise is a special case of eSURE. While eSURE in general is superior to Noise2Noise in the sense that it can be applied with correlated noise, it also yields slightly better results than vanilla SURE. The method is evaluated on the standard denoising datasets and is compared to sensible baselines (Noise2Noise, vanilla SURE, and standard supervised training). Its performance is in agreement with the theory. Originality: + The paper presents a highly original theory and derives a novel training method from it. Quality: + The paper seems theoretically and technically sound. Clarity: + The paper is well structured. - The clarity of the theory might be improved by being more explicit in some explanations. e.g.: Theorem 3 would be easier to understand if it would be explicitly stated (at least this is how I read it) that y_1 could e.g. correspond to a ground truth with remaining noise and that y_2 could correspond to an input image with correlated noise. Similarly for the Eq. 7, if the vectors (y-x) and (z-x) were to be introduced mentioning explicitly what they correspond to, the theory might be easier to grasp. Significance: I believe the paper is highly significant and can further our understanding of machine learning from noisy data. The findings are fundamental and important. Final Recommendation: Considering the principled original approach and the high significance, I firmly recommend to accept the paper. ------------------------------------------ Post Rebuttal: I feel that my concerns and suggestions have been adequately addressed and I will stick with my initial rating.

This paper is well written and the contribution is clearly descibed. My main concern is whether the assumption of correlated pairs of noisy images is helpful for real-world image denoising. I expect that the authors can give us some examples and initial results to illustrate this point for real application tasks. My second concern is that the proposed method is derived based on AWGN, while the real noise may be far different from such assumption and can be both signal-dependent and spatially variant. [1] Toward convolutional blind denoising of real photographs, CVPR 2019. [2] When AWGN-based Denoiser Meets Real Noises, Arxiv 2019. It seems that the proposed method is sensitive to the paramter \epison (see Lines 202~203). Thus, it is suggested to conduct an ablation study to see the effect of \epison.