NeurIPS 2019
Sun Dec 8th through Sat the 14th, 2019 at Vancouver Convention Center
Paper ID:114
Title:Blind Super-Resolution Kernel Estimation using an Internal-GAN

Reviewer 1

This paper introduces a blind super-resolution technique, i.e. a method allowing to increase the resolution of an image without knowing the downscaling kernel. Clarity : clarity is fairly good. It's just that sometimes some statements are raising questions which are answered later in the paper. It would be better to warn the reader that explanations are coming next. originality : the proposed GAN allowing to estimate the SR kernel is new and tailored for the blind SR problem. quality : the authors have rigorously presented their approach. The paper is technically sound. The only issue I see is that the performance discrepancies reported in the experimental section should be proved to be statistically significant. significance : the results are significant (see previous comments on the contributions) Remarks : line 101 : The benefits of using a D-map as compared to a pixelwise output is not explained in details. Line 117 : having a 7x7 receptive field is also possible because in other layers, the authors use 1x1 convolutions. This should be outlined in the text. Line 130: the authors explain that using a single layer generator does not work because optimization is entwined with that of the discriminator which is non-convex. It should be mentioned here that an « over-parametrized » learned G can be anyway compacted to single layer kernel as will be done in 4.2. Table 1 : please provide evidence that the reported PSNR discrepancies are statistically significant. I do not understand what synthetic dataset is used here (It is explained in the next subsection). Reproducibility : data and code will be made available but when or under what circumstances ? Runtime : I don’t understand the comment that «  runtime is independent of image size ». If the image is larger, then there are surely more patches from which to learn and thus training epochs are probably longer. UPDATE AFTER REBUTTAL: I thank the authors for the relevant feedback they provided. Concerning statistical significance, I agree that the reported indicators in the initial submission are indeed the usual practice in the SR (or image processing) literature, but I think the authors would also agree with me that this is a bad practice. I also agree that empirical standard deviation is not very informative because what we would like to know is how "concentrated" are probabilities around the empirical mean. I suggest bootstrap confidence intervals. The figure 2 of the feedback is also interesting. About the Dmap, why is it computationally more efficient ? Is there a parallelization to be exploited ? I am confident that the authors can easily address the other suggestions I made. I maintain my score because it is already pretty high.

Reviewer 2

This paper addresses single image SR problem based on zero-shot learning. Unlike conventional learning-based SR methods which assume known SR kernel and utilize external database to train the network during training phase, this work assumes unknown SR kernel, and thus estimates the kernel in a blind manner at test time. Then, the estimated kernels are integrated with conventional SR methods (e.g., [29]), and improves restoration quality by a large margin with the aid of the accurately estimated SR kernel. Overall, this work is a natural extension of previous work [23] using learning techniques, and the manuscript is well organized and easy to read. Here are some questions and minor comments. a. Discriminator D is a binary classifier in this task, but renders a 2D map rather than a single value. Is there any specific reason to employ this network architecture? b. I believe the proposed method can estimate SR kernel even when the input images are down-scaled by bicubic interpolation. From this view (assume GT kernel is bi-cubic), it would be great if the authors provide quantitative comparison results with conventional datasets (e.g., set5, set14, ...) c. Isn’t it possible to train the SR network with train datasets whose LR images are down-scaled with various SR kernels as well as bi-cubic kernels? (i.e., SR kernels might be generated by modifying Levin et al’s blur kernel generation technique) d..How do you handle the estimated SR kernel to integrate with ZSSR[29] which is implemented on coarse-to-fine (multi-scale) manner?

Reviewer 3

This paper investigate kernel estimation method for blind super-resolution. I think this paper has the following strong points: 1) The idea of using distribution of small patches to guide the training of kernel estimator is interesting. 2) The authors proposed to use deep linear networks for estimating the kernel, and provided a good justification of using such a special network architecture. 3) The authors have used some prior information to regularize the training of kernel estimator. 4) The proposed algorithm achieved good results. Despite the above strong points, I think this paper also has the following drawbacks: 1) Lack of experimental justification. Although the authors have discussed the reason of their design choice, but it will be better if the authors could provide experimental results to show the effect of kernel constraints, kernel estimator with/without activation function ... 2) More implementation details should be provided to make this work reproducable. For example, training details, hyper-parameters ... I think there must be many tricks in selecting patches for training the discriminator. 3) is there a non-negative constraint for the estimated kernel?