Reviews: ETNet: Error Transition Network for Arbitrary Style Transfer

(1) Originality: This paper first introduces error-correction and diffusion mechanism into the style transfer literature, which separates it from existing works. Meantime, style transfer via iterative refinement is not a novel idea as it has been applied by the WCT method [17]. (2) Quality: This paper has provided both qualitative and quantitative experiments to show the superiorities of the proposed style transfer method. However, there are several concerns: a. The main concern is the meaning of equation (3). Equation (2) associates the style and content error via a learnable weight W to obtain a full error feature. In equation (3), the affinity is calculated between this full error feature and the stylized image features. As far as I can see, these two features have different meanings and are not comparable to each other. So what is the motivation to calculate the affinity between them? How would a further multiplication between this affinity and the full error feature diffuse the error to the whole image? b. In equation (4), why not simply concatenate the error feature of layer i with error feature of layer i-1, why using a fusion layer with learnable weight? c. As for the performance of the proposed algorithm. First, I do not think 0.5680 second per image (Table 2) is real-time (as stated in the introduction). Second, this work has to separately train each error transition network separately, that would increase training burdens compared to other methods such as AdaIn or WCT. However, the proposed algorithm does achieve lower style loss and the results look better compared to other state-of-the-art methods. (3) Clarity: In general, this paper is easy to follow but contains some spelling mistakes, e.g., Line 123, inputted -> input Line 109, 124, outputted -> output There are also some confusions in equations. Such as the ‘*’ symbol in equation (3), I suppose it to be an element-wise multiplication, which is different from the matrix multiplication in equation (2), which the author(s) should make clear. (4) Significance: This paper aims at improving the existing style transfer method via interactive error-correction. The framework is novel compared to other style transfer methods. Post rebuttal: The rebuttal indeed does not clearly answer my question. However, after revisiting the details of the paper, and inferring from the comments of other reviewers, I could get the insights, i.e., not strictly, a forward way of optimizing the transferred image, with the attention mechanism involved. Thus I raised the rating.

Paper ID:	332
Title:	ETNet: Error Transition Network for Arbitrary Style Transfer

Reviewer 1

Reviewer 2

Reviewer 3