NeurIPS 2019
Sun Dec 8th through Sat the 14th, 2019 at Vancouver Convention Center
Paper ID: 5992 Levenshtein Transformer

### Reviewer 1

[update] Thanks for the revision and clarification! I revised my review accordingly. ========================= This submission introduces Levenshtein Transformer, a non-autoregressive model for text generation and post editing. Instead of generating tokens left-to-right, it repeats a delete-and-insert procedure. More specifically, starting from an initial string, it keeps deletes tokens from or insert new tokens into the outputs, until convergence is met. The model is trained with imitation learning, where expert policy derived from gold data or a pertained auto-regressive teacher model is explored. Experiments on text summarization, machine translation, and post editing shows that the proposed model outperforms the transformer baselines in both accuracy and efficiency. Overall I think this is an interesting work. Yet I do have some confusion in both the technical part and the experimental part.

### Reviewer 2

Originality: It is an interesting work by casting the sequence generation task as two iterative tasks of insertion/deletion. I think the formulation is new that is coupled with the training procedure based on imitation learning with two policies, i.e., deletion and insertion. Quality: The proposed model and its training procedure seem apt and well designed. Experiments are carried out carefully with consistent gains when compared with SOTA, i.e., Transformer, with faster inference speed. Clarity: This paper is clearly written, though I have a couple of minor questions regarding technical details. See the details in "Improvements" section. Significance: Given the inference efficiency and its reasonable quality improvements, I feel this work might have potential to impact future research. Other comment: line 89: we our policy for one iteration is -> {our, the}(?) policy for ...