This paper proposes a semi-autoregressive neural text generation method with cascased transformer where candidate outputs are pruned during decoding through scoring with CRF models of increasing context length. The method supports parallel computing in inference which lead to 7x speed compared to autoregressive method with loss, which is better than most existing non-autoregressive methods. Reviewers all agree that the idea is novel and well develolped. The experiments are extensive and show significant benefit over existing auto-regressive and non-autoregressive methods. It's very nice paper.