Title:On the Convergence Rate of Training Recurrent Neural Networks

This paper proves poly-time convergence of SGD/GD in over-parametrized RNNs for the first time. Given that there is not many theoretical results in this space. All reviewers find this result a significant progress. Therefore, I recommend acceptance.