Convergence for bptt grads:
1.5078654e-06
-1.9676609
-1.9679229
-1.9679229
-1.9679227
-1.9679227
-1.9679229
-1.9679224
-1.9679227
-1.9679229
-1.9679222
-1.9679227
-1.9679216
-1.9679229
-1.9679233
-1.9679222
-1.9679229
-1.9679229
-1.9679233
-1.967923
-1.9679227
-1.9679224
-1.967923
-1.9679233
-1.9679221
-1.9679229
-1.9679221
-1.967923
-1.9679222
-1.967923
-1.967923
-1.9679222
-1.9679224
-1.967923
-1.967923
-1.9679227
-1.9679229
-1.9679227
-1.9679224
-1.9679229
-1.9679229
Final cosine with grad:	 0.9964593052864075
Final dist with grad:	 0.14849257469177246
