Convergence for bptt grads:
1.5078654e-06
-5.08868
-5.0875187
-5.0905437
-5.0881658
-5.0879436
-5.089059
-5.089189
-5.086409
-5.087255
-5.088713
-5.0873876
-5.0897884
-5.089011
-5.088446
-5.0894647
-5.0882387
-5.0878577
-5.0882444
-5.0884814
-5.0893717
-5.0893016
-5.0888624
-5.0887427
-5.0883026
-5.0888166
-5.0883245
-5.0878553
-5.088594
-5.0881352
-5.090726
-5.088081
-5.0890536
-5.087842
-5.0874767
-5.089555
-5.089337
-5.088247
-5.0894394
-5.088837
-5.089529
Final cosine with grad:	 1.0
Final dist with grad:	 9.569244139129296e-05
