Convergence for bptt grads:
1.5078654e-06
-3.4644954
-3.4644167
-3.4644315
-3.464468
-3.4644647
-3.4644377
-3.4644852
-3.4644947
-3.4644802
-3.4644756
-3.4644678
-3.4644752
-3.4644554
-3.4644594
-3.4644692
-3.4644585
-3.4644775
-3.4644618
-3.4644926
-3.4644594
-3.4644632
-3.4644892
-3.4644353
-3.4644449
-3.464493
-3.4644852
-3.4644585
-3.4644985
-3.4644582
-3.4644947
-3.4644837
-3.4644232
-3.4644465
-3.4644933
-3.4644215
-3.4644704
-3.4644513
-3.4644158
-3.4644868
-3.4644537
Final cosine with grad:	 0.9999980330467224
Final dist with grad:	 0.00342211383394897
