NIPS Proceedingsβ

Qi Cai

2 Papers

  • Neural Temporal-Difference Learning Converges to Global Optima (2019)
  • Neural Trust Region/Proximal Policy Optimization Attains Globally Optimal Policy (2019)