NIPS Proceedingsβ

Csaba Szepesvári

14 Papers

  • Deep Representations and Codes for Image Auto-Annotation (2012)
  • Improved Algorithms for Linear Stochastic Bandits (2011)
  • Error Propagation for Approximate Policy and Value Iteration (2010)
  • Estimation of Rényi Entropy and Mutual Information Based on Generalized Nearest-Neighbor Graphs (2010)
  • Online Markov Decision Processes under Bandit Feedback (2010)
  • Parametric Bandits: The Generalized Linear Case (2010)
  • A General Projection Property for Distribution Families (2009)
  • Convergent Temporal-Difference Learning with Arbitrary Smooth Function Approximation (2009)
  • Multi-Step Dyna Planning for Policy Evaluation and Control (2009)
  • A Convergent $O(n)$ Temporal-difference Algorithm for Off-policy Learning with Linear Function Approximation (2008)
  • Online Optimization in X-Armed Bandits (2008)
  • Regularized Policy Iteration (2008)
  • Fitted Q-iteration in continuous action-space MDPs (2007)
  • The Asymptotic Convergence-Rate of Q-learning (1997)