Reviews: Characterizing the Exact Behaviors of Temporal Difference Learning Algorithms Using Markov Jump Linear System Theory

This paper provides an analysis of Temporal-Difference algorithms using the theory of Markov Jump Linear Systems (MJLS). Its main contributions are to establish exact dynamics for the first and second order TD moments using linear function approximation, given by Linear Time Invariant systems. Reviewers found the technical contributions of this paper to be very strong, with potentially important significance in the study of a central object in RL such as TD learning. The main point of contention is the current presentation, which is cumbersome with notation, with page-long theorem statements, and, most importantly, without sufficient discussion of how these results relate to existing work on the convergence analysis of TD learning. The AC shares these concerns. However, after careful discussion with reviewers and having read the author feedback (which does promise to improve readability), considered that the positive contributions outweight the risk of poor readibility, and recommends acceptance, urging the authors to address the concerns raised by reviewers and AC.

Paper ID:	4586
Title:	Characterizing the Exact Behaviors of Temporal Difference Learning Algorithms Using Markov Jump Linear System Theory