Part of Advances in Neural Information Processing Systems 15 (NIPS 2002)
There are several reinforcement learning algorithms that yield ap(cid:173) proximate solutions for the problem of policy evaluation when the value function is represented with a linear function approximator. In this paper we show that each of the solutions is optimal with respect to a specific objective function. Moreover, we characterise the different solutions as images of the optimal exact value func(cid:173) tion under different projection operations. The results presented here will be useful for comparing the algorithms in terms of the error they achieve relative to the error of the optimal approximate solution.