David Wingate, Satinder Baveja
In order to represent state in controlled, partially observable, stochastic dynamical systems, some sort of sufﬁcient statistic for history is necessary. Predictive repre- sentations of state (PSRs) capture state as statistics of the future. We introduce a new model of such systems called the “Exponential family PSR,” which deﬁnes as state the time-varying parameters of an exponential family distribution which models n sequential observations in the future. This choice of state representation explicitly connects PSRs to state-of-the-art probabilistic modeling, which allows us to take advantage of current efforts in high-dimensional density estimation, and in particular, graphical models and maximum entropy models. We present a pa- rameter learning algorithm based on maximum likelihood, and we show how a variety of current approximate inference methods apply. We evaluate the qual- ity of our model with reinforcement learning by directly evaluating the control performance of the model.