Sun Dec 8th through Sat the 14th, 2019 at Vancouver Convention Center
For me is not clear what is the advantage of this model compared to the many previous work using self organizing maps, predictive coding or slow feature analysis. Nowadays there is a lot of work also in unsupervised learning depth from images.
1. I wonder how does the behavior change if the morphology of the agent (motor configurations) is made more complex. If I understand correctly the agent only has 3 degrees of freedom (Figure 3 (left))? What would happen if the agent were much more complex, say with very many degrees of freedom. Will we still see the same behavior as in the current experiments in the paper?
Originality This paper is fairly original as it formalises the problem of sensorimotor prediction and evaluation. Self-localization is common topic that is often addressed in RL and neuro-science, however it was nice to see a proper analysis of the problem together with the invariant and how to measure them. Quality Overall I think this paper is of high quality and is nicely self-contained. The authors motivate the problem well, give a formal declaration of the problem setup, define the invariant properties that we are looking for in agents as proof of sensorimotor awareness, then present the loss that is supposed to make this behaviour emerge, present ways of measuring the sensorimotor awareness via invariants, and finally run experiment and show positive results via 3 different environment setups. Clarify Even thought he paper is well written I think it could be clearer. I feel like the authors complicate the problem by introducing a lot of different notations and a very generic definition of the problem - maybe a running example throughout the manuscript would have helped to understand the concepts faster. The paper boils down to this: the hypothesis is that performing next step prediction of some carefully chosen values makes self localization emerge under the right environment properties. I think this message needs to be stated in a simple way as it is easy for the reader to get lost in the details. A diagram for the two metrics would have been useful too. Significance Overall I feel that it is a positive, sensible result, however I am not sure how significant this result is. It seems natural and obvious that such properties would emerge in the given setup, but it is also nice to have some work that formally present and show that it is actually the case. I do not know the literature well enough to judge whether this has been shown before or not, however I would have liked to see in the related work section a justification of why this paper is unique and what it brings compared to other works.
Originality: Moderate. This is an area of research that has received a significant amount of attention, as the authors indicate. But the specific investigation in this work regarding the sufficient conditions for certain structural properties to emerge is useful and novel. Quality: Good. The approach appears to be technically sound. The claims are supported by thorough variation of the environmental properties involved. Clarity: Good. The paper is clearly written and very well organized. One moderate weakness in the exposition is the treatment of Condition II regarding metric structure. If I understand correctly, the environmental conditions that are sufficient for this property to emerge amount to "dataset completeness" in the form of the agent observing transitions under a wide variety of environmental configurations. The structure learned under violation of this condition can be seen as "overfitting" to the particular environmental configuration (as it is static) and therefore is only topologically similar to the true state. This is somewhat subtle and could use further exposition. Significance: Moderate. This is important work, and is likely to provide a basis for future researchers in this area to analyze and develop environments to investigate and encourage the emergence of spatial representations. The architecture proposed in this work is not a significantly useful contribution and the results are not immediately illuminating about any particular application domain, but the lessons learned will be valuable to future researchers.