This is a solid paper that definitely warrants acceptance. The paper is clearly written and makes multiple substantive contributions in terms of training and analyzing RNNs at the boundary of ML and neuroscience. The reviewers identified some issues, primarily relating to clarity and requests for additional details. There is reasonable confidence that updates by the authors will satisfy these requests. There is an emerging field of research involving neural networks being trained to solve tasks that are used in neuroscience experiments to allow comparisons between the representations and dynamics learned by artificial systems and those observed in real neural recordings. This particular paper, while not dealing with real neural data, describes findings that will be of interest for this growing community. In addition, by selecting a task that is of interest for the IBL, it will be relevant for that subset of the neuroscience community. One additional comment in connection to the existing literature. While the task investigated in this work is on the simpler end of the spectrum, thereby permitting comparisons with the tractable Bayes-optimal model, there is other work in a similar neuroscience-adjacent context that emphasizes, for example, how recurrent networks can solve multiple tasks [Yang et al 2019] as well as higher-dimensional control problems [Merel et al 2020]. At present there seems to me to be a bit of a gap between the regime of a single, relatively simple task on the one end and more complex models that are trained to solve multiple, more complex problems. I'm curious if the authors have thoughts on how well their analysis techniques and model compression approach will scale to settings that have higher "task" complexity such as in the examples above. References: "Task representations in neural networks trained to perform many cognitive tasks" Yang et al. 2019 "Deep neuroethology of a virtual rodent" Merel et al. 2020