Reviewers have somewhat variable opinions on this paper. Reviewer 4 feels it needs to be better distinguished from the literature. Ablation analyses that reveal more about what aspects are necessary would also strengthen the paper. Reviewer 3 raises a valid concern about limiting assumptions in the model - "an instantaneous mixture of latents is not the sort of dependence that a state-space model would capture". Reviewers 1 and 2 are more positive on the paper and most of their earlier concerns were addressed in the rebuttal. This paper is close to the boundary for acceptance.