Sun Dec 8th through Sat the 14th, 2019 at Vancouver Convention Center
The paper proposes an interesting strategy for active learning in which the model must be learned from little or no data. This is a useful problem to study in practice (which the authors provide examples). As reviewers 2 and 4 note, I recommend polishing the related work particularly in the connections for the BELGAM model. Even the original DLGM paper of Rezende et al. (2014) use priors for the network parameters; but they do MAP estimation.