Review for NeurIPS paper: VAEM: a Deep Generative Model for Heterogeneous Mixed Type Data

NeurIPS 2020

VAEM: a Deep Generative Model for Heterogeneous Mixed Type Data

Meta Review

The paper proposes modelling vectors with dimensions having different types (real-valued and categorical) using a two-stage VAE approach. First, a VAE with a 1D latent is trained once for each input dimension to standardize the data. Then a "dependency" VAE is trained on top of the resulting latents to capture the dependence between them. Pros: -The approach is interesting and novel -The idea is simple and seems effective, so might be widely adopted -The paper is well written -VAEM outperforms sensible baselines at generative modelling and a sequential information acquisition task Cons: -It is not explained why the two-stage training approach is a good idea. The fact that joint training tends to perform less well than two-stage training, as reported in the rebuttal, is an important observation that should be discussed and, ideally, explained in the paper. -The UCI datasets used for the evaluation are very small -The paper does not explore whether the benefits come from better inference, better generation, or both