NeurIPS 2019
Sun Dec 8th through Sat the 14th, 2019 at Vancouver Convention Center
Paper ID: 4229 Stein Variational Gradient Descent With Matrix-Valued Kernels

### Reviewer 1

The paper is clearly written: both precise and easy to read. The main idea is very simple: remove the assumption that we equip the vector space of vector-valued functions H^d with the standard inner product (or equivalently standard kernel), i.e., take into account the vector components could be "correlated". While this is simple it does add clarity to the theoretical setting of SVGD in which it is assumed, but not stated, that the components are independent. As the authors show it is important to note H^d is a vector-valued RKHS since coordinate transformation (which should not affect experiments) induce non-standard matrix kernels. While this provides the natural theoretical framework, it is raises the question of which matrix kernel to choose and I think the authors do not really answer this important (but probably hard) question. Indeed it seems to me that the natural way to choose the matrix kernel is using some intrinsic geometric information (coordinate independent), but the authors explain themselves that this leads to expensive computation. It is not clear to me why the "mixture preconditioning kernel" would work better in general.