As all reviewers agreed this is a clear novel contribution that proposes the Quantile Propagation (QP) algorithm, which operates similarly to Expectation Propagation (EP). While EP minimises the forward KL(p||q) divergence for each local factor, QP minimises the L2-Wasserstein distance. The authors have shown some theoretical results that QP can provide smaller variances than EP, which could be often beneficial in the reported experiments in Gaussian process classification. The use the Wasserstein distance for inference in GPs is interesting and thus the paper can be accepted on the basis of novelty. However, the claims of improved performance compared to EP are not really well supported experimentally. In fact, as the reviewers pointed out the new algorithm does not provided significantly better results than EP. Thus, the authors should down-weight their claims about "replacing EP". It would be useful also to add in the paper a clear and illustrative example pointing out the different behaviours of QP and EP.