NIPS 2018
Sun Dec 2nd through Sat the 8th, 2018 at Palais des Congrès de Montréal
Paper ID: 6582 Predictive Approximate Bayesian Computation via Saddle Points

### Reviewer 1

The paper proposes a framework to work with approximate Bayesian computation in an optimisation setting. While the idea is interesting, the presentation is a bit limited: there are many English errors and there is a clear point which is not well explained: from equation (5) on the loss function is intended to be either negative or positive, so that the minimisation problem is meaningless.

### Reviewer 2

Thank you for your response that answers my main concerns. I am still voting for accepting the paper. -------------------------------------------------------------------------------------------- This paper considers the problem of performing Bayesian inference when dealing with intractable likelihood. Approximate Bayesian Computation (ABC) is now a well established method to handle complex likelihood which does not admit closed form or is computationally too expensive to be computed. Nevertheless, such methods present limitations like : - they are delicate to tune as they require a non one-to-one function of the data (summary statistics) and the definition of a neighbourhood of the data with respect to some metric, - they may be difficult to scale up. To overcome such issues, the authors propose to move from the initial ABC paradigm towards a saddle point formulation of the problem. They suggest to use the ability to sample from complex model (like in ABC) in order to solve an optimisation problem which provides an approximation of the posterior of interest. The authors compare the performance of their approach with some particular examples of ABC algorithms on synthetic and real datasets. The motivation of the paper and the advocated solution are well explained. The existing literature on ABC is rather dense but the authors managed to give a broad overview of the different solutions. I have only few comments about the existing literature that is not mentioned in the paper: - to be completely fair, the authors could have mentioned ABC random forests for Bayesian parameter inference (Raynal et al. 2016) as it is a solution which avoids tuning the neighbourhood and reduces the need for summary statistic selection, - regarding the formulation in terms of optimisation problem, it would have been interesting to provide a discussion, a comparison or a mention of the work of Moreno, et al. (2016, Automatic Variational ABC) and Fasiolo, et al. (2017, An Extended Empirical Saddlepoint Approximation for Intractable Likelihoods). Do you have any input or comment on this? The paper is overall properly written and easily to follow. I just have few minor comments about the readability: - the definition of the representation of $\theta$ for the posterior distribution in section section 3.1 (l.199 - l.204) is redundant as it is already defined on line l.97 - l.104. I would suggest to do the mention from l.199 - l.204 straight before section 2.1 (sticking to the word representation rather than reparametrization) as it is used in Equations (4), (5) and so on... and remove it from section 3.1, - Equation (7) and equation (10) are the same. In section 3.1, the authors refer alternatively to (10) or (7). I would suggest to remove equation (10) and introduce the definition of $\Phi$ in equation (7) and then specifying $f(Y,\i)$ and $u(\theta,Y)$ when needed. The idea is original and very interesting as it brings some efficient machine learning techniques where existing ABC methods may fail or struggle. The results seem promising though it would have been interesting to have a more comprehensive comparison with existing methods. The main limitation I see to this approach is the inability to reach the optimal solution of the saddle point formulation (7) as in some variational approaches. This would end up to rough or wrong approximations of $f*$ and $u*$ and hence to a misspecified posterior approximations. Do you have any comments on this? I spotted few typos that I listed below : l.101 modelstive? l.102 mdoels -> models l.108 p(Y|\theta)p(\theta) -> p(Y|\theta)\pi(\theta) l.112 Shouldn't be the expectation of equation (4) with respect to q(\theta|Y)p(Y) instead of with respect to p(\theta|Y)p(Y)? l.134 experessive -> expressive