Paper ID: 1691
Title: Reflection, Refraction, and Hamiltonian Monte Carlo
Current Reviews

Submitted by Assigned_Reviewer_1

Q1: Comments to author(s). First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. (For detailed reviewing guidelines, see http://nips.cc/PaperInformation/ReviewerInstructions)


SUMMARY

Hamiltonian MCMC methods sample from a probability

distribution by treating its log as a "potential energy"

function over the state space, augmenting the space with extra

"momentum variables" and their associated "kinetic energy",

and evolving the state of the Markov process by integrating

the physical Hamiltonian equations of motion of the system.

Each step of the Markov chain is accomplished by numerically

integrating the Hamiltonian equations forward in time.

However, if the energy function is non-differentiable, the

integral is not well-defined.

The rejection step that is used

to counteract numerical inaccuracies in the integration also

accounts for such non-differentiable regions, but at the cost

of slowing down the mixing rate of the Markov chain.

This

paper suggests physically-inspired "reflections" and

"refractions" of the trajectory of the system that occur

whenever the state crosses a discontinuity in the energy

function.

It applies to target distributions that are

differentiable everywhere except on the boundaries of certain

polytopes; the reflection or refraction occurs whenever the

trajectory of the system crosses such a boundary.

Whether a reflection or refraction occurs depends on if the

momentum component of the state is sufficient to "climb" the

gap in energy.

The authors show that the necessary

volume-preservation properties are satisfied by these

reflection and refraction procedures, so that the Markov

chains converges to the required target distribution.

QUALITY

The presented modification to Hamiltonian MCMC is useful and

the case of piecewise-differentiable target distributions is

quite applicable in practice.

It would, of course, be useful

to have boundaries that are not affine hyperplanes.

CLARITY

The algorithm, its motivation, and the proof of the volume

preservation property is clearly presented.

Some more

(qualitative and/or quantitative) discussion about how the

reflections and refractions affect rejection rates would be

useful.

ORIGINALITY AND SIGNIFICANCE

The idea of applying physically-based reflection and

refraction to Hamiltonian MCMC methods appears to be novel and

useful.

While the present work proves its correctness under

certain conditions, it is an interesting avenue of research to

extend it further.

Q2: Please summarize your review in 1-2 sentences
A modification of Hamiltonian MCMC is presented to

handle probability distributions that are not differentiable on

the boundaries of polytopes.

The proposed solution is for the

trajectory of the system to be reflected or refracted when it

encounters such a boundary, thereby reducing rejection rates and

improving mixing speed.

Submitted by Assigned_Reviewer_2

Q1: Comments to author(s). First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. (For detailed reviewing guidelines, see http://nips.cc/PaperInformation/ReviewerInstructions)
This paper presents and verifies (correctness and performance through simulation) a sensible variation on HMC for pdfs with discontinuities or on a bounded domain.

The method relies on simulating refraction and reflection that result from hamiltonian dynamics, so leads to better proposals around discontinuities (and higher acceptance rates => higher ess).

One thing that would be nice is to see an example of a pdf when it's not worth computing intersections.

What types of high dimensional pdfs is this method not suited for?
Q2: Please summarize your review in 1-2 sentences
This paper presents and verifies

Submitted by Assigned_Reviewer_3

Q1: Comments to author(s). First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. (For detailed reviewing guidelines, see http://nips.cc/PaperInformation/ReviewerInstructions)
This is an interesting extension of the standard case and the proof of volume conservation presented is simple but important and was lacking in previous works. The writing is clear.

Typos: l.161:

",Then" l.376: "parameters, are tuned"
Q2: Please summarize your review in 1-2 sentences
This paper extends the leapfrog discretization used in Hamiltonian Monte Carlo to piecewise smooth distributions. The algorithm is correct and is an interesting extension of the smooth case.

Submitted by Assigned_Reviewer_4

Q1: Comments to author(s). First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. (For detailed reviewing guidelines, see http://nips.cc/PaperInformation/ReviewerInstructions)
UPDATES AFTER REBUTTAL: Thank you for addressing my questions. The extra numerical results in the supplement are a welcome addition to the paper. I think that this is a good paper, but I will keep my score at 7.

ORIGINAL REVIEW: The paper proposes a modified leapfrog integrator for Hamiltonian Monte Carlo for sampling from piecewise continuous probability distributions, with affine discontinuity boundaries. The simulator makes use of reflection and refraction of the simulated trajectory whenever it hits a boundary. Similar ideas have previously been used for simulating from multivariate Gaussian distributions with constraints, but as far as I can tell this is the first application of this approach to "general" distributions (although, I'm not an expert on HMC). The idea is simple but intuitively appealing (most good ideas are simple!) and I think that this method can be very useful for a restricted class of problems. The numerical evaluation is not very extensive and the authors should consider extending it for a later (re)submission. Furthermore, there are several typos that need to be corrected.

Finally, can you comment on the possibility of using non-affine discontinuity boundaries? Apart from the computational issues associated with computing intersections, would it be possible to extend the approach to allow for more general boundaries?

Some detailed comments/questions:

- In section 3: in think that it could be useful to mention that K(p') = K(p) - \Delta U to explain the form of the refracted momentum. - In Equation (3): aren't the upper triangle (excluding the four elements in the upper left corner) also zero? - Line 363: Missing normalisation by k in the definition of WMAE. - Footnote 2 on page 7: I think that it would be good if you could elaborate on the sensitivity of the tuning parameters (L and eps) of your method to convince the reader. Perhaps include additional numerical result in a supplementary material?
Q2: Please summarize your review in 1-2 sentences
The paper proposes a modified leapfrog integrator for Hamiltonian Monte Carlo for sampling from piecewise continuous probability distributions, with affine discontinuity boundaries. The idea is simple but potentially very useful for a restricted class of problems. The numerical evaluation is not very extensive.

Author Feedback
Author Feedback
Q1:Author rebuttal: Please respond to any concerns raised in the reviews. There are no constraints on how you want to argue your case, except for the fact that your text should be limited to a maximum of 5000 characters. Note however, that reviewers and area chairs are busy and may not read long vague rebuttals. It is in your own interest to be concise and to the point.
We would like to thank the reviewers for their helpful and constructive comments.

R1>
1. RE More quantitative/qualitative discussions of the effect of reflections and refractions on the rejection rate:

We have provided more numerical results accessible by the following anonymous link:

https://drive.google.com/file/d/0B9SfUIl9GRZHWWRGOWRwTkx1UmM/view?usp=sharing

or in case it does not work, by the following anonymous link:

http://www.keepandshare.com/doc2/85678/supplimentary-pdf-4-7-meg?da=y

In the last pages of the uploaded file there are three tables that represent the rate of reflections and refractions (as long as rejections) with respect to several different tunings. It also contains a brief discussion on the key role that reflection plays in the sampler's performance in high dimensions.

***

R2>
1. We will fix the typos. Thanks for mentioning them.

***

R3>
1. RE more numerical evaluation:

In the anonymous link mentioned in R1.1 we have provided more numerical evaluations (as suggested).

2. RE elaborating on the sensitivity of the tuning parameters (L and eps) of our method.

In the uploaded supplementary material, our method and the baselines are tested on all combinations of parameter L being in {20, 50, 100, 200}; epsilon being in {0.05, 0.1, 0.2, 0.4} and the model dimension be in {2, 10, 50}.
It is already well known that basic HMC is quite sensitive to tunings (see e.g. the paper's citation [7]).
This is verified by the provided supplementary plots. In contrast, reflective HMC is not so: On the tested parameters, our method is never worse than HMC and in most cases (particularly, in high dimensions) it performs considerably better.

3. RE the possibility of using non-affine discontinuity boundaries:

In this case, it can be shown that analogous to curved mirrors, the volume preservation may be violated (due to magnification). However, despite this, we hope that (with some modifications) we can design a reflection/refraction based HMC that even in the mentioned case, converges to the correct stationary distribution. We cannot be sure whether that is possible but do consider the prospects for extending the work beyond affine borders.

4. In Equation (3): aren't the upper triangle (excluding the four elements in the upper left corner) also zero?

Not all of them. For instance, in the 3rd row and 4th column:

(d q'_2) /(dp_2) = tau.

5. Thanks for the other detailed comments/typos. We will implement/fix them.


***

R4>
1. RE "the definition of the time period is not intuitive":

The total leapfrog full-step time is epsilon. We divide it into arbitrary smaller periods,

Epsilon = tau1 + tau2 +...
such that in each period tau_i, at most one reflection or refraction occurs. According to lemmas 1 or 2 (or in case there is no collision, just a shear transformation) in each period, tau_i, the volume is preserved. Therefore it is preserved for the total time, epsilon, also.

2. RE "it is questionable that the total energy is preserved during the refraction":

Consider a collision to a hyperplane q1 = c (since any collision can be transformed to this case via rotation).
In this case, only the momentum element associated with the first dimension changes.
Let p1' and p1'' be such momentum elements just before and after collision.
In order to preserve the total energy we equate the Hamiltonian
(that is, H(q,p) = U(q) + K(p) where k(p) = (p^2)/2) right before and after the collision:

0.5*(p1')^2 = (\Delta U) + 0.5*(p1'')^2

Therefore in order to preserve the total energy, the momentum just after the collision should be:

(p1'') = SQRT{p1' - 2(\Delta U)}

As used in the paper.
We did not provide the proof in the paper since it is already stated in Pakman's papers [10] and [11]. However, we will either mention it in the paper or provide the above proof explicitly.

***

R5>
1. What types of high-dimensional pdfs is this method not suited for?

According to the discussion provided in the anonymous link (mentioned in R1.1), we believe that if the high dimensional density is truncated or has external boundaries, our method is always suitable.
But if the pdf does not have a finite support (i.e. is not surrounded by a 0-probability space) and it has a large enough number of internal boundaries and the potential gaps at the boundaries are sufficiently small, It would eventually be better off just ignoring them using regular HMC.