NeurIPS 2020

Linear Time Sinkhorn Divergences using Positive Features

Meta Review

This paper proposes the combination of Sinkhorn's algorithm with cost functions of the form -log <\phi(x), \phi(y)> where \phi is an embedding into nonnegative vectors. The motivation is that when the embedding vectors \phi(x) live in R^r and r is small, the Sinkhorn iteration can be computed very quickly (b/c the relevant matrix is rank r). There are some issues with practical applications given that the actual quantitative bounds for approximating a cost function using nonnegative features. For e.g., even for Gaussian kernels, the rank turns out to be exponential in the dimension so it's not clear if the methods are practical. Despite this, the reviewers found the research direction important and were convinced of the importance of the algorithm proposed in the work. I am pleased to recommend acceptance.