Sun Dec 8th through Sat the 14th, 2019 at Vancouver Convention Center
Originality: the main sources of originality here appear to be Prop 2, as well as the use of properties of the Wasserstein distance to get sharper results (across all the contributions); I have some more comments on this point, later on. Quality: the quality seems good. I skimmed the proofs and they made sense. Clarity: the paper is reasonably well-written; I have some more comments on this point, later on. Significance: the paper seems like a good step forward in terms of giving concentration inequalities for spectral risk measures. The main result(s) are not all that surprising, and the math involved behind-the-scenes doesn't appear to be exceptionally difficult, but the results still seem valuable/worthwhile.
This paper is clearly written and easy to read. However, I have some questions about the contribution of this paper. I’m wondering why it is a significant problem to evaluate the estimation error by the Wasserstein distance. While the CVaR is a very important measure, it is no clear connection with the characteristics of the Wasserstein distance. The authors just mention “it is interesting to know” (line 38), but it is very insufficient to describe the importance of their goal. Also, technical contributions seem less. With the known formulation of the CVaR in (6), the bound for the error is obtained by a simple application of the concentration inequality of sub-Gaussian random variables. To improve technical contributions, the authors should investigate theoretical aspects of the error bound such as a convergence rate, adaptivity and so on. ====== UPDATED AFTER REBUTTAL ====== Thanks to the rebuttal by the authors, I understood that there are some connections between the risk measure and the Wasserstein distance. Thus, I updated my score from 5 to 6. However, I feel that it is not easy task for readers to understand the connection from the main script. Though it is always difficult to describe all in short space of conference papers, I'll be glad if more intuitive description is provided.
The key insight of the proposed method is to put the estimation of the error in relation to the distance between the empirical and the true distribution, where the distance is the Wasserstein distance. This is a pretty original approach, as far as this reviewer knows. The page limit of NeurIPS puts highly-theoretical papers such as this one at a serious disadvantage. As a result, it is quite hard for someone not expert in the area to grasp what is going on in the paper because very little guidance is given (as there is no space for it), and very little intuition is successfully communicated to the reader. While it is very hard to achieve given the space limitation, perhaps some nuggets of intuition can be made evident here and there. The contribution of the paper seem quite significant, and the technique very novel, although there seems to be a very strong reliance on existing concentration bounds. It would be helpful to clarify how much of the proposed results are more than an application of existing bounds (which would still make the work an interesting contribution, but this clarification would help better assessing its significance).