Sun Dec 8th through Sat the 14th, 2019 at Vancouver Convention Center
The paper is very well written, making the sometimes dry topic of losses easy to follow and engaging. I want to specially mention that the authors did a great job of providing intuitive interpretations of most definitions, which will surely be appreciated by many readers. However I think that there's room for improvement in further clarifying terms like "reports" and "properties" which will most likely be new for a significant part of the audience. In terms of the technical contribution, it provides a fresh look at the construction of surrogate losses from the point of view of embeddings, proving the interesting result that every discrete loss can be embedded into a polyhedral loss as well as the converse. On top of the conceptual appeal, the power of this idea is exemplified by reworking or deriving new results on consistency and calibration of surrogate losses. Proofs are solid and generally the paper is a valuable addition to the literature on losses and surrogates.
This work considers the relationship between convex surrogate loss and learning problem such as classification and ranking. The authors embed each of the finitely many predictions (e.g. classes) as a point in Rd, assign the original loss values to these points, and convexifies the loss in between to obtain a surrogate. The authors prove that this approach is equivalent, in a strong sense, to working with polyhedral (piecewise linear convex) losses, and give a construction of a link function through which L is a consistent surrogate for the loss it embeds. Some examples are presented to verify the theoretical analysis. This is an interesting direction in learning theory, while I have some concerns as follows: 1) What's the motivation of polyhedral losses? The authors should present some real applications and shows its importance, especially for some new learning problems and settings. 2) It would be better that the authors focus on some specific learning setting such as classification, or AUC or classification with rejection. The presented submission focuses on various setting and there is a lack of deep understanding on some specific problems. 3) It would be better to present some intuitive explanation for Definitions 1-3 for better understanding their meanings. 4) There are some known results in Section 5 and the authors should present some new real applications of surrogate loss. 5) It would be better to present some regret bounds on the surrogate loss and polyhedral losses.
********After author response*********************** I thank the authors for answering my questions. I keep my evaluation and vote for accepting the paper. ********************************************************** Originality: This work provided a novel approach for designing convex loss surrogates for general discrete losses and analyzing their consistency. Using the new analysis framework, the authors gave a negative answer to the open question of the consistency of the Lovasz hinge. The authors also gave original understandings of the top-k loss, showing that the convex loss by [LHS15] is consistent with a discrete loss that is slightly different from the discrete top-k loss. Quality: I haven't gone through the appendix, but the proofs and arguments in the main paper are sound and clear. Clarity: The paper is very clearly written and easy to understand. Significance: Convex loss surrogates are broadly used in machine learning, so understanding how to design such surrogates systematically is meaningful. The analysis of top-k losses and Lovasz hinge helps practitioners understand these losses better and gives them insights for choosing the right loss. [LHS15] Lapin, Maksim and Hein, Matthias and Schiele, Bernt. Top-k multiclass SVM