NeurIPS 2020

### Review 1

Summary and Contributions: The paper proposes TreeRep, an algorithm that constructs a tree-metric approximation of a Gromov hyperbolic metric. This algorithm runs thousands of times faster than existing approaches for embedding into hyperbolic manifolds.

Strengths: The algorithm is interesting and a (I think) novel approach for metric learning in this space. The paper is cleanly written and well motivated. The algorithm is relatively easy to understand and comes with some nice theoretical guarantees.

Weaknesses: The relationship of the method to hyperbolic space is not obvious. It seems to me that recovering a tree from a metric isn't inherently a hyperbolic-space thing. Sure, if you then proceeded to embed the tree into hyperbolic space using Sarkar's construction, that would make the relationship clear...but as far as I can tell this wasn't done. The paper would be a stronger contender in this category if the method were used to produce hyperbolic embeddings that were then used for some downstream task (such as MAP) rather than just evaluating distortion.

Correctness: As far as I can tell, the claims are correct and the method is valid. The empirical methodology looks to be the correct one, although it is limited to only looking at distortion (rather than also evaluating other downstream task metrics for an embedding).

Clarity: The paper is fairly clear, although I think it would be better to cut down on the definitions on page 3 (right now it reads as a list-of-definitions section). E.g. definition 4 could be cut as it appears to not really be used. Or, definition 7 is well-known and we can just assume people know what a geodesic is. You could then spend the space on having an actual algorithm block for TreeRep, which would make it easier to follow what you are doing.

Relation to Prior Work: The related work section seems complete and detailed. I have some concerns about the comparison to Sala et al [36]. Sala et al [36] presents an algorithm, h-MDS (their Algorithm 2) which gives guaranteed-exact recovery of a metric embeddable exactly in hyperbolic space. The fact that [36] gives such an algorithm with guarantees makes the description in this paper "They do not, however, come with rigorous geometric guarantees of the optimal solutions" seem kinda misleading as applied to [36]. More worryingly, although the empirical experiments in this paper seem to compare to [36], they do not appear to try h-MDS at all: even though h-MDS should get zero distortion on its random-points-on-a-hyperbolic-manifold test. It is not clear why this comparison was not done.

Reproducibility: Yes

Additional Feedback: Your graph legend formatting in Figure 2 is messed up.

### Review 2

Summary and Contributions: In this paper, the authors preset a new method for learning hyperbolic representations. The authors present an algorithm, TreeRep, to learn a tree structure on the data as an intermediate step before determining the hyperbolic embedding.

Strengths: The paper and and its appendix include theoretical guarantees that their algorithm does return a tree and several structural lemmas. Empirically, their algorithm also has lower average distortion and higher mean average precision than previous algorithms.

Weaknesses: My only concern is that I'm not sure if it will be something the larger NeurIPS community might be interested in.

Correctness: I read the relevant proofs of Theorem 1 and it seemed correct to me.

Clarity: The paper is well-written. The definitions and problems are clearly stated as well as the theorem and the proof. My only comment is that I did not like the sentence "Furthermore the algorithm is embarrassingly parallelizable" as part of the theorem. It probably should be a remark when the implementation is discussed.

Relation to Prior Work: Yes.

Reproducibility: Yes

### Review 3

Summary and Contributions: The authors propose an efficient algorithm to compute low-dimensional hyperbolic embedding of the data via a tree structure.

Strengths: The authors provide extensive experiments to evaluate their method.

Weaknesses: The logic of writing and typos make the paper somehow confusing. A clearer problem statement and motivations are expected.

Correctness: The problem itself is worth studying and most of the statements are correct.

Clarity: I have to say some parts of the paper is really confusing to follow. I like the topic and the solid algorithm work, but it is better if the authors can refine this paper and give a clearer explanation and comparisons.

Relation to Prior Work: Many related work is mentioned. But I cannot say they are discussed in a clear way.

Reproducibility: Yes

Correctness: In Definition 5 (p.3, line 97-99), I guess $d(( e, t_1 ), ( e, t_2 )) = w(e) | t_1 - t_2 |$, since in the current definition $d(( e, t ), ( e, t ))$ becomes infinite. Also, when constructing the metric graph X, you should quotient as X = E\times[0,1]/\sim$, where (e_1, 0) ~ (e_2, 0) if e_1 = (v, v_1) and e_2 = (v, v_2) (and similar for (e_1, 0) ~ (e_2, 1), (e_1, 1) ~ (e_2, 0), (e_1, 1) ~ (e_2, 1) as well). And you should also define how the distance should be defined between ( e_1, t_1 ) and ( e_2, t_2 ). Also in Supplement, p.5, line 534: "Let$x, y \in X$and let$r \in g(x, y)$" -> "Let$x, y \in X$, then$r \in g(x, y)$" Clarity: I think the paper is overall clearly written, except that some definitions and statements are not fully clearly written as I mentioned in Correctness. Relation to Prior Work: I think the comparison to related work is well discussed in Section 1. Reproducibility: Yes Additional Feedback: Minor typos: p.2, line 59: "take as an input graphs, not metrics" -> "take as input graphs, not metrics" p.3, line 95: "a a weighted tree" -> "a weighted tree" p.3, line 116: "a hyperbolic representations" -> "hyperbolic representations" p.4, line 130: "that that" -> "that" p.4, line 155: "let us defining" -> "let us define" p.7, line 254: "the we" -> "we" Supplement, p.4, line 525: "let us defining" -> "let us define" Supplement, p.17, line 617: "$\pi y, \pi z$" -> "$\pi y, \pi z\$." I read the authors' feedback and I agree with their comments, so I increase the overall score to 6. As I have suggested, I think it would be helpful to highlight how the suggested method differs from and is better than existing approaches.