__ Summary and Contributions__: The paper proposes TreeRep, an algorithm that constructs a tree-metric approximation of a Gromov hyperbolic metric. This algorithm runs thousands of times faster than existing approaches for embedding into hyperbolic manifolds.

__ Strengths__: The algorithm is interesting and a (I think) novel approach for metric learning in this space. The paper is cleanly written and well motivated. The algorithm is relatively easy to understand and comes with some nice theoretical guarantees.

__ Weaknesses__: The relationship of the method to hyperbolic space is not obvious. It seems to me that recovering a tree from a metric isn't inherently a hyperbolic-space thing. Sure, if you then proceeded to embed the tree into hyperbolic space using Sarkar's construction, that would make the relationship clear...but as far as I can tell this wasn't done. The paper would be a stronger contender in this category if the method were used to produce hyperbolic embeddings that were then used for some downstream task (such as MAP) rather than just evaluating distortion.

__ Correctness__: As far as I can tell, the claims are correct and the method is valid. The empirical methodology looks to be the correct one, although it is limited to only looking at distortion (rather than also evaluating other downstream task metrics for an embedding).

__ Clarity__: The paper is fairly clear, although I think it would be better to cut down on the definitions on page 3 (right now it reads as a list-of-definitions section). E.g. definition 4 could be cut as it appears to not really be used. Or, definition 7 is well-known and we can just assume people know what a geodesic is. You could then spend the space on having an actual algorithm block for TreeRep, which would make it easier to follow what you are doing.

__ Relation to Prior Work__: The related work section seems complete and detailed.
I have some concerns about the comparison to Sala et al [36]. Sala et al [36] presents an algorithm, h-MDS (their Algorithm 2) which gives guaranteed-exact recovery of a metric embeddable exactly in hyperbolic space. The fact that [36] gives such an algorithm with guarantees makes the description in this paper "They do not, however, come with rigorous geometric guarantees of the optimal solutions" seem kinda misleading as applied to [36]. More worryingly, although the empirical experiments in this paper seem to compare to [36], they do not appear to try h-MDS at all: even though h-MDS should get zero distortion on its random-points-on-a-hyperbolic-manifold test. It is not clear why this comparison was not done.

__ Reproducibility__: Yes

__ Additional Feedback__: Your graph legend formatting in Figure 2 is messed up.

__ Summary and Contributions__: In this paper, the authors preset a new method for learning hyperbolic representations. The authors present an algorithm, TreeRep, to learn a tree structure on the data as an intermediate step before determining the hyperbolic embedding.

__ Strengths__: The paper and and its appendix include theoretical guarantees that their algorithm does return a tree and several structural lemmas. Empirically, their algorithm also has lower average distortion and higher mean average precision than previous algorithms.

__ Weaknesses__: My only concern is that I'm not sure if it will be something the larger NeurIPS community might be interested in.

__ Correctness__: I read the relevant proofs of Theorem 1 and it seemed correct to me.

__ Clarity__: The paper is well-written. The definitions and problems are clearly stated as well as the theorem and the proof. My only comment is that I did not like the sentence "Furthermore the algorithm is embarrassingly parallelizable" as part of the theorem. It probably should be a remark when the implementation is discussed.

__ Relation to Prior Work__: Yes.

__ Reproducibility__: Yes

__ Additional Feedback__:

__ Summary and Contributions__: The authors propose an efficient algorithm to compute low-dimensional hyperbolic embedding of the data via a tree structure.

__ Strengths__: The authors provide extensive experiments to evaluate their method.

__ Weaknesses__: The logic of writing and typos make the paper somehow confusing. A clearer problem statement and motivations are expected.

__ Correctness__: The problem itself is worth studying and most of the statements are correct.

__ Clarity__: I have to say some parts of the paper is really confusing to follow. I like the topic and the solid algorithm work, but it is better if the authors can refine this paper and give a clearer explanation and comparisons.

__ Relation to Prior Work__: Many related work is mentioned. But I cannot say they are discussed in a clear way.

__ Reproducibility__: Yes

__ Additional Feedback__: (1) In Abstract, you mentioned "rather than determining the low-dimensional hyperbolic embedding directly, we learn a tree structure...", but the title of your paper is "Tree! I am no Tree! I am a low dimensional hyperbolic embedding". This is confusing, please clarify what do you want to express.
(2) In Introduction, you mentioned your method is "extremely fast algorithm" in learning a tree structure to approximate the metric. I see Table 1, 2 and 3 contain the results of testing computational efficiency. But it seems the efficiency is not that impressive. Please give explanations on that statement in Introduction.
(3) I notice you bring up Yu and De Sa's work in NeurlPS2019, I wonder if a comparison of their method and yours can be made, since you are towards solving a similar issue.
(4) what are T hat and dThat mean in Lemma 2? "Then we can extend..." This sentence on line 152 is confusing.
(5) Definition 10 is confusing. "Let us defining the following two zone types", but three items are listed. please clearly define zone1 and zone2
########################
Thanks the feedback from the author. I like what the authors want to deliver in this paper, but I would not change the score. Logically, this submission required to be further clarified about the problem statement, motivation and contributions. I was confused by several expressions in the paper (eg. my comment 4 and 5), especially the introduction part, but still I think it is unclear to well follow the paper after reading the feedback. I understand the limitation of feedback pages would force the authors to give up answering some questions, but I also believe more work is supposed to be done before submitting it officially. Besides the logic and expression of the paper, I don't think the authors have well discusses about the prior work. Discussions about Yu and De Sa’s work in feedback does not make sense to me. Since the authors have pointed out the location of the code, I changed the reproducibility to Yes.

__ Summary and Contributions__: This paper suggests an algorithm TreeRep that, take a delta-hyperbolic metric space as an input and learns a tree structure with low distortion. For the case when delta = 0, TreeRep has a theoretical guarantee that the output tree recovers the input metric. Also, it is empirically shown that TreeRep is faster and has lower distortion than previous algorithms.

__ Strengths__: As far as I checked, the theoretical claims in this paper are solid, and also experimentally verified. I think practitioners will find TreeRep useful as a method to embed data into trees.

__ Weaknesses__: TreeRep provides a guarantee that when tree is given as an input then the output recovers the tree metric, but for general delta-hyperbolic metric space as an input, Proposition 1 provides a bound of the distortion given 3 points but not giving the global bound on the distortion. I think it should be clearly discussed or mentioned that how the result in Proposition 1 is different from the global bound on the distortion and how it can be possibly extended to give a bound for the global bound on the distortion. Also, in the experiments, NJ outperforms TreeRep in the hyperbolic manifold dataset and TreeRep outperforms NJ in the biological data in terms of the distortion. So I think it would be helpful if the authors can give some intuition on under what situations TreeRep outperforms other tree construction methods.

__ Correctness__: In Definition 5 (p.3, line 97-99), I guess $d(( e, t_1 ), ( e, t_2 )) = w(e) | t_1 - t_2 |$, since in the current definition $d(( e, t ), ( e, t ))$ becomes infinite. Also, when constructing the metric graph X, you should quotient as X = E\times[0,1]/\sim$, where (e_1, 0) ~ (e_2, 0) if e_1 = (v, v_1) and e_2 = (v, v_2) (and similar for (e_1, 0) ~ (e_2, 1), (e_1, 1) ~ (e_2, 0), (e_1, 1) ~ (e_2, 1) as well). And you should also define how the distance should be defined between ( e_1, t_1 ) and ( e_2, t_2 ).
Also in Supplement, p.5, line 534: "Let $x, y \in X$ and let $r \in g(x, y)$" -> "Let $x, y \in X$, then $r \in g(x, y)$"

__ Clarity__: I think the paper is overall clearly written, except that some definitions and statements are not fully clearly written as I mentioned in Correctness.

__ Relation to Prior Work__: I think the comparison to related work is well discussed in Section 1.

__ Reproducibility__: Yes

__ Additional Feedback__: Minor typos:
p.2, line 59: "take as an input graphs, not metrics" -> "take as input graphs, not metrics"
p.3, line 95: "a a weighted tree" -> "a weighted tree"
p.3, line 116: "a hyperbolic representations" -> "hyperbolic representations"
p.4, line 130: "that that" -> "that"
p.4, line 155: "let us defining" -> "let us define"
p.7, line 254: "the we" -> "we"
Supplement, p.4, line 525: "let us defining" -> "let us define"
Supplement, p.17, line 617: "$\pi y, \pi z$" -> "$\pi y, \pi z$."
I read the authors' feedback and I agree with their comments, so I increase the overall score to 6. As I have suggested, I think it would be helpful to highlight how the suggested method differs from and is better than existing approaches.