NeurIPS 2019
Sun Dec 8th through Sat the 14th, 2019 at Vancouver Convention Center
Paper ID: 2695 Spectral Modification of Graphs for Improved Spectral Clustering

### Reviewer 1

Summary: The paper proposes to modify graphs in order to improve the quality of spectral partitioning on them. The quality is governed by the graph spectrum, and the proposed approach is to change the spectrum while approximately preserving cut values. This is achieved in two steps: first use Racke's embedding of the graph into a cut-preserving tree with extra nodes (this step apparently improves the spectrum), then remove the extra nodes by computing the graph Schur complement, an operation which fully preserves the spectrum including the cut values. The resulting graph has approximately the same cut values and improved spectrum. Since both operations are too costly to be considered practical, the paper proceeds to suggest a heuristic variant of the method. Finally several small experimental visualizations are shown to display the advantage of the proposed method. Evaluation: Ultimately the main idea paper is a direct "black-box" invocation of two known techniques (Racke's tree cut sparsifiers and vertex sparsification by Schur complementation), but it seems potentially useful, and the examples provided in the paper give a clear and crisp proof-of-concept for its advantage. The main weakness of the paper is the lack of empirical evaluation. Given that the paper concedes that the theoretical variant of the method is impractical, and suggests an alternative heuristic version, it seems unfortunate that there is no principled experimental evaluation on even moderate-sized graphs. The few examples given in the paper are small, qualitative and somewhat arbitrary (as opposed to using standard benchmark input graph, comparison to baselines and other methods, principled choice of parameters, numerical evaluation of performance, etc). Put simply, it is not quite clear what to make of an algorithm that has neither firm theoretical guarantees nor certified empirical advantage. In conclusion, the paper suggests a nice and simple way to enhance the quality of spectral partitioning and makes a case for its advantage, then discusses practical considerations, but stops short of substantiating the empirical usefulness of its method. Update post-rebuttal: I thank the authors for their detailed response. I do not find it adequately addresses the concerns raised about empirical evaluation. It argues that their theoretical analysis renders thorough experiments unnecessary; however, as previously noted, their primary algorithm is at present supported by neither proofs (only intuition) nor experiments (only proof-of-concept). As a result, I maintain my score.

### Reviewer 2

Overall, I feel the paper is written in a rush and there are many problems in notations and presentation. More importantly, there seem to be a few flaws or statements without sufficient justification, which makes the paper theoretically unsolid. • \phi is used in the introduction, but it is not defined until section 2 • Could you please use [] instead of () for references? The formula are also indexed by (), which is confusing. • Definition 3.2 should be revised a bit. It has two “for all $S \subset V$” • Definition 3.3 should also be slightly revised: there are grammar mistakes and you never specify which quantity is the maximizer, though I guess it should be H. • Since the spectral maximizer depends on the cut approximator, so I think it should be defined to be \alpha-spectral maximizer” with a corresponding parameter \alpha, correct? But seems H is never associated with such a parameter. Please clarify. • What do mean by “not even using a subset of V” while requiring “two graphs on the same vertex set” on line 131 • Theorem 3.2 is confusing. What is S? • Section 3.3 is titled as “Cheeger Inequalities for Spectral Maximizers”. But in theorem 3.3, you only show the existence of such H, but there is no statement whether the H is the spectral maximizer of G. And the Corollary 3.1 depends on Theorem 3.3. So at the end, I don’t see the connection between the results in Section 3.3 and the spectral maximizer in Section 3.2 • It is discussed at the end of page 4, that we should assume the diameter of T to be \tilde{O}(1). I don’t see a rigorous justification of this statement, unless I missed something. • The algorithm does not have theoretical guarantee for claimed properties. The authors addressed most of my questions. However, the correspondence between the existing H and the spectral maximizer is still not clear to me. Perhaps after the authors add clarification, this can be resolved. The theoretical guarantee for the algorithm is not available still, which seems to me is important. Given the current result, I will change my evaluation to 6.

### Reviewer 3

In this work the authors study the following problem: given an input graph G, define as Cut(G) the set of all graphs with the same cut structure. The goal is to find a graph H within Cut(G) that dominates spectrally G. The idea is that applying spectral clustering on H will result in better outputs, and will avoid the pathological performance of spectral clustering which actually occurs in real-world datasets. The author(s) prove that Racke's trees serve this purpose. Interestingly, sparse spectral maximizers also exist; just sparsify if needed the maximizer. The algorithm is simple to implement, and can potentially have broader impact. The weakest part in this work are the experiments. It would be interesting to demonstrate better the impact of the method.