Sun Dec 8th through Sat the 14th, 2019 at Vancouver Convention Center
Post-rebuttal: Thanks for the response. Would be great to include the rebuttal discussion in the final version of this work. >>>>> This paper proposes a new framework for certified robustness of graph neural networks. The task under consideration is node classification and the threat model allows for addition and removal of edges under local (node-specific) and global upper bounds on the permissible perturbations. Under this threat model, the paper derives efficient algorithms for obtaining robustness certificates using a suitably designed MDP. The margin formulation can further be used to derive robust losses for training. The problem is well-motivated and significant, the writing is clear for most parts, and the theoretical contributions are excellent! The robust training algorithm and experimental validation further lends support to the proposed framework. I haven't checked the proofs in great detail, but overall I enjoyed reading the paper and believe it is a significant contribution to the field. Some comments/questions for the authors: - The theoretical exposition crucially hinges on the fact that the GNN logits for the pre-final layer (and consequently the margin) is linear in the propagating vector (e.g., PageRank scores). This is not true for most GNNs used (e.g., GATs, GCNs). Could the authors shed light on the difficulties in analyzing robustness of non-linear propagation schemes under similar threat models? - The experiments for robust training are done in a transductive setting, i.e., when the full graph is visible. How would the robust training results look for inductive semi-supervised node classification (test graph contains more nodes and edges than train graph)?
The authors develop a framework for adversarial robustness in graph neural networks where the adversary can change a budgeted number of edges. The adversary is constrained such that they cannot change more than a budgeted number of edges as well as a certain number of edges from the same node. The authors then want to certify the robustness of the graph neural network from robustness under this threat model of [Klicpera et al., 2019]. This model uses the "predict then propagate" framework: node features are learned in isolation then they are "propagated" by weighting the local features by the personalized PageRank. Thus the graph structure is used in a simpler way than other graph neural networks. When there is only the "local" budget, the authors develop an exact algorithm. Their approach is based on Markov decision processes similar to (Fercoq et al., 2010). When there is a global budget, the authors write the problem as a quadratic program and relax the problem using the "Reformulation Linearization Technique". They show that many constraints in the relaxed problem can be simplified to a single linear inequality. Finally they experimentally verify their methods. They show that certain aspects improve robustness, such as increasing the teleport probability in personal PageRank. The authors mention that [Klicpera et al., 2019] do not use personalized PageRank, however I believe that they do. To improve clarity, the authors should move the description of the reformulation linearization technique in the appendix to the main paper. The authors write "the SDP-relaxation  is not suitable for our problem since the constraints are trivially fulfilled". It would be interesting and helpful to explain why. How long does the method based on the reformulation linearization technique take to run? Further, how is the certificate quality when there are no local constraints, only global constraints? Is it possible to show that adding the global constraints makes the problem NP-hard? It might be possible to also solve this problem optimally. *** Post Rebuttal *** Thank you for including the NP-hardness reduction in the rebuttal.
Originality: The problem the paper study is innovative and interesting. However, the techniques the paper relies are variation and combination of existing methods. Quality: The paper is technically sound with proofs for all the claims. The paper is reproducible as the authors provide code and detailed description of experiments. Clarity: The paper is well written and very easy to follow. Significance: The applicability of the method is kind of limited. The method and analysis are limited to a very specific type of GNN. On the contrary, the most widely applied GNN method are based on message-passing and convolution like GraphSage. The techniques in this paper is very hard to generalize beyond the PPNP model.