Review for NeurIPS paper: Uncertainty Aware Semi-Supervised Learning on Graph Data

NeurIPS 2020

Uncertainty Aware Semi-Supervised Learning on Graph Data

Review 1

Summary and Contributions: This paper proposed a multi-source uncertainty framework of GNNs which models various types of uncertainty. The authors study multiple types of uncertainty in deep learning and belief/evidence theory domain for node classification predictions. They validated their presented model with existing benchmarks on six real network datasets for OOD detention and misclassification detection.

Strengths: Pre-existing work in GNNs have not consider the concept of uncertainty associated with probabilities, which can minimize the risk of misclassification. One of the good contributions in the paper is that the authors theoretically analyzed the relationships between different types of uncertainties.

Weaknesses: Even though this is an interesting setting and the technical solutions presented in the paper look reasonable, the idea seems to be pretty incremental as it stacks multiple existing techniques without many innovations. For me, it’s a bit hard to advocate acceptance yet.

Correctness: The described technical details seem to be correct, yet it’s a bit hard to understand the proposed methodology, which needs to be cleared for improvement.

Clarity: This paper is written well in general.

Relation to Prior Work: This paper thoroughly analyzed the differences in multiple types of uncertainties introduced in pre-existing work.

Reproducibility: Yes

Additional Feedback: Can authors provide more detailed rationales for using Dirichlet distribution to model a probability distribution for class probabilities? [Additional comments] I carefully read through authors’ feedback and came to realize the proposed framework can be significantly useful in terms of modeling various types of uncertainty and the experimental results and analysis are reasonable. So, I increased my score from “marginally below the acceptance threshold” to “marginally above the acceptance threshold”.

Review 2

Summary and Contributions: The paper formulates GNNs to output the parameters of Dirichlet distribution over class probabilities instead of directly predicting class probabilities. The goal is to calculate additional uncertainty metrics from belief theory domain, namely dissonance and vacuity. The training is further guided by a Dirichlet prior constructed based on a graph-based kernel estimation, and a GNN trained with only the classification loss. The experiments on semi-supervised node classification on 6 datasets show that dissonance and vacuity can be good scores for detecting misclassification and OoD, respectively.

Strengths: The paper tries to extend the notion of uncertainty estimates in semi-supervised learning on graph data and proposes using dissonance and vacuity as additional uncertainty measures. It models the parameters of a Dirichlet distribution by GNNs to be able to employ dissonance and vacuity scores. Dissonance and vacuity show advantages for in-distribution misclassification detection and OoD detection on several datasets. The paper shows the connection of entropy, epistemic uncertainty, aleatoric uncertainty, dissonance and vacuity under its framework. As the training is guided by multiple components including the teacher GNN and the constructed Dirichlet prior based on the shortest paths from the labeled nodes in the graph, the ablation studies appropriately examine the effect of each component of the framework.

Weaknesses: The scope of the paper is limited to the semi-supervised setting. In the previous works on CV, epistemic uncertainty or entropy have been used for uncertainty estimation/OoD detection. This paper suggests that these uncertainty estimates would not work in semi-supervised settings and provides an ablation study on image classification for demonstrating the claim. On the other hand, the paper does not investigate the performance change with different number of labeled nodes on the graph data which is the focus of the paper and may be more relevant. Some stronger (Bayesian) GCN baselines can be investigated in terms of their uncertainty quantification like Dropout+DropEdge in Rong et. al, 2019.

Correctness: Overall, the method and empirical comparisons seem sound. I think the loss should include an additional KL term for theta because of the Bayesian treatment of it which is not mentioned in the paper.

Clarity: The paper is generally well-written, but there are some minor typos and grammatical issues.

Relation to Prior Work: The paper discusses some related works for uncertainty estimation in BNNs and uncertainty reasoning in belief theory domain and motivates its contributions and relates them to the previous works. The paper may want to discuss more and differentiate between the different approaches for uncertainty estimation in graphs, including dropout-based techniques, and methods that consider uncertainty in graph structure and discuss why the latter ones have not been investigated in this work.

Reproducibility: Yes

Additional Feedback: --Update-- Based on the additional results in the rebuttal that are encouraging, specifically the comparison with DropEdge and providing the performance change with different number of labeled nodes, I have increased my score. However, I still believe that instead of a semi-supervised image classification experiment, the paper needs to investigate a supervised graph experiment setup to demonstrate the claims in section 6.3 about performance differences in supervised vs semi-supervised settings. Specifically, epistemic being worse than all other uncertainty measures (for a fixed method) for OOD detection seems to contradict previous papers in the OOD detection field. Also, I think the loss is missing an L2 regularization on theta that should appear from the KL term of the Bayesian treatment of it.

Review 3

Summary and Contributions: The authors propose an uncertainty framework for GNNs that incorporates several components of uncertainty in data. They have a theoretic analysis of these sources of uncertainty and relate them to each other. They also develop the Graph-based Kernel Dirichlet distribution Estimation (GKDE) model. They train this model and compare it to other popular GNNs, such as GCN, by running many experiments.

Strengths: Figure 1 is an excellent illustration of the different types of scenarios. There are a lot of definitions are variables to digest so this figure really helps LINK the variables to the reader’s intuition. The math and explanations are very straightforward and clear. The authors did an excellent job presenting their work. The questions and answers in the experiment section are very helpful and useful to read over before jumping into the tables. The experiments are thorough and cover many datasets and models.

Weaknesses: The results in Table 1 are good but not amazing. GCN still performs relatively well compared to the authors’ methods. Otherwise, the paper is very solid.

Correctness: Everything seems correct. There are a few things that I was confused about which I mentioned below.

Clarity: Overall the paper is very clear. The authors did an excellent job. Equation 5 - I am confused on a few things. The notation P(y|x; theta) is confusing because the semicolon implies that theta is a vector and not a random vector, however, the conditional distribution of theta is given P(theta|G). So what is the point of the semicolon? Also, there is a typo in Equation 5 I think because the entropy term is not defined correctly. It should be H ( E_{P(theta|G)} [y] | x; theta) if I understand correctly. It doesn’t make sense to take the expectation (wrt P(theta|G)) of y|x;theta, the conditioning is on the entropy and not inside of the expectation. Theorem 1 - I am not sure if Part 2. b) benefits from the approximate notation that the authors use. I looked in the Supplementary material and the authors derive these relations asymptotically. The asymptotic results may be more beneficial and clearer than how they are stated in Theorem 1.

Relation to Prior Work: Yes prior work is very well presented.

Reproducibility: Yes

Additional Feedback: The authors did a great job!