Sun Dec 8th through Sat the 14th, 2019 at Vancouver Convention Center
1. Most of the existing graph based SSL methods assume that the nodes are labeled at random, however authors claim that the probability of missingness may depend on the unobserved data conditioned upon the observed data. I think this claim should be motivated well enough, at least to me it was not entirely clear why this is important. If authors can provide some scenarios which can help understand the claim, it will be beneficial for the readers. 2. Overall, the proposed model is novel and is supported by strong theory. However, the experimental analysis can be improved. One of the baselines the authors consider is "SM", but do not mention the paper in which it is proposed. Authors show results for only one real world dataset i.e., Cora dataset. They should produce results on multiple real world datasets. 3. Going by the motivation of the work, incorporating missing responses is important for graph based SSL. However, authors do not compare the proposed model with state of the art graph based SSL methods like GAT (Velickovic et al., ICML 2018) etc. [Velickovic et al., ICML 2018] Graph Attention Networks 4. Minor points: -- "vertexes" -> "vertices" -- Not sure if using gradient descent qualifies as a contribution. [Comments after reading the author response] I thank the authors for providing additional results. I'm satisfied with the author response and I vote for acceptance of the paper.
I think this paper is well prepared. I have the following comments. (1) I think one important contribution is in theoretical aspect. Nevertheless, I cannot judge the proposed results is based on the previous works or these results in developed by the authors themselves. Thus, I suggest the authors to highlight their unique contributions. (2) I am concerned about the compiutational complexity of the proposed algorithm. Thus, I suggest the authors to analyze it. (3) The paper assumed that r_i follows a Bernoulli distribution, how about other distributions?
Inference under non-ignorable missing response variables is considered for graph embedding setting. Ignoring these missing variables leads to biased estimation under MNAR, so it is important to consider the missing mechanism explicitly. This paper focuses on missingness of response variables only. So, missingness of graph embedding is not considered. In fact, graph embedding is implemented as the ordinary GCN in the experiments. Therefore, the novelty is basically for handling the missingness of the response variables only, and such method is not quite novel in statistics.