NeurIPS 2019
Sun Dec 8th through Sat the 14th, 2019 at Vancouver Convention Center
Reviewer 1
originality:It is novel to encourage inter-neuron communication at the same layer. But it is similar to SE block or other attention moudle essentially. quality: the authors do many experiments clarity: easy to read significance:It is similar to SE block or other attention moudle essentially.
Reviewer 2
The authors propose an approach to increase the representation power of neural network by introducing communication between the neurons in the same layer. To this end a neural communication bloc is introduced. It first encodes the feature map of each neuron to reduce its dimensionality by a factor of 8. Then an attention-based GCN is used to propagate the information between the neurons via a fully-connected graph. In practice, a weighted sum of the neuron encodings is computed for each node, where the weights are determined by the nodes' features similarity. Finally, the updated representation is decoded to the original resolution and added to the original features. Importantly, this model applies the same operations to every neuron, thus the number of parameters is independent of the feature dimensionality, but dependent on the spatial size of the feature map. Several such blocks can be added at different layers of the network. In experiments on CIFAR and ImageNet the proposed module outperforms both the baseline and SE-Net using a variety of architectures. At the same time, it uses less parameters than SE-Net. That said, the absolute improvement over the baseline is within 1%. Additionally, improvements are demonstrated on the tasks of semantic segmentation and object detections. However, the method is only compared to the baseline here. The improvements are also within 1%. A qualitative analysis of the learned representations is provided. It shows that the NC module leads to learning neurons that are less correlated (more diverse). The paper is relatively well written and is easy to follow. Overall I find the idea to be reasonable and well presented. However, the proposed approach only achieves marginal improvements in practice, so I doubt it will have a significant impact on the community. Moreover, experiments evaluation is incomplete (see Improvements). The authors have partially addressed my concerns, however, I still find the improvements to be marginal (especially over SE-Nets), and the experimental evaluation to be incomplete (comparison to non-local networks is not reported). Overall, the paper is of a slightly above-average quality, but it will not have a significant impact on the community. I will increase my score to 6.
Reviewer 3
The paper is modestly original, high-quality in the experiments and analyses performed, written clearly enough (some improvements will be suggested below), and of modest significance. The NC block is an improvement over the SE block, and likely was developed with SE block as the main motivation. As such, a more thorough explanation of SE block decision and how NC block decisions contrast with it would be nice to include, and make the theoretical contribution clearer. The "algorithmic" novelty of the NC block feels modest, but could be improved with better explanation of the motivations behind its development and specific choices made. Experiments and analyses are sufficient and of high quality. What they show is that the NC block is usually but not always a modest improvement of the SE block. That does not feel very significant to practitioners.