Paper ID: | 526 |
---|---|

Title: | Powerset Convolutional Neural Networks |

The paper is clear, well motivated, well written and overall very pleasant to read. The proposed method is new and interesting and will probably remain as a baseline for future work in the topic of deep learning on set functions. Some remarks: - l95 are defined cut and association score but they are not used in the following of the paper. - What is the norm on s:2^N -> R (used e.g. l 99-100)? - l194: the sentence "hypergraphs are represented by their weight set function (with unit weights)" is unclear - l232: two different models are proposed and benchmarked, a discussion about their difference would be very welcomed - l232: it would nice to have the number of parameters for the different proposed models - l302: "(i.e. Fourier sparse)" instead of "(= Fourier sparse)". - In the case of set functions, the complexity of the related algorithm may become an issue, how does the proposed algorithm compared time wise with the other benchmark methods? An in-depth discussion about applicability of the algorithm would be welcomed. * Bibliography: many references are incomplete, e.g. [4] is COLT 2012, [12] is ICLR 2018, etc.

The paper considers an interesting and relatively novel machine learning problem, learning on set functions. It proposes a new architecture called powerset convolutional NN tailed for this problem. The proposed model is sted from solid theoretical analysis and gets reasonable good experiment results. In every dimension of "originality, quality, clarity, and significance", I think this paper reaches the bar of NeurIPS. Section 2 is my favorite part of the paper. It is very clear and constructs the powerset convolutions by analyzing the shift-invariant and from easy one to more complex ones. I also like Table 1 a lot. It is really good to see the corresponds between convolutions and their shifts. The author did an interesting analysis of convolutional pattern matching. I am wondering since powerset convolution and graph convolution has different pattern matching behavior comparing to 1D or 2D convolution, does it finally cause some behavior difference between CNN and GCN/Powerset CNN. The paper also has limitations. First, its set function representation is very costive. It uses R^{2^N} vector represents a set function on a set with N elements. Thus it is hard to apply this method to a large set. It may restrict its use in real applications. Second, the experiments conducted on relatively small datasets. For example, even in the real dataset, results are about classification on subhypergraph of size 10. While graph neural network can handle graph with a much larger size.

The authors built their work on top of "A discrete signal processing framework for set functions" where powerset convolutions were defined, adding powerset pooling operations and defining powerset convolutional neural networks that can be used to classify set functions. The authors provided a detailed analysis of the kind of patterns that powerset convolutions are sensitive to from a pattern matching perspective, and defined their implementation. The authors recognize the exponential growth of complexity O(n2^n) and that to scale their approach to larger ground sets, which limits the applicability of the current method. The empirical results show that the powerset CNNs perform similarly to the baselines on both the synthetic and real datasets, maybe the tasks chosen are too small or well suited to showcase the proposed powerset CNNs. The authors recognizes the lack of datasets containing set functions well suited for their method, however the current set of experiments weakens the argument than powerset CNNs can handle set functions better than graph-convolutional baselines. If the proposed powerset CNNs could be applied to theorem proving or to transformations of boolean formulas, then it could show their relevance and applicability. The paper provides clear definition of all the concepts used to define powerset CNNs, including the different shifts and invariances of the different convolutions and poolings operations. Minor comments: - 19 according parameter -> appropriate parameter - 307 utilizing novel powerset convolutions -> utilizing recent powerset convolutions ------------- POST FEEDBACK --------------- After reviewing the author feedback, I've updated my score to 5, I think the work could be accepted but is not clear how applicable it would be given current restrictions ground-set size = 30, and lack of clear examples where powerset CNNs are better than graph CNNS. The way the contributions of the paper are described need to be updated to clearly reflects the contributions of the paper and what is previous work.