Sun Dec 8th through Sat the 14th, 2019 at Vancouver Convention Center
All the reviewers thought that generalizing the structured prediction energy network (SPEN) to incorporate factored potentials (following graph structure) with proposed approximate inference schemes for structured prediction make a nice contribution to NeurIPS. The extensive experiments were lauded, but concerns were expressed with the theoretical backing of the methods. After discussion and looking at the paper, the AC agrees with R2 that the paper makes an interesting practical contribution, and that the theory could be clarified in follow-up work. The authors should include their timing results as well as additional clarification from the rebuttal in their camera ready version. Additional side notes: - [*] from the rebuttal should be mentioned in the main paper as a way to handle the entropy term over the marginal polytope in a principled manner with Frank-Wolfe. Note that the authors in [*] use line-search for their FW algorithm, which pushes the iterates closer to the boundary (and thus might yield to convergence issues (slow convergence)); I suspect such issues were not observed in this submission as it looks like a fixed step-size scheme was used. - Note that the FW method can fail to converge to an optimum point when the objective is non-differentiable (see example 1 of Nesterov, Math. Prog. 2018, "Complexity bounds for primal-dual methods minimizing the model of objective function", which works for *any step-size* of FW). Given that this submission mentions ReLU activations (which are non-smooth), this caveat should also be mentioned in the paper.