NeurIPS 2019
Sun Dec 8th through Sat the 14th, 2019 at Vancouver Convention Center
Paper ID: 4455 Fast Decomposable Submodular Function Minimization using Constrained Total Variation

### Reviewer 1

After reading the rebuttal and other reviews, my score remains the same. ======== Overall the submission presents a novel, simple-yet-non-obvious change to existing approaches that can yield substantial improvements. The change is well-motivated and accompanied by appropriate supporting theory, which is correct to my knowledge. The experiments are appropriate: the authors show improved performance, and also study the sensitivity of performance to the choice of the constraint \epsilon (showing of course that there is a tradeoff to optimize). On the whole it is a solid paper and I recommend acceptance. Some small comments: - There is a minor typo in the displayed equation after line 376 around the $s$. - It would be good to explain the identity in line 378. - Please be specific about what part of [31] is being referenced in line 166.

### Reviewer 2

Originality: It is a neat idea to introduce a different proximal term of the continuous variable w, which may help to improve the efficiency of the algorithm. This idea in a high level works in a similar way as that of screening elements and keeping a small active set, e.g. the one adopted in reference [25]. Other analysis and discussion seem to be a natural generalization of previous works, e.g., [15]. Therefore, I think the originality is around the borderline. Quality: The submission is technically sound. Regarding the novel term on w, the authors provide a relatively complete discussion on how to handle it, although some techniques are overlapping with those adopted in the previous work [15]. Clarity: The clarity of this work is not very good. As a reader that is very familiar with the important references, I have to read through Section 3 several times to get a clear outline of how the overall algorithm works and where the potential improvement comes from. The problem is caused by the fact that there are too many formulations including discrete, continuous ones and different algorithms for different setting proposed by the authors. I think it would be difficult for a non-expert/practitioner to follow the idea clearly. Significance: I think the significance of the work is not extraordinary but okay. The idea is good. Although it is a small idea and kind of overlapping other ideas at a high level, it is good to see in this particular setting and indeed improves the algorithms. ---After reading the rebuttal --- I agree with other reviewers that the contribution of this work is solid although the change is slight. However, there is one additional minor comment: Since the RBCD methods proposed in [17,18] hold theoretical characterization of the number of iterations needed while the cyclic BCD used here does not hold the similar result (also questioned by Reviewer 4), why not directly use RBCD? A table to list the whole algorithm can be rather helpful. Please provide the evaluation over the cases with more complexity potential functions as promised.

### Reviewer 3

Comments and Questions: Originality: The paper presents a constrained version of the \psi function for minimizing the sum of submodular set functions. As far as I know, I did not see this technique before. Quality: Experiments support the fast convergence of proposed algorithms. For the theoretical results, can you get a faster rate than previous algorithms, say, [15] and [17]? Meanwhile, is it possible to give oracle complexity in terms of either discrete minimization or function value evaluations? This would be a good way to theoretical justify the advantage of your algorithm. Clarity: Overall, the paper is well-presented. Significance: Submodular minimization is well applied in segmentation and constraint satisfaction, this technique would be a good way to accelerate the SFM process. - The choice of \epsilon depends on diameter of the base polytope D. Is there an efficient method to estimate it? - From the experiments you observe a tradeoff of choosing \epsilon, do you have some heuristic method to choose it. - Do you see a hope of applying this technique to minimizing submodular functions over continuous domains? Minor comments: - line 68: loop to the solve -> loop to solve -line 196: less than less -> less than