NIPS 2018
Sun Dec 2nd through Sat the 8th, 2018 at Palais des Congrès de Montréal
Paper ID: 1901 Lipschitz regularity of deep neural networks: analysis and efficient estimation

### Reviewer 1

In this paper, the authors presented two methods to estimate the Lipschitz regularity of deep neural networks to understand their sensitivity to perturbation of inputs. A generic algorithm AutoLip is proposed which is applicable to any automatically differentiable function. An improved algorithm called SeqLip is proposed for sequential neural networks. Some experimental results are given to illustrate the effectiveness of these methods. (1) While it is claimed that AutoLip is applicable to any automatically differentiable function, it seems that its effectiveness is tested only on the specifical sequential neural networks. It would be helpful to present experiments with more general neural networks. (2) The algorithm SeqLip requires SVD decomposition for each matrix in each layer. I am afraid that the computation can be expensive for large $n_i$. (3) In the first line of eq (11), it seems that the right-hand side should be $$\|\Sigma_1U_1^\top\diag(\sigma_1)V_2\Sigma_2U_2^\top\diag(\sigma_2)\cdots V_{k-1}\Sigma_{k-1}U_{k-1}^\top\diag(\sigma_{k-1})V_K\Sigma_k\|_2.$$ I am not sure whether this is equal to the one mentioned in the paper. (4) The experimental results are not quite convincing. For the experiment with MLP, the dimension is too small. For the experiment with CNN, the depth is a bit small. For the experiment with AlexNet, the Lipschitz constant obtained by Greedy SeqLip is still too large. (5) I did not understand the meaning of a brute force combinational approach'' in Section 6.1. As far as I can see, this involves no combinational optimization problems. Minor Comments: (1) In eq (11), (12), $k$ should be $K$ (2) In conclusion, it is stated that SeqLip outperforms AutoLip by a factor of $10$. However, I can only see an improvement up to a factor of $6$ in the experiments. --------------------------- After rebuttal and discussions: I agree with Reviewer #1 and Reviewer #2 that the proposed SeqLip is interesting for bounding the Lipschitz constant of a MLP. I vote for its acceptance.