NeurIPS 2020

Consistent feature selection for analytic deep neural networks

Meta Review

This paper shows that the adaptive group lasso feature selection method, more specifically a combined strategy called GL+AGL, is selection consistent for a very general class of Deep Neural Networks, provided the the DNN interacts with the input through a finite set of linear units. This is an important property since it provides a guarantee that, with enough training examples, GL+AGL will effectively identify the set of relevant inputs; making the DNN more interpretable. The general structure of the proof follows the analysis of high-dimensional linear models, but new technical elements are introduced to tackle de difficulties introduced when the linear transformation of the first layer is followed by a sequence of non-linear transformations typically used in DNNs. Finally the numerical experiments provide evidence that the popular group lasso method might be an inefficient feature selection method for DNNs.