NIPS 2017
Mon Dec 4th through Sat the 9th, 2017 at Long Beach Convention Center
Paper ID: 3095 Estimating High-dimensional Non-Gaussian Multiple Index Models via Stein’s Lemma

### Reviewer 1

This paper studies estimation of sparse single and multiple index models. The goal is to construct good estimators under weaker assumptions on the covariate distribution and link function. This paper is poorly written. In the introduction the author(s) emphasized on the generality of their methods, such as how it would work for a wide range of covariate distributions and link functions. But in the main methodology section (Section 3), the development focused only on covariate distributions with independent, identically distributed components, which is far more restrictive than almost any existing works. Equation 3.2 only holds when the link function is phase retrieval. So the entire section 3 is restricted to a highly restrictive covariate distribution and a single link function. It is entirely unclear how the method works in other scenarios. In the introduction it is mentioned repeatedly that the proposed method needs to assume that the covariate distribution is known. It is unclear how this assumption is used in the methodological and theoretical developments. But this is a very strong assumption. For example, in high dimensional linear regression, if the covariate distribution is known, then the problem becomes almost trivial. The author(s) also claimed to make a contribution to heavy-tailed sparse PCA, which is over-stating. The only contribution in Section 4 is the use of a robust estimator of the input matrix $\bar\Sigma$. The robust estimator itself is not new. The SPCA formulation in eqs. (3.7) and (4.1) is standard and well-known (d'Aspremont et al SIAM Review 2007, Vu et al NIPS 2013), which the authors were aware of but did not cite in Sections 3 and 4. Given the existing results on sparse PCA analysis and truncated average estimate, the consistency of heavy-tailed sparse PCA is trivial: the input matrix is entry-wise consistent and hence $\ell_1$ penalized sparse PCA is consistent. Typos: - line 44: "statistical optimal" --> "statistically optimal" - line 52: "is rather restrictive" --> "are rather restrictive" - Equation 2.1: Remove the second "=". - First line of Eq. 3.3 (also eqs 3.7 and 4.1): "+" --> "-", unless you use a negative value of $\lambda$. ####### Response after author feedback ######## Thanks for the feedback. Yes. I agree that there was some misunderstanding in my first review. But still section 4.1 is simply a special case of the analysis of sparse PCA which only assumes that the input matrix is entry-wise consistent. This point has been illustrated, for example, in the perspective of robust optimization in d'Aspremont et al and later developments (for example, the proof used in Wang et al arXiv:1307.0164).