NeurIPS 2019
Sun Dec 8th through Sat the 14th, 2019 at Vancouver Convention Center

### Reviewer 1

In this paper, the authors propose CLaR (Concomitant Lasso with Repetitions) which is an approach to solve lasso problems with multiple realizations and heteroskedastic noise. Overall this is a reasonably good paper although I found parts of it to be not explained very clearly. The authors provide a theoretical justification for their approach, an algorithm to solve their problem and some applications in simulated and real MEG data. Technically, this is a reasonably sound paper. The objective function combines a novel loss function with the L 2,1 norm to promote sparsity. My main comment is that the motivation for the approach is hard to follow, because the authors introduce a lot of different concepts (heteroscedacity, concomitant estimation of coefficients and noise, and M/EEG data) without explaining well the connections between them very well. For example, why does increasing the number of observations lead to heterocedasticity? (line 26) What are the repetitions the authors are combining (e.g. in an MEEG context)? I would suggest a modest re-writing of the introduction to make this clearer. After a couple of readings it seems that the approach was specifically designed with M/EEG in mind. If that is the case, I would make that case explicitly in the introduction and please also provide a more detailed explanation of the M/EEG problem setup in the main text of the paper - currently this is not very clear. Moreover, the authors approach is also couched in multi-task learning (MTL) terminology without this connection being made fully explicit In the real data experiments, the authors set the regularization parameter on the basis of their expectation to see only two sources in the data. This may be reasonable for the data they are evaluating, but this knowledge is unlikely to be true in general. Perhaps the authors can comment on that. How would they then set the regularization parameter in other settings? Other comments: - the authors seem to abuse their notation. on line 65, the authors define $||.||_{p1}$ to mean the l_p1 norm for any $p \in [1,\inf]$ but then they use this notation to mean the norm induced by the estimated precision matrix $S^{-1}$ (e.g. in eq. 2). Is this a misunderstanding? - line 45: probably should be "reduces noise variance inversely proportional to the number of repetitions". - _why_ does the homoscedastic solver fail after whitening? (line 222)

### Reviewer 2

The paper is well-written and the approach is interesting, however the paper contributions listed above highly overlap with a previous work [Massias 2018a AISTATS]. The differentiating part between two works is that the previous approach averaged noisy observations, whereas this work slightly updates the previous solver to minimize a data fidelity term which is the summation over all repetitions. It is empirically shown to have some advantages over the averaging approach (this point was also emphasised in the supplementary part B.8).

### Reviewer 3

Originality: The paper seems original compared to a set of alternatives that the authors have provided and compared against. Its potential for neuroimaging applications such as M/EEG is also interesting. Quality: The paper is overall well-written. Reproducibility is much appreciated, and the supplement preemptively answered some of my questions related to the experiments. Clarity: For the most part, it was written clearly so I was able to follow the crucial points for the most part given my limited knowledge in the area. The provided code helped me a lot to digest some technical pieces of the method. However, the neuroimaging related explanations were relatively less clear than the other part which I will describe below. Significance: Given my understanding of the paper, I was able to appreciate the contribution of the method towards some real data applications. Thanks to the provided code, the impact of the paper could be immediate and more probable.