NeurIPS 2019
Sun Dec 8th through Sat the 14th, 2019 at Vancouver Convention Center
Reviewer 1
This paper defines a weighting scheme that allows for the isolation of outlier points that might bias principal subspace detection, even in the context of using Robust PCA estimators. The main idea is that when columns of the data matrix fall completely outside of the main subspace spanned the the other columns, this can be detected to allow for the "sparse" part of robust PCA to contain complete columns instead of completely random locations anywhere in the data matrix. The model itself (and to some extent the idea of weighting) is similar in spirit to this un-cited paper: - A.S. Charles, A. Ahmed, A. Joshi, S. Conover, C. Turnes, and M.A. Davenport, Cleaning up toxic waste: Removing nefarious contributions to recommendation systems. Proceedings of the ICASSP Vancouver, Canada, May 2013. however the authors provide some conditions on correctness for their algorithm, which is different than the re-weighted l1 scheme used in the citation above. Overall I thought this paper was good. One piece that was unclear to me was how the innovations coefficients in Figure 1 were so well behaved. There was a large amount of variation in the single vector (A^Tc) calculations in the left two panels of Figure 1, however the variation was not as large in the right panel. I did not see where any averaging over coefficients calculated using different c vectors was done, which I assume would be needed. A small comment in "Data Model 1": there is a comma missing in [B (A+N)]T --> should be [B, (A+N)]T. I would recommend that the authors change the name of the algorithm. iSearch is used a lot for different search engines, and a more unique name would help. Perhaps Innovations for Nixing Outliers (In N' Out)? Maybe that one isn't good, but the idea remains.
Reviewer 2
There is a lot of literature for this extensively studied problem; so no novelty on the problem setting. The authors present a new algorithm that is shown to be effective in challenging generating models where the outlier columns are very close to each other, or very close to the span of the inliers. In Data Model 1 definition, "m" is a typo. It should be M_1. The notation of [B (A+N)]T is not very clear, and potentially confusing, until you stare at the Notation section that comes later. This is an example of poor writing. The ideas in the iSearch algorithm seem to be reasonable but nothing strikes me as ground breaking. If inliers lie in a low dimensional linear subspace and we compute a direction that agrees maximally with any one such inlier, such a direction will be not so much in agreement with outliers that don't lie in this subspace. So it does seem like a relatively simple observation. The theoretical work is possibly what may make this paper acceptable to readers with a theoretical bent. That said, while the theorems seem to be analytical guarantees, it is difficult to see what that means practically. If the guarantees kick in only in conditions that lead to easy instances, then the usefulness of such theoretical guarantees is limited. So I'm mildly positive on the paper and recommend the below score.
Reviewer 3
This work presents a provable algorithm for detecting outliers based on robust PCA. I am impressed by the analysis and results on the different distributions of the outliers. The theoretical proof is solid. I only have a few concerns: a) In the introduction, the authors failed to mention many other robust PCA works. b) Can the authors introduce more on Assumption 1 and Assumption 2? Why are they valid and how are they related to practical applications?