NIPS 2017
Mon Dec 4th through Sat the 9th, 2017 at Long Beach Convention Center
Paper ID: 2107 Conic Scan-and-Cover algorithms for nonparametric topic modeling

### Reviewer 1

This paper presents a geometric algorithm for parameter estimation in LDA when the number of topics is unknown. The method obtains impressive results—its predictive performance on synthetic data is comparable to a Gibbs sampler where the true number of topics is fixed. Moreover the algorithm appears to scale very well with the number of documents (like all spectral methods). Unfortunately, the meat of the paper is dense and difficult to parse. The paper attempts to provide some geometric intuition — but the lack of a clean and well-described picture makes the scattered references to the geometric story (i.e., "conic scans") hard to follow. The caption for figure 1 does not say enough to clearly explain what is being depicted. I don’t have much familiarity with the relevant background literature on geometric algorithms for LDA inference and will defer to other reviewers with stronger background. But I think the general NIPS reader will have too much difficulty reading this paper.

### Reviewer 2

The paper proposes a nonparametric topic model based on geometric properties of topics. The convex geometric perspective of topic models gave rise to some interesting questions. This paper tackles the problem of finding an appropriate number of topics by topic simplex covering algorithm called conic scan coverage approach. Overall, this paper is clearly written, and easy to follow. Theoretical justifications of CSC support the main claim of the paper as well. However, unlike the CSC algorithm, the document CSC algorithm contains some arbitrary steps such as the mean shifting and spherical k-means. The justification of these steps may bring some other questions. For example, is the mean shifting result always around the initial direction? if not, is the initializing direction in step 4 of algorithm2 necessary? A random selection of an initial direction would be enough since it would also cover the topic simplex, eventually. A similar question arises from the spherical k-means; the resulting topics are no longer a vertex of topic convex. What if the variance of the original topic vertex before these steps is relatively low than the others? (may be a long document could not be the topic vertex?) Do we still need the mean-shifting and k-means in this case? or is it better to keep these vertices? It would be more interesting if there are some (empirical?) discussions on the variances of the point estimate and the proposed steps.