NeurIPS 2019
Sun Dec 8th through Sat the 14th, 2019 at Vancouver Convention Center

### Reviewer 1

This paper proposed a new rerandomized densification strategy for minhash as well as a bin-wise CWS. The proposed method achieves the smallest variance among all densification schemes. Densification of one permutation hashing is a very important problem, and this work proposed a new technique that achieves the smallest variance. The idea of this paper is quite interesting to me, and I like the analysis and experiment of this paper. Overall, the idea of this paper is interesting and it is a great incremental advance in the field of densification one permutation hashing. So, I would like to give a positive vote for this paper.

### Reviewer 2

In this paper, the authors continue the line of work that replaces k independent minhashes, sample only one permutation and then extract k samples for it so to obtain an unbiased estimator for the Jaccard similarity. The authors show a minor variant of the algorithm from [23], where more randomness is injected. They analyze the variance of estimator and conclude that the variance is smaller than for the scheme from [23], as well as evaluate it experimentally. I like the result, since it makes progress on an important problem (estimators for the Jaccard similarity are heavily used in practice). However, the result is a bit incremental, and not really about learning (sure, minhash and its variants can be used for kernel methods, but it's orthogonal to the contribution of the present paper). These considerations make me put this paper into the "slightly above borderline" pile.

### Reviewer 3

The authors propose that the optimal densification for OPH can actually be further optimized. In usual OPH, we get one permutation of the sparse vector, break the vector into K equal sized bins. In the usual Consistent Weighted Sampling (CWS) approach, we sample non-empty bins from these K bins and retrieve a fixed hash code for these bins. In this new approach, the authors suggest to treat each of the K bins as a separate sparse vector and perform MinHash on these retrieved bins to get a hash code instead of directly getting a Hash code. This is called re-randomization. The authors theoretically prove that this re-randomization achieves the smallest variance among densification schemes(that are used to retrieve hash codes from empty buckets). Also, they extend this idea to weighted non-negative sparse vectors (by a method called Bin-wise CWS) The paper seems to be a subtle improvement over prior work. It's fairly well written barring some typos (listed below). It is a significant observation but not entirely new. Typos: 1. well behavored ---> well behaved (in multiple places) 2. In section 2, the sentence "we use in J(S,T) to denote Jaccard similarity and J(S,T) to denote generalized Jaccard similarity" sounds weird. Are both denoted by J(S,T)? 3. In section 3, I_s1 = {1,2,3} is wrong. It should be I_s1 = {2,3,4} <===================== POST AUTHOR RESPONSE=================> I've read the author response. I'm still positive abt the submission and I'm retaining my score.