NeurIPS 2020

The Dilemma of TriHard Loss and an Element-Weighted TriHard Loss for Person Re-Identification

Review 1

Summary and Contributions: The paper proved that there is a dilemma in TriHard, where hard negative samples share certain characteristics in common with anchor samples and hard positive samples. But the elements of feature vectors extracted from these characteristics are forced to be away from each other by negative pairs but close to each other by positive pairs, which effect bad features clustering within classes. To mitigate TriHard loss dilemma, they propose several strategies and an Element-weighted TriHard loss. Extensive experiments are conducted on Market1501 and DukeMTMC-reID datasets and the achieve state-of-the-art results.

Strengths: 1. The paper proved the dilemma of TriHard loss from the perspective of gradient optimization. 2. The paper proposed targeted optimization based on the dilemma, such as half TriHard loss, TriHard loss with feature normalization, and half TriHard loss with average negative samples. Then element-weighted TriHard loss finally solved the dilemma.

Weaknesses: The paper makes element-weighted when training, but when testing, there is no weight to novel class. So the feature is only weighted when training. The inconsistency may limit accuracy.

Correctness: The claims and method correct are correct. The empirical methodology is correct, but with some weaknesses. See the comments in Weaknesses.

Clarity: The writing is well-understandable.

Relation to Prior Work: Yes

Reproducibility: Yes

Additional Feedback: after reading the rebuttal, I keep my original reate.

Review 2

Summary and Contributions: This paper aims to alleviate the dilemma of triplet loss when facing hard negative samples. The “dilemma” mentioned in this paper means that the similarity between the anchor and positive sample is useful for better representation, but the similarity between the anchor and negative sample should be repelled, so processing the similarity of anchor, positive sample and hard negative samples is a dilemma problem in triplet loss. To solve this problem, an Element-weighted TriHard Loss function is designed in this paper. The main idea is to weight the feature vectors of anchor and negative sample before calculating their distance to find discriminative elements of their feature vectors. Meanwhile, three mitigation strategies of TriHard loss dilemma are discussed in this paper. Experimental results on two datasets show that the proposed method can alleviate the dilemma by combining them with other strategies. The problems studied in this paper are meaningful for improving the stability of triplet loss. The discussed strategies and proposed method may bring some insights to the reader.

Strengths: + Both the motivation and proposed method of this paper are introduced clearly. + The discussion of several strategies that mitigate the influence of hard negative samples on triplet loss is helpful for other researchers. + The experiments of TH, TH+FH, HTH, HNTH and EWTH as well as their combinations will provide reference baselines for related research.

Weaknesses: - As shown in Table1, the proposed Element-weighted TriHard loss actually does not perform better than several existing strategies. Although combining EWTH with HTH or HNTH gets improved results, the effectiveness of the designed method of this paper is incremental. - the best value of t is varying for different methods and datasets (Supplementary materials), so it is too sensitive to selected. - Mathematical symbols in the Section of Method have some problem. For example, x_k under equation (5) has not been mentioned earlier. Because there are so many formulas, it's best to define all the symbols clearly in the paper.

Correctness: I think the proposed method and conducted experiments are correct and implementable.

Clarity: The paper is well written, but some typos should be corrected.

Relation to Prior Work: Yes, related work is well summarized.

Reproducibility: Yes

Additional Feedback: Post rebuttal: The authors address my concerns. I would like to keep my ratings.

Review 3

Summary and Contributions: This paper investigates the dilemma of TriHard loss, a problem that would cause unstable training. Through qualitative analysis, the authors believe this problem is caused by repelling the whole feature vector of anchor and hard samples when they share common elements. Hence, the authors introduce three simple strategies to alleviate this problem. They also propose an Element-weighted TriHard loss function. By putting the loss function and the strategies together, this paper achieves satisfying results.

Strengths: 1. The goal is apparent --- investigate and alleviate the dilemma of TriHard loss; 2. The performance is good --- compared with other loss functions; 3. There are alation studies; 4. The implementation code is attached, which is easy to follow.

Weaknesses: 1. There is no evidence provided to demonstrate the problem and some claims (such as L91-L93). 2. There is no training dynamics to identify why the proposed two techniques boost performance. 3. The proposed method is simple, like a combination of tricks. 4. Some parts of this paper are not clear, which makes reading difficult. I am not clear how do you plot figure 2 using D_{g1,g2}.

Correctness: I am not sure whether the claims and method are correct. Fisrt, some claims lack of evidences. Second, I am not carefully check the formulations and source code. Yes.

Clarity: No, some parts of this paper are not clear, which makes reading difficult.

Relation to Prior Work: No

Reproducibility: Yes

Additional Feedback: Post rebuttal: After reading the rebuttal and other reviews, I tend to change my rating to positive. I believe the authors can enhance this paper by providing more serious evidence to some claims and correcting the typos, as mentioned in the rebuttal. But considering its incremental novelty, my final rating is 6 -marginally above the acceptance threshold.

Review 4

Summary and Contributions: this paper analyses the weakness of triplet loss with batch hard theoretically and experimentally. Based on the analysis, the paper proposed a new method and proved to be very efficient on market and duke dataset.

Strengths: 1. the experimental results are very exciting 2. analysis is reasonable 2. code was provided as a material

Weaknesses: 1. writing need to be carefully polished 2. it will be better if some larger datasets are evluated, such as msmt17

Correctness: yes

Clarity: not very well

Relation to Prior Work: yes

Reproducibility: Yes

Additional Feedback: post rebuttal: after reading the rebuttal, I keep my original reate