__ Summary and Contributions__: This paper offers a method to train individually fair representations. The process follows as (1) a data regulator defines a set of similarity notions (e.g. individuals are similar if x categorical features are the same and y numerical features are leq some difference). (2) A data producer trains a representation where individuals considered similar are mapped within a specified distance within the latent space. To perform training, they use a method based on training neural networks using logical constraints called DL2 --- here, the logical constraints are given by the data regulator. (3) A data consumer trains a model on the representation. Here the data consumer must ensure local robustness of the classifier in order for it to be individually fair. While the representation maps similar individuals "close together" in the latent space, an arbitrary classifier may not be individually fair, so the data consumer must ensure that the trained classifier is locally robust. The authors provide comprehensive evaluation on a variety of data sets and problem settings. They find that their model produces classifiers that have high degrees of individual fairness.

__ Strengths__: This paper presents an interesting and novel method to train individually fair representations. Representational fairness has been of interest at NeurIPS, so this work is relevant to the NeurIPS community. Training the fair representation through logical constraints imposed by the data regulator is a particularly compelling, as it presents a route for non or semi technical data regulators to articulate key individual fairness constraints. This method is significant because it could open up a number of new approaches to representational fairness because of the intuitiveness of providing logical constraints. Further, the empirical results are convincing. The method offers minimal changes to accuracy and produces significant changes to individual fairness. Last, its worth noting that the computational time imposed by the method is minimal, which is positive.

__ Weaknesses__: One of the advantages of group fair representation learning is that the data consumer doesn't have to train the classifier in any particular fashion. Here, the data consumer needs to be fairness aware and capable of training the classifier to satisfy the needed local robustness property. In this way, the method isn't as widely applicable as related group fair representation learning strategies. Still, the work provides a good start towards training individually fair representations.
Empirically, it would be interesting to see the accuracy / individual fairness of the logistic regression trained on the representation, without training for local robustness. While the theoretical necessity is clear, this would help readers understand the necessity of this step empirically.

__ Correctness__: Claims and methodology seem correct.

__ Clarity__: This paper is well written and easy to read.

__ Relation to Prior Work__: Yes, prior work is well situated.

__ Reproducibility__: Yes

__ Additional Feedback__: Update: Thanks for the responses.

__ Summary and Contributions__: The paper presents an end-to-end framework for learning representations with certified individual fairness. The paper follows the framework setting in [10] that contains a data regulator, a data producer, and a data consumer. The data producer and the data consumer are independent. The main contributions of this paper include: 1) individual fairness notions defined via interpretable logical constraints, 2) the first use of training ML models with logical constraints [15] and the constraint satisfaction checking method [16] for learning certified individually fair models, and 3) an open source tool together with extensive empirical evaluation.

__ Strengths__: + The introduced individual fairness notions based on logical constraints are new and can be adopted in many applications as they can provide interpretable and domain-specific fair notions.
+ The proposed framework was built upon several newly developed techniques, training ML models with logical constraints via a translation from logical constraints to a differentiable loss function [15], new methods for proving that constraints are satisfied, and modeling the encoder as a mixed-integer linear program optimization problem. While each of the above techniques cannot be considered as this paper's contribution, putting them together for learning certified individually fair representations and demonstrating feasibility and effectiveness are indeed this paper's contribution.
+ The opensource tool together with the well studied empirical evaluation are another strength.

__ Weaknesses__: I do not identify any significant weakness of this work. Some minor comments are below.
Some discussions about comparisons with counterfactual individual fairness and indirect fairness (via redlining attributes) would be helpful for readers to better understand the applicability of the proposed notions.
The balance parameter \gamma is important. While the paper specified different \gamma for different datasets and compare them with the baseline with \gamma =0, it would be better to pick one dataset (e.g., law school) with varying \gamma values. Then readers can better know the accuracy-fairness tradeoff with different balance values.

__ Correctness__: The claims and the developed methods are correct. The empirical methodology is also correct.

__ Clarity__: The paper is very well written and can be easily followed. This paper is also a good mix of theoretical analysis and empirical evaluation.

__ Relation to Prior Work__: The claimed contributions and descriptions of relation to prior work are generally clear. The comparisons with [10] from setting and approach perspectives could be more clear. I even wondered whether empirical comparisons with [10] should be conducted.

__ Reproducibility__: Yes

__ Additional Feedback__: The authors may consider to move fair transfer learning from Appendix G to the main body as this could be another major contribution.
Thanks for authors' rebuttal feedback, in particular, on comparisons with counterfactual individual fairness and indirect fairness, and the balance parameter \gamma.

__ Summary and Contributions__: The paper introduces a framework for learning representations of data with a certified individual fairness guarantee. Such framework is composed of two independent components that operate in sequence: (1) a data producer who learns an encoder that maps the data into a latent space, with the goal that similar individuals after transformation are still close to each other; and (2) a data consumer who learns a certified robust classifier on the transformed data. The composition of two ensures certified individual fairness. Both the encoder and the robust classifier can be formulated as min-max optimization and can be solved efficiently using the existing algorithms.

__ Strengths__: 1. Learning data representation with a certified fairness guarantee is an important research topic in machine learning. The fair data representation can be used for various downstream tasks in a variety of applications.
2. The certified individual fairness is achieved in two steps by data producer and data consumer; two components can work independently and are not constrained by each other. The proposed method is NOT domain/application-specific that there is no constraint imposed on the similarity metric in fairness criterion.
3. The framework proposed in the paper is flexible and can be achieved efficiently by utilizing the existing tools and algorithms. For example, the mixed-integer linear program solvers can be used to find convex relaxation of the latent set (radius epsilon); the algorithms in robust machine learning can be used by data producer to certify individual fairness.

__ Weaknesses__: UPDATE: I appreciate the author's clarification on the comparison with local DP, the notations \phi, \mu, and the additional experiments on examining the effect of loss balancing factor. As an additional suggestion, it would be better if authors can include more related work on the topic of learning robust representation, such as "Zhu, Sicheng, Xiao Zhang, and David Evans. Learning Adversarially Robust Representations via Worst-Case Mutual Information Maximization", "Garg, Shivam, et al. A spectral view of adversarially robust features.", "Pensia, Ankit, Varun Jog, and Po-Ling Loh. Extracting robust and accurate features via a robust information bottleneck" and so on. While I believe these studies are different from this paper, I think this topic is highly relevant to the topic of learning fair representation.
1. The performance highly depends on the choice of classifiers/parameters (e.g., loss balancing factors) and datasets, neither theoretical guarantee nor the guidance on the selection of classifiers/parameters is provided, which is one of the main reasons that limit my score. The performance is only compared with the BASE case. I wonder if authors can compare (at least empirically) with other fair representation learning methods.
2. The individual fairness studied in the paper — treating similar people similarly — is highly related to differential privacy and robustness ML. In essence, all are trying to guarantee that any small perturbation of training data cannot change the output significantly. Their relations have been studied extensively in the literature. It is thus NOT surprising to see the use of robust learning in providing certificates for individual fairness. Moreover, most of the methods in the paper are adopted from the existing work (e.g., the use of logical constraints, the formulation of min-max optimization, etc.). I am concerned that the work doesn’t have significant novelty and contribution. It would be helpful if authors can explain how their framework differs fundamentally from the existing work, especially those in the domain of robust and (local) differential privacy.
3. When the fairness constraint is imposed in machine learning, typically there should be a tradeoff between accuracy and fairness. I believe there is also such a tradeoff in the proposed framework, which is an important criterion but has not been discussed in the paper. However, as shown in the experiments, both accuracy and fairness can be improved simultaneously in many cases (e.g., HEALTH dataset). It seems that fairness can be attained "for free" without losing accuracy. I wonder if authors can explain why this can happen. I suggest authors conducting more experiments to examine the impact of loss balancing factor on the performance, which I believe plays a critical role in balancing fairness-accuracy tradeoff.

__ Correctness__: Yes

__ Clarity__: The paper is well-organized and well-written in general. Some parts may be improved. For example, it would be good if the authors can state explicitly with more details in Sec 2 that the similarity measure \phi, \mu are logical expressions taking value either 0 or 1.

__ Relation to Prior Work__: The paper introduces some related work but is not sufficient.
As mentioned by authors in related work, some other works have also studied individual fair representation learning. Although those works may focus on the similarity metric that is different from the current paper, their methods should be compared, and the difference should be highlighted.
As I mentioned in the "Weakness" section, individual fairness is highly related to differential privacy and robustness, the comparisons with the existing works in these domains are needed.

__ Reproducibility__: Yes

__ Additional Feedback__: In sec 5.2, it seems that $h^{(y)}_{\psi}(z)$ is used directly without definition. How does it differ from $h_{\psi}(z)$? My understanding is that $h^{(y)}_{\psi}(z)$ denotes the likelihood of $z$'s label being predicted as $y$ by classifier $h_{\psi}(z)$, please correct me if I am wrong.
Please see "weakness" section for additional suggestions and questions.