Part of Advances in Neural Information Processing Systems 33 (NeurIPS 2020)
Jaewoong Cho, Gyeongjo Hwang, Changho Suh
As machine learning becomes prevalent in a widening array of sensitive applications such as job hiring and criminal justice, one critical aspect that machine learning classifiers should respect is to ensure fairness: guaranteeing the irrelevancy of a prediction output to sensitive attributes such as gender and race. In this work, we develop a kernel density estimation trick to quantify fairness measures that capture the degree of the irrelevancy. A key feature of our approach is that quantified fairness measures can be expressed as differentiable functions w.r.t. classifier model parameters. This then allows us to enjoy prominent gradient descent to readily solve an interested optimization problem that fully respects fairness constraints. We focus on a binary classification setting and two well-known definitions of group fairness: Demographic Parity (DP) and Equalized Odds (EO). Our experiments both on synthetic and benchmark real datasets demonstrate that our algorithm outperforms prior fair classifiers in accuracy-fairness tradeoff performance both w.r.t. DP and EO.