Part of Advances in Neural Information Processing Systems 33 (NeurIPS 2020)
Elfarouk Harb, Ho Shan Lam
In this paper, we study the problem of fair clustering on the $k-$center objective. In fair clustering, the input is $N$ points, each belonging to at least one of $l$ protected groups, e.g. male, female, Asian, Hispanic. The objective is to cluster the $N$ points into $k$ clusters to minimize a classical clustering objective function. However, there is an additional constraint that each cluster needs to be fair, under some notion of fairness. This ensures that no group is either ``over-represented'' or ``under-represented'' in any cluster. Our work builds on the work of Chierichetti et al. (NIPS 2017), Bera et al. (NeurIPS 2019), Ahmadian et al. (KDD 2019), and Bercea et al. (APPROX 2019). We obtain a randomized $3-$approximation algorithm for the $k-$center objective function, beating the previous state of the art ($4-$approximation). We test our algorithm on real datasets, and show that our algorithm is effective in finding good clusters without over-representation or under-representation, surpassing the current state of the art in runtime speed, clustering cost, while achieving similar fairness violations.