{"title": "Optimized Pre-Processing for Discrimination Prevention", "book": "Advances in Neural Information Processing Systems", "page_first": 3992, "page_last": 4001, "abstract": "Non-discrimination is a recognized objective in algorithmic decision making. In this paper, we introduce a novel probabilistic formulation of data pre-processing for reducing discrimination. We propose a convex optimization for learning a data transformation with three goals: controlling discrimination, limiting distortion in individual data samples, and preserving utility. We characterize the impact of limited sample size in accomplishing this objective. Two instances of the proposed optimization are applied to datasets, including one on real-world criminal recidivism. Results show that discrimination can be greatly reduced at a small cost in classification accuracy.", "full_text": "Optimized Pre-Processing for Discrimination\n\nPrevention\n\nFlavio P. Calmon\nHarvard University\n\nflavio@seas.harvard.edu\n\nDennis Wei\n\nIBM Research AI\ndwei@us.ibm.com\n\nBhanukiran Vinzamuri\n\nIBM Research AI\n\nbhanu.vinzamuri@ibm.com\n\nKarthikeyan Natesan Ramamurthy\n\nIBM Research AI\n\nknatesa@us.ibm.com\n\nKush R. Varshney\nIBM Research AI\n\nkrvarshn@us.ibm.com\n\nAbstract\n\nNon-discrimination is a recognized objective in algorithmic decision making. In\nthis paper, we introduce a novel probabilistic formulation of data pre-processing\nfor reducing discrimination. We propose a convex optimization for learning a data\ntransformation with three goals: controlling discrimination, limiting distortion\nin individual data samples, and preserving utility. We characterize the impact\nof limited sample size in accomplishing this objective. Two instances of the\nproposed optimization are applied to datasets, including one on real-world criminal\nrecidivism. Results show that discrimination can be greatly reduced at a small cost\nin classi\ufb01cation accuracy.\n\nIntroduction\n\n1\nDiscrimination is the prejudicial treatment of an individual based on membership in a legally protected\ngroup such as a race or gender. Direct discrimination occurs when protected attributes are used\nexplicitly in making decisions, also known as disparate treatment. More pervasive nowadays is\nindirect discrimination, in which protected attributes are not used but reliance on variables correlated\nwith them leads to signi\ufb01cantly different outcomes for different groups. The latter phenomenon is\ntermed disparate impact. Indirect discrimination may be intentional, as in the historical practice of\n\u201credlining\u201d in the U.S. in which home mortgages were denied in zip codes populated primarily by\nminorities. However, the doctrine of disparate impact applies regardless of actual intent.\nSupervised learning algorithms, increasingly used for decision making in applications of consequence,\nmay at \ufb01rst be presumed to be fair and devoid of inherent bias, but in fact, inherit any bias or dis-\ncrimination present in the data on which they are trained [Calders and \u017dliobait\u02d9e, 2013]. Furthermore,\nsimply removing protected variables from the data is not enough since it does nothing to address\nindirect discrimination and may in fact conceal it. The need for more sophisticated tools has made\ndiscrimination discovery and prevention an important research area [Pedreschi et al., 2008].\nAlgorithmic discrimination prevention involves modifying one or more of the following to ensure\nthat decisions made by supervised learning methods are less biased: (a) the training data, (b) the\nlearning algorithm, and (c) the ensuing decisions themselves. These are respectively classi\ufb01ed as\npre-processing [Hajian, 2013], in-processing [Fish et al., 2016, Zafar et al., 2016, Kamishima et al.,\n2011] and post-processing approaches [Hardt et al., 2016]. In this paper, we focus on pre-processing\nsince it is the most \ufb02exible in terms of the data science pipeline: it is independent of the modeling\nalgorithm and can be integrated with data release and publishing mechanisms.\nResearchers have also studied several notions of discrimination and fairness. Disparate impact is\naddressed by the principles of statistical parity and group fairness [Feldman et al., 2015], which seek\nsimilar outcomes for all groups. In contrast, individual fairness [Dwork et al., 2012] mandates that\nsimilar individuals be treated similarly irrespective of group membership. For classi\ufb01ers and other\n\n31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA.\n\n\fFigure 1: The proposed pipeline for predictive learning\nwith discrimination prevention. Learn mode applies\nwith training data and apply mode with novel test data.\nNote that test data also requires transformation before\npredictions can be obtained.\n\npredictive models, equal error rates for different groups are a desirable property [Hardt et al., 2016],\nas is calibration or lack of predictive bias in the predictions [Zhang and Neill, 2016]. The tension\nbetween the last two notions is described by Kleinberg et al. [2017] and Chouldechova [2016]; the\nwork of Friedler et al. [2016] is in a similar vein. Corbett-Davies et al. [2017] discuss the trade-offs\nin satisfying prevailing notions of algorithmic fairness from a public safety standpoint. Since the\npresent work pertains to pre-processing and not modeling, balanced error rates and predictive bias are\nless relevant criteria. Instead we focus primarily on achieving group fairness while also accounting\nfor individual fairness through a distortion constraint.\nExisting pre-processing approaches include sampling or re-weighting the data to neutralize dis-\ncriminatory effects [Kamiran and Calders, 2012], changing the individual data records [Hajian and\nDomingo-Ferrer, 2013], and using t-closeness [Li et al., 2007] for discrimination control [Ruggieri,\n2014]. A common theme is the importance of balancing discrimination control against utility of the\nprocessed data. However, this prior work neither presents general and principled optimization frame-\nworks for trading off these two criteria, nor allows connections to be made to the broader statistical\nlearning and information theory literature via probabilistic descriptions. Another shortcoming is that\nindividual distortion or fairness is not made explicit.\nIn this work, we (i) introduce a probabilistic\nframework for discrimination-preventing pre-\nprocessing in supervised learning, (ii) formu-\nlate an optimization problem for producing pre-\nprocessing transformations that trade off dis-\ncrimination control, data utility, and individ-\nual distortion, (iii) characterize theoretical prop-\nerties of the optimization approach (e.g. con-\nvexity, robustness to limited samples), and (iv)\nbenchmark the ensuing pre-processing transfor-\nmations on real-word datasets. Our aim in part is\nto work toward a more uni\ufb01ed view of existing\npre-processing concepts and methods, which may help to suggest re\ufb01nements. While discrimination\nand utility are de\ufb01ned at the level of probability distributions, distortion is controlled on a per-sample\nbasis, thereby limiting the effect of the transformation on individuals and ensuring a degree of\nindividual fairness. Figure 1 illustrates the supervised learning pipeline that includes our proposed\ndiscrimination-preventing pre-processing.\nThe work of Zemel et al. [2013] is closest to ours in also presenting a framework with three criteria\nrelated to discrimination control (group fairness), individual fairness, and utility. However, the\ncriteria are manifested less directly than in our proposal. Discrimination control is posed in terms of\nintermediate features rather than outcomes, individual distortion does not take outcomes into account\n(being an (cid:96)2-norm between original and transformed features), and utility is speci\ufb01c to a particular\nclassi\ufb01er. Our formulation more naturally and generally encodes these fairness and utility desiderata.\nGiven the novelty of our formulation, we devote more effort than usual to discussing its motivations\nand potential variations. We state conditions under which the proposed optimization problem is\nconvex. The optimization assumes as input an estimate of the distribution of the data which, in\npractice, can be imprecise due to limited sample size. Accordingly, we characterize the possible\ndegradation in discrimination and utility guarantees at test time in terms of the training sample\nsize. To demonstrate our framework, we apply speci\ufb01c instances of it to a prison recidivism dataset\n[ProPublica, 2017] and the UCI Adult dataset [Lichman, 2013]. We show that discrimination,\ndistortion, and utility loss can be controlled simultaneously with real data. We also show that the pre-\nprocessed data reduces discrimination when training standard classi\ufb01ers, particularly when compared\nto the original data with and without removing protected variables. In the Supplementary Material\n(SM), we describe in more detail the resulting transformations and the demographic patterns that they\nreveal.\n2 General Formulation\nWe are given a dataset consisting of n i.i.d. samples {(Di, Xi, Yi)}n\ni=1 from a joint distribution\npD,X,Y with domain D \u00d7 X \u00d7 Y. Here D denotes one or more protected (discriminatory) variables\nsuch as gender and race, X denotes other non-protected variables used for decision making, and Y\nis an outcome random variable. We use the term \u2018discriminatory\u2019 interchangeably with \u2018protected,\u2019\n\n2\n\nLearn/Apply TransformationOriginaldata{(Xi,Yi)}Discriminatoryvariable{Di}Utility:pX,Yp\u02c6X,\u02c6YIndividualdistortion:(xi,yi)(\u02c6xi,\u02c6yi)Discriminationcontrol:\u02c6YiDiLearn/Applypredictivemodel(\u02c6Y|\u02c6X,D)Transformeddata{(Di,\u02c6Xi,\u02c6Yi)}\fand not in the usual statistical sense. For instance, Yi could represent a loan approval decision for\nindividual i based on demographic information Di and credit score Xi. We focus in this paper on\ndiscrete (or discretized) and \ufb01nite domains D and X and binary outcomes, i.e. Y = {0, 1}. There is\nno restriction on the dimensions of D and X.\nOur goal is to determine a randomized mapping p \u02c6X, \u02c6Y |X,Y,D that (i) transforms the given dataset into\na new dataset {(Di, \u02c6Xi, \u02c6Yi)}n\ni=1 which may be used to train a model, and (ii) similarly transforms\ndata to which the model is applied, i.e. test data. Each ( \u02c6Xi, \u02c6Yi) is drawn independently from the same\ndomain X \u00d7 Y as X, Y by applying p \u02c6X, \u02c6Y |X,Y,D to the corresponding triplet (Di, Xi, Yi). Since Di\nis retained as-is, we do not include it in the mapping to be determined. Motivation for retaining D is\ndiscussed later in Section 3. For test samples, Yi is not available at the input while \u02c6Yi may not be\nneeded at the output. In this case, a reduced mapping p \u02c6X|X,D is used as given later in (9).\nIt is assumed that pD,X,Y is known along with its marginals and conditionals. This assumption is\noften satis\ufb01ed using the empirical distribution of {(Di, Xi, Yi)}n\ni=1. In Section 3, we state a result\nensuring that discrimination and utility loss continue to be controlled if the distribution used to\ndetermine p \u02c6X, \u02c6Y |X,Y,D differs from the distribution of test samples.\nWe propose that the mapping p \u02c6X, \u02c6Y |X,Y,D satisfy the three following properties.\nI. Discrimination Control. The \ufb01rst objective is to limit the dependence of the transformed outcome\n\u02c6Y on the protected variables D. We propose two alternative formulations. The \ufb01rst requires the\nconditional distribution p \u02c6Y |D to be close to a target distribution pYT for all values of D,\n\nJ\n\n(1)\nwhere J(\u00b7,\u00b7) denotes some distance function. In the second formulation, we constrain the conditional\nprobability p \u02c6Y |D to be similar for any two values of D:\n\n\u2264 \u0001y,d \u2200 d \u2208 D, y \u2208 {0, 1},\n\np \u02c6Y |D(y|d), pYT (y)\n\nJ\n\np \u02c6Y |D(y|d1), p \u02c6Y |D(y|d2)\n\n\u2264 \u0001y,d1,d2 \u2200 d1, d2 \u2208 D, y \u2208 {0, 1}.\n\n(2)\n\nNote that the number of such constraints is O(|D|2) as opposed to O(|D|) constraints in (1). The\nchoice of pYT in (1), and J and \u0001 in (1) and (2) should be informed by societal aspects, consultations\nwith domain experts and stakeholders, and legal considerations such as the \u201c80% rule\u201d [EEOC, 1979].\nFor this work, we choose J to be the following probability ratio measure:\n\n(cid:16)\n\n(cid:16)\n\n(cid:17)\n\n(cid:17)\n\n(cid:12)(cid:12)(cid:12)(cid:12) p\n\n(cid:12)(cid:12)(cid:12)(cid:12) .\n\nJ(p, q) =\n\nq \u2212 1\n\n(3)\n\nThis metric is motivated by the \u201c80% rule.\u201d The combination of (3) and (1) generalizes the extended\nlift criterion proposed in the literature [Pedreschi et al., 2012], while the combination of (3) and (2)\ngeneralizes selective and contrastive lift. The latter combination (2), (3) is used in the numerical\nresults in Section 4. We note that the selection of a \u2018fair\u2019 target distribution pYT in (1) is not\nstraightforward; see \u017dliobait\u02d9e et al. [2011] for one such proposal. Despite its practical motivation, we\nalert the reader that (3) may be unnecessarily restrictive when q is low.\nIn (1) and (2), discrimination control is imposed jointly with respect to all protected variables, e.g.\nall combinations of gender and race if D consists of those two variables. An alternative is to take\nthe protected variables one at a time, and impose univariate discrimination control. In this work, we\nopt for the more stringent joint discrimination control, although legal formulations tend to be of the\nunivariate type.\nFormulations (1) and (2) control discrimination at the level of the overall population in the dataset.\nIt is also possible to control discrimination within segments of the population by conditioning on\nadditional variables B, where B is a subset of X and X is a collection of features. Constraint (1)\nwould then generalize to J\n\u2264 \u0001y,d,b for all d \u2208 D, y \u2208 {0, 1}, and\nb \u2208 B. Similar conditioning or \u2018context\u2019 for discrimination has been explored before in Hajian and\nDomingo-Ferrer [2013] in the setting of association rule mining. For example, B could represent\nthe fraction of a pool of applicants that applied to a certain department, which enables the metric to\navoid statistical traps such as the Simpson\u2019s paradox [Pearl, 2014]. One may wish to control for such\n\np \u02c6Y |D,B(y|d, b), pYT |B(y|b)\n\n(cid:16)\n\n(cid:17)\n\n3\n\n\fvariables in determining the presence of discrimination, while ensuring that population segments\ncreated by conditioning are large enough to derive statistically valid inferences. Moreover, we note\nthat there may exist inaccessible latent variables that drive discrimination, and the metrics used here\nare inherently limited by the available data. Recent de\ufb01nitions of fairness that seek to mitigate\nthis issue include [Johnson et al., 2016] and [Kusner et al., 2017]. We defer further investigation of\ncausality and conditional discrimination to future work.\nII. Distortion Control. The mapping p \u02c6X, \u02c6Y |X,Y,D should satisfy distortion constraints with respect\nto the domain X \u00d7 Y. These constraints restrict the mapping to reduce or avoid altogether certain\nlarge changes (e.g. a very low credit score being mapped to a very high credit score). Given a\ndistortion metric \u03b4 : (X \u00d7 Y)2 \u2192 R+, we constrain the conditional expectation of the distortion as,\n(4)\n\n\u03b4((x, y), ( \u02c6X, \u02c6Y )) | D = d, X = x, Y = y\n\n\u2264 cd,x,y \u2200 (d, x, y) \u2208 D \u00d7 X \u00d7 Y.\n\nE(cid:104)\n\n(cid:105)\n\nWe assume that \u03b4(x, y, x, y) = 0 for all (x, y) \u2208 X \u00d7 Y. Constraint (4) is formulated with pointwise\nconditioning on (D, X, Y ) = (d, x, y) in order to promote individual fairness.\nIt ensures that\ndistortion is controlled for every combination of (d, x, y), i.e. every individual in the original dataset,\nand more importantly, every individual to which a model is later applied. By way of contrast, an\naverage-case measure in which an expectation is also taken over D, X, Y may result in high distortion\nfor certain (d, x, y), likely those with low probability. Equation (4) also allows the level of control\ncd,x,y to depend on (d, x, y) if desired. We also note that (4) is a property of the mapping p \u02c6X, \u02c6Y |D,X,Y ,\nand does not depend on the assumed distribution pD,X,Y .\nThe expectation over \u02c6X, \u02c6Y in (4) encompasses several cases depending on the choices of the metric\n\u03b4 and thresholds cd,x,y. If cd,x,y = 0, then no mappings with nonzero distortion are allowed for\nindividuals with original values (d, x, y). If cd,x,y > 0, then certain mappings may still be disallowed\nby assigning them in\ufb01nite distortion. Mappings with \ufb01nite distortion are permissible subject to the\nbudget cd,x,y. Lastly, if \u03b4 is binary-valued (perhaps achieved by thresholding a multi-valued distortion\nfunction), it can be seen as classifying mappings into desirable (\u03b4 = 0) and undesirable ones (\u03b4 = 1).\nHere, (4) reduces to a bound on the conditional probability of an undesirable mapping, i.e.,\n\n(cid:16)\n\nPr\n\n\u03b4((x, y), ( \u02c6X, \u02c6Y )) = 1 | D = d, X = x, Y = y\n\n\u2264 cd,x,y.\n\n(5)\n\n(cid:16)\n\nIII. Utility Preservation. In addition to constraints on individual distortions, we also require that\nthe distribution of ( \u02c6X, \u02c6Y ) be statistically close to the distribution of (X, Y ). This is to ensure that a\nmodel learned from the transformed dataset (when averaged over the protected variables D) is not\ntoo different from one learned from the original dataset, e.g. a bank\u2019s existing policy for approving\nloans. For a given dissimilarity measure \u2206 between probability distributions (e.g. KL-divergence),\nwe require that \u2206\nOptimization Formulation. Putting together the considerations from the three previous subsections,\nwe arrive at the optimization problem below for determining a randomized transformation p \u02c6X, \u02c6Y |X,Y,D\nmapping each sample (Di, Xi, Yi) to ( \u02c6Xi, \u02c6Yi):\n\np \u02c6X, \u02c6Y , pX,Y\n\nbe small.\n\n(cid:17)\n\n(cid:17)\n\nmin\n\np \u02c6X, \u02c6Y |X,Y,D\n\ns.t. J\n\n\u2206\n\n(cid:17)\n\np \u02c6X, \u02c6Y , pX,Y\n\n(cid:16)\n(cid:16)\nE(cid:104)\np \u02c6Y |D(y|d), pYT (y)\n\u03b4((x, y), ( \u02c6X, \u02c6Y )) | D = d, X = x, Y = y\n\n\u2264 \u0001y,d and\n\n(cid:17)\n\np \u02c6X, \u02c6Y |X,Y,D is a valid distribution.\n\n(cid:105)\n\n\u2264 cd,x,y \u2200 (d, x, y) \u2208 D \u00d7 X \u00d7 Y,\n\n(6)\n\nWe choose to minimize the utility loss \u2206 subject to constraints on individual distortion (4) and\ndiscrimination (we use (1) for concreteness, but (2) can be used instead), since it is more natural to\nplace bounds on the latter two.\nThe distortion constraints (4) are an essential component of the problem formulation (6). Without\n(4) and assuming that pYT = pY , it is possible to achieve perfect utility and non-discrimination\nsimply by sampling ( \u02c6Xi, \u02c6Yi) from the original distribution pX,Y independently of any inputs, i.e.\n\n4\n\n\fp \u02c6X, \u02c6Y |X,Y,D(\u02c6x, \u02c6y|x, y, d) = p \u02c6X, \u02c6Y (\u02c6x, \u02c6y) = pX,Y (\u02c6x, \u02c6y). Then \u2206(p \u02c6X, \u02c6Y , pX,Y ) = 0, and p \u02c6Y |D(y|d) =\np \u02c6Y (y) = pY (y) = pYT (y) for all d \u2208 D. Clearly, this solution is objectionable from the viewpoint of\nindividual fairness, especially for individuals to whom a subsequent model is applied since it amounts\nto discarding an individual\u2019s data and replacing it with a random sample from the population pX,Y .\nConstraint (4) seeks to prevent such gross deviations from occurring. The distortion constraints may,\nhowever, render the optimization infeasible, as illustrated in the SM.\n3 Theoretical Properties\n\nI. Convexity. We show conditions under which (6) is a convex or quasiconvex optimization problem,\nand can thus be solved to optimality. The proof is presented in the SM.\nProposition 1. Problem (6) is a (quasi)convex optimization if \u2206(\u00b7,\u00b7) is (quasi)convex and J(\u00b7,\u00b7) is\nquasiconvex in their respective \ufb01rst arguments (with the second arguments \ufb01xed). If discrimination\nconstraint (2) is used in place of (1), then the condition on J is that it be jointly quasiconvex in both\narguments.\n\nis given by\n\n(cid:88)\n\n\u02c6x\n\n(cid:88)\n\n\u02c6x\n\nbased on D and \u02c6X. To remove the separate issue of model accuracy, suppose for simplicity that the\n\nII. Generalizability of Discrimination Control. We now discuss the generalizability of discrimi-\nnation guarantees (1) and (2) to unseen individuals, i.e. those to whom a model is applied. Recall\nfrom Section 2 that the proposed transformation retains the protected variables D. We \ufb01rst consider\nthe case where models trained on the transformed data to predict \u02c6Y are allowed to depend on D.\nWhile such models may qualify as disparate treatment, the intent and effect is to better mitigate\ndisparate impact resulting from the model. In this respect our proposal shares the same spirit with\n\u2018fair\u2019 af\ufb01rmative action in Dwork et al. [2012] (fairer on account of distortion constraint (4)).\n\nAssuming that predictive models for \u02c6Y can depend on D, let (cid:101)Y be the output of such a model\nmodel provides a good approximation to the conditional distribution of \u02c6Y , i.e. p(cid:101)Y | \u02c6X,D((cid:101)y|\u02c6x, d) \u2248\np \u02c6Y | \u02c6X,D((cid:101)y|\u02c6x, d). Then for individuals in a protected group D = d, the conditional distribution of (cid:101)Y\np(cid:101)Y |D((cid:101)y|d) =\np \u02c6Y | \u02c6X,D((cid:101)y|\u02c6x, d)p \u02c6X|D(\u02c6x|d) = p \u02c6Y |D((cid:101)y|d).\nHence the model output p(cid:101)Y |D can also be controlled by (1) or (2).\np \u02c6Y | \u02c6X, i.e. p(cid:101)Y | \u02c6X,D((cid:101)y|\u02c6x, d) = p(cid:101)Y | \u02c6X ((cid:101)y|\u02c6x) \u2248 p \u02c6Y | \u02c6X ((cid:101)y|\u02c6x). In this case we have\nwhich in general is not equal to p \u02c6Y |D((cid:101)y|d) in (7). The quantity on the right-hand side of (8) is less\n\nOn the other hand, if D must be suppressed from the transformed data, perhaps to comply with legal\nrequirements regarding its non-use, then a predictive model can depend only on \u02c6X and approximate\n\np(cid:101)Y | \u02c6X,D((cid:101)y|\u02c6x, d)p \u02c6X|D(\u02c6x|d) \u2248\n\np \u02c6Y | \u02c6X ((cid:101)y|\u02c6x)p \u02c6X|D(\u02c6x|d),\n\nstraightforward to control. We address this question in the SM.\nIII. Training and Application Considerations. The proposed optimization framework has two\nmodes of operation (Fig. 1): train and apply. In train mode, the optimization problem (6) is solved in\norder to determine a mapping p \u02c6X, \u02c6Y |X,Y,D for randomizing the training set. The randomized training\nset, in turn, is used to \ufb01t a classi\ufb01cation model f\u03b8( \u02c6X, D) that approximates p \u02c6Y | \u02c6X,D, where \u03b8 are the\nparameters of the model. At apply time, a new data point (X, D) is received and transformed into\n( \u02c6X, D) through a randomized mapping p \u02c6X|X,D. The mapping p \u02c6X|D,X is given by marginalizing\nover Y, \u02c6Y :\n\np(cid:101)Y |D((cid:101)y|d) \u2248\n\n(cid:88)\n\n\u02c6x\n\n(7)\n\n(8)\n\n(cid:88)\n\ny,\u02c6y\n\np \u02c6X|D,X (\u02c6x|d, x) =\n\np \u02c6X, \u02c6Y |X,Y,D(\u02c6x, \u02c6y|x, y, d)pY |X,D(y|x, d).\n\n(9)\n\nAssuming that the variable D is not suppressed, and that the marginals are known, then the utility\nand discrimination guarantees set during train time still hold during apply time, as discussed above.\n\n5\n\n\fE(cid:104)E(cid:104)\n\nHowever, the distortion control will inevitably change, since the mapping has been marginalized over\nY . More speci\ufb01cally, the bound on the expected distortion for each sample becomes\n\n\u03b4((x, Y ), ( \u02c6X, \u02c6Y )) | D = d, X = x, Y\n\n| D = d, X = x\n\n\u2264\n\npY |X,D(y|x, d)cx,y,d (cid:44) cx,d .\n\n(cid:105)\n\n(cid:105)\n\n(cid:88)\n\ny\u2208Y\n\n(cid:16)\n\n(10)\nIf the distortion control values cx,y,d are independent of y, then the upper-bound on distortion set\nduring training time still holds during apply time. Otherwise, (10) provides a bound on individual\ndistortion at apply time. The same guarantee holds for the case when D is suppressed.\nIV. Robustness to Mismatched Prior Distribution Estimation. We may also consider the case\nwhere the distribution pD,X,Y used to determine the transformation differs from the distribution\nqD,X,Y of test samples. This occurs, for example, when pD,X,Y is the empirical distribution computed\nfrom n i.i.d. samples from an unknown distribution qD,X,Y . In this situation, discrimination control\nand utility are still guaranteed for samples drawn from qD,X,Y that are transformed using p \u02c6Y , \u02c6X|X,Y,D,\nwhere the latter is obtained by solving (6) with pD,X,Y . In particular, denoting by q \u02c6Y |D and q \u02c6X, \u02c6Y\nthe corresponding distributions for \u02c6Y , \u02c6X and D when qD,X,Y is transformed using p \u02c6Y , \u02c6X|X,Y,D, we\nhave J\nfor n suf\ufb01ciently large (the distortion control constraints (4) only depend on p \u02c6Y , \u02c6X|X,Y,D). The next\nproposition provides an estimate of the rate of this convergence in terms of n and assuming pY,D(y, d)\nis \ufb01xed and bounded away from zero. Its proof can be found in the SM.\nProposition 2. Let pD,X,Y be the empirical distribution obtained from n i.i.d. samples that is used to\ndetermine the mapping p \u02c6Y , \u02c6X|X,Y,D, and qD,X,Y be the true distribution of the data, with support size\n(cid:17)\nm (cid:44) |X \u00d7Y \u00d7D|. In addition, denote by qD, \u02c6X, \u02c6Y the joint distribution after applying p \u02c6Y , \u02c6X|X,Y,D to\nsamples from qD,X,Y . If for all y \u2208 Y, d \u2208 D we have pY,D(y, d) > 0, J\n\u2264 \u0001,\nwhere J is given in (3), and\n\np \u02c6Y |D(y|d), pYT (y)\n\np \u02c6Y |D(y|d), pYT (y)\n\nq \u02c6Y |D(y|d), pYT (y)\n\npX,Y , p \u02c6X, \u02c6Y\n\nqX,Y , q \u02c6X, \u02c6Y\n\n\u2192 \u2206\n\n\u2192 J\n\nand \u2206\n\n(cid:17)\n\n(cid:16)\n\n(cid:17)\n\n(cid:16)\n\n(cid:17)\n\n(cid:16)\n\n(cid:17)\n\n(cid:16)\n(cid:12)(cid:12)(cid:12) \u2264 \u00b5,\n(cid:16)\n\n(cid:12)(cid:12)(cid:12)pX,Y (x, y) \u2212 p \u02c6X, \u02c6Y (x, y)\n(cid:114)\n\n(cid:17)\n\n(cid:111) (cid:46)\n\nm\nn\n\n(cid:17)\n\nn\nm\n\nlog \u03b2\n\n.\n\nJ\n\nlog\n\n\u2212\n\n1 +\n\nmax\n\n\u2212 \u00b5\n\n\u2212 \u0001, \u2206\n\nqX,Y , q \u02c6X, \u02c6Y\n\nq \u02c6Y |D(y|d), pYT (y)\n\n(11)\nProposition 2 guarantees that, as long as n is suf\ufb01ciently large, the utility and discrimination control\nguarantees will approximately hold when p \u02c6X, \u02c6Y |Y,X,D is applied to fresh samples drawn from qD,X,Y .\nIn particular, the utility and discrimination guarantees will converge to the ones used as parameters in\nthe optimization at a rate that is at least\nn log n. The distortion control guarantees (4) are a property\nof the mapping p \u02c6X, \u02c6Y |Y,X,D, and do not depend on the distribution of the data. The convergence rate\nis tied to the support size, and for large m a dimensionality reduction step may be required to assuage\ngeneralization issues. The same upper bound on convergence rate holds for discrimination constraints\nof the form (2).\n4 Experimental Results\n\nn\n\nThis section provides a numerical demonstration of running the data processing pipeline in Fig. 1. Our\nfocus here is on the discrimination-accuracy trade-off obtained when the pre-processed data is used\nto train standard prediction algorithms. The SM presents additional results on the trade-off between\ndiscrimination control \u0001 and utility \u2206 as well as an analysis of the optimized data transformations.\nWe apply the pipeline to ProPublica\u2019s COMPAS recidivism data [ProPublica, 2017] and the UCI\nAdult dataset [Lichman, 2013]. From the COMPAS dataset (7214 instances), we select severity of\ncharge, number of prior crimes, and age category to be the decision variables (X). The outcome\nvariable (Y ) is a binary indicator of whether the individual recidivated (re-offended), and race is\nset to be the protected variable (D). The encoding of categorical variables is described in the SM.\nFor the Adult dataset (32561 instances), the features were categorized as protected variables (D):\n\n6\n\nwith probability 1 \u2212 \u03b2,\n\n(cid:110)\n\n(cid:16)\n\n(cid:17)\n\n(cid:16)\n\n\u2206\n\npX,Y , p \u02c6X, \u02c6Y\n\n=\n\n(cid:17)\n\n(cid:88)\n\nx,y\n\n(cid:16)\n(cid:113) 1\n\n\f(cid:17)\n\npX,Y , p \u02c6X, \u02c6Y\n\n= 1\n2\n\nx,y\n\n(cid:16)\n\n(cid:80)\n\n(cid:12)(cid:12)(cid:12), the\n(cid:12)(cid:12)(cid:12)pX,Y (x, y) \u2212 p \u02c6X, \u02c6Y (x, y)\n\ngender (male, female); decision variables (X): age (quantized to decades) and education (quantized\nto years); and response variable (Y ): income (binary).\nOur proposed approach is benchmarked against two baselines, leaving the dataset as-is and sup-\npressing the protected variable D during training and testing. We also compare against the learning\nfair representations (LFR) algorithm from Zemel et al. [2013]. As discussed in the introduction,\nLFR has fundamental differences from the proposed framework. In particular, LFR only considers\nbinary-valued D, and consequently, we restrict D to be binary in the experiments presented here.\nHowever, our method is not restricted to D being binary or univariate. Illustrations of our method on\nnon-binary D are provided in the SM.\nThe details of applying our method to the datasets are as follows. For each train/test split, we\napproximate pD,X,Y using the empirical distribution of (D, X, Y ) in the training set and solve (6)\nusing a standard convex solver [Diamond and Boyd, 2016]. For both datasets the utility metric\n\u2206 is the total variation distance, i.e. \u2206\ndistortion constraint is the combination of (2) and (3), and two levels of discrimination control are\nused, \u0001 = {0.05, 0.1}. The distortion function \u03b4 is chosen differently for the two datasets as described\nbelow, based on the differing semantics of the variables in the two applications. The speci\ufb01c values\nwere chosen for demonstration purposes to be reasonable to our judgment and can easily be tuned\naccording to the desires of a practitioner. We emphasize that the distortion values were not selected\nto optimize the results presented here. All experiments run in minutes on a standard laptop.\nDistortion function for COMPAS: We use the expected distortion constraint in (4) with cd,x,y =\n0.4, 0.3 for d being respectively African-American and Caucasian. The distortion function \u03b4 has the\nfollowing behavior. Jumps of more than one category in age and prior counts are heavily discouraged\nby a high distortion penalty (104) for such transformations. We impose the same penalty on increases\nin recidivism (change of Y from 0 to 1). Both these choices are made in the interest of individual\nfairness. Furthermore, for every jump to an adjacent category for age and prior counts, a penalty of 1\nis assessed, and a similar jump in charge degree incurs a penalty of 2. Reduction in recidivism (1 to\n0) has a penalty of 2. The total distortion for each individual is the sum of squares of distortions for\neach attribute of X.\nDistortion function for Adult: We use three conditional probability constraints of the form in (5). In\nconstraint i, the distortion function returns 1 in case (i) and 0 otherwise: (1) if income is decreased,\nage is not changed and education is increased by at most 1 year, (2) if age is changed by a decade\nand education is increased by at most 1 year regardless of the change of income, (3) if age is\nchanged by more than a decade or education is lowered by any amount or increased by more than 1\nyear. The corresponding probability bounds cd,x,y are 0.1, 0.05, 0 (no dependence on d, x, y). As a\nconsequence, and in the same broad spirit as for COMPAS, decreases in income, small changes in\nage, and small increases in education (events (1), (2)) are permitted with small probabilities, while\nlarger changes in age and education (event (3)) are not allowed at all.\nOnce the optimized randomized mapping p \u02c6X, \u02c6Y |D,X,Y is determined, we apply it to the training set to\nobtain a new perturbed training set, which is then used to \ufb01t two classi\ufb01ers: logistic regression (LR)\nand random forest (RF). For the test set, we \ufb01rst compute the test-time mapping p \u02c6X|D,X in (9) using\np \u02c6X, \u02c6Y |D,X,Y and pD,X,Y estimated from the training set. We then independently randomize each\nEach trained classi\ufb01er f is applied to the transformed test samples, obtaining an estimate (cid:101)yi =\np \u02c6X|D,X\u2212\u2212\u2212\u2212\u2212\u2192 (di, \u02c6xi).\ntest sample (di, xi) using p \u02c6X|D,X, preserving the protected variable D, i.e. (di, xi)\n(cid:80){\u02c6xi,di}:di=d f (di, \u02c6xi), where nd is the number of samples with di = d.\ngiven by p(cid:101)Y |D(1|d) = 1\nf (di, \u02c6xi) which is evaluated against yi. These estimates induce an empirical posterior distribution\n\nnd\n\nFor the two baselines, the above procedure is repeated without data transformation except for dropping\nD throughout for the second baseline (D is still used to compute the discrimination of the resulting\nclassi\ufb01er). Due to the lack of available code, we implemented LFR ourselves in Python and solved\nthe associated optimization problem using the SciPy package. The parameters for LFR were set as\nrecommended in Zemel et al. [2013]: Az = 50 (group fairness), Ax = 0.01 (individual fairness), and\nAy = 1 (prediction accuracy). The results did not signi\ufb01cantly change within a reasonable variation\nof these three parameters.\n\n7\n\n\fFigure 2: Discrimination-AUC plots for two different classi\ufb01ers. Top row is for COMPAS dataset, and bottom\nrow for UCI Adult dataset. First column is logistic regression (LR), and second column is random forests (RF).\n\nResults. We report the trade-off between two metrics: (i) the empirical discrimination of the classi\ufb01er\n\non the test set, given by maxd,d(cid:48)\u2208D J(p(cid:101)Y |D(1|d), p(cid:101)Y |D(1|d(cid:48))), and (ii) the empirical accuracy, mea-\nsured by the Area under ROC (AUC) of(cid:101)yi = f (di, \u02c6xi) compared to yi, using 5-fold cross validation.\n\nFig. 2 presents the operating points achieved by each procedure in the discrimination-accuracy space\nas measured by these metrics. For the COMPAS dataset, there is signi\ufb01cant discrimination in the\noriginal dataset, which is re\ufb02ected by both LR and RF when the data is not transformed. Dropping the\nD variable reduces discrimination with a negligible impact on classi\ufb01cation. However discrimination\nis far from removed since the features X are correlated with D, i.e. there is indirect discrimination.\nLFR with the recommended parameters is successful in further reducing discrimination while still\nachieving high prediction performance for the task.\nOur proposed optimized pre-processing approach successfully decreases the empirical discrimination\nclose to the target \u0001 values (x-axis). Deviations are expected due to the approximation of \u02c6Y , the output\n\nof the transformation, by (cid:101)Y , the output of each classi\ufb01er, and also due to the randomized nature of\n\nthe method. The decreased discrimination comes at an accuracy cost, which is greater in this case\nthan for LFR. A possible explanation is that LFR is free to search across different representations\nwhereas our method is restricted by the chosen distortion metric and having to preserve the domain of\nthe original variables. For example, for COMPAS we heavily penalize increases in recidivism from 0\nto 1 as well as large changes in prior counts and age. When combined with the other constraints in\nthe optimization, this may alter the joint distribution after perturbation and by extension the classi\ufb01er\noutput. Increased accuracy could be obtained by relaxing the distortion constraint, as long as this\nis acceptable to the practitioner. We highlight again that our distortion metric was not chosen to\nexplicitly optimize performance on this task, and should be guided by the practitioner. Nevertheless,\nwe do successfully obtain a controlled reduction of discrimination while avoiding unwanted deviations\nin the randomized mapping.\nFor the Adult dataset, dropping the protected variable does signi\ufb01cantly reduce discrimination, in\ncontrast with COMPAS. Our method further reduces discrimination towards the target \u0001 values. The\nloss of prediction performance is again due to satisfying the distortion and discrimination constraints.\nOn the other hand, LFR with the recommended parameters provides only a small reduction in\ndiscrimination. We note that this does not contradict the results in Zemel et al. [2013], since here we\nhave adopted a multiplicative discrimination metric (3) whereas Zemel et al. [2013] used an additive\nmetric. Moreover, we reduced the Adult dataset to 31 binary features which is different from Zemel\net al. [2013] where they additionally considered the test dataset for Adult (12661 instances) also and\ncreated 103 binary features. By varying the LFR parameters, it is possible to attain low empirical\ndiscrimination but with a large loss in prediction performance (below the plotted range). Thus, we\ndo not claim that our method outperforms LFR since different operating points can be achieved by\n\n8\n\n0.000.050.100.150.200.250.300.350.40Discrimination0.6750.6800.6850.6900.6950.7000.7050.7100.715AUCLRLR+DroppingDLFRLR+Ourapproach(.05)LR+Ourapproach(.1)0.00.10.20.30.4Discrimination0.650.660.670.680.690.700.71AUCRFRF+DroppingDLFRRF+Ourapproach(.05)RF+Ourapproach(.1)0.000.250.500.751.001.251.501.75Discrimination0.760.770.780.790.800.810.82AUCLRLR+DroppingDLFRLR+Ourapproach(.05)LR+Ourapproach(.1)0.000.250.500.751.001.251.501.75Discrimination0.740.750.760.770.780.790.800.810.82AUCRFRF+DroppingDLFRRF+Ourapproach(.05)RF+Ourapproach(.1)\fadjusting parameters in either approach. In our approach however, individual fairness is explicitly\nmaintained through the design of the distortion metric and discrimination is controlled directly by a\nsingle parameter \u0001, whereas the relationship is less clear with LFR.\n5 Conclusions\nWe proposed a \ufb02exible, data-driven optimization framework for probabilistically transforming data in\norder to reduce algorithmic discrimination, and applied it to two datasets. When used to train standard\nclassi\ufb01ers, the transformed dataset led to a fairer classi\ufb01cation when compared to the original dataset.\nThe reduction in discrimination comes at an accuracy penalty due to the restrictions imposed on the\nrandomized mapping. Moreover, our method is competitive with others in the literature, with the\nadded bene\ufb01t of enabling an explicit control of individual fairness and the possibility of multivariate,\nnon-binary protected variables. The \ufb02exibility of the approach allows numerous extensions using\ndifferent measures and constraints for utility preservation, discrimination, and individual distortion\ncontrol. Investigating such extensions, developing theoretical characterizations based on the proposed\nframework, and quantifying the impact of the transformations on additional supervised learning tasks\nwill be pursued in future work.\n\nReferences\nT. Calders and I. \u017dliobait\u02d9e. Why unbiased computational processes can lead to discriminative decision\nprocedures. In Discrimination and Privacy in the Information Society, pages 43\u201357. Springer,\n2013.\n\nA. Chouldechova. Fair prediction with disparate impact: A study of bias in recidivism prediction\n\ninstruments. arXiv preprint arXiv:1610.07524, 2016.\n\nS. Corbett-Davies, E. Pierson, A. Feller, S. Goel, and A. Huq. Algorithmic decision making and the\n\ncost of fairness. arXiv preprint arXiv:1701.08230, 2017.\n\nS. Diamond and S. Boyd. CVXPY: A Python-embedded modeling language for convex optimization.\n\nJournal of Machine Learning Research, 17(83):1\u20135, 2016.\n\nC. Dwork, M. Hardt, T. Pitassi, O. Reingold, and R. Zemel. Fairness through awareness.\n\nIn\nProceedings of the 3rd Innovations in Theoretical Computer Science Conference, pages 214\u2013226.\nACM, 2012.\n\nT. U. EEOC. Uniform guidelines on employee selection procedures. https://www.eeoc.gov/\n\npolicy/docs/qanda_clarify_procedures.html, Mar. 1979.\n\nM. Feldman, S. A. Friedler, J. Moeller, C. Scheidegger, and S. Venkatasubramanian. Certifying and\nremoving disparate impact. In Proc. ACM SIGKDD Int. Conf. Knowl. Disc. Data Min., pages\n259\u2013268, 2015.\n\nB. Fish, J. Kun, and \u00c1. D. Lelkes. A con\ufb01dence-based approach for balancing fairness and accuracy.\nIn Proceedings of the SIAM International Conference on Data Mining, pages 144\u2013152. SIAM,\n2016.\n\nS. A. Friedler, C. Scheidegger, and S. Venkatasubramanian. On the (im) possibility of fairness. arXiv\n\npreprint arXiv:1609.07236, 2016.\n\nS. Hajian. Simultaneous Discrimination Prevention and Privacy Protection in Data Publishing and\nMining. PhD thesis, Universitat Rovira i Virgili, 2013. Available online: https://arxiv.org/\nabs/1306.6805.\n\nS. Hajian and J. Domingo-Ferrer. A methodology for direct and indirect discrimination prevention in\n\ndata mining. IEEE Trans. Knowl. Data Eng., 25(7):1445\u20131459, 2013.\n\nM. Hardt, E. Price, and N. Srebro. Equality of opportunity in supervised learning. In Adv. Neur. Inf.\n\nProcess. Syst. 29, pages 3315\u20133323, 2016.\n\nK. D. Johnson, D. P. Foster, and R. A. Stine. Impartial predictive modeling: Ensuring fairness in\n\narbitrary models. arXiv preprint arXiv:1608.00528, 2016.\n\n9\n\n\fF. Kamiran and T. Calders. Data preprocessing techniques for classi\ufb01cation without discrimination.\n\nKnowledge and Information Systems, 33(1):1\u201333, 2012.\n\nT. Kamishima, S. Akaho, and J. Sakuma. Fairness-aware learning through regularization approach.\nIn Data Mining Workshops (ICDMW), IEEE 11th International Conference on, pages 643\u2013650.\nIEEE, 2011.\n\nJ. Kleinberg, S. Mullainathan, and M. Raghavan. Inherent trade-offs in the fair determination of risk\n\nscores. In Proc. Innov. Theoret. Comp. Sci., 2017.\n\nM. J. Kusner, J. R. Loftus, C. Russell, and R. Silva. Counterfactual fairness. arXiv preprint\n\narXiv:1703.06856, 2017.\n\nN. Li, T. Li, and S. Venkatasubramanian. t-closeness: Privacy beyond k-anonymity and l-diversity. In\n\nIEEE 23rd International Conference on Data Engineering, pages 106\u2013115. IEEE, 2007.\n\nM. Lichman. UCI machine learning repository, 2013. URL http://archive.ics.uci.edu/ml.\nJ. Pearl. Comment: understanding simpson\u2019s paradox. The American Statistician, 68(1):8\u201313, 2014.\nD. Pedreschi, S. Ruggieri, and F. Turini. Discrimination-aware data mining. In Proc. ACM SIGKDD\n\nInt. Conf. Knowl. Disc. Data Min., pages 560\u2013568. ACM, 2008.\n\nD. Pedreschi, S. Ruggieri, and F. Turini. A study of top-k measures for discrimination discovery. In\n\nProc. ACM Symp. Applied Comput., pages 126\u2013131, 2012.\n\nProPublica.\n\nRisk\n\nScore\n\nAnalysis.\nhttps://www.propublica.org/datastore/dataset/compas-recidivism-risk-score-data-and-analysis,\n2017.\n\nRecidivism\n\nCOMPAS\n\nData\n\nand\n\nS. Ruggieri. Using t-closeness anonymity to control for non-discrimination. Trans. Data Privacy, 7\n\n(2):99\u2013129, 2014.\n\nM. B. Zafar, I. Valera, M. G. Rodriguez, and K. P. Gummadi. Fairness beyond disparate treat-\nment & disparate impact: Learning classi\ufb01cation without disparate mistreatment. arXiv preprint\narXiv:1610.08452, 2016.\n\nR. Zemel, Y. L. Wu, K. Swersky, T. Pitassi, and C. Dwork. Learning fair representations. In Proc. Int.\n\nConf. Mach. Learn., pages 325\u2013333, 2013.\n\nZ. Zhang and D. B. Neill. Identifying signi\ufb01cant predictive bias in classi\ufb01ers. In Proceedings of the\nNIPS Workshop on Interpretable Machine Learning in Complex Systems, 2016. Available online:\nhttps://arxiv.org/abs/1611.08292.\n\nI. \u017dliobait\u02d9e, F. Kamiran, and T. Calders. Handling conditional discrimination. In Proc. IEEE Int.\n\nConf. Data Mining, pages 992\u20131001, 2011.\n\n10\n\n\f", "award": [], "sourceid": 2135, "authors": [{"given_name": "Flavio", "family_name": "Calmon", "institution": "Harvard University"}, {"given_name": "Dennis", "family_name": "Wei", "institution": "IBM Research"}, {"given_name": "Bhanukiran", "family_name": "Vinzamuri", "institution": "IBM Research"}, {"given_name": "Karthikeyan", "family_name": "Natesan Ramamurthy", "institution": "IBM Research"}, {"given_name": "Kush", "family_name": "Varshney", "institution": "IBM Research"}]}