{"title": "Discrimination in Online Markets: Effects of Social Bias on Learning from Reviews and Policy Design", "book": "Advances in Neural Information Processing Systems", "page_first": 2145, "page_last": 2155, "abstract": "The increasing popularity of online two-sided markets such as ride-sharing, accommodation and freelance labor platforms, goes hand in hand with new socioeconomic challenges. One major issue remains the existence of bias and discrimination against certain social groups. We study this problem using a two-sided large market model with employers and workers mediated by a platform. Employers who seek to hire workers face uncertainty about a candidate worker's skill level. Therefore, they base their hiring decision on learning from past reviews about an individual worker as well as on their (possibly misspecified) prior beliefs about the ability level of the social group the worker belongs to. Drawing upon the social learning literature with bounded rationality and limited information, uncertainty combined with social bias leads to unequal hiring opportunities between workers of different social groups. Although the effect of social bias decreases as the number of reviews increases (consistent with empirical findings), minority workers still receive lower expected payoffs. Finally, we consider a simple directed matching policy (DM), which combines learning and matching to make better matching decisions for minority workers. Under this policy, there exists a steady-state equilibrium, in which DM reduces the discrimination gap.", "full_text": "Discrimination in Online Markets: Effects of Social\nBias on Learning from Reviews and Policy Design\n\nFaidra Monachou\nStanford University\n\nmonachou@stanford.edu\n\nAbstract\n\nItai Ashlagi\n\nStanford University\n\niashlagi@stanford.edu\n\nThe increasing popularity of online two-sided markets such as ride-sharing, ac-\ncommodation and freelance labor platforms, goes hand in hand with new socioeco-\nnomic challenges. One major issue remains the existence of bias and discrimination\nagainst certain social groups. We study this problem using a two-sided large market\nmodel with employers and workers mediated by a platform. Employers who seek\nto hire workers face uncertainty about a candidate worker\u2019s skill level. Therefore,\nthey base their hiring decision on learning from past reviews about an individual\nworker as well as on their (possibly misspeci\ufb01ed) prior beliefs about the ability\nlevel of the social group the worker belongs to. Drawing upon the social learning\nliterature with bounded rationality and limited information, uncertainty combined\nwith social bias leads to unequal hiring opportunities between workers of different\nsocial groups. Although the effect of social bias decreases as the number of reviews\nincreases (consistent with empirical \ufb01ndings), minority workers still receive lower\nexpected payoffs. Finally, we consider a simple directed matching policy (DM),\nwhich combines learning and matching to make better matching decisions for\nminority workers. Under this policy, there exists a steady-state equilibrium, in\nwhich DM reduces the discrimination gap.\n\n1\n\nIntroduction\n\nOnline markets such as ride-sharing, accommodation, and freelance labor platforms have grown\nrapidly over the past few years, thus shaping the future of work. However, a major issue, which\nis common in traditional markets, is the existence of bias and discrimination against certain social\ngroups. Indeed, several empirical studies document the existence of racial, gender and other forms of\ndiscrimination in popular online platforms. In experiments on Airbnb, an online accommodation-\nsharing platform, Edelman et al. [18] and Cui et al. [16] \ufb01nd that accommodation applications from\nguests with distinctively African-American names are about 16%-19% less likely to be accepted\nrelative to identical guests with distinctively white-sounding names. A study by Ge et al. [25]\ncon\ufb01rms analogous results for race discrimination on the ridesharing platform Uber while Ameri\net al. [5] document discrimination against travelers with disabilities on Airbnb. Hann\u00e1k et al. [27]\nexamine racial and gender discrimination in two freelance labor markets, TaskRabbit and Fiverr;\non both platforms, workers perceived to be black get worse ratings than similarly quali\ufb01ed workers\nperceived to be white while on TaskRabbit women receive fewer reviews than men with equivalent\nwork experience.\nUnderstanding the effects of social bias on learning from reviews will help designing successful\ninterventions that reduce the existing discrimination. Towards this goal, we consider a two-sided\nlarge market model of employers and workers mediated by platform. Workers belong to one of\n\n33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada.\n\n\ftwo different social groups (minority or majority1) and can be either high-skilled or low-skilled;\nemployers may or may not be biased against minority workers. Employers are matched randomly\nwith candidate workers and decide whether to hire them or not, but, due to the uncertainty about a\nworker\u2019s skill level, base their decision (i) on past reviews about the individual worker, (ii) on their\nprivate prior beliefs about the skill level of the social group that the worker belongs to, and (iii) on\npersonal preferences. Employers in the model may be biased against the minority group of workers.\nWe study the dynamics of employers\u2019 beliefs under social bias. Although social bias decreases with\nadditional reviews, the welfare of minority workers is lower than majority workers at the steady-state\nequilibrium of the market (Theorem 1). We also design a simple algorithmic policy to decrease\ndiscrimination; our proposed DM (Directed Matching) policy uses a combination of learning and\nmatching [30] to make better matching decisions for minority workers and improve welfare. DM\nlearns (within a small error probability) the group of an employer and then matches employers to\nmajority or minority workers based on the employer\u2019s identi\ufb01ed group. The policy aims at protecting\nminority workers from matching with discriminating employers and results in Pareto improvement\nover the benchmark uniformly random matching (UM) algorithm that ignores social bias (Theorem\n3). The DM policy can explain ubiquitous policies that boost new workers to the top of search results\nin online labor platforms.\nThe behavioral assumptions in this paper are motivated by the rich empirical and theoretical literature\nwhich has identi\ufb01ed two potential sources of discrimination: belief-based (mostly known as statistical)\nand taste-based. Regarding the former, a long line of research in the statistical discrimination literature\nassumes that group differences do exist and are exogenous [22]. Thus, employers hold correct beliefs\nabout aggregate group differences (e.g. [6, 36, 4, 13, 28, 34]). This case is not considered here since\nwe assume an equally distributed skill level between the two social groups of workers. Instead, we\nconsider an alternative source of belief-based bias that recent research has demonstrated: incorrect\nprior beliefs [11, 39, 12, 24, 23, 16]. Without being aware of their own bias (see [37] about the bias\nblindspot effect), some employers may hold misspeci\ufb01ed models of group differences which, in the\nabsence of perfect information, lead to false judgment of an individual\u2019s abilities. The combination\nof both uncertainty and misspeci\ufb01cation results in discrimination.\nThe key difference between belief-based and taste-based social bias is the effect of information.\nTaste-based discrimination models [8] assume that the differential treatment of minority groups\nis driven purely by preferences; thus, discrimination persists even with perfect information about\nan individual\u2019s true skills. In sharp contrast to taste-based bias, the presence of more information\ngradually reduces (and asymptotically eliminates) the effect of belief-based bias. The existing\nempirical research on online platforms indicates the existence of belief-based bias. For example,\nCui et al. [16] \ufb01nd that a positive review can signi\ufb01cantly alleviate racial discrimination on Airbnb\nwhile self-claimed quality information by guests themselves cannot. Using Airbnb observational data,\nAbrahao et al. [1] also \ufb01nd that users with higher reputation scores are considered more trustworthy\nregardless of their demographic characteristics. Finally, several papers such as [38] and [35] suggest\nthat rating systems can be utilized towards combating discrimination in online markets. Motivated\nby these \ufb01ndings, we focus mainly on belief-based bias and show that the model is consistent with\nthe empirical \ufb01ndings (Theorem 2); nevertheless, we also discuss taste-based bias (Supplementary\nMaterial, Section E.1).\nFrom a technical point of view, we draw upon a variety of tools to prove our results. First, employers\nengage in a na\u00efve social learning process from reviews to learn about the true quality of each worker\nin the market. In contrast to the existing literature [40, 10, 7, 9, 41, 15, 29, 2, 17, 26, 20, 19, 21, 33],\nwe also assume that employers may have different (misspeci\ufb01ed) priors. We use a stochastic\napproximation analysis [9] to represent the dynamics of the social learning process. Second, we\nadopt a continuum model and extend the setting in [30] by including agent histories on both sides of\nthe market as well as by incorporating a social learning component with employer incentives (in their\nhiring and review decisions) to the evolution of the system. These two factors, along with a different\nobjective and the presence of social bias, make the dynamical system in our paper take a non-linear\nform, and differentiate (both technically and conceptually) our model and policy from [30]. Finally,\nregarding the learning algorithm used in our DM policy, we also use results by Agrawal et al. [3] on a\nvariation of the stochastic multi-armed bandit (MAB) problem.\n\n1We do not make any assumption on the size of each social group. Alternatively, one may use the terms\n\nprivileged and unprivileged.\n\n2\n\n\f2 Model\n\nWe consider an online labor platform of workers and employers2 mediated by a platform. Time is\ndiscrete t = 1, . . . and a mass of workers and employers arrive at each period t. We assume that\neach arriving worker and employer stays for K periods; later, we also consider the limit K \u2192 \u221e.\nThe market is initially empty, and no workers and employers have entered the market before time\nt = 1. At each time t, each worker is paired with one prospective employer who decides whether\nto hire the worker or not. Throughout the paper, we use t to denote the absolute time in the market,\nk = 1, . . . , K to denote the relevant period during a worker\u2019s lifetime and n = 1, . . . , K to denote\nthe relevant period during an employer\u2019s lifetime.\nAgents. At any time t, a mass \u03bbB > 0 of minority (B) and a mass \u03bbA > 0 of majority (A) workers\narrives to the market. Regardless of his social group c \u2208 {A, B}, each worker may be high-skilled or\nlow-skilled; let Q \u2208 {H, L} denote the skill level of the worker and let the true fraction of high-skilled\nworkers within a social group c be q0 \u2208 (0, 1). The social group of each worker is publicly observed\nby the platform and the agents, but only nature knows the true skill level of a worker.\nEach employer belongs to one of two groups e \u2208 {N, D} based on her prior belief: non-\ndiscriminating (N) employers and discriminating (D) employers against minority workers. At\neach time period t, a mass \u03bbN > 0 of N employers and a mass \u03bbD > 0 of D employers arrive to the\nmarket.\nMatching, Actions and Utility. At each period t, employers and workers who are still present in the\nmarket are paired by the platform; initially we assume uniformly random pairing. For simplicity,\nwe assume that workers always accept the incoming offers of employers. If hired at one period,\nthe worker receives payoff 1 for that period, otherwise he gets zero payoff. Each employer has two\nactions at each period; she can either hire (m = 1) or reject (m = 0) the candidate worker. By\nslightly abusing notation, we call employer k the k-th employer that a certain worker meets over his\nlifetime. If employer k hires the worker, she receives utility\n\nUk = Ak + 1{Q=H} +Pk,\n\n(1)\n\nwhere Ak is an ex ante idiosyncratic term and Pk is an ex post3 idiosyncratic term. Otherwise, she\nreceives zero utility. The random variables Ak and Pk are independent; they are also i.i.d. with known\ncontinuous CDF FA and FP and bounded support [a, a] and [p, p], respectively. Let \u00b5P = E Pk.\nFinally, note that the exact values of Ak and Pk are privately revealed only to the k-th employer of\nthe worker.4\nRating system. Given a positive hiring decision, there is a probability \u03b7 > 0 that the employer leaves\na review. A review rk can be either good (g) or bad (b) while rk = (cid:5) denotes that no review was\nleft. Reviews are imperfect as they are determined by the realized utility Uk of each employer k. In\nparticular, rk = g if Uk \u2265 0 and rk = b if Uk < 0. Upon meeting a candidate worker, employer k\nobserves the worker\u2019s social group c, her private value Ak, and some information on worker\u2019s history\nprovided by the platform. The platform observes the full history of a worker (consisting of hiring\ndecisions and reviews) but the employers see only the statistics of good and bad reviews. Speci\ufb01cally,\nlet Gk and Bk denote the number of good and bad reviews before period k. Hence, in any period k,\nemployers only observe Gk and Bk. Before the worker enters the market (k = 1), no reviews are\navailable thus B1 = G1 = 0.\nBelief-based social bias. The platform has the correct prior belief q0 about minority and majority\n\u2208 (0, 1) about\nworkers\u2019 skill level. All employers also share the same, correct prior belief q0 (cid:44) G0\nmajority workers. Regarding minority workers, non-discriminating (N) employers have prior belief\nGN\n= \u03b2q0 = \u03b2G0\n.\n0\nIn this case, we say that discriminating employers have social bias level \u03b2 \u2208 (0, 1) against minority\nN0\nN0\nworkers.\n\nbut discriminating employers (D) use a misspeci\ufb01ed prior belief GD\n0\nN0\n\n= q0 = G0\nN0\n\nN0\n\n2We refer to each worker as he and each employer as she.\n3The ex ante idiosyncratic term Ak is realized when employer k meets the worker; the ex post term Pk is\n\nrealized after the employer hires the worker. Both Ak and Pk are independent of Q.\n\n4This is the standard utility model in the related social learning literature (see e.g. [29, 9, 2])\n\n3\n\n\fNa\u00efve learning from reviews. We consider that employers have limited computational ability and\nnaively use the fraction of good reviews (adjusted by their prior belief)\n\nGk + Ge\n0\n\nqe\nk =\n\nGk + Bk + N0\n\n(2)\nas a proxy for the probability that the worker is high-skilled. The \ufb01ctitious reviews G0, N0 \u2208 N may\nbe interpreted as the weight that employers assign to the prior belief q0 (see (2)). The smaller the\nnumber of reviews, the more employers rely on their private belief. Hence, as the number of reviews\nincreases, employers start relying more on the external information (see also Lemma 2). If the total\nnumber of reviews Nk (cid:44) Gk + Bk = 0, qe\nWorker welfare. At the steady-state equilibrium of the market5, we de\ufb01ne the worker\u2019s welfare as\n\nk is equal to employer k\u2019s prior belief Ge\n\n0\nN0\n\n.\n\n(cid:18) K(cid:88)\n\n(cid:19)\n\nW c\n\nQ(K) = E\n\n\u03b4kmk | Q, c\n\n,\n\n(3)\n\nk=1\n\nwhere \u03b4 \u2208 (0, 1) is a known discount factor and mk is the hiring decision of the k-th employer.\nWe say that there exists discrimination against minority (resp. majority) workers if W B\nW A\ntion gap d(Q, K) among workers of skill level Q as d(Q, K) = W A\nAssumptions. We make the following two technical assumptions.\n\nQ (K) <\nQ (K)) for all Q \u2208 {H, L}. For life time K, we de\ufb01ne the discrimina-\n\nQ (K) \u2212 W B\n\nQ (K) (resp. W A\n\nQ (K) < W B\n\nQ (K).\n\nAssumption 1 (Richness) The support [a, a] and [p, p] of random variables A and P are such that\n(4)\n\nP(A + \u00b5P \u2265 0) = P(A \u2265 \u2212\u00b5P ) > 0,\n\ni.e. a + \u00b5P \u2265 0, as well as a + \u00b5P + q0 \u2264 0. Furthermore,\n\na + p > 0 and a + 1 + p < 0.\n\n(5)\n\nAssumption 2 (Balanced market) The market is perfectly balanced, i.e. \u03bbD + \u03bbN = \u03bbA + \u03bbB.\nAssumption 1 is important in establishing almost sure convergence. Speci\ufb01cally, a + \u00b5P \u2265 0 ensures\nthat, regardless of the current belief qe\nk of group e employers, there is always a positive probability\nthat the worker is hired; a + \u00b5P + q0 \u2264 0 implies that for belief q0 or smaller, there is a positive\nprobability that the worker is not hired. Finally, (5) guarantees that, conditional on hiring, good and\nbad reviews happen with positive probability. However, we also examine the case where Assumption\n1 does not hold. For simplicity, we also assume that the market is balanced (Assumption 2). This\nimplies that neither employer nor worker stay unmatched in any period. We discuss the case of\nunbalanced markets in Section 5.\n\n3 Effects of Belief-Based Social Bias\n\nIn this section, we analyze the dynamics of belief updating during the lifetime k = 1, . . . , K of a\nworker in the market. The worker of social group c and unknown (but \ufb01xed) skill level Q meets one\nemployer per period k. By comparing the case of workers with different social groups but same skill\nlevel, we study how social bias affects worker welfare at the steady-state equilibrium of the market as\nwell as asymptotic learning of worker quality.\nEmployer\u2019s hiring decision problem. At period k, the candidate worker\u2019s past reviews Gk and Bk\ncoupled with employer k\u2019s prior about group c \u2208 {A, B} induce a belief qe\nk, e \u2208 {N, D}, regarding\nhis skill level. Upon meeting the candidate worker, employer k\u2019s decision problem is simply given by\n\nmk = arg max\n\nm\u2208{0,1} 1{m = 1}(Ak + qe\n\nk + \u00b5P ).\n\n(6)\n\nIn turn, the employer accepts the current worker if and only if her expected utility for that worker is\nnon-negative, that is qe\nAs a warm-up, we prove the following intuitive properties. Their proofs are straightforward and can\nbe found in the Supplementary Material (together with all other proofs).\n\nk \u2265 \u2212Ak \u2212 \u00b5P .\n\n5See Appendix A.1 for a formal description of systems dynamics.\n\n4\n\n\fLemma 1 The difference qN\nthe same minority worker is positive for any period k and weakly decreases with k.\n\nk between the beliefs of group N and group D employers about\n\nk \u2212 qD\n\nThe following lemma is an immediate corollary.\n\nLemma 2 Fix period k and review statistics Gk, Bk. Under uniformly random matching, the\nprobability that a minority worker is hired at period k equals\n\n\u03bbD\n\nk \u2212 \u00b5P )) +\nwhich is smaller than the probability 1 \u2212 FA(\u2212qN\nreview statistics is hired at period k.\n\n(1 \u2212 FA(\u2212qD\n\n\u03bbD + \u03bbN\n\n\u03bbN\n\n(1 \u2212 FA(\u2212qN\n\nk \u2212 \u00b5P )),\n\n(7)\n\u03bbD + \u03bbN\nk \u2212 \u00b5P ) that a majority worker with the same\n\nIn practice, Lemma 1 and Lemma 2 suggest that the difference in hiring probabilities of minority and\nmajority workers is large for workers with few reviews.\nDiscrimination and worker inequality. Under the social learning dynamics that we described here,\nthe large market - which we model as a discrete-time dynamical system - always reaches a unique\nsteady state equilibrium (see Lemma A.1 in Appendix A). In the following theorem, we quantify the\neffect of social bias on the worker welfare at the steady-state equilibrium, and show the existence of\ndiscrimination against minority workers.\n\nTheorem 1 (Discrimination under belief-based social bias) At the steady-state equilibrium of the\nmarket, there exists discrimination against minority workers, i.e. minority workers have lower\nexpected welfare W B\n\nQ (K) than majority workers of the same skill level Q \u2208 {H, L}.\n\nQ (K) < W A\n\nAsymptotic learning under belief-based social bias. In the baseline model, we have assumed that\nworkers (and employers) stay for a limited time of K periods. Next we verify that learning occurs\nin the limit K \u2192 \u221e. Given the \u201cnaive\" learning rule in (2), we prove that the skill level estimate\nt of employer group e about the skill level of a worker asymptotically converges to an estimate\nqe\nqe\u221e. Similar results are also established in [9] and [41]; the main difference lies at the existence\nof employer groups with contradicting prior beliefs in the case of minority workers.6 Theorem 2\nbelow shows that the asymptotic estimates for minority workers do not differ between discriminating\nand non-discriminating employers. It also shows that, despite the naivet\u00e9 of the social learning\nrule and employers\u2019 different prior beliefs, the employers correctly estimate a higher skill-level for\nhigh-skilled workers compared to their estimated level for low-skilled workers. Hence, they are able\nto distinguish between high-skilled and low-skilled workers.\n\nTheorem 2 (Asymptotic Learning under Belief-Based Social Bias) Fix a worker of social group\nc \u2208 {A, B} and true skill level Q \u2208 {H, L}. Then, as K \u2192 \u221e,\nt \u2192 qe\u221e(Q) almost surely.\nqe\n\nThe limit qe\u221e(Q) depends on the true skill level Q of the worker but not on his social group c, i.e.\nqN\u221e(Q) = qD\u221e(Q) \u2261 q\u221e(Q) for workers of the same Q. Speci\ufb01cally, q\u221e(Q) is the unique solution to\n(8)\n\nP[A + 1{Q=H} +P \u2265 0 | A + q + \u00b5P \u2265 0, Q] \u2212 q = 0,\n\nwhich does not depend on the employers\u2019 prior beliefs.\nFurthermore, employers are able to distinguish between high-skilled and low-skilled workers, i.e.\nq\u221e(H) > q\u221e(L).\n\nThe proof can be found in Appendix B. An important technical observation is that, when employers\ndo not hire the worker or do not leave a review, Nk = Gk + Bk remains unchanged. Formally, let\n\u03c4j denote the time indices of employers who hire the worker and leave a review, i.e. \u03c41 = min(k |\nmk = 1, rk (cid:54)= (cid:5)) and \u03c4j = min(j | j > \u03c4j\u22121, mj = 1, rj (cid:54)= (cid:5)). Since \u03c4Nk < k denotes the last time\na review was left before period k, the belief qe\nt has the same value in the time between periods \u03c4Nk + 1\nand \u03c4Nk+1, i.e.\n\nk = qe\nqe\n\n\u03c4Nk +1\n\n(9)\n\n6In comparison to the models considered in [9] and [41], there are also several technical differences in the\n\nutility function and the review structure.\n\n5\n\n\ft and qN\n\nt and qD\n\nk \u2212 qD\n\nt and qN\n\nfor any group e \u2208 {N, D}.\nTherefore, for the case of majority workers where both D and N employers share the same belief\nqk at any time k, the dynamics of qk at times \u03c41 + 1, . . . , \u03c4Nk + 1 can be described by a stochastic\napproximation (Robbins-Monro) algorithm. A known result (Lemma D.2) about the almost sure\nconvergence of Robbins-Monro algorithms guarantees the almost sure convergence of qt. On the\nother hand, the case of minority workers is more complicated. Employers of group D and N co-exist\nin the market. Due to their different priors, the beliefs of D and N employers do not follow the same\ndynamics although the probability of getting hired at time t depends on both qD\nt . By using a\ngeneralized Robbins-Monro argument, we prove that qD\nt converge almost surely to the same\nlimit as the belief about majority workers. Intuitively, as time t grows, Lemma 1 implies that the\ndifference between qN\nt decreases thus both N and D employers gradually forget their prior\nbeliefs and behave in a similar way.\nGiven the context of belief-based discrimination, the results in Theorem 2 are not surprising. Belief-\nbased discrimination occurs because (possibly misspeci\ufb01ed) prior beliefs about group characteristics\n\ufb01ll in the gap of perfect information about an individual worker. Already in Lemma 1, we have shown\nthat the difference qN\nk decreases as the number of reviews increases; thus, if it was possible to\ncollect an in\ufb01nite number of reviews about a certain worker, uncertainty would eventually disappear\nand employers would be able to perfectly distinguish between workers of high and low skill level.\nDespite this fact, note that, for \u03b4 \u2208 (0, 1), the discrimination gap in the worker welfare (as shown in\nTheorem 1) would still exist in the limit K \u2192 \u221e: minority workers eventually have equal hiring\nopportunities but this is not enough to account for the initial social bias in the \ufb01rst periods of their\nlifetime. Hence, discrimination (in terms of discounted total welfare) persists even as K \u2192 \u221e.\nFinally, we examine the case where Assumption 1 does not hold and \ufb01nd that with positive probability\nhirings stop and convergence to limit q\u221e(Q) does not occur. Speci\ufb01cally, we have that7:\nLemma 3 Fix Q \u2208 {H, L}. Suppose that condition (4) of Assumption 1 does not hold, i.e. a + \u00b5P <\n0. Then, with positive probability qe\n\nt does not converge to q\u221e(Q).\n\nPractically speaking, the presence of a worker in a real online market, could be indeed very short.\nRegarding online labor markets, Hann\u00e1k et al. [27] \ufb01nd that the perceived gender and race have\nsigni\ufb01cant negative correlations with search rank and that the number of completed tasks is positively\ncorrelated with the number of reviews. Consequently, it may be possible that minority workers may\nnot even have the chance to receive enough reviews or even stay long in the platform due to the tough\ncompetition. Lemma 3 partially captures this situation showing that learning stops with positive\nprobability. Nevertheless, considerations about the effect of market congestion (due to competition)\nand algorithmic bias on the exit rate of minority workers are out of the scope of the current paper but\nare de\ufb01nitely an interesting direction for future research.\n\n4 Policy Design\n\nUnder a uniform matching (UM) policy, a minority worker\u2019s welfare is smaller than a majority\nworker\u2019s welfare (Theorem 1). We are interested in designing a matching policy that reduces the\ndiscrimination gap d(Q, K) for each skill level Q so that it Pareto-dominates uniform matching in\nterms of worker welfare. A combination of employer type learning and improved matching is used to\nachieve this goal.\n\n4.1 Learning employer types\n\nGiven a known social learning model and the history of employers\u2019 hiring decisions, the platform can\nlearn - within a reasonable error probability - the group that each employer belongs to. Intuitively,\nif the platform observes that a certain employer rejects minority workers more often than a non-\ndiscriminating employer would do, then this employer probably belongs to group D.\nPreliminaries. We introduce the following de\ufb01nitions which can also be found in Appendix A.1.\nLet \u2126k, k = 1, . . . , K denote the worker history of length k \u2212 1. Formally, the worker history\n\u2126k = {\u03c91, . . . , \u03c9k\u22121} consists of all the past hiring decisions mk \u2208 {0, 1} and reviews rk \u2208 {(cid:5), g, b}\n\n7For a related result about a Bayesian social learning model, see Theorem 1 in [2].\n\n6\n\n\ffor that worker, that is \u03c9k = (mk, rk). Let Hn = {h1, . . . , hn\u22121}, n = 1, . . . , K, also denote the\nemployer history of length n \u2212 1. Each hn consists of the hiring decision mn \u2208 {0, 1} made by that\n8 and social group c \u2208 {A, B}, i.e. where hn = (mn, \u2126k, c).\nemployer about a worker with history \u2126k\nInitially, H1 = \u2205 and \u21261 = \u2205.\nGiven the employer\u2019s history Hn of length n \u2212 1, we de\ufb01ne ln to be the log-likelihood ratio\n\nln = log\n\nP(g = D | Hn)\nP(g = N | Hn)\n\n,\n\n(10)\n\n. Given thresholds \u03b8N , \u03b8D > 0, an employer is said to be D-identi\ufb01ed if ln > \u03b8D and\nwhere l1 = \u03bbD\nN-identi\ufb01ed if ln < \u2212\u03b8N . We also say that the employers who have not been identi\ufb01ed yet but match\n\u03bbN\nto minority workers are in the learning pool. Hence, the learning pool consists of all the minority\nworkers and the unidenti\ufb01ed employers matched to them.\nLearning in \ufb01nite expected time. For simplicity, suppose that we use a common threshold \u03b8 for both\ntypes N and D. It turns out that, even for a very large threshold \u03b8, we can manage to learn the type of\neach employer in expected \ufb01nite time during their lifetime. On top of that, the high threshold \u03b8 also\nprovides a very high accuracy to the types assigned to employers. To avoid introducing additional\nnotation, we provide below an informal version of this technical result. The proof is based on an\napplication of Lemma 4.3 in [3] (Lemma D.5 in Appendix).\n\nLemma 4 (Informal - Lemma C.1 in Appendix C) Suppose that a \ufb01xed employer is paired to a\nminority worker n according to a known distribution of worker histories, i.i.d.\nfor each period\nn = 1, . . . , K of the employer\u2019s lifetime. Then, for large enough K and \u03b8, the expected time until an\nemployer of group e \u2208 {N, D} gets e-identi\ufb01ed is at most K, i.e.\n\nE(inf{n > 0 : ln \u2265 \u03b8} | e = D) < K and E(inf{n > 0 : ln \u2264 \u2212\u03b8} | e = N ) < K.\n\n(11)\n\n4.2 Directed Matching (DM) policy\n\nThe platform can use the information provided by the learning algorithm in various ways. This paper\noffers a simple directed matching (DM) policy that takes advantage of the previous learning algorithm\nin order to reduce the discrimination gap between minority and majority workers. The policy extends\nideas found in [30] and works as follows. In the learning pool, the platform learns the type of each\nemployer by observing her past decisions about minority workers. To protect the minority workers,\nthis particular employer who has been identi\ufb01ed as being discriminating should not be matched with\nminority workers as long as the capacity constraints under arrival rates \u03bbi, i \u2208 {A, B, N, D} allow\nto. Therefore, as soon as a mass of employers exits the learning pool, an equal mass of minority\nworkers is matched with employers who have been identi\ufb01ed as non-discriminating. On the other\nhand, D-identi\ufb01ed employers are matched to majority workers. The idea is simple but, as Theorem 3\nshows, it can reduce the discrimination gap.9\nAt the time t = TDM, when the DM policy is introduced to the market, the market is at the steady state\nequilibrium under the initial uniform matching (UM) policy (see Lemma A.1). Then, DM proceeds\nas follows for some thresholds \u03b8N and \u03b8D.\nAt each time t = TDM, . . . repeat:\n1. Learning.\n\n1. Check each employer of history length n, for all n = 1, . . . , K:\n\n(a) If ln > \u03b8D, identify the employer as D.\n(b) If ln < \u2212\u03b8N , identify the employer as N.\n(c) Otherwise, the employer remains in the learning pool.\n\n2. All the employers who have just been identi\ufb01ed as N or D exit the learning pool.\n\n2. Matching.\n\n8Note that \u2126k can be of any length 0 and K \u2212 1.\n9However, the optimality of the DM policy is an open, challenging question.\n\n7\n\n\f1. Match the mass of D-identi\ufb01ed employers to an equal mass of majority workers (uniformly\n\nat random).\n\n2. Match the mass of N-identi\ufb01ed employers to an equal mass of minority workers (uniformly\nat random). Prioritize over workers who have already been matched with N-identi\ufb01ed\nemployers in the past. If necessary, select (uniformly at random) a mass of minority workers\nto exit the learning pool and match them with the remaining N-identi\ufb01ed employers.\n\n3. Uniformly at random select a mass of newly arrived employers to replace an mass of the\nemployers who have exited either the learning pool or the market so that the total mass of\nemployers and workers in the learning pool are equal. If the workers in the learning pool\noutnumber the employers in the learning pool, add an appropriate mass of non-identi\ufb01ed\nemployers to the learning pool (selected uniformly at random).\n\n4. Match the minority workers in the learning pool to an equal mass of employers in the\n\nlearning pool (uniformly at random).\n\n5. In any of the previous steps, match any remaining unmatched workers and employers\n\nuniformly at random.\n\nTheorem 3 For large enough K, \u03b8N and \u03b8D, there exists a steady-state equilibrium such that the\nDM policy Pareto-dominates the UM policy. That is, dDM(Q, K) < dUM(Q, K) while W A\nQ,UM(K) =\nW A\n\nQ,DM(K) and W B\n\nQ,UM(K) < W B\n\nQ,DM(K).\n\n\u03bbD\n\nThe intuition behind Theorem 3 is as follows. At the steady-state equilibrium of the market, an\nincoming minority worker either enters the learning pool (step 2c) or gets matched with N-identi\ufb01ed\nemployers (step 2b) or matches uniformly at random with employers as in UM (step 2e). For\nlarge enough appropriately chosen \u03b8N and \u03b8D, the fraction of D employers in the learning pool at\nevery time t remains at most\n. The error probability of the learning algorithm also becomes\nnegligible (Lemma C.3 in Supplementary Material), meaning that almost all N-identi\ufb01ed employers\nare indeed N employers. Hence, any incoming minority worker matches in expectation to fewer D\nemployers than he did under the UM policy. This leads to improvement of minority workers\u2019 welfare\nin total. Majority workers are not affected by the employer type and thus earn the same expected\nwelfare.\nFinally, observe that Theorem 3 is silent about employers\u2019 welfare. An interesting question is whether\none can design a matching policy, which in addition to decreasing the discrimination gap, does not\nharm workers and non-discriminating employers.\n\n\u03bbD+\u03bbN\n\n5 Conclusion and Open Questions\n\nThe framework studied in this paper, albeit stylized, provides simple insights about the underlying\ndiscrimination mechanism in online two-sided markets. We assumed a behavioral model of na\u00efve\nagents with belief-based bias. However, other behavioral assumptions are interesting to study. Except\nfor Bayesian agents and taste-based bias, one can assume agents update their (potentially misspeci\ufb01ed)\nbeliefs based on their own past experiences. Furthermore, the proposed DM policy is not a panacea\nbut demonstrates useful insights towards eliminating discrimination. The policy exploits the online\nnature of platforms and the plethora of available data in order to protect against discrimination. Thus,\nit becomes an example of how platforms can exercise their control towards a less discriminatory\nenvironment.\nIdentifying discriminating employers could also be useful for other platform interventions. The\nplatform could send warnings to or even ban D-identi\ufb01ed users.10 Learning possibly discriminating\nusers could also be useful when the platform designs information campaigns that target those users.\nIf discrimination is belief-based, such a measure might be very effective at helping people correct\ntheir misspeci\ufb01ed prior beliefs. Nevertheless, legal constraints and ethical considerations should also\nbe taken into account since there is no available law that clearly regulates discrimination in platforms\n[35, 38]. Note that, given our framework, discrimination occurs because in expectation D employers\nreject minority workers more often than they reject majority workers. This does not necessarily imply\n\n10Banning employers can possibly harm workers. Moreover, with more heterogeneity, other employers may\n\nseem to appear discriminating.\n\n8\n\n\fthat the platform is able to immediately take legal action against the D-identi\ufb01ed users. In this case,\nalternative operational measures have to be taken.\nThis paper offers some insights about discrimination, but also sheds light on even bigger questions\nthat remain open for future research. Reviews play an important role in many decisions that online\nplatforms make. The rating system creates an information loop: the platform relies on user-generated\ndata to learn about workers while employers\u2019 decisions and feedback are partially based on infor-\nmation provided by the platform. For example, the platform often relies on ratings to evaluate the\nperformance of workers as well as to qualify them for higher-paid work thus employment decisions\nmay inherit consumers\u2019 biases [38]. Do algorithmic decisions such as search, ranking and matching\ndetermined by users\u2019 ratings also run the risk of reproducing human bias (see, e.g. [31, 32, 14]\nfor related work in this direction)? What role does the design of online rating systems play in\namplifying the effect of social bias? How does the amount and form of information provided by\nthe platform about a worker affect hiring decisions? Furthermore, in a non-balanced market with\nmore workers than employers, DM policy would also look different; intuitively, workers of lower\nprobability to be high-skilled would stay unmatched for long periods of time. If employers were more\nthan workers, D-identi\ufb01ed employers could stay remain unmatched by the platform. In unbalanced\nmarkets, how does the effect of social bias change? And most generally, is there any platform policy\nthat permanently eliminates discrimination?\n\nAcknowledgments\n\nThe authors would like to thank Daniela Saban and Ramesh Johari for helpful comments and\ndiscussions.\n\nReferences\n[1] Bruno Abrahao, Paolo Parigi, Alok Gupta, and Karen S Cook. 2017. Reputation offsets trust\njudgments based on social biases among Airbnb users. Proceedings of the National Academy of\nSciences 114, 37 (2017), 9848\u20139853.\n\n[2] Daron Acemoglu, Ali Makhdoumi, Azarakhsh Malekian, and Asuman Ozdaglar. 2017. Fast\n\nand slow learning from reviews. Technical Report. National Bureau of Economic Research.\n\n[3] Rajeev Agrawal, Demosthenis Teneketzis, and Venkatachalam Anantharam. 1989. Asymptoti-\ncally ef\ufb01cient adaptive allocation schemes for controlled iid processes: Finite parameter space.\nIEEE Trans. Automat. Control 34, 3 (1989).\n\n[4] Joseph G Altonji and Charles R Pierret. 2001. Employer learning and statistical discrimination.\n\nThe Quarterly Journal of Economics 116, 1 (2001), 313\u2013350.\n\n[5] Mason Ameri, Sean Rogers, Lisa Schur, and Douglas Kruse. 2017. No room at the inn?\n\nDisability access in the new sharing economy. (2017).\n\n[6] Kenneth Arrow et al. 1973. The theory of discrimination. Discrimination in labor markets 3,\n\n10 (1973), 3\u201333.\n\n[7] Abhijit V Banerjee. 1992. A simple model of herd behavior. The quarterly journal of economics\n\n107, 3 (1992), 797\u2013817.\n\n[8] Gary S Becker. 1957. The Economics of Discrimination. University of Chicago Press (1957).\n\n[9] Omar Besbes and Marco Scarsini. 2018. On information distortions in online ratings. Operations\n\nResearch 66, 3 (2018), 597\u2013610.\n\n[10] Sushil Bikhchandani, David Hirshleifer, and Ivo Welch. 1992. A theory of fads, fashion, custom,\nand cultural change as informational cascades. Journal of political Economy 100, 5 (1992),\n992\u20131026.\n\n[11] J Aislinn Bohren, Alex Imas, and Michael Rosenberg. 2019. The Dynamics of Discrimination:\n\nTheory and Evidence. To appear in American Economic Review (2019).\n\n9\n\n\f[12] Pedro Bordalo, Katherine B Coffman, Nicola Gennaioli, and Andrei Shleifer. 2016. Beliefs\n\nabout gender. Technical Report. National Bureau of Economic Research.\n\n[13] Stephen Coate and Glenn C Loury. 1993. Will af\ufb01rmative-action policies eliminate negative\n\nstereotypes? The American Economic Review (1993), 1220\u20131240.\n\n[14] Sam Corbett-Davies, Emma Pierson, Avi Feller, Sharad Goel, and Aziz Huq. 2017. Algorithmic\ndecision making and the cost of fairness. In Proceedings of the 23rd ACM SIGKDD International\nConference on Knowledge Discovery and Data Mining. ACM, 797\u2013806.\n\n[15] Davide Crapis, Bar Ifrach, Costis Maglaras, and Marco Scarsini. 2016. Monopoly pricing in the\n\npresence of social learning. Management Science 63, 11 (2016), 3586\u20133608.\n\n[16] Ruomeng Cui, Jun Li, and Dennis J Zhang. 2016. Discrimination with incomplete information\n\nin the sharing economy: Field evidence from Airbnb. (2016).\n\n[17] Morris H DeGroot. 1974. Reaching a consensus. J. Amer. Statist. Assoc. 69, 345 (1974),\n\n118\u2013121.\n\n[18] Benjamin Edelman, Michael Luca, and Dan Svirsky. 2017. Racial discrimination in the sharing\neconomy: Evidence from a \ufb01eld experiment. American Economic Journal: Applied Economics\n9, 2 (2017), 1\u201322.\n\n[19] Glenn Ellison and Drew Fudenberg. 1993. Rules of thumb for social learning. Journal of\n\npolitical Economy 101, 4 (1993), 612\u2013643.\n\n[20] Glenn Ellison and Drew Fudenberg. 1995. Word-of-mouth communication and social learning.\n\nThe Quarterly Journal of Economics 110, 1 (1995), 93\u2013125.\n\n[21] Larry G Epstein, Jawwad Noor, and Alvaro Sandroni. 2010. Non-bayesian learning. The BE\n\nJournal of Theoretical Economics 10, 1 (2010).\n\n[22] Hanming Fang and Andrea Moro. 2011. Theories of statistical discrimination and af\ufb01rmative\n\naction: A survey. In Handbook of Social Economics. Vol. 1. Elsevier, 133\u2013200.\n\n[23] Roland Fryer and Matthew O Jackson. 2008. A categorical model of cognition and biased\n\ndecision making. The BE Journal of Theoretical Economics 8, 1 (2008).\n\n[24] Roland G Fryer Jr. 2007. Belief \ufb02ipping in a dynamic model of statistical discrimination.\n\nJournal of Public Economics 91, 5-6 (2007), 1151\u20131166.\n\n[25] Yanbo Ge, Christopher R Knittel, Don MacKenzie, and Stephen Zoepf. 2016. Racial and\ngender discrimination in transportation network companies. Technical Report. National Bureau\nof Economic Research.\n\n[26] Benjamin Golub and Matthew O Jackson. 2010. Naive learning in social networks and the\n\nwisdom of crowds. American Economic Journal: Microeconomics 2, 1 (2010), 112\u201349.\n\n[27] Anik\u00f3 Hann\u00e1k, Claudia Wagner, David Garcia, Alan Mislove, Markus Strohmaier, and Christo\nWilson. 2017. Bias in Online Freelance Marketplaces: Evidence from TaskRabbit and Fiverr..\nIn CSCW. 1914\u20131933.\n\n[28] Lily Hu and Yiling Chen. 2018. A Short-term Intervention for Long-term Fairness in the\nLabor Market. In Proceedings of the 2018 World Wide Web Conference on World Wide Web.\nInternational World Wide Web Conferences Steering Committee, 1389\u20131398.\n\n[29] Bar Ifrach, Costis Maglaras, and Marco Scarsini. 2014. Bayesian social learning with consumer\n\nreviews. ACM SIGMETRICS Performance Evaluation Review 41, 4 (2014), 28\u201328.\n\n[30] Ramesh Johari, Vijay Kamble, and Yash Kanoria. 2017. Matching while Learning. In Proceed-\n\nings of the 2017 ACM Conference on Economics and Computation. ACM, 119\u2013119.\n\n[31] Matthew Joseph, Michael Kearns, Jamie H Morgenstern, and Aaron Roth. 2016. Fairness in\nlearning: Classic and contextual bandits. In Advances in Neural Information Processing Systems.\n325\u2013333.\n\n10\n\n\f[32] Jon Kleinberg, Sendhil Mullainathan, and Manish Raghavan. 2016. Inherent trade-offs in the\n\nfair determination of risk scores. arXiv preprint arXiv:1609.05807 (2016).\n\n[33] Tho Ngoc Le, Vijay G Subramanian, and Randall A Berry. 2016. Are imperfect reviews helpful\nin social learning?. In 2016 IEEE International Symposium on Information Theory (ISIT). IEEE,\n2089\u20132093.\n\n[34] Jonathan Levin. 2009. The dynamics of collective reputation. The BE Journal of Theoretical\n\nEconomics 9, 1 (2009).\n\n[35] Karen Levy and Solon Barocas. 2017. Designing Against Discrimination in Online Markets.\n\n(2017).\n\n[36] Edmund S Phelps. 1972. The statistical theory of racism and sexism. The American Economic\n\nReview 62, 4 (1972), 659\u2013661.\n\n[37] Emily Pronin, Daniel Y Lin, and Lee Ross. 2002. The bias blind spot: Perceptions of bias in\n\nself versus others. Personality and Social Psychology Bulletin 28, 3 (2002), 369\u2013381.\n\n[38] Alex Rosenblat, Karen EC Levy, Solon Barocas, and Tim Hwang. 2017. Discriminating Tastes:\nUber\u2019s Customer Ratings as Vehicles for Workplace Discrimination. Policy & Internet 9, 3\n(2017), 256\u2013279.\n\n[39] Joshua Schwartzstein. 2014. Selective attention and learning. Journal of the European Economic\n\nAssociation 12, 6 (2014), 1423\u20131452.\n\n[40] Lones Smith and Peter S\u00f8rensen. 2000. Pathological outcomes of observational learning.\n\nEconometrica 68, 2 (2000), 371\u2013398.\n\n[41] Stefano Vaccari, Costis Maglaras, and Marco Scarsini. 2018. Social Learning from Online\n\nReviews with Product Choice. (2018).\n\n11\n\n\f", "award": [], "sourceid": 1269, "authors": [{"given_name": "Faidra Georgia", "family_name": "Monachou", "institution": "Stanford University"}, {"given_name": "Itai", "family_name": "Ashlagi", "institution": "Stanford"}]}