TY - GEN
T1 - Imperfect Inferences
T2 - 5th ACM Conference on Fairness, Accountability, and Transparency, FAccT 2022
AU - Rieke, Aaron
AU - Southerland, Vincent
AU - Svirsky, Dan
AU - Hsu, Mingwei
N1 - Publisher Copyright:
© 2022 Owner/Author.
PY - 2022/6/21
Y1 - 2022/6/21
N2 - Measuring racial disparities is challenging, especially when demographic labels are unavailable. Recently, some researchers and advocates have argued that companies should infer race and other demographic factors to help them understand and address discrimination. Others have been more skeptical, emphasizing the inaccuracy of racial inferences, critiquing the conceptualization of demographic categories themselves, and arguing that the use of demographic data might encourage algorithmic tweaks where more radical interventions are needed. We conduct a novel empirical analysis that informs this debate, using a dataset of self-reported demographic information provided by users of the ride-hailing service Uber who consented to share this information for research purposes. As a threshold matter, we show how this data reflects the enduring power of racism in society. We find differences by race across a range of outcomes. For example, among self-reported African-American riders, we see racial differences on factors from iOS use to local pollution levels. We then turn to a practical assessment of racial inference methodologies and offer two key findings. First, every inference method we tested has significant errors, miscategorizing people relative to their self-reports (even as the self-reports themselves suffer from selection bias). Second, and most importantly, we found that the inference methods worked: they reliably confirmed directional racial disparities that we knew were reflected in our dataset. Our analysis also suggests that the choice of inference methods should be informed by the measurement task. For example, disparities that are geographic in nature might be best captured by inferences that rely on geography; discrimination based on a person's name might be best detected by inferences that rely on names. In conclusion, our analysis shows that common racial inference methods have real and practical utility in shedding light on aggregate, directional disparities, despite their imperfections. While the recent literature has identified notable challenges regarding the collection and use of this data, these challenges should not be seen as dispositive.
AB - Measuring racial disparities is challenging, especially when demographic labels are unavailable. Recently, some researchers and advocates have argued that companies should infer race and other demographic factors to help them understand and address discrimination. Others have been more skeptical, emphasizing the inaccuracy of racial inferences, critiquing the conceptualization of demographic categories themselves, and arguing that the use of demographic data might encourage algorithmic tweaks where more radical interventions are needed. We conduct a novel empirical analysis that informs this debate, using a dataset of self-reported demographic information provided by users of the ride-hailing service Uber who consented to share this information for research purposes. As a threshold matter, we show how this data reflects the enduring power of racism in society. We find differences by race across a range of outcomes. For example, among self-reported African-American riders, we see racial differences on factors from iOS use to local pollution levels. We then turn to a practical assessment of racial inference methodologies and offer two key findings. First, every inference method we tested has significant errors, miscategorizing people relative to their self-reports (even as the self-reports themselves suffer from selection bias). Second, and most importantly, we found that the inference methods worked: they reliably confirmed directional racial disparities that we knew were reflected in our dataset. Our analysis also suggests that the choice of inference methods should be informed by the measurement task. For example, disparities that are geographic in nature might be best captured by inferences that rely on geography; discrimination based on a person's name might be best detected by inferences that rely on names. In conclusion, our analysis shows that common racial inference methods have real and practical utility in shedding light on aggregate, directional disparities, despite their imperfections. While the recent literature has identified notable challenges regarding the collection and use of this data, these challenges should not be seen as dispositive.
KW - civil rights
KW - demographics
KW - discrimination
KW - fairness
KW - inference
KW - race
UR - http://www.scopus.com/inward/record.url?scp=85132979733&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85132979733&partnerID=8YFLogxK
U2 - 10.1145/3531146.3533140
DO - 10.1145/3531146.3533140
M3 - Conference contribution
AN - SCOPUS:85132979733
T3 - ACM International Conference Proceeding Series
SP - 767
EP - 777
BT - Proceedings of 2022 5th ACM Conference on Fairness, Accountability, and Transparency, FAccT 2022
PB - Association for Computing Machinery
Y2 - 21 June 2022 through 24 June 2022
ER -