TY - GEN
T1 - Identifying and reducing gender bias in word-level language models
AU - Bordia, Shikha
AU - Bowman, Samuel R.
N1 - Funding Information:
We are grateful to Yu Wang and Jason Cramer for helping to initiate this project, to Nishant Subra-mani for helpful discussion, and to our reviewers for their thoughtful feedback. Bowman acknowledges support from Samsung Research.
Publisher Copyright:
© 2017 Association for Computational Linguistics.
Copyright:
Copyright 2020 Elsevier B.V., All rights reserved.
PY - 2019
Y1 - 2019
N2 - Many text corpora exhibit socially problematic biases, which can be propagated or amplified in the models trained on such data. For example, doctor cooccurs more frequently with male pronouns than female pronouns. In this study we (i) propose a metric to measure gender bias; (ii) measure bias in a text corpus and the text generated from a recurrent neural network language model trained on the text corpus; (iii) propose a regularization loss term for the language model that minimizes the projection of encoder-trained embeddings onto an embedding subspace that encodes gender; (iv) finally, evaluate efficacy of our proposed method on reducing gender bias. We find this regularization method to be effective in reducing gender bias up to an optimal weight assigned to the loss term, beyond which the model becomes unstable as the perplexity increases. We replicate this study on three training corpora-Penn Treebank,WikiText-2, and CNN/Daily Mail-resulting in similar conclusions.
AB - Many text corpora exhibit socially problematic biases, which can be propagated or amplified in the models trained on such data. For example, doctor cooccurs more frequently with male pronouns than female pronouns. In this study we (i) propose a metric to measure gender bias; (ii) measure bias in a text corpus and the text generated from a recurrent neural network language model trained on the text corpus; (iii) propose a regularization loss term for the language model that minimizes the projection of encoder-trained embeddings onto an embedding subspace that encodes gender; (iv) finally, evaluate efficacy of our proposed method on reducing gender bias. We find this regularization method to be effective in reducing gender bias up to an optimal weight assigned to the loss term, beyond which the model becomes unstable as the perplexity increases. We replicate this study on three training corpora-Penn Treebank,WikiText-2, and CNN/Daily Mail-resulting in similar conclusions.
UR - http://www.scopus.com/inward/record.url?scp=85084298290&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85084298290&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85084298290
T3 - NAACL HLT 2019 - 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies - Proceedings of the Student Research Workshop
SP - 7
EP - 15
BT - NAACL HLT 2019 - 2019 Conference of the North American Chapter of the Association for Computational Linguistics
PB - Association for Computational Linguistics (ACL)
T2 - 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2019 - Student Research Workshop, SRW 2019
Y2 - 3 June 2019 through 5 June 2019
ER -