TY - GEN
T1 - Entities as experts
T2 - 2020 Conference on Empirical Methods in Natural Language Processing, EMNLP 2020
AU - Févry, Thibault
AU - Soares, Livio Baldini
AU - FitzGerald, Nicholas
AU - Choi, Eunsol
AU - Kwiatkowski, Tom
N1 - Publisher Copyright:
© 2020 Association for Computational Linguistics
PY - 2020
Y1 - 2020
N2 - We focus on the problem of capturing declarative knowledge about entities in the learned parameters of a language model. We introduce a new model-Entities as Experts (EAE)-that can access distinct memories of the entities mentioned in a piece of text. Unlike previous efforts to integrate entity knowledge into sequence models, EAE's entity representations are learned directly from text. We show that EAE's learned representations capture sufficient knowledge to answer TriviaQA questions such as “Which Dr. Who villain has been played by Roger Delgado, Anthony Ainley, Eric Roberts?”, outperforming an encoder-generator Transformer model with 10× the parameters. According to the LAMA knowledge probes, EAE contains more factual knowledge than a similarly sized BERT, as well as previous approaches that integrate external sources of entity knowledge. Because EAE associates parameters with specific entities, it only needs to access a fraction of its parameters at inference time, and we show that the correct identification and representation of entities is essential to EAE's performance.
AB - We focus on the problem of capturing declarative knowledge about entities in the learned parameters of a language model. We introduce a new model-Entities as Experts (EAE)-that can access distinct memories of the entities mentioned in a piece of text. Unlike previous efforts to integrate entity knowledge into sequence models, EAE's entity representations are learned directly from text. We show that EAE's learned representations capture sufficient knowledge to answer TriviaQA questions such as “Which Dr. Who villain has been played by Roger Delgado, Anthony Ainley, Eric Roberts?”, outperforming an encoder-generator Transformer model with 10× the parameters. According to the LAMA knowledge probes, EAE contains more factual knowledge than a similarly sized BERT, as well as previous approaches that integrate external sources of entity knowledge. Because EAE associates parameters with specific entities, it only needs to access a fraction of its parameters at inference time, and we show that the correct identification and representation of entities is essential to EAE's performance.
UR - http://www.scopus.com/inward/record.url?scp=85102065976&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85102065976&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85102065976
T3 - EMNLP 2020 - 2020 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference
SP - 4937
EP - 4951
BT - EMNLP 2020 - 2020 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference
PB - Association for Computational Linguistics (ACL)
Y2 - 16 November 2020 through 20 November 2020
ER -