TY - JOUR
T1 - Decontextualization
T2 - Making sentences stand-alone
AU - Choi, Eunsol
AU - Palomaki, Jennimaria
AU - Lamm, Matthew
AU - Kwiatkowski, Tom
AU - Das, Dipanjan
AU - Collins, Michael
N1 - Publisher Copyright:
© 2021, MIT Press Journals. All rights reserved.
PY - 2021/2/1
Y1 - 2021/2/1
N2 - Models for question answering, dialogue agents, and summarization often interpret the meaning of a sentence in a rich context and use that meaning in a new context. Taking excerpts of text can be problematic, as key pieces may not be explicit in a local window. We isolate and define the problem of sentence decontextualization: taking a sentence together with its context and rewriting it to be interpretable out of context, while preserving its meaning. We describe an annotation procedure, collect data on the Wikipedia corpus, and use the data to train models to automatically decontextualize sentences. We present preliminary studies that show the value of sentence decontextualization in a user-facing task, and as preprocessing for systems that perform document understanding. We argue that decontextualization is an important subtask in many downstream applications, and that the definitions and resources provided can benefit tasks that operate on sentences that occur in a richer context.
AB - Models for question answering, dialogue agents, and summarization often interpret the meaning of a sentence in a rich context and use that meaning in a new context. Taking excerpts of text can be problematic, as key pieces may not be explicit in a local window. We isolate and define the problem of sentence decontextualization: taking a sentence together with its context and rewriting it to be interpretable out of context, while preserving its meaning. We describe an annotation procedure, collect data on the Wikipedia corpus, and use the data to train models to automatically decontextualize sentences. We present preliminary studies that show the value of sentence decontextualization in a user-facing task, and as preprocessing for systems that perform document understanding. We argue that decontextualization is an important subtask in many downstream applications, and that the definitions and resources provided can benefit tasks that operate on sentences that occur in a richer context.
UR - http://www.scopus.com/inward/record.url?scp=85106167515&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85106167515&partnerID=8YFLogxK
U2 - 10.1162/tacl_a_00377
DO - 10.1162/tacl_a_00377
M3 - Article
AN - SCOPUS:85106167515
SN - 2307-387X
VL - 9
SP - 447
EP - 461
JO - Transactions of the Association for Computational Linguistics
JF - Transactions of the Association for Computational Linguistics
ER -