TY - JOUR
T1 - Augmented Language Models
T2 - a Survey
AU - Mialon, Grégoire
AU - Dessì, Roberto
AU - Lomeli, Maria
AU - Nalmpantis, Christoforos
AU - Pasunuru, Ram
AU - Raileanu, Roberta
AU - Rozière, Baptiste
AU - Schick, Timo
AU - Dwivedi-Yu, Jane
AU - Celikyilmaz, Asli
AU - Grave, Edouard
AU - Lecun, Yann
AU - Scialom, Thomas
N1 - Publisher Copyright:
© 2023, Transactions on Machine Learning Research. All rights reserved.
PY - 2023
Y1 - 2023
N2 - This survey reviews works in which language models (LMs) are augmented with reasoning skills and the ability to use tools. The former is defined as decomposing a potentially complex task into simpler subtasks while the latter consists in calling external modules such as a code interpreter. LMs can leverage these augmentations separately or in combination via heuristics, or learn to do so from demonstrations. While adhering to a standard missing tokens prediction objective, such augmented LMs can use various, possibly non-parametric external modules to expand their context processing ability, thus departing from the pure language modeling paradigm. We therefore refer to them as Augmented Language Models (ALMs). The missing token objective allows ALMs to learn to reason, use tools, and even act, while still performing standard natural language tasks and even outperforming most regular LMs on several benchmarks. In this work, after reviewing current advance in ALMs, we conclude that this new research direction has the potential to address common limitations of traditional LMs such as interpretability, consistency, and scalability issues.
AB - This survey reviews works in which language models (LMs) are augmented with reasoning skills and the ability to use tools. The former is defined as decomposing a potentially complex task into simpler subtasks while the latter consists in calling external modules such as a code interpreter. LMs can leverage these augmentations separately or in combination via heuristics, or learn to do so from demonstrations. While adhering to a standard missing tokens prediction objective, such augmented LMs can use various, possibly non-parametric external modules to expand their context processing ability, thus departing from the pure language modeling paradigm. We therefore refer to them as Augmented Language Models (ALMs). The missing token objective allows ALMs to learn to reason, use tools, and even act, while still performing standard natural language tasks and even outperforming most regular LMs on several benchmarks. In this work, after reviewing current advance in ALMs, we conclude that this new research direction has the potential to address common limitations of traditional LMs such as interpretability, consistency, and scalability issues.
UR - http://www.scopus.com/inward/record.url?scp=86000562801&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=86000562801&partnerID=8YFLogxK
M3 - Article
AN - SCOPUS:86000562801
SN - 2835-8856
VL - 2023
JO - Transactions on Machine Learning Research
JF - Transactions on Machine Learning Research
ER -