TY - JOUR
T1 - Using natural language processing to analyse text data in behavioural science
AU - Feuerriegel, Stefan
AU - Maarouf, Abdurahman
AU - Bär, Dominik
AU - Geissler, Dominique
AU - Schweisthal, Jonas
AU - Pröllochs, Nicolas
AU - Robertson, Claire E.
AU - Rathje, Steve
AU - Hartmann, Jochen
AU - Mohammad, Saif M.
AU - Netzer, Oded
AU - Siegel, Alexandra A.
AU - Plank, Barbara
AU - Van Bavel, Jay J.
N1 - Publisher Copyright:
© Springer Nature America, Inc. 2025.
PY - 2025/2
Y1 - 2025/2
N2 - Language is a uniquely human trait at the core of human interactions. The language people use often reflects their personality, intentions and state of mind. With the integration of the Internet and social media into everyday life, much of human communication is documented as written text. These online forms of communication (for example, blogs, reviews, social media posts and emails) provide a window into human behaviour and therefore present abundant research opportunities for behavioural science. In this Review, we describe how natural language processing (NLP) can be used to analyse text data in behavioural science. First, we review applications of text data in behavioural science. Second, we describe the NLP pipeline and explain the underlying modelling approaches (for example, dictionary-based approaches and large language models). We discuss the advantages and disadvantages of these methods for behavioural science, in particular with respect to the trade-off between interpretability and accuracy. Finally, we provide actionable recommendations for using NLP to ensure rigour and reproducibility.
AB - Language is a uniquely human trait at the core of human interactions. The language people use often reflects their personality, intentions and state of mind. With the integration of the Internet and social media into everyday life, much of human communication is documented as written text. These online forms of communication (for example, blogs, reviews, social media posts and emails) provide a window into human behaviour and therefore present abundant research opportunities for behavioural science. In this Review, we describe how natural language processing (NLP) can be used to analyse text data in behavioural science. First, we review applications of text data in behavioural science. Second, we describe the NLP pipeline and explain the underlying modelling approaches (for example, dictionary-based approaches and large language models). We discuss the advantages and disadvantages of these methods for behavioural science, in particular with respect to the trade-off between interpretability and accuracy. Finally, we provide actionable recommendations for using NLP to ensure rigour and reproducibility.
UR - http://www.scopus.com/inward/record.url?scp=85213878177&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85213878177&partnerID=8YFLogxK
U2 - 10.1038/s44159-024-00392-z
DO - 10.1038/s44159-024-00392-z
M3 - Review article
AN - SCOPUS:85213878177
SN - 2731-0574
VL - 4
SP - 96
EP - 111
JO - Nature Reviews Psychology
JF - Nature Reviews Psychology
IS - 2
M1 - e2024292118
ER -