TY - GEN
T1 - Concept-Guided Chain-of-Thought Prompting for Pairwise Comparison Scoring of Texts with Large Language Models
AU - Wu, Patrick Y.
AU - Nagler, Jonathan
AU - Tucker, Joshua
AU - Messing, Solomon
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Existing text scoring methods require a large corpus, struggle with short texts, or require hand-labeled data. We develop a text scoring framework that leverages generative large language models (LLMs) to (1) set texts against the backdrop of information from the near-totality of the web and digitized media, and (2) effectively transform pairwise text comparisons from a reasoning problem to a pattern recognition task. Our approach, concept-guided chain-of-thought (CGCoT), utilizes a chain of researcher-designed prompts with an LLM to generate a concept-specific breakdown for each text, akin to guidance provided to human coders. We then pairwise compare breakdowns using an LLM and aggregate answers into a score using a probability model. We apply this approach to better understand speech reflecting aversion to specific political parties on Twitter, a topic that has commanded increasing interest because of its potential contributions to democratic backsliding. We achieve stronger correlations with human judgments than widely used unsupervised text scoring methods like Wordfish. In a supervised setting, besides a small pilot dataset to develop CGCoT prompts, our measures require no additional hand-labeled data and produce predictions on par with RoBERTa-Large fine-tuned on thousands of hand-labeled tweets. This project showcases the potential of combining human expertise and LLMs for scoring tasks.
AB - Existing text scoring methods require a large corpus, struggle with short texts, or require hand-labeled data. We develop a text scoring framework that leverages generative large language models (LLMs) to (1) set texts against the backdrop of information from the near-totality of the web and digitized media, and (2) effectively transform pairwise text comparisons from a reasoning problem to a pattern recognition task. Our approach, concept-guided chain-of-thought (CGCoT), utilizes a chain of researcher-designed prompts with an LLM to generate a concept-specific breakdown for each text, akin to guidance provided to human coders. We then pairwise compare breakdowns using an LLM and aggregate answers into a score using a probability model. We apply this approach to better understand speech reflecting aversion to specific political parties on Twitter, a topic that has commanded increasing interest because of its potential contributions to democratic backsliding. We achieve stronger correlations with human judgments than widely used unsupervised text scoring methods like Wordfish. In a supervised setting, besides a small pilot dataset to develop CGCoT prompts, our measures require no additional hand-labeled data and produce predictions on par with RoBERTa-Large fine-tuned on thousands of hand-labeled tweets. This project showcases the potential of combining human expertise and LLMs for scoring tasks.
KW - social science methods or tools
KW - text analysis
KW - text mining
UR - http://www.scopus.com/inward/record.url?scp=85218074659&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85218074659&partnerID=8YFLogxK
U2 - 10.1109/BigData62323.2024.10825235
DO - 10.1109/BigData62323.2024.10825235
M3 - Conference contribution
AN - SCOPUS:85218074659
T3 - Proceedings - 2024 IEEE International Conference on Big Data, BigData 2024
SP - 7232
EP - 7241
BT - Proceedings - 2024 IEEE International Conference on Big Data, BigData 2024
A2 - Ding, Wei
A2 - Lu, Chang-Tien
A2 - Wang, Fusheng
A2 - Di, Liping
A2 - Wu, Kesheng
A2 - Huan, Jun
A2 - Nambiar, Raghu
A2 - Li, Jundong
A2 - Ilievski, Filip
A2 - Baeza-Yates, Ricardo
A2 - Hu, Xiaohua
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2024 IEEE International Conference on Big Data, BigData 2024
Y2 - 15 December 2024 through 18 December 2024
ER -