TY - GEN
T1 - System-Level Natural Language Feedback
AU - Yuan, Weizhe
AU - Cho, Kyunghyun
AU - Weston, Jason
N1 - Publisher Copyright:
© 2024 Association for Computational Linguistics.
PY - 2024
Y1 - 2024
N2 - Natural language (NL) feedback offers rich insights into user experience. While existing studies focus on an instance-level approach, where feedback is used to refine specific examples, we introduce a framework for system-level use of NL feedback. We show how to use feedback to formalize system-level design decisions in a human-in-the-loop-process - in order to produce better models. In particular this is done through: (i) metric design for tasks; and (ii) language model prompt design for refining model responses. We conduct two case studies of this approach for improving search query and dialog response generation, demonstrating the effectiveness of system-level feedback. We show the combination of system-level and instance-level feedback brings further gains, and that human written instance-level feedback results in more grounded refinements than GPT-3.5 written ones, underlying the importance of human feedback for building systems. We release our code and data at https://github.com/yyy-Apple/Sys-NL-Feedback.
AB - Natural language (NL) feedback offers rich insights into user experience. While existing studies focus on an instance-level approach, where feedback is used to refine specific examples, we introduce a framework for system-level use of NL feedback. We show how to use feedback to formalize system-level design decisions in a human-in-the-loop-process - in order to produce better models. In particular this is done through: (i) metric design for tasks; and (ii) language model prompt design for refining model responses. We conduct two case studies of this approach for improving search query and dialog response generation, demonstrating the effectiveness of system-level feedback. We show the combination of system-level and instance-level feedback brings further gains, and that human written instance-level feedback results in more grounded refinements than GPT-3.5 written ones, underlying the importance of human feedback for building systems. We release our code and data at https://github.com/yyy-Apple/Sys-NL-Feedback.
UR - http://www.scopus.com/inward/record.url?scp=85189940700&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85189940700&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85189940700
T3 - EACL 2024 - 18th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference
SP - 2773
EP - 2789
BT - EACL 2024 - 18th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference
A2 - Graham, Yvette
A2 - Purver, Matthew
A2 - Purver, Matthew
PB - Association for Computational Linguistics (ACL)
T2 - 18th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2024
Y2 - 17 March 2024 through 22 March 2024
ER -