Leveraging Implicit Feedback from Deployment Data in Dialogue

Richard Yuanzhe Pang, Stephen Roller, Kyunghyun Cho, He He, Jason Weston

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We study improving social conversational agents by learning from natural dialogue between users and a deployed model, without extra annotations. To implicitly measure the quality of a machine-generated utterance, we leverage signals like user response length, sentiment and reaction of the future human utterances in the collected dialogue episodes. Our experiments use the publicly released deployment data from BlenderBot (Xu et al., 2023). Human evaluation indicates improvements in our new models over baseline responses; however, we find that some proxy signals can lead to more generations with undesirable properties as well. For example, optimizing for conversation length can lead to more controversial or unfriendly generations compared to the baseline, whereas optimizing for positive sentiment or reaction can decrease these behaviors.

Original languageEnglish (US)
Title of host publicationEACL 2024 - 18th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference
EditorsYvette Graham, Matthew Purver, Matthew Purver
PublisherAssociation for Computational Linguistics (ACL)
Pages60-75
Number of pages16
ISBN (Electronic)9798891760899
StatePublished - 2024
Event18th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2024 - St. Julian�s, Malta
Duration: Mar 17 2024Mar 22 2024

Publication series

NameEACL 2024 - 18th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference
Volume2

Conference

Conference18th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2024
Country/TerritoryMalta
CitySt. Julian�s
Period3/17/243/22/24

ASJC Scopus subject areas

  • Language and Linguistics
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'Leveraging Implicit Feedback from Deployment Data in Dialogue'. Together they form a unique fingerprint.

Cite this