TY - CONF
T1 - Emergent communication in a multi-modal, multi-step referential game
AU - Evtimova, Katrina
AU - Drozdov, Andrew
AU - Kiela, Douwe
AU - Cho, Kyunghyun
N1 - Funding Information:
We thank Brenden Lake and Alex Cohen for valuable discussion. We also thank Maximilian Nickel, Y-Lan Boureau, Jason Weston, Dhruv Batra, and Devi Parikh for helpful suggestions. KC thanks for support by AdeptMind, Tencent, eBay, NVIDIA, and CIFAR. AD thanks the NVIDIA Corporation for their donation of a Titan X Pascal. This work is done by KE as a part of the course DS-GA 1010-001 Independent Study in Data Science at the Center for Data Science, New York University. A part of Fig. 1 is licensed from EmmyMik/CC BY 2.0/https://www.flickr.com/photos/emmymik/8206632393/.
Publisher Copyright:
© Learning Representations, ICLR 2018 - Conference Track Proceedings.All right reserved.
PY - 2018/1/1
Y1 - 2018/1/1
N2 - Inspired by previous work on emergent communication in referential games, we propose a novel multi-modal, multi-step referential game, where the sender and receiver have access to distinct modalities of an object, and their information exchange is bidirectional and of arbitrary duration. The multi-modal multi-step setting allows agents to develop an internal communication significantly closer to natural language, in that they share a single set of messages, and that the length of the conversation may vary according to the difficulty of the task. We examine these properties empirically using a dataset consisting of images and textual descriptions of mammals, where the agents are tasked with identifying the correct object. Our experiments indicate that a robust and efficient communication protocol emerges, where gradual information exchange informs better predictions and higher communication bandwidth improves generalization.
AB - Inspired by previous work on emergent communication in referential games, we propose a novel multi-modal, multi-step referential game, where the sender and receiver have access to distinct modalities of an object, and their information exchange is bidirectional and of arbitrary duration. The multi-modal multi-step setting allows agents to develop an internal communication significantly closer to natural language, in that they share a single set of messages, and that the length of the conversation may vary according to the difficulty of the task. We examine these properties empirically using a dataset consisting of images and textual descriptions of mammals, where the agents are tasked with identifying the correct object. Our experiments indicate that a robust and efficient communication protocol emerges, where gradual information exchange informs better predictions and higher communication bandwidth improves generalization.
UR - http://www.scopus.com/inward/record.url?scp=85071195027&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85071195027&partnerID=8YFLogxK
M3 - Paper
T2 - 6th International Conference on Learning Representations, ICLR 2018
Y2 - 30 April 2018 through 3 May 2018
ER -