SemEval-2024 Task 8: Multidomain, Multimodel and Multilingual Machine-Generated Text Detection

Yuxia Wang, Jonibek Mansurov, Petar Ivanov, Jinyan Su, Artem Shelmanov, Akim Tsvigun, Osama Mohammed Afzal, Tarek Mahmoud, Giovanni Puccetti, Thomas Arnold, Chenxi Whitehouse, Alham Fikri Aji, Nizar Habash, Iryna Gurevych, Preslav Nakov

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We present the results and the main findings of SemEval-2024 Task 8: Multigenerator, Multidomain, and Multilingual Machine-Generated Text Detection. The task featured three subtasks. Subtask A is a binary classification task determining whether a text is written by a human or generated by a machine. This subtask has two tracks: a monolingual track focused solely on English texts and a multilingual track. Subtask B is to detect the exact source of a text, discerning whether it is written by a human or generated by a specific LLM. Subtask C aims to identify the changing point within a text, at which the authorship transitions from human to machine. The task attracted a large number of participants: subtask A monolingual (126), subtask A multilingual (59), subtask B (70), and subtask C (30). In this paper, we present the task, analyze the results, and discuss the system submissions and the methods they used. For all subtasks, the best systems used LLMs.

Original languageEnglish (US)
Title of host publicationSemEval 2024 - 18th International Workshop on Semantic Evaluation, Proceedings of the Workshop
EditorsAtul Kr. Ojha, A. Seza Dohruoz, Harish Tayyar Madabushi, Giovanni Da San Martino, Sara Rosenthal, Aiala Rosa
PublisherAssociation for Computational Linguistics (ACL)
Pages2057-2079
Number of pages23
ISBN (Electronic)9798891761070
StatePublished - 2024
Event18th International Workshop on Semantic Evaluation, SemEval 2024, co-located with the 2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics, NAACL 2024 - Hybrid, Mexico City, Mexico
Duration: Jun 20 2024Jun 21 2024

Publication series

NameSemEval 2024 - 18th International Workshop on Semantic Evaluation, Proceedings of the Workshop

Conference

Conference18th International Workshop on Semantic Evaluation, SemEval 2024, co-located with the 2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics, NAACL 2024
Country/TerritoryMexico
CityHybrid, Mexico City
Period6/20/246/21/24

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Computer Science Applications
  • Theoretical Computer Science

Fingerprint

Dive into the research topics of 'SemEval-2024 Task 8: Multidomain, Multimodel and Multilingual Machine-Generated Text Detection'. Together they form a unique fingerprint.

Cite this