TY - GEN
T1 - LLM-DetectAIve
T2 - 2024 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, EMNLP 2024
AU - Abassy, Mervat
AU - Elozeiri, Kareem
AU - Aziz, Alexander
AU - Ta, Minh Ngoc
AU - Tomar, Raj Vardhan
AU - Adhikari, Bimarsha
AU - El Dine Ahmed, Saad
AU - Wang, Yuxia
AU - Afzal, Osama Mohammed
AU - Xie, Zhuohan
AU - Mansurov, Jonibek
AU - Artemova, Ekaterina
AU - Mikhailov, Vladislav
AU - Xing, Rui
AU - Geng, Jiahui
AU - Iqbal, Hasan
AU - Mujahid, Zain Muhammad
AU - Mahmoud, Tarek
AU - Tsvigun, Akim
AU - Aji, Alham Fikri
AU - Shelmanov, Artem
AU - Habash, Nizar
AU - Gurevych, Iryna
AU - Nakov, Preslav
N1 - Publisher Copyright:
© 2024 Association for Computational Linguistics.
PY - 2024
Y1 - 2024
N2 - The ease of access to large language models (LLMs) has enabled a widespread of machine-generated texts, and now it is often hard to tell whether a piece of text was human-written or machine-generated. This raises concerns about potential misuse, particularly within educational and academic domains. Thus, it is important to develop practical systems that can automate the process. Here, we present one such system, LLM-DetectAIve, designed for fine-grained detection. Unlike most previous work on machine-generated text detection, which focused on binary classification, LLM-DetectAIve supports four categories: (i) human-written, (ii) machine-generated, (iii) machine-written, then machine-humanized, and (iv) human-written, then machine-polished. Category (iii) aims to detect attempts to obfuscate the fact that a text was machine-generated, while category (iv) looks for cases where the LLM was used to polish a human-written text, which is typically acceptable in academic writing, but not in education. Our experiments show that LLM-DetectAIve can effectively identify the above four categories, which makes it a potentially useful tool in education, academia, and other domains. LLM-DetectAIve is publicly accessible at https://github.com/mbzuai-nlp/LLM-DetectAIve. The video describing our system is available at https://youtu.be/E8eT_bE7k8c.
AB - The ease of access to large language models (LLMs) has enabled a widespread of machine-generated texts, and now it is often hard to tell whether a piece of text was human-written or machine-generated. This raises concerns about potential misuse, particularly within educational and academic domains. Thus, it is important to develop practical systems that can automate the process. Here, we present one such system, LLM-DetectAIve, designed for fine-grained detection. Unlike most previous work on machine-generated text detection, which focused on binary classification, LLM-DetectAIve supports four categories: (i) human-written, (ii) machine-generated, (iii) machine-written, then machine-humanized, and (iv) human-written, then machine-polished. Category (iii) aims to detect attempts to obfuscate the fact that a text was machine-generated, while category (iv) looks for cases where the LLM was used to polish a human-written text, which is typically acceptable in academic writing, but not in education. Our experiments show that LLM-DetectAIve can effectively identify the above four categories, which makes it a potentially useful tool in education, academia, and other domains. LLM-DetectAIve is publicly accessible at https://github.com/mbzuai-nlp/LLM-DetectAIve. The video describing our system is available at https://youtu.be/E8eT_bE7k8c.
UR - http://www.scopus.com/inward/record.url?scp=85216276004&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85216276004&partnerID=8YFLogxK
U2 - 10.18653/v1/2024.emnlp-demo.35
DO - 10.18653/v1/2024.emnlp-demo.35
M3 - Conference contribution
AN - SCOPUS:85216276004
T3 - EMNLP 2024 - 2024 Conference on Empirical Methods in Natural Language Processing, Proceedings of System Demonstrations
SP - 336
EP - 343
BT - EMNLP 2024 - 2024 Conference on Empirical Methods in Natural Language Processing, Proceedings of System Demonstrations
A2 - Farias, Delia Irazu Hernandez
A2 - Hope, Tom
A2 - Li, Manling
PB - Association for Computational Linguistics (ACL)
Y2 - 12 November 2024 through 16 November 2024
ER -