TY - GEN
T1 - DLoRA
T2 - 2024 Findings of the Association for Computational Linguistics, EMNLP 2024
AU - Gao, Chao
AU - Zhang, Sai Qian
N1 - Publisher Copyright:
© 2024 Association for Computational Linguistics.
PY - 2024
Y1 - 2024
N2 - To enhance the performance of large language models (LLM) on downstream tasks, one solution is to fine-tune certain LLM parameters and make it better align with the characteristics of the training dataset. This process is commonly known as parameter-efficient fine-tuning (PEFT). Due to the scale of LLM, PEFT operations are usually executed in the public environment (e.g., cloud server). This necessitates the sharing of sensitive user data across public environments, thereby raising potential privacy concerns. To tackle these challenges, we propose a distributed PEFT framework called DLoRA. DLoRA enables scalable PEFT operations to be performed collaboratively between the cloud and user devices. Coupled with the proposed Halt and Proceed algorithm, the evaluation results demonstrate that DLoRA can significantly reduce the computation and communication workload over user devices while achieving superior accuracy and privacy protection. The source code can be accessed through the provided link.
AB - To enhance the performance of large language models (LLM) on downstream tasks, one solution is to fine-tune certain LLM parameters and make it better align with the characteristics of the training dataset. This process is commonly known as parameter-efficient fine-tuning (PEFT). Due to the scale of LLM, PEFT operations are usually executed in the public environment (e.g., cloud server). This necessitates the sharing of sensitive user data across public environments, thereby raising potential privacy concerns. To tackle these challenges, we propose a distributed PEFT framework called DLoRA. DLoRA enables scalable PEFT operations to be performed collaboratively between the cloud and user devices. Coupled with the proposed Halt and Proceed algorithm, the evaluation results demonstrate that DLoRA can significantly reduce the computation and communication workload over user devices while achieving superior accuracy and privacy protection. The source code can be accessed through the provided link.
UR - http://www.scopus.com/inward/record.url?scp=85217617651&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85217617651&partnerID=8YFLogxK
U2 - 10.18653/v1/2024.findings-emnlp.802
DO - 10.18653/v1/2024.findings-emnlp.802
M3 - Conference contribution
AN - SCOPUS:85217617651
T3 - EMNLP 2024 - 2024 Conference on Empirical Methods in Natural Language Processing, Findings of EMNLP 2024
SP - 13703
EP - 13714
BT - EMNLP 2024 - 2024 Conference on Empirical Methods in Natural Language Processing, Findings of EMNLP 2024
A2 - Al-Onaizan, Yaser
A2 - Bansal, Mohit
A2 - Chen, Yun-Nung
PB - Association for Computational Linguistics (ACL)
Y2 - 12 November 2024 through 16 November 2024
ER -