TY - GEN
T1 - HAAN
T2 - 2025 Design, Automation and Test in Europe Conference, DATE 2025
AU - Peng, Tianfan
AU - Xia, Tianhua
AU - Qin, Jiajun
AU - Zhang, Sai Qian
N1 - Publisher Copyright:
© 2025 EDAA.
PY - 2025
Y1 - 2025
N2 - Large language models (LLMs) have revolutionized natural language processing (NLP) tasks by achieving state-of-the-art performance across a range of benchmarks. Central to the success of these models is the integration of sophisticated architectural components aimed at improving training stability, convergence speed, and generalization capabilities. Among these components, normalization operation, such as layer normalization (LayerNorm), emerges as a pivotal technique, offering substantial benefits to the overall model performance. However, previous studies have indicated that normalization operations can substantially elevate processing latency and energy usage. In this work, we adopt the principles of algorithm and hardware co-design, introducing a holistic normalization accelerating method named HAAN. The evaluation results demonstrate that HAAN can achieve significantly better hardware performance compared to state-of-the-art solutions.
AB - Large language models (LLMs) have revolutionized natural language processing (NLP) tasks by achieving state-of-the-art performance across a range of benchmarks. Central to the success of these models is the integration of sophisticated architectural components aimed at improving training stability, convergence speed, and generalization capabilities. Among these components, normalization operation, such as layer normalization (LayerNorm), emerges as a pivotal technique, offering substantial benefits to the overall model performance. However, previous studies have indicated that normalization operations can substantially elevate processing latency and energy usage. In this work, we adopt the principles of algorithm and hardware co-design, introducing a holistic normalization accelerating method named HAAN. The evaluation results demonstrate that HAAN can achieve significantly better hardware performance compared to state-of-the-art solutions.
UR - http://www.scopus.com/inward/record.url?scp=105006929952&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=105006929952&partnerID=8YFLogxK
U2 - 10.23919/DATE64628.2025.10993005
DO - 10.23919/DATE64628.2025.10993005
M3 - Conference contribution
AN - SCOPUS:105006929952
T3 - Proceedings -Design, Automation and Test in Europe, DATE
BT - 2025 Design, Automation and Test in Europe Conference, DATE 2025 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 31 March 2025 through 2 April 2025
ER -