TY - GEN
T1 - EDA Corpus
T2 - 2024 IEEE International LLM-Aided Design Workshop, LAD 2024
AU - Wu, Bing Yue
AU - Sharma, Utsav
AU - Kankipati, Sai Rahul Dhanvi
AU - Yadav, Ajay
AU - George, Bintu Kappil
AU - Guntupalli, Sai Ritish
AU - Rovinski, Austin
AU - Chhabria, Vidya A.
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Large language models (LLMs) serve as powerful tools for design, providing capabilities for both task automation and design assistance. Recent advancements have shown tremendous potential for facilitating LLM integration into the chip design process; however, many of these works rely on data which are not publicly available and/or not permissively licensed for use in LLM training and distribution. In this paper, we present a solution aimed at bridging this gap by introducing an open-source dataset tailored for OpenROAD, a widely adopted open-source EDA toolchain. The dataset features over 1500 data points and is structured in two formats: (i) a pairwise set comprised of question prompts with prose answers, and (ii) a pairwise set comprised of code prompts and their corresponding OpenROAD scripts. By providing this dataset, we aim to facilitate LLM-focused research within the EDA domain. The dataset is available at https://github.com/OpenROAD-Assistant/EDA-Corpus.
AB - Large language models (LLMs) serve as powerful tools for design, providing capabilities for both task automation and design assistance. Recent advancements have shown tremendous potential for facilitating LLM integration into the chip design process; however, many of these works rely on data which are not publicly available and/or not permissively licensed for use in LLM training and distribution. In this paper, we present a solution aimed at bridging this gap by introducing an open-source dataset tailored for OpenROAD, a widely adopted open-source EDA toolchain. The dataset features over 1500 data points and is structured in two formats: (i) a pairwise set comprised of question prompts with prose answers, and (ii) a pairwise set comprised of code prompts and their corresponding OpenROAD scripts. By providing this dataset, we aim to facilitate LLM-focused research within the EDA domain. The dataset is available at https://github.com/OpenROAD-Assistant/EDA-Corpus.
UR - http://www.scopus.com/inward/record.url?scp=85206666352&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85206666352&partnerID=8YFLogxK
U2 - 10.1109/LAD62341.2024.10691774
DO - 10.1109/LAD62341.2024.10691774
M3 - Conference contribution
AN - SCOPUS:85206666352
T3 - 2024 IEEE LLM Aided Design Workshop, LAD 2024
BT - 2024 IEEE LLM Aided Design Workshop, LAD 2024
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 28 June 2024 through 29 June 2024
ER -