TY - GEN
T1 - SwapAdvisor
T2 - 25th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2020
AU - Huang, Chien Chin
AU - Jin, Gu
AU - Li, Jinyang
N1 - Publisher Copyright:
© 2020 Copyright held by the owner/author(s). Publication rights licensed to ACM.
PY - 2020/3/9
Y1 - 2020/3/9
N2 - It is known that deeper and wider neural networks can achieve better accuracy. But it is difficult to continue the trend to increase model size due to limited GPU memory. One promising solution is to support swapping between GPU and CPU memory. However, existing work on swapping only handle certain models and do not achieve satisfactory performance. Deep learning computation is commonly expressed as a dataflow graph which can be analyzed to improve swapping. We propose SwapAdvisor, which performs joint optimization along 3 dimensions based on a given dataflow graph: operator scheduling, memory allocation, and swap decisions. SwapAdvisor explores the vast search space using a custom-designed genetic algorithm. Evaluations using a variety of large models show that SwapAdvisor can train models up to 12 times the GPU memory limit while achieving 53-99% of the throughput of a hypothetical baseline with infinite GPU memory.
AB - It is known that deeper and wider neural networks can achieve better accuracy. But it is difficult to continue the trend to increase model size due to limited GPU memory. One promising solution is to support swapping between GPU and CPU memory. However, existing work on swapping only handle certain models and do not achieve satisfactory performance. Deep learning computation is commonly expressed as a dataflow graph which can be analyzed to improve swapping. We propose SwapAdvisor, which performs joint optimization along 3 dimensions based on a given dataflow graph: operator scheduling, memory allocation, and swap decisions. SwapAdvisor explores the vast search space using a custom-designed genetic algorithm. Evaluations using a variety of large models show that SwapAdvisor can train models up to 12 times the GPU memory limit while achieving 53-99% of the throughput of a hypothetical baseline with infinite GPU memory.
UR - http://www.scopus.com/inward/record.url?scp=85082386791&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85082386791&partnerID=8YFLogxK
U2 - 10.1145/3373376.3378530
DO - 10.1145/3373376.3378530
M3 - Conference contribution
AN - SCOPUS:85082386791
T3 - International Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS
SP - 1341
EP - 1355
BT - ASPLOS 2020 - 25th International Conference on Architectural Support for Programming Languages and Operating Systems
PB - Association for Computing Machinery
Y2 - 16 March 2020 through 20 March 2020
ER -