TY - GEN
T1 - Adaptive Distributed Convolutional Neural Network Inference at the Network Edge with ADCNN
AU - Zhang, Sai Qian
AU - Lin, Jieyu
AU - Zhang, Qi
N1 - Publisher Copyright:
© 2020 ACM.
PY - 2020/8/17
Y1 - 2020/8/17
N2 - The emergence of the Internet of Things (IoT) has led to a remarkable increase in the volume of data generated at the network edge. In order to support real-time smart IoT applications, massive amounts of data generated from edge devices need to be processed using methods such as deep neural networks (DNNs) with low latency. To improve application performance and minimize resource cost, enterprises have begun to adopt Edge computing, a computation paradigm that advocates processing input data locally at the network edge. However, as edge nodes are often resource-constrained, running data-intensive DNN inference tasks on each individual edge node often incurs high latency, which seriously limits the practicality and effectiveness of this model. In this paper, we study the problem of distributed execution of inference tasks on edge clusters for Convolutional Neural Networks (CNNs), one of the most prominent models of DNN. Unlike previous work, we present Fully Decomposable Spatial Partition (FDSP), which naturally supports resource heterogeneity and dynamicity in edge computing environments. We then present a compression technique that further reduces network communication overhead. Our system, called ADCNN, provides up to 2.8 × speed up compared to state-of-the-art approaches, while achieving a competitive inference accuracy.
AB - The emergence of the Internet of Things (IoT) has led to a remarkable increase in the volume of data generated at the network edge. In order to support real-time smart IoT applications, massive amounts of data generated from edge devices need to be processed using methods such as deep neural networks (DNNs) with low latency. To improve application performance and minimize resource cost, enterprises have begun to adopt Edge computing, a computation paradigm that advocates processing input data locally at the network edge. However, as edge nodes are often resource-constrained, running data-intensive DNN inference tasks on each individual edge node often incurs high latency, which seriously limits the practicality and effectiveness of this model. In this paper, we study the problem of distributed execution of inference tasks on edge clusters for Convolutional Neural Networks (CNNs), one of the most prominent models of DNN. Unlike previous work, we present Fully Decomposable Spatial Partition (FDSP), which naturally supports resource heterogeneity and dynamicity in edge computing environments. We then present a compression technique that further reduces network communication overhead. Our system, called ADCNN, provides up to 2.8 × speed up compared to state-of-the-art approaches, while achieving a competitive inference accuracy.
KW - Convolutional neural networks
KW - Distributed inference
KW - Edge computing
UR - http://www.scopus.com/inward/record.url?scp=85090595123&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85090595123&partnerID=8YFLogxK
U2 - 10.1145/3404397.3404473
DO - 10.1145/3404397.3404473
M3 - Conference contribution
AN - SCOPUS:85090595123
T3 - ACM International Conference Proceeding Series
BT - Proceedings of the 49th International Conference on Parallel Processing, ICPP 2020
PB - Association for Computing Machinery
T2 - 49th International Conference on Parallel Processing, ICPP 2020
Y2 - 17 August 2020 through 20 August 2020
ER -