TY - GEN
T1 - CapsAcc
T2 - 22nd Design, Automation and Test in Europe Conference and Exhibition, DATE 2019
AU - Marchisio, Alberto
AU - Hanif, Muhammad Abdullah
AU - Shafique, Muhammad
N1 - Publisher Copyright:
© 2019 EDAA.
Copyright:
Copyright 2019 Elsevier B.V., All rights reserved.
PY - 2019/5/14
Y1 - 2019/5/14
N2 - Recently, CapsuleNets have overtaken traditional Deep Convolutional Neural Networks (CNNs), because of their improved generalization ability due to the multi-dimensional capsules, in contrast to the single-dimensional neurons. Consequently, CapsuleNets also require extremely intense matrix computations, making it a gigantic challenge to achieve high performance. In this paper, we propose CapsAcc, the first specialized CMOS-based hardware architecture to perform CapsuleNets inference with high performance and energy efficiency. State-of-the- art convolutional CNN accelerators would not work efficiently for CapsuleNets, as their designs do not account for unique processing nature of CapsuleNets involving multi-dimensional matrix processing, squashing and dynamic routing. Our architecture exploits the massive parallelism by flexibly feeding the data to a specialized systolic array according to the operations required in different layers. It also avoids extensive load and store operations on the on-chip memory, by reusing the data when possible. We synthesized the complete CapsAcc architecture in a 32nm CMOS technology using Synopsys design tools, and evaluated it for the MNIST benchmark (as also done by the original CapsuleNet paper) to ensure consistent and fair comparisons. This work enables highly-efficient CapsuleNets inference on embedded platforms.
AB - Recently, CapsuleNets have overtaken traditional Deep Convolutional Neural Networks (CNNs), because of their improved generalization ability due to the multi-dimensional capsules, in contrast to the single-dimensional neurons. Consequently, CapsuleNets also require extremely intense matrix computations, making it a gigantic challenge to achieve high performance. In this paper, we propose CapsAcc, the first specialized CMOS-based hardware architecture to perform CapsuleNets inference with high performance and energy efficiency. State-of-the- art convolutional CNN accelerators would not work efficiently for CapsuleNets, as their designs do not account for unique processing nature of CapsuleNets involving multi-dimensional matrix processing, squashing and dynamic routing. Our architecture exploits the massive parallelism by flexibly feeding the data to a specialized systolic array according to the operations required in different layers. It also avoids extensive load and store operations on the on-chip memory, by reusing the data when possible. We synthesized the complete CapsAcc architecture in a 32nm CMOS technology using Synopsys design tools, and evaluated it for the MNIST benchmark (as also done by the original CapsuleNet paper) to ensure consistent and fair comparisons. This work enables highly-efficient CapsuleNets inference on embedded platforms.
UR - http://www.scopus.com/inward/record.url?scp=85066634482&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85066634482&partnerID=8YFLogxK
U2 - 10.23919/DATE.2019.8714922
DO - 10.23919/DATE.2019.8714922
M3 - Conference contribution
AN - SCOPUS:85066634482
T3 - Proceedings of the 2019 Design, Automation and Test in Europe Conference and Exhibition, DATE 2019
SP - 964
EP - 967
BT - Proceedings of the 2019 Design, Automation and Test in Europe Conference and Exhibition, DATE 2019
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 25 March 2019 through 29 March 2019
ER -