TY - GEN
T1 - MemGANs
T2 - 2019 IEEE/ACM International Symposium on Low Power Electronics and Design, ISLPED 2019
AU - Hanif, Muhammad Abdullah
AU - Zuhaib Akbar, Muhammad
AU - Ahmed, Rehan
AU - Rehman, Semeen
AU - Jantsch, Axel
AU - Shafique, Muhammad
N1 - Funding Information:
ACKNOWLEDGMENT This work was partially supported by the Erasmus+ International Credit Mobility (KA107).
Publisher Copyright:
© 2019 IEEE.
Copyright:
Copyright 2019 Elsevier B.V., All rights reserved.
PY - 2019/7
Y1 - 2019/7
N2 - Generative Adversarial Networks (GANs) have gained importance because of their tremendous unsupervised learning capability and enormous applications in data generation, for example, text to image synthesis, synthetic medical data generation, video generation, and artwork generation. Hardware acceleration for GANs become challenging due to the intrinsic complex computational phases, which require efficient data management during the training and inference. In this work, we propose a distributed on-chip memory architecture, which aims at efficiently handling the data for complex computations involved in GANs, such as strided convolution or transposed convolution. We also propose a controller that improves the computational efficiency by pre-arranging the data from either the off-chip memory or the computational units before storing it in the on-chip memory. Our architectural enhancement supports to achieve 3.65x performance improvement in state-of-the-art, and reduces the number of read accesses and write accesses by 85% and 75%, respectively.
AB - Generative Adversarial Networks (GANs) have gained importance because of their tremendous unsupervised learning capability and enormous applications in data generation, for example, text to image synthesis, synthetic medical data generation, video generation, and artwork generation. Hardware acceleration for GANs become challenging due to the intrinsic complex computational phases, which require efficient data management during the training and inference. In this work, we propose a distributed on-chip memory architecture, which aims at efficiently handling the data for complex computations involved in GANs, such as strided convolution or transposed convolution. We also propose a controller that improves the computational efficiency by pre-arranging the data from either the off-chip memory or the computational units before storing it in the on-chip memory. Our architectural enhancement supports to achieve 3.65x performance improvement in state-of-the-art, and reduces the number of read accesses and write accesses by 85% and 75%, respectively.
KW - DCGAN
KW - DNN
KW - GAN
KW - Generative Adversarial Networks
KW - Hardware Accelerator
KW - Memory Architecture
UR - http://www.scopus.com/inward/record.url?scp=85072664153&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85072664153&partnerID=8YFLogxK
U2 - 10.1109/ISLPED.2019.8824833
DO - 10.1109/ISLPED.2019.8824833
M3 - Conference contribution
AN - SCOPUS:85072664153
T3 - Proceedings of the International Symposium on Low Power Electronics and Design
BT - International Symposium on Low Power Electronics and Design, ISLPED 2019
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 29 July 2019 through 31 July 2019
ER -