TY - GEN
T1 - PruNet
T2 - 2018 International Joint Conference on Neural Networks, IJCNN 2018
AU - Marchisio, Alberto
AU - Hanif, Muhammad Abdullah
AU - Martina, Maurizio
AU - Shafique, Muhammad
N1 - Publisher Copyright:
© 2018 IEEE.
Copyright:
Copyright 2018 Elsevier B.V., All rights reserved.
PY - 2018/10/10
Y1 - 2018/10/10
N2 - DNNs are highly memory and computationally intensive, due to which they are unfeasible to deploy in real time or mobile applications, where power and memory resources are scarce. Introducing sparsity in the network is a way to reduce those requirements. However, systematically employing pruning under given accuracy requirements is a challenging problem. We propose a novel methodology that iteratively applies a magnitude-based Class-Blind pruning to compress a DNN for obtaining a sparse model. It is a generic methodology and can be applied to different types of DNNs. We demonstrate that retraining after pruning is essential to restore the accuracy of the network. Experimental results show that our methodology is able to reduce the model size by around two orders of magnitude, without noticeably affecting the accuracy. It requires several iterations of pruning and retraining, but can achieve up to 190x Memory Saving Ratio (for the LeNet on the MNIST dataset) when compared to the baseline model. Similar results are also obtained for more complex networks like 91x for VGG-16 on the CIFAR100 dataset. If we combine this work with an efficient coding for sparse networks, like Compressed Sparse Column (CSC) or Compressed Sparse Row (CSR), we can obtain a reduced memory footprint. Our methodology can be complemented by other compression techniques, like weight sharing, quantization or fixed-point conversion, that allows to further reduce memory and computations.
AB - DNNs are highly memory and computationally intensive, due to which they are unfeasible to deploy in real time or mobile applications, where power and memory resources are scarce. Introducing sparsity in the network is a way to reduce those requirements. However, systematically employing pruning under given accuracy requirements is a challenging problem. We propose a novel methodology that iteratively applies a magnitude-based Class-Blind pruning to compress a DNN for obtaining a sparse model. It is a generic methodology and can be applied to different types of DNNs. We demonstrate that retraining after pruning is essential to restore the accuracy of the network. Experimental results show that our methodology is able to reduce the model size by around two orders of magnitude, without noticeably affecting the accuracy. It requires several iterations of pruning and retraining, but can achieve up to 190x Memory Saving Ratio (for the LeNet on the MNIST dataset) when compared to the baseline model. Similar results are also obtained for more complex networks like 91x for VGG-16 on the CIFAR100 dataset. If we combine this work with an efficient coding for sparse networks, like Compressed Sparse Column (CSC) or Compressed Sparse Row (CSR), we can obtain a reduced memory footprint. Our methodology can be complemented by other compression techniques, like weight sharing, quantization or fixed-point conversion, that allows to further reduce memory and computations.
UR - http://www.scopus.com/inward/record.url?scp=85056558929&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85056558929&partnerID=8YFLogxK
U2 - 10.1109/IJCNN.2018.8489764
DO - 10.1109/IJCNN.2018.8489764
M3 - Conference contribution
AN - SCOPUS:85056558929
T3 - Proceedings of the International Joint Conference on Neural Networks
BT - 2018 International Joint Conference on Neural Networks, IJCNN 2018 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 8 July 2018 through 13 July 2018
ER -