TY - JOUR
T1 - TiQSA
T2 - Workload Minimization in Convolutional Neural Networks Using Tile Quantization and Symmetry Approximation
AU - Sabir, Dilshad
AU - Hanif, Muhammmad Abdullah
AU - Hassan, Ali
AU - Rehman, Saad
AU - Shafique, Muhammad
N1 - Funding Information:
This work was supported by the National University of Sciences and Technology, Islamabad.
Publisher Copyright:
© 2013 IEEE.
PY - 2021
Y1 - 2021
N2 - Convolutional Neural Networks (CNNs) in the Internet-of-Things (IoT)-based applications face stringent constraints, like limited memory capacity and energy resources due to many computations inconvolution layers. In order to reduce the computational workload in these layers, this paper proposes a hybrid convolution method in conjunction with a Particle of Swarm Convolution Layer Optimization (PSCLO) algorithm. The hybrid convolution is an approximation that exploits the inherent symmetry of filter termed as symmetry approximation and Winograd algorithm structure termed as tile quantization approximation. PSCLO optimizes the balance between workload reduction and accuracy degradation for each convolution layer by selecting fine-tuned thresholds to control each approximation's intensity. The proposed methods have been evaluated on ImageNet, MNIST, Fashion-MNIST, SVHN, and CIFAR-10 datasets. The proposed techniques achieved 5.28\text{x} multiplicative workload reduction without significant accuracy degradation (<0.1%) for ImageNet on ResNet-18, which is 1.08\text{x} less multiplicative workload as compared to state-of-the-art Winograd CNN pruning. For LeNet 3.87\text{x} and 3.93\text{x} was the multiplicative workload reduction for MNIST and Fashion-MNISTdatasets. The additive workload reduction was 2.5 {x} and 2.56 {x} for the respective datasets. There is no significant accuracy loss for MNIST and Fashion-MNIST dataset.
AB - Convolutional Neural Networks (CNNs) in the Internet-of-Things (IoT)-based applications face stringent constraints, like limited memory capacity and energy resources due to many computations inconvolution layers. In order to reduce the computational workload in these layers, this paper proposes a hybrid convolution method in conjunction with a Particle of Swarm Convolution Layer Optimization (PSCLO) algorithm. The hybrid convolution is an approximation that exploits the inherent symmetry of filter termed as symmetry approximation and Winograd algorithm structure termed as tile quantization approximation. PSCLO optimizes the balance between workload reduction and accuracy degradation for each convolution layer by selecting fine-tuned thresholds to control each approximation's intensity. The proposed methods have been evaluated on ImageNet, MNIST, Fashion-MNIST, SVHN, and CIFAR-10 datasets. The proposed techniques achieved 5.28\text{x} multiplicative workload reduction without significant accuracy degradation (<0.1%) for ImageNet on ResNet-18, which is 1.08\text{x} less multiplicative workload as compared to state-of-the-art Winograd CNN pruning. For LeNet 3.87\text{x} and 3.93\text{x} was the multiplicative workload reduction for MNIST and Fashion-MNISTdatasets. The additive workload reduction was 2.5 {x} and 2.56 {x} for the respective datasets. There is no significant accuracy loss for MNIST and Fashion-MNIST dataset.
KW - CNN
KW - Convolutional neural network
KW - DNN
KW - particle of swarm convolution layer optimization
KW - reduced workload
KW - symmetry approximation
KW - tile quantization approximation
KW - winograd transform
UR - http://www.scopus.com/inward/record.url?scp=85103755552&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85103755552&partnerID=8YFLogxK
U2 - 10.1109/ACCESS.2021.3069906
DO - 10.1109/ACCESS.2021.3069906
M3 - Article
AN - SCOPUS:85103755552
SN - 2169-3536
VL - 9
SP - 53647
EP - 53668
JO - IEEE Access
JF - IEEE Access
M1 - 9389774
ER -