Abstract
With the evolution of Smart Cyber–Physical Systems (CPS) and Internet-of-Things (IoT), the number of connected (intelligent) devices is increasing at an exponential rate, and so as the data being produced by them. To process this gigantic amount of data, algorithms are being proposed that can extract information from it without much human intervention. Deep Learning (DL) is one of the fields which has recently shown significant potential towards this direction, as it allows computational models (deep neural networks) composed of multiple processing layers to learn autonomously to interpret data with the help of the back-propagation algorithm. However, state-of-the-art deep neural networks (DNNs) are highly resource-hungry. To enable the use of DNNs in resource constraint mobile devices, several optimization techniques have been proposed at different abstraction layers of the computing stack. In this paper, we highlight the main categories of the optimization techniques and present a cross-layer methodology for developing performance/energy efficient DNN-based systems. At the software level, we employ a structured pruning technique along with the quantization of inputs and network parameters to reduce the computational complexity and memory requirements for the DNN inference process. At the hardware level, exploiting the error resilience characteristics of the DNNs, we make use of functional approximations in the arithmetic modules to further improve the efficiency of DNN-based systems.
Original language | English (US) |
---|---|
Article number | 103609 |
Journal | Microprocessors and Microsystems |
Volume | 88 |
DOIs | |
State | Published - Feb 2022 |
Keywords
- Approximate computing
- CNN
- Convolutional neural networks
- Cross-layer
- DNN
- Deep Learning
- Deep neural networks
- Edge computing
- Machine learning
- Optimization
- Pruning
- Quantization
ASJC Scopus subject areas
- Software
- Hardware and Architecture
- Computer Networks and Communications
- Artificial Intelligence