Abstract
The capability of directly and accurately learning pattern from raw data has enabled the use of deep learning (DL) in various application domains such as image and video processing, natural language processing, autonomous driving, robotics, smart healthcare, and predictive maintenance. However, the state-of-the-art DL models are highly complex and require a large amount of compute and memory resources to generate highly accurate results. A number of DNN optimization techniques have been proposed to reduce the complexity of these models and enable their reliable and efficient deployment in resource-constrained embedded scenarios. This chapter first presents an overview of different DNN optimization and approximation techniques designed to improve area, performance, and energy efficiency of DL systems. Then, it covers different methodologies that synergistically integrate different optimization techniques in a cross-layer flow. Toward the end, the chapter discusses works related to end-to-end system-level optimizations and approximations to enable ultra-efficient DNN inference.
Original language | English (US) |
---|---|
Title of host publication | Embedded Machine Learning for Cyber-Physical, IoT, and Edge Computing |
Subtitle of host publication | Software Optimizations and Hardware/Software Codesign |
Publisher | Springer Nature |
Pages | 225-248 |
Number of pages | 24 |
ISBN (Electronic) | 9783031399329 |
ISBN (Print) | 9783031399312 |
DOIs | |
State | Published - Jan 1 2023 |
Keywords
- Cross-layer optimization
- Deep neural networks
- DNN accelerators
- DNN optimization
- Hardware approximations
- Pruning
- Quantization
- Self-healing
ASJC Scopus subject areas
- General Computer Science
- General Engineering
- General Social Sciences