Abstract
Recent research in Non-Volatile Memory (NVM) and Processing-in-Memory (PIM) technologies has proposed low energy PIM-based system designs for high-performance neural network inference. Simultaneously, there is a tremendous thrust in neural network architecture research, primarily targeted towards task-specific accuracy improvements. Despite the enormous potential of a PIM-based compute paradigm, most hardware proposals adopt a one-accelerator-fits-all-networks approach, bleeding performance across all verticals. The overarching goal for this work is to improve the throughput and power efficiency of convolutional neural networks on resistive crossbar-based microarchitectures. To this end, we demonstrate why, how, and where to prune contemporary neural networks for superior exploitation of the crossbar's underlying parallelism model. Further, we present the first crossbar-aware neural network design principles for discovering novel crossbar-amenable network architectures. Our third contribution includes simple yet efficient hardware optimizations to boost energy area efficiency for modern deep neural networks and ensembles. Finally, we combine these ideas towards our fourth contribution, CrossNet, a novel network architecture family which improves computational efficiency by 19.06\times and power efficiency by 4.16\times over state-of-the-art designs.
Original language | English (US) |
---|---|
Article number | 9294014 |
Pages (from-to) | 229066-229085 |
Number of pages | 20 |
Journal | IEEE Access |
Volume | 8 |
DOIs | |
State | Published - 2020 |
Keywords
- CNN
- convolutional neural networks
- crossbar
- CrossNet
- deep neural networks
- design
- DNN
- efficiency
- energy consumption
- ensemble
- in-memory computing
- memristor
- neural network
- neuromorphic computing
- optimization
- performance
- principles
- Processing-in-memory
- pruning
- ReRAM
- resistive RAM
ASJC Scopus subject areas
- General Computer Science
- General Materials Science
- General Engineering