Hardware Acceleration of Sparse and Irregular Tensor Computations of ML Models: A Survey and Insights

Shail Dave, Riyadh Baghdadi, Tony Nowatzki, Sasikanth Avancha, Aviral Shrivastava, Baoxin Li

Research output: Contribution to journalReview articlepeer-review

Abstract

Machine learning (ML) models are widely used in many important domains. For efficiently processing these computational- and memory-intensive applications, tensors of these overparameterized models are compressed by leveraging sparsity, size reduction, and quantization of tensors. Unstructured sparsity and tensors with varying dimensions yield irregular computation, communication, and memory access patterns; processing them on hardware accelerators in a conventional manner does not inherently leverage acceleration opportunities. This article provides a comprehensive survey on the efficient execution of sparse and irregular tensor computations of ML models on hardware accelerators. In particular, it discusses enhancement modules in the architecture design and the software support, categorizes different hardware designs and acceleration techniques, analyzes them in terms of hardware and execution costs, analyzes achievable accelerations for recent DNNs, and highlights further opportunities in terms of hardware/software/model codesign optimizations (inter/intramodule). The takeaways from this article include the following: understanding the key challenges in accelerating sparse, irregular shaped, and quantized tensors; understanding enhancements in accelerator systems for supporting their efficient computations; analyzing tradeoffs in opting for a specific design choice for encoding, storing, extracting, communicating, computing, and load-balancing the nonzeros; understanding how structured sparsity can improve storage efficiency and balance computations; understanding how to compile and map models with sparse tensors on the accelerators; and understanding recent design trends for efficient accelerations and further opportunities.

Original languageEnglish (US)
Pages (from-to)1706-1752
Number of pages47
JournalProceedings of the IEEE
Volume109
Issue number10
DOIs
StatePublished - Oct 2021

Keywords

  • Compact models
  • VLSI
  • compiler optimizations
  • dataflow
  • deep learning
  • deep neural networks (DNNs)
  • dimension reduction
  • energy efficiency
  • hardware/software/model codesign
  • machine learning (ML)
  • pruning
  • quantization
  • reconfigurable computing
  • sparsity
  • spatial architecture
  • tensor decomposition

ASJC Scopus subject areas

  • Computer Science(all)
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Hardware Acceleration of Sparse and Irregular Tensor Computations of ML Models: A Survey and Insights'. Together they form a unique fingerprint.

Cite this