MASR: A Modular Accelerator for Sparse RNNs

Udit Gupta, Brandon Reagen, Lillian Pentecost, Marco Donato, Thierry Tambe, Alexander M. Rush, Gu Yeon Wei, David Brooks

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Recurrent neural networks (RNNs) are becoming the de-facto solution for speech recognition. RNNs exploit long-Term temporal relationships in data by applying repeated, learned transformations. Unlike fully-connected (FC) layers with single vector matrix operations, RNN layers consist of hundreds of such operations chained over time. This poses challenges unique to RNNs that are not found in convolutional neural networks(CNNs) or FC models, namely large dynamic activation. In this paper we present MASR, a principled and modular architecture that accelerates bidirectional RNNs for on-chip ASR. MASR is designed to exploit sparsity in both dynamic activations and static weights. The architecture is enhanced by a series of dynamic activation optimizations that enable compact storage, ensure no energy is wasted computing null operations, and maintain high MAC utilization for highly parallel accelerator designs. In comparison to current state-of-The-Art sparse neural network accelerators (e.g., EIE), MASR provides 2×area 3×energy, and 1.6×performance benefits. The modular nature of MASR enables designs that efficiently scale from resource-constrained low-power IoT applications to large-scale, highly parallel datacenter deployments.

Original languageEnglish (US)
Title of host publicationProceedings - 2019 28th International Conference on Parallel Architectures and Compilation Techniques, PACT 2019
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1-14
Number of pages14
ISBN (Electronic)9781728136134
DOIs
StatePublished - Sep 2019
Event28th International Conference on Parallel Architectures and Compilation Techniques, PACT 2019 - Seattle, United States
Duration: Sep 21 2019Sep 25 2019

Publication series

NameParallel Architectures and Compilation Techniques - Conference Proceedings, PACT
Volume2019-September
ISSN (Print)1089-795X

Conference

Conference28th International Conference on Parallel Architectures and Compilation Techniques, PACT 2019
Country/TerritoryUnited States
CitySeattle
Period9/21/199/25/19

Keywords

  • Accelerator
  • Recurrent neural networks
  • automatic speech recognition
  • deep neural network
  • sparsity

ASJC Scopus subject areas

  • Software
  • Theoretical Computer Science
  • Hardware and Architecture

Fingerprint

Dive into the research topics of 'MASR: A Modular Accelerator for Sparse RNNs'. Together they form a unique fingerprint.

Cite this