Mapping Systolic Arrays onto 3D Circuit Structures: Accelerating Convolutional Neural Network Inference

H. T. Kung, Bradley McDanel, Sai Qian Zhang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In recent years, numerous designs have used systolic arrays to accelerate convolutional neural network (CNN) inference. In this work, we demonstrate that we can further speed up CNN inference and lower its power consumption by mapping systolic arrays onto 3D circuit structures as opposed to conventional 2D structures. Specifically, by operating in 3D space, a wide systolic array consisting of a number of subarrays can efficiently implement wide convolutional layers prevalent in state of the art CNNs. Additionally, by accumulating intermediate results along the third dimension, systolic arrays can process partitioned data channels in parallel with reduced data skew for lowered inference latency. We present a building block design using through-silicon vias (TSVs) for the 3D realization of systolic subarrays. We validate the 3D scheme using a 2.5D FPGA design and demonstrate that when mapped onto 3D structures wide systolic arrays can scale up in size without increasing wiring length in interconnecting subarrays. Further, by taking full advantage of 3D structures, we are able to pipeline inference across multiple layers of a CNN over a series of systolic arrays, dramatically reducing the inference time per input sample. These improvements lead to significantly reduced inference latency, which is especially important for real-time applications where it is common to process samples one at a time.

Original languageEnglish (US)
Title of host publicationProceedings of the IEEE Workshop on Signal Processing Systems, SiPS 2018
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages330-336
Number of pages7
ISBN (Electronic)9781538663189
DOIs
StatePublished - Dec 31 2018
Event2018 IEEE Workshop on Signal Processing Systems, SiPS 2018 - Cape Town, South Africa
Duration: Oct 21 2018Oct 24 2018

Publication series

NameIEEE Workshop on Signal Processing Systems, SiPS: Design and Implementation
Volume2018-October
ISSN (Print)1520-6130

Conference

Conference2018 IEEE Workshop on Signal Processing Systems, SiPS 2018
Country/TerritorySouth Africa
CityCape Town
Period10/21/1810/24/18

Keywords

  • 3D-IC implementation
  • accelerator
  • convolutional neural network (CNN)
  • deep learning
  • FPGA
  • inference latency
  • power consumption
  • systolic array
  • wiring length

ASJC Scopus subject areas

  • Electrical and Electronic Engineering
  • Signal Processing
  • Applied Mathematics
  • Hardware and Architecture

Fingerprint

Dive into the research topics of 'Mapping Systolic Arrays onto 3D Circuit Structures: Accelerating Convolutional Neural Network Inference'. Together they form a unique fingerprint.

Cite this