CPU-Accelerator Co-Scheduling for CNN Acceleration at the Edge

Yeongmin Kim, Joonho Kong, Arslan Munir

Research output: Contribution to journalArticlepeer-review

Abstract

Convolutional neural networks (CNNs) are widely deployed for many artificial intelligence (AI) applications, such as object detection and image classification. Due to the burgeoning revolution in edge AI, CNN hardware accelerators are also being employed in resource-constrained edge devices for achieving better performance and energy efficiency at the edge. Although CNN accelerators enable fast and energy-efficient CNN inference at the edge, the remaining hardware resources on the edge devices except for the CNN accelerator remain idle, which could otherwise be utilized for attaining even better performance and energy efficiency for CNN inferences. In this paper, we propose a CPU-accelerator co-scheduling technique for convolution (CONV) layer operations of CNN inferences in resource-constrained edge devices. Our proposed co-scheduling technique exploits an inherent parallelism in CNN output channels, that is, the operations for generating different output channels in a CONV layer can be executed in parallel. For load balancing between the CPU and the CNN accelerator, we also propose a simple, yet accurate latency model for CONV layer operations in the CPU and the accelerator. Based on the latency estimation of CONV layer operations provided by our proposed model, we distribute the tasks to the CPU and the CNN accelerator in a load-balance manner to minimize the idle period during the CONV layer operations in both the CPU and the CNN accelerator. We implement our proposed hardware/software (HW/SW) co-scheduling technique in various field-programmable gate array system-on-chip (FPGA-SoC) platforms as a proof-of-concept. Experimental results indicate that our proposed co-scheduling technique improves system performance by $1.18\times -2.00\times $ with energy reduction of 14.9% - 49.7% as compared to the accelerator-only execution.

Original languageEnglish (US)
Article number9264125
Pages (from-to)211422-211433
Number of pages12
JournalIEEE Access
Volume8
DOIs
StatePublished - 2020

Keywords

  • Convolutional neural networks
  • co-scheduling
  • latency model
  • load balancing
  • resource-constrained edge devices

ASJC Scopus subject areas

  • General Computer Science
  • General Materials Science
  • General Engineering

Fingerprint

Dive into the research topics of 'CPU-Accelerator Co-Scheduling for CNN Acceleration at the Edge'. Together they form a unique fingerprint.

Cite this