An Off-Chip Memory Access Optimization for Embedded Deep Learning Systems

Rachmad Vidya Wicaksana Putra, Muhammad Abdullah Hanif, Muhammad Shafique

Research output: Chapter in Book/Report/Conference proceedingChapter

Abstract

Implementations of Deep Neural Networks (DNNs) or Deep Learning (DL) for embedded applications may improve the users’ quality of life, as DL has become a prominent solution for many machine learning (ML) tasks, like personalized healthcare assistance. Such implementations require high energy efficiency since embedded applications usually have tight operational constraints, such as small memory and low operational power/energy. Therefore, specialized hardware accelerators are typically employed to expedite the DL inference. However, previous works have shown that DL accelerators still suffer from high energy consumption from the DRAM-based off-chip memory accesses, thereby hindering the embedded DL implementations. In this chapter, we discuss our design methodology for optimizing the energy consumption of DRAM accesses for the DL accelerators targeting embedded applications. Our design methodology employs an exploration technique to find the data partitioning and scheduling that offer minimum DRAM accesses for the given DNN model and exploits the low latency DRAMs to efficiently perform data accesses that incur minimum DRAM access energy.

Original languageEnglish (US)
Title of host publicationEmbedded Machine Learning for Cyber-Physical, IoT, and Edge Computing
Subtitle of host publicationHardware Architectures
PublisherSpringer International Publishing
Pages175-198
Number of pages24
ISBN (Electronic)9783031195686
ISBN (Print)9783031195679
DOIs
StatePublished - Jan 1 2023

ASJC Scopus subject areas

  • General Computer Science
  • General Engineering
  • General Social Sciences

Fingerprint

Dive into the research topics of 'An Off-Chip Memory Access Optimization for Embedded Deep Learning Systems'. Together they form a unique fingerprint.

Cite this