Toward an analytical performance model to select between GPU and CPU Execution

Artem Chikin, Jose Nelson Amaral, Karim Ali, Ettore Tiotto

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Automating the device selection in heterogeneous computing platforms requires the modelling of performance both on CPUs and on accelerators. This work argues for the use of a hybrid analytical performance modelling approach is a practical way to build fast and efficient methods to select an appropriate target for a given computation kernel. The target selection problem has been addressed in the literature, however there has been a strong emphasis on building empirical models with machine learning techniques. We argue that the applicability of such solutions is often limited in production systems. This paper focus on the issue of building a selector to decide if an OpenMP loop nest should be executed in a CPU or in a GPU. To this end, it offers a comprehensive comparison evaluation of the difference in GPU kernel performance on devices of multiple generations of architectures. The idea is to underscore the need for accurate analytical performance models and to provide insights in the evolution of GPU accelerators. This work also highlights a drawback of existing approaches to modelling GPU performance - accurate modelling of memory coalescing characteristics. To that end, we examine a novel application of an inter-thread difference analysis that can further improve analytical models. Finally, this work presents an initial study of an OpenMP runtime framework for target-offloading target selection.

Original languageEnglish (US)
Title of host publicationProceedings - 2019 IEEE 33rd International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2019
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages353-362
Number of pages10
ISBN (Electronic)9781728135106
DOIs
StatePublished - May 2019
Event33rd IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2019 - Rio de Janeiro, Brazil
Duration: May 20 2019May 24 2019

Publication series

NameProceedings - 2019 IEEE 33rd International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2019

Conference

Conference33rd IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2019
Country/TerritoryBrazil
CityRio de Janeiro
Period5/20/195/24/19

Keywords

  • GPGPU
  • Heterogeneous Computing
  • Hybrid Analysis
  • MIC
  • OpenMP
  • Performance Model
  • Static Analysis

ASJC Scopus subject areas

  • Information Systems and Management
  • Artificial Intelligence
  • Computer Networks and Communications
  • Hardware and Architecture
  • Control and Optimization

Fingerprint

Dive into the research topics of 'Toward an analytical performance model to select between GPU and CPU Execution'. Together they form a unique fingerprint.

Cite this