Fast four-way parallel radix sorting on GPUs

Linh Ha, Jens Krüger, Cláudio T. Silva

Research output: Contribution to journalArticle

Abstract

Efficient sorting is a key requirement for many computer science algorithms. Acceleration of existing techniques as well as developing new sorting approaches is crucial for many real-time graphics scenarios, database systems, and numerical simulations to name just a few. It is one of the most fundamental operations to organize and filter the ever growing massive amounts of data gathered on a daily basis. While optimal sorting models for serial execution on a single processor exist, efficient parallel sorting remains a challenge. In this paper, we present a hardware-optimized parallel implementation of the radix sort algorithm that results in a significant speed up over existing sorting implementations. We outperform all known General Processing Unit (GPU) based sorting systems by about a factor of two and eliminate restrictions on the sorting key space. This makes our algorithm not only the fastest, but also the first general GPU sorting solution.

Original languageEnglish (US)
Pages (from-to)2368-2378
Number of pages11
JournalComputer Graphics Forum
Volume28
Issue number8
DOIs
StatePublished - Dec 2009

Keywords

  • Collision detection
  • GPGPU
  • GPU sorting
  • HPC
  • Parallel sorting

ASJC Scopus subject areas

  • Computer Graphics and Computer-Aided Design

Fingerprint Dive into the research topics of 'Fast four-way parallel radix sorting on GPUs'. Together they form a unique fingerprint.

  • Cite this