Efficient parallelization of the Discrete Wavelet Transform algorithm using memory-oblivious optimizations

Anastasis Keliris, Vasilis Dimitsas, Olympia Kremmyda, Dimitris Gizopoulos, Michail Maniatakos

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

As the rate of single-thread CPU performance improvement per generation has diminished due to lower transistor-speed scaling and energy related issues, researchers and industry have shifted their interest towards multi-core and many-core architectures for improving performance. Comparisons between optimized applications for parallel architectures have been quantified many times in the literature, but contradictory results have been reported mainly due to biased methods of evaluating and comparing these architectures. In this paper, we present memory-oblivious optimizations of the widely used Discrete Wavelet Transform (DWT), and provide detailed comparisons of the algorithm on Intel and AMD multi-core CPUs, Nvidia many-core GPUs, as well as the Intel's Xeon Phi many-core coprocessor. Our results indicate that, compared to their respective non-optimized single thread implementations, memory-oblivious optimization delivers up to 17.9×-197.2× performance improvement for the various architectures examined. Furthermore, compared to the state-of-the-art, the presented CPU and GPU memory-oblivious implementations are 2.6× and 1.3× faster respectively than the fastest implementations of DWT currently available in the literature. No comparison to the state-of-the-art can be made for the Xeon Phi, as, to the best of our knowledge, this is the first study that optimizes the DWT for this newfangled architecture.

Original languageEnglish (US)
Title of host publicationProceedings - 2015 25th International Workshop on Power and Timing Modeling, Optimization and Simulation, PATMOS 2015
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages25-32
Number of pages8
ISBN (Electronic)9781467394192
DOIs
StatePublished - Dec 4 2015
Event25th International Workshop on Power and Timing Modeling, Optimization and Simulation, PATMOS 2015 - Salvador, Bahia, Brazil
Duration: Sep 1 2015Sep 4 2015

Publication series

NameProceedings - 2015 25th International Workshop on Power and Timing Modeling, Optimization and Simulation, PATMOS 2015

Other

Other25th International Workshop on Power and Timing Modeling, Optimization and Simulation, PATMOS 2015
CountryBrazil
CitySalvador, Bahia
Period9/1/159/4/15

ASJC Scopus subject areas

  • Hardware and Architecture
  • Electrical and Electronic Engineering

Fingerprint Dive into the research topics of 'Efficient parallelization of the Discrete Wavelet Transform algorithm using memory-oblivious optimizations'. Together they form a unique fingerprint.

  • Cite this

    Keliris, A., Dimitsas, V., Kremmyda, O., Gizopoulos, D., & Maniatakos, M. (2015). Efficient parallelization of the Discrete Wavelet Transform algorithm using memory-oblivious optimizations. In Proceedings - 2015 25th International Workshop on Power and Timing Modeling, Optimization and Simulation, PATMOS 2015 (pp. 25-32). [7347583] (Proceedings - 2015 25th International Workshop on Power and Timing Modeling, Optimization and Simulation, PATMOS 2015). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/PATMOS.2015.7347583