TY - GEN
T1 - PENCIL
T2 - 24th International Conference on Parallel Architecture and Compilation, PACT 2015
AU - Baghdadi, Riyadh
AU - Beaugnon, Ulysse
AU - Cohen, Albert
AU - Grosser, Tobias
AU - Kruse, Michael
AU - Reddy, Chandan
AU - Verdoolaege, Sven
AU - Betts, Adam
AU - Donaldson, Alastair F.
AU - Ketema, Jeroen
AU - Absar, Javed
AU - Haastregt, Svenvan V.
AU - Kravets, A.
AU - Lokhmotov, Anton
AU - David, Robert
AU - Hajiyev, Elnar
N1 - Publisher Copyright:
© 2015 IEEE.
PY - 2016/3/8
Y1 - 2016/3/8
N2 - Programming accelerators such as GPUs withlow-level APIs and languages such as OpenCL and CUDAis difficult, error-prone, and not performance-portable. Au-tomatic parallelization and domain specific languages (DSLs)have been proposed to hide complexity and regain performanceportability. We present P ENCIL, a rigorously-defined subset ofGNU C99 - enriched with additional language constructs - that enables compilers to exploit parallelism and produce highlyoptimized code when targeting accelerators. P ENCIL aims toserve both as a portable implementation language for libraries, and as a target language for DSL compilers. We implemented a P ENCIL-to-OpenCL backend using astate-of-the-art polyhedral compiler. The polyhedral compiler, extended to handle data-dependent control flow and non-affinearray accesses, generates optimized OpenCL code. To demon-strate the potential and performance portability of P ENCILand the P ENCIL-to-OpenCL compiler, we consider a numberof image processing kernels, a set of benchmarks from theRodinia and SHOC suites, and DSL embedding scenarios forlinear algebra (BLAS) and signal processing radar applications(SpearDE), and present experimental results for four GPUplatforms: AMD Radeon HD 5670 and R9 285, NVIDIAGTX 470, and ARM Mali-T604.
AB - Programming accelerators such as GPUs withlow-level APIs and languages such as OpenCL and CUDAis difficult, error-prone, and not performance-portable. Au-tomatic parallelization and domain specific languages (DSLs)have been proposed to hide complexity and regain performanceportability. We present P ENCIL, a rigorously-defined subset ofGNU C99 - enriched with additional language constructs - that enables compilers to exploit parallelism and produce highlyoptimized code when targeting accelerators. P ENCIL aims toserve both as a portable implementation language for libraries, and as a target language for DSL compilers. We implemented a P ENCIL-to-OpenCL backend using astate-of-the-art polyhedral compiler. The polyhedral compiler, extended to handle data-dependent control flow and non-affinearray accesses, generates optimized OpenCL code. To demon-strate the potential and performance portability of P ENCILand the P ENCIL-to-OpenCL compiler, we consider a numberof image processing kernels, a set of benchmarks from theRodinia and SHOC suites, and DSL embedding scenarios forlinear algebra (BLAS) and signal processing radar applications(SpearDE), and present experimental results for four GPUplatforms: AMD Radeon HD 5670 and R9 285, NVIDIAGTX 470, and ARM Mali-T604.
KW - Automatic optimization
KW - Domain specific languages
KW - Intermediate language
KW - OpenCL
KW - Polyhedral model
UR - http://www.scopus.com/inward/record.url?scp=84975490487&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84975490487&partnerID=8YFLogxK
U2 - 10.1109/PACT.2015.17
DO - 10.1109/PACT.2015.17
M3 - Conference contribution
AN - SCOPUS:84975490487
T3 - Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT
SP - 138
EP - 149
BT - Proceedings - 24th International Conference on Parallel Architecture and Compilation, PACT 2015
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 18 October 2015 through 21 October 2015
ER -