Advanced search
Start date
Betweenand


Automatic Scan Parallelization in OpenMP

Full text
Author(s):
Zegarra, Maicol ; Pereira, Marcio ; Martorell, Xavier ; Araujo, Guido ; IEEE
Total Authors: 5
Document type: Journal article
Source: 2017 INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING WORKSHOPS (SBAC-PADW); v. N/A, p. 6-pg., 2017-01-01.
Abstract

Prefix Scan (or simply scan) is an operator that computes all the partial sums of a vector. A scan operation results in a vector where each element is the sum of the preceding elements in the original vector up to the corresponding position. Scan is a key operation in many relevant problems like sorting, lexical analysis, string comparison, image filtering among others. Although there are libraries that provide hand-parallelized implementations of scan in CUDA and OpenCL, no automatic parallelization solution exists for this operator in OpenMP. This paper proposes a new clause for OpenMP which enables the automatic synthesis of the parallel scan. By using the proposed clause a programmer can considerably reduce the complexity of designing scan based algorithms, thus allowing he or she to focus the attention on the problem and not on learning new parallel programming models or languages. Scan was designed in AClang, an open-source LLVM/Clang compiler framework that implements the recently released OpenMP 4.X Accelerator Programming Model. Experiments running a set of typical scan based algorithms on NVIDIA, Intel, and ARM GPUs reveal that the performance of the proposed OpenMP clause is equivalent to that achieved when using OpenCL library calls, with the advantage of a simpler programming complexity. (AU)

FAPESP's process: 13/08293-7 - CCES - Center for Computational Engineering and Sciences
Grantee:Munir Salomao Skaf
Support Opportunities: Research Grants - Research, Innovation and Dissemination Centers - RIDC