Roy Spliet, Lee Howes, Benedict R. Gaster, and Ana Lucia Varbanescu
To appear in the Proceedings of the 7th Annual Workshop on General Purpose Processing with Graphics Processing Units (GPGPU 7). March 2014.
OpenCL is becoming a popular choice for the parallel programming of both multi-core CPUs and GPGPUs. One of the features missing in OpenCL, yet commonly found in irregular parallel applications, is dynamic memory allocation. In this paper, we propose KMA, a first dynamic memory allocator for OpenCL. KMA’s design is based on a thorough analysis of a set of 11 algorithms, which shows that dynamic memory allocation is a necessary commodity, typically used for implementing complex data structures (arrays, lists, or trees) that need constant restructuring at run- time. Taking into account both the survey findings and the OpenCL challenges, we design KMA as a two-layer memory manager that makes smart use of these patterns: its basic functionality provides generic malloc() and free() APIs, while the higher layer provides support for building and efficiently managing dynamic data structures. Our experiments focus on the performance and usability of KMA, for both micro-benchmarks and a real-life case-study, and our results show that when dynamic allocation is mandatory, KMA is a competitive allocator. We conclude that embedding dynamic memory allocation in OpenCL is feasible, but it is a complex, delicate task due to the massive parallelism of the platform and the requirement for portability.