Formalizing Address Spaces with application to Cuda, OpenCL, and beyond

Benedict R. Gaster and Lee Howes

In the Proceedings of the 6th Annual Workshop on General Purpose Processing with Graphics Processing Units (GPGPU 6). March 2013.

Abstract

Modern GPUs, such as AMD’s Graphics Core Next and Nvidia’s Fermi, support an additional generic address space that dynamically determines an address’ disjoint address space, submitting the correct load/store operation to the particular memory subsystem. Generic address spaces allow for dynamic casting between generic
and non-generic address spaces that is similar to the dynamic subtyping found in objected oriented languages. The advantage of the generic address space is it simplifies the programming model but sometimes at the cost of decreased performance, both dynamically and due to the optimization a compiler can safely perform.

This paper describes a new type system for inferring Cuda and OpenCL style address spaces. We show that the address space system can be inferred. We extend this base system with a notion of generic address space, including dynamic casting, and show that there also exists a static translation to architectures without support
for generic address spaces but comes at a potential performance cost. This performance cost can be reclaimed when an architecture directly supports generic address space.