Abstract
Heterogeneous and multi-device nodes are increasingly common in high-performance computing and data centers, yet existing programming models often lack simple, transparent, and portable support for these diverse architectures. The main contribution of this work is the development of novel SEER capabilities to address this challenge by providing a descriptive programming model that allows applications to seamlessly leverage heterogeneous nodes across various device types. SEER uses efficient memory management and can select the proper device[s] depending on the computational cost of the applications. This is completely transparent to the programmer, thereby providing a highly productive programming environment. Integrating extreme heterogeneity into the SEER library as shown with the use of NVIDIA and AMD GPUs simultaneously allows it to expand and exploit the performance possibilities. Our analysis based on the well-known Conjugate Gradient algorithm reports accelerations above 1.5 × on computationally demanding steps of such an algorithm by using both architectures simultaneously.