SC21 Proceedings

The International Conference for High Performance Computing, Networking, Storage, and Analysis

In-Depth Analyses of Unified Virtual Memory System for GPU Accelerated Computing


Authors: Tyler Allen and Rong Ge (Clemson University)

Abstract: The abstraction of a shared memory space over separate CPU and GPU memory domains has eased the burden of portability for many HPC codebases at the cost of moderate-to-high performance overhead. NVIDIA Unified Virtual Memory (UVM) is the primary real-world implementation of such abstraction and offers a testbed for a novel in-depth performance study for both UVM and future Linux Heterogeneous Memory Management (HMM) compatible systems.

In this paper, we take a deep dive into the UVM system architecture. We reveal specific GPU hardware limitations using targeted benchmarks to uncover driver functionality as a real-time system. We further provide a quantitative evaluation of fault handling for various applications and scenarios. We find that the driver workload is dependent on the interactions among application access patterns, GPU hardware constraints, and Host OS components. We determine that the cost of host OS components is significant and present across implementations, warranting close attention.



Presentation: file


Back to Technical Papers Archive Listing