Authors: Yehia Arafa (New Mexico State University); Abdel-Hameed Badawy (New Mexico State University, Los Alamos National Laboratory); Ammar ElWazir and Atanu Barai (New Mexico State University); Ali Eker (State University of New York at Binghamton); Gopinath Chennupati (Amazon Alexa); and Nandakishore Santhi and Stephan Eidenbenz (Los Alamos National Laboratory)
Abstract: In this paper, we present PPT-GPU, a scalable performance prediction toolkit for GPUs. PPT-GPU achieves scalability through a hybrid high-level modeling approach where some computations are extrapolated and multiple parts of the model are parallelized. The tool primary prediction models use pre-collected memory and instructions traces of the workloads to accurately capture the dynamic behavior of the kernels.
PPT-GPU reports an extensive array of GPU performance metrics accurately while being easily extensible. We use a broad set of benchmarks to verify predictions accuracy. We compare the results against hardware metrics collected using vendor profiling tools and cycle-accurate simulators. The results show that the performance predictions are highly correlated to the actual hardware (MAPE: < 16% and Correlation: > 0.98). Moreover, PPT-GPU is orders of magnitude faster than cycle-accurate simulators. This comprehensiveness of the collected metrics can guide architects and developers to perform design space explorations.
Back to Technical Papers Archive Listing