Write a Blog >>
PPoPP 2022
Sat 2 - Wed 6 April 2022
Mon 4 Apr 2022 10:20 - 10:35 - Session 1 Chair(s): Tongping Liu

Modern computing platforms tend to deploy multiple GPUs on a single node to boost performance. GPUs are an expensive resource, and increasing their utilization without causing performance degradation of individual workloads is an important and challenging problem. Although services such as NVIDIA’s MPS allow multiple cooperative kernels to simultaneously run on a single device, they do not solve the co-execution problem for uncooperative, independent kernels on such a multi-GPU system. To tackle this problem, we propose CASE— a fully automated compiler-assisted scheduling framework. During the compilation of an application, CASE constructs GPU tasks from CUDA programs and instruments a probe before each one. At runtime, each probe conveys information about its task’s resource requirements such as memory and compute to a user-level scheduler. The scheduler then places each task onto a suitable device by employing a policy appropriate to the system. In our prototype, a throughput-oriented scheduling policy is implemented to evaluate our resource-aware scheduling framework. The Rodinia benchmark suite and the Darknet neural network framework were used in our evaluation. The results show that, as compared to existing state-of-the-art methods, CASE improves throughput by up to 2.5X for Rodinia, and up to 2.7X for Darknet on modern NVIDIA GPU platforms, mainly due to the fact that it improves the average system utilization by up to 3.36X and the job turnaround time by up to 4.9X.
Meanwhile, it limits individual kernel performance degradation within 2.5%. CASE achieved peak system utilization of 78% for Rodinia, and 80% for Darknet, on a 4XV100 system.

Mon 4 Apr

Displayed time zone: Eastern Time (US & Canada) change

10:20 - 11:20
Session 1Main Conference
Chair(s): Tongping Liu University of Massachusetts at Amherst
10:20
15m
Talk
CASE: A Compiler-Assisted SchEduling Framework for Multi-GPU Systems
Main Conference
Chao Chen Amazon Web Service, Chris Porter Georgia Institute of Technology, USA, Santosh Pande Georgia Institute of Technology
10:35
15m
Talk
Dopia: Online Parallelism Management for Integrated CPU/GPU Architectures
Main Conference
Younghyun Cho University of California, Berkeley, Jiyeon Park Seoul National University, Florian Negele ETH Zurich, Changyeon Jo Seoul National University, Thomas Gross ETH Zurich, Bernhard Egger Seoul National University
10:50
15m
Talk
Mashup: Making Serverless Computing Useful for HPC Workflows via Hybrid Execution
Main Conference
Rohan Basu Roy Northeastern University, Tirthak Patel Northeastern University, Vijay Gadepally MIT Lincoln Laboratory, Devesh Tiwari Northeastern University
11:05
15m
Talk
Stream Processing with Dependency-Guided Synchronization
Main Conference
Konstantinos Kallas University of Pennsylvania, Filip Niksic Google, Caleb Stanford University of Pennsylvania, Rajeev Alur University of Pennsylvania