Write a Blog >>
PPoPP 2022
Sat 2 - Wed 6 April 2022

PPoPP is the premier forum for leading work on all aspects of parallel programming, including theoretical foundations, techniques, languages, compilers, runtime systems, tools, and practical experience. In the context of the symposium, “parallel programming” encompasses work on concurrent and parallel systems (multicore, multi-threaded, heterogeneous, clustered, and distributed systems; grids; datacenters; clouds; and large scale machines). Given the rise of parallel architectures in the consumer market (desktops, laptops, and mobile devices) and data centers, PPoPP is particularly interested in work that addresses new parallel workloads and issues that arise out of extreme-scale applications or cloud platforms, as well as techniques and tools that improve the productivity of parallel programming or work towards improved synergy with such emerging architectures.

Dates
You're viewing the program in a time zone which is different from your device's time zone change time zone

Mon 4 Apr

Displayed time zone: Eastern Time (US & Canada) change

08:45 - 09:00
10:00 - 10:20
10:20 - 11:20
Session 1Main Conference
Chair(s): Tongping Liu University of Massachusetts at Amherst
10:20
15m
Talk
CASE: A Compiler-Assisted SchEduling Framework for Multi-GPU Systems
Main Conference
Chao Chen Amazon Web Service, Chris Porter Georgia Institute of Technology, USA, Santosh Pande Georgia Institute of Technology
10:35
15m
Talk
Dopia: Online Parallelism Management for Integrated CPU/GPU Architectures
Main Conference
Younghyun Cho University of California, Berkeley, Jiyeon Park Seoul National University, Florian Negele ETH Zurich, Changyeon Jo Seoul National University, Thomas Gross ETH Zurich, Bernhard Egger Seoul National University
10:50
15m
Talk
Mashup: Making Serverless Computing Useful for HPC Workflows via Hybrid Execution
Main Conference
Rohan Basu Roy Northeastern University, Tirthak Patel Northeastern University, Vijay Gadepally MIT Lincoln Laboratory, Devesh Tiwari Northeastern University
11:05
15m
Talk
Stream Processing with Dependency-Guided Synchronization
Main Conference
Konstantinos Kallas University of Pennsylvania, Filip Niksic Google, Caleb Stanford University of Pennsylvania, Rajeev Alur University of Pennsylvania
11:20 - 11:40
11:40 - 12:25
Session 2Main Conference
Chair(s): Ang Li Pacific Northwest National Laboratory
11:40
15m
Talk
Parallel Block-Delayed Sequences
Main Conference
Sam Westrick Carnegie Mellon University, Mike Rainey Carnegie Mellon University, Daniel Anderson Carnegie Mellon University, Guy E. Blelloch Carnegie Mellon University, USA
11:55
15m
Talk
RTNN: Accelerating Neighbor Search Using Hardware Ray Tracing
Main Conference
Yuhao Zhu University of Rochester
12:10
15m
Talk
TileSpGEMM: A Tiled Algorithm for Parallel Sparse General Matrix-Matrix Multiplication on GPUs
Main Conference
Yuyao Niu China University of Petroleum-Beijing, Zhengyang Lu China University of Petroleum-Beijing, Haonan Ji China University of Petroleum-Beijing, Shuhui Song China University of Petroleum-Beijing, Zhou Jin China University of Petroleum-Beijing, Weifeng Liu China University of Petroleum-Beijing
12:25 - 12:50
12:50 - 13:35
Session 3Main Conference
Chair(s): Bin Ren Pacific Northwest National Laboratories
12:50
15m
Talk
QGTC: Accelerating Quantized Graph Neural Networks via GPU Tensor Core
Main Conference
Yuke Wang UC Santa Barbara, Boyuan Feng University of California Santa Barbara, Yufei Ding University of California at Santa Barbara
13:05
15m
Talk
FasterMoE: Modeling and Optimizing Training of Large-Scale Dynamic Pre-Trained Models
Main Conference
Jiaao He Tsinghua University, China, Jidong Zhai Tsinghua University, Tiago Antunes Tsinghua University, Haojie Wang Tsinghua University, Fuwen Luo Tsinghua University, Shangfeng Shi Tsinghua University, Qin Li Tsinghua University
13:20
15m
Talk
Near-Optimal Sparse Allreduce for Distributed Deep Learning
Main Conference
Shigang Li ETH Zurich, Torsten Hoefler ETH Zurich
13:35 - 13:45
13:45 - 14:45
Business MeetingMain Conference
13:45
60m
Meeting
Business Meeting
Main Conference

Tue 5 Apr

Displayed time zone: Eastern Time (US & Canada) change

10:00 - 10:20
10:20 - 11:20
Session 4Main Conference
Chair(s): Kenjiro Taura The University of Tokyo
10:20
15m
Talk
BAGUALU: Targeting Brain Scale Pretrained Models with over 37 Million Cores
Main Conference
Zixuan Ma Tsinghua University, Jiaao He Tsinghua University, China, Jiezhong Qiu Tsinghua University and Beijing Academy of Artificial Intelligence, Huanqi Cao Tsinghua University, Yuanwei Wang Tsinghua University, Zhenbo Sun Tsinghua University, Liyan Zheng Tsinghua University, Haojie Wang Tsinghua University, Shizhi Tang Tsinghua University, Tianyu Zheng Zhejiang Lab, Junyang Lin DAMO Academy, Alibaba Group, Guanyu Feng Tsinghua University, Zeqiang Huang Zhejiang Lab, Jie Gao Zhejiang Lab, Aohan Zeng Tsinghua University and Beijing Academy of Artificial Intelligence, Jianwei Zhang DAMO Academy, Alibaba Group, Runxin Zhong Tsinghua University, Tianhui Shi Tsinghua University, Sha Liu Zhejiang Lab, Weimin Zheng Tsinghua University, Jie Tang Tsinghua University and Beijing Academy of Artificial Intelligence, Hongxia Yang DAMO Academy, Alibaba Group, Xin Liu Zhejiang Lab, Jidong Zhai Tsinghua University, Wenguang Chen Tsinghua University
10:35
15m
Talk
Extending the limit of molecular dynamics with ab initio accuracy to 10 billion atoms
Main Conference
Zhuoqiang Guo Institute of Computing Technology, Chinese Academy of Sciences, Denghui Lu HEDPS, CAPT, College of Engineering, Peking University, Yujin Yan Institute of Computing Technology, Chinese Academy of Sciences, Siyu Hu Institute of Computing Technology, Chinese Academy of Sciences, Rongrong Liu Institute of Computing Technology, Chinese Academy of Sciences, Guangming Tan Chinese Academy of Sciences(CAS), Ninghui Sun State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, University of Chinese Academy of Sciences, Wanrun Jiang AI for Science Institute, Lijun Liu Osaka University, Yixiao Chen Princeton University, Linfeng Zhang DP Technology, Mohan Chen HEDPS, CAPT, College of Engineering, Peking University, Han Wang Laboratory of Computational Physics, Institute of Applied Physics and Computational Mathematics, Weile Jia Institute of Computing Technology, Chinese Academy of Sciences
10:50
15m
Talk
LOTUS: Locality Optimizing Triangle Counting
Main Conference
Mohsen Koohi Esfahani Queen's University Belfast, Peter Kilpatrick Queen's University Belfast, Hans Vandierendonck Queen's University Belfast
Link to publication Pre-print
11:05
15m
Talk
Scaling Graph Traversal to 281 Trillion Edges with 40 Million Cores
Main Conference
Huanqi Cao Tsinghua University, Yuanwei Wang Tsinghua University, Haojie Wang Tsinghua University, Heng Lin Peking University, Zixuan Ma Tsinghua University, Wanwang Yin National Supercomputing Center in Wuxi, Wenguang Chen Tsinghua University
11:20 - 11:40
11:40 - 12:25
Session 5Main Conference
Chair(s): Wenwen Wang University of Georgia
11:40
15m
Talk
Vapro: Performance Variance Detection and Diagnosis for Production-Run Parallel Applications
Main Conference
Liyan Zheng Tsinghua University, Jidong Zhai Tsinghua University, Xiongchao Tang Sangfor Technologies Inc. and Tsinghua University, Haojie Wang Tsinghua University, Teng Yu Tsinghua University, Yuyang Jin Tsinghua University, Shuaiwen Leon Song University of Sydney, Wenguang Chen Tsinghua University
11:55
15m
Talk
Interference Relation-Guided SMT Solving for Multi-Threaded Program Verification
Main Conference
Hongyu Fan Tsinghua University, Weiting Liu Tsinghua University, Fei He Tsinghua University
12:10
15m
Talk
PerFlow: A Domain Specific Framework for Automatic Performance Analysis of Parallel Applications
Main Conference
Yuyang Jin Tsinghua University, Haojie Wang Tsinghua University, Runxin Zhong Tsinghua University, Chen Zhang Tsinghua University, Jidong Zhai Tsinghua University
12:25 - 12:50
12:50 - 13:35
Session 6Main Conference
Chair(s): Stefan K. Muller Illinois Institute of Technology
12:50
15m
Talk
FliT: A Library for Simple and Efficient Persistent Algorithms
Main Conference
Yuanhao Wei Carnegie Mellon University, USA, Naama Ben-David VMware, Michal Friedman Technion, Israel, Guy E. Blelloch Carnegie Mellon University, USA, Erez Petrank Technion
13:05
15m
Talk
The Performance Power of Software Combining in Persistence
Main Conference
Panagiota Fatourou FORTH ICS and University of Crete, Greece, Nikolaos Kallimanis Institute of Computer Science, Foundation for Research & Technology - Hellas, Eleftherios Kosmas Department of Computer Science, University of Crete, Greece
13:20
15m
Talk
Understanding and Detecting Deep Memory Persistency Bugs in NVM Programs with DeepMC
Main Conference
Benjamin Reidys UIUC, Jian Huang University of Illinois at Urbana-Champaign
13:35 - 14:00
14:00 - 15:25
Poster SessionMain Conference
Chair(s): Yan Gu UC Riverside
14:00
5m
Talk
POSTER: Automatic Synthesis of Parallel Unix Commands and Pipelines with KumQuat
Main Conference
Jiasi Shen Massachusetts Institute of Technology, Martin C. Rinard Massachusetts Institute of Technology, Nikos Vasilakis Massachusetts Institute of Technology
14:05
5m
Talk
POSTER: Towards OmpSs-2 and OpenACC Interoperation
Main Conference
Orestis Korakitis Barcelona Supercomputing Center (BSC), Simon Garcia De Gonzalo Barcelona Supercomputing Center (BSC), Nicolas Guidotti INESC-ID, Instituto Superior Técnico, University of Lisbon, João Barreto INESC-ID, José C. Monteiro INESC-ID, Instituto Superior Técnico, University of Lisbon, Antonio J. Peña Barcelona Supercomputing Center (BSC)
14:10
5m
Talk
POSTER: LB-HM: Load Balance-Aware Data Placement on Heterogeneous Memory for Task-Parallel HPC Applications
Main Conference
Zhen Xie University of California, Merced, Jie Liu , Sam Ma College of William & Mary, Jiajia Li William & Mary, Pacific Northwest National Laboratory, Dong Li University of California, Merced
14:15
5m
Talk
POSTER: Hardening Selective Protection across Multiple Program Inputs for HPC Applications
Main Conference
Yafan Huang University of Iowa, Shengjian Guo Baidu USA, Sheng Di Argonne National Laboratory, Guanpeng Li University of Iowa, Franck Cappello Argonne National Laboratory
14:20
5m
Talk
POSTER: A Parallel Branch-and-Bound Algorithm with History-Based Domination
Main Conference
Taspon Gonggiatgul California State University, Sacramento, Ghassan Shobaki California State University, Sacramento, Pınar Muyan-Özçelik California State University, Sacramento
14:25
5m
Talk
POSTER: Remote OpenMP Offloading
Main Conference
Atmn Patel University of Waterloo, Johannes Doerfert Argonne National Laboratory
14:30
5m
Talk
POSTER: High Performance GPU Concurrent B+tree
Main Conference
Weihua Zhang Fudan University, Chuanlei Zhao Fudan University, Lu Peng Louisiana State University, Yuzhe Lin Fudan University, Fengzhe Zhang Fudan University, Jinhu Jiang Fudan University
14:35
5m
Talk
POSTER: The Problem-Based Benchmark Suite (PBBS), V2
Main Conference
Daniel Anderson Carnegie Mellon University, Guy E. Blelloch Carnegie Mellon University, USA, Laxman Dhulipala University of Maryland, College Park, Magdalen Dobson Carnegie Mellon University, Yihan Sun University of California, Riverside
14:40
5m
Talk
POSTER: An LLVM-based Open-Source Compiler for NVIDIA GPUs
Main Conference
Da Yan Hong Kong University of Science and Technology, Wei Wang Hong Kong University of Science and Technology, Xiaowen Chu Data Science and Analytics Thrust, HKUST(GZ)
14:45
5m
Talk
POSTER: ParGeo: A Library for Parallel Computational Geometry
Main Conference
Yiqiu Wang Massachusetts Institute of Technology, Shangdi Yu Massachusetts Institute of Technology, Laxman Dhulipala University of Maryland, College Park, Yan Gu UC Riverside, Julian Shun MIT
14:50
5m
Talk
POSTER: Parallel Algorithms for Masked Sparse Matrix-Matrix Products
Main Conference
Srđan Milaković Rice University, Oguz Selvitopi Lawrence Berkeley National Laboratory, Israt Nisa AWS AI, Zoran Budimlić Rice University, Aydin Buluc Lawrence Berkeley National Laboratory
14:55
5m
Talk
POSTER: Rethinking Graph Data Placement for Graph Neural Network Training on Multiple GPUs
Main Conference
Shihui Song The University of Iowa, Peng Jiang The University of Iowa
15:00
5m
Talk
POSTER: Optimizing Consistency for Partially Replicated Data Stores
Main Conference
Ivan Kuraj MIT CSAIL, USA, Armando Solar-Lezama Massachusetts Institute of Technology, Nadia Polikarpova University of California at San Diego
15:05
5m
Talk
POSTER: Optimizing Sparse Computations Jointly
Main Conference
Kazem Cheshmi University of Toronto, Michelle Strout University of Arizona, Maryam Mehri Dehnavi University of Toronto
15:10
5m
Talk
POSTER: wCQ: A Fast Wait-Free Queue with Bounded Memory Usage
Main Conference
Ruslan Nikolaev The Pennsylvania State University, Binoy Ravindran Virginia Tech
15:15
5m
Talk
POSTER: Automatic Differentiation of Parallel Loops with Formal Methods
Main Conference
Jan Hueckelheim Argonne National Laboratory, Laurent Hascoet Inria
15:20
5m
Talk
POSTER: A W-cycle Algorithm for Efficient Batched SVD on GPUs
Main Conference
Junmin Xiao Institute of Computing Technology of Chinese Academy of Sciences, Qing Xue Institute of Computing Technology, Chinese Academy of Sciences, Hui Ma Institute of Computing Technology, Chinese Academy of Sciences, Xiaoyang Zhang Institute of Computing Technology, Chinese Academy of Sciences, Guangming Tan Chinese Academy of Sciences(CAS)

Wed 6 Apr

Displayed time zone: Eastern Time (US & Canada) change

10:00 - 10:20
10:20 - 11:20
Session 7Main Conference
Chair(s): Vitaly Aksenov Inria & ITMO University
10:20
15m
Talk
Deadlock-Free Asynchronous Message Reordering in Rust with Multiparty Session Types
Main Conference
Zak Cutner Imperial College London, Nobuko Yoshida Imperial College London, Martin Vassor Imperial College London
10:35
15m
Talk
Detectable Recovery of Lock-Free Data Structures
Main Conference
Hagit Attiya Technion, Ohad Ben-Baruch Ben-Gurion University of the Negev, Panagiota Fatourou FORTH ICS and University of Crete, Greece, Danny Hendler BGU, Eleftherios Kosmas Department of Computer Science, University of Crete, Greece
10:50
15m
Talk
Lock-Free Locks Revisited
Main Conference
Naama Ben-David VMware, Guy E. Blelloch Carnegie Mellon University, USA, Yuanhao Wei Carnegie Mellon University, USA
11:05
15m
Talk
Asymmetry-aware Scalable Locking
Main Conference
Nian Liu Shanghai Jiao Tong University, Jinyu Gu Shanghai Jiao Tong University, Dahai Tang Hunan university, Kenli Li National Supercomputing Center in Changsha, Hunan University, Binyu Zang Shanghai Jiao Tong University, Haibo Chen Shanghai Jiao Tong University
11:20 - 12:00
12:00 - 13:15
Session 8Main Conference
Chair(s): Naama Ben-David VMware
12:00
15m
Talk
Bundling Linked Data Structures for Linearizable Range Queries
Main Conference
Jacob Nelson Lehigh University, Ahmed Hassan Lehigh University, Roberto Palmieri Lehigh University
12:15
15m
Talk
Elimination (a,b)-trees with fast, durable updates
Main Conference
Anubhav Srivastava University of Waterloo, Trevor Brown University of Waterloo
12:30
15m
Talk
Jiffy: A Lock-free Skip List with Batch Updates and Snapshots
Main Conference
Tadeusz Kobus Poznan University of Technology, Maciej Kokociński Poznan University of Technology, Paweł T. Wojciechowski Poznan University of Technology
12:45
15m
Talk
Multi-Queues Can Be State-of-the-Art Priority Schedulers
Main Conference
Anastasiia Postnikova ITMO University, Nikita Koval JetBrains, Giorgi Nadiradze IST Austria, Dan Alistarh IST Austria
13:00
15m
Talk
PathCAS: An Efficient Middle Ground for Concurrent Search Data Structures
Main Conference
Trevor Brown University of Waterloo, William Sigouin University of Waterloo, Dan Alistarh IST Austria
13:15 - 13:25
Closing RemarksMain Conference

Unscheduled Events

Not scheduled
Break
Break
Main Conference

Accepted Papers

Title
Asymmetry-aware Scalable Locking
Main Conference
BAGUALU: Targeting Brain Scale Pretrained Models with over 37 Million Cores
Main Conference
Bundling Linked Data Structures for Linearizable Range Queries
Main Conference
CASE: A Compiler-Assisted SchEduling Framework for Multi-GPU Systems
Main Conference
Deadlock-Free Asynchronous Message Reordering in Rust with Multiparty Session Types
Main Conference
Detectable Recovery of Lock-Free Data Structures
Main Conference
Dopia: Online Parallelism Management for Integrated CPU/GPU Architectures
Main Conference
Elimination (a,b)-trees with fast, durable updates
Main Conference
Extending the limit of molecular dynamics with ab initio accuracy to 10 billion atoms
Main Conference
FasterMoE: Modeling and Optimizing Training of Large-Scale Dynamic Pre-Trained Models
Main Conference
FliT: A Library for Simple and Efficient Persistent Algorithms
Main Conference
Interference Relation-Guided SMT Solving for Multi-Threaded Program Verification
Main Conference
Jiffy: A Lock-free Skip List with Batch Updates and Snapshots
Main Conference
Lock-Free Locks Revisited
Main Conference
LOTUS: Locality Optimizing Triangle Counting
Main Conference
Link to publication Pre-print
Mashup: Making Serverless Computing Useful for HPC Workflows via Hybrid Execution
Main Conference
Multi-Queues Can Be State-of-the-Art Priority Schedulers
Main Conference
Near-Optimal Sparse Allreduce for Distributed Deep Learning
Main Conference
Parallel Block-Delayed Sequences
Main Conference
PathCAS: An Efficient Middle Ground for Concurrent Search Data Structures
Main Conference
PerFlow: A Domain Specific Framework for Automatic Performance Analysis of Parallel Applications
Main Conference
POSTER: An LLVM-based Open-Source Compiler for NVIDIA GPUs
Main Conference
POSTER: A Parallel Branch-and-Bound Algorithm with History-Based Domination
Main Conference
POSTER: Automatic Differentiation of Parallel Loops with Formal Methods
Main Conference
POSTER: Automatic Synthesis of Parallel Unix Commands and Pipelines with KumQuat
Main Conference
POSTER: A W-cycle Algorithm for Efficient Batched SVD on GPUs
Main Conference
POSTER: Hardening Selective Protection across Multiple Program Inputs for HPC Applications
Main Conference
POSTER: High Performance GPU Concurrent B+tree
Main Conference
POSTER: LB-HM: Load Balance-Aware Data Placement on Heterogeneous Memory for Task-Parallel HPC Applications
Main Conference
POSTER: Optimizing Consistency for Partially Replicated Data Stores
Main Conference
POSTER: Optimizing Sparse Computations Jointly
Main Conference
POSTER: Parallel Algorithms for Masked Sparse Matrix-Matrix Products
Main Conference
POSTER: ParGeo: A Library for Parallel Computational Geometry
Main Conference
POSTER: Remote OpenMP Offloading
Main Conference
POSTER: Rethinking Graph Data Placement for Graph Neural Network Training on Multiple GPUs
Main Conference
POSTER: The Problem-Based Benchmark Suite (PBBS), V2
Main Conference
POSTER: Towards OmpSs-2 and OpenACC Interoperation
Main Conference
POSTER: wCQ: A Fast Wait-Free Queue with Bounded Memory Usage
Main Conference
QGTC: Accelerating Quantized Graph Neural Networks via GPU Tensor Core
Main Conference
RTNN: Accelerating Neighbor Search Using Hardware Ray Tracing
Main Conference
Scaling Graph Traversal to 281 Trillion Edges with 40 Million Cores
Main Conference
Stream Processing with Dependency-Guided Synchronization
Main Conference
The Performance Power of Software Combining in Persistence
Main Conference
TileSpGEMM: A Tiled Algorithm for Parallel Sparse General Matrix-Matrix Multiplication on GPUs
Main Conference
Understanding and Detecting Deep Memory Persistency Bugs in NVM Programs with DeepMC
Main Conference
Vapro: Performance Variance Detection and Diagnosis for Production-Run Parallel Applications
Main Conference

Call for Papers

PPoPP 2022: The 27th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming

Seoul, S. Korea. (collocated with CC-2022, HPCA-2022, and CGO-2022) Dates: Apr 2 – Apr 6, 2022. https://ppopp22.sigplan.org

Submission URL: https://ppopp22.hotcrp.com



Important dates:

  • Paper registration and abstract submission: August 9, 2021
  • Full paper submission: August 13, 2021
  • Early notification for papers not passing the first stage: October 11, 2021
  • Workshop/tutorial proposal submission deadline: October 15, 2021
  • Workshop/tutorial proposal acceptance notification: October 25, 2021
  • Author response period: November 1 – November 3, 2021
  • Author Notification: November 16, 2021
  • Artifact submission to AE committee: November 28, 2020
  • Artifact notification by AE committee: December 26, 2020
  • Final paper due: December 27, 2021

All deadlines are at midnight anywhere on earth (AoE) and are firm.



Scope:

PPoPP is the premier forum for leading work on all aspects of parallel programming, including theoretical foundations, techniques, languages, compilers, runtime systems, tools, and practical experience. In the context of the symposium, “parallel programming” encompasses work on concurrent and parallel systems (multicore, multi-threaded, heterogeneous, clustered, and distributed systems; grids; data centers; clouds; and large scale machines). Given the rise of parallel architectures in the consumer market (desktops, laptops, and mobile devices) and data centers, PPoPP is particularly interested in work that addresses new parallel workloads and issues that arise out of extreme-scale applications or cloud platforms, as well as techniques and tools that improve the productivity of parallel programming or work towards improved synergy with such emerging architectures.

Specific topics of interest include (but are not limited to): - Compilers and runtime systems for parallel and heterogeneous systems - Concurrent data structures - Development, analysis, or management tools - Fault tolerance for parallel systems - Formal analysis and verification - High-performance / scientific computing - Libraries for parallel computing - Middleware for parallel systems - Parallel algorithms - Parallel applications and frameworks - Parallel programming for deep memory hierarchies, including nonvolatile memory - Parallel programming languages - Parallel programming theory and models - Parallelism in non-scientific workloads: web, search, analytics, cloud, machine learning - Performance analysis, debugging, and optimization - Programming tools for parallel and heterogeneous systems - Software engineering for parallel programs - Software for heterogeneous architectures - Software productivity for parallel programming - Synchronization and concurrency control

Papers should report on original research relevant to parallel programming and contain enough background materials to make them accessible to the entire parallel programming research community. Papers describing experience should indicate how they illustrate general principles or lead to new insights. PPoPP submissions will be evaluated based on their technical merit and accessibility. Submissions should clearly motivate the importance of the problem being addressed, compared to the existing body of work on the topic, and explicitly and precisely state the paper’s key contributions and result towards addressing the problem. Submissions should strive to be accessible both to a broad audience and to experts in the area. Authors of papers that do not pass the first round of reviewing will receive a notification so that they can start working as early as possible on revising their papers and resubmitting them to other conferences or journals.



Paper Submission:

Conference submission site: https://ppopp22.hotcrp.com.

All submissions must be made electronically through the conference website and include an abstract (100–400 words), author contact information, the full list of authors, and their affiliations. Full paper submissions must be in PDF formatted printable on both A4 and US letter size paper.

All papers must be prepared in ACM Conference Format using the 2-column acmart format: use the SIGPLAN proceedings template acmart-sigplanproc-template.tex for Latex, and interim-layout.docx for Word. You may also want to consult the official ACM information on the Master Article Template and related tools. Important note: The Word template (interim-layout.docx) on the ACM website uses 9pt font; you need to increase it to 10pt.

Papers should contain a maximum of 10 pages of text (in a typeface no smaller than 10 points) or figures, NOT INCLUDING references. There is no page limit for references. References must include the names of all authors (not {et al.}). Appendices are not allowed, but the authors may submit supplementary material, such as proofs or source code; all supplementary material must be in PDF format. Looking at supplementary material is at the discretion of the reviewers.

Submission is double-blind, and authors will need to identify any potential conflicts of interest with PC and Extended Review Committee members, as defined here: http://www.sigplan.org/Resources/Policies/Review/ (ACM SIGPLAN policy).

PPoPP 2022 will employ a double-blind reviewing process. To facilitate this process, submissions should not reveal the identity of the authors in any way. Authors should leave out author names and affiliations from the body of their submission and the supplementary material. They should also ensure that any references to their own related work should be in the third person (e.g., not “We build on our previous work …” but rather “We build on the work of …”). The purpose of this process is to help the PC and external reviewers come to an initial judgment about the paper without bias, not to make it impossible for them to discover the authors if they were to try. Nothing should be done in the name of anonymity that weakens the submission or makes reviewing the paper more difficult. In particular, important background references should not be omitted or anonymized. In addition, authors should feel free to disseminate their ideas or draft versions of their papers as they normally would. For instance, authors may post drafts of their papers on the web or give talks on their research ideas. Authors with further questions on double-blind reviewing are encouraged to contact the Program Co-Chairs by email.

Submissions should be in PDF and printable on both US Letter and A4 paper. Papers may be resubmitted to the submission site multiple times until the deadline, but the last version submitted before the deadline will be the version reviewed. Papers that exceed the length requirement that deviate from the expected format or that are submitted late will be rejected.

All submissions that are not accepted for regular presentations will be automatically considered for posters. Two-page summaries of accepted posters will be included in the conference proceedings.

To allow reproducibility, we encourage authors of accepted papers to submit their papers for Artifact Evaluation (AE). The AE process begins after the acceptance notification and is run by a separate committee whose task is to assess how the artifacts support the work described in the papers. Artifact evaluation is voluntary and will not affect paper acceptance but will be taken into consideration when selecting papers for awards. Papers that go through the AE process successfully will receive one or several of the ACM reproducibility badges printed on the papers themselves. More information will be posted on the AE website.



Workshop/Tutorial Proposals:

We are soliciting proposals for workshops and tutorials within the general scope of PPoPP. Members of the community are encouraged to submit proposals for workshops/tutorials that bring together researchers and practitioners to share their tools, technologies, latest results and to discuss work in progress and new directions. Workshops and tutorials will be held April 2-3, 2022.

Submit your proposals by October 15, 2021, to Bernhard Egger (bernhard@csap.snu.ac.kr).

Artifact Evaluation

Artifact evaluation submission site

Due Time: 11:59pm 11/26/2021 (AOE) Due Time: 11:59pm 11/28/2021 (AOE)

Call for Artifacts

Authors of accepted PPoPP 2022 papers/posters are invited to formally submit their supporting materials to the Artifact Evaluation (AE) process. The Artifact Evaluation Committee attempts to reproduce experiments (in broad strokes) and assess if submitted artifact supports the claims made in the paper/poster. The submission is voluntary and does not influence the final decision regarding paper/poster acceptance.

We invite every author of an accepted PPoPP paper/poster to consider submitting an artifact. It is good for the community as a whole. At PPoPP, we follow ACM's artifact reviewing and badging policy. ACM describes a research artifact as follows:

"By "artifact" we mean a digital object that was either created by the authors to be used as part of the study or generated by the experiment itself. For example, artifacts can be software systems, scripts used to run experiments, input datasets, raw data collected in the experiment, or scripts used to analyze results."

Submission Site

The submission site is located at https://ppopp22ae.hotcrp.com/.

Evaluation Process

Artifact evaluation is single-blind. Please take precautions (e.g. turning off analytics, logging) to help prevent accidentally learning the identities of reviewers. Each submitted artifact is evaluated by at least two members of the artifact evaluation committee.

During the process, authors and evaluators are allowed to anonymously communicate with each other to overcome technical difficulties. Ideally, we hope to see all submitted artifacts to successfully pass the artifact evaluation.

The evaluators are asked to evaluate the artifact based on the following criteria, that are defined byACM.

ACM recommends awarding three different types of badges to communicate how the artifact has been evaluated. A single paper can receive up to three badges — one badge of each type.



The green Artifacts Available badge indicates that an artifact is publicly accessible in an archival repository. For this badge to be awarded the paper does not have to be independently evaluated. ACM requires that a qualified archival repository is used, for example Zenodo, figshare, Dryad. Personal webpages, GitHub repositories or alike are not sufficient as it can be changed after the submission deadline!
The red Artifacts Evaluated badges indicate that a research artifact has been successfully completed an independent audit. A reviewer has verified that the artifact is documented, complete, consistent, exercisable, and include appropriate evidence of verification and validation. Two levels are distinguished:

The lighter red Artifacts Evaluated — Functional badge indicates a basic level of functionality. The darker red Artifacts Evaluated — Reusable badge indicates a higher quality artifact which significantly exceeds minimal functionality so that reuse and repurposing is facilitated.

Artifacts need not be made publicly available to be considered for one of these badges. However, they do need to be made available to reviewers.
The blue Results Validated badges indicate that the main results of the paper have been successfully obtained by an independent reviewer. Two levels are distinguished:

The darker blue Results Reproduced badge indicates that the main results of the paper have been successfully obtained using the provided artifact. The lighter blue Results Replicated badge indicates that the main results of the paper have been independently obtained without using the author-provided research artifact.

Artifacts need not be made publicly available to be considered for one of these badges. However, they do need to be made available to reviewers.



At PPoPP the artifact evaluation committee awards for each successfully evaluated paper one of the two red Artifacts Evaluated badges as well as the darker blue Results Reproduced badge. We do not award the lighter blue Results Replicated badge in this artifact evaluation process. The green Artifact Available badge does not require the formal audit and, therefore, is awarded directly by the publisher — if the authors provide a link to the deposited artifact.

Note that the variation of empirical and numerical results is tolerated. In fact, it is often unavoidable in computer systems research - see "how to report and compare empirical results?" in AE FAQ on ctuning.org!

Packaging and Instructions

Your submission should consist of three pieces:

  1. The submission version of your paper/poster.
  2. A README file (PDF or plaintext format) that explains your artifact (details below).
  3. The artifact itself, packaged as a single archive file. Artifacts less than 600MB can be directly uploaded to the hotCRP submission site; for archives larger than 600MB, please provide a URL pointing to the artifact; the URL must protect the anonymity of the reviewers. Please use a widely available compressed archive format such as ZIP (.zip), tar and gzip (.tgz), or tar and bzip2 (.tbz2). Ensure the file has the suffix indicating its format. Those seeking the "Available" badge must additionally follow the appropriate instructions recommended by ACM on uploading the archive to a publicly available, immutable location to receive the badge.

The README file should consist of two parts:

  1. a Getting Started Guide and
  2. Step-by-Step Instructions for how you propose to evaluate your artifact (with appropriate connections to the relevant sections of your paper);

The Getting Started Guide should contain setup instructions (including, for example, a pointer to the VM player software, its version, passwords if needed, etc.) and basic testing of your artifact that you expect a reviewer to be able to complete in 30 minutes. Reviewers will follow all the steps in the guide during an initial kick-the-tires phase. The Getting Started Guide should be as simple as possible, and yet it should stress the key elements of your artifact. Anyone who has followed the Getting Started Guide should have no technical difficulties with the rest of your artifact. In this step, you may want to include a single high-level "runme.sh" script that automatically compiles your artifact, runs it (printing some interesting events to the console), collects data (e.g., performance data), and produces files such as graphs or charts similar to the ones used in your paper.

The Step by Step Instructions explain how to reproduce any experiments or other activities that support the conclusions in your paper. Write this for readers who have a deep interest in your work and are studying it to improve it or compare against it. If your artifact runs for more than a few minutes, point this out and explain how to run it on smaller inputs.

Where appropriate, include descriptions of and links to files (included in the archive) that represent expected outputs (e.g., the speedup comparison chart expected to be generated by your tool on the given inputs); if there are warnings that are safe to be ignored, explain which ones they are.

The artifact's documentation should include the following:

  • A list of claims from the paper supported by the artifact, and how/why.
  • A list of claims from the paper not supported by the artifact, and how/why. Example: Performance claims cannot be reproduced in VM, authors are not allowed to redistribute specific benchmarks, etc. Artifact reviewers can then center their reviews / evaluation around these specific claims.

If you are seeking a "reusable" badge, your documentation should include which aspects of the artifact you suggest the reviewer exercise in a different setting. For example, you may want to point out which script to modify so that the reviewer may be able to run your tool on a benchmark not used in the paper. You may want the reviewer to suggest where to edit a script to change the number of CPU cores used for evaluation.

Submission Guidelines

1. Carefully think which badges you want.

In your hotCRP submission, be upfront about which badge(s) you are seeking.

  1. If making your code public is all you want to do, seek only the "available" badge. The reviewers will not exercise the artifact for its functionality or validate the claims.
  2. If you do not plan on making the artifact available to the public, do not seek the "available" badge.
  3. If you only plan to reproduce the claims without making your artifact Documented, Consistent, Complete, and exercisable, do not seek the "functional" badge.

2. Minimize the artifact setup overhead

A well-packaged artifact is easily usable by the reviewers, saving them time and frustration, and more clearly conveying the value of your work during evaluation. A great way to package an artifact is as a Docker image or in a virtual machine that runs "out of the box" with very little system-specific configuration. Using a virtual machine provides a way to make an easily reproducible environment — it is less susceptible to bit rot. It also helps the AEC have confidence that errors or other problems cannot cause harm to their machines.

Giving AE reviewers remote access to your machines with preinstalled (proprietary) software is also possible.

The submission of an artifact is not the same as making it public. AEC members will be instructed that they may not publicize any part of your artifact during or after completing evaluation, nor retain any part of it after evaluation. Thus, you are free to include models, data files, proprietary binaries, and similar items in your artifact.

After preparing your artifact, download and test it on at least one fresh machine where you did not prepare the artifact; this will help you fix missing dependencies, if any.

3. Carefully think your artifact working on a reviewer's machine

The reviewers will not have access to any special hardware or software outside of their own research needs provided by their university or research team. There are more tips preparing a submission available on the ctuning website.

If you have an unusual experimental setup that requires specific hardware (i.e., custom hardware, oscilloscopes for measurements …) or proprietary software please contact the artifact evaluation chairs before the submission.

Discussion with Reviewers

Throughout the review period, reviews will be submitted to HotCRP and will be (approximately) continuously visible to authors. AEC reviewers will be able to continuously interact (anonymously) with authors for clarifications, system-specific patches, and other logistics to help ensure that the artifact can be evaluated. The goal of continuous interaction is to prevent rejecting artifacts for "wrong library version" types of problems.

For questions, please contact AE co-chairs, Milind Chabbi (milind@uber.com) and Karthik Murthy (karthik.s.m@gmail.com).

Stream Processing with Dependency-Guided Synchronization Konstantinos Kallas, Filip Niksic, Caleb Stanford, Rajeev Alur (University of Pennsylvania)

CASE: A Compiler-Assisted SchEduling Framework for Multi-GPU Systems Chao Chen (Amazon Web Service), Chris Porter, Santosh Pande (Georgia Institute of Technology)

Dopia: Online Parallelism Management for Integrated CPU/GPU Architectures Younghyun Cho (University of California, Berkeley), Jiyeon Park (Seoul National University), Florian Negele (ETH Zurich), Changyeon Jo (Seoul National University), Thomas R. Gross (ETH Zurich), Bernhard Egger (Seoul National University)

Mashup: Making Serverless Computing Useful for HPC Workflows via Hybrid Execution Rohan Basu Roy, Tirthak Patel (Northeastern University), Vijay Gadepally (MIT Lincoln Laboratory), Devesh Tiwari (Northeastern University)

Parallel Block-Delayed Sequences Sam Westrick, Mike Rainey, Daniel Anderson, Guy E. Blelloch (Carnegie Mellon University)

RTNN: Accelerating Neighbor Search Using Hardware Ray Tracing Yuhao Zhu (University of Rochester)

TileSpGEMM: A Tiled Algorithm for Parallel Sparse General Matrix-Matrix Multiplication on GPUs Yuyao Niu, Zhengyang Lu, Haonan Ji, Shuhui Song, Zhou Jin, Weifeng Liu (China University of Petroleum-Beijing)

QGTC: Accelerating Quantized Graph Neural Networks via GPU Tensor Core Yuke Wang, Boyuan Feng, Yufei Ding (University of California Santa Barbara)

FasterMoE: Modeling and Optimizing Training of Large-Scale Dynamic Pre-Trained Models Jiaao He, Jidong Zhai, Tiago Antunes, Haojie Wang, Fuwen Luo, Shangfeng Shi, Qin Li (Tsinghua University)

Near-Optimal Sparse Allreduce for Distributed Deep Learning Shigang Li, Torsten Hoefler (ETH Zurich)

Vapro: Performance Variance Detection and Diagnosis for Production-Run Parallel Applications Liyan Zheng, Jidong Zhai (Tsinghua University), Xiongchao Tang (Sangfor Technologies Inc. and Tsinghua University), Haojie Wang, Teng Yu, Yuyang Jin (Tsinghua University), Shuaiwen Leon Song (University of Sydney), Wenguang Chen (Tsinghua University)

Interference Relation-Guided SMT Solving for Multi-Threaded Program Verification Hongyu Fan, Weiting Liu, Fei He (Tsinghua University)

PerFlow: A Domain Specific Framework for Automatic Performance Analysis of Parallel Applications Yuyang Jin, Haojie Wang, Runxin Zhong, Chen Zhang, Jidong Zhai (Tsinghua University)

BAGUALU: Targeting Brain Scale Pretrained Models with over 37 Million Cores Zixuan Ma, Jiaao He (Tsinghua University), Jiezhong Qiu (Tsinghua University and Beijing Academy of Artificial Intelligence), Huanqi Cao, Yuanwei Wang, Zhenbo Sun, Liyan Zheng, Haojie Wang, Shizhi Tang (Tsinghua University), Tianyu Zheng (Zhejiang Lab), Junyang Lin (DAMO Academy, Alibaba Group), Guanyu Feng (Tsinghua University), Zeqiang Huang, Jie Gao (Zhejiang Lab), Aohan Zeng (Tsinghua University and Beijing Academy of Artificial Intelligence), Jianwei Zhang (DAMO Academy, Alibaba Group), Runxin Zhong, Tianhui Shi (Tsinghua University), Sha Liu (Zhejiang Lab), Weimin Zheng (Tsinghua University), Jie Tang (Tsinghua University and Beijing Academy of Artificial Intelligence), Hongxia Yang (DAMO Academy, Alibaba Group), Xin Liu (Zhejiang Lab), Jidong Zhai, Wenguang Chen (Tsinghua University)

Extending the Limit of Molecular Dynamics with Ab Initio Accuracy to 10 Billion Atoms Zhuoqiang Guo (Institute of Computing Technology, Chinese Academy of Sciences), Denghui Lu (HEDPS, CAPT, College of Engineering, Peking University), Yujin Yan, Siyu Hu, Rongrong Liu, Guangming Tan (Institute of Computing Technology, Chinese Academy of Sciences), Ninghui Sun (State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences), Wanrun Jiang (AI for Science Institute), Lijun Liu (Osaka University), Yixiao Chen (Princeton University), Linfeng Zhang (DP Technology), Mohan Chen (HEDPS, CAPT, College of Engineering, Peking University), Han Wang (Laboratory of Computational Physics, Institute of Applied Physics and Computational Mathematics), Weile Jia (Institute of Computing Technology, Chinese Academy of Sciences)

LOTUS: Locality Optimizing Triangle Counting Mohsen Koohi Esfahani, Peter Kilpatrick, Hans Vandierendonck (Queen’s University Belfast)

Scaling Graph Traversal to 281 Trillion Edges with 40 Million Cores Huanqi Cao, Yuanwei Wang, Haojie Wang (Tsinghua University), Heng Lin (Peking University), Zixuan Ma (Tsinghua University), Wanwang Yin (National Supercomputing Center in Wuxi), Wenguang Chen (Tsinghua University)

Deadlock-Free Asynchronous Message Reordering in Rust with Multiparty Session Types Zak Cutner, Nobuko Yoshida, Martin Vassor (Imperial College London)

Detectable Recovery of Lock-Free Data Structure Hagit Attiya (Technion), Ohad Ben-Baruch (Ben-Gurion University of the Negev), Panagiota Fatourou (Université de Paris, LIPADE, F-75006 Paris, France & FORTH-ICS and University of Crete, Greece), Danny Hendler (Ben-Gurion University of the Negev), Eleftherios Kosmas (University of Crete, Greece)

Lock-Free Locks Revisited Naama Ben-David (VMware), Guy E. Blelloch, Yuanhao Wei (Carnegie Mellon University)

Asymmetry-aware Scalable Locking Nian Liu, Jinyu Gu (Shanghai Jiao Tong University), Dahai Tang (Hunan university), Kenli Li (College of Information Science and Engineering, National Supercomputing Center in Changsha, Hunan University), Binyu Zang, Haibo Chen (Shanghai Jiao Tong University)

FliT: A Library for Simple and Efficient Persistent Algorithms Yuanhao Wei (Carnegie Mellon University), Naama Ben-David (VMware), Michal Friedman (Technion), Guy E. Blelloch (Carnegie Mellon University), Erez Petrank (Technion)

Understanding and Detecting Deep Memory Persistency Bugs in NVM Programs with DeepMC Benjamin Reidys, Jian Huang (UIUC)

The Performance Power of Software Combining in Persistence Panagiota Fatourou (Université de Paris, LIPADE, F-75006 Paris, France & FORTH-ICS and University of Crete, Greece), Nikolaos Kallimanis (Institute of Computer Science, Foundation for Research & Technology - Hellas), Eleftherios Kosmas (Department of Computer Science, University of Crete, Greece)

Multi-Queues Can Be State-of-the-Art Priority Schedulers Anastasiia Postnikova (ITMO University), Nikita Koval (JetBrains), Giorgi Nadiradze, Dan Alistarh (IST Austria)

Bundling Linked Data Structures for Linearizable Range Queries Jacob Nelson-Slivon, Ahmed Hassan, Roberto Palmieri (Lehigh University)

PathCAS: An Efficient Middle Ground for Concurrent Search Data Structures Trevor Brown, William Sigouin (University of Waterloo), Dan Alistarh (IST Austria)

Jiffy: A Lock-free Skip List with Batch Updates and Snapshots Tadeusz Kobus, Maciej Kokociński, Paweł T. Wojciechowski (Poznan University of Technology)

Elimination (a,b)-Trees with Fast, Durable Updates Anubhav Srivastava, Trevor Brown (University of Waterloo)

POSTER: Automatic Synthesis of Parallel Unix Commands and Pipelines with KumQuat Jiasi Shen, Martin Rinard, Nikos Vasilakis (MIT)

POSTER: Towards OmpSs-2 and OpenACC Interoperation Orestis Korakitis, Simon Garcia De Gonzalo (Barcelona Supercomputing Center (BSC)), Nicolas Guidotti (INESC-ID, Instituto Superior Técnico, University of Lisbon), João Pedro Barreto (INESC-ID & Instituto Superior Técnico), José C. Monteiro (INESC-ID, Instituto Superior Técnico, University of Lisbon), Antonio J. Peña (Barcelona Supercomputing Center (BSC))

POSTER: LB-HM: Load Balance-Aware Data Placement on Heterogeneous Memory for Task-Parallel HPC Applications Zhen Xie, Jie Liu (University of California, Merced), Sam Ma, Jiajia Li (College of William & Mary), Dong Li (University of California, Merced)

POSTER: Hardening Selective Protection across Multiple Program Inputs for HPC Applications Yafan Huang (University of Iowa), Shengjian Guo (Baidu USA), Sheng Di (Argonne National Laboratory), Guanpeng Li (University of Iowa), Franck Cappello (Argonne National Laboratory)

POSTER: A Parallel Branch-and-Bound Algorithm with History-Based Domination Taspon Gonggiatgul, Ghassan Shobaki, Pinar Muyan-Ozcelik (California State University, Sacramento)

POSTER: Remote OpenMP Offloading Atmn Patel (University of Waterloo), Johannes Doerfert (Argonne National Laboratory)

POSTER: High Performance GPU Concurrent B+tree Weihua Zhang, Chuanlei Zhao (Fudan University), Lu Peng (Louisiana State University), Yuzhe Lin, Fengzhe Zhang, Jinhu Jiang (Fudan University)

POSTER: The Problem-Based Benchmark Suite (PBBS), V2 Daniel Anderson, Guy Blelloch (Carnegie Mellon University), Laxman Dhulipala (University of Maryland, College Park), Magdalen Dobson (Carnegie Mellon University), Yihan Sun (UC Riverside)

POSTER: An LLVM-based Open-Source Compiler for NVIDIA GPUs Da Yan, Wei Wang (Hong Kong University of Science and Technology), Xiaowen Chu (Data Science and Analytics Thrust, HKUST (GZ))

POSTER: ParGeo: A Library for Parallel Computational Geometry Yiqiu Wang, Shangdi Yu (Massachusetts Institute of Technology), Laxman Dhulipala (University of Maryland, College Park), Yan Gu (University of California, Riverside), Julian Shun (Massachusetts Institute of Technology)

POSTER: Parallel Algorithms for Masked Sparse Matrix-Matrix Products Srđan Milaković (Rice University), Oguz Selvitopi (Lawrence Berkeley National Laboratory), Israt Nisa (AWS AI), Zoran Budimlić (Rice University), Aydin Buluc (Lawrence Berkeley National Laboratory)

POSTER: Rethinking Graph Data Placement for Graph Neural Network Training on Multiple GPUs Shihui Song, Peng Jiang (The University of Iowa)

POSTER: Optimizing Consistency for Partially Replicated Data Stores Ivan Kuraj, Armando Solar-Lezama (Massachusetts Institute of Technology), Nadia Polikarpova (University of California, San Diego)

POSTER: Optimizing Sparse Computations Jointly Kazem Cheshmi (University of Toronto), Michelle Mills Strout (University of Arizona), Maryam Mehri Dehnavi (University of Toronto)

POSTER: wCQ: A Fast Wait-Free Queue with Bounded Memory Usage Ruslan Nikolaev (The Pennsylvania State University), Binoy Ravindran (Virginia Tech)

POSTER: Automatic Differentiation of Parallel Loops with Formal Methods Jan Hueckelheim (Argonne National Laboratory), Laurent Hascoet (Inria)

POSTER: A W-cycle Algorithm for Efficient Batched SVD on GPUs Junmin Xiao (Institute of Computing Technology of Chinese Academy of Sciences), Qing Xue, Hui Ma, Xiaoyang Zhang, Guangming Tan (Institute of Computing Technology, Chinese Academy of Sciences)