


default search action
PPoPP 2024: Edinburgh, UK
- Michel Steuwer, I-Ting Angelina Lee, Milind Chabbi:
Proceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, PPoPP 2024, Edinburgh, United Kingdom, March 2-6, 2024. ACM 2024
Keynote
- Nir Shavit
:
Sparsity in Deep Neural Nets (Keynote). 1
Synchronization and Concurrency Control I
- Pedro Ramalhete
, Andreia Correia
:
Scaling Up Transactions with Slower Clocks. 2-16 - Jonggyu Park
, Young Ik Eom
:
Locks as a Resource: Fairly Scheduling Lock Occupation with CFL. 17-29 - Daewoo Kim
, Trevor Brown
, Ajay Singh
:
Are Your Epochs Too Epic? Batch Free Can Be Harmful. 30-41
Compilers and Runtimes for Parallel Systems
- Jiangsu Du
, Jinhui Wei
, Jiazhi Jiang
, Shenggan Cheng
, Dan Huang
, Zhiguang Chen
, Yutong Lu
:
Liger: Interleaving Intra- and Inter-Operator Parallelism for Distributed Large Model Inference. 42-54 - Jinchen Xu
, Guanghui Song
, Bei Zhou
, Fei Li
, Jiangwei Hao
, Jie Zhao
:
A Holistic Approach to Automatic Mixed-Precision Code Generation and Tuning for Affine Programs. 55-67 - Stefan K. Muller
:
Language-Agnostic Static Deadlock Detection for Futures. 68-79 - Akshay Bhosale
, Rudolf Eigenmann
:
Recurrence Analysis for Automatic Parallelization of Subscripted Subscripts. 80-93
High Performance Computing
- Kasra Jamshidi
, Keval Vora
:
OsirisBFT: Say No to Task Replication for Scalable Byzantine Fault Tolerant Analytics. 94-108 - Haozhong Qiu
, Chuanfu Xu
, Jianbin Fang
, Liang Deng
, Jian Zhang
, Qingsong Wang
, Yue Ding
, Zhe Dai
, Yonggang Che
, Shizhao Chen
, Jie Liu
:
Towards Scalable Unstructured Mesh Computations on Shared Memory Many-Cores. 109-119 - Jiabin Xie
, Guangnan Feng
, Han Huang
, Junxuan Feng
, Zhiguang Chen
, Yutong Lu
:
Extreme-scale Direct Numerical Simulation of Incompressible Turbulence on the Heterogeneous Many-core System. 120-132 - James Psota
, Armando Solar-Lezama
:
Pure: Evolving Message Passing To Better Leverage Shared Memory Within Nodes. 133-146
Graph Processing
- Sungwoo Park
, Seyeon Oh
, Min-Soo Kim
:
INFINEL: An efficient GPU-based processing method for unpredictable large output graph queries. 147-159 - Xinbiao Gan
, Guang Wu
, Shenghao Qiu
, Feng Xiong
, Jiaqi Si
, Jianbin Fang
, Dezun Dong
, Chunye Gong
, Tiejun Li
, Zheng Wang
:
GraphCube: Interconnection Hierarchy-aware Graph Processing. 160-174 - Zhiheng Lin
, Ke Meng
, Chaoyang Shui
, Kewei Zhang
, Junmin Xiao
, Guangming Tan
:
Exploiting Fine-Grained Redundancy in Set-Centric Graph Pattern Mining. 175-187
Synchronization and Concurrency Control II
- Vitaly Aksenov
, Nikita Koval
, Petr Kuznetsov
, Anton Paramonov
:
Memory Bounds for Concurrent Bounded Queues. 188-199 - Guy E. Blelloch
, Yuanhao Wei
:
VERLIB: Concurrent Versioned Pointers. 200-214 - Mohammad Khalaji
, Trevor Brown
, Khuzaima Daudjee
, Vitaly Aksenov
:
Practical Hardware Transactional vEB Trees. 215-228
ML Workloads
- Xiaoyan Liu
, Xuegui Zheng
, Hailong Yang
, Zhongzhi Luan
, Depei Qian
:
Tetris: Accelerating Sparse Convolution by Exploiting Memory Reuse on GPU. 229-242 - Ismet Dagli
, Mehmet E. Belviranli
:
Shared Memory-contention-aware Concurrent DNN Execution for Diversely Heterogeneous System-on-Chips. 243-256 - Siyu Hu
, Tong Zhao
, Qiuchen Sha
, Enji Li
, Xiangyu Meng
, Liping Liu
, Lin-Wang Wang
, Guangming Tan
, Weile Jia
:
Training one DeePMD Model in Minutes: a Step towards Online Learning. 257-269
Parallel Algorithms
- Magdalen Dobson Manohar
, Zheqi Shen
, Guy E. Blelloch
, Laxman Dhulipala
, Yan Gu
, Harsha Vardhan Simhadri
, Yihan Sun
:
ParlayANN: Scalable and Deterministic Parallel Graph-Based Approximate Nearest Neighbor Search Algorithms. 270-285 - Quanquan C. Liu
, Julian Shun
, Igor Zablotchi
:
Parallel k-Core Decomposition with Batched Updates and Asynchronous Reads. 286-300 - Xiaojun Dong
, Laxman Dhulipala
, Yan Gu
, Yihan Sun
:
Parallel Integer Sort: Theory and Practice. 301-315 - Zafar Ahmad
, Reilly Browne
, Rezaul Chowdhury
, Rathish Das
, Yushen Huang
, Yimin Zhu
:
Fast American Option Pricing using Nonlinear Stencils. 316-332
Optimizing for Memory
- Yuetao Chen
, Kun Li
, Yuhao Wang
, Donglin Bai
, Lei Wang
, Lingxiao Ma
, Liang Yuan
, Yunquan Zhang
, Ting Cao
, Mao Yang
:
ConvStencil: Transform Stencil Computation to Matrix Multiplication on Tensor Cores. 333-347 - Brian Wheatman
, Randal C. Burns
, Aydin Buluç
, Helen Xu
:
CPMA: An Efficient Batch-Parallel Compressed Set Without Pointers. 348-363 - Hunter McCoy
, Prashant Pandey
:
Gallatin: A General-Purpose GPU Memory Manager. 364-376
Linear Algebra
- Meng Pang
, Xiang Fei
, Peng Qu
, Youhui Zhang
, Zhaolin Li
:
A Row Decomposition-based Approach for Sparse Matrix Multiplication on GPUs. 377-389 - Abhinav Jangda
, Mohit Yadav
:
Fast Kronecker Matrix-Matrix Multiplication on GPUs. 390-403 - Lukas Gianinazzi
, Alexandros Nikolaos Ziogas
, Langwen Huang
, Piotr Luczynski
, Saleh Ashkboosh
, Florian Scheidl
, Armon Carigiet
, Chio Ge
, Nabil Abubaker
, Maciej Besta
, Tal Ben-Nun
, Torsten Hoefler
:
Arrow Matrix Decomposition: A Novel Approach for Communication-Efficient Sparse Matrix Multiplication. 404-416
Applications
- Shenggan Cheng
, Xuanlei Zhao
, Guangyang Lu
, Jiarui Fang
, Tian Zheng
, Ruidong Wu
, Xiwen Zhang
, Jian Peng
, Yang You
:
FastFold: Optimizing AlphaFold Training and Inference on GPU Clusters. 417-430 - Seongyeon Park
, Junguk Hong
, Jaeyong Song
, Hajin Kim
, Youngsok Kim
, Jinho Lee
:
AGAThA: Fast and Efficient GPU Acceleration of Guided Sequence Alignment for Long Read Mapping. 431-444
POSTER SESSION: Posters
- Zhuoran Ji
, Zhaorui Zhang
, Jiming Xu
, Lei Ju
:
POSTER: Accelerating High-Precision Integer Multiplication used in Cryptosystems with GPUs. 445-447 - Zhichen Feng
, Jialin Li
, Yaqian Gao
, Shaobo Tian
, Huang Ye
, Jian Zhang
:
POSTER: Enabling Extreme-Scale Phase Field Simulation with In-situ Feature Extraction. 448-450 - Lixian Ma
, Haoruo Chen
, En Shao
, Leping Wang
, Quan Chen
, Guangming Tan
:
POSTER: FineCo: Fine-grained Heterogeneous Resource Management for Concurrent DNN Inferences. 451-453 - Jiajun Huang
, Sheng Di
, Xiaodong Yu
, Yujia Zhai
, Jinyang Liu
, Yafan Huang
, Ken Raffenetti
, Hui Zhou
, Kai Zhao
, Zizhong Chen
, Franck Cappello
, Yanfei Guo
, Rajeev Thakur
:
POSTER: Optimizing Collective Communications with Error-bounded Lossy Compression for GPU Clusters. 454-456 - Guofeng Feng
, Weile Jia
, Ninghui Sun
, Guangming Tan
, Jiajia Li
:
POSTER: Optimizing Sparse Tensor Contraction with Revisiting Hash Table Design. 457-459 - Juntao Zhao
, Borui Wan
, Chuan Wu
, Yanghua Peng
, Haibin Lin
:
POSTER: LLM-PQ: Serving LLM on Heterogeneous Clusters with Phase-Aware Partition and Adaptive Quantization. 460-462 - dePaul Miller
, Henry F. Korth
, Roberto Palmieri
:
POSTER: OCToPus: Semantic-aware Concurrency Control for Blockchain Transactions. 463-465 - Jiaao He
, Shengqi Chen
, Jidong Zhai
:
POSTER: Pattern-Aware Sparse Communication for Scalable Recommendation Model Training. 466-468 - Shunde Li
, Junyu Gu
, Jue Wang
, Tiechui Yao
, Zhiqiang Liang
, Yumeng Shi
, Shigang Li
, Weiting Xi
, Shushen Li
, Chunbao Zhou
, Yangang Wang
, Xuebin Chi
:
POSTER: ParGNN: Efficient Training for Large-Scale Graph Neural Network on GPU Clusters. 469-471 - Yifei Li
, Bole Zhou
, Jiejing Zhang
, Xuechao Wei
, Yinghan Li
, Yingda Chen
:
POSTER: RadiK: Scalable Radix Top-K Selection on GPUs. 472-474 - Almog Zur
, Nachshon Cohen
, Michal Friedman
, Erez Petrank
:
POSTER: RELAX: Durable Data Structures with Swift Recovery. 475-476 - Yi Zong
, Xinliang Wang
, Haopeng Huang
, Chensong Zhang
, Xiaowen Xu
, Jian Sun
, Bowen Yan
, Qin Wang
, Sicong Li
, Zhaohui Ding
, Wei Xue
:
POSTER: StructMG: A Fast and Scalable Structured Multigrid. 478-480

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.