Parla: A Python Orchestration System for Heterogeneous Architectures

Published in Proceedings of the IEEE International Conference for High Performance Computing, Networking, Storage and Analysis (SC), 2022

Recommended citation: Hochan Lee, William Ruys, Ian Henriksen, Arthur Peters, Yineng Yan, Sean Stephens, Bozhi You, Henrique Fingler, Martin Burtscher, Milos Gligoric, Karl Schulz, Keshav Pingali, Christopher J. Rossbach, Mattan Erez, and George Biros. (2022). "Parla: A Python Orchestration System for Heterogeneous Architectures." Proceedings of the IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, November 2022.

Download paper here Git repository

Python’s ease of use and rich collection of numeric libraries make it an excellent choice for rapidly developing scientific applications. However, composing these libraries to take advantage of complex heterogeneous nodes is still difficult. To simplify writing multi-device code, we created Parla, a heterogeneous task-based programming framework that fully supports Python’s scientific programming stack. Parla’s API is based on Python decorators and allows users to wrap code in Parla tasks for parallel execution. Parla arrays enable automatic movement of data between devices. The Parla runtime handles resource-aware mapping, scheduling, and execution of tasks. Compared to other Python tasking systems, Parla is unique in its parallelization of tasks within a single process, its GPU context and resource-aware runtime, and its design around gradual adoption to provide easy migration of and integration into existing Python applications. We show that Parla can achieve performance competitive with hand-optimized code while improving ease of development.