Lightning is a framework for data processing using GPUs on distributed platforms. The framework allows distributed multi-GPU execution of compute kernels functions written in CUDA in a way that is similar to programming a single GPU, without worrying about low-level details such as network communication, memory management, and data transfers. This enables scaling of existing GPU kernels to much larger problem sizes, for beyond the memory capacity of a single GPU. Lightning efficiently distributes the work/data across GPUS and maximizes efficiency by overlapping scheduling, data movement, and work when possible.
The project is written in Rust and has been tested with Rust 1.56.
To build the project use cargo
, which is included with the Rust installion.
cargo build --release
Apache 2.0. See LICENSE.
S. Heldens, P. Hijma, B. van Werkhoven, J. Maassen, R.V. van Nieuwpoort, "Lightning: Scaling the GPU Programming Model Beyond a Single GPU", in IEEE IPDPS, 2022