Reproducibility setup assessing whether the existing MPI correctness benchmarks, MPI-CorrBench and the MPI Bugs Initiative, mirror the MPI usage of real-world HPC codes. Our analysis combines static source text analysis and dynamic MPI call tracing to generate a database for comparison.
To generate our coverage data for each correctness benchmark, we need to (1) build our PMPI-based interceptor library to trace MPI usage, (2) modify MPI-CorrBench and (3) MPI Bugs Initiative to execute and analyze their tests with our tooling to generate the static & dynamic traces, (4) use our Python-based MPI usage analysis tool to merge the resulting trace with the statically collected MPI usage database, and execute it again to generate the final coverage data of the HPC data set.
The MPI call tracer is a shared lib that needs to be preloaded before running the MPI target code.
- Call init.sh to clone & compile the code.
- This requires Clang (v12) & llvm-symbolizer and Python3 for the PMPI wrapper
- Source bashrc to set environment path variables for later steps
Requires mpi-arg-tracer
& mpi-arg-usage
(latter is for static collection in step 2.1)
- Source bashrc to set required environment (path) variables
- Call init.sh to clone & format the MPI test cases and install the MPI tracer tool.
- format.py is the formats the code using
clang-format
.
- format.py is the formats the code using
- Run execute.sh to evaluate the benchmark with our MPI tracer and generate dynamic MPI usage data.
This step is used to generate static MPI usage data (to be merged with the dynamic) one.
It relies only on the mpi-arg-usage
tool & the benchmarks test sources.
- Call collect_static_data.sh runs our MPI usage analysis tool
mpi-arg-usage
on the test cases.
Requires mpi-arg-tracer
& mpi-arg-usage
(latter is for static collection in step 3.1)
- Source bashrc to set required environment (path) variables
- Call init.sh to clone project and install the MPI tracer tool.
- patch file is applied to add our tracing tool to MBI
- Run execute.sh to evaluate the benchmark with our MPI tracer and generate dynamic MPI usage data.
This step is used to generate static MPI usage data (to be merged with the dynamic) one.
It relies only on the mpi-arg-usage
tool & the benchmarks test sources.
- Call collect_static_data.sh runs our MPI usage analysis
tool
mpi-arg-usage
on the test cases.
Requires previous steps 1-3.
- Source bashrc to set required environment (path) variables
- Call init.sh to clone project and install Python package requirements with conda.
- Run execute.sh to evaluate the benchmarks MPI usage w.r.t. HPC data set.
- Initializes HPC data set by extracting the existing tar.gz (refer to mpi-arg-usage),
- merges static and dynamic data of correctness benchmarks and
- generates the final plots.
Our tool chain automatically generates data sets and plots for publishing purposes, see CI file.
For each (successful) run, artifacts are attached, see All Workflows:
corrbench-trace
: Dynamic MPI trace of MPI-CorrBench.mbi-trace
: Dynamic MPI trace of MPI Bugs Initiative (with 1s timeout (e.g., deadlocks) to save time).plots
: Visualization of scoring MPI usage patterns w.r.t. the HPC data set.