GraphNet is a large-scale dataset of deep learning computation graphs, built as a standard benchmark for tensor compiler optimization. It provides over 2.7K computation graphs extracted from state-of-the-art deep learning models spanning diverse tasks and ML frameworks. With standardized formats and rich metadata, GraphNet enables fair comparison and reproducible evaluation of the general optimization capabilities of tensor compilers, thereby supporting advanced research such as AI for System on compilers.
- [2025-10-14] ✨ Our technical report is out: a detailed study of dataset construction and compiler benchmarking, introducing the novel performance metrics Speedup Score S(t) and Error-aware Speedup Score ES(t). 📘 GraphNet: A Large-Scale Computational Graph Dataset for Tensor Compiler Research
- [2025-8-20] 🚀 The second round of open contribution tasks was released. (completed ✅)
- [2025-7-30] 🚀 The first round of open contribution tasks was released. (completed ✅)
We evaluate two representative tensor compiler backends, CINN (PaddlePaddle) and TorchInductor (PyTorch), on GraphNet's NLP and CV subsets. The evaluation adopts two quantitative metrics proposed in the Technical Report:
- Speedup Score S(t) — evaluates compiler performance under varying numerical tolerance levels.
- Error-aware Speedup Score ES(t) — further accounts for runtime and compilation errors.
This section shows how to evaluate tensor compilers and reproduce benchmark results (for compiler users and developers), as well as how to contribute new computation graphs (for GraphNet contributors).
Step 1: Benchmark
Use graph_net.torch.test_compiler to benchmark GraphNet samples with specific batch and logging configurations:
# Set your benchmark directory
export GRAPH_NET_BENCHMARK_PATH=/home/yourname/graphnet_benchmark/
# Run benchmark
python -m graph_net.torch.test_compiler \
--model-path $GRAPH_NET_EXTRACT_WORKSPACE/model_name/ \
--compiler /custom/or/builtin/compiler/ \
--device /device/to/execute/ \
--warmup /times/to/warmup/ \
--trials /times/to/test/ \
> $GRAPH_NET_BENCHMARK_PATH/log.log 2>&1
# Note: If --compiler is omitted, PyTorch’s built-in compiler is used by default.
After executing, graph_net.torch.test_compiler
will:
- Running the original model in eager mode to record a baseline.
- Compiling the model with the specified backend (e.g., CINN, TVM, Inductor, TensorRT, XLA, BladeDISC).
- Executing the compiled model and collecting its runtime and outputs.
- Conduct speedup by comparing the compiled results against the baseline (if no execution failure occurs).
Step 2: Generate JSON Record
Extract runtime, correctness, and failure information from benchmark logs:
python -m graph_net.log2json \
--log-file $GRAPH_NET_BENCHMARK_PATH/log.log \
--output-dir $GRAPH_NET_BENCHMARK_PATH/JSON_results/
Step 3: Analysis
Use the three scripts graph_net.plot_St
, graph_net.plot_ESt
and graph_net.plot_violin
to generate St plot, ESt plot, and violin plot based on the JSON results.
python -m graph_net.plot_St \
--benchmark-path $GRAPH_NET_BENCHMARK_PATH/JSON_results/ \
--output-dir $GRAPH_NET_BENCHMARK_PATH \
--negative-speedup-penalty penalty/power/for/negative/speedup \
--fpdb base/penalty/for/severe/errors
python -m graph_net.plot_ESt \
--benchmark-path $GRAPH_NET_BENCHMARK_PATH/JSON_results/ \
--output-dir $GRAPH_NET_BENCHMARK_PATH \
--negative-speedup-penalty penalty/power/for/negative/speedup \
--fpdb base/penalty/for/severe/errors
# Note: If --negative-speedup-penalty is omitted, p=0 is used by default.
# If --fpdb, b=0.1 is used by default.
python -m graph_net.plot_violin \
--benchmark-path $GRAPH_NET_BENCHMARK_PATH/JSON_results/ \
--output-dir $GRAPH_NET_BENCHMARK_PATH
The scripts are designed to process a file structure as /benchmark_path/category_name/
, and items on x-axis are identified by name of the sub-directories. After executing, several summary plots of result in categories (model tasks, libraries...) will be exported to $GRAPH_NET_BENCHMARK_PATH
.
Want to understand how GraphNet is built or contribute new samples? Check out the Construction Guide for details on the extraction and validation workflow.
- Scale GraphNet to 10K+ graphs.
- Further annotate GraphNet samples into more granular sub-categories
- Extract samples from multi-GPU scenarios to support benchmarking and optimization for large-scale, distributed computing.
- Enable splitting full graphs into independently optimized subgraphs and operator sequences.
Vision: GraphNet aims to lay the foundation for AI for Compiler by enabling large-scale, systematic evaluation of tensor compiler optimizations, and providing a dataset for models to learn and transfer optimization strategies.
You can join our community via following group chats. Welcome to ask any questions about using and building GraphNet.
![]() |
Channel is also available. |
GraphNet is released under the MIT License.
If you find this project helpful, please cite:
@article{li2025graphnet,
title = {GraphNet: A Large-Scale Computational Graph Dataset for Tensor Compiler Research},
author = {Xinqi Li and Yiqun Liu and Shan Jiang and Enrong Zheng and Huaijin Zheng and Wenhao Dai and Haodong Deng and Dianhai Yu and Yanjun Ma},
year = {2025},
url = {https://github.com/PaddlePaddle/GraphNet/blob/develop/GraphNet_technical_report.pdf}
}