Profile Guided Optimization (PGO) benchmark report #24

zamazan4ik · 2024-11-14T01:35:35Z

Hi!

As I have done many times before, I decided to test the Profile-Guided Optimization (PGO) technique to optimize the library performance. For reference, results for other projects are available at https://github.com/zamazan4ik/awesome-pgo . Since PGO has helped many other libraries, I decided to apply it to osmgraph to see if a performance win (or loss) can be achieved. Here are my benchmark results.

This information can be interesting for anyone who wants to achieve more performance with the library in their use cases.

Test environment

Fedora 41
Linux kernel 6.11.5
AMD Ryzen 9 5900x
48 Gib RAM
SSD Samsung 980 Pro 2 Tib
Compiler - Rustc 1.82.0
osmgraph version: master branch, ee46e2ef9c762e182c7f177b41d4244351f29ae9 commit
Disabled Turbo boost

Benchmark

For PGO optimization I use cargo-pgo tool. Release bench results I got with taskset -c 0 cargo bench command. The PGO training phase is done with taskset -c 0 cargo pgo bench, PGO optimization phase - with taskset -c 0 cargo pgo optimize bench.

taskset -c 0 is used to reduce the OS scheduler's influence on the results. All measurements are done on the same machine, with the same background "noise" (as much as I can guarantee).

Results

I got the following results:

Release: https://gist.github.com/zamazan4ik/9421cfcb12fba9981d8d8bbccd27c5be
PGO optimized compared to Release: https://gist.github.com/zamazan4ik/6c3286fef32b8f6f2bf31c1fd93d31a4
(just for reference) PGO instrumented compared to Release: https://gist.github.com/zamazan4ik/88ea481a6e7118147e52649e9b31f9ab

According to the results (at least in this benchmark suite), PGO measurably improves the library's performance.

Further steps

I understand that the steps above can be time-consuming and hard to implement in practice. At the very least, the library's users can find this performance report and decide to enable PGO for their applications if they care about the library's performance in their workloads. Maybe a small note somewhere in the documentation (the README file?) will be enough to raise awareness about this work.

Please don't treat the issue like an actual issue - it's just a benchmark report (since Discussions are disabled for the repo).

Thank you.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Profile Guided Optimization (PGO) benchmark report #24

Profile Guided Optimization (PGO) benchmark report #24

zamazan4ik commented Nov 14, 2024

Profile Guided Optimization (PGO) benchmark report #24

Profile Guided Optimization (PGO) benchmark report #24

Comments

zamazan4ik commented Nov 14, 2024

Test environment

Benchmark

Results

Further steps