Profile-Guided Optimization (PGO) benchmark report #6500

zamazan4ik · 2024-10-02T18:26:30Z

zamazan4ik
Oct 2, 2024

Hi!

I decided to test the Profile-Guided Optimization (PGO) technique to optimize the library performance. For reference, results for other projects are available at https://github.com/zamazan4ik/awesome-pgo . Since PGO has helped many different libraries (like Apache Datafusion results), I decided to apply it to arrow-rs to see if a performance win (or loss) can be achieved. Here are my benchmark results.

This information can be interesting for anyone who wants to achieve more performance with the library in their use cases.

Test environment

Fedora 40
Linux kernel 6.10.11
AMD Ryzen 9 5900x
48 Gib RAM
SSD Samsung 980 Pro 2 Tib
Compiler - Rustc 1.81.0
arrow-rs version: master branch, 3293a8c2f9062fca93bee2210d540a1d25155bf5 commit
Disabled Turbo boost

Benchmark

For PGO optimization I use cargo-pgo tool. Release bench results I got with taskset -c 0 cargo bench command. The PGO training phase is done with taskset -c 0 cargo pgo bench, PGO optimization phase - with taskset -c 0 cargo pgo optimize bench.

taskset -c 0 is used to reduce the OS scheduler's influence on the results. All measurements are done on the same machine, with the same background "noise" (as much as I can guarantee).

Results

I got the following results:

Release: https://gist.github.com/zamazan4ik/519c90524c798dafcf15a511cda6dac7
PGO optimized compared to Release: https://gist.github.com/zamazan4ik/02d168cde26938872e93d74c6db26128
(just for reference) PGO instrumented compared to Release: https://gist.github.com/zamazan4ik/79460174baf702f68c29c7e38711642a

According to the results we can see, that in many cases performance was measurably improved. However, for other cases we also see regressions. It's completely fine since the training dataset tries to cover all possible cases and sometimes the compiler cannot find the optimization decision for all cases.

It's still a good result since it proves that in some scenarios users can achieve better performance for arrow-rs with PGO.

Further steps

At the very least, the library's users can find this performance report and decide to enable PGO for their applications if they care about the library's performance in their workloads. Maybe a small note somewhere in the documentation (the README file?) will be enough to raise awareness about this possible performance improvement.

Thank you.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Profile-Guided Optimization (PGO) benchmark report #6500

{{title}}

Replies: 0 comments

Select a reply

Profile-Guided Optimization (PGO) benchmark report #6500

zamazan4ik Oct 2, 2024

Test environment

Benchmark

Results

Further steps

Replies: 0 comments

zamazan4ik
Oct 2, 2024