Add a section to the documentation explaining that PGO can help up substantially (25%) and maybe offer some tips for users to use it? #9561
Labels
documentation
Improvements or additions to documentation
This ticket tracks adding a profile guided optimization to the documentation section and link to #9507
Many thanks to @@zamazan4ik for this wonderful content
Yes, it would be a great option. It requires almost no resources to maintain (write once and link to this discussion for the results). In this case, users who are interested in optimizing
arrow-datafusion
more will be able to use this information as an additional optimization opportunity. I have several examples of how such documentation can be written (it's for applications but anyway - for a library case it should look a similar way):Unfortunately, this way is a bit trickier in practice. Pre-gathered PGO profiles have multiple issues - e.g. incompatibilities between different compiler versions, a profile skew (when a PGO profile is gathered for an older version of the code. When time flies, pre-gathered PGO profiles become less and less efficient so some kind of regular PGO profile regeneration is required).
I could suggest another similar way - integrate into the build scripts the way to build the library with enabled PGO (based on some workload like TPCH, Clickbench, any other target workload, or any combination of them - it's up to discussion). On the one hand, users will be able to build the PGO-optimized version of the library. On another hand, you won't waste your maintenance resources on maintaining always up-to-date pre-gathered PGO profiles (however, this process can be simplified with CI).
Some examples of PGO build integration into the build scripts:
configure
script.If you have some prebuilt versions of the library (e.g. a Python wheel), you can think about pre-optimizing these prebuilt binaries with PGO (based on TPCH, Clickbench, etc.). As an example - Pydantic-core: GitHub PR.
Originally posted by @zamazan4ik in #9507 (reply in thread)
The text was updated successfully, but these errors were encountered: