Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apply Profile-guided optimization to improve performance #9412

Open
FilipAndersson245 opened this issue Jun 26, 2021 · 14 comments
Open

Apply Profile-guided optimization to improve performance #9412

FilipAndersson245 opened this issue Jun 26, 2021 · 14 comments
Labels
A-infra CI and workflow issues A-perf performance issues C-enhancement Category: enhancement

Comments

@FilipAndersson245
Copy link

Profile-guided optimization (PGO) shows great promise in improving the speed of software, last year tests where made on applying it on Rust itself improving build time by ~15%.
Would it be feasible to do something similar for Rust analyzer to improve its speed?
As i understand it the difficulties would be

  • gathering runtime statistics
  • longer compilation times
@matklad
Copy link
Member

matklad commented Jun 27, 2021

It would be feasibly, but the tradeoff between additional perf boost and additional burden of maintaining a more complex build process is not worth it at this stage. It’s more impactful to spend the effort on making rust-analyzer more performant directly. The primary blocker for that work is understanding rust-analyzer’s heap structure: #9309

@lnicola
Copy link
Member

lnicola commented Jun 28, 2021

I just tried this, it's an ~8% improvement in analysis-stats, especially in the type inference:

baseline:

Database loaded:     651.08ms, 278minstr
  crates: 36, mods: 715, decls: 14906, fns: 11073
Item Collection:     9.72s, 74ginstr
  exprs: 301267, ??ty: 515 (0%), ?ty: 582 (0%), !ty: 220
Inference:           15.01s, 110ginstr
Total:               24.73s, 185ginstr

pgo:

Database loaded:     638.59ms, 273minstr
  crates: 36, mods: 715, decls: 14906, fns: 11073
Item Collection:     9.43s, 73ginstr
  exprs: 301267, ??ty: 515 (0%), ?ty: 582 (0%), !ty: 220
Inference:           13.28s, 107ginstr
Total:               22.71s, 180ginstr

Steps for posterity, since they weren't obvious:

RUSTFLAGS="-C profile-generate" cargo build --release
target/release/rust-analyzer analysis-stats .
llvm-profdata merge *.profraw --output merged.profdata
RUSTFLAGS="-C profile-use=$PWD/merged.profdata" cargo build --release

Probably not worth the hassle for now.

@Veykril Veykril added the A-perf performance issues label May 28, 2022
@FilipAndersson245
Copy link
Author

Was some time since this issue and RA is maturing quite nicely, maybe we should look over this again?
saw Kobzol made this https://github.com/Kobzol/cargo-pgo to simplify the process of getting a PGO binary.

@lnicola lnicola added the A-infra CI and workflow issues label Nov 1, 2022
@lnicola
Copy link
Member

lnicola commented Nov 1, 2022

cargo pgo bolt build --with-pgo appears to crash BOLT on my system (LLVM 14.0.6, BOLT 14.0.6), and BOLT doesn't seem to be packaged with LLVM 14 in the distros I've tried, making it a bit hard to acquire. I still don't think it's worth bothering with that. As for plain PGO, on self:

# baseline
Database loaded:     508.15ms, 256minstr (metadata 317.41ms, 23minstr; build 102.87ms, 9210kinstr)
  crates: 43, mods: 869, decls: 19387, fns: 14479
Item Collection:     10.34s, 89ginstr
  exprs: 411733, ??ty: 45 (0%), ?ty: 130 (0%), !ty: 1                                                                                                                
Inference:           37.58s, 265ginstr
Total:               47.92s, 355ginstr

# pgo
Database loaded:     489.57ms, 248minstr (metadata 304.50ms, 21minstr; build 104.55ms, 8241kinstr)
  crates: 43, mods: 869, decls: 19387, fns: 14479
Item Collection:     8.99s, 76ginstr
  exprs: 411733, ??ty: 45 (0%), ?ty: 130 (0%), !ty: 1                                                                                                                
Inference:           29.43s, 227ginstr
Total:               38.42s, 304ginstr

for a speed-up of 20%.

For sysroot:

# baseline
Database loaded:     570.04ms, 132minstr (metadata 250.14ms, 2907kinstr; build 280.13ms, 31minstr)
  crates: 35, mods: 1267, decls: 39546, fns: 26480
Item Collection:     4.28s, 46ginstr
  exprs: 421064, ??ty: 42223 (10%), ?ty: 18705 (4%), !ty: 265                                                       
Inference:           29.26s, 213ginstr
Total:               33.54s, 259ginstr

# pgo
Database loaded:     547.65ms, 126minstr (metadata 253.65ms, 2687kinstr; build 255.81ms, 26minstr)
  crates: 35, mods: 1267, decls: 39546, fns: 26480
Item Collection:     3.71s, 41ginstr
  exprs: 421064, ??ty: 42223 (10%), ?ty: 18705 (4%), !ty: 265                                                       
Inference:           23.42s, 177ginstr
Total:               27.13s, 219ginstr

(19% faster)

My steps in #9412 (comment) still work, but seem to yield a smaller improvement. I'm not sure if cargo pgo is doing some extra magic, or it's just measurement noise.

In any case, 15-20% is a decent improvement.

@FilipAndersson245
Copy link
Author

Great! This looks awesome, 15-20% is a huge improvement.

@lnicola
Copy link
Member

lnicola commented Nov 1, 2022

15-20% is a drop in the ocean compared to the algorithmic improvements (that nobody had time/managed/knew how to make) 😅.

@Veykril Veykril added the C-enhancement Category: enhancement label Feb 9, 2023
@zamazan4ik
Copy link

Just to history - I've applied PGO (without BOLT) to Clangd (a project similar to Rust Analyzer but for C++) and got nice improvements as well: link.

@FilipAndersson245
Copy link
Author

FilipAndersson245 commented Apr 15, 2024

@lnicola It has been quite some time since the last time PGO was evaluated, Are rust-analyzer in a better state where it may be suitable to distribute PGO-optimized builds?

@ofek
Copy link

ofek commented Dec 1, 2024

Are PGO builds now distributed?

@ChayimFriedman2
Copy link
Contributor

Are PGO builds now distributed?

No. There are still bigger wins to be gained (which nobody tries currently AFAIK).

@ofek
Copy link

ofek commented Dec 1, 2024

Are you referring to this? #9309

@ChayimFriedman2
Copy link
Contributor

@ofek Performance work is tracked in #17491.

@berkus
Copy link

berkus commented Dec 2, 2024

@ChayimFriedman2 jfyi, i am using rust-analyzer every day, all day - every single percent shaved off of runtime of this tool is a huge win for my work day. I urge you to not ignore a 15% win that is basically coming for free.

@FilipAndersson245
Copy link
Author

I think as of right now there is a bug with PGO + LTO in the rust compiler preventing both to be used simultaneously, so any exploration of eventual performance would probably be best after that is fixed.
rust-lang/rust#115344

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-infra CI and workflow issues A-perf performance issues C-enhancement Category: enhancement
Projects
None yet
Development

No branches or pull requests

8 participants