Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Firefox compile time regression #43211

Closed
glandium opened this issue Jul 13, 2017 · 35 comments
Closed

Firefox compile time regression #43211

glandium opened this issue Jul 13, 2017 · 35 comments
Labels
C-bug Category: This is a bug. I-compiletime Issue: Problems and improvements with respect to compile times. P-medium Medium priority regression-from-stable-to-stable Performance or correctness regression from one stable version to another. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@glandium
Copy link
Contributor

glandium commented Jul 13, 2017

Preliminary note: this may or may not be related to #43141, but just in case, I file this separately instead of piling up there.

I'm sorry I don't have much more fine-grained data at the moment, but -Ztime-passes, when multiple things are built in parallel (while building Firefox), is not entirely helpful, because it doesn't tell to which crate the interleaving numbers belong to.

Rust build times on Firefox are already bad with 1.18. Some of the most time consuming crates have build times that look like the following:

  • 74.992975322 webrender
  • 99.960699075 geckoservo
  • 259.393332042 gkrust
  • 259.639402886 gkrust_gtest
  • 363.456949809 style

With the latest nightly as of writing (1.20.0-nightly (f85579d 2017-07-12)), those times become:

  • 85.71714646 webrender
  • 124.097890719 geckoservo
  • 384.619852505 gkrust_gtest
  • 396.106775997 gkrust
  • 455.281318295 style

I'm trying to narrow it down to older nightlies, but it started badly, because I got an ICE building the style crate on the last 1.18.0 nightly (2017-04-29)

Those are times for --release with lto enabled (I have a separate bug to file about lto, but I don't think that's relevant to this regression).

CC: @froydnj, @rillian

@glandium
Copy link
Contributor Author

Mmmmm with 1.19.0-nightly (75b0568 2017-05-15):

  • 60.430645418 webrender
  • 116.540179527 0 geckoservo
  • 370.565763687 0 gkrust_gtest
  • 372.559032193 0 gkrust
  • 433.088609909 0 style

With 1.19.0-nightly (d3abc80 2017-05-09):

  • 63.26174462 webrender
  • 117.30302075 geckoservo
  • 319.482715603 gkrust
  • 320.997348082 gkrust_gtest
  • 387.128239057 style

There would seem to be different regression points for different things, and early in the 1.19 development, at that (I tried 75b0568 because of range given in #43141 )

Also note that the order in which the mentioned crates are built is webrender, style, geckoservo, gkrust, gkrust_gtest (the last two are interchangeable depending on timing), and are all happening sequentially because, to simplify things, they are a dependency chain (except webrender, which may build in parallel of style, but it's not like it would matter much).

Also note that this long tail of builds takes so long that by the time geckoservo starts to build after style, there's only that happening on the build machine (everything C++ that doesn't depend on rust is done), and in fact, style has been the only thing happening on the build machine for a while too...

@sanxiyn sanxiyn added the I-compiletime Issue: Problems and improvements with respect to compile times. label Jul 13, 2017
@glandium
Copy link
Contributor Author

Note, I'm not excluding that as a counterpart, the resulting builds might have better code, but I haven't verified that (and I don't know which specific tests in the Firefox test suites would tell me that about the rust code)

@alexcrichton alexcrichton added the regression-from-stable-to-nightly Performance or correctness regression from stable to nightly. label Jul 13, 2017
@alexcrichton
Copy link
Member

cc @rust-lang/compiler

@arielb1
Copy link
Contributor

arielb1 commented Jul 13, 2017

@glandium

I remember that LLVM 4.0 introduced a 10% regression in LLVM passes time, but that was #40123 (merged Apr 24).

@brson brson added T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. I-nominated labels Jul 13, 2017
@brson
Copy link
Contributor

brson commented Jul 13, 2017

P-?

@michaelwoerister
Copy link
Member

It would be interesting to know if turning off LTO makes a difference here.

@nikomatsakis
Copy link
Contributor

More data points would also be useful -- how hard are they to gather?

@glandium
Copy link
Contributor Author

What kind of data points? Different versions? Different configurations?

@michaelwoerister
Copy link
Member

@glandium Different Rust versions, yes, and if possible, timings for specific passes.

@michaelwoerister
Copy link
Member

michaelwoerister commented Jul 17, 2017

Some more questions:

  • Can you give some instructions on how to build Firefox with Stylo enabled? Is it just ./mach build?
  • Does FF use the system provided Rust? I.e. can I just switch the Rust version via rustup and expect the FF build system to pick that up?
  • Where does the Rust code in FF live? (I tried changing files in the servo subdirectory but that did not seem to do anything)

@jdm
Copy link
Contributor

jdm commented Jul 17, 2017

  • To build with stylo, create a .mozconfig file in the root directory and add ac_add_options --enable-stylo.
  • The build system uses what ever rustc is in your PATH.
  • servo/components/{style,selectors} and servo/ports/geckolib are the main places.

@michaelwoerister
Copy link
Member

Thanks, @jdm! Do you know about the other questions?

@jdm
Copy link
Contributor

jdm commented Jul 17, 2017

I edited my previous comment to add answers.

@michaelwoerister
Copy link
Member

Thanks, @jdm! I'll try to reproduce those build times locally and collect some data.

Will the build system respect CARGO_BUILD_JOBS=1? I seem to remember that that didn't work when building servo, but it might help getting more readable -Ztime-passes output.

@jdm
Copy link
Contributor

jdm commented Jul 17, 2017

I do not know the answer to that, unfortunately.

@michaelwoerister
Copy link
Member

./mach -j1 seems to do the trick.

@michaelwoerister
Copy link
Member

So, I've gathered some data with the latest nightly (rustc 1.20.0-nightly (2652ce677 2017-07-17)):

                TOTAL        LLVM
--------------------------------------                
webrender         63s         53s
style            262s        196s
geckoservo       117s        106s
gkrust           241s        237s

This is with LTO enabled. The full output can be found here:
https://gist.github.com/michaelwoerister/c0e34a3f69221dcad6581ddedec79e3a

None of these crates spends a lot of time before LLVM. style is a bit of an exception here (with a surprising amount of time spent in name resolution) but overall (unless I messed up in my measurements somewhere) we can rule out that #43141 plays a role here.

@michaelwoerister
Copy link
Member

@glandium It's interesting that your timings of style and gkrust seem a lot more extreme than mine. Is it possible that this is a memory/swapping issue too?

@michaelwoerister
Copy link
Member

Some more data: Turning off LTO makes the compile time of the final crate practically disappear. This is to be expected as that crate is almost empty. The rest stays pretty much unchanged.

                TOTAL        LLVM
--------------------------------------                
webrender         59s         51s
style            251s        187s
geckoservo       118s        107s
gkrust           < 1s        < 1s

@michaelwoerister
Copy link
Member

Here are my timings with the nightly of the original post (1.20.0-nightly (f85579d 2017-07-12)):

                TOTAL        LLVM
--------------------------------------                
webrender         61s         51s
style            262s        198s
geckoservo       114s        103s
gkrust           239s        234s

They don't look quite as bad.

@glandium
Copy link
Contributor Author

Do you have numbers with 1.18?

My numbers come from Firefox CI, which happens on Amazon EC2, where the CPUs are Xeons that probably are slower than your CPU?

@michaelwoerister
Copy link
Member

Here are timings with 1.18 stable:

                TOTAL        LLVM
--------------------------------------                
webrender         43s         36s
style            178s        140s
geckoservo        87s         79s
gkrust           160s        156s

Things are quite a bit slower compared to the latest nightly. But it's not an entirely valid comparison, since nightlies have debug assertions (and maybe even LLVM assertions) enabled in the compiler, so they are expected to be slower. I'll try to find a nightly that roughly corresponds to 1.18 stable, then we'll see how much that accounts for.

@SimonSapin
Copy link
Contributor

Rust Nightly does enable LLVM assertions, whereas other channels do not. The impact on compile times is significant, especially when doing optimizations.

Rust CI does make builds available without LLVM assertions, but they’re not easy to use: #39754. (More discussion of this: https://internals.rust-lang.org/t/disabling-llvm-assertions-in-nightly-builds/5388)

@michaelwoerister
Copy link
Member

See here are timings from rustc 1.18.0-nightly (63c77214c 2017-04-24), i.e. a nightly version roughly equivalent to stable 1.18, but with all assertions enabled:

                TOTAL        LLVM
--------------------------------------                
webrender         60s         52s
style            262s        217s
geckoservo       120s        111s
gkrust           206s        201s

These times are roughly the same as the current nightly, except for gkrust, where the current nightly spends 31 seconds more in LLVM.

The full -Z time-passes output of everything above can be found at:
https://gist.github.com/michaelwoerister/c0e34a3f69221dcad6581ddedec79e3a

@glandium
Copy link
Contributor Author

So, I did some builds with 1.18, 1.19 (now released) and 1.20 beta (presumably without the LLVM asserts). I forced them to be -j1 both for make and cargo, avoiding things running in parallel and altering the times of each other (which, they appear to do, a lot).

As in the first comment, I'm going to list the times taken by the slowest crates:

                1.18     1.19     1.20
webrender      40.06s   39.61s   44.87s
clap           40.08s   53.27s   57.75s
bindgen        44.31s   44.47s   44.46s
geckoservo     77.14s   77.32s   84.46s
syntex_syntax  105.02s  103.19s  96.80s
gkrust         128.17s  132.96s  140.84s
gkrust_gtest   128.72s  134.81s  141.59s
style          187.41s  188.26s  206.54s

So while not as bad as in the first comment, it's not entirely good either. At least 1.19 is not too bad, except, somehow, for clap.

@glandium
Copy link
Contributor Author

https://bugzilla.mozilla.org/show_bug.cgi?id=1382743#c11 suggests something did go bad when upgrading from 1.18 to 1.19. -j1 might be hiding some issues.

@michaelwoerister
Copy link
Member

-j1 might be hiding some issues.

If the issues are related to the job server then that's quite likely.

@Mark-Simulacrum Mark-Simulacrum added the P-high High priority label Jul 27, 2017
@Mark-Simulacrum Mark-Simulacrum added the C-bug Category: This is a bug. label Jul 28, 2017
@arielb1
Copy link
Contributor

arielb1 commented Aug 9, 2017

1.20 is now beta

@arielb1 arielb1 added regression-from-stable-to-beta Performance or correctness regression from stable to beta. and removed regression-from-stable-to-nightly Performance or correctness regression from stable to nightly. labels Aug 9, 2017
@arielb1
Copy link
Contributor

arielb1 commented Aug 10, 2017

There appears to be a 5% "unexplained" regression between 1.19 and 1.20. Allocators are the prime suspect - is anyone looking at it?

@alexcrichton
Copy link
Member

@arielb1 I haven't had much time personally to look into the allocator-related regressions yet. Where'd you find the 5% number from, though? Is there perhaps a suspect commit on perf.rust-lang.org?

@arielb1
Copy link
Contributor

arielb1 commented Aug 13, 2017

@alexcrichton

That's from @glandium's numbers

                1.18     1.19     1.20
webrender      40.06s   39.61s   44.87s
clap           40.08s   53.27s   57.75s
bindgen        44.31s   44.47s   44.46s
geckoservo     77.14s   77.32s   84.46s
syntex_syntax  105.02s  103.19s  96.80s
gkrust         128.17s  132.96s  140.84s
gkrust_gtest   128.72s  134.81s  141.59s
style          187.41s  188.26s  206.54s

style was affected by #43572, but for the other crates it looks like about 5%

@nikomatsakis
Copy link
Contributor

triage: P-medium

Discussed this in the @rust-lang/compiler meeting. We decided to downgrade this to P-medium, since the immediate problem has been resolved, though we will want to revisit (and perhaps upgrade priority again) in the future.

@rust-highfive rust-highfive added P-medium Medium priority and removed P-high High priority labels Aug 17, 2017
@alexcrichton alexcrichton modified the milestone: 1.20 Aug 23, 2017
@alexcrichton alexcrichton added regression-from-stable-to-stable Performance or correctness regression from one stable version to another. and removed regression-from-stable-to-beta Performance or correctness regression from stable to beta. labels Aug 28, 2017
@alexcrichton
Copy link
Member

1.20 will soon be stable

@michaelwoerister michaelwoerister removed their assignment Oct 31, 2017
@cyplo
Copy link
Contributor

cyplo commented Apr 15, 2018

Heya !
Some perf improvements landed in 2018, including incremental compilation, I wonder if they helped in this case ?

@Enselic
Copy link
Member

Enselic commented Sep 28, 2023

Triage: No one said it did not help, so let's assume it did! Closing.

@Enselic Enselic closed this as not planned Won't fix, can't repro, duplicate, stale Sep 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: This is a bug. I-compiletime Issue: Problems and improvements with respect to compile times. P-medium Medium priority regression-from-stable-to-stable Performance or correctness regression from one stable version to another. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests