Skip to content

Lazily check incomplete lockfile to improve performance#5546

Merged
deivid-rodriguez merged 7 commits intomasterfrom
lazily-check-incomplete-lockfile
Jul 18, 2022
Merged

Lazily check incomplete lockfile to improve performance#5546
deivid-rodriguez merged 7 commits intomasterfrom
lazily-check-incomplete-lockfile

Conversation

@deivid-rodriguez
Copy link
Copy Markdown
Contributor

What was the end-user or developer problem that led to this PR?

We have some code that checks whether the lockfile has incomplete specs for the current platform, i.e., when even if the current platform is locked, the lockfile is missing some specs for it.

I think this check was introduced due to some bug in Bundler that generated some incomplete lockfiles, but it should be a very edge case. However, we check this every time bundler/setup is required, so all usages have to pay the cost of trying to gracefully handle this edge case.

What is your fix for the problem, implemented in this PR?

Checking this edge case involves actually resolving the locked specs for the current platform, which is something we need to do later anyways. So my approach is to assume this edge case does not happen, and when going ahead and materializing the actual set of specifications, check whether it actually happened. If that's the case, then go ahead and re-resolve.

This should reduce the number of times bundler/setup calls Bundler::SpecSet#for from 3 to 2.

The benefit on performance is unfortunately more moderate than I was expecting, about 1% on a fresh new rails application and about 2% on rails/rails repository Gemfile. But I would expect it to be better for bigger Gemfiles.

Make sure the following tasks are checked

@deivid-rodriguez deivid-rodriguez force-pushed the lazily-check-incomplete-lockfile branch from f10a6e8 to ee0c989 Compare May 17, 2022 22:56
@deivid-rodriguez
Copy link
Copy Markdown
Contributor Author

I also tried this against the big Gemfile published by @technicalpickles at #5545, and I only get a 2% speed. Quite disappointing but better than nothing I guess.

@technicalpickles
Copy link
Copy Markdown
Contributor

technicalpickles commented May 17, 2022

I tested on our project's Gemfile and got an 8% improvement 😁

# master
❯ hyperfine "ruby ~/workspace/big-gemfile/run.rb 2.4.0.dev" --warmup 3
Benchmark 1: ruby ~/workspace/big-gemfile/run.rb 2.4.0.dev
  Time (mean ± σ):      1.850 s ±  0.024 s    [User: 1.605 s, System: 0.183 s]
  Range (min … max):    1.830 s …  1.914 s    10 runs

# lazily-check-incomplete-lockfile 
❯ hyperfine "ruby ~/workspace/big-gemfile/run.rb 2.4.0.dev" --warmup 3
Benchmark 1: ruby ~/workspace/big-gemfile/run.rb 2.4.0.dev
  Time (mean ± σ):      1.702 s ±  0.012 s    [User: 1.465 s, System: 0.181 s]
  Range (min … max):    1.686 s …  1.725 s    10 runs

I was curious how it stacks with the other PRs we have going, and seems they end up making things worse 🙊

# lazily-check-incomplete-lockfile + memoize-dep-proxy-name + memoize-lazy-specification-hash + specset-for-hash-lookup-of-handled-deps
❯ hyperfine "ruby ~/workspace/big-gemfile/run.rb 2.4.0.dev" --warmup 3 
Benchmark 1: ruby ~/workspace/big-gemfile/run.rb 2.4.0.dev
  Time (mean ± σ):      1.841 s ±  0.036 s    [User: 1.556 s, System: 0.211 s]
  Range (min … max):    1.809 s …  1.924 s    10 runs

edit: corrected observation and data in the last block after I realized I was running against the wrong branch

@deivid-rodriguez
Copy link
Copy Markdown
Contributor Author

deivid-rodriguez commented May 18, 2022

@technicalpickles Good news that 8% speed up. I don't understand though how the other patches could ever harm performance, and it's not what I observed.

#5537 needs to be adapted to this PR to use [name, platform] tuples as keys, and just "true" as values, like this:

diff --git a/bundler/lib/bundler/spec_set.rb b/bundler/lib/bundler/spec_set.rb
index f0f9d093a0..b05084bd4c 100644
--- a/bundler/lib/bundler/spec_set.rb
+++ b/bundler/lib/bundler/spec_set.rb
@@ -12,15 +12,15 @@ def initialize(specs)
     end
 
     def for(dependencies, check = false, platforms = [nil])
-      handled = ["bundler"].product(platforms)
+      handled = ["bundler"].product(platforms).map {|k| [k, true] }.to_h
       deps = dependencies.map(&:name).product(platforms)
       specs = []
 
       loop do
         break unless dep = deps.shift
-        next if handled.include?(dep)
+        next if handled.key?(dep)
 
-        handled << dep
+        handled[dep] = true
 
         specs_for_dep = spec_for_dependency(*dep)
         if specs_for_dep.any?

With that on top, I get the following results on Ruby 2.7.5 (this PR + using a hash to track handled items in SpecSet#for is 14% faster than just this PR and 18% faster than master):

➜  big-gemfile git:(main) ✗ ruby -v
ruby 2.7.5p203 (2021-11-24 revision f69aeb8314) [arm64-darwin21]

➜  big-gemfile git:(main) ✗ hyperfine 'BUNDLER_VERSION=2.4.0.lazyhash ruby -rbundler/setup -e1' 'BUNDLER_VERSION=2.4.0.dev ruby -rbundler/setup -e1' 'BUNDLER_VERSION=2.4.0.lazy ruby -rbundler/setup -e1'
Benchmark 1: BUNDLER_VERSION=2.4.0.lazyhash ruby -rbundler/setup -e1
  Time (mean ± σ):     161.6 ms ±   0.9 ms    [User: 125.0 ms, System: 29.8 ms]
  Range (min … max):   160.5 ms … 163.1 ms    18 runs
 
Benchmark 2: BUNDLER_VERSION=2.4.0.dev ruby -rbundler/setup -e1
  Time (mean ± σ):     190.3 ms ±   0.8 ms    [User: 153.0 ms, System: 29.9 ms]
  Range (min … max):   188.8 ms … 191.4 ms    15 runs
 
Benchmark 3: BUNDLER_VERSION=2.4.0.lazy ruby -rbundler/setup -e1
  Time (mean ± σ):     185.0 ms ±   0.6 ms    [User: 148.1 ms, System: 29.9 ms]
  Range (min … max):   184.1 ms … 186.1 ms    15 runs
 
Summary
  'BUNDLER_VERSION=2.4.0.lazyhash ruby -rbundler/setup -e1' ran
    1.14 ± 0.01 times faster than 'BUNDLER_VERSION=2.4.0.lazy ruby -rbundler/setup -e1'
    1.18 ± 0.01 times faster than 'BUNDLER_VERSION=2.4.0.dev ruby -rbundler/setup -e1'

Something very interesting is that I get much bigger speed up on Ruby 3.1.2

➜  big-gemfile git:(main) ✗ ruby -v
ruby 3.1.2p20 (2022-04-12 revision 4491bb740a) [arm64-darwin21]

➜  big-gemfile git:(main) ✗ hyperfine 'BUNDLER_VERSION=2.4.0.lazyhash ruby -rbundler/setup -e1' 'BUNDLER_VERSION=2.4.0.dev ruby -rbundler/setup -e1' 'BUNDLER_VERSION=2.4.0.lazy ruby -rbundler/setup -e1' 
Benchmark 1: BUNDLER_VERSION=2.4.0.lazyhash ruby -rbundler/setup -e1
  Time (mean ± σ):     183.5 ms ±   1.0 ms    [User: 140.9 ms, System: 36.0 ms]
  Range (min … max):   181.3 ms … 185.3 ms    16 runs
 
Benchmark 2: BUNDLER_VERSION=2.4.0.dev ruby -rbundler/setup -e1
  Time (mean ± σ):     224.8 ms ±   0.9 ms    [User: 181.5 ms, System: 36.0 ms]
  Range (min … max):   223.1 ms … 226.1 ms    13 runs
 
Benchmark 3: BUNDLER_VERSION=2.4.0.lazy ruby -rbundler/setup -e1
  Time (mean ± σ):     209.8 ms ±   0.9 ms    [User: 166.6 ms, System: 36.0 ms]
  Range (min … max):   208.5 ms … 212.0 ms    14 runs
 
Summary
  'BUNDLER_VERSION=2.4.0.lazyhash ruby -rbundler/setup -e1' ran
    1.14 ± 0.01 times faster than 'BUNDLER_VERSION=2.4.0.lazy ruby -rbundler/setup -e1'
    1.23 ± 0.01 times faster than 'BUNDLER_VERSION=2.4.0.dev ruby -rbundler/setup -e1'

So another interesting finding here is that Ruby 2.7 is significantly faster for us than Ruby 3.1.

@deivid-rodriguez
Copy link
Copy Markdown
Contributor Author

And for completeness, Ruby 3.0 behaves exactly the same as Ruby 3.1, so it seems like some change in the 2.7 -> 3.0 transition.

@deivid-rodriguez
Copy link
Copy Markdown
Contributor Author

@technicalpickles Are you planning to dig further on why you observed different results when stacking PRs, or did you already find an explanation for it?

@technicalpickles
Copy link
Copy Markdown
Contributor

technicalpickles commented May 29, 2022

It took me a bit to recover from RailsConf 😅 I ran it against our repo with master, #5537 , #5546 , and then those two combined (. I only tested Ruby 2.7.5, since we aren't on 3.0 yet.

For big-gemfile:

❯ hyperfine "ruby ~/workspace/big-gemfile/run.rb 2.4.0.dev" --warmup 3 # master       ─╯
Benchmark 1: ruby ~/workspace/big-gemfile/run.rb 2.4.0.dev
  Time (mean ± σ):     380.0 ms ±   8.8 ms    [User: 205.3 ms, System: 128.2 ms]
  Range (min … max):   361.2 ms … 393.5 ms    10 runs

❯ hyperfine "ruby ~/workspace/big-gemfile/run.rb 2.4.0.dev" --warmup 3 # lazily-check-incomplete-lockfile
Benchmark 1: ruby ~/workspace/big-gemfile/run.rb 2.4.0.dev
  Time (mean ± σ):     403.7 ms ±  50.5 ms    [User: 199.1 ms, System: 135.8 ms]
  Range (min … max):   368.0 ms … 521.3 ms    10 runs

❯ hyperfine "ruby ~/workspace/big-gemfile/run.rb 2.4.0.dev" --warmup 3 # specset-for-hash-lookup-of-handled-deps
Benchmark 1: ruby ~/workspace/big-gemfile/run.rb 2.4.0.dev
  Time (mean ± σ):     351.3 ms ±  13.0 ms    [User: 178.9 ms, System: 127.8 ms]
  Range (min … max):   338.7 ms … 380.5 ms    10 runs

❯ hyperfine "ruby ~/workspace/big-gemfile/run.rb 2.4.0.dev" --warmup 3 # lazily…-check-incomplete-lockfile+specset-for-hash-lookup-of-handled-deps
Benchmark 1: ruby ~/workspace/big-gemfile/run.rb 2.4.0.dev
  Time (mean ± σ):     373.0 ms ±  19.0 ms    [User: 177.5 ms, System: 136.5 ms]
  Range (min … max):   338.4 ms … 398.4 ms    10 runs

For our app:

❯ hyperfine "ruby ~/workspace/big-gemfile/run.rb 2.4.0.dev" --warmup 3 # master
Benchmark 1: ruby ~/workspace/big-gemfile/run.rb 2.4.0.dev
  Time (mean ± σ):      1.285 s ±  0.012 s    [User: 0.994 s, System: 0.239 s]
  Range (min … max):    1.265 s …  1.305 s    10 runs

❯ hyperfine "ruby ~/workspace/big-gemfile/run.rb 2.4.0.dev" --warmup 3 # lazily-check-incomplete-lockfile
Benchmark 1: ruby ~/workspace/big-gemfile/run.rb 2.4.0.dev
  Time (mean ± σ):      1.900 s ±  0.053 s    [User: 1.571 s, System: 0.235 s]
  Range (min … max):    1.807 s …  1.963 s    10 runs

❯ hyperfine "ruby ~/workspace/big-gemfile/run.rb 2.4.0.dev" --warmup 3 # specset-for-hash-lookup-of-handled-deps
Benchmark 1: ruby ~/workspace/big-gemfile/run.rb 2.4.0.dev
  Time (mean ± σ):     538.6 ms ±  10.1 ms    [User: 300.0 ms, System: 186.8 ms]
  Range (min … max):   526.7 ms … 553.7 ms    10 runs

❯ hyperfine "ruby ~/workspace/big-gemfile/run.rb 2.4.0.dev" --warmup 3 # lazily-check-incomplete-lockfile+specset-for-hash-lookup-of-handled-deps
Benchmark 1: ruby ~/workspace/big-gemfile/run.rb 2.4.0.dev
  Time (mean ± σ):     540.6 ms ±  11.2 ms    [User: 292.8 ms, System: 196.5 ms]
  Range (min … max):   532.1 ms … 568.1 ms    10 runs

Parsing those out to a table...

Branch big-gemfile our app
master 380.0 ms ± 8.8 ms 1.285 s ± 0.012 s
lazily-check-incomplete-lockfile 403.7 ms ± 50.5 ms 1.900 s ± 0.053 s
specset-for-hash-lookup-of-handled-deps 351.3 ms ± 13.0 ms 538.6 ms ± 10.1 ms
lazily-check-incomplete-lockfile+specset-for-hash-lookup-of-handled-deps 373.0 ms ± 19.0 ms 540.6 ms ± 11.2 ms

If I were to draw a conclusion from that, lazily-check-incomplete-lockfile is slightly slower (6.2%) on the big-gemfile I created, but significantly slower on our app (47%). specset-for-hash-lookup-of-handled-deps was the fastest branch, with 7.5% improvement for big-gemfile, and 58% improvement for our app. Combining the approaches was very slightly slower than using specset-for-hash-lookup-of-handled-deps on its own.

I wanted to call out that I'm specifically calling testing this bundler/setup workflow, and not anything else, so I don't know where else this branch would impact.

PS: I tried to do a few rake install with custom versions like your previous benchmarks suggest, but I was getting inconsistent results... like, it seemed to only really respect one 2.4.0 version installed.

PPS: I accidentally hit command-enter and commented before finishing this.

@deivid-rodriguez
Copy link
Copy Markdown
Contributor Author

deivid-rodriguez commented May 30, 2022

No worries @technicalpickles!

I'm super confused with your results, since now they seem to contradict your initial testing where you reported an 8% improvement, no?

To clarify how I test this. I edit the lib/bundler/version.rb and give it a distinct prerelease-like name. Then I run bin/rake install. I do this for every branch I want to test.

Then I let hyperfine compare the different branches by setting BUNDLER_VERSION for each command. Like I reported before:

➜ hyperfine 'BUNDLER_VERSION=2.4.0.lazyhash ruby -rbundler/setup -e1' 'BUNDLER_VERSION=2.4.0.dev ruby -rbundler/setup -e1' 'BUNDLER_VERSION=2.4.0.lazy ruby -rbundler/setup -e1'

@technicalpickles
Copy link
Copy Markdown
Contributor

Let me try clearing out my various dev gems, and try again.

I also realized that the run.rb may trigger actually resolving the dependencies because it displays a count of gems. I'll try again with the simpler one liner you used.

@deivid-rodriguez deivid-rodriguez force-pushed the lazily-check-incomplete-lockfile branch 2 times, most recently from 8906a1a to 31adc0c Compare June 13, 2022 18:35
@deivid-rodriguez
Copy link
Copy Markdown
Contributor Author

@technicalpickles Did you find time to try this? I'm really curious to clarify this and I'd like to ship this improvement :)

@deivid-rodriguez
Copy link
Copy Markdown
Contributor Author

Alright, something we can do while you try this out is go ahead and ship #5537. since that's the biggest speed up here and clearly can't ever make things worse. Once we have that, I can rebase this PR and we can evaluate its performance more easily. I'll work on that!

@deivid-rodriguez deivid-rodriguez force-pushed the lazily-check-incomplete-lockfile branch from 31adc0c to 203136f Compare June 19, 2022 17:50
@deivid-rodriguez deivid-rodriguez force-pushed the lazily-check-incomplete-lockfile branch from 203136f to bc04bb6 Compare June 23, 2022 09:26
@deivid-rodriguez
Copy link
Copy Markdown
Contributor Author

This PR is now rebased and ready to be tried out :)

@deivid-rodriguez deivid-rodriguez force-pushed the lazily-check-incomplete-lockfile branch 2 times, most recently from 0ba2c9b to 3c375fa Compare July 1, 2022 08:14
@deivid-rodriguez
Copy link
Copy Markdown
Contributor Author

These are updated number on my computer after other speedups were released

$ hyperfine 'BUNDLER_VERSION=2.4.0.dev ruby -rbundler/setup -e1' 'BUNDLER_VERSION=2.3.17 ruby -rbundler/setup -e1'  
Benchmark 1: BUNDLER_VERSION=2.4.0.dev ruby -rbundler/setup -e1
  Time (mean ± σ):     145.6 ms ±   1.1 ms    [User: 122.3 ms, System: 21.9 ms]
  Range (min … max):   144.3 ms … 148.5 ms    20 runs
 
Benchmark 2: BUNDLER_VERSION=2.3.17 ruby -rbundler/setup -e1
  Time (mean ± σ):     149.1 ms ±   0.8 ms    [User: 126.2 ms, System: 21.5 ms]
  Range (min … max):   147.5 ms … 150.1 ms    19 runs
 
Summary
  'BUNDLER_VERSION=2.4.0.dev ruby -rbundler/setup -e1' ran
    1.02 ± 0.01 times faster than 'BUNDLER_VERSION=2.3.17 ruby -rbundler/setup -e1'

So only 2-3% speed up now.

But I'm working on a different approach where the speed up goes up to 5-6% and that might make this one unnecessary. I'll propose it separately.

@deivid-rodriguez
Copy link
Copy Markdown
Contributor Author

And here is the alternative PR: #5695.

By the way @technicalpickles, I think I understood why you were getting different results. Are you running your tests on Linux by any chance? Since the Gemfile.lock in your repo only includes arm64-darwin-21, bundling only Linux will actually re-resolve from scratch for the new platform, that might explain your numbers, since this intends to speed up the "no re-resolve needed" happy path.

@deivid-rodriguez deivid-rodriguez force-pushed the lazily-check-incomplete-lockfile branch 2 times, most recently from 211e608 to 744bd5e Compare July 16, 2022 13:42
@deivid-rodriguez
Copy link
Copy Markdown
Contributor Author

Good news, after rebasing this on top of other improvements, it still provides a 5-6% speed up on the big gemfile we've been using for testing! 🎉

$ hyperfine 'BUNDLER_VERSION=2.4.0.dev ruby -rbundler/setup -e1' 'BUNDLER_VERSION=2.3.18 ruby -rbundler/setup -e1'  
Benchmark 1: BUNDLER_VERSION=2.4.0.dev ruby -rbundler/setup -e1
  Time (mean ± σ):     141.4 ms ±   1.5 ms    [User: 117.5 ms, System: 22.5 ms]
  Range (min … max):   139.2 ms … 146.1 ms    20 runs
 
Benchmark 2: BUNDLER_VERSION=2.3.18 ruby -rbundler/setup -e1
  Time (mean ± σ):     149.4 ms ±   0.7 ms    [User: 125.5 ms, System: 22.6 ms]
  Range (min … max):   148.3 ms … 150.8 ms    19 runs
 
Summary
  'BUNDLER_VERSION=2.4.0.dev ruby -rbundler/setup -e1' ran
    1.06 ± 0.01 times faster than 'BUNDLER_VERSION=2.3.18 ruby -rbundler/setup -e1

I'm setting this as ready and I'll be merging in a couple of days if I don't get reviews.

@deivid-rodriguez deivid-rodriguez marked this pull request as ready for review July 16, 2022 13:44
So that it deals with [name, platform] tuples consistently instead of
dealing with `Gem::Dependency` or `Bundler::DepProxy` instances
inconsistently.
The resolve might be against locally available gems, or remote gems,
depending on the situation.
This is a very weird edge case, not all Bundler invokations should pay
its cost. Instead, assume it's not going to happen, detect the situation
at materialization time, and re-resolve.
Since the improvement to lazily check whether the lockfile has missing
platform specific dependencies, special handling on "bundler" inside
`Bundler::SpecSet#for` is no longer covered by specs, because even in
the case where missing platform specific dependencies are found, we
still show the "Found no changes" message initially.

This extra assertion makes the code covered again.
@deivid-rodriguez deivid-rodriguez force-pushed the lazily-check-incomplete-lockfile branch from 744bd5e to 8ce54de Compare July 18, 2022 10:08
@deivid-rodriguez deivid-rodriguez force-pushed the lazily-check-incomplete-lockfile branch from 8ce54de to 5281e51 Compare July 18, 2022 11:28
@deivid-rodriguez deivid-rodriguez merged commit f847e60 into master Jul 18, 2022
@deivid-rodriguez deivid-rodriguez deleted the lazily-check-incomplete-lockfile branch July 18, 2022 14:53
deivid-rodriguez added a commit that referenced this pull request Jul 27, 2022
Lazily check incomplete lockfile to improve performance

(cherry picked from commit f847e60)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants