Lazily check incomplete lockfile to improve performance by deivid-rodriguez · Pull Request #5546 · ruby/rubygems

deivid-rodriguez · 2022-05-17T21:08:35Z

What was the end-user or developer problem that led to this PR?

We have some code that checks whether the lockfile has incomplete specs for the current platform, i.e., when even if the current platform is locked, the lockfile is missing some specs for it.

I think this check was introduced due to some bug in Bundler that generated some incomplete lockfiles, but it should be a very edge case. However, we check this every time bundler/setup is required, so all usages have to pay the cost of trying to gracefully handle this edge case.

What is your fix for the problem, implemented in this PR?

Checking this edge case involves actually resolving the locked specs for the current platform, which is something we need to do later anyways. So my approach is to assume this edge case does not happen, and when going ahead and materializing the actual set of specifications, check whether it actually happened. If that's the case, then go ahead and re-resolve.

This should reduce the number of times bundler/setup calls Bundler::SpecSet#for from 3 to 2.

The benefit on performance is unfortunately more moderate than I was expecting, about 1% on a fresh new rails application and about 2% on rails/rails repository Gemfile. But I would expect it to be better for bigger Gemfiles.

Make sure the following tasks are checked

Describe the problem / feature
Write tests for features and bug fixes
Write code to solve the problem
Make sure you follow the current code style and write meaningful commit messages without tags

deivid-rodriguez · 2022-05-17T23:04:43Z

I also tried this against the big Gemfile published by @technicalpickles at #5545, and I only get a 2% speed. Quite disappointing but better than nothing I guess.

technicalpickles · 2022-05-17T23:42:50Z

I tested on our project's Gemfile and got an 8% improvement 😁

# master
❯ hyperfine "ruby ~/workspace/big-gemfile/run.rb 2.4.0.dev" --warmup 3
Benchmark 1: ruby ~/workspace/big-gemfile/run.rb 2.4.0.dev
  Time (mean ± σ):      1.850 s ±  0.024 s    [User: 1.605 s, System: 0.183 s]
  Range (min … max):    1.830 s …  1.914 s    10 runs

# lazily-check-incomplete-lockfile 
❯ hyperfine "ruby ~/workspace/big-gemfile/run.rb 2.4.0.dev" --warmup 3
Benchmark 1: ruby ~/workspace/big-gemfile/run.rb 2.4.0.dev
  Time (mean ± σ):      1.702 s ±  0.012 s    [User: 1.465 s, System: 0.181 s]
  Range (min … max):    1.686 s …  1.725 s    10 runs

I was curious how it stacks with the other PRs we have going, and seems they end up making things worse 🙊

# lazily-check-incomplete-lockfile + memoize-dep-proxy-name + memoize-lazy-specification-hash + specset-for-hash-lookup-of-handled-deps
❯ hyperfine "ruby ~/workspace/big-gemfile/run.rb 2.4.0.dev" --warmup 3 
Benchmark 1: ruby ~/workspace/big-gemfile/run.rb 2.4.0.dev
  Time (mean ± σ):      1.841 s ±  0.036 s    [User: 1.556 s, System: 0.211 s]
  Range (min … max):    1.809 s …  1.924 s    10 runs

edit: corrected observation and data in the last block after I realized I was running against the wrong branch

deivid-rodriguez · 2022-05-18T07:20:38Z

@technicalpickles Good news that 8% speed up. I don't understand though how the other patches could ever harm performance, and it's not what I observed.

#5537 needs to be adapted to this PR to use [name, platform] tuples as keys, and just "true" as values, like this:

diff --git a/bundler/lib/bundler/spec_set.rb b/bundler/lib/bundler/spec_set.rb
index f0f9d093a0..b05084bd4c 100644
--- a/bundler/lib/bundler/spec_set.rb
+++ b/bundler/lib/bundler/spec_set.rb
@@ -12,15 +12,15 @@ def initialize(specs)
     end
 
     def for(dependencies, check = false, platforms = [nil])
-      handled = ["bundler"].product(platforms)
+      handled = ["bundler"].product(platforms).map {|k| [k, true] }.to_h
       deps = dependencies.map(&:name).product(platforms)
       specs = []
 
       loop do
         break unless dep = deps.shift
-        next if handled.include?(dep)
+        next if handled.key?(dep)
 
-        handled << dep
+        handled[dep] = true
 
         specs_for_dep = spec_for_dependency(*dep)
         if specs_for_dep.any?

With that on top, I get the following results on Ruby 2.7.5 (this PR + using a hash to track handled items in SpecSet#for is 14% faster than just this PR and 18% faster than master):

➜  big-gemfile git:(main) ✗ ruby -v
ruby 2.7.5p203 (2021-11-24 revision f69aeb8314) [arm64-darwin21]

➜  big-gemfile git:(main) ✗ hyperfine 'BUNDLER_VERSION=2.4.0.lazyhash ruby -rbundler/setup -e1' 'BUNDLER_VERSION=2.4.0.dev ruby -rbundler/setup -e1' 'BUNDLER_VERSION=2.4.0.lazy ruby -rbundler/setup -e1'
Benchmark 1: BUNDLER_VERSION=2.4.0.lazyhash ruby -rbundler/setup -e1
  Time (mean ± σ):     161.6 ms ±   0.9 ms    [User: 125.0 ms, System: 29.8 ms]
  Range (min … max):   160.5 ms … 163.1 ms    18 runs
 
Benchmark 2: BUNDLER_VERSION=2.4.0.dev ruby -rbundler/setup -e1
  Time (mean ± σ):     190.3 ms ±   0.8 ms    [User: 153.0 ms, System: 29.9 ms]
  Range (min … max):   188.8 ms … 191.4 ms    15 runs
 
Benchmark 3: BUNDLER_VERSION=2.4.0.lazy ruby -rbundler/setup -e1
  Time (mean ± σ):     185.0 ms ±   0.6 ms    [User: 148.1 ms, System: 29.9 ms]
  Range (min … max):   184.1 ms … 186.1 ms    15 runs
 
Summary
  'BUNDLER_VERSION=2.4.0.lazyhash ruby -rbundler/setup -e1' ran
    1.14 ± 0.01 times faster than 'BUNDLER_VERSION=2.4.0.lazy ruby -rbundler/setup -e1'
    1.18 ± 0.01 times faster than 'BUNDLER_VERSION=2.4.0.dev ruby -rbundler/setup -e1'

Something very interesting is that I get much bigger speed up on Ruby 3.1.2

➜  big-gemfile git:(main) ✗ ruby -v
ruby 3.1.2p20 (2022-04-12 revision 4491bb740a) [arm64-darwin21]

➜  big-gemfile git:(main) ✗ hyperfine 'BUNDLER_VERSION=2.4.0.lazyhash ruby -rbundler/setup -e1' 'BUNDLER_VERSION=2.4.0.dev ruby -rbundler/setup -e1' 'BUNDLER_VERSION=2.4.0.lazy ruby -rbundler/setup -e1' 
Benchmark 1: BUNDLER_VERSION=2.4.0.lazyhash ruby -rbundler/setup -e1
  Time (mean ± σ):     183.5 ms ±   1.0 ms    [User: 140.9 ms, System: 36.0 ms]
  Range (min … max):   181.3 ms … 185.3 ms    16 runs
 
Benchmark 2: BUNDLER_VERSION=2.4.0.dev ruby -rbundler/setup -e1
  Time (mean ± σ):     224.8 ms ±   0.9 ms    [User: 181.5 ms, System: 36.0 ms]
  Range (min … max):   223.1 ms … 226.1 ms    13 runs
 
Benchmark 3: BUNDLER_VERSION=2.4.0.lazy ruby -rbundler/setup -e1
  Time (mean ± σ):     209.8 ms ±   0.9 ms    [User: 166.6 ms, System: 36.0 ms]
  Range (min … max):   208.5 ms … 212.0 ms    14 runs
 
Summary
  'BUNDLER_VERSION=2.4.0.lazyhash ruby -rbundler/setup -e1' ran
    1.14 ± 0.01 times faster than 'BUNDLER_VERSION=2.4.0.lazy ruby -rbundler/setup -e1'
    1.23 ± 0.01 times faster than 'BUNDLER_VERSION=2.4.0.dev ruby -rbundler/setup -e1'

So another interesting finding here is that Ruby 2.7 is significantly faster for us than Ruby 3.1.

deivid-rodriguez · 2022-05-18T09:21:56Z

And for completeness, Ruby 3.0 behaves exactly the same as Ruby 3.1, so it seems like some change in the 2.7 -> 3.0 transition.

deivid-rodriguez · 2022-05-24T09:05:58Z

@technicalpickles Are you planning to dig further on why you observed different results when stacking PRs, or did you already find an explanation for it?

technicalpickles · 2022-05-29T22:41:45Z

It took me a bit to recover from RailsConf 😅 I ran it against our repo with master, #5537 , #5546 , and then those two combined (. I only tested Ruby 2.7.5, since we aren't on 3.0 yet.

For big-gemfile:

❯ hyperfine "ruby ~/workspace/big-gemfile/run.rb 2.4.0.dev" --warmup 3 # master       ─╯
Benchmark 1: ruby ~/workspace/big-gemfile/run.rb 2.4.0.dev
  Time (mean ± σ):     380.0 ms ±   8.8 ms    [User: 205.3 ms, System: 128.2 ms]
  Range (min … max):   361.2 ms … 393.5 ms    10 runs

❯ hyperfine "ruby ~/workspace/big-gemfile/run.rb 2.4.0.dev" --warmup 3 # lazily-check-incomplete-lockfile
Benchmark 1: ruby ~/workspace/big-gemfile/run.rb 2.4.0.dev
  Time (mean ± σ):     403.7 ms ±  50.5 ms    [User: 199.1 ms, System: 135.8 ms]
  Range (min … max):   368.0 ms … 521.3 ms    10 runs

❯ hyperfine "ruby ~/workspace/big-gemfile/run.rb 2.4.0.dev" --warmup 3 # specset-for-hash-lookup-of-handled-deps
Benchmark 1: ruby ~/workspace/big-gemfile/run.rb 2.4.0.dev
  Time (mean ± σ):     351.3 ms ±  13.0 ms    [User: 178.9 ms, System: 127.8 ms]
  Range (min … max):   338.7 ms … 380.5 ms    10 runs

❯ hyperfine "ruby ~/workspace/big-gemfile/run.rb 2.4.0.dev" --warmup 3 # lazily…-check-incomplete-lockfile+specset-for-hash-lookup-of-handled-deps
Benchmark 1: ruby ~/workspace/big-gemfile/run.rb 2.4.0.dev
  Time (mean ± σ):     373.0 ms ±  19.0 ms    [User: 177.5 ms, System: 136.5 ms]
  Range (min … max):   338.4 ms … 398.4 ms    10 runs

For our app:

❯ hyperfine "ruby ~/workspace/big-gemfile/run.rb 2.4.0.dev" --warmup 3 # master
Benchmark 1: ruby ~/workspace/big-gemfile/run.rb 2.4.0.dev
  Time (mean ± σ):      1.285 s ±  0.012 s    [User: 0.994 s, System: 0.239 s]
  Range (min … max):    1.265 s …  1.305 s    10 runs

❯ hyperfine "ruby ~/workspace/big-gemfile/run.rb 2.4.0.dev" --warmup 3 # lazily-check-incomplete-lockfile
Benchmark 1: ruby ~/workspace/big-gemfile/run.rb 2.4.0.dev
  Time (mean ± σ):      1.900 s ±  0.053 s    [User: 1.571 s, System: 0.235 s]
  Range (min … max):    1.807 s …  1.963 s    10 runs

❯ hyperfine "ruby ~/workspace/big-gemfile/run.rb 2.4.0.dev" --warmup 3 # specset-for-hash-lookup-of-handled-deps
Benchmark 1: ruby ~/workspace/big-gemfile/run.rb 2.4.0.dev
  Time (mean ± σ):     538.6 ms ±  10.1 ms    [User: 300.0 ms, System: 186.8 ms]
  Range (min … max):   526.7 ms … 553.7 ms    10 runs

❯ hyperfine "ruby ~/workspace/big-gemfile/run.rb 2.4.0.dev" --warmup 3 # lazily-check-incomplete-lockfile+specset-for-hash-lookup-of-handled-deps
Benchmark 1: ruby ~/workspace/big-gemfile/run.rb 2.4.0.dev
  Time (mean ± σ):     540.6 ms ±  11.2 ms    [User: 292.8 ms, System: 196.5 ms]
  Range (min … max):   532.1 ms … 568.1 ms    10 runs

Parsing those out to a table...

Branch	big-gemfile	our app
master	380.0 ms ± 8.8 ms	1.285 s ± 0.012 s
lazily-check-incomplete-lockfile	403.7 ms ± 50.5 ms	1.900 s ± 0.053 s
specset-for-hash-lookup-of-handled-deps	351.3 ms ± 13.0 ms	538.6 ms ± 10.1 ms
lazily-check-incomplete-lockfile+specset-for-hash-lookup-of-handled-deps	373.0 ms ± 19.0 ms	540.6 ms ± 11.2 ms

If I were to draw a conclusion from that, lazily-check-incomplete-lockfile is slightly slower (6.2%) on the big-gemfile I created, but significantly slower on our app (47%). specset-for-hash-lookup-of-handled-deps was the fastest branch, with 7.5% improvement for big-gemfile, and 58% improvement for our app. Combining the approaches was very slightly slower than using specset-for-hash-lookup-of-handled-deps on its own.

I wanted to call out that I'm specifically calling testing this bundler/setup workflow, and not anything else, so I don't know where else this branch would impact.

PS: I tried to do a few rake install with custom versions like your previous benchmarks suggest, but I was getting inconsistent results... like, it seemed to only really respect one 2.4.0 version installed.

PPS: I accidentally hit command-enter and commented before finishing this.

deivid-rodriguez · 2022-05-30T08:39:10Z

No worries @technicalpickles!

I'm super confused with your results, since now they seem to contradict your initial testing where you reported an 8% improvement, no?

To clarify how I test this. I edit the lib/bundler/version.rb and give it a distinct prerelease-like name. Then I run bin/rake install. I do this for every branch I want to test.

Then I let hyperfine compare the different branches by setting BUNDLER_VERSION for each command. Like I reported before:

➜ hyperfine 'BUNDLER_VERSION=2.4.0.lazyhash ruby -rbundler/setup -e1' 'BUNDLER_VERSION=2.4.0.dev ruby -rbundler/setup -e1' 'BUNDLER_VERSION=2.4.0.lazy ruby -rbundler/setup -e1'

technicalpickles · 2022-05-31T15:35:21Z

Let me try clearing out my various dev gems, and try again.

I also realized that the run.rb may trigger actually resolving the dependencies because it displays a count of gems. I'll try again with the simpler one liner you used.

deivid-rodriguez · 2022-06-14T08:00:51Z

@technicalpickles Did you find time to try this? I'm really curious to clarify this and I'd like to ship this improvement :)

deivid-rodriguez · 2022-06-18T19:47:56Z

Alright, something we can do while you try this out is go ahead and ship #5537. since that's the biggest speed up here and clearly can't ever make things worse. Once we have that, I can rebase this PR and we can evaluate its performance more easily. I'll work on that!

deivid-rodriguez · 2022-06-23T09:27:05Z

This PR is now rebased and ready to be tried out :)

deivid-rodriguez · 2022-07-01T10:18:51Z

These are updated number on my computer after other speedups were released

$ hyperfine 'BUNDLER_VERSION=2.4.0.dev ruby -rbundler/setup -e1' 'BUNDLER_VERSION=2.3.17 ruby -rbundler/setup -e1'  
Benchmark 1: BUNDLER_VERSION=2.4.0.dev ruby -rbundler/setup -e1
  Time (mean ± σ):     145.6 ms ±   1.1 ms    [User: 122.3 ms, System: 21.9 ms]
  Range (min … max):   144.3 ms … 148.5 ms    20 runs
 
Benchmark 2: BUNDLER_VERSION=2.3.17 ruby -rbundler/setup -e1
  Time (mean ± σ):     149.1 ms ±   0.8 ms    [User: 126.2 ms, System: 21.5 ms]
  Range (min … max):   147.5 ms … 150.1 ms    19 runs
 
Summary
  'BUNDLER_VERSION=2.4.0.dev ruby -rbundler/setup -e1' ran
    1.02 ± 0.01 times faster than 'BUNDLER_VERSION=2.3.17 ruby -rbundler/setup -e1'

So only 2-3% speed up now.

But I'm working on a different approach where the speed up goes up to 5-6% and that might make this one unnecessary. I'll propose it separately.

deivid-rodriguez · 2022-07-07T12:20:38Z

And here is the alternative PR: #5695.

By the way @technicalpickles, I think I understood why you were getting different results. Are you running your tests on Linux by any chance? Since the Gemfile.lock in your repo only includes arm64-darwin-21, bundling only Linux will actually re-resolve from scratch for the new platform, that might explain your numbers, since this intends to speed up the "no re-resolve needed" happy path.

deivid-rodriguez · 2022-07-16T13:44:23Z

Good news, after rebasing this on top of other improvements, it still provides a 5-6% speed up on the big gemfile we've been using for testing! 🎉

$ hyperfine 'BUNDLER_VERSION=2.4.0.dev ruby -rbundler/setup -e1' 'BUNDLER_VERSION=2.3.18 ruby -rbundler/setup -e1'  
Benchmark 1: BUNDLER_VERSION=2.4.0.dev ruby -rbundler/setup -e1
  Time (mean ± σ):     141.4 ms ±   1.5 ms    [User: 117.5 ms, System: 22.5 ms]
  Range (min … max):   139.2 ms … 146.1 ms    20 runs
 
Benchmark 2: BUNDLER_VERSION=2.3.18 ruby -rbundler/setup -e1
  Time (mean ± σ):     149.4 ms ±   0.7 ms    [User: 125.5 ms, System: 22.6 ms]
  Range (min … max):   148.3 ms … 150.8 ms    19 runs
 
Summary
  'BUNDLER_VERSION=2.4.0.dev ruby -rbundler/setup -e1' ran
    1.06 ± 0.01 times faster than 'BUNDLER_VERSION=2.3.18 ruby -rbundler/setup -e1

I'm setting this as ready and I'll be merging in a couple of days if I don't get reviews.

So that it deals with [name, platform] tuples consistently instead of dealing with `Gem::Dependency` or `Bundler::DepProxy` instances inconsistently.

The resolve might be against locally available gems, or remote gems, depending on the situation.

This is a very weird edge case, not all Bundler invokations should pay its cost. Instead, assume it's not going to happen, detect the situation at materialization time, and re-resolve.

Since the improvement to lazily check whether the lockfile has missing platform specific dependencies, special handling on "bundler" inside `Bundler::SpecSet#for` is no longer covered by specs, because even in the case where missing platform specific dependencies are found, we still show the "Found no changes" message initially. This extra assertion makes the code covered again.

Lazily check incomplete lockfile to improve performance (cherry picked from commit f847e60)

deivid-rodriguez force-pushed the lazily-check-incomplete-lockfile branch from f10a6e8 to ee0c989 Compare May 17, 2022 22:56

technicalpickles mentioned this pull request May 29, 2022

Lazily check incomplete lockfile+specset for hash lookup of handled deps #5582

Closed

4 tasks

deivid-rodriguez force-pushed the lazily-check-incomplete-lockfile branch 2 times, most recently from 8906a1a to 31adc0c Compare June 13, 2022 18:35

deivid-rodriguez force-pushed the lazily-check-incomplete-lockfile branch from 31adc0c to 203136f Compare June 19, 2022 17:50

deivid-rodriguez added the bundler: performance label Jun 19, 2022

deivid-rodriguez force-pushed the lazily-check-incomplete-lockfile branch from 203136f to bc04bb6 Compare June 23, 2022 09:26

deivid-rodriguez added the status: feedback required label Jun 25, 2022

deivid-rodriguez force-pushed the lazily-check-incomplete-lockfile branch 2 times, most recently from 0ba2c9b to 3c375fa Compare July 1, 2022 08:14

deivid-rodriguez force-pushed the lazily-check-incomplete-lockfile branch 2 times, most recently from 211e608 to 744bd5e Compare July 16, 2022 13:42

deivid-rodriguez removed the status: feedback required label Jul 16, 2022

deivid-rodriguez marked this pull request as ready for review July 16, 2022 13:44

deivid-rodriguez added 2 commits July 18, 2022 12:08

Deduplication is only necessary for materialization

2d466be

Refactor SpecSet#for

520f59a

So that it deals with [name, platform] tuples consistently instead of dealing with `Gem::Dependency` or `Bundler::DepProxy` instances inconsistently.

deivid-rodriguez added 4 commits July 18, 2022 12:08

Remove comment that does not really hold

fb87392

The resolve might be against locally available gems, or remote gems, depending on the situation.

Delay SpecSet#for for checking incomplete platform specs

3de25c7

This is a very weird edge case, not all Bundler invokations should pay its cost. Instead, assume it's not going to happen, detect the situation at materialization time, and re-resolve.

Check message signaling re-resolve due to missing platform specific gems

393be98

deivid-rodriguez force-pushed the lazily-check-incomplete-lockfile branch from 744bd5e to 8ce54de Compare July 18, 2022 10:08

deivid-rodriguez enabled auto-merge July 18, 2022 10:08

deivid-rodriguez disabled auto-merge July 18, 2022 11:24

Set timeout for truffleruby CI

5281e51

deivid-rodriguez force-pushed the lazily-check-incomplete-lockfile branch from 8ce54de to 5281e51 Compare July 18, 2022 11:28

deivid-rodriguez enabled auto-merge July 18, 2022 11:28

deivid-rodriguez merged commit f847e60 into master Jul 18, 2022

deivid-rodriguez deleted the lazily-check-incomplete-lockfile branch July 18, 2022 14:53

deivid-rodriguez added a commit that referenced this pull request Jul 27, 2022

Merge pull request #5546 from rubygems/lazily-check-incomplete-lockfile

ea744f5

Lazily check incomplete lockfile to improve performance (cherry picked from commit f847e60)

deivid-rodriguez mentioned this pull request Aug 8, 2022

Add RubyGems Updates July 2022 rubygems/rubygems.github.io#122

Merged

Uh oh!

Conversation

deivid-rodriguez commented May 17, 2022

What was the end-user or developer problem that led to this PR?

What is your fix for the problem, implemented in this PR?

Make sure the following tasks are checked

Uh oh!

deivid-rodriguez commented May 17, 2022

Uh oh!

technicalpickles commented May 17, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

deivid-rodriguez commented May 18, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

deivid-rodriguez commented May 18, 2022

Uh oh!

deivid-rodriguez commented May 24, 2022

Uh oh!

technicalpickles commented May 29, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

deivid-rodriguez commented May 30, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

technicalpickles commented May 31, 2022

Uh oh!

deivid-rodriguez commented Jun 14, 2022

Uh oh!

deivid-rodriguez commented Jun 18, 2022

Uh oh!

deivid-rodriguez commented Jun 23, 2022

Uh oh!

deivid-rodriguez commented Jul 1, 2022

Uh oh!

deivid-rodriguez commented Jul 7, 2022

Uh oh!

deivid-rodriguez commented Jul 16, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

technicalpickles commented May 17, 2022 •

edited

Loading

deivid-rodriguez commented May 18, 2022 •

edited

Loading

technicalpickles commented May 29, 2022 •

edited

Loading

deivid-rodriguez commented May 30, 2022 •

edited

Loading