Improve bundler/setup performance again by not deduplicating intermediate results#5533
Merged
deivid-rodriguez merged 2 commits intomasterfrom May 16, 2022
Merged
Improve bundler/setup performance again by not deduplicating intermediate results#5533deivid-rodriguez merged 2 commits intomasterfrom
bundler/setup performance again by not deduplicating intermediate results#5533deivid-rodriguez merged 2 commits intomasterfrom
Conversation
On a different patch, it was noticed Ngam Pham that we are calling `LazySpecification#hash` many times, and simply memoizing that led to a very considerable performance improvement in his app. I noticed though that we shouldn't be calling `LazySpecification#hash` that many times, and I located the culprit at `SpecSet#for` where we were deduplicating the partial aggregated result on every iteration. It is enough to do it just once at the end. This leads on a 12% speedup on Rails repository Gemfile vs the previous 8% I was getting from memoizing `LazySpecification#hash`. Also, after this patch memoizing `LazySpecification#hash` has no effect in performance anymore. Co-authored-by: Ngan Pham <ngan@users.noreply.github.com>
4 tasks
On `rails/rails` repository Gemfile, running the following script ``` # script.rb require "bundler/setup" ``` #### Before ``` ➜ rails git:(main) ✗ BUNDLER_VERSION=2.4.0.dev ruby-memory-profiler --pretty --no-detailed --allocated-strings=0 --retained-strings=0 script.rb Total allocated: 24.37 MB (207937 objects) Total retained: 2.98 MB (34152 objects) ``` #### After ``` ➜ rails git:(main) ✗ BUNDLER_VERSION=2.4.0.dev ruby-memory-profiler --pretty --no-detailed --allocated-strings=0 --retained-strings=0 script.rb Total allocated: 22.27 MB (206856 objects) Total retained: 2.98 MB (34152 objects) ``` Co-authored-by: Josh Nichols <josh.nichols@gusto.com>
Contributor
|
Compared this to 2.3.9 and 2.3.13 with our Gemfile: So, looks good! |
technicalpickles
added a commit
to technicalpickles/rubygems
that referenced
this pull request
May 13, 2022
Looking over flamegraphs for `require 'bundler/setup'`, I'm seeing `Bundler::DepProxy#name` show up quite often. The method itself is really simple, delegating to the dependency's proxy. I suspect it is getting called enough that memoizing the value improves will improve performance by saving a method call, in exchange for saving the value in memory. When testing with this patch plus ruby#5533 I saw time go from 0.92s to 0.75s.
4 tasks
technicalpickles
added a commit
to technicalpickles/rubygems
that referenced
this pull request
May 13, 2022
…andled deps
I was looking at (yet another) flamegraph in speedscope, and used the
'left hand heavy' and was shocked to realize that 0.5s of the 1.7s
is spent in DepProxy#name. This method _only_ delegates the name to an
underlying spec, so it's not complex at all.
It seems to be of how often this line ends up calling it:
next if handled.any?{|d| d.name == dep.name && (match_current_platform || d.__platform == dep.__platform) } || dep.name == "bundler"
The `handled` array is built up as dependencies are handled, so this get
slower as more dependencies are installed.
This change changes how `handled` is track. Instead of just an array, I've
tried using a Hash, with the key being a dep's name, and the value being
a list of deps with that name. This means it's constant time to find
the dependencies with the same name.
I saw a drop from 1.7s to 1.0s against master, and from 0.95s to 0.24s
when used with ruby#5533
4 tasks
deivid-rodriguez
pushed a commit
to technicalpickles/rubygems
that referenced
this pull request
May 16, 2022
…andled deps
I was looking at (yet another) flamegraph in speedscope, and used the
'left hand heavy' and was shocked to realize that 0.5s of the 1.7s
is spent in DepProxy#name. This method _only_ delegates the name to an
underlying spec, so it's not complex at all.
It seems to be of how often this line ends up calling it:
next if handled.any?{|d| d.name == dep.name && (match_current_platform || d.__platform == dep.__platform) } || dep.name == "bundler"
The `handled` array is built up as dependencies are handled, so this get
slower as more dependencies are installed.
This change changes how `handled` is track. Instead of just an array, I've
tried using a Hash, with the key being a dep's name, and the value being
a list of deps with that name. This means it's constant time to find
the dependencies with the same name.
I saw a drop from 1.7s to 1.0s against master, and from 0.95s to 0.24s
when used with ruby#5533
bundler/setup performance againbundler/setup performance again by not deduplicating intermediate results
deivid-rodriguez
added a commit
that referenced
this pull request
May 18, 2022
Improve `bundler/setup` performance again (cherry picked from commit b0958db)
technicalpickles
added a commit
to technicalpickles/rubygems
that referenced
this pull request
May 29, 2022
…andled deps
I was looking at (yet another) flamegraph in speedscope, and used the
'left hand heavy' and was shocked to realize that 0.5s of the 1.7s
is spent in DepProxy#name. This method _only_ delegates the name to an
underlying spec, so it's not complex at all.
It seems to be of how often this line ends up calling it:
next if handled.any?{|d| d.name == dep.name && (match_current_platform || d.__platform == dep.__platform) } || dep.name == "bundler"
The `handled` array is built up as dependencies are handled, so this get
slower as more dependencies are installed.
This change changes how `handled` is track. Instead of just an array, I've
tried using a Hash, with the key being a dep's name, and the value being
a list of deps with that name. This means it's constant time to find
the dependencies with the same name.
I saw a drop from 1.7s to 1.0s against master, and from 0.95s to 0.24s
when used with ruby#5533
deivid-rodriguez
pushed a commit
to technicalpickles/rubygems
that referenced
this pull request
Jun 18, 2022
…andled deps
I was looking at (yet another) flamegraph in speedscope, and used the
'left hand heavy' and was shocked to realize that 0.5s of the 1.7s
is spent in DepProxy#name. This method _only_ delegates the name to an
underlying spec, so it's not complex at all.
It seems to be of how often this line ends up calling it:
next if handled.any?{|d| d.name == dep.name && (match_current_platform || d.__platform == dep.__platform) } || dep.name == "bundler"
The `handled` array is built up as dependencies are handled, so this get
slower as more dependencies are installed.
This change changes how `handled` is track. Instead of just an array, I've
tried using a Hash, with the key being a dep's name, and the value being
a list of deps with that name. This means it's constant time to find
the dependencies with the same name.
I saw a drop from 1.7s to 1.0s against master, and from 0.95s to 0.24s
when used with ruby#5533
matzbot
pushed a commit
to ruby/ruby
that referenced
this pull request
Jun 19, 2022
…ing hash lookup of handled deps
I was looking at (yet another) flamegraph in speedscope, and used the
'left hand heavy' and was shocked to realize that 0.5s of the 1.7s
is spent in DepProxy#name. This method _only_ delegates the name to an
underlying spec, so it's not complex at all.
It seems to be of how often this line ends up calling it:
next if handled.any?{|d| d.name == dep.name && (match_current_platform || d.__platform == dep.__platform) } || dep.name == "bundler"
The `handled` array is built up as dependencies are handled, so this get
slower as more dependencies are installed.
This change changes how `handled` is track. Instead of just an array, I've
tried using a Hash, with the key being a dep's name, and the value being
a list of deps with that name. This means it's constant time to find
the dependencies with the same name.
I saw a drop from 1.7s to 1.0s against master, and from 0.95s to 0.24s
when used with ruby/rubygems#5533
ruby/rubygems@844dac30d4
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What was the end-user or developer problem that led to this PR?
The performance of
bundler/setupis still slow.What is your fix for the problem, implemented in this PR?
On a different patch, it was noticed by @ngan that we are calling
LazySpecification#hashmany times, and simply memoizing that led to a very considerable performance improvement in his app.I noticed though that we shouldn't be calling
LazySpecification#hashthat many times, and I located the culprit atSpecSet#forwhere we were deduplicating the partial aggregated result on every iteration. It is enough to do it just once at the end.This leads on a 12% speedup on Rails repository Gemfile vs the previous 8% I was getting from memoizing
LazySpecification#hash. Also, after this patch memoizingLazySpecification#hashhas no effect in performance anymore.Make sure the following tasks are checked