Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduces autoloading strategy for service gems #3098

Closed
wants to merge 7 commits into from

Conversation

Schwad
Copy link

@Schwad Schwad commented Sep 6, 2024

What are we trying to accomplish?

This PR introduces an autoloading strategy for the top-level service modules in the AWS SDK for Ruby. Specifically, we're updating the service_class template, which generates the main entry point for each AWS service module. Our goal is to significantly improve performance by reducing memory footprint and speeding up gem require times.

We also ensure eager loading is still possible for production environments.

Key Changes:

  1. service_class.rb and service_class.mustache are updated to use autoload instead of require_relative. Other templates that are explicitly loaded in service_class.mustache also are updated.
  2. Updated codegen to implicitly require customizations directly into resources if the file exists.
  3. Optionally supports eager-loading if within a Rails ecosystem and eager loading is enabled.

The PR is structured in six commits to facilitate review. The resulting code generation is the final commit.

Benchmarks

aws-sdk-core is used in these examples.

Require Time

Metric Before After Improvement
require 'aws-sdk-core' 346.2ms 69.6ms 5x faster
Benchmark Script
require 'benchmark'
  
Benchmark.bm do |x|
  x.report "require 'aws-sdk-core'" do
    require 'aws-sdk-core'
  end
end

Memory Allocation

Metric Before After Improvement
Allocated memory 7,358,751 102,755 98.6% reduction
Allocated objects 65,949 978 98.5% reduction
Memory Profiler Script
require 'memory_profiler'

report = MemoryProfiler.report do
  require 'aws-sdk-core'
end
  
puts report.pretty_print

Real Life Boot Time

This is what we're seeing with our application at boot time, pointed to this branch.

~250ms average improvement on boot time. Could be much more for applications that rely on a lot of gems.

Before:

~3.69s, GC ~465ms

Screenshot 2024-08-28 at 09 57 29

After:

~3.42s, GC ~435ms

Screenshot 2024-08-28 at 09 57 55

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@Schwad Schwad force-pushed the schwad_autoload branch 2 times, most recently from dbcd6e0 to 3585360 Compare September 6, 2024 11:27
@mullermp
Copy link
Contributor

mullermp commented Sep 6, 2024

Thanks for opening a pull request on this! We were looking to do this to sso, ssoidc, and sts gems that are stuck inside of core, but we ran into some compatibility problems across versions. For example, we need to consider cases where the service gem is older but the core gem is newer.

Could you please revert the code generated changes so the PR is more manageable?

@Schwad
Copy link
Author

Schwad commented Sep 6, 2024

Thanks for opening a pull request on this!

My pleasure! I have very much enjoyed working on this.

For example, we need to consider cases where the service gem is older but the core gem is newer.

Sure, that makes sense.

Could you please revert the code generated changes so the PR is more manageable?

Sure! I did this as a vanilla revert commit for now so you can inspect the codegen changes if needed. If you prefer I pop both commits off altogether I can do that.

[This commit}(https://github.com//pull/3098/commits/44fd76db259a1fcf0360cfbf1380c29f96ec794b) still has a fair few files that had to be done by hand to support production eager loading. I did not make that new codegeneration because it shouldn't change.

@mullermp
Copy link
Contributor

mullermp commented Sep 6, 2024

Great. This will take some time to review and test and I can scope some time out early next week. I saw some Rails specific stuff, please ensure that's removed too.

@Schwad
Copy link
Author

Schwad commented Sep 9, 2024

Thanks

Great. This will take some time to review and test and I can scope some time out early next week. I saw some Rails specific stuff, please ensure that's removed too.

This code only runs if the gem is used within a Rails application. ( if Rails::Railtie.defined? ). This supports eager loading of all constants. Therefore applications can still choose to eager load in production.

@gmcgibbon do you know if there's a novel way to achieve eager-loading on the application-end without having the Railtie logic upstream? Or could we roll our own?

@alextwoods
Copy link
Contributor

The rails specific logic could potentially be moved to https://github.com/aws/aws-sdk-rails.

In general I like the idea, though we had to revert our attempts at making bundled services in core autoload (see #3088) - complex load_paths in some customer cases ultimately made it impossible to support safely for all existing users.

My overall concern with this method is that it moves the cost of loading the sdk from require time to first use (that is, when creating a service client for the first time). Considering a typical service case, this means that time could end up impacting a user request rather than just service startup. Do you have any benchmarking on client creation?

@Schwad
Copy link
Author

Schwad commented Sep 10, 2024

Thanks for the reply @alextwoods, much appreciated. 🙇

The rails specific logic could potentially be moved to https://github.com/aws/aws-sdk-rails.

Okay, I have no problem with doing that.

My overall concern with this method is that it moves the cost of loading the sdk from require time to first use (that is, when creating a service client for the first time).

Yes. Right now it's fairly expensive to boot time to require aws-sdk-*. This is a Rails-specific example but if most applications ran the script below, they are likely to see these gems take up most of the top 10 longest require times on boot.

script
# typed: true
# frozen_string_literal: true

require "benchmark"

original_require = Kernel.method(:require)

$require_times = {}

Kernel.define_method(:require) do |name|
  result = nil
  time = Benchmark.measure do
    result = original_require.call(name)
  end
  $require_times[name] = time.real * 1000 if result
  result
end

require_relative "../config/environment"

puts "slowest requires:"
$require_times.sort_by { |_, time| -time }.first(20).each do |name, time|
  puts format("%-60s %10.4f", name, time)
end

You are right that autoloading by default defers parts of this load time to the first time the user would hit, say, Aws::Foo::Client. The eager loading (which we may move to aws-sdk-rails) gives the best of both worlds. Development can autoload everything, production can eager load everything to keep the user service experience as fast as possible.

Do you have any benchmarking on client creation?

For users of these gems who don't have eager load capability, it is probably good to have this benchmark. I don't have this code in front of me right now, but let me circle back to you with that. 👍

@alextwoods
Copy link
Contributor

alextwoods commented Sep 13, 2024

@Schwad - I've got a few fixes (see this branch) for all of the failing tests, but can't push updates to the PR. Can you give me access?

@alextwoods
Copy link
Contributor

So far this change also seems to be passing in our internal build system (with a few of the fixes in the branch I linked), so thats looking good. I believe it may avoid some of the complex load path issues that I ran into with #3083 (but haven't completed a full 100% test of it).

I've been doing some initial benchmarking on client initialization for S3(without the eager load). It does definitely increase the first client initialize time (since it has deferred all of the require until then), but there is still a net win when adding the require + initialize time (since we are avoiding requiring other un-used parts of the SDK). While I am still a bit concerned with this, I think its a reasonable tradeoff.

With Autoload:
S3 Require time: 120.2 ms
S3 Client Initialize time: 65.2 ms

Require Allocated: 11450 kb
Require Retained: 1527 kb
Client Initialize Allocated: 13457 kb
Client Initialize Retained: 4380 kb

Current State (no autoload):
S3 Require time: 225.531 ms
S3 Client Initialize time: 13.84 ms

Require Allocated: 30835 kb
Require Retained: 4521 kb
Client Initialize Allocated: 3778 kb
Client Initialize Retained: 2690 kb

@Schwad
Copy link
Author

Schwad commented Sep 16, 2024

@alextwoods thanks for this, you beat me to those benchmarks! It does seem that time-to-first-client might still be an improvement in production even with an autoload-only setup.

Can you give me access?

I'm working on doing this right now, I'll circle back when access is working. Thank you!

@Schwad
Copy link
Author

Schwad commented Sep 16, 2024

@alextwoods adding you as a collaborator on a fork on our org proved surprisingly trickier than it should be. 😅 I got decent advice it would be much simpler to use my own fork where I have all admin privileges and add you as a collaborator.

I've pushed this to my personal fork and invited you, you should be able to do whatever you want with the branch now I believe. Any issues let me know.

New PR/branch: #3105

Screenshot 2024-09-16 at 16 48 01

@alextwoods
Copy link
Contributor

Thanks! Accepted and I'll see if I can get my changes merged into that PR and then clean up these others.

@alextwoods
Copy link
Contributor

Closing in favor of #3105

@alextwoods alextwoods closed this Sep 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants