Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce memory usage from aws-partitions #3122

Merged
merged 24 commits into from
Oct 17, 2024
Merged

Conversation

alextwoods
Copy link
Contributor

@alextwoods alextwoods commented Oct 7, 2024

The endpoints (partitions.json) loaded by aws-partitions account for a large amount of memory required when creating/using service clients and post Endpoints2.0, are required only for legacy features.

Benchmarking
code being benchmarked:

require 'aws-sdk-lambda'
client = Aws::Lambda::Client.new
client.list_functions; nil
  • Before: memory retained: 8010 kb (69488 objects)
  • After: memory retained: 3732 kb (30878 objects)

A decrease of more than 50%.

This PR does the following to reduce memory usage:

  • Adds a new partitions metadata module that loads only the partition metadata.
  • Change the aws_partitions Endpoints matcher to use the new, metadata only methods.
  • Make client.config.endpoint set using an after_initailize and using the configured endpoint resolver

Additional endpoints changes:
Added a create class method to EndpointParameters which takes values from config. Operation endpoint parameter classes now are only generated when there is operation specific values and use the create method to handle setting values from config. Example generated code for S3:

# EndpointParameters#create
    def self.create(config, options={})
      new({
        region: config.region,
        use_fips: config.use_fips_endpoint,
        endpoint: config.regional_endpoint ? nil : config.endpoint.to_s,
        force_path_style: config.force_path_style,
        use_global_endpoint: config.s3_us_east_1_regional_endpoint == 'legacy',
        disable_multi_region_access_points: config.s3_disable_multiregion_access_points,
        use_arn_region: config.s3_use_arn_region,
        disable_s3_express_session_auth: config.disable_s3_express_session_auth,
      }.merge(options))
    end

# Endpoint module
    class ListObjectsV2
      def self.build(context)
        Aws::S3::EndpointParameters.create(
          context.config,
          bucket: context.params[:bucket],
          use_dual_stack: context[:use_dualstack_endpoint],
          accelerate: context[:use_accelerate_endpoint],
          prefix: context.params[:prefix],
        )
      end
    end

Alternatives
Instead of using define_singleton_method to override the endpoint method on the config struct, we could instead:

  1. resolve with the new endpoints resolver from config (fallback to legacy if we can’t infer parameters). This is fairly complex code and requires us to make some assumptions about how parameters are bound (but reasonable within all existing AWS services). See demo implementation - this demo does NOT work for S3, but works for most other services.
  2. (Slight breaking change) Use a simple templated/pattern resolver instead of the legacy resolver. Requires writing the basic template code, but is relatively simple. It will produce "incorrect" answers in some cases, but these are NOT used when making requests (but may be used by customers).
  3. (Breaking change) Set endpoint to nil. Customers that were relying on client.config.endpoint may be broken.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

  1. To make sure we include your contribution in the release notes, please make sure to add description entry for your changes in the "unreleased changes" section of the CHANGELOG.md file (at corresponding gem). For the description entry, please make sure it lives in one line and starts with Feature or Issue in the correct format.

  2. For generated code changes, please checkout below instructions first:
    https://github.com/aws/aws-sdk-ruby/blob/version-3/CONTRIBUTING.md

Thank you for your contribution!

@alextwoods alextwoods marked this pull request as ready for review October 9, 2024 16:52
Copy link
Contributor

@mullermp mullermp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. I like this approach better. Just needs some clean up I think.


def initialize_default_endpoint(client)
client_module = Object.const_get(client.class.name.rpartition('::').first)
if client.config.respond_to?(:endpoint_provider) &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a lot of nesting here. Can we break this up a bit with either methods and/or guard clauses?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, just rescuing exceptions cleans this up a ton.

Copy link

github-actions bot commented Oct 17, 2024

Detected 1 possible performance regressions:

  • aws-sdk-cloudwatchlogs.filter_log_events_large_ms - z-score regression: 24.96 -> 42.73. Z-score: 22.23

@alextwoods alextwoods merged commit 76412b0 into version-3 Oct 17, 2024
31 checks passed
@alextwoods alextwoods deleted the optimize_partitions branch October 17, 2024 17:39
alextwoods added a commit that referenced this pull request Oct 17, 2024
alextwoods added a commit that referenced this pull request Oct 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants