Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clients cache open sockets for up to 20 seconds after last request #632

Closed
Tracked by #1692
stevepryde opened this issue Oct 10, 2022 · 9 comments
Closed
Tracked by #1692
Assignees
Labels
bug This issue is a bug. p3 This is a minor priority issue

Comments

@stevepryde
Copy link

Describe the bug

Each instance of aws_sdk_*::Client seems to cache potentially hundreds of open sockets that stay around for up to 20 seconds after use. This becomes a major problem on Lambda where your app has a file handle limit of 1024, especially when using a number of different AWS clients to do lots of parallel operations in the same app (each with their own cache).

This can be mitigated slightly by dropping clients and recreating as needed but that's not always feasible and is probably also less efficient.

Expected Behavior

Either sockets should always be closed after every request, or the caller should be able to customize this behaviour.
Or perhaps a way to share the socket pool between multiple clients.

Current Behavior

After performing 50x S3 HeadObject requests concurrently, the S3 Client struct keeps all 50 file handles (sockets) open for up to 20 seconds, even though the requests complete almost instantly. I assume all other Client structs behave similarly.

Reproduction Steps

I can reproduce this by spawning concurrent S3 HeadObject tasks and then polling the number of file handles used.
I assume any Client and any operation will do similar.

let s3 = aws_sdk_s3::Client::new(&config);
let mut futs = Vec::new();
for _ in 0..50 {
    futs.push(tokio::spawn(
        s3.head_object().bucket(bucket).key(key).send(),
    ));
}

for fut in futs {
    fut.await.unwrap().unwrap();
}

for countdown in 0..30 {
    tokio::time::sleep(Duration::from_secs(1)).await;
    let fd_count = procfs::process::Process::myself()
        .unwrap()
        .fd_count()
        .unwrap();
    info!("fd_count = {fd_count} after {countdown} seconds");
    if fd_count < 20 {
        break;
    }
}

Possible Solution

I haven't taken a deep enough dive into the client yet but something seems to be caching open sockets for up to 20 seconds. These do get reused if using the same client, but it would be good to have a way to avoid caching them at all.

Additional Information/Context

No response

Version

├── aws-config v0.49.0
│   │   ├── aws-http v0.49.0
│   │   │   ├── aws-smithy-http v0.49.0
│   │   │   │   ├── aws-smithy-eventstream v0.49.0
│   │   │   │   │   ├── aws-smithy-types v0.49.0
│   │   │   │   ├── aws-smithy-types v0.49.0 (*)
│   │   │   ├── aws-smithy-types v0.49.0 (*)
│   │   │   ├── aws-types v0.49.0
│   │   │   │   ├── aws-smithy-async v0.49.0
│   │   │   │   ├── aws-smithy-client v0.49.0
│   │   │   │   │   ├── aws-smithy-async v0.49.0 (*)
│   │   │   │   │   ├── aws-smithy-http v0.49.0 (*)
│   │   │   │   │   ├── aws-smithy-http-tower v0.49.0
│   │   │   │   │   │   ├── aws-smithy-http v0.49.0 (*)
│   │   │   │   │   ├── aws-smithy-types v0.49.0 (*)
│   │   │   │   ├── aws-smithy-http v0.49.0 (*)
│   │   │   │   ├── aws-smithy-types v0.49.0 (*)
│   ├── aws-sdk-s3 v0.19.0
│   │   ├── aws-endpoint v0.49.0 (*)
│   │   ├── aws-http v0.49.0 (*)
│   │   ├── aws-sig-auth v0.49.0 (*)
│   │   ├── aws-sigv4 v0.49.0 (*)
│   │   ├── aws-smithy-async v0.49.0 (*)
│   │   ├── aws-smithy-checksums v0.49.0
│   │   │   ├── aws-smithy-http v0.49.0 (*)
│   │   │   ├── aws-smithy-types v0.49.0 (*)
│   │   ├── aws-smithy-client v0.49.0 (*)
│   │   ├── aws-smithy-eventstream v0.49.0 (*)
│   │   ├── aws-smithy-http v0.49.0 (*)
│   │   ├── aws-smithy-http-tower v0.49.0 (*)
│   │   ├── aws-smithy-types v0.49.0 (*)
│   │   ├── aws-smithy-xml v0.49.0 (*)
│   │   ├── aws-types v0.49.0 (*)
│   ├── aws-types v0.49.0 (*)

Environment details (OS name and version, etc.)

AWS Lambda (Amazon Linux 2 x86_64) and also Mac OS 12.6 M1/32G

Logs

No response

@stevepryde stevepryde added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Oct 10, 2022
@jdisanti
Copy link
Contributor

The SDK is using the hyper defaults for connection pooling right now, and it looks like by default, hyper doesn't have an upper bound on idle connections.

I'm not sure what the correct fix is, but its definitely seems like something the SDK needs solve.

It is possible to work around this by replacing the default connector so that you can change hyper's defaults, but this is definitely not easy, and we don't have a good example for it currently.

@jdisanti jdisanti removed the needs-triage This issue or PR still needs to be triaged. label Oct 10, 2022
@jdisanti
Copy link
Contributor

Adding this to our Production Readiness tracking issue.

@jdisanti
Copy link
Contributor

It looks like the AWS SDK for Java V1 defaults to 50 idle connections with a max idle time of 60 seconds, and V2 is consistent.

@jdisanti
Copy link
Contributor

@jdisanti
Copy link
Contributor

I think you can workaround this issue with the following connector customization for now:

aws-config = "0.49.0"
aws-sdk-dynamodb = "0.19.0"
aws-sdk-s3 = "0.19.0"
aws-smithy-client = "0.49.0"
hyper = { version = "0.14.20", features = ["full"] }
tokio = { version = "1.21.2", features = ["full"] }
use aws_sdk_dynamodb as dynamodb;
use aws_sdk_s3 as s3;
use aws_smithy_client::erase::DynConnector;
use aws_smithy_client::hyper_ext::Adapter as HyperAdapter;

fn create_smithy_conn() -> DynConnector {
    // The `DynConnector` results in an allocation and dynamic dispatch for all client calls,
    // so you may desire to type out the full type name for the return value instead.
    DynConnector::new(
        // The `HyperAdapter` converts a hyper connector into a Smithy connector that can be used with the SDK
        HyperAdapter::builder()
            .hyper_builder({
                // Tell the `HyperAdapter` to set max idle connections on the underlying hyper client
                let mut hyper_builder = hyper::Client::builder();
                hyper_builder.pool_max_idle_per_host(50);
                hyper_builder
            })
            // Use the default/built-in HTTPS connector
            .build(aws_smithy_client::conns::https()),
    )
}

#[tokio::main]
async fn main() {
    let sdk_config = aws_config::load_from_env().await;

    // Construct clients with the customized connectors
    let s3_client = s3::Client::from_conf_conn((&sdk_config).into(), create_smithy_conn());
    let dynamodb_client =
        dynamodb::Client::from_conf_conn((&sdk_config).into(), create_smithy_conn());

    println!("{:?}", s3_client.list_buckets().send().await);
    println!("{:?}", dynamodb_client.list_tables().send().await);
}

If you try that, let me know if it improves things.

@stevepryde
Copy link
Author

That works nicely. Thank you 😄

@jmklix jmklix added the p3 This is a minor priority issue label Nov 14, 2022
@jdisanti jdisanti self-assigned this Dec 5, 2022
@jdisanti jdisanti added the pending-release This issue will be fixed by an approved PR that hasn't been released yet. label Dec 8, 2022
@jdisanti
Copy link
Contributor

jdisanti commented Dec 8, 2022

A fix for this will go out in the next release.

@jdisanti
Copy link
Contributor

This was included in release-2022-12-14.

@jdisanti jdisanti removed the pending-release This issue will be fixed by an approved PR that hasn't been released yet. label Dec 21, 2022
@github-actions
Copy link

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see.
If you need more assistance, please either tag a team member or open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue is a bug. p3 This is a minor priority issue
Projects
Archived in project
Development

No branches or pull requests

3 participants