Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

S3 - retrying fetches can corrupt the downloaded file? #2692

Closed
jdelStrother opened this issue Apr 21, 2022 · 2 comments · Fixed by #2693
Closed

S3 - retrying fetches can corrupt the downloaded file? #2692

jdelStrother opened this issue Apr 21, 2022 · 2 comments · Fixed by #2693
Assignees
Labels
bug This issue is a bug. investigating Issue is being investigated

Comments

@jdelStrother
Copy link

Describe the bug

We've been seeing some very occasional file corruption when downloading files from S3 using the SDK. We first observed this on 2nd March, and have had 6 occurrences since (out of, say, 20,000 successful downloads).

The affected files have a repeated chunk in them - say the original file is 5MB, the resulting file after download might be 5.5MB, with the first 0.5MB being repeated twice.

s3 = ::Aws::S3::Resource.new(region: s3_config[:region], credentials: s3_credentials)
s3.bucket(bucket_name)
s3.object(key).get(response_target: local_path, checksum_mode: "ENABLED")

After downloading, the file at local_path should match the S3 object. However, we're still seeing this silently corrupt the download despite the presence of checksum_mode.

Expected Behavior

Downloaded files exactly match the file on S3. Or, at least, a checksum error is raised if they don't match.

Current Behavior

Whenever we've seen this fail, the file takes longer than usual to download, and the SDK logger reports that a retry was used:

[Aws::S3::Client 200 75.615772 1 retries] get_object(checksum_mode:"ENABLED",bucket:"XXXXXX",key:"YYYY")  

Here's the S3 logs from that:

36f9b0673892ff12e83df3b8d5de28c2ce25d3f004d8d9789e008a752773f3ec BUCKET [20/Apr/2022:17:49:20 +0000] MY_IP MY_ROLE FS9KQCP2K686GR6H REST.GET.OBJECT KEY "GET /BUCKET/KEY HTTP/1.1" 200 - 18216449 35600463 91928 130 "-" "aws-sdk-ruby3/3.130.0 ruby/3.0.3 x86_64-linux aws-sdk-s3/1.113.0" - gEh8Wuxn1GM4Az2bZSKFZntpUrV0SjBCmUKAK0HDIII625xheiSywECi1K26a5PoxVv3YcM2SCc= SigV4 ECDHE-RSA-AES128-GCM-SHA256 AuthHeader s3.amazonaws.com TLSv1.2 -

36f9b0673892ff12e83df3b8d5de28c2ce25d3f004d8d9789e008a752773f3ec BUCKET [20/Apr/2022:17:50:33 +0000] MY_IP MY_ROLE 2KZRM5108Z74T3BD REST.GET.OBJECT KEY "GET /BUCKET/KEY HTTP/1.1" 200 - 35600463 35600463 906 92 "-" "aws-sdk-ruby3/3.130.0 ruby/3.0.3 x86_64-linux aws-sdk-s3/1.113.0" - PyjzB9K05cmGg2NbJnwRJcEJbFlIHr05lpfww/O7qOe6Izj7GHz/yt9bfZHXDzjlVqMHLdxYL4M= SigV4 ECDHE-RSA-AES128-GCM-SHA256 AuthHeader s3.amazonaws.com TLSv1.2 -

In this example, the S3 file is 35,600,463 bytes long, but the corrupted local file is 36,036,229 with a duplicated 435,766 byte section at the start.

Reproduction Steps

I'm open to better ideas to reproduce this but have tried for a while without success. Sometimes weeks go by without us seeing the issue.

Possible Solution

No response

Additional Information/Context

This is also opened as AWS support ticket 9950331191 with unredacted info, if you have any access to that

Gem name ('aws-sdk', 'aws-sdk-resources' or service gems like 'aws-sdk-s3') and its version

aws-sdk-s3 3.130.0

Environment details (Version of Ruby, OS environment)

Ruby 3.0.3, Ubuntu 18

@jdelStrother jdelStrother added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Apr 21, 2022
@mullermp
Copy link
Contributor

Thanks for opening. We're taking a look.

@alextwoods alextwoods added investigating Issue is being investigated and removed needs-triage This issue or PR still needs to be triaged. labels Apr 22, 2022
@alextwoods alextwoods self-assigned this Apr 22, 2022
@github-actions
Copy link

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see.
If you need more assistance, please either tag a team member or open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue is a bug. investigating Issue is being investigated
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants