Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

S3 Store's bytes per upload part calculation logic doesn't make sense #1425

Open
aleksdmladenovic opened this issue Oct 21, 2024 · 0 comments

Comments

@aleksdmladenovic
Copy link
Contributor

aleksdmladenovic commented Oct 21, 2024

This is the code of S3 Store's bytes per upload part calculation.

// S3 requires us to upload in parts if the size is greater than 5GB. The part size must be at least
// 5mb (except last part) and can have up to 10,000 parts.
let bytes_per_upload_part =
(max_size / (MIN_MULTIPART_SIZE - 1)).clamp(MIN_MULTIPART_SIZE, MAX_MULTIPART_SIZE);

We've declared constant variables as this.

// S3 parts cannot be smaller than this number. See:
// https://docs.aws.amazon.com/AmazonS3/latest/userguide/qfacts.html
const MIN_MULTIPART_SIZE: u64 = 5 * 1024 * 1024; // 5MB.
// S3 parts cannot be larger than this number. See:
// https://docs.aws.amazon.com/AmazonS3/latest/userguide/qfacts.html
const MAX_MULTIPART_SIZE: u64 = 5 * 1024 * 1024 * 1024; // 5GB.
// S3 parts cannot be more than this number. See:
// https://docs.aws.amazon.com/AmazonS3/latest/userguide/qfacts.html
const MAX_UPLOAD_PARTS: usize = 10_000;

It makes sense to use clamp to make the value to keep in range.

But it doesn't make sense to divide the max_size with MIN_MULTIPART_SIZE.
Since we are calculating how many bytes are required for the part not how many part counts are required, why do we divide max bytes with minimum mulitpart bytes?

If that part was to keep the parts count to be less than MAX_UPLOAD_PARTS, shouldn't the code be like this?

 let bytes_per_upload_part = 
     (max_size / (MAX_UPLOAD_PARTS as u64 - 1)).clamp(MIN_MULTIPART_SIZE, MAX_MULTIPART_SIZE); 

I've checked this line's history and it had been there from old times.
Frankly, I'm not sure if this is a real issue and I misunderstood something.
Hope you to check this issue's validity. I will raise a PR if my solution is correct. Thank you.

cc: @allada, @MarcusSorealheis

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant