Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add basic support for Multipart Upload #2

Merged
merged 18 commits into from
Oct 20, 2024
Merged

Conversation

ffigiel
Copy link
Contributor

@ffigiel ffigiel commented Oct 5, 2024

This PR adds support for CreateMultipartUpload, UploadPart and CompleteMultipartUpload S3 APIs. This lets the users of this library upload large files more quickly, by uploading multiple chunks in parallel.

There are other Multipart Upload APIs that could be implemented in the future:

  • AbortMultipartUpload
  • ListMultipartUploads
  • ListParts
  • UploadPartCopy

The full list of these APIs can be found on MinIO's docs: https://min.io/docs/minio/linux/reference/s3-api-compatibility.html#multipart-uploads

test/multipart_test.gleam Outdated Show resolved Hide resolved
Comment on lines 40 to 43
|> list.map(fn(pair) {
let #(body, part_number) = pair
let assert Ok(res) =
upload_part.request(bucket:, key:, upload_id:, part_number:, body:)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should I actually perform these uploads in parallel so that we have an example use case?

@ffigiel ffigiel marked this pull request as ready for review October 5, 2024 18:20
Copy link
Owner

@lpil lpil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! I've left some questions inline

})

// Complete the multipart upload
// NOTE: The Parts must be sorted in ascending order for this operation
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this mean?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's too confusing. I'm going to remove this note, and perhaps make complete_multipart_upload.request() sort the parts argument to prevent the error case I'm alluding to in this note.

The reason for sorting is that when calling complete_multipart_upload.request(), the parts argument must be ordered by the part_number, e.g. [Part(.., part_number: 1), Part(.., part_number: 2), Part(.., part_number: 3)], otherwise the S3 server would respond to such request with an error.

Copy link
Contributor Author

@ffigiel ffigiel Oct 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done in 71ba290, but maybe I should skip sorting and let the request fail when the user gives us incorrect parts?

src/bucket/complete_multipart_upload.gleam Show resolved Hide resolved
src/bucket/complete_multipart_upload.gleam Show resolved Hide resolved
bucket: String,
key: String,
upload_id: String,
part_number: Int,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is a part number?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From UploadPart docs (https://docs.aws.amazon.com/AmazonS3/latest/API/API_UploadPart.html):

Part numbers can be any number from 1 to 10,000, inclusive. A part number uniquely identifies a part and also defines its position within the object being created. If you upload a new part using the same part number that was used with a previous part, the previously uploaded part is overwritten.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add this a doc comment please 🙏

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done in fd7e3c6

import gleam/list
import gleam/string
import gleeunit/should
import helpers
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add tests for the various possible errors

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done in 3be6630

Comment on lines 62 to 66
let assert Ok(res) =
get_object.request(bucket, key)
|> get_object.build(helpers.creds)
|> httpc.send_bits
let assert Ok(get_object.Found(contents)) = get_object.response(res)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move this to a function in helpers.gleam please

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what you mean. Should I make a helper that takes a bucket, key, creds, and a list of parts as BitArrays, and verifies the object?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah! helper.get_object or something which returns the contents

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done in 57a99e4

should.not_equal(res.upload_id, "")

// Upload some parts (they can be sent in parallel)
// NOTE: The minimum Part size for multipart upload is 5MiB (hardcoded in minio)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like part 3 is smaller than that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not a problem. I've updated the note in b0d5139

@lpil lpil marked this pull request as draft October 8, 2024 11:35
@ffigiel ffigiel marked this pull request as ready for review October 10, 2024 19:37
Copy link
Owner

@lpil lpil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Beautiful! One tiny thing inline, and could you also update the changelog please. Thanks

@ffigiel
Copy link
Contributor Author

ffigiel commented Oct 20, 2024

Done, I also updated the readme

@ffigiel ffigiel requested a review from lpil October 20, 2024 12:50
Copy link
Owner

@lpil lpil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@lpil lpil merged commit 824f9d1 into lpil:main Oct 20, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants