Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support multithreaded uploads for faster transfers #178

Open
stanhu opened this issue Nov 18, 2020 · 1 comment
Open

Support multithreaded uploads for faster transfers #178

stanhu opened this issue Nov 18, 2020 · 1 comment
Assignees
Labels
Feature Request This issue requests a feature that is currently not supported

Comments

@stanhu
Copy link

stanhu commented Nov 18, 2020

For large transfers, parallelizing Put Blob Block calls may speed up transfers significantly. This block of code could likely benefit:

def create_block_blob_multiple_put(container, blob, content, size, options = {})
content_type = get_or_apply_content_type(content, options[:content_type])
content = StringIO.new(content) if content.is_a? String
block_size = get_block_size(size)
# Get the number of blocks
block_count = (Float(size) / Float(block_size)).ceil
block_list = []
for block_id in 0...block_count
id = block_id.to_s.rjust(6, "0")
put_blob_block(container, blob, id, content.read(block_size), timeout: options[:timeout], lease_id: options[:lease_id])
block_list.push([id])
end
# Commit the blocks put
commit_options = {}
commit_options[:content_type] = content_type
commit_options[:content_encoding] = options[:content_encoding] if options[:content_encoding]
commit_options[:content_language] = options[:content_language] if options[:content_language]
commit_options[:content_md5] = options[:content_md5] if options[:content_md5]
commit_options[:cache_control] = options[:cache_control] if options[:cache_control]
commit_options[:content_disposition] = options[:content_disposition] if options[:content_disposition]
commit_options[:metadata] = options[:metadata] if options[:metadata]
commit_options[:timeout] = options[:timeout] if options[:timeout]
commit_options[:request_id] = options[:request_id] if options[:request_id]
commit_options[:lease_id] = options[:lease_id] if options[:lease_id]
commit_blob_blocks(container, blob, block_list, commit_options)
get_properties_options = {}
get_properties_options[:lease_id] = options[:lease_id] if options[:lease_id]
# Get the blob properties
get_blob_properties(container, blob, get_properties_options)
end

The Amazon SDK does something similar: https://github.com/aws/aws-sdk-ruby/blob/d2dc0213758da0aee100e485f2dfb090ffa7dbd5/gems/aws-sdk-s3/lib/aws-sdk-s3/multipart_file_uploader.rb#L133-L163

@katmsft katmsft self-assigned this Jan 6, 2021
@katmsft katmsft added the Feature Request This issue requests a feature that is currently not supported label Jan 6, 2021
@katmsft
Copy link
Member

katmsft commented Jan 6, 2021

Thanks for reporting. This repository also welcomes community contribution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature Request This issue requests a feature that is currently not supported
Projects
None yet
Development

No branches or pull requests

2 participants