buffer size for multipart s3 downloads #691

mheilman · 2016-06-22T21:31:18Z

I noticed recently that for a large download, the awscli (aws s3 cp s3://...) was faster than using boto3.s3.transfer.MultipartDownloader.

After running a few tests of downloading an 8GB file, it looks like maybe the size of the I/O buffer here may have something to do with it. I don't understand why, but making that buffer size larger (e.g., 256KB or 1024KB instead of the current 16KB) seems to improve download speeds consistently for me.

Perhaps that buffer size should be increased, or maybe just made configurable? I don't understand the pros and cons other than that making it larger seems to help for my use case.

Times for downloading an 8GB file from S3 to a g2.2xlarge instance (I just changed the number in the line of code mentioned above):

100 seconds with 1024KB buffer
106 seconds with 256KB buffer
118 seconds with 16KB buffer (current boto3 code)
256 seconds with 4KB buffer

Code for testing:

import time
import boto3
import logging
from concurrent.futures import ProcessPoolExecutor

t0 = time.time()

logging.basicConfig(level='DEBUG')
logging.getLogger('botocore').setLevel('INFO')
client = boto3.client('s3')

config = boto3.s3.transfer.TransferConfig(
    multipart_threshold=64 * 1024 * 1024,
    max_concurrency=10,
    num_download_attempts=10,
    multipart_chunksize=16 * 1024 * 1024,
    max_io_queue=10000
)

config = boto3.s3.transfer.TransferConfig()
transfer = boto3.s3.transfer.MultipartDownloader(client, config, boto3.s3.transfer.OSUtils())
transfer.download_file('bucket-name', 'path/to/big/file/foo.npy', 'foo2.npy', 8000000000, {})
print("TIME: {} SECONDS".format(time.time() - t0))

I previously mentioned this here.

The text was updated successfully, but these errors were encountered:

mheilman · 2016-06-22T22:01:27Z

In the benchmarks above, I downloaded the file to an EBS volume, which is perhaps less than ideal since that depends on the network connection to, if I understand correctly. However, I've seen similar differences in performance between boto3 and awscli on local storage on a d2.8xlarge instance. IIRC, the difference was even more pronounced in that case, perhaps because of the 10 gbps networking of the d2.8xlarge.

kyleknap · 2016-06-22T23:57:51Z

This is definitely something you may see if the configurations are not appropriate for the manager. I would really recommend read this thread and comment in a similar implementation as to why this is the case: boto/s3transfer#13 (comment). Based on that discussion, we may need to update the defaults in boto3.

If you are able to follow along, I would recommend setting a multipart_chunksize such that the following is True if you do not want to mess with the source code:

multipart_chunksize * max_concurrency < 16 kB (the default io chunksize) * max_io

When you bumped the io chunksize to 1024KB does that make the performance more comparable to the CLI? That is the io chunksize it uses.

mheilman · 2016-06-23T01:19:20Z

Ah, I forgot to mention how long awscli takes above. awscli takes about 90 seconds on the same machine for the same file, so it's still a bit faster than with boto3 when I changed the io chunksize to 1024KB, but not too much.

I'll try fiddling around with the multipart_chunksize and/or max_io next...

mheilman · 2016-06-23T01:35:33Z

Another data point: it took 113 seconds to download the 8GB file with the following settings, where I just bumped up the IO queue size to be way larger than necessary to satisfy the inequality above.

max_concurrency=10
max_io_queue=1000000000
multipart_chunksize=16MB
io chunksize=16KB (boto3 default)

mheilman · 2016-06-23T01:40:51Z

I also just tried the following settings, which are the same as the ones I used for the tests at the top of this thread except with a smaller multipart_chunksize, and it took 118 seconds.

max_concurrency=10
max_io_queue=10000
multipart_chunksize=1MB
io chunksize=16KB (boto3 default)

gisjedi · 2016-06-23T03:01:45Z

I'm encountering an identical problem. I was experiencing nearly 3 times the performance using the AWS CLI as opposed to boto. AWS CLI (aws-cli/1.10.33 botocore/1.4.23) is using the out of the box defaults. I'm using boto3 1.3.1 and using all default settings for my TransferConfig. I played with max_io_queue settings as @mheilman did with little effect - for a 5GiB file I'm downloading it in roughly 44 seconds.

Tested as follows:

aws-cli - default settings: 15s
boto3 - default TransferConfig: 44s
boto3 - default TransferConfig and boto3 source file s3/transfer.py buffer_size variable set to 1024 * 256: 16s

I tried all the settings suggested above focusing on max_io_queue. Even setting this up to the 10s of millions made no appreciable difference... maybe a second or two. Changing the buffer_size in the boto3 source seemed to be the only configuration that actually made the results consistent with the AWS CLI. I tried buffer sizes from 16KiB all the way up to 64MiB, but settled on 256KiB as performance deteriorated on both sides of that value.

All my testing was done on a m4.10xl instance running Amazon AMI.

mheilman · 2016-06-23T14:29:52Z

I was experiencing nearly 3 times the performance using the AWS CLI as opposed to boto.

@gisjedi, thanks for adding your observations. It's good to know I'm not the only one seeing this. While the differences I've posted above were smaller, I've also seen a similar 3x speed difference between boto3 and awscli on an d2.8xlarge instance with 10 gbps networking (the g2.2xlarge instance I used for the tests above has maybe 1 gbps).

mheilman · 2016-06-27T15:22:07Z

fwiw, I tried commenting out the lines that queue up IO chunks here (and put a pass there), and downloading the 8GB file took about the same amount of time as awscli (86 seconds).

kyleknap · 2016-06-27T16:16:12Z

Hmm it sounds like the theory that the slowness has to do with the io queue is correct.

jamesls · 2016-07-28T16:11:41Z

Also relevant: #737

kyleknap · 2016-08-03T19:21:35Z

With the release of 1.4.0 of boto3, you now have the option to both io_chunksize and max_io_queue so for the environment where the network speed is much faster than the io speed you can configure it in a way to make io stop being the bottleneck: https://boto3.readthedocs.io/en/latest/reference/customizations/s3.html#boto3.s3.transfer.TransferConfig

It is important to note that with the current configuration, the defaults should be suitable. Now the io_chunksize is 256KB, which seems to be a good default value as I have found in my testing and testing from others, @gisjedi. For me with the current default configurations, boto3 achieves the same speed for downloads as the CLI for large downloads on larger instances.

Closing out issue as the defaults should now be resulting in better performance and the necessary configuration parameters related to io are now exposed to tweak to make the download faster if the results from using the defaults are still not as desired.

kyleknap added the response-requested Waiting on additional information or feedback. label Jun 22, 2016

This was referenced Jul 20, 2016

Add ability to configure io chunksize boto/s3transfer#35

Merged

Expose io chunksize in TransferConfig #725

Merged

kyleknap closed this as completed Aug 3, 2016

gisjedi mentioned this issue Aug 4, 2016

Upgrade Boto3 requirement to 1.4 ngageoint/scale#413

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

buffer size for multipart s3 downloads #691

buffer size for multipart s3 downloads #691

mheilman commented Jun 22, 2016

mheilman commented Jun 22, 2016 •

edited

Loading

kyleknap commented Jun 22, 2016

mheilman commented Jun 23, 2016 •

edited

Loading

mheilman commented Jun 23, 2016 •

edited

Loading

mheilman commented Jun 23, 2016

gisjedi commented Jun 23, 2016

mheilman commented Jun 23, 2016

mheilman commented Jun 27, 2016

kyleknap commented Jun 27, 2016

jamesls commented Jul 28, 2016

kyleknap commented Aug 3, 2016

buffer size for multipart s3 downloads #691

buffer size for multipart s3 downloads #691

Comments

mheilman commented Jun 22, 2016

mheilman commented Jun 22, 2016 • edited Loading

kyleknap commented Jun 22, 2016

mheilman commented Jun 23, 2016 • edited Loading

mheilman commented Jun 23, 2016 • edited Loading

mheilman commented Jun 23, 2016

gisjedi commented Jun 23, 2016

mheilman commented Jun 23, 2016

mheilman commented Jun 27, 2016

kyleknap commented Jun 27, 2016

jamesls commented Jul 28, 2016

kyleknap commented Aug 3, 2016

mheilman commented Jun 22, 2016 •

edited

Loading

mheilman commented Jun 23, 2016 •

edited

Loading

mheilman commented Jun 23, 2016 •

edited

Loading