Skip to content

Performance: small blobs

Ze Qian Zhang edited this page Feb 11, 2021 · 3 revisions

It is inherently harder to achieve high throughput with small blobs (in the KBs), due to the overhead in terms of data transferred per transaction. The AzCopy Team is working actively on improving the user experience in this scenario. This article discusses some ways to tune the AzCopy tool to increase the throughput.

Upload and download

Depending on how powerful your machine is, you should ramp up the AZCOPY_CONCURRENCY_VALUE setting as high as possible without affecting your environment.

To achieve better performance, there is the option of decreasing the logging level to ERROR with --log-level, to minimize the amount of time that AzCopy spends on logging request/response.

On some Linux systems, there may be issues related to scanning speed. In other words, the scanning is not happening as fast as needed to saturate all the parallel network connections. In this case you can turn on AZCOPY_CONCURRENT_SCAN. Please refer to the help message in azcopy env.

Copy

Copying blobs is done on the service end, in other words AzCopy coordinates the chunks, but the destination Storage service reads data directly from the source Storage service. In this case, you could be a lot more aggressive with AZCOPY_CONCURRENCY_VALUE, and try to set it to >1000 since there's very little work happening on the client side.

You could also try to break down the job and spin up AzCopy on more than 1 machine/VM. This has proven to be effective to some extent as well.