Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High number of calls to GetBlobProperties when running azcopy copy #2737

Open
Pluggi opened this issue Jun 25, 2024 · 3 comments
Open

High number of calls to GetBlobProperties when running azcopy copy #2737

Pluggi opened this issue Jun 25, 2024 · 3 comments
Assignees

Comments

@Pluggi
Copy link

Pluggi commented Jun 25, 2024

Which version of the AzCopy was used?

Note: The version is visible when running AzCopy without any argument
AzCopy 10.22.1

Which platform are you using? (ex: Windows, Mac, Linux)

Linux

What command did you run?

Note: Please remove the SAS to avoid exposing your credentials. If you cannot remember the exact command, please retrieve it from the beginning of the log file.
azcopy copy "https://${SRC}.blob.core.windows.net?${SAS_TOKEN}" "https://${DST}.blob.core.windows.net?${SAS_TOKEN}"

What problem was encountered?

We are seeing a lot of GetBlobProperties call on our destination Storage Account whenever we run the command.

2024-06-25T13-27-54

Two processes were started at midnight, with one finishing at 6:21AM and the other at 10:38AM.
We would like to understand what are these calls used for and if they could be removed, as they incur high costs (we have 15 storage accounts with the same patterns, costing us 100$ per day).

Have you found a mitigation/solution?

I feel like the --s2s-preserve-properties could be the culprit.
I am going to try disabling it and seeing what happens.

@Pluggi
Copy link
Author

Pluggi commented Jun 25, 2024

Setting --s2s-preserve-properties=false does not seem to change anything.

@ashruti-msft
Copy link
Collaborator

ashruti-msft commented Jun 26, 2024

Hi this is a default behaviour of azcopy and there is no option to reduce the getBlobProperties calls.

By default AzCopy uses parallel hierarchical listing for the Blob endpoint in order to speed up the listing process.

To reduce the IOs/cost or optimize for a flat structure, you can choose to disable parallel hierarchical listing by setting the environment variable AZCOPY_DISABLE_HIERARCHICAL_SCAN to true.
You can refer this for more information.

Please know that doing this would impact performance and if performance is one of your priorities, then this is NOT desirable but if you prioritize saving on costs then this can be an option.

@ashruti-msft ashruti-msft self-assigned this Jun 26, 2024
@Pluggi Pluggi changed the title High calls to GetBlobProperties when running azcopy copy High number of calls to GetBlobProperties when running azcopy copy Jun 28, 2024
@Pluggi
Copy link
Author

Pluggi commented Jun 28, 2024

Setting AZCOPY_DISABLE_HIERARCHICAL_SCAN does not seem to have made much of a difference unfortunately.

I have resorted to using azcopy sync for now, even though it uses a lot more memory.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants