Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

azcopy url-decodes slashes from directory names #2753

Open
frank-siebert-tracetronic opened this issue Jul 22, 2024 · 12 comments
Open

azcopy url-decodes slashes from directory names #2753

frank-siebert-tracetronic opened this issue Jul 22, 2024 · 12 comments
Assignees

Comments

@frank-siebert-tracetronic

Which version of the AzCopy was used?

10.24.0

Note: The version is visible when running AzCopy without any argument

Which platform are you using? (ex: Windows, Mac, Linux)

Windows

What command did you run?

azcopy copy "C:/Users/XXX/testproject/test_folder_%5C_something_wrong" "https://xxx.blob.core.windows.net/zzz?sp=racw&st=2024-07-22T10:00:55Z&se=2024-07-22T18:00:55Z&spr=https&sv=2022-11-02&sr=c&sig=abc" --recursive

Note: Please remove the SAS to avoid exposing your credentials. If you cannot remember the exact command, please retrieve it from the beginning of the log file.

What problem was encountered?

I have some directories that have backward slashes in their names (they originally come from unix systems), so they get url encoded on my disk (therefore the %5C in the directory name).
What I am trying to do is upload them in the exact same manner, so no decoding taking place, just the plain "%5C" within the directory name.
However what is happening instead is the %5C gets replaced with a slash and then gets interpreted as a seperator, so in the storage I will get a subdirectory named "_something_wrong"

Expected: storageSource/test_folder_%5C_something_wrong
Actual: storageSource/test_folder_/_something_wrong

I have tried to do a double url encoding by renaming the directory to %255C but that didn't work either and will not be decoded at all, so it stays that way and therefore is also not a solution.

How can we reproduce the problem in the simplest way?

Create a simple folder locally with a %5C in the directory name and transfer it to an arbitrary storage container via azcopy copy.

Have you found a mitigation/solution?

No

@ashruti-msft
Copy link
Collaborator

Hi, I was not able to reproduce this issue. Test scenario: I created a folder named "new_%5C_thing" on local and copied it to storage using azcopy. Storage showed a folder created with the name "new_%5C_thing". Please let me know if I understood your issue correctly or not. Thanks.

@ashruti-msft ashruti-msft self-assigned this Jul 24, 2024
@ashruti-msft
Copy link
Collaborator

Okay I can reproduce this on windows not on linux. I'll get back to you regarding this.

@frank-siebert-tracetronic
Copy link
Author

Okay I can reproduce this on windows not on linux. I'll get back to you regarding this.

Hi, are there any updates on this issue?

@ashruti-msft
Copy link
Collaborator

Hi, can you please share your debug logs for this run. Thanks!

@frank-siebert-tracetronic
Copy link
Author

frank-siebert-tracetronic commented Aug 2, 2024

Hi, can you please share your debug logs for this run. Thanks!

Well, those are corporate specific logs, so this is unfortunately not possible. I thought you could reproduce this. What part of the logs do you need?

@Data-Engineer-Joe
Copy link

Hello @ashruti-msft, you said you can reproduce this issue. What's the next step here, this behaviour hurts pretty much because it rips our hive partitioning into pieces. Please follow up on this. I hope there will be a fix. Thanks.

@jschelter-co
Copy link

Hello @ashruti-msft. I'd also like to ask if you've had the chance to look more closely into this issue. It really is a major blocker on our side. Any feedback is highly appreciated. Thanks a lot.

@Data-Engineer-Joe
Copy link

Hello @ashruti-msft, could you please leave a comment on your progress regarding this issue?

@ashruti-msft
Copy link
Collaborator

Hi @Data-Engineer-Joe I'll discuss this issue with the team and get back to you.

@Data-Engineer-Joe
Copy link

Hello @ashruti-msft and @adreed-msft , do you have any news on the issue? It's really a major problem for us! Please give me some feedback, what do you plan or if you can offer something. Thank you.

@ashruti-msft
Copy link
Collaborator

Hi @Data-Engineer-Joe we are actively working on this issue and will be fixing this issue in our next release. Thanks!

@dphulkar-msft
Copy link
Collaborator

dphulkar-msft commented Sep 23, 2024

Hi @Data-Engineer-Joe,
Can you please try using this flag while running the command? --disable-auto-decoding=true
Default value is false. To enable automatic decoding of illegal chars on Windows can be set to true to disable automatic decoding.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants