Skip to content

Conversation

@kylebarron
Copy link
Member

@kylebarron kylebarron commented Jun 2, 2025

Which issue does this PR close?

Rationale for this change

Is there a spec for each of these URL formats? In my usage of Azure so far it seems like Azure URLs like https://blob.core.windows.net/ always have the container as the first element of the URL path. Is that true?

Refer to the tests updated in this PR and ensure that you agree with the edited versions of them.

What changes are included in this PR?

Fixed parsing paths from Azure URLs.

Are there any user-facing changes?

Yes. Up to consideration whether this is a bug fix or a breaking change.

@kylebarron
Copy link
Member Author

The failing emulator tests seem unrelated?

@alamb
Copy link
Contributor

alamb commented Jun 5, 2025

Here is a proposed fix for the emulator tests:

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @kylebarron

This seems like it does what it is designed to do. I don't know much about azure

Is the path format documented anywhere that I can double check?

@kylebarron
Copy link
Member Author

I also don't know the most about Azure; I've only started working with it recently.

I don't know where these URL patterns are documented; it's just my understanding that the first part of the path is the container name.

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @kylebarron

@kylebarron
Copy link
Member Author

Is the path format documented anywhere that I can double check?

Looks like here? https://learn.microsoft.com/en-us/rest/api/storageservices/naming-and-referencing-containers--blobs--and-metadata#resource-uri-syntax

Each resource has a corresponding base URI, which refers to the resource itself.

For the storage account, the base URI includes the name of the account only:

https://myaccount.blob.core.windows.net

For a container, the base URI includes the name of the account and the name of the container:

https://myaccount.blob.core.windows.net/mycontainer

For a blob, the base URI includes the name of the account, the name of the container, and the name of the blob:

https://myaccount.blob.core.windows.net/mycontainer/myblob

There is also a section about a root container that I'm not familiar with.

@alamb alamb merged commit f422dce into apache:main Jun 16, 2025
8 checks passed
@alamb
Copy link
Contributor

alamb commented Jun 16, 2025

Thanks again @kylebarron

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Incorrect prefix in ObjectStoreScheme::parse for Azure HTTP urls

2 participants