-
Notifications
You must be signed in to change notification settings - Fork 786
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use reconstructed ListBlobs marker to provide list offset support in MicrosoftAzure
store
#6174
base: master
Are you sure you want to change the base?
Use reconstructed ListBlobs marker to provide list offset support in MicrosoftAzure
store
#6174
Conversation
80fa36c
to
3f5a080
Compare
Are there any official SDKs that implement this, this would give some measure of confidence that this is both correct and likely to remain well supported? I also wonder which of Azure's three blob storage "flavors" supports this, it would be very out of character for them to actually be consistent... FYI @alexwilcoxson-rel who filed a related #5653 but was unable to get it to work |
@tustvold Well, to be fair I was not even aware that hadoop was also relying on this (thanks for the pointer), so no, I have no knowledge of official SDKs that implement this hack haha, and you are probably right that this breaks under hierarchical namespaces. That is why my goal was to introduce it under a experimental/unstable flag to shift the responsibility of relying on this to the user (in this case: me) while still allowing for the functionality. This is something I would keep as a patch on my side but thought that maybe others could potentially benefit from it too if they know what they are doing : ) I can totally understand if this is deemed too hacky (even if under a flag) |
Yeah before we went down the road of impl in object_store we tried doing what hadoop code is doing via python and the different blob apis (blob.core.windows.net and dfs.core.windows.net). Also the hadoop code does differentiate between hierarchical namespace and not. Furthermore we have been in talks with Microsoft and the Azure Storage team about this. Will follow up once they get back to us about whether this will be documented and supported rather than requiring this type of workaround. |
That will be great for the entire community. Thanks in advance. |
Hey - One of my Customers believe this could potentially unblock few challenges on their side. May I know whom are you talking to within MS from both SDK & Storage engineering side? |
I'm going to mark this as a draft whilst we wait to hear back from MS about first-party support for this |
Last I heard they are still getting the private preview available with this feature. |
Latest update. I have tested list with a startFrom parameter against a private preview storage account Microsoft provided us. However it won't be generally available until sometime next year. In the mean time they once again directed us towards what hadoop is doing This workaround only works against the dfs.core.windows.net and associated endpoints. I tested it against a patched version of OpenDAL's azdls (which has an impl of ObjectStore trait) with success: apache/opendal#5242 |
That's really nice. Happy to know.
Thank you for this. |
Which issue does this PR close?
Closes #6173.
Rationale for this change
The opaque token provided as a
marker
to ListBlobs is a trivial encoding of the key to start listing from (seemarker_for_offset
code comment). In this PR we propose an experimental feature flag that implements offset behavior for listing by relying on this fact.