-
Notifications
You must be signed in to change notification settings - Fork 9.2k
HADOOP-19187: [ABFS][FNSOverBlob] AbfsClient Refactoring to Support Multiple Implementation of Clients. #6879
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HADOOP-19187: [ABFS][FNSOverBlob] AbfsClient Refactoring to Support Multiple Implementation of Clients. #6879
Conversation
|
💔 -1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
...tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/AzureBlobFileSystemStore.java
Show resolved
Hide resolved
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsClient.java
Show resolved
Hide resolved
|
@anujmodi2021 Please update PR subjectline and description reflecting your changes. Thanks! |
Thank you @rakeshadr for reviewing the PR. |
|
💔 -1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
steveloughran
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Commented. Nothing major as this is still the initial PR where you have a pushed methods down into AbfsDfsClient. As usual, I'd like Java docs on all the stuff which is now visible; this is to help the people that come after us.
Assuming this work is going to be a series of PRs and will not be ready for use at all until they're all in, what do you think about me creating a feature branch for the HADOOP-19179 work.
Long-lived feature branches can be a problem on their own, but because they're isolated they can be rebased. It just seems to me that right now this configuration is incomplete.
If you do want to merger immediately then add a test to see what happens when you try to instantiate a client where
fs.azure.fns.account.service.type = blob. I expect right now for things to fail. So make sure that it doesn't meaningfully with a test to prove it.
...ools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/constants/AbfsServiceType.java
Show resolved
Hide resolved
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsClient.java
Show resolved
Hide resolved
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsClient.java
Show resolved
Hide resolved
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsClient.java
Outdated
Show resolved
Hide resolved
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsClient.java
Show resolved
Hide resolved
| 1. Account Type: Must be set to `false` to indicate FNS Account | ||
| ```xml | ||
| <property> | ||
| <name>fs.azure.account.hns.enabled</name> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how do you mix service types on a single clent? please give an example
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We won't be mixing service types for a single client. In case we need to move traffic from one service type to another, we will switch the AbfsClient used for making rest operation calls. Plan is to have both AbfsDfsClient and AbfsBlobClient initialised in AbfsClientHandler and use the one that is needed at that point in code flow.
...azure/src/test/java/org/apache/hadoop/fs/azurebfs/ITestAzureBlobFileSystemInitAndCreate.java
Outdated
Show resolved
Hide resolved
...azure/src/test/java/org/apache/hadoop/fs/azurebfs/ITestAzureBlobFileSystemInitAndCreate.java
Outdated
Show resolved
Hide resolved
| HNS Enabled accounts will still use DFS Endpoint which continues to be the | ||
| recommended stack based on performance and feature capabilities. | ||
|
|
||
| ## Configuring ABFS Driver for FNS Accounts |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a section fo explicit configuration of HNS accounts
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point to improve our docs, but I was thinking this site page is supposed to talk about FNS accounts only.
Do you suggest we add another page for HNS accounts? Something like hns_dfs.md and talk about how to optimally configure HNS accounts with ABFS driver?
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
rakeshadr
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 LGTM. Thanks @anujmodi2021 for the contribution!
|
Hi @steveloughran |
|
🎊 +1 overall
This message was automatically generated. |
steveloughran
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for merging into trunk.
I would still like to see a good story for running say, distcp or spark where the storage types are different for different URLs. distcp would be a good one to test as it can be done within this module
+1 on this... We have planned for both scale and functional testing of this new support for FNS Accounts. We will also update our test script to run the whole test suite on different combinations of account type and endpoint. These includes distcp as well... Once we have different modules checked-in. we will share a testing report before enabling users to configure blob endpoint. Thanks for merging this PR. |
…ultiple Implementation of Clients. (apache#6879) Refactor AbfsClient into DFS and Blob Client. Contributed by Anuj Modi
…ultiple Implementation of Clients. (apache#6879) Refactor AbfsClient into DFS and Blob Client. Contributed by Anuj Modi
…ultiple Implementation of Clients. (apache#6879) Refactor AbfsClient into DFS and Blob Client. Contributed by Anuj Modi
…ultiple Implementation of Clients. (apache#6879) Refactor AbfsClient into DFS and Blob Client. Contributed by Anuj Modi
Description of PR
This is the first PR in the series of work done under Parent Jira: HADOOP-19179
Jira for this Patch: https://issues.apache.org/jira/browse/HADOOP-19187
Scope of this task is to refactor the AbfsClient such that multiple implementations of AbfsClient can exist and ABFSStore can choose to interact with the client it wants based on the endpoint configured by user.
The blob endpoint support will remain "Unsupported" until the whole code is checked-in and well tested.
Production Code Changes
AbfsConfiguration.java: New configurations are defined which will allow users to chose the Endpoint they want Filesystem to interact with. Details on how to use this configurations are added infnsBlob.md.AzureBlobFileSystem.javawill have additional checks to verify that the configured Endpoint is valid for the account type used by user. While the Blob Endpoint support is under development, it won't allow users to configure Blob Endpoint for any account type.AbfsClient.javawill now become an abstract class with all the current rest operation implementations moved to its childAbfsDfsClientimplementing APIs of DFS endpoint.AbfsClientHandler.javawill hold both AbfsDfsClient and AbfsBlobClient instances and will return one of them based on the service type configured or required for any fallback scenario.AzureBlobFileSystemStore.javawill have the capability to identify the service type configured by user and based on that it can choose the type ofAbfsClientto use. Instead of client, it will hold on to an object ofAbfsClientHandler. Wherever needed it will rely onAbfsClientHandlerto return the correct client to be used to a file system operation.Test Code Changes
AbfsBlobClientitself at this point. Follow up PRs on FnsOverBlob Support will add new tests once the whole implementation is complete. Till then users won't be allowed to configure blob endpoint. Test to verify that FS init will fail with Blob Endpoint added.How was this patch tested?
Existing test suite was ran on DFS Endpoint only. All the failures reported are known.
Metric related tests are fixed in the the #6847