HDFS-16283. RBF: reducing the load of renewLease() RPC#4524
HDFS-16283. RBF: reducing the load of renewLease() RPC#4524ayushtkn merged 6 commits intoapache:trunkfrom
Conversation
|
💔 -1 overall
This message was automatically generated. |
ayushtkn
left a comment
There was a problem hiding this comment.
Makes sense to me to have calls only to specified namespaces. Had a quick look, have dropped some comments.
| public void renewLease(String clientName) throws IOException { | ||
| public void renewLease(String clientName, String nsIdentifies) | ||
| throws IOException { | ||
| checkNNStartup(); | ||
| // just ignore nsIdentifies |
There was a problem hiding this comment.
Better to have a check that it is null, from accidentally letting user pass some value to Namenode and feel it is getting honoured.
| } | ||
|
|
||
| /** | ||
| * Get all nsIdentifies of DFSOutputStreams. |
There was a problem hiding this comment.
Identifies in the method names and arguments, doesn't make sense, Can we change it to NsIndentifiers, well I am good with just namespaces/namespace also
| if (nsIdentify != null && !nsIdentify.isEmpty()) { | ||
| allNSIdentifies.add(nsIdentify); |
There was a problem hiding this comment.
In which case it can be null or empty?
One which I can think of is if the router is at older version than the client, means if Router doesn't have this and client is upgraded.
I think that scenario should be sorted, if either of the identifier is null or empty pass some null or so to the Router and make sure the old functionality of shooting RPC to all namespaces, stays intact.
| */ | ||
| @Idempotent | ||
| void renewLease(String clientName) throws IOException; | ||
| void renewLease(String clientName, String allNSIdentifies) throws IOException; |
There was a problem hiding this comment.
Add detail about the new argument in the javadoc as well
| if (nsIdentifies == null || nsIdentifies.isEmpty()) { | ||
| return new ArrayList<>(namenodeResolver.getNamespaces()); | ||
| } | ||
| String[] nsIdList = nsIdentifies.split(","); |
There was a problem hiding this comment.
First at client we are doing a String.Joinner stuff, then here we are splitting, can't we pass an array/set/list whichever possible and get rid of this join & split overhead during the call?
| namespaceInfo = new FederationNamespaceInfo("", "", nsId); | ||
| nsNameSpaceInfoCache.put(nsId, namespaceInfo); |
There was a problem hiding this comment.
I didn't catch this logic of new FederationNamespaceInfor creation, you have a cached Map, which is empty. You do a get, it will return null, you come to the if block and create explicitly this, why aren't we initialising the cached map from namenodeResolver.getNamespaces() or in case we don't find it in the cached map, why don't we go ahead and try find from namenodeResolver.getNamespaces()
| if (nss.size() == 1) { | ||
| rpcClient.invokeSingle(nss.get(0).getNameserviceId(), method); |
There was a problem hiding this comment.
nsId is getting passed from the client, if we get an array or so, you can figure out initially itself whether you have only one entry or not. so you can get rid of getRewLeaseNSs(nsIdentifies); completely in that case?
| fsDataOutputStream0.close(); | ||
| fsDataOutputStream1.close(); |
There was a problem hiding this comment.
Either use finally or try-with resources, for close.
| FSDataOutputStream fsDataOutputStream0 = routerFS.create(newTestPath0); | ||
| FSDataOutputStream fsDataOutputStream1 = routerFS.create(newTestPath1); |
There was a problem hiding this comment.
does this code bother Append flow as well?
| dfsRouterFS.getClient().getLeaseRenewer().interruptAndJoin(); | ||
|
|
||
| Path testPath = new Path("/testRenewLease0/test.txt"); | ||
| FSDataOutputStream fsDataOutputStream = routerFS.create(testPath); |
There was a problem hiding this comment.
Test both for both replicated as well as Erasure Coded files
|
Thanks @ayushtkn for your review, I learned a lot from it. Thank you again. Because the |
There was a problem hiding this comment.
Thanx @ZanderXu for the update, dropped some comments give a check, there may be some checkstyle warnings as well from Jenkins.
The last build shows some test failures as well, and they look related I think, give a check to them as well
Rest post that things looks good...
| * the last call to renewLease(), the NameNode assumes the | ||
| * client has died. | ||
| * | ||
| * @param namespaces The full Namespace list that the release rpc |
There was a problem hiding this comment.
seems typo release -> renewLease
| throws IOException { | ||
| if (namespaces != null && namespaces.size() > 0) { | ||
| LOG.warn("namespaces({}) should be null or empty " | ||
| + "on NameNode side, please check it.", namespaces); |
There was a problem hiding this comment.
throw Exception here, We don't expect Namespaces here and neither wan't to silently ignore such an occurrence
| } | ||
|
|
||
| @Test | ||
| public void testRewnewLease() throws Exception { |
There was a problem hiding this comment.
This test is has become little big, Can we split the create & append apart into different tests? Can extract the common stuff into a util method and reuse
| if (ret instanceof LastBlockWithStatus) { | ||
| ((LastBlockWithStatus) ret).getFileStatus().setNamespace(ns); | ||
| } |
There was a problem hiding this comment.
Is this for append? Then No I don't think we should do this for all other API, should restrict our changes to only Append code.
Check if changing the Append code in RouterClientProtocol helps:
@Override
public LastBlockWithStatus append(String src, final String clientName,
final EnumSetWritable<CreateFlag> flag) throws IOException {
rpcServer.checkOperation(NameNode.OperationCategory.WRITE);
List<RemoteLocation> locations = rpcServer.getLocationsForPath(src, true);
RemoteMethod method = new RemoteMethod("append",
new Class<?>[] {String.class, String.class, EnumSetWritable.class},
new RemoteParam(), clientName, flag);
RemoteResult result = rpcClient
.invokeSequential(method, locations, LastBlockWithStatus.class, null);
LastBlockWithStatus lbws = (LastBlockWithStatus) result.getResult();
lbws.getFileStatus().setNamespace(result.getLocation().getNameserviceId());
return lbws;
}
| Map<String, FederationNamespaceInfo> allAvailableNamespaces = | ||
| getAvailableNamespaces(); |
There was a problem hiding this comment.
Should have some caching here:
Like:
Initially initialise availableNamespace and for every call check from this, if some entry isn't found in the stored/cached availableNamespace, In that case call getAvailableNamespaces() and update the value of availableNamespace,
if still we don't find the entry after then we can return all the namespace what we are doing now
|
💔 -1 overall
This message was automatically generated. |
|
Thanks @ayushtkn for your good idea, and I have updated the patch. About caching I'm looking for your help, thanks. |
|
hmm, the caching may be can have a follow up post this, might be tricky but doable. you missed a couple of comments, rest things look almost good to me @goiri / @Hexiaoqiao mind giving an additional check.. |
|
Thanks @ayushtkn for your review and ideas. |
|
💔 -1 overall
This message was automatically generated. |
|
@ZanderXu @ayushtkn, Thanks for your great works here. After a quick glance, it seems one solution to improve renewLease for RBF. |
|
Thank @Hexiaoqiao for your solution. In the beginning, we try to carry the writing paths to RBF to fix this issue. After running for a while, I found some cases also need to be fixed:
Also, the number of renewLease requests between client and rbf will also increases, depending on the number of files being written at the same time. |
|
Thanks for quick response.
In my practice, the cost with split-path to renewLease will be under control even for long running applications, such flink applications (I have not observed that many files being written concurrently, it will be helpful if any cases could offer.)
For both create and renewLease (with file path), I think they will apply the same MountTableResolver for same file. So it does not seem to one issue for renewLease. Maybe some corner case I do not catch. Please correct me if something missed.
Yes, it is true. I am totally agree. Based on my internal production cluster, it will be less than 5% increase. |
Although they all used MountTableResolver, create and append rpc can get the NS which the file belongs, but renewlease can only obtain the full NSs which the file mounted. So int this case, the renewlease rpc always forwarded to some unnecessary nameservices. |
|
🎊 +1 overall
This message was automatically generated. |
Exactly true. For MultipleDestinationMount, it could forward to different NS when request with file path only, especially for DestinationOrder.RANDOM and related order. cc @ayushtkn @goiri Anymore feedback here? Thanks. |
|
@Hexiaoqiao I am ok with using path, but if I catch correct, the only save with using path will be like we won't be exposing the namespaces to the end client? but in exchange we will be saving I think a bunch of RPCs, especially in case of multi destination mount points. May be from the performance point of view, it might be better with namespaces( the present approach). But I don't have any strong objections, if you feel we shouldn't expose the namespaces to end client. if there is a particular use case where we shouldn't expose namespace to end client, in that case we may hide this change behind a config, and this optimisation won't work in that case, but in general ViewFs also knows about all namespaces and usually a lot of clients too have these namespaces defined in their configs. so, that is not a big secret, and this namespace info will also be their in back-end and I even don't think exposing them via this route can have any security issue? But I am Ok, with whichever approach you folks feel better.. |
|
@ayushtkn It is not related with any security issue when I propose to use path as one parameter of renewLease. Actually in my opinion, it will be confused and poor readable with both namespaces and router name at client side, without other strong support points.
As mentioned above, for MultipleDestinationMount it will be difficult to reduce requests to NameNode at Router side. (I am limited by my internal case where no MultipleDestination with DestinationOrder.RANDOM hash configured.) |
ayushtkn
left a comment
There was a problem hiding this comment.
Thanx @Hexiaoqiao for the details. Makes sense :-)
Changes LGTM.
Will hold for @Hexiaoqiao to have a final look before we conclude this.
| + ") should be null or empty"); | ||
| } | ||
| checkNNStartup(); | ||
| // just ignore nsIdentifies |
There was a problem hiding this comment.
remove this line or change it to // Ignore the namespaces.
There was a problem hiding this comment.
copy, I will fix it.
Hexiaoqiao
left a comment
There was a problem hiding this comment.
@ZanderXu It almost look good to me. Just leave some nit comments. FYI. Will give my +1 once fixed. Thanks again.
| /** | ||
| * Try to get a list of FederationNamespaceInfo for renewLease RPC. | ||
| */ | ||
| private List<FederationNamespaceInfo> getRewLeaseNSs(List<String> namespaces) |
There was a problem hiding this comment.
This method name should be getRenewLeaseNSs?
| getAvailableNamespaces(); | ||
| for (String namespace : namespaces) { | ||
| if (!allAvailableNamespaces.containsKey(namespace)) { | ||
| return new ArrayList<>(namenodeResolver.getNamespaces()); |
There was a problem hiding this comment.
We should use result directly rather than create another ArrayList again here?
There was a problem hiding this comment.
namenodeResolver.getNamespaces() is a hashSet, I want to a List so that we can use invokeSingle method to forward this rpc when there is only one namespace.
List<FederationNamespaceInfo> nss = getRenewLeaseNSs(namespaces);
if (nss.size() == 1) {
rpcClient.invokeSingle(nss.get(0).getNameserviceId(), method);
} else {
rpcClient.invokeConcurrent(nss, method, false, false);
}
Of course, Set can also achieve this goal.
There was a problem hiding this comment.
Get it. make sense to me.
| } | ||
| } | ||
|
|
||
|
|
| * the last call to renewLease(), the NameNode assumes the | ||
| * client has died. | ||
| * | ||
| * @param namespaces The full Namespace list that the release rpc |
|
💔 -1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
@ayushtkn @Hexiaoqiao Thanks for your discussion and review. I will continue to word hard to submit more patches to the community. |
… Contributed by ZanderXu. Reviewed-by: He Xiaoqiao <hexiaoqiao@apache.org> Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
… Contributed by ZanderXu. Reviewed-by: He Xiaoqiao <hexiaoqiao@apache.org> Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
…3.2-bzl-hdfs-merge' HDFS-16283. RBF: reducing the load of renewLease() RPC (apache#4524). See merge request dap/hadoop!79
… Contributed by ZanderXu. Reviewed-by: He Xiaoqiao <hexiaoqiao@apache.org> Signed-off-by: Ayush Saxena <ayushsaxena@apache.org> With part of HDFS-15535. ACLOVERRIDE
Description of PR
HDFS-16283: RBF: improve renewLease() to call only a specific NameNode rather than make fan-out calls
Currently RBF will forward the renewLease() rpc to all the available name services. So the forwarding efficiency will be affected by the unhealthy downstream name services. And along with as more as NSs are monitored by RBF, this problem will become more and more serious.
In our prod cluster, there are 70+ nameservices, the phenomenon that renewLease() rpc is blocked often occurs.
This patch is be used to fix this problem and work well on our cluster, and the main ideas is: