Skip to content

Conversation

@runzhiwang
Copy link
Contributor

@runzhiwang runzhiwang commented Jun 6, 2020

What changes were proposed in this pull request?

What's the problem ?
I start a ozone cluster with 1000 datanodes and 10 s3gateway, and run two weeks with heavy workload, and perf om and scm.

  1. From om perf, getServiceList cost 63.75% cpu.
    image

2.From scm perf, queryNode come from om::getServiceList cost 33.20% cpu
image

What's the reason ?
Now s3g create a client for each request (we should improve it in the future). when create each RpcClient, s3g will call ozoneManagerClient.getServiceInfo(), getServiceInfo will call getServiceList. getServiceList is a very heavy call, because it need to get DatanodeDetails of 1000 Datanodes, so om and scm are very busy with getServiceList.

But s3g does not use the List(ServiceInfo) which got from getServiceList at all.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-3745

How was this patch tested?

Existed tests.

@runzhiwang runzhiwang force-pushed the getServiceList branch 2 times, most recently from 5757fd1 to 5e11bb4 Compare June 6, 2020 07:13
@runzhiwang runzhiwang changed the title HDDS-3745. Improve OM and SCM performance with 50% by avoid call getServiceInfo in s3g HDDS-3745. Improve OM and SCM performance with 63% by avoid call getServiceInfo in s3g Jun 6, 2020
@runzhiwang runzhiwang changed the title HDDS-3745. Improve OM and SCM performance with 63% by avoid call getServiceInfo in s3g HDDS-3745. Improve OM and SCM performance with 64% by avoid call getServiceInfo in s3g Jun 6, 2020
@runzhiwang
Copy link
Contributor Author

@elek Could you help review this patch ? Thank you very much.

@elek
Copy link
Member

elek commented Jun 10, 2020

Thanks the patch @runzhiwang

It looks correct me, but it's also a question about the long-term usage of getServiceInfo. Originally it was introduced (as far as I remember) to get the address of the SCM. But over the time the client is improved to avoid all the direct calls to the SCM.

I agree, that long-term we should use proxy users for S3 and pooling the connections.

Short-term this patch looks good to me, but why don't we use the getServiceInfo in case of secure clusters? Do we need to replace it with something simple.

I would be interested about the opinion of @nandakumar131. As far as I remember he worked on the original implementation.

Personally I think a generic getServiceClient can be useful. For example active storage-class-es can be downloaded from the server at the beginning of the connection. But that's a long term plan and this patch can help short term.

Let's wait for more opinions.

@nandakumar131
Copy link
Contributor

Some background
gerServiceInfo call was introduced as part of Service Discovery API (HDFS-12868).
The idea behind Service Discovery API was to not worry about setting multiple properties at client to connect to Ozone.
Point the client to any one of the OzoneManager (even in case of HA), the client should be able to get all the required configuration and be able to connect to Ozone.
This will make the client configuration simple.

When gerServiceInfo was introduced, Ozone client just needed OM's IP and port to connect, nothing else. Over time we made lot of code change and never revisited Service Discovery API.
After OM HA, it is no longer true that we can discover other services and required configurations just with single OM's IP/Port. We need ozone-site.xml at client and set of properties that will allow us to interact with OM HA.

Now I'm not even sure where and why gerServiceInfo is getting used.

@bharatviswa504
Copy link
Contributor

bharatviswa504 commented Jun 11, 2020

Some background
gerServiceInfo call was introduced as part of Service Discovery API (HDFS-12868).
The idea behind Service Discovery API was to not worry about setting multiple properties at client to connect to Ozone.
Point the client to any one of the OzoneManager (even in case of HA), the client should be able to get all the required configuration and be able to connect to Ozone.
This will make the client configuration simple.

When gerServiceInfo was introduced, Ozone client just needed OM's IP and port to connect, nothing else. Over time we made lot of code change and never revisited Service Discovery API.
After OM HA, it is no longer true that we can discover other services and required configurations just with single OM's IP/Port. We need ozone-site.xml at client and set of properties that will allow us to interact with OM HA.

Now I'm not even sure where and why gerServiceInfo is getting used.

For OM HA, the reason for not using the getServiceInfo was if client knows one of the OzoneManager addresses, and if that is down, client will not be able to talk to the ozone cluster.
Even right now getServiceInfo, will now return quorum of OM address with each role, but this information cannot be used for the above reason, so the client still needs a quorum of OM address.

@nandakumar131
Copy link
Contributor

@bharatviswa504 I understand the reason behind it.

We should revisit Service Discovery API and make a decision on whether we can make it work or should we remove it completely.

@bharatviswa504
Copy link
Contributor

bharatviswa504 commented Jun 11, 2020

From the code point of I see the usage of getServiceInfo used to get CaCert and used by XceiverClientRatis and Grpc.

    ServiceInfoEx serviceInfoEx = ozoneManagerClient.getServiceInfo();
    String caCertPem = null;
    if (OzoneSecurityUtil.isSecurityEnabled(conf)) {
      caCertPem = serviceInfoEx.getCaCertificate();
    }

Looking into the PR code now it has eliminated using this getServiceInfo, and added a new API for getcaCert when security is enabled.

@bharatviswa504
Copy link
Contributor

@bharatviswa504 I understand the reason behind it.

We should revisit Service Discovery API and make a decision on whether we can make it work or should we remove it completely.

After OM HA, it is no longer true that we can discover other services and required >configurations just with single OM's IP/Port. We need ozone-site.xml at client and set of .properties that will allow us to interact with OM HA.

Agreed. I just want to give you a context for the above statement mentioned.

@runzhiwang
Copy link
Contributor Author

runzhiwang commented Jun 11, 2020

@elek @nandakumar131 @bharatviswa504 Hi, Thanks for review. The problem of getServiceInfo is that it contains all the datanodes' information which cost too much if the cluster is big. If it only contain OM address, there is no need to replace it with getCaCertificate.
image

@xiaoyuyao
Copy link
Contributor

@runzhiwang the reason the serviceinfo request retrieve all the DNs is for the DN REST service address. After the Ozone REST support was removed a while back, those leftover are not cleaned up completely. Maybe a time to clean that up to make it lighter without introducing a new RPC?

@runzhiwang
Copy link
Contributor Author

@xiaoyuyao Thanks for explanation, I agree and I will clean that up.

@bharatviswa504
Copy link
Contributor

@elek @nandakumar131 @bharatviswa504 Hi, Thanks for review. The problem of getServiceInfo is that it contains all the datanodes' information which cost too much if the cluster is big. If it only contain OM address, there is no need to replace it with getCaCertificate.
image

Thank You @runzhiwang for the explanation of the issue. I like @xiaoyuyao proposal, if DN information is no more required, we can clean that up, and use getServiceInfo instead of a new RPC call.

@runzhiwang runzhiwang changed the title HDDS-3745. Improve OM and SCM performance with 64% by avoid call getServiceInfo in s3g HDDS-3745. Improve OM and SCM performance with 64% by avoid collect datanode information s3g Jun 14, 2020
@runzhiwang runzhiwang changed the title HDDS-3745. Improve OM and SCM performance with 64% by avoid collect datanode information s3g HDDS-3745. Improve OM and SCM performance with 64% by avoid collect datanode information to s3g Jun 14, 2020
@runzhiwang runzhiwang force-pushed the getServiceList branch 3 times, most recently from f333972 to 13fb742 Compare June 15, 2020 00:56
@codecov-commenter
Copy link

Codecov Report

Merging #1031 into master will decrease coverage by 0.05%.
The diff coverage is n/a.

Impacted file tree graph

@@             Coverage Diff              @@
##             master    #1031      +/-   ##
============================================
- Coverage     70.38%   70.33%   -0.06%     
+ Complexity     9234     9227       -7     
============================================
  Files           961      961              
  Lines         48130    48115      -15     
  Branches       4676     4674       -2     
============================================
- Hits          33877    33842      -35     
- Misses        12001    12020      +19     
- Partials       2252     2253       +1     
Impacted Files Coverage Δ Complexity Δ
.../java/org/apache/hadoop/ozone/om/OzoneManager.java 63.94% <ø> (+0.04%) 184.00 <0.00> (-1.00) ⬆️
.../apache/hadoop/hdds/scm/node/StaleNodeHandler.java 88.88% <0.00%> (-11.12%) 4.00% <0.00%> (ø%)
...er/common/transport/server/GrpcXceiverService.java 70.00% <0.00%> (-10.00%) 3.00% <0.00%> (ø%)
...ntainerLocationProtocolServerSideTranslatorPB.java 38.31% <0.00%> (-5.15%) 17.00% <0.00%> (-2.00%)
...ntainerLocationProtocolClientSideTranslatorPB.java 41.17% <0.00%> (-3.93%) 20.00% <0.00%> (-2.00%)
...adoop/hdds/scm/server/SCMClientProtocolServer.java 47.65% <0.00%> (-3.52%) 23.00% <0.00%> (-4.00%)
...g/apache/hadoop/hdds/protocol/DatanodeDetails.java 88.23% <0.00%> (-1.69%) 29.00% <0.00%> (-1.00%)
.../common/states/endpoint/HeartbeatEndpointTask.java 69.81% <0.00%> (-1.26%) 25.00% <0.00%> (-1.00%)
...p/ozone/container/keyvalue/helpers/ChunkUtils.java 85.45% <0.00%> (-0.91%) 30.00% <0.00%> (-1.00%)
...iner/common/statemachine/DatanodeStateMachine.java 82.87% <0.00%> (-0.56%) 29.00% <0.00%> (-1.00%)
... and 11 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 23034fb...13fb742. Read the comment docs.

@runzhiwang
Copy link
Contributor Author

@xiaoyuyao @bharatviswa504 Thanks for review. I have updated the patch.

@bharatviswa504
Copy link
Contributor

Thank You for the update.
Can we also make the change in RpcClient.java, calling getServiceInfo only when security is enabled?

@elek
Copy link
Member

elek commented Jun 15, 2020

Can we also make the change in RpcClient.java, calling getServiceInfo only when security is enabled?

I have some fear if the secre/unsecure cluster would have such difference. For example, I should run all the performance tests on secure cluster. We should be careful and execute all the tests on both secure / unsecure cluster. Seems to be safer to keep it as is. But I can be convinced.

@xiaoyuyao
Copy link
Contributor

The getServiceInfo could be used for non-secure configuration from the server in future. I agree with @elek to leave it as-is.

@bharatviswa504
Copy link
Contributor

Can we also make the change in RpcClient.java, calling getServiceInfo only when security is enabled?

I have some fear if the secre/unsecure cluster would have such difference. For example, I should run all the performance tests on secure cluster. We should be careful and execute all the tests on both secure / unsecure cluster. Seems to be safer to keep it as is. But I can be convinced.

From code, I see that we use getServiceInfo value only when security is enabled. So, from S3G perspective, for each request, we can save one Rpc Call.

    ServiceInfoEx serviceInfoEx = ozoneManagerClient.getServiceInfo();
    String caCertPem = null;
    if (OzoneSecurityUtil.isSecurityEnabled(conf)) {
      caCertPem = serviceInfoEx.getCaCertificate();
    }

I have not understood, what is the problem mentioned here, as from code we can see the value from getServiceInfo is only used when security is enabled.

If someone has mistakenly used some of the information from getServiceInfo mistakenly with out initializing it, it will be caught immediately as it will throw NPE. And also we are not initializing serviceinfo, to any class parameter, it is locally used in the RpcClient constructor only.

@xiaoyuyao
Copy link
Contributor

xiaoyuyao commented Jun 16, 2020

I have not understood, what is the problem mentioned here, as from code we can see the value from getServiceInfo is only used when security is enabled.

getServiceInfo is the service discovery API for OM, which is not for security only. If we go with the other route, i.e., a dedicated API for getCaCertificate(), I'm OK with including check for security enabled.

@bharatviswa504
Copy link
Contributor

bharatviswa504 commented Jun 17, 2020

But right now getServiceInfo is used only to getCaCertificate, other than that in current code there is no usage of it. As now the client does not need an SCM address, for OM, we get a quorum of address from config, and also we don't need DN address which this patch has eliminated.

If we can skip getServiceInfo call in a non-secure cluster where we can save one RPC Call. I am fine with reusing this API or with totally a new API.

@elek
Copy link
Member

elek commented Jun 22, 2020

So what's next here? I propose to merge this as is (clean simple patch, just remove the datanode part).

If getServiceInfo() is not required anymore: we can remove it fully in a separated patch. (Not just the usage, but we can deprecate and/or remove the server side parts.

@bharatviswa504, @xiaoyuyao Are you ok with this approach?

@bharatviswa504
Copy link
Contributor

bharatviswa504 commented Jun 24, 2020

Fine with me. In this Jira let's focus on removing datanode part.

I will open a new Jira for removing the usage of getServiceInfo.

+1 LGTM.

@bharatviswa504 bharatviswa504 merged commit 3ca7f5c into apache:master Jun 24, 2020
@bharatviswa504
Copy link
Contributor

Thank You @runzhiwang for the contribution and @elek and @xiaoyuyao for the review.

errose28 added a commit to errose28/ozone that referenced this pull request Jun 25, 2020
* upstream/master: (56 commits)
  HDDS-3264. Fix TestCSMMetrics.java. (apache#1120)
  HDDS-3858. Remove support to start Ozone and HDFS datanodes in the same JVM (apache#1117)
  HDDS-3704. Update all the documentation to use ozonefs-hadoop2/3 instead of legacy/current (apache#1099)
  HDDS-3773. Add OMDBDefinition to define structure of om.db. (apache#1076)
  Revert "HDDS-3263. Fix TestCloseContainerByPipeline.java. (apache#1119)" (apache#1126)
  HDDS-3821. Disable Ozone SPNEGO should not fall back to hadoop.http.a… (apache#1101)
  HDDS-3819. OzoneManager#listVolumeByUser ignores userName parameter when ACL is enabled (apache#1087)
  HDDS-3779. Add csi interface documents to show how to use ozone csi (apache#1059)
  HDDS-3857. Datanode in compose/ozonescripts can't be started (apache#1116)
  HDDS-3430. Enable TestWatchForCommit test cases. (apache#1114)
  HDDS-3263. Fix TestCloseContainerByPipeline.java. (apache#1119)
  HDDS-3512. s3g multi-part-upload saved incorrect content using streaming (apache#1092)
  HDDS-3836. Modify ContainerPlacementPolicyFactory JavaDoc (apache#1097)
  HDDS-3780. Replace the imagePullPolicy from always to IfNotPresent (apache#1055)
  HDDS-3847. Change OMNotLeaderException logging to DEBUG (apache#1118)
  HDDS-3745. Improve OM and SCM performance with 64% by avoid collect datanode information to s3g (apache#1031)
  HDDS-3286. BasicOzoneFileSystem  support batchDelete. (apache#814)
  HDDS-3850. Update the admin document to let user know how to show the status of all rules. (apache#1109)
  HDDS-3848. Add ratis.thirdparty.version in main pom.xml (apache#1108)
  HDDS-3815. Avoid buffer copy in ContainerCommandRequestProto. (apache#1085)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants