HDDS-9345. Add CapacityPipelineChoosePolicy considering datanode storage space #5354

whbing · 2023-09-23T18:59:09Z

What changes were proposed in this pull request?

Add pipeline choose policy impl CapacityPipelineChoosePolicy.

Consider the following scenario:
Our cluster often scales up with new nodes, but the old nodes may already be quite full in terms of writes. The balancer's speed is usually slow, so it's essential to choose nodes with lower usage as much as possible.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-9345

How was this patch tested?

unit test passed
works well in online cluster env, info as follows:

dn storage info now:

Usage Information (100 Datanodes)

UUID         : e844c80f-f86a-458e-9d5b-998ccab26f6b
Capacity     : 191985138794496 B (174.61 TB)
Total Used   : 191880094604207 B (174.51 TB)
Total Used % : 99.95%
Ozone Used % : 99.88%
Remaining    : 105044190289 B (97.83 GB)
Remaining %  : 0.05%

UUID         : a80fad73-d92a-4ae3-a38a-627183550213
Capacity     : 95993097879552 B (87.31 TB)
Total Used   : 95881322125358 B (87.20 TB)
Total Used % : 99.88%
Ozone Used   : 95707847367083 B (87.05 TB)
Ozone Used % : 99.70%
Remaining    : 111775754194 B (104.10 GB)
Remaining %  : 0.12%

UUID         : 8a4b3279-c447-4bd1-815b-53f2b9cbbfcd
Capacity     : 191985138794496 B (174.61 TB)
Total Used   : 191672656451807 B (174.33 TB)
Total Used % : 99.84%
Ozone Used   : 191540205531706 B (174.20 TB)
Ozone Used % : 99.77%
Remaining    : 312482342689 B (291.02 GB)
Remaining %  : 0.16%

UUID         : 27604988-5cb8-4aca-99eb-909af2b5e65b
Capacity     : 95993097879552 B (87.31 TB)
Total Used   : 95823788713950 B (87.15 TB)
Total Used % : 99.82%
Ozone Used   : 95749906064930 B (87.08 TB)
Ozone Used % : 99.75%
Remaining    : 169309165602 B (157.68 GB)
Remaining %  : 0.18%
...
(ignore some )
...
UUID         : bd536fd8-979f-4103-afbc-48927f3d1c7c
Capacity     : 159987615662080 B (145.51 TB)
Total Used   : 25659199434752 B (23.34 TB)
Total Used % : 16.04%
Ozone Used   : 24361891610799 B (22.16 TB)
Ozone Used % : 15.23%
Remaining    : 134328416227328 B (122.17 TB)
Remaining %  : 83.96%

UUID         : 0a07b46c-b4a9-4608-96c5-7312dc80be61
Capacity     : 159987615662080 B (145.51 TB)
Total Used   : 25511656828928 B (23.20 TB)
Total Used % : 15.95%
Ozone Used   : 24156249565246 B (21.97 TB)
Ozone Used % : 15.10%
Remaining    : 134475958833152 B (122.31 TB)
Remaining %  : 84.05%

UUID         : 55c4eb7a-67d7-44b9-9226-16b34cfc9875
Capacity     : 159987615662080 B (145.51 TB)
Total Used   : 25475661058048 B (23.17 TB)
Total Used % : 15.92%
Ozone Used   : 24307327014485 B (22.11 TB)
Ozone Used % : 15.19%
Remaining    : 134511954604032 B (122.34 TB)
Remaining %  : 84.08%

UUID         : 82b96d23-05a0-492f-aae4-c749c5d2e92e
Capacity     : 159987615662080 B (145.51 TB)
Total Used   : 25472433516544 B (23.17 TB)
Total Used % : 15.92%
Ozone Used   : 24204556646585 B (22.01 TB)
Ozone Used % : 15.13%
Remaining    : 134515182145536 B (122.34 TB)
Remaining %  : 84.08%

   <property>
      <name>hdds.scm.pipeline.choose.policy.impl</name>
      <value>org.apache.hadoop.hdds.scm.pipeline.choose.algorithms.CapacityPipelineChoosePolicy</value>
   </property>

The debug log indicates that the node with lower storage rate is selected when selecting pipeline with this pr:

2023-09-25 19:40:12,194 [IPC Server handler 94 on 9863] DEBUG org.apache.hadoop.hdds.scm.PipelineChoosePolicy: Compare the max datanode storage in the two pipelines, 
first : SCMNodeStat{capacity=95993097879552, scmUsed=93070479413248, remaining=2904901464064}, 
second : SCMNodeStat{capacity=159987615662080, scmUsed=51905289928704, remaining=106962467737600}, 
and chosen the second pipeline = Pipeline[ Id: ae66a1e3-5ddf-493c-9751-3998b72182c7, Nodes: b6485c4a-c079-4c04-906d-9087fd785e2f{ip: 10.xxx.xxx.39, host: bigdata-xxx, ports: [REPLICATION=9886, RATIS=9858, RATIS_ADMIN=9857, RATIS_SERVER=9856, STANDALONE=9859], networkLocation: /default-rack, certSerialId: null, persistedOpState: IN_SERVICE, persistedOpStateExpiryEpochSec: 0}61e5e505-c0de-4c33-850f-a21d9b1971be{ip: 10.xxx.xxx.38, host: bigdata-xxx, ports: [REPLICATION=9886, RATIS=9858, RATIS_ADMIN=9857, RATIS_SERVER=9856, STANDALONE=9859], networkLocation: /default-rack, certSerialId: null, persistedOpState: IN_SERVICE, persistedOpStateExpiryEpochSec: 0}e3f99e46-b5f2-46fe-8d13-506498a06ad2{ip: 10.xxx.xxx.39, host: bigdata-xxx, ports: [REPLICATION=9886, RATIS=9858, RATIS_ADMIN=9857, RATIS_SERVER=9856, STANDALONE=9859], networkLocation: /default-rack, certSerialId: null, persistedOpState: IN_SERVICE, persistedOpStateExpiryEpochSec: 0}, ReplicationConfig: RATIS/THREE, State:OPEN, leaderId:e3f99e46-b5f2-46fe-8d13-506498a06ad2, CreationTimestamp2023-09-25T18:59:36.139+08:00[Asia/Shanghai]]

sodonnel · 2023-09-25T08:26:18Z

Please add some description to the PR about how this new policy would work, why it is needed etc.

whbing · 2023-09-25T12:36:58Z

Please add some description to the PR about how this new policy would work, why it is needed etc.

Added at the beginning of this page

siddhantsangwan

The overall idea looks good to me. I have some comments below. Haven't checked the tests yet.

siddhantsangwan · 2023-11-20T13:55:15Z

...java/org/apache/hadoop/hdds/scm/pipeline/choose/algorithms/CapacityPipelineChoosePolicy.java

+        targetPipeline =
+            !metric1.isGreater(metric2.get()) ? pipeline1 : pipeline2;


It's possible that we're checking two pipelines which share a datanode (multi raft), and that datanode is the most used one in both the pipelines. This will result in a tie and we'll choose the first pipeline. I'm wondering if it's better to break the tie by comparing the second most used node in that case.

It's possible that we're checking two pipelines which share a datanode (multi raft), and that datanode is the most used one in both the pipelines. This will result in a tie and we'll choose the first pipeline. I'm wondering if it's better to break the tie by comparing the second most used node in that case.

@siddhantsangwan Thanks for review ! It's a good idea to consider a second node. I'll update the code later.
( Also, I'm thinking there shouldn't be a need to consider a third node, as that might make the logic quite redundant. )

I rebased master branch and add a commit "add second compare logic". I tested in test env and debug log is printed as follows.
@siddhantsangwan If you have the time, PTAL again. Thanks!

whbing · 2023-11-22T08:21:59Z

debug log as follows:

2023-11-22 15:45:10,855 [IPC Server handler 95 on default port 9863] DEBUG org.apache.hadoop.hdds.scm.PipelineChoosePolicy: Compare scmUsed in pipelines, first : [SCMNodeMetric{SCMNodeStat{capacity=107374182400, scmUsed=623083520, remaining=86085607424}}, SCMNodeMetric{SCMNodeStat{capacity=107313369088, scmUsed=257377201, remaining=92930936832}}, SCMNodeMetric{SCMNodeStat{capacity=107313369088, scmUsed=257315761, remaining=93191069696}}], second : [SCMNodeMetric{SCMNodeStat{capacity=107374182400, scmUsed=3412590592, remaining=88673959936}}, SCMNodeMetric{SCMNodeStat{capacity=107374182400, scmUsed=3342024704, remaining=97891262464}}, SCMNodeMetric{SCMNodeStat{capacity=107374182400, scmUsed=623083520, remaining=86085607424}}]
2023-11-22 15:45:10,856 [IPC Server handler 95 on default port 9863] DEBUG org.apache.hadoop.hdds.scm.PipelineChoosePolicy: Chosen the first pipeline by compared scmUsed
2023-11-22 15:45:16,467 [IPC Server handler 0 on default port 9863] DEBUG org.apache.hadoop.hdds.scm.PipelineChoosePolicy: Compare scmUsed in pipelines, first : [SCMNodeMetric{SCMNodeStat{capacity=107374182400, scmUsed=623083520, remaining=86085607424}}, SCMNodeMetric{SCMNodeStat{capacity=107313369088, scmUsed=257377201, remaining=92930936832}}, SCMNodeMetric{SCMNodeStat{capacity=107313369088, scmUsed=257315761, remaining=93191049216}}], second : [SCMNodeMetric{SCMNodeStat{capacity=107374182400, scmUsed=3412590592, remaining=88673959936}}, SCMNodeMetric{SCMNodeStat{capacity=107374182400, scmUsed=3342024704, remaining=97891262464}}, SCMNodeMetric{SCMNodeStat{capacity=107374182400, scmUsed=3319808000, remaining=93105278976}}]
2023-11-22 15:45:16,468 [IPC Server handler 0 on default port 9863] DEBUG org.apache.hadoop.hdds.scm.PipelineChoosePolicy: Chosen the first pipeline by compared scmUsed
2023-11-22 15:45:22,142 [IPC Server handler 14 on default port 9863] DEBUG org.apache.hadoop.hdds.scm.PipelineChoosePolicy: Compare scmUsed in pipelines, first : [SCMNodeMetric{SCMNodeStat{capacity=107374182400, scmUsed=3412590592, remaining=88673959936}}, SCMNodeMetric{SCMNodeStat{capacity=107374182400, scmUsed=3342024704, remaining=97891262464}}, SCMNodeMetric{SCMNodeStat{capacity=107374182400, scmUsed=623083520, remaining=86085607424}}], second : [SCMNodeMetric{SCMNodeStat{capacity=107374182400, scmUsed=3412590592, remaining=88673959936}}, SCMNodeMetric{SCMNodeStat{capacity=107374182400, scmUsed=3342024704, remaining=97891262464}}, SCMNodeMetric{SCMNodeStat{capacity=107374182400, scmUsed=3319808000, remaining=93105278976}}]
2023-11-22 15:45:22,143 [IPC Server handler 14 on default port 9863] DEBUG org.apache.hadoop.hdds.scm.PipelineChoosePolicy: Secondary compare because the first round is the same
2023-11-22 15:45:22,143 [IPC Server handler 14 on default port 9863] DEBUG org.apache.hadoop.hdds.scm.PipelineChoosePolicy: Chosen the first pipeline by compared scmUsed
2023-11-22 15:45:27,689 [IPC Server handler 95 on default port 9863] DEBUG org.apache.hadoop.hdds.scm.PipelineChoosePolicy: Compare the same pipeline Pipeline[ Id: f1458feb-0472-4d5f-a490-97029b65dcf5, Nodes: c187d45d-e703-4b6d-a7e7-ec125f5d59f6(zk3/10.96.xx.178)67c72d5b-6fff-4f39-8e9d-ca1ad3628bc3(zk2/10.96.xx.24)43dc44df-f27c-4ade-9651-501fd881a8d6(hadoop3/10.190.xx.5), ReplicationConfig: RATIS/THREE, State:OPEN, leaderId:43dc44df-f27c-4ade-9651-501fd881a8d6, CreationTimestamp2023-11-22T15:43:54.022+08:00[Asia/Chongqing]]
2023-11-22 15:45:27,690 [IPC Server handler 95 on default port 9863] DEBUG org.apache.hadoop.hdds.scm.PipelineChoosePolicy: Chosen the first pipeline by compared scmUsed
2023-11-22 15:45:33,414 [IPC Server handler 0 on default port 9863] DEBUG org.apache.hadoop.hdds.scm.PipelineChoosePolicy: Compare scmUsed in pipelines, first : [SCMNodeMetric{SCMNodeStat{capacity=107374182400, scmUsed=623083520, remaining=86085607424}}, SCMNodeMetric{SCMNodeStat{capacity=107313369088, scmUsed=257377201, remaining=92930936832}}, SCMNodeMetric{SCMNodeStat{capacity=107313369088, scmUsed=257315761, remaining=93191049216}}], second : [SCMNodeMetric{SCMNodeStat{capacity=107374182400, scmUsed=3319808000, remaining=93105278976}}, SCMNodeMetric{SCMNodeStat{capacity=107313369088, scmUsed=257377201, remaining=92930936832}}, SCMNodeMetric{SCMNodeStat{capacity=107313369088, scmUsed=257315761, remaining=93191049216}}]
2023-11-22 15:45:33,415 [IPC Server handler 0 on default port 9863] DEBUG org.apache.hadoop.hdds.scm.PipelineChoosePolicy: Chosen the first pipeline by compared scmUsed
2023-11-22 15:45:39,070 [IPC Server handler 14 on default port 9863] DEBUG org.apache.hadoop.hdds.scm.PipelineChoosePolicy: Compare scmUsed in pipelines, first : [SCMNodeMetric{SCMNodeStat{capacity=107374182400, scmUsed=3319808000, remaining=93105278976}}, SCMNodeMetric{SCMNodeStat{capacity=107313369088, scmUsed=257377201, remaining=92930936832}}, SCMNodeMetric{SCMNodeStat{capacity=107313369088, scmUsed=257315761, remaining=93191049216}}], second : [SCMNodeMetric{SCMNodeStat{capacity=107374182400, scmUsed=623083520, remaining=86085607424}}, SCMNodeMetric{SCMNodeStat{capacity=107313369088, scmUsed=257377201, remaining=92930936832}}, SCMNodeMetric{SCMNodeStat{capacity=107313369088, scmUsed=257315761, remaining=93191049216}}]
2023-11-22 15:45:39,071 [IPC Server handler 14 on default port 9863] DEBUG org.apache.hadoop.hdds.scm.PipelineChoosePolicy: Chosen the second pipeline by compared scmUsed
2023-11-22 15:45:44,801 [IPC Server handler 97 on default port 9863] DEBUG org.apache.hadoop.hdds.scm.PipelineChoosePolicy: Compare scmUsed in pipelines, first : [SCMNodeMetric{SCMNodeStat{capacity=107374182400, scmUsed=3319808000, remaining=93105278976}}, SCMNodeMetric{SCMNodeStat{capacity=107313369088, scmUsed=257377201, remaining=92930936832}}, SCMNodeMetric{SCMNodeStat{capacity=107313369088, scmUsed=257317475, remaining=93190995968}}], second : [SCMNodeMetric{SCMNodeStat{capacity=107374182400, scmUsed=623083520, remaining=86085607424}}, SCMNodeMetric{SCMNodeStat{capacity=107313369088, scmUsed=257377201, remaining=92930936832}}, SCMNodeMetric{SCMNodeStat{capacity=107313369088, scmUsed=257317475, remaining=93190995968}}]
2023-11-22 15:45:44,802 [IPC Server handler 97 on default port 9863] DEBUG org.apache.hadoop.hdds.scm.PipelineChoosePolicy: Chosen the second pipeline by compared scmUsed
2023-11-22 15:45:51,010 [IPC Server handler 14 on default port 9863] DEBUG org.apache.hadoop.hdds.scm.PipelineChoosePolicy: Compare scmUsed in pipelines, first : [SCMNodeMetric{SCMNodeStat{capacity=107374182400, scmUsed=3412590592, remaining=88673959936}}, SCMNodeMetric{SCMNodeStat{capacity=107374182400, scmUsed=3342024704, remaining=97895714816}}, SCMNodeMetric{SCMNodeStat{capacity=107374182400, scmUsed=623083520, remaining=86085607424}}], second : [SCMNodeMetric{SCMNodeStat{capacity=107374182400, scmUsed=3412590592, remaining=88673959936}}, SCMNodeMetric{SCMNodeStat{capacity=107374182400, scmUsed=3342024704, remaining=97895714816}}, SCMNodeMetric{SCMNodeStat{capacity=107374182400, scmUsed=3319808000, remaining=93105278976}}]
2023-11-22 15:45:51,011 [IPC Server handler 14 on default port 9863] DEBUG org.apache.hadoop.hdds.scm.PipelineChoosePolicy: Secondary compare because the first round is the same
2023-11-22 15:45:51,011 [IPC Server handler 14 on default port 9863] DEBUG org.apache.hadoop.hdds.scm.PipelineChoosePolicy: Chosen the first pipeline by compared scmUsed

Format the above log for easy analysis:

2023-11-22 15:45:10,855 [IPC Server handler 95 on default port 9863] DEBUG org.apache.hadoop.hdds.scm.PipelineChoosePolicy: Compare scmUsed in pipelines, 
first :  [SCMNodeMetric{SCMNodeStat{capacity=107374182400, scmUsed=623083520, remaining=86085607424}}, 
          SCMNodeMetric{SCMNodeStat{capacity=107313369088, scmUsed=257377201, remaining=92930936832}}, 
          SCMNodeMetric{SCMNodeStat{capacity=107313369088, scmUsed=257315761, remaining=93191069696}}], 
second : [SCMNodeMetric{SCMNodeStat{capacity=107374182400, scmUsed=3412590592, remaining=88673959936}}, 
          SCMNodeMetric{SCMNodeStat{capacity=107374182400, scmUsed=3342024704, remaining=97891262464}}, 
          SCMNodeMetric{SCMNodeStat{capacity=107374182400, scmUsed=623083520, remaining=86085607424}}]
2023-11-22 15:45:10,856 [IPC Server handler 95 on default port 9863] DEBUG org.apache.hadoop.hdds.scm.PipelineChoosePolicy: Chosen the first pipeline by compared scmUsed

2023-11-22 15:45:16,467 [IPC Server handler 0 on default port 9863] DEBUG org.apache.hadoop.hdds.scm.PipelineChoosePolicy: Compare scmUsed in pipelines,
first :  [SCMNodeMetric{SCMNodeStat{capacity=107374182400, scmUsed=623083520, remaining=86085607424}}, 
          SCMNodeMetric{SCMNodeStat{capacity=107313369088, scmUsed=257377201, remaining=92930936832}}, 
          SCMNodeMetric{SCMNodeStat{capacity=107313369088, scmUsed=257315761, remaining=93191049216}}],
second : [SCMNodeMetric{SCMNodeStat{capacity=107374182400, scmUsed=3412590592, remaining=88673959936}}, 
          SCMNodeMetric{SCMNodeStat{capacity=107374182400, scmUsed=3342024704, remaining=97891262464}},
          SCMNodeMetric{SCMNodeStat{capacity=107374182400, scmUsed=3319808000, remaining=93105278976}}]
2023-11-22 15:45:16,468 [IPC Server handler 0 on default port 9863] DEBUG org.apache.hadoop.hdds.scm.PipelineChoosePolicy: Chosen the first pipeline by compared scmUsed

2023-11-22 15:45:22,142 [IPC Server handler 14 on default port 9863] DEBUG org.apache.hadoop.hdds.scm.PipelineChoosePolicy: Compare scmUsed in pipelines,
first :  [SCMNodeMetric{SCMNodeStat{capacity=107374182400, scmUsed=3412590592, remaining=88673959936}},
          SCMNodeMetric{SCMNodeStat{capacity=107374182400, scmUsed=3342024704, remaining=97891262464}}, 
          SCMNodeMetric{SCMNodeStat{capacity=107374182400, scmUsed=623083520, remaining=86085607424}}],
second : [SCMNodeMetric{SCMNodeStat{capacity=107374182400, scmUsed=3412590592, remaining=88673959936}}, 
          SCMNodeMetric{SCMNodeStat{capacity=107374182400, scmUsed=3342024704, remaining=97891262464}}, 
          SCMNodeMetric{SCMNodeStat{capacity=107374182400, scmUsed=3319808000, remaining=93105278976}}]
2023-11-22 15:45:22,143 [IPC Server handler 14 on default port 9863] DEBUG org.apache.hadoop.hdds.scm.PipelineChoosePolicy: Secondary compare because the first round is the same
2023-11-22 15:45:22,143 [IPC Server handler 14 on default port 9863] DEBUG org.apache.hadoop.hdds.scm.PipelineChoosePolicy: Chosen the first pipeline by compared scmUsed

2023-11-22 15:45:27,689 [IPC Server handler 95 on default port 9863] DEBUG org.apache.hadoop.hdds.scm.PipelineChoosePolicy: Compare the same pipeline Pipeline[ Id: f1458feb-0472-4d5f-a490-97029b65dcf5, Nodes: c187d45d-e703-4b6d-a7e7-ec125f5d59f6(zk3/10.96.xx.178)67c72d5b-6fff-4f39-8e9d-ca1ad3628bc3(zk2/10.96.xx.24)43dc44df-f27c-4ade-9651-501fd881a8d6(hadoop3/10.190.xx.5), ReplicationConfig: RATIS/THREE, State:OPEN, leaderId:43dc44df-f27c-4ade-9651-501fd881a8d6, CreationTimestamp2023-11-22T15:43:54.022+08:00[Asia/Chongqing]]
2023-11-22 15:45:27,690 [IPC Server handler 95 on default port 9863] DEBUG org.apache.hadoop.hdds.scm.PipelineChoosePolicy: Chosen the first pipeline by compared scmUsed

2023-11-22 15:45:33,414 [IPC Server handler 0 on default port 9863] DEBUG org.apache.hadoop.hdds.scm.PipelineChoosePolicy: Compare scmUsed in pipelines,
first :  [SCMNodeMetric{SCMNodeStat{capacity=107374182400, scmUsed=623083520, remaining=86085607424}}, 
          SCMNodeMetric{SCMNodeStat{capacity=107313369088, scmUsed=257377201, remaining=92930936832}}, 
          SCMNodeMetric{SCMNodeStat{capacity=107313369088, scmUsed=257315761, remaining=93191049216}}], 
second : [SCMNodeMetric{SCMNodeStat{capacity=107374182400, scmUsed=3319808000, remaining=93105278976}}, 
          SCMNodeMetric{SCMNodeStat{capacity=107313369088, scmUsed=257377201, remaining=92930936832}}, 
          SCMNodeMetric{SCMNodeStat{capacity=107313369088, scmUsed=257315761, remaining=93191049216}}]
2023-11-22 15:45:33,415 [IPC Server handler 0 on default port 9863] DEBUG org.apache.hadoop.hdds.scm.PipelineChoosePolicy: Chosen the first pipeline by compared scmUsed

2023-11-22 15:45:39,070 [IPC Server handler 14 on default port 9863] DEBUG org.apache.hadoop.hdds.scm.PipelineChoosePolicy: Compare scmUsed in pipelines,
first :  [SCMNodeMetric{SCMNodeStat{capacity=107374182400, scmUsed=3319808000, remaining=93105278976}}, 
          SCMNodeMetric{SCMNodeStat{capacity=107313369088, scmUsed=257377201, remaining=92930936832}}, 
          SCMNodeMetric{SCMNodeStat{capacity=107313369088, scmUsed=257315761, remaining=93191049216}}],
second : [SCMNodeMetric{SCMNodeStat{capacity=107374182400, scmUsed=623083520, remaining=86085607424}}, 
          SCMNodeMetric{SCMNodeStat{capacity=107313369088, scmUsed=257377201, remaining=92930936832}}, 
          SCMNodeMetric{SCMNodeStat{capacity=107313369088, scmUsed=257315761, remaining=93191049216}}]
2023-11-22 15:45:39,071 [IPC Server handler 14 on default port 9863] DEBUG org.apache.hadoop.hdds.scm.PipelineChoosePolicy: Chosen the second pipeline by compared scmUsed

2023-11-22 15:45:44,801 [IPC Server handler 97 on default port 9863] DEBUG org.apache.hadoop.hdds.scm.PipelineChoosePolicy: Compare scmUsed in pipelines,
first :  [SCMNodeMetric{SCMNodeStat{capacity=107374182400, scmUsed=3319808000, remaining=93105278976}}, 
          SCMNodeMetric{SCMNodeStat{capacity=107313369088, scmUsed=257377201, remaining=92930936832}},
          SCMNodeMetric{SCMNodeStat{capacity=107313369088, scmUsed=257317475, remaining=93190995968}}],
second : [SCMNodeMetric{SCMNodeStat{capacity=107374182400, scmUsed=623083520, remaining=86085607424}}, 
          SCMNodeMetric{SCMNodeStat{capacity=107313369088, scmUsed=257377201, remaining=92930936832}}, 
          SCMNodeMetric{SCMNodeStat{capacity=107313369088, scmUsed=257317475, remaining=93190995968}}]
2023-11-22 15:45:44,802 [IPC Server handler 97 on default port 9863] DEBUG org.apache.hadoop.hdds.scm.PipelineChoosePolicy: Chosen the second pipeline by compared scmUsed

2023-11-22 15:45:51,010 [IPC Server handler 14 on default port 9863] DEBUG org.apache.hadoop.hdds.scm.PipelineChoosePolicy: Compare scmUsed in pipelines,
first :  [SCMNodeMetric{SCMNodeStat{capacity=107374182400, scmUsed=3412590592, remaining=88673959936}},
          SCMNodeMetric{SCMNodeStat{capacity=107374182400, scmUsed=3342024704, remaining=97895714816}},
          SCMNodeMetric{SCMNodeStat{capacity=107374182400, scmUsed=623083520, remaining=86085607424}}],
second : [SCMNodeMetric{SCMNodeStat{capacity=107374182400, scmUsed=3412590592, remaining=88673959936}}, 
          SCMNodeMetric{SCMNodeStat{capacity=107374182400, scmUsed=3342024704, remaining=97895714816}}, 
          SCMNodeMetric{SCMNodeStat{capacity=107374182400, scmUsed=3319808000, remaining=93105278976}}]
2023-11-22 15:45:51,011 [IPC Server handler 14 on default port 9863] DEBUG org.apache.hadoop.hdds.scm.PipelineChoosePolicy: Secondary compare because the first round is the same
2023-11-22 15:45:51,011 [IPC Server handler 14 on default port 9863] DEBUG org.apache.hadoop.hdds.scm.PipelineChoosePolicy: Chosen the first pipeline by compared scmUsed

Meet expectations.

xichen01

@whbing The change looks good. Just a few comment you can refer to.

...java/org/apache/hadoop/hdds/scm/pipeline/choose/algorithms/CapacityPipelineChoosePolicy.java

xichen01 · 2023-11-23T11:07:15Z

...java/org/apache/hadoop/hdds/scm/pipeline/choose/algorithms/CapacityPipelineChoosePolicy.java

+  @Override
+  public Pipeline choosePipeline(List<Pipeline> pipelineList,
+      PipelineRequestInformation pri) {
+    Pipeline pipeline1 = healthPolicy.choosePipeline(pipelineList, pri);


In some Cluster, There's maybe close hundred pipelines. We just compare two Pipeline in here.
Does this make the probability of the largest (in capacity) Pipeline being selected low?

Perhaps a possible solution is to add a configuration that determines how many Pipelines are compared at a time, which takes the value [0, 1]

When it is 0, only one Pipeline is selected at a time, which is basically equivalent to the RandomPipelineChoosePolicy.

When 1, it compares all Pipelines, and strictly chooses the largest Pipeline in the whole world.

PS: But even if this feature needs to be implemented, I think it can be done in another PR, and when this PR is merged, the current solution will work in a small cluster.

In some Cluster, There's maybe close hundred pipelines. We just compare two Pipeline in here. Does this make the probability of the largest (in capacity) Pipeline being selected low?

Perhaps a possible solution is to add a configuration that determines how many Pipelines are compared at a time, which takes the value [0, 1]

When it is 0, only one Pipeline is selected at a time, which is basically equivalent to the RandomPipelineChoosePolicy.

When 1, it compares all Pipelines, and strictly chooses the largest Pipeline in the whole world.

PS: But even if this feature needs to be implemented, I think it can be done in another PR, and when this PR is merged, the current solution will work in a small cluster.

@xichen01 Thanks for review ! About the logic of selection, there are links to this original papers in HDFS-11564. The algorithms of choosing 2 random nodes and then placing the container on the lower utilization node is discussed in great depth in this survey paper.
https://pdfs.semanticscholar.org/3597/66cb47572028eb70c797115e987ff203e83f.pdf
In addition, SCMContainerPlacementCapacity#chooseNode also uses this algorithm. So, I wonder if it is not necessary to find the pipeline with minimum storage every time?

Do we have any test result for this algo comparing with random healthy node policy? just to see effectivness of algo.

@xichen01 Thanks for review ! About the logic of selection, there are links to this original papers in HDFS-11564. The algorithms of choosing 2 random nodes and then placing the container on the lower utilization node is discussed in great depth in this survey paper.
https://pdfs.semanticscholar.org/3597/66cb47572028eb70c797115e987ff203e83f.pdf
In addition, SCMContainerPlacementCapacity#chooseNode also uses this algorithm. So, I wonder if it is not necessary to find the pipeline with minimum storage every time?

@whbing Understood. For a fairly balanced cluster, such as a new one, this strategy can work very well, providing similar loads to all DataNodes.
However, for a significantly unbalanced cluster, like when adding new nodes, this strategy might be limited, especially in larger clusters.
But for the latter case (adding new nodes), we can also balance it using a balancer

whbing · 2023-11-23T11:58:22Z

ci passed in my branch https://github.com/whbing/ozone/actions/runs/6956936286.

sumitagrawl

@whbing Thanks for working over this, IMO looks another approach to replace Random policy. Have few query...

...java/org/apache/hadoop/hdds/scm/pipeline/choose/algorithms/CapacityPipelineChoosePolicy.java

sumitagrawl · 2023-11-27T11:39:17Z

...java/org/apache/hadoop/hdds/scm/pipeline/choose/algorithms/CapacityPipelineChoosePolicy.java

+  @Override
+  public Pipeline choosePipeline(List<Pipeline> pipelineList,
+      PipelineRequestInformation pri) {
+    Pipeline pipeline1 = healthPolicy.choosePipeline(pipelineList, pri);


Do we have any test result for this algo comparing with random healthy node policy? just to see effectivness of algo.

xichen01 · 2023-11-28T11:42:28Z

@whbing thanks for you update, LGTM +1

whbing · 2023-11-30T10:25:49Z

Run test, got selected result like:

pipeline0 selected count: 62
pipeline1 selected count: 205
pipeline2 selected count: 308
pipeline3 selected count: 425

sumitagrawl

@whbing others LGTM,
are we planning to make policy as default? or its just an option provided. We may need have in defined docs.

whbing · 2023-12-05T14:18:49Z

are we planning to make policy as default? or its just an option provided. We may need have in defined docs.

@sumitagrawl an option provided and NOT change the default value. Add description in ScmConfig.java

adoroszlai

Thanks @whbing for working on this.

.../java/org/apache/hadoop/hdds/scm/pipeline/choose/algorithms/PipelineChoosePolicyFactory.java

.../org/apache/hadoop/hdds/scm/pipeline/choose/algorithms/TestCapacityPipelineChoosePolicy.java

...n/java/org/apache/hadoop/hdds/scm/pipeline/choose/algorithms/RandomPipelineChoosePolicy.java

.../java/org/apache/hadoop/hdds/scm/pipeline/choose/algorithms/HealthyPipelineChoosePolicy.java

…age space

patch updated

adoroszlai

Thanks a lot @whbing for updating the patch, LGTM.

adoroszlai · 2024-01-18T18:25:27Z

hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/PipelineChoosePolicy.java

+  default PipelineChoosePolicy init(final NodeManager nodeManager) {
+    // override if the policy requires nodeManager
+    return this;
+  }


adoroszlai · 2024-01-18T18:26:57Z

@siddhantsangwan @sodonnel please take another look

siddhantsangwan

LGTM. Does anyone else have any comments? Let's get this committed, it's been open for a long time now.

adoroszlai · 2024-01-22T17:12:29Z

Thanks @whbing for the patch, @siddhantsangwan, @sodonnel, @sumitagrawl, @xichen01 for the review.

…age space (apache#5354)

…age space (apache#5354) (cherry picked from commit 73e6f90)

whbing force-pushed the HDDS-9345 branch from 2b73504 to e403c6f Compare September 24, 2023 09:32

adoroszlai requested review from siddhantsangwan and sodonnel October 25, 2023 13:39

adoroszlai added the needs review label Nov 18, 2023

siddhantsangwan reviewed Nov 20, 2023

View reviewed changes

whbing force-pushed the HDDS-9345 branch from 8a3f295 to feae119 Compare November 22, 2023 05:01

xichen01 reviewed Nov 23, 2023

View reviewed changes

sumitagrawl reviewed Nov 27, 2023

View reviewed changes

nandakumar131 added the scm label Nov 30, 2023

adoroszlai requested review from siddhantsangwan and sumitagrawl December 1, 2023 17:34

sumitagrawl reviewed Dec 5, 2023

View reviewed changes

adoroszlai previously requested changes Jan 17, 2024

View reviewed changes

wanghongbing added 6 commits January 18, 2024 23:26

HDDS-9345. Add CapacityPipelineChoosePolicy considering datanode stor…

8b9747f

…age space

add second compare logic

2354261

change code style and log.

e6c624e

improve compare logic and add test

624c34f

add description

72b48a3

address comments.

61933da

whbing force-pushed the HDDS-9345 branch from 0e45d3c to 61933da Compare January 18, 2024 15:55

whbing requested a review from adoroszlai January 18, 2024 16:44

wanghongbing and others added 2 commits January 19, 2024 00:55

fix test

cef840c

Merge remote-tracking branch 'origin/master' into HDDS-9345

c4d4c36

fix checkstyle

cf55246

adoroszlai reviewed Jan 18, 2024

View reviewed changes

adoroszlai requested review from sumitagrawl and xichen01 January 18, 2024 18:26

adoroszlai removed the needs review label Jan 22, 2024

siddhantsangwan approved these changes Jan 22, 2024

View reviewed changes

adoroszlai merged commit 73e6f90 into apache:master Jan 22, 2024

Tejaskriya pushed a commit to Tejaskriya/ozone that referenced this pull request Jan 24, 2024

HDDS-9345. Add CapacityPipelineChoosePolicy considering datanode stor…

227b64b

…age space (apache#5354)

k5342 pushed a commit to pfnet/ozone that referenced this pull request Apr 17, 2025

HDDS-9345. Add CapacityPipelineChoosePolicy considering datanode stor…

888706f

…age space (apache#5354) (cherry picked from commit 73e6f90)

		targetPipeline =
		!metric1.isGreater(metric2.get()) ? pipeline1 : pipeline2;

HDDS-9345. Add CapacityPipelineChoosePolicy considering datanode storage space #5354

HDDS-9345. Add CapacityPipelineChoosePolicy considering datanode storage space #5354

Uh oh!

Conversation

whbing commented Sep 23, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

What is the link to the Apache JIRA

How was this patch tested?

Uh oh!

sodonnel commented Sep 25, 2023

Uh oh!

whbing commented Sep 25, 2023

Uh oh!

siddhantsangwan left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

whbing commented Nov 22, 2023

Uh oh!

xichen01 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

whbing commented Nov 23, 2023

Uh oh!

sumitagrawl left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

xichen01 commented Nov 28, 2023

Uh oh!

whbing commented Nov 30, 2023

Uh oh!

sumitagrawl left a comment

Choose a reason for hiding this comment

Uh oh!

whbing commented Dec 5, 2023

Uh oh!

adoroszlai left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

adoroszlai left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

adoroszlai commented Jan 18, 2024

Uh oh!

siddhantsangwan left a comment

Choose a reason for hiding this comment

Uh oh!

adoroszlai commented Jan 22, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

whbing commented Sep 23, 2023 •

edited

Loading