HDDS-11463. Add SCM RPC support for DataNode volume info reporting. #7266

slfan1989 · 2024-10-03T01:20:07Z

What changes were proposed in this pull request?

Currently, we lack tools on the SCM side to track failed disks on DataNodes. DataNodes have already reported this information, and we need to display it.

In this PR, we will display the failed disks on the DataNode. The information can be displayed in JSON format or using the default format.

Default format

Datanode Volume Failures (5 Volumes)

Node         : localhost-62.238.104.185 (de97aaf3-99ad-449d-ad92-2c4f5a744b49) 
Failed Volume: /data0/ozonedata/hdds 
Capacity Lost: 7430477791683 B (6.76 TB) 
Failure Date : Thu Oct 03 09:25:16 +0800 2024 

Node         : localhost-163.120.165.68 (cf40e987-8952-4f7a-88b7-096e6b285243) 
Failed Volume: /data1/ozonedata/hdds 
Capacity Lost: 7430477791683 B (6.76 TB) 
Failure Date : Thu Oct 03 09:25:16 +0800 2024 

Node         : localhost-253.243.206.120 (0cc77921-489d-4cf0-a036-475faa16d443) 
Failed Volume: /data2/ozonedata/hdds 
Capacity Lost: 7430477791683 B (6.76 TB) 
Failure Date : Thu Oct 03 09:25:16 +0800 2024 

Node         : localhost-136.194.243.81 (5cb6430d-0ce5-4204-b265-179ee38fb30e) 
Failed Volume: /data3/ozonedata/hdds 
Capacity Lost: 7430477791683 B (6.76 TB) 
Failure Date : Thu Oct 03 09:25:16 +0800 2024 

Node         : localhost-48.253.209.226 (f99a8374-edb0-419d-9cba-cfab9d9e8a2e) 
Failed Volume: /data4/ozonedata/hdds 
Capacity Lost: 7430477791683 B (6.76 TB) 
Failure Date : Thu Oct 03 09:25:16 +0800 2024

Json format

[ {
  "node" : "localhost-161.170.151.131 (155bb574-7ed8-41cd-a868-815f4c2b0d60)",
  "volumeName" : "/data0/ozonedata/hdds",
  "failureDate" : 1727918794694,
  "capacityLost" : 7430477791683
}, {
  "node" : "localhost-67.218.46.23 (520d29eb-8387-4cda-bcb1-8727fdddd451)",
  "volumeName" : "/data1/ozonedata/hdds",
  "failureDate" : 1727918794695,
  "capacityLost" : 7430477791683
}, {
  "node" : "localhost-30.151.88.21 (d66cab50-bbf8-4199-9d7f-82da84a30137)",
  "volumeName" : "/data2/ozonedata/hdds",
  "failureDate" : 1727918794695,
  "capacityLost" : 7430477791683
}, {
  "node" : "localhost-78.50.38.217 (a673f50a-6f74-4e62-8c0c-f7337d5f3ce5)",
  "volumeName" : "/data3/ozonedata/hdds",
  "failureDate" : 1727918794695,
  "capacityLost" : 7430477791683
}, {
  "node" : "localhost-138.205.52.25 (84b7e49a-9bd4-4115-96fa-69f2d259343c)",
  "volumeName" : "/data4/ozonedata/hdds",
  "failureDate" : 1727918794695,
  "capacityLost" : 7430477791683
} ]

Table format

+-------------------------------------------------------------------------------------------------------------------------------------------+
|                                                         Datanode Volume Failures                                                          |
+------------------------------------------------------------------+-----------------------+---------------+--------------------------------+
|                               Node                               |      Volume Name      | Capacity Lost |          Failure Date          |
+------------------------------------------------------------------+-----------------------+---------------+--------------------------------+
|  localhost-83.212.219.28 (8b6addb1-759a-49e9-99fb-0d1a6cfb2d7f)  | /data0/ozonedata/hdds |    6.76 TB    | Sat Oct 05 17:47:47 +0800 2024 |
| localhost-103.199.236.47 (0dbe503a-3382-4753-b95a-447bab5766c4)  | /data1/ozonedata/hdds |    6.76 TB    | Sat Oct 05 17:47:47 +0800 2024 |
|  localhost-178.123.46.32 (2017076a-e763-4f47-abce-78535b5770a3)  | /data2/ozonedata/hdds |    6.76 TB    | Sat Oct 05 17:47:47 +0800 2024 |
| localhost-123.112.235.228 (aaebb6a7-6b62-4160-9934-b16b8fdde65e) | /data3/ozonedata/hdds |    6.76 TB    | Sat Oct 05 17:47:47 +0800 2024 |
| localhost-249.235.216.19 (cbc7c0b5-5ae0-4e40-91b8-1d9c419a007c)  | /data4/ozonedata/hdds |    6.76 TB    | Sat Oct 05 17:47:47 +0800 2024 |
+------------------------------------------------------------------+-----------------------+---------------+--------------------------------+

What is the link to the Apache JIRA

JIRA: HDDS-11463. Track and display failed DataNode storage locations in SCM.

How was this patch tested?

Add Junit Test & Testing in a test environment.

slfan1989 · 2024-10-23T09:01:44Z

@errose28 Could you please help review this PR? Thank you very much! We discussed the relevant implementation together in HDDS-11463.

errose28

Thanks for working on this @slfan1989, this looks like a useful addition. I only had time for a quick high level look for now.

errose28 · 2024-10-28T20:43:42Z

hadoop-ozone/tools/src/main/java/org/apache/hadoop/ozone/admin/scm/VolumeFailureSubCommand.java

+ * Handler of ozone admin scm volumesfailure command.
+ */
+@Command(
+    name = "volumesfailure",


For the CLI, we should probably use something like ozone admin datanode volume list. The datanode subcommand is already used to retrieve information about datanodes from SCM. Splitting the commands so that volume has its own subcommand gives us more options in the future.

To distinguish failed and healthy volumes and filter out different nodes, we can either add some kind of filter flag, or leave it up to grep/jq to be applied to the output.

This also means we should make the RPC more generic to support pulling all volume information.

Thank you for helping to review this PR! I will continue to improve the relevant code based on your suggestions.

errose28 · 2024-10-28T20:48:01Z

...ner-service/src/main/java/org/apache/hadoop/ozone/container/common/volume/StorageVolume.java

    private boolean failedVolume = false;
    private String datanodeUuid;
    private String clusterID;
+    private long failureDate;


Lets use failureTime. I'm assuming this is being stored as millis since epoch, so it will have data and time information.

I have improved the relevant code.

errose28 · 2024-10-28T20:51:55Z

...ner-service/src/main/java/org/apache/hadoop/ozone/container/common/volume/StorageVolume.java

+    // Ensure it is set only once,
+    // which is the time when the failure was first detected.
+    if (failureDate == 0L) {
+      setFailureDate(Time.now());


Let's use Instant.now() per HDDS-7911.

@errose28 Can you help review this PR again? Thank you very much!

adoroszlai · 2024-11-05T18:33:44Z

Thanks @slfan1989 for working on this. Converted it to draft because there is a failing test:

[ERROR] org.apache.hadoop.hdds.scm.node.TestSCMNodeManager.tesVolumeInfoFromNodeReport  Time elapsed: 1.105 s  <<< ERROR!
java.lang.UnsupportedOperationException
	at java.base/java.util.AbstractList.add(AbstractList.java:153)
	at java.base/java.util.AbstractList.add(AbstractList.java:111)
	at org.apache.hadoop.hdds.scm.node.DatanodeInfo.updateStorageReports(DatanodeInfo.java:186)
	at org.apache.hadoop.hdds.scm.node.SCMNodeManager.processNodeReport(SCMNodeManager.java:674)
	at org.apache.hadoop.hdds.scm.node.SCMNodeManager.register(SCMNodeManager.java:423)
	at org.apache.hadoop.hdds.scm.node.SCMNodeManager.register(SCMNodeManager.java:360)
	at org.apache.hadoop.hdds.scm.node.TestSCMNodeManager.tesVolumeInfoFromNodeReport(TestSCMNodeManager.java:1591)

https://github.com/slfan1989/ozone/actions/runs/11471452180
https://github.com/slfan1989/ozone/actions/runs/11476535807
https://github.com/slfan1989/ozone/actions/runs/11625983369

slfan1989 · 2024-11-06T09:29:07Z

Thanks @slfan1989 for working on this. Converted it to draft because there is a failing test:
[ERROR] org.apache.hadoop.hdds.scm.node.TestSCMNodeManager.tesVolumeInfoFromNodeReport  Time elapsed: 1.105 s  <<< ERROR!
java.lang.UnsupportedOperationException
	at java.base/java.util.AbstractList.add(AbstractList.java:153)
	at java.base/java.util.AbstractList.add(AbstractList.java:111)
	at org.apache.hadoop.hdds.scm.node.DatanodeInfo.updateStorageReports(DatanodeInfo.java:186)
	at org.apache.hadoop.hdds.scm.node.SCMNodeManager.processNodeReport(SCMNodeManager.java:674)
	at org.apache.hadoop.hdds.scm.node.SCMNodeManager.register(SCMNodeManager.java:423)
	at org.apache.hadoop.hdds.scm.node.SCMNodeManager.register(SCMNodeManager.java:360)
	at org.apache.hadoop.hdds.scm.node.TestSCMNodeManager.tesVolumeInfoFromNodeReport(TestSCMNodeManager.java:1591)
https://github.com/slfan1989/ozone/actions/runs/11471452180 https://github.com/slfan1989/ozone/actions/runs/11476535807 https://github.com/slfan1989/ozone/actions/runs/11625983369

@adoroszlai Thank you for reviewing this PR！ I am currently making improvements, and once the changes pass the CI tests in my branch, I will reopen the PR.

cc: @errose28

slfan1989 · 2024-11-07T23:48:24Z

@adoroszlai Thank you for reviewing this PR! I will also pay closer attention to CI issues in future development. I understand that CI testing resources are valuable.

I have made improvements to the code based on @errose28 suggestions and also fixed the related unit test errors. The CI for my branch has passed(https://github.com/slfan1989/ozone/actions/runs/11719380711), and I have updated the PR status to 'Ready for Review'.

slfan1989 · 2024-11-19T14:18:04Z

@errose28 Could you please help review this PR again? Thank you very much! I’ve made some additional improvements to this PR, as we wanted to print all the disk information. However, since there’s quite a lot of disk data, I’ve added pagination functionality.

adoroszlai · 2025-02-14T19:47:22Z

Temporarily converted to draft and assigned to myself, to resolve conflicts.

slfan1989 · 2025-02-15T09:25:19Z

@adoroszlai Thank you for your attention to this PR. I will continue to follow up on it.

adoroszlai · 2025-02-18T11:28:02Z

Merged from master. There will be one checkstyle problem:

hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/scm/datanode/package-info.java
 18: Missing a Javadoc comment.
 18: Missing javadoc for package-info.java file.

Previously the license header was a javadoc in this new file, so the problem was hidden.

slfan1989 · 2025-05-23T14:27:43Z

@adoroszlai Could you help review this PR? Thank you very much!

I have conducted tests on my own branch, and it currently passes the key CI tests.

https://github.com/slfan1989/ozone/actions/runs/15209585380

slfan1989 · 2025-05-23T22:59:20Z

...hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/server/SCMClientProtocolServer.java

+
+    // If startItem is specified, find its position in the volumeInfos list
+    int startIndex = 0;
+    if (StringUtils.isNotBlank(startItem)) {


@adoroszlai I added logic to skip startItem in this part of the code, but after thinking it through, I realized it’s better to use the server’s hostname or UUID as startItem instead of a disk prefix. That’s because many machines name their disks like data0 to data9, and using a disk name could lead to unexpected filtering behavior.

slfan1989 · 2025-05-27T22:03:41Z

@adoroszlai Can we move forward with this PR? I would appreciate your advice.

adoroszlai

Thanks @slfan1989 for updating the patch.

adoroszlai · 2025-05-28T08:56:23Z

hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/scm/datanode/VolumeInfo.java

+  private static final Codec<VolumeInfo> CODEC = new DelegatedCodec<>(
+      Proto2Codec.get(HddsProtos.VolumeInfoProto.getDefaultInstance()),
+      VolumeInfo::fromProtobuf,
+      VolumeInfo::getProtobuf,
+      VolumeInfo.class);
+
+  public static Codec<VolumeInfo> getCodec() {
+    return CODEC;
+  }


Codec is required only for storing in DB, but VolumeInfo does not seem to be persisted by either datanode or SCM. So I think this can be removed.

I have removed the CODEC.

adoroszlai · 2025-05-28T08:59:03Z

hadoop-hdds/interface-client/src/main/proto/hdds.proto

 }

+message VolumeInfoProto {
+    optional string uuid = 1;


Please use DatanodeIDProto.

adoroszlai · 2025-05-28T09:00:01Z

hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/scm/datanode/VolumeInfo.java

+ */
+public final class VolumeInfo implements Comparable<VolumeInfo> {
+
+  private String uuid;


Please use DatanodeID.

adoroszlai · 2025-05-28T09:01:26Z

hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/scm/datanode/VolumeInfo.java

+
+  @Override
+  public int compareTo(VolumeInfo that) {
+    Preconditions.checkNotNull(that);


nit: prefer builtin Objects.requireNonNull

adoroszlai · 2025-05-28T09:02:05Z

...mmon/src/main/java/org/apache/hadoop/hdds/scm/protocol/StorageContainerLocationProtocol.java

+   * @throws IOException
+   * I/O exceptions that may occur during the process of querying the volume.
+   */
+  StorageContainerLocationProtocolProtos.GetVolumeInfosResponseProto getVolumeInfos(


nit: please import GetVolumeInfosResponseProto instead of StorageContainerLocationProtocolProtos.

adoroszlai · 2025-05-28T10:10:55Z

...-ozone/cli-admin/src/main/java/org/apache/hadoop/hdds/scm/cli/datanode/VolumeSubCommand.java

+  private String uuid;
+
+  // The HostName identifier of the DataNode.
+  @Option(names = { "--hostName" },


This still applies to the latest patch.

adoroszlai · 2025-05-28T10:12:46Z

...-ozone/cli-admin/src/main/java/org/apache/hadoop/hdds/scm/cli/datanode/VolumeSubCommand.java

+  @CommandLine.Mixin
+  private ListPaginationOptions listOptions;
+
+  enum DISPLAYMODE { all, normal, failed }


nit: Enums should be named like other types (classes, interfaces): DisplayMode.

Also, please consider using all-caps for values (ALL, etc.)

adoroszlai · 2025-05-28T10:20:33Z

...-ozone/cli-admin/src/main/java/org/apache/hadoop/hdds/scm/cli/datanode/VolumeSubCommand.java

+  @Option(names = { "--displayMode" },
+      defaultValue = "all",
+      description = "Display mode for disks: 'failed' shows failed disks, " +
+      "'normal' shows healthy disks, 'all' shows all disks.")
+  private DISPLAYMODE displayMode;


On another look, I think "display mode" is confusing. JSON and Table are display modes. "all/normal/failed" filter the list by volume state.

I suggest renaming to --state, and renaming normal to healthy.

Then description can be simplified to Filter disks by state.

adoroszlai · 2025-05-28T10:23:24Z

...-ozone/cli-admin/src/main/java/org/apache/hadoop/hdds/scm/cli/datanode/VolumeSubCommand.java

+
+    // If displayed in JSON format.
+    if (json) {
+      System.out.print(JsonUtils.toJsonStringWithDefaultPrettyPrinter(volumeInfos));


This is still applicable.

Also, please use println, to avoid situation like HDDS-13100.

adoroszlai · 2025-05-28T10:27:41Z

hadoop-ozone/cli-admin/src/test/java/org/apache/hadoop/ozone/scm/TestVolumeCommand.java

+  @AfterEach
+  public void tearDown() {
+  }
+


nit: unnecessary

errose28 · 2025-05-28T16:18:19Z

Hi @slfan1989 thanks for working on this change. I think there are three attributes being added here which should be reviewed separately:

Adding an SCM RPC to retrieve volume information
Tracking failure time of the volume
Adding a CLI to view the volume information

The RPC to retrieve volume information is definitely required going forward regardless of the other two items to create some sort of CLI to query volume state.

Tracking the failure time of the volume seems like a somewhat invasive change since it spans the datanode, heartbeat, and SCM. Is this necessary, or is it enough to depend on a metrics database to track timing of cluster events? Of course we need improvements to our volume metrics as well as mentioned in #8405.

On the CLI front, I do think we need a dedicated ozone admin datanode info command going forward as outlined in HDDS-13097. This would give all volume information per node. With volume counters added to ozone admin datanode list as proposed in HDDS-13096, we could get all failed volumes in a two step process:

jq filter on ozone admin datanode list to find all nodes with failed volumes.
jq filter on ozone admin datanode info to get specific information about the failed volumes, including their capacity.

Do we need a dedicated ozone admin datanode volume list/info command pairing in addition to this? It may be useful to have such cross-cutting commands to get information in one shot, but on the other hand it may result in duplication at the CLI. For example I could see the request to add node filtering to ozone admin datanode volume list/info at which point it becomes much the same as ozone admin datanode list/info.

slfan1989 · 2025-05-29T02:17:40Z

@errose28 Thank you for your message! I'd like to share some thoughts from a different perspective. As it stands, this feature does not conflict with the proposal in #8405. #8405 represents a more innovative and forward-looking design, and although it's still under discussion, it will certainly be valuable if implemented as planned.

At the same time, I believe this feature does not impact HDDS-13096 or HDDS-13097. My comment on #8405 was more about expressing expectations for the system’s future capabilities — I hope Ozone can gradually support such features — rather than raising any objections to #8405 itself.

The design of #7266 is inspired by HDFS's disk failure detection mechanism, with the goal of improving the system's ability to identify and locate failed disks. For users migrating from HDFS to Ozone, using the volume command to directly view failed disks can offer a more intuitive and convenient operational experience.

From my perspective, we all play different roles in this project. Your team focuses on evolving and optimizing the system's architecture, while we, as external users, are more focused on refining specific functional details based on real-world use. Ultimately, however, we share the same goal: to make Ozone more robust, more user-friendly, and more widely adopted.

Naturally, it's not easy to fully align these detail-oriented changes with larger, ongoing feature developments — for example, making #7266 fully consistent with #8405. This is mainly because #8405 is broader in scope, with a longer timeline, whereas #7266 focuses on a very specific aspect. While we fully respect the overall direction, we also hope to move forward with some smaller, incremental improvements to address current practical issues.

In addition to this PR, we're also working on several other enhancements. For instance, we've implemented mechanisms to collect DataNode I/O statistics to more precisely manage container replication. We've also introduced time-based peak/off-peak control logic for various DataNode management operations (such as deletion, container replication, and EC container reconstruction). These improvements are driven by real-world production needs, and from our perspective, they've shown positive results.

However, since many of these PRs have some degree of code coupling with our previous contributions, it's difficult for us to combine everything into a single, unified patch for upstream submission.

Therefore, we hope to proceed with #7266 for now. If #8405 later results in a more complete or improved solution, we’d be happy to continue refining things in that direction. In the meantime, this also gives us a valuable opportunity to participate in the community and contribute to Ozone’s development.

slfan1989 · 2025-05-29T03:12:30Z

@adoroszlai Thank you very much for reviewing the code. I will make improvements based on your suggestions. @errose28's comments are essentially not in conflict with #7266, and I'm looking forward to seeing #7266 progress so that we can move forward with the subsequent work.

slfan1989 · 2025-05-29T09:31:13Z

@adoroszlai Thank you very much for your detailed suggestions! I've made the changes accordingly. Could you review this PR again? Thank you very much! I respect @errose28's perspective. However, I believe this PR does not conflict with #8405, nor with HDDS-13096 or HDDS-13097 — they can coexist. We've already spent considerable time reviewing this PR together, and I'd like to continue moving it forward.

slfan1989 · 2025-05-29T21:12:05Z

@adoroszlai @errose28 Can I still continue to follow up on this PR? I feel that I’ve put in some effort, but right now I’ve lost a clear direction on how to proceed. According to @errose28 suggestion, this PR only needs to keep the RPC part, but I’m not sure how to continue working on the related functionality from here.

adoroszlai · 2025-05-30T05:30:46Z

@slfan1989 Thanks for all your efforts on this PR. The concerns/suggestions raised by @errose28 make sense though. Please try to reach agreement.

I won't be able to re-review until next week in any case.

slfan1989 · 2025-05-30T05:43:32Z

@slfan1989 Thanks for all your efforts on this PR. The concerns/suggestions raised by @errose28 make sense though. Please try to reach agreement.

I won't be able to re-review until next week in any case.

@adoroszlai Thank you very much for your message and for your continued support and assistance! Since @errose28 is currently planning some new features, I believe this PR could be considered as part of that effort, especially given the amount of work we've already invested. As for which specific features should be retained, it would be helpful if @errose28 could review and provide guidance.

errose28 · 2025-06-04T21:20:59Z

Hi @slfan1989 I appreciate your response and the work you've done on these changes. My suggestion to split this PR down was not meant to diminish any of the work that has been put in here, but to speed up in incorporating the work into Ozone.

A change like this approaching 1k lines, 70+ review comments, and encompassing multiple items is going to be large for any reviewer, and I think we could make faster progress by splitting it. For example, I think we could iterate on the volume info RPC in SCM pretty quickly and get that change merged first. I'm ok with adding volume failure time and capacity lost to SCM as well, but it will be easier to review those as their own change. This way most of the work here can be merged while we discuss the CLI.

The design of #7266 is inspired by HDFS's disk failure detection mechanism, with the goal of improving the system's ability to identify and locate failed disks. For users migrating from HDFS to Ozone, using the volume command to directly view failed disks can offer a more intuitive and convenient operational experience.

Can you add more details about how this is similar to HDFS? I'm familiar with hdfs dfsadmin -report to list failed volumes, but that breaks the information down at the node level with failed volume counters for each node, which seems more similar to an ozone admin datanode list command.

I believe this feature does not impact HDDS-13096 or HDDS-13097.

Yes we could have both ozone admin datanode list/info and ozone admin datanode volume list without code conflicts, but we need to build a maintainable and intuitive CLI, which means we should avoid commands that do the same or similar things. In this case I think we should standardize one way to get volume information from the CLI. I propose keeping this within the datanode info command because volumes are completely contained within a node, unlike cross cutting concepts like containers and pipelines which have their own subcommands. Based on early comments like this I think we would end up needing node filtering in ozone admin datanode volume list at which point it becomes very similar to ozone admin datanode list/info.

slfan1989 · 2025-06-17T03:03:52Z

@errose28 @adoroszlai Thank you for your message! I’m currently reviewing the issues you raised and will continue to follow up on the PR, making necessary adjustments and providing timely feedback. I'm currently working on the code improvements and expect to complete them within 1–2 days.

slfan1989 · 2025-06-29T06:39:09Z

@errose28 @adoroszlai I have completed the improvements for this PR and kept the RPC interface part. Could you please help review it? Thank you very much!

errose28 · 2025-07-21T23:04:13Z

Sorry for the delay. Can we have this PR be just the SCM <-> client communication for querying volume info? Right now it also contains information for SCM <-> DN communication about the failure time of the volume, which is not directly related and can be added in a follow-up change. The new RPC will also need tests added.

slfan1989 · 2025-07-22T01:03:09Z

Sorry for the delay. Can we have this PR be just the SCM <-> client communication for querying volume info? Right now it also contains information for SCM <-> DN communication about the failure time of the volume, which is not directly related and can be added in a follow-up change. The new RPC will also need tests added.

@errose28 Thank you for your feedback. I will improve this PR based on your suggestions.

slfan1989 · 2025-08-11T09:28:32Z

Sorry for the delay. Can we have this PR be just the SCM <-> client communication for querying volume info? Right now it also contains information for SCM <-> DN communication about the failure time of the volume, which is not directly related and can be added in a follow-up change. The new RPC will also need tests added.

@errose28 Thank you for your feedback. I will improve this PR based on your suggestions.

@errose28 @adoroszlai Could you please take another look at this PR? Thank you very much! I’ve simplified it to only include the interactions between the Client and SCM. If you think the implementation meets expectations, I’ll add some unit tests.

Apologies for the delay in following up on some Ozone PRs. Most of my time this year has been dedicated to supporting Hadoop on JDK17. That work is now nearing completion, and I will focus on addressing the remaining PRs in Ozone.

github-actions · 2025-11-12T00:06:13Z

This PR has been marked as stale due to 21 days of inactivity. Please comment or remove the stale label to keep it open. Otherwise, it will be automatically closed in 7 days.

github-actions · 2025-11-19T00:06:25Z

Thank you for your contribution. This PR is being closed due to inactivity. If needed, feel free to reopen it.

slfan1989 mentioned this pull request Oct 3, 2024

HDDS-11268. Add --table mode for OM/SCM Roles CLI #7016

Merged

slfan1989 force-pushed the HDDS-11463 branch from 56230df to a84c575 Compare October 5, 2024 09:57

slfan1989 marked this pull request as ready for review October 5, 2024 11:48

slfan1989 marked this pull request as draft October 6, 2024 01:46

slfan1989 closed this Oct 22, 2024

slfan1989 force-pushed the HDDS-11463 branch from d93e9a3 to 3fb2cf0 Compare October 22, 2024 04:44

slfan1989 reopened this Oct 23, 2024

slfan1989 marked this pull request as ready for review October 23, 2024 04:41

slfan1989 force-pushed the HDDS-11463 branch from 08e0f93 to 4e58fca Compare October 23, 2024 08:57

errose28 reviewed Oct 28, 2024

View reviewed changes

adoroszlai marked this pull request as draft November 5, 2024 18:30

slfan1989 force-pushed the HDDS-11463 branch 2 times, most recently from a83a8f7 to b1df492 Compare November 6, 2024 09:26

slfan1989 force-pushed the HDDS-11463 branch 2 times, most recently from 8bc7ae0 to ff43fac Compare November 7, 2024 08:37

slfan1989 marked this pull request as ready for review November 7, 2024 23:48

adoroszlai requested a review from errose28 November 8, 2024 04:59

errose28 mentioned this pull request Dec 2, 2024

HDDS-11770. Change default failed volume tolerated to 0 #7499

Closed

adoroszlai self-assigned this Feb 14, 2025

adoroszlai marked this pull request as draft February 14, 2025 19:46

adoroszlai removed their assignment Feb 18, 2025

slfan1989 force-pushed the HDDS-11463 branch from c18377e to 1197c0a Compare February 21, 2025 07:14

errose28 mentioned this pull request May 15, 2025

HDDS-8387. Improved Storage Volume Handling in Datanodes #8405

Closed

slfan1989 force-pushed the HDDS-11463 branch from 1f34a2f to 41afe7c Compare May 22, 2025 08:59

slfan1989 marked this pull request as ready for review May 23, 2025 14:24

adoroszlai self-requested a review May 23, 2025 15:03

slfan1989 commented May 23, 2025

View reviewed changes

adoroszlai reviewed May 28, 2025

View reviewed changes

slfan1989 changed the title ~~HDDS-11463. Track and display failed DataNode storage locations in SCM.~~ HDDS-11463. Add SCM RPC support for DataNode volume info reporting. Jun 29, 2025

HDDS-11463. Add SCM RPC support for DataNode volume info reporting.

416198a

slfan1989 force-pushed the HDDS-11463 branch from bcde90c to 416198a Compare August 11, 2025 09:24

slfan1989 and others added 2 commits September 3, 2025 18:46

Merge branch 'apache:master' into HDDS-11463

cf5c743

HDDS-11463. Fix org.apache.commons.collections Error.

2a23318

github-actions bot added the stale label Nov 12, 2025

github-actions bot closed this Nov 19, 2025

HDDS-11463. Add SCM RPC support for DataNode volume info reporting. #7266

HDDS-11463. Add SCM RPC support for DataNode volume info reporting. #7266

Uh oh!

Conversation

slfan1989 commented Oct 3, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

What is the link to the Apache JIRA

How was this patch tested?

Uh oh!

slfan1989 commented Oct 23, 2024

Uh oh!

errose28 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

adoroszlai commented Nov 5, 2024

Uh oh!

slfan1989 commented Nov 6, 2024

Uh oh!

slfan1989 commented Nov 7, 2024

Uh oh!

slfan1989 commented Nov 19, 2024

Uh oh!

adoroszlai commented Feb 14, 2025

Uh oh!

slfan1989 commented Feb 15, 2025

Uh oh!

adoroszlai commented Feb 18, 2025

Uh oh!

slfan1989 commented May 23, 2025

Uh oh!

slfan1989 May 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

slfan1989 commented May 27, 2025

Uh oh!

adoroszlai left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

errose28 commented May 28, 2025

Uh oh!

slfan1989 commented May 29, 2025

Uh oh!

slfan1989 commented Oct 3, 2024 •

edited

Loading

slfan1989 May 23, 2025 •

edited

Loading

slfan1989 commented Jun 17, 2025 •

edited

Loading

slfan1989 commented Aug 11, 2025 •

edited

Loading