Skip to content

Conversation

@slfan1989
Copy link
Contributor

@slfan1989 slfan1989 commented Jul 31, 2024

What changes were proposed in this pull request?

During Ozone EC block recovery, we encountered a situation where some attempted recoveries of EC Stripe Blocks fail to reconstruct properly, resulting in the following error.

java.lang.IllegalArgumentException: The chunk list has 2 entries, but the checksum chunks has 3 entries. 
They should be equal in size.  

After carefully reviewing the EC code, We did not find any issues with the read and write operations. Upon discussion, we concluded that the closure of containers (e.g., when a container reaches its write capacity) may affect the writing of EC stripe data blocks. However, this is considered a normal occurrence, implying that occasional issues with some EC stripe data blocks are expected.

image

For EC type EC-3-2-1024K:

One of our EC stripes consists of 3 data blocks. Blocks 1 and 2 are correct, but Block 3 has an issue where its BlockGroupLength is shorter than that of Blocks 1 and 2.

If Block 2 is lost at this point, we would need to recover Block 2 using 2 data blocks (DataBlock1, DataBlock3) and 2 parity blocks (Parity1, Parity2).

We can read ECReconstructionCoordinator#calcBlockLocationInfoMap, which calls ECReconstructionCoordinator#calcEffectiveBlockGroupLen method.

private long calcEffectiveBlockGroupLen(BlockData[] blockGroup,
int replicaCount) {
Preconditions.checkState(blockGroup.length == replicaCount);
long blockGroupLen = Long.MAX_VALUE;
for (int i = 0; i < replicaCount; i++) {
if (blockGroup[i] == null) {
continue;
}
long putBlockLen = getBlockDataLength(blockGroup[i]);
// Use safe length is the minimum of the lengths recorded across the
// stripe
blockGroupLen = Math.min(putBlockLen, blockGroupLen);
}
return blockGroupLen == Long.MAX_VALUE ? 0 : blockGroupLen;
}

This section of code selects the minimum BlockGroupLength, thus choosing the BlockGroupLength of DataBlock3, leading to the recovery of an incorrect data block.

ECBlockOutputStream#executePutBlock validates the generated data, so we will receive a notification.

java.lang.IllegalArgumentException: The chunk list has 2 entries, but the checksum chunks has 3 entries. 
They should be equal in size.  

I attempted to propose an improvement method:
if there are sufficient source data blocks available, we will exclude Block 3.

Solution description

Step1. Improve BlockGroupLength selection.

We will attempt to find the most suitable BlockGroupLength possible. We cannot avoid encountering some special situations during data writes, which may lead to inaccuracies in the length of some BlockGroupLength values.

However, in cases like the following, we can reasonably infer which BlockGroupLength is accurate:
when only one DN has a different BlockGroupLength compared to the others, we can assume that the BlockGroupLength of the other DNs is more reasonable.

Step2. Exclude DN whose BlockGroupLength is different from other DNs.

  • Remove the problematic DataNode from the data recovery pipeline.
  • Remove the problematic BlockData from the array of BlockGroups.

What is the link to the Apache JIRA

JIRA: HDDS-10985. EC Reconstruction failed because the size of currentChunks was not equal to checksumBlockDataChunks.

How was this patch tested?

Add Junit Test.

@slfan1989 slfan1989 marked this pull request as draft July 31, 2024 06:54
@adoroszlai
Copy link
Contributor

@slfan1989 If you intend to mark the PR as draft, please do so while creating the PR, not as a separate step afterwards. This is to avoid unnecessary CI runs.

Similarly, please fill PR template while creating the PR, not afterwards. (Tweaking the description is fine at any time, just avoid using the unedited template.)

@slfan1989
Copy link
Contributor Author

slfan1989 commented Jul 31, 2024

@slfan1989 If you intend to mark the PR as draft, please do so while creating the PR, not as a separate step afterwards. This is to avoid unnecessary CI runs.

Similarly, please fill PR template while creating the PR, not afterwards. (Tweaking the description is fine at any time, just avoid using the unedited template.)

@adoroszlai Thank you for the reminder. I will pay attention to these details when submitting future prs. HDDS-11243 (#7008) is already deployed in our production environment, and I will update its status to "ready for review" soon. As for HDDS-10985(#7009), after refining the relevant descriptions, I will also change its status to "ready for review".

@slfan1989 slfan1989 marked this pull request as ready for review July 31, 2024 10:28
@slfan1989
Copy link
Contributor Author

@sodonnel @adoroszlai This is an improvement regarding EC block recovery. The primary focus of the PR is to eliminate inaccurate DNs caused by BlockGroupLength. If you have time, please assist in reviewing this PR. Thank you very much!

@adoroszlai adoroszlai requested a review from sodonnel August 7, 2024 11:37
@sodonnel
Copy link
Contributor

sodonnel commented Aug 7, 2024

This section of code selects the minimum BlockGroupLength, thus choosing the BlockGroupLength of DataBlock3, leading to the recovery of an incorrect data block.

I am not sure this statement is correct.

When a stripe is written, the client must wait for all DNs to ack the write, indicating it was successfully saved. If any of the DNs do not return successfully, the stripe is abandoned by the client, and a new block is requested and the stripe is written to the new block again. This may duplicate some data.

To know for sure if this is happening, you need to look at the block length stored in OM for this block and see if it aligns with 2 strips (as stated by DN 3) or greater than 3 stripes, indicating that the "failed write handling" on the client did not do the correct thing. Unfortunately given the block ID, its not easy to find the key is it associated with. I believe it can be done via recon.

@slfan1989
Copy link
Contributor Author

slfan1989 commented Aug 8, 2024

@sodonnel

Thank you very much for your reply!

I think I may not have expressed myself very clearly. I’d like to describe the functionality of this PR again and provide an example to illustrate the entire recovery process.

Background

We discovered some errors related to EC reconstruction online, with detailed descriptions in HDDS-10985.

java.lang.IllegalArgumentException: The chunk list has 2 entries, but the checksum chunks has 3 entries. 
They should be equal in size.  

After encountering these errors, we took the following steps:

  1. We carefully reviewed the read and write code for EC and raised some questions about its implementation. After you provided answers, I understood your approach and concluded that there were no obvious issues with the EC write code.

  2. We added extra auditLog entries for certain DataNodes (DNs) because finding EC reconstruction errors in DN logs is very challenging. I wrote the reconstruction error information into the audit.log. We can refer to the relevant pr for details(HDDS-11171. [DN] Add EC Block Recover Audit Log. #6936).

After deploying this PR, I discovered many errors with the reconstruction blocks.

2024-07-25 07:25:15,830 | ERROR | DNAudit | user=null | ip=null | op=RECOVER_EC_BLOCK {blockLocationInfo={blockID={conID: 951772 locID: 113750155032021583 bcsId: 0}, length=25165824, offset=0, token=null, pipeline=Pipeline[ Id: cc205dc9-49a1-4c07-92b9-0c27504268b6, Nodes: 62fa9baf-6002-4da9-bb06-eaf3e9549774(bigdata-ozone1431.online/10.77.218.54)d6b32839-fd22-4ee9-b1ef-0a2f7ae5e8d7(bigdata-ozone1302.online/10.77.213.32)8d7e54be-c94c-4bfd-ba11-4b618a1ca332(bigdata-ozone1718.online/10.77.233.41)e6599058-c6cf-4d3c-b669-4155bedc6631(bigdata-ozone1455.online/10.77.219.38)94bfe561-11b8-40b4-89e7-f3279f6b73e6(bigdata-ozone1474.online/10.77.220.18)f4bd3569-fbb4-40fa-a792-863a34df2cb4(bigdata-ozone1245.online/10.77.211.34)c1e06adb-46d9-4d08-ab65-bd9ef83607f8(bigdata-ozone1330.online/10.77.214.31)0e41e3ce-bc8d-4185-b956-d1d445b25cb9(bigdata-ozone1382.online/10.77.216.56), excludedSet: , ReplicationConfig: EC{rs-6-3-1024k}, State:CLOSED, leaderId:, CreationTimestamp2024-07-25T07:23:12.013617185+08:00[Asia/Shanghai]], createVersion=0, partNumber=0}} | ret=FAILURE | java.lang.IllegalArgumentException: The chunk list has 4 entries, but the checksum chunks has 5 entries. They should be equal in size.
2024-07-25 08:44:13,601 | ERROR | DNAudit | user=null | ip=null | op=RECOVER_EC_BLOCK {blockLocationInfo={blockID={conID: 837314 locID: 113750154862943175 bcsId: 0}, length=100663296, offset=0, token=null, pipeline=Pipeline[ Id: 8aeaf9ec-7537-4ad0-82d3-be7b5e7702aa, Nodes: 6d856eab-b1ce-428e-b91c-b8eb950807ae(bigdata-ozone1354.online/10.77.215.15)1b29d271-631b-4703-9d63-a0b65bf30480(bigdata-ozone1258.online/10.77.211.57)447297bf-f236-42da-babd-6affdff5e845(bigdata-ozone1404.online/10.77.217.53)105617a7-5fa5-41fd-bd51-61b5200c3e1d(bigdata-ozone1695.online/10.77.232.12)dedf7b87-667b-4c84-b98b-94fb8bb8a2bc(bigdata-ozone1295.online/10.77.213.15)f76dda7e-d639-465c-a1ab-e9ef6ec4421c(bigdata-ozone1418.online/10.77.218.31)86e30199-e1b6-41ee-936a-c9c7638af580(bigdata-ozone1476.online/10.77.220.20)df941469-8358-402a-8600-0d3f508f9cda(bigdata-ozone1366.online/10.77.216.18), excludedSet: , ReplicationConfig: EC{rs-6-3-1024k}, State:CLOSED, leaderId:, CreationTimestamp2024-07-25T08:44:04.377586980+08:00[Asia/Shanghai]], createVersion=0, partNumber=0}} | ret=FAILURE | java.lang.IllegalArgumentException: The chunk list has 16 entries, but the checksum chunks has 17 entries. They should be equal in size.
2024-07-25 08:45:07,392 | ERROR | DNAudit | user=null | ip=null | op=RECOVER_EC_BLOCK {blockLocationInfo={blockID={conID: 951070 locID: 113750155030944419 bcsId: 0}, length=56623104, offset=0, token=null, pipeline=Pipeline[ Id: 7d0a854b-dbeb-4958-91fb-16c2dcf2fdfa, Nodes: a2ddcc2d-8aa9-4030-9121-af6f9426ac04(bigdata-ozone1249.online/10.77.211.38)cac84fc4-835b-4c49-9566-10b3b252b44f(bigdata-ozone1408.online/10.77.217.58)e6599058-c6cf-4d3c-b669-4155bedc6631(bigdata-ozone1455.online/10.77.219.38)cffa6746-ae46-4e5f-8f54-c37bebdb36d6(bigdata-ozone1712.online/10.77.234.53)93435586-39df-4e4b-88e6-f25e3d926bba(bigdata-ozone1329.online/10.77.214.20)795e2ef6-4f22-44de-aad0-78bae5a153b3(bigdata-ozone1513.online/10.77.221.52)4f5b5892-63f4-4cf8-b617-2fb26e9e0ef5(bigdata-ozone1480.online/10.77.220.34)7c8f10a6-8027-488c-b187-8e4b3afadce3(bigdata-ozone1316.online/10.77.213.57), excludedSet: , ReplicationConfig: EC{rs-6-3-1024k}, State:CLOSED, leaderId:, CreationTimestamp2024-07-25T08:44:04.377586831+08:00[Asia/Shanghai]], createVersion=0, partNumber=0}} | ret=FAILURE | java.lang.IllegalArgumentException: The chunk list has 9 entries, but the checksum chunks has 10 entries. They should be equal in size.
2024-07-25 08:55:26,974 | ERROR | DNAudit | user=null | ip=null | op=RECOVER_EC_BLOCK {blockLocationInfo={blockID={conID: 932402 locID: 113750154996121646 bcsId: 0}, length=18874368, offset=0, token=null, pipeline=Pipeline[ Id: 9a6bb8f1-8322-4846-990b-81d5b5de78e0, Nodes: 4dc2a747-2be3-4a58-ab7c-3fbe504a1189(bigdata-ozone1502.online/10.77.221.20)fcd1cc61-b679-4853-87f6-be6cbfd94b32(bigdata-ozone1352.online/10.77.215.13)6c2057a9-061a-44c2-bd5e-b7e430f5257d(bigdata-ozone1334.online/10.77.214.35)fe24377f-91a9-403a-b654-1f7c8cd184e4(bigdata-ozone1663.online/10.77.230.37)33c6fb6a-6ed6-4246-b8b9-814309b59750(bigdata-ozone1255.online/10.77.211.54)5b5316b6-0be1-4ff6-8c2b-52c3cb53a702(bigdata-ozone1414.online/10.77.218.17)6ea28f7b-462f-4b35-994a-33d1d7d288de(bigdata-ozone1315.online/10.77.213.56)91662b72-aa0d-4087-a1e9-f71b55c4a64f(bigdata-ozone1468.online/10.77.220.11), excludedSet: , ReplicationConfig: EC{rs-6-3-1024k}, State:CLOSED, leaderId:, CreationTimestamp2024-07-25T08:55:26.562523718+08:00[Asia/Shanghai]], createVersion=0, partNumber=0}} | ret=FAILURE | java.lang.IllegalArgumentException: The chunk list has 3 entries, but the checksum chunks has 4 entries. They should be equal in size.
2024-07-25 09:15:06,692 | ERROR | DNAudit | user=null | ip=null | op=RECOVER_EC_BLOCK {blockLocationInfo={blockID={conID: 925553 locID: 113750154985174863 bcsId: 0}, length=56623104, offset=0, token=null, pipeline=Pipeline[ Id: 819c7cb8-a4f2-43a3-a2c1-c692c15f1c7f, Nodes: 77eaf094-d67b-40cc-a1c0-5eff52292a22(bigdata-ozone1351.online/10.77.215.12)14e3d6be-dd8f-476d-90b9-043d37e8d735(bigdata-ozone1472.online/10.77.220.16)cf38675f-987b-47c2-baa3-e6afdd3884fe(bigdata-ozone1451.online/10.77.219.34)003c3831-ffd7-4d6a-8b8c-f71a4d01bf94(bigdata-ozone1291.online/10.77.213.11)d8f3179c-7629-48f2-9030-45a89de389ab(bigdata-ozone1425.online/10.77.218.38)c31aca19-69cc-4cb6-9776-c995ee3e3132(bigdata-ozone1509.online/10.77.221.38)cde3f346-7a0d-42e1-9c53-2084e71febfd(bigdata-ozone1381.online/10.77.216.55)c27c68ff-966c-4d55-a92c-52ce9214ca40(bigdata-ozone1343.online/10.77.214.54), excludedSet: , ReplicationConfig: EC{rs-6-3-1024k}, State:CLOSED, leaderId:, CreationTimestamp2024-07-25T09:11:21.535920299+08:00[Asia/Shanghai]], createVersion=0, partNumber=0}} | ret=FAILURE | java.lang.IllegalArgumentException: The chunk list has 9 entries, but the checksum chunks has 10 entries. They should be equal in size.
2024-07-25 09:27:51,679 | ERROR | DNAudit | user=null | ip=null | op=RECOVER_EC_BLOCK {blockLocationInfo={blockID={conID: 955780 locID: 113750155037274967 bcsId: 0}, length=12582912, offset=0, token=null, pipeline=Pipeline[ Id: 8431e3cf-7812-49bc-89f9-d43e1e3cdcf0, Nodes: b98a0679-cab6-4f1e-98ee-a04d27f0f1e4(bigdata-ozone1506.online/10.77.221.34)41ba9acc-b97c-4e7a-b0df-d1d05daebefb(bigdata-ozone1807.online/10.77.223.60)19c5d2a7-1fee-4e6c-8395-ee6d09cfd86a(bigdata-ozone1332.online/10.77.214.33)7509ee25-be69-49bb-8d03-38457712f2e9(bigdata-ozone1296.online/10.77.213.16)94bfe561-11b8-40b4-89e7-f3279f6b73e6(bigdata-ozone1474.online/10.77.220.18)60fe53ee-de63-41c3-b5f0-613a6b55444c(bigdata-ozone1423.online/10.77.218.36)fcd1cc61-b679-4853-87f6-be6cbfd94b32(bigdata-ozone1352.online/10.77.215.13)3ccd78c7-047f-4a45-ac88-891936f29dfe(bigdata-ozone1363.online/10.77.216.15), excludedSet: , ReplicationConfig: EC{rs-6-3-1024k}, State:CLOSED, leaderId:, CreationTimestamp2024-07-25T09:27:36.505353695+08:00[Asia/Shanghai]], createVersion=0, partNumber=0}} | ret=FAILURE | java.lang.IllegalArgumentException: The chunk list has 2 entries, but the checksum chunks has 3 entries. They should be equal in size.

Solution description

After reviewing the error logs, we found that most issues were caused by a single block missing the final chunk, leading to reconstructed blocks not meeting expectations. After some discussion, we proposed a recovery strategy: if among the participating blocks, there is exactly one block with a different blockgrouplength from the others (where all other blocks have the same blockgrouplength), we will treat this inconsistent block as a lost block and use the remaining blocks to complete its recovery.

Overall, this process is idempotent, because for EC-6-3-1024k, using 8 blocks to recover 1 lost block is equivalent to using 7 blocks to recover 1 lost block.

Example

We will still use the case described by HDDS-10985 to illustrate the recovery process.

java.lang.IllegalArgumentException: The chunk list has 2 entries, but the checksum chunks has 3 entries. They should be equal in size.  at   com.google.common.base.Preconditions.checkArgument(Preconditions.java:143)  at org.apache.hadoop.hdds.scm.storage.ECBlockOutputStream.executePutBlock(ECBlockOutputStream.java:147)  at org.apache.hadoop.ozone.container.ec.reconstruction.ECReconstructionCoordinator.reconstructECBlockGroup(ECReconstructionCoordinator.java:338)  at org.apache.hadoop.ozone.container.ec.reconstruction.ECReconstructionCoordinator.reconstructECContainerGroup(ECReconstructionCoordinator.java:181)  at org.apache.hadoop.ozone.container.ec.reconstruction.ECReconstructionCoordinatorTask.runTask(ECReconstructionCoordinatorTask.java:68)  at org.apache.hadoop.ozone.container.replication.ReplicationSupervisor$TaskRunner.run(ReplicationSupervisor.java:369)  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)  at java.lang.Thread.run(Thread.java:745) 

2024-06-13 07:59:00,718 [ContainerReplicationThread-6] INFO org.apache.hadoop.ozone.container.ec.reconstruction.ECReconstructionCoordinator: 
Block Data for: conID: 1188979 locID: 113750155360100435 bcsId: 0 
replica Index: 1 block length: 3145728 block group length: 18874368 
chunk list:  
chunkNum: 1 length: 1048576 offset: 0  
chunkNum: 2 length: 1048576 offset: 1048576  
chunkNum: 3 length: 1048576 offset: 2097152

2024-06-13 07:59:00,718 [ContainerReplicationThread-6] INFO org.apache.hadoop.ozone.container.ec.reconstruction.ECReconstructionCoordinator: 
Block Data for: conID: 1188979 locID: 113750155360100435 bcsId: 0 
replica Index: 2 block length: 3145728 block group length: 18874368 
chunk list:  
chunkNum: 1 length: 1048576 offset: 0  
chunkNum: 2 length: 1048576 offset: 1048576  
chunkNum: 3 length: 1048576 offset: 2097152

2024-06-13 07:59:00,718 [ContainerReplicationThread-6] INFO org.apache.hadoop.ozone.container.ec.reconstruction.ECReconstructionCoordinator: 
Block Data for: conID: 1188979 locID: 113750155360100435 bcsId: 0 
replica Index: 3 block length: 3145728 block group length: 18874368 
chunk list:  
chunkNum: 1 length: 1048576 offset: 0  
chunkNum: 2 length: 1048576 offset: 1048576  
chunkNum: 3 length: 1048576 offset: 2097152

2024-06-13 07:59:00,718 [ContainerReplicationThread-6] INFO org.apache.hadoop.ozone.container.ec.reconstruction.ECReconstructionCoordinator: 
Block Data for: conID: 1188979 locID: 113750155360100435 bcsId: 0 
replica Index: 4 block length: 2097152 block group length: 12582912 
chunk list:  
chunkNum: 1 length: 1048576 offset: 0  
chunkNum: 2 length: 1048576 offset: 1048576

2024-06-13 07:59:00,718 [ContainerReplicationThread-6] INFO org.apache.hadoop.ozone.container.ec.reconstruction.ECReconstructionCoordinator: 
Block Data for: conID: 1188979 locID: 113750155360100435 bcsId: 0 
replica Index: 5 block length: 3145728 block group length: 18874368 
chunk list:  
chunkNum: 1 length: 1048576 offset: 0  
chunkNum: 2 length: 1048576 offset: 1048576  
chunkNum: 3 length: 1048576 offset: 2097152

2024-06-13 07:59:00,718 [ContainerReplicationThread-6] INFO org.apache.hadoop.ozone.container.ec.reconstruction.ECReconstructionCoordinator: 
Block Data for: conID: 1188979 locID: 113750155360100435 bcsId: 0 
replica Index: 6 block length: 3145728 block group length: 18874368 
chunk list:  
chunkNum: 1 length: 1048576 offset: 0  
chunkNum: 2 length: 1048576 offset: 1048576  
chunkNum: 3 length: 1048576 offset: 2097152

2024-06-13 07:59:00,718 [ContainerReplicationThread-6] INFO org.apache.hadoop.ozone.container.ec.reconstruction.ECReconstructionCoordinator: 
Block Data for: conID: 1188979 locID: 113750155360100435 bcsId: 0 
replica Index: 8 block length: 3145728 block group length: 18874368 
chunk list:  
chunkNum: 1 length: 1048576 offset: 0  
chunkNum: 2 length: 1048576 offset: 1048576  
chunkNum: 3 length: 1048576 offset: 2097152

2024-06-13 07:59:00,718 [ContainerReplicationThread-6] INFO org.apache.hadoop.ozone.container.ec.reconstruction.ECReconstructionCoordinator: 
Block Data for: conID: 1188979 locID: 113750155360100435 bcsId: 0 
replica Index: 9 block length: 3145728 block group length: 18874368 
chunk list:  
chunkNum: 1 length: 1048576 offset: 0  
chunkNum: 2 length: 1048576 offset: 1048576  
chunkNum: 3 length: 1048576 offset: 2097152 

This case lost replica 7, SCM decided to use replicas 1, 2, 3, 4, 5, 6, 8, and 9 to recover this missing replica.

Original Code

The actual situation is that replica 4 has fewer chunks and a smaller blockgrouplength compared to the other replicas. The current logic uses the blockgrouplength of replica 4 as the basis for recovery. As a result, the newly recovered replica 7 has one fewer chunk than the other replicas (1, 2, 3, 5, 6, 8, 9), triggering an IllegalArgumentException during the verification process.

Modified Code

Step 1. First, we group the source replicas, resulting in the following:

  • blockgrouplength: 18874368, replicas: 1, 2, 3, 5, 6, 8, 9
  • blockgrouplength: 12582912, replica: 4

Step 2. We observe that the current scenario satisfies two conditions:

  1. Only replica 4 has a blockgrouplength that differs from the others.
  2. The other replicas (1, 2, 3, 5, 6, 8, 9) share a consistent blockgrouplength.

Step 3. We conclude that replica 4 can be excluded, allowing replicas (1, 2, 3, 5, 6, 8, 9) to participate in the recovery process. Consequently, the resulting replica 7 should meet the expected criteria.

@sodonnel
Copy link
Contributor

sodonnel commented Aug 8, 2024

In your example, what do you do with EC-3-2, if 2 replicas are missing and one of the remaining 3 has a missing chunk? You cannot recover it.

There is certainly a problem here. If things are working as designed from the client size, and it is abandoning these blocks due to a failed ACK from one of the replicas, then the block length at OM should be the smaller size.

Then the problem is actually that we have truncated the block length to remove the duplicated chunks, but we have not truncated the checksums to match that, and that results in the failed recovery.

For some of the failing blocks, can you check the size of the block on OM to see if it matches the smaller or larger size of the block across the replicas?

@slfan1989
Copy link
Contributor Author

slfan1989 commented Aug 9, 2024

In your example, what do you do with EC-3-2, if 2 replicas are missing and one of the remaining 3 has a missing chunk? You cannot recover it.

You are correct. in this scenario, even with code modifications, we can't recover the Block. However, the previous code also could not perform the recovery due to a java.lang.IllegalArgumentException.

There is certainly a problem here. If things are working as designed from the client size, and it is abandoning these blocks due to a failed ACK from one of the replicas, then the block length at OM should be the smaller size.

Thank you for your explanation! This information is crucial. I will recheck the lengths of the damaged blocks in OM to further confirm the issue. Once I complete the confirmation, I will get back to you with feedback.

Thanks again for your reply!

@slfan1989
Copy link
Contributor Author

slfan1989 commented Sep 5, 2024

@sodonnel Thank you very much for your response!Your assessment is correct. there is no issue with the EC data writing in Ozone. The problem lies with the incorrect checksum selection during data recovery.

Here are some analytical approaches:

Analysis from the data writing

When writing data, we sometimes encounter situations like ContainerState is CLOSED, where the data may be written as follows:

  • Block-1
Data Data Data Parity Parity
EC-Stripe1 Chunk1 Chunk1 Chunk1 Chunk1 Chunk1
EC-Stripe2 Chunk2 Error Chunk2 Chunk2 Chunk2
  • Block-2
Data Data Data Parity Parity
EC-Stripe2 Chunk2 Chunk2 Chunk2 Chunk2 Chunk2

When writing Block1's EC-Stripe2 on the client, an error was returned by the DN. As a result, the client retried the operation and requested a new data block to re-write EC-Stripe2. At this point, some "dirty data" have been written to Block-1.

Analysis from the EC data reconstruction

If DN3 data for Block-1 is lost at this point, we would encounter the following situation.

Data Data Data(Missing) Parity Parity
EC-Stripe1 Chunk1 Chunk1 Chunk1(Missing) Chunk1 Chunk1
EC-Stripe2 Chunk2 Error Chunk2(Missing) Chunk2 Chunk2

We will use Data1, Data2, Parity1, and Parity2 to recover the Data3.

During the EC data recovery process, we will choose a smaller BlockGroupLength to restore the data.
However, during validation, we still chose the chunk size of the dirty data to verify the recovered data.
We should validate the data according to the chunk size of the data blocks corresponding to the smaller BlockGroupLength.

Following methods to analyze the data:

  1. Parsed the OM metadata, which was written to a Hive table. This metadata includes keyPath, fileSize, ConId, and LocId.
  2. Found the EC recovery logs from a specific DN with errors to identify the ConId and LocId.
  3. Queried the corresponding files using ConId and LocId from the table in step 1.
  4. On OM, used ozone sh key info keyname to find the corresponding BlockGroupLength.
  5. On SCM, used ozone admin container info <containerId> to find the corresponding container.

I’ve modified the code, and this PR truncates the checksums during EC data recovery to ensure that the restored data can pass validation.

cc: @errose28

return executePutBlock(close, force, blockGroupLength);
}

private int calcEffectiveChunkSize(BlockData[] blockGroup,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this calculation really needed? We know the block size we are recovering (blockGroupLength) which is earlier calculated as the min(put_block_size) across the stripe.

Can we then simply divide blockGroupLength / chunkSize and round up to give the number of chunks, and from there know the number of checksum chunks we need to recover?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for reviewing the code! The calculation method you suggested is feasible, and the code will become more concise. I will try to make the modifications.

@sodonnel
Copy link
Contributor

sodonnel commented Sep 6, 2024

@slfan1989 Thanks for digging into this. Your research confirms what I suspected. The client side is working as it was designed, but the checksum calculation is broken.

I had a look at the change and suggested a simplification which might be possible.

It would also be nice if we could get a test to reproduce this and validate the fix.

@slfan1989
Copy link
Contributor Author

@slfan1989 Thanks for digging into this. Your research confirms what I suspected. The client side is working as it was designed, but the checksum calculation is broken.

I had a look at the change and suggested a simplification which might be possible.

It would also be nice if we could get a test to reproduce this and validate the fix.

Thank you very much for your guidance and suggestions! I will try to add unit tests to reproduce and test this functionality.

}
}

private void triggerRetryByCloseContainer(OzoneOutputStream out) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the production environment, we encountered a situation where data was written to a certain DN, and the Container for that DN Closed. I designed the following steps to replicate the scenario we see in production: writing data and Closed the Container concurrently. Since the entire process is asynchronous, we may need to execute it multiple times to reproduce the issue seen in production.

Example

2024-09-11 07:46:20,917 [FixedThreadPoolWithAffinityExecutor-1-0] INFO  container.IncrementalContainerReportHandler (AbstractContainerReportHandler.java:updateContainerState(312)) - Moving container #1 to CLOSED state, datanode 4366cc44-4875-4f4d-8afb-5ac7ed9ba40d(bogon/192.168.1.16) reported CLOSED replica with index 4.
07:46:20.914 [4366cc44-4875-4f4d-8afb-5ac7ed9ba40d-ChunkReader-0] ERROR DNAudit - user=null | ip=null | op=PUT_BLOCK {blockData=[blockId=conID: 1 locID: 113750153625600007 bcsId: 0 replicaIndex: 4, size=2097152]} | ret=FAILURE
java.lang.Exception: Requested operation not allowed as ContainerState is CLOSED
	at org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatchRequest(HddsDispatcher.java:431) ~[classes/:?]
	at org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.lambda$dispatch$0(HddsDispatcher.java:197) ~[classes/:?]
	at org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(OzoneProtocolMessageDispatcher.java:89) [classes/:?]
	at org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatch(HddsDispatcher.java:196) [classes/:?]
	at org.apache.hadoop.ozone.container.common.transport.server.GrpcXceiverService$1.onNext(GrpcXceiverService.java:112) [classes/:?]
	at org.apache.hadoop.ozone.container.common.transport.server.GrpcXceiverService$1.onNext(GrpcXceiverService.java:105) [classes/:?]
	at org.apache.ratis.thirdparty.io.grpc.stub.ServerCalls$StreamingServerCallHandler$StreamingServerCallListener.onMessage(ServerCalls.java:262) [ratis-thirdparty-misc-1.0.6.jar:1.0.6]
	at org.apache.ratis.thirdparty.io.grpc.ForwardingServerCallListener.onMessage(ForwardingServerCallListener.java:33) [ratis-thirdparty-misc-1.0.6.jar:1.0.6]
	at org.apache.hadoop.hdds.tracing.GrpcServerInterceptor$1.onMessage(GrpcServerInterceptor.java:49) [classes/:?]
	at org.apache.ratis.thirdparty.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.messagesAvailableInternal(ServerCallImpl.java:329) [ratis-thirdparty-misc-1.0.6.jar:1.0.6]
	at org.apache.ratis.thirdparty.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.messagesAvailable(ServerCallImpl.java:314) [ratis-thirdparty-misc-1.0.6.jar:1.0.6]
	at org.apache.ratis.thirdparty.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1MessagesAvailable.runInContext(ServerImpl.java:833) [ratis-thirdparty-misc-1.0.6.jar:1.0.6]
	at org.apache.ratis.thirdparty.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37) [ratis-thirdparty-misc-1.0.6.jar:1.0.6]
	at org.apache.ratis.thirdparty.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133) [ratis-thirdparty-misc-1.0.6.jar:1.0.6]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_412]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_412]
	at java.lang.Thread.run(Thread.java:750) [?:1.8.0_412]
2024-09-11 07:46:20,929 [client-write-TID-0] WARN  io.KeyOutputStream (ECKeyOutputStream.java:logStreamError(200)) - Put block failed: S S S F S
2024-09-11 07:46:20,929 [client-write-TID-0] WARN  io.KeyOutputStream (ECKeyOutputStream.java:logStreamError(202)) - Failure for replica index: 4, DatanodeDetails: 4366cc44-4875-4f4d-8afb-5ac7ed9ba40d(bogon/192.168.1.16)
java.io.IOException: Unexpected Storage Container Exception: org.apache.hadoop.hdds.scm.container.common.helpers.ContainerNotOpenException: Requested operation not allowed as ContainerState is CLOSED
	at org.apache.hadoop.hdds.scm.storage.BlockOutputStream.setIoException(BlockOutputStream.java:815)
	at org.apache.hadoop.hdds.scm.storage.ECBlockOutputStream.validateResponse(ECBlockOutputStream.java:351)
	at org.apache.hadoop.hdds.scm.storage.ECBlockOutputStream.lambda$executePutBlock$1(ECBlockOutputStream.java:280)
	at java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:616)
	at java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:591)
	at java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:456)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:750)
Caused by: org.apache.hadoop.hdds.scm.container.common.helpers.ContainerNotOpenException: Requested operation not allowed as ContainerState is CLOSED
	at org.apache.hadoop.hdds.scm.storage.ContainerProtocolCalls.validateContainerResponse(ContainerProtocolCalls.java:787)
	at org.apache.hadoop.hdds.scm.storage.ECBlockOutputStream.validateResponse(ECBlockOutputStream.java:349)
	... 7 more

Reconstruction Result

We can see that during the recovery process, the BlockGroupLength differs across different DNs.

2024-09-11 07:46:21,375 [main] INFO  reconstruction.ECReconstructionCoordinator (ECReconstructionCoordinator.java:logBlockGroupDetails(356)) - Block group details for conID: 1 locID: 113750153625600007 bcsId: 0 replicaIndex: null. Replication Config EC{rs-3-2-1024k}. Calculated safe length: 3145728. 
2024-09-11 07:46:21,375 [main] INFO  reconstruction.ECReconstructionCoordinator (ECReconstructionCoordinator.java:logBlockGroupDetails(387)) - Block Data for: conID: 1 locID: 113750153625600007 bcsId: 0 replicaIndex: 2 replica Index: 2 block length: 1048576 block group length: 4194304 chunk list: 
  chunkNum: 1 length: 1048576 offset: 0
2024-09-11 07:46:21,375 [main] INFO  reconstruction.ECReconstructionCoordinator (ECReconstructionCoordinator.java:logBlockGroupDetails(387)) - Block Data for: conID: 1 locID: 113750153625600007 bcsId: 0 replicaIndex: 3 replica Index: 3 block length: 1048576 block group length: 4194304 chunk list: 
  chunkNum: 1 length: 1048576 offset: 0
2024-09-11 07:46:21,375 [main] INFO  reconstruction.ECReconstructionCoordinator (ECReconstructionCoordinator.java:logBlockGroupDetails(387)) - Block Data for: conID: 1 locID: 113750153625600007 bcsId: 0 replicaIndex: 4 replica Index: 4 block length: 1048576 block group length: 3145728 chunk list: 
  chunkNum: 1 length: 1048576 offset: 0
2024-09-11 07:46:21,375 [main] INFO  reconstruction.ECReconstructionCoordinator (ECReconstructionCoordinator.java:logBlockGroupDetails(387)) - Block Data for: conID: 1 locID: 113750153625600007 bcsId: 0 replicaIndex: 5 replica Index: 5 block length: 2097152 block group length: 4194304 chunk list: 
  chunkNum: 1 length: 1048576 offset: 0
  chunkNum: 2 length: 1048576 offset: 1048576

@slfan1989
Copy link
Contributor Author

@sodonnel Can you help review this PR again? Thank you very much!

Copy link
Contributor

@sodonnel sodonnel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the work and iterations on this @slfan1989. The change LGTM now.

@sodonnel sodonnel merged commit 0915f0b into apache:master Sep 11, 2024
@slfan1989
Copy link
Contributor Author

@sodonnel Thank you very much for reviewing the code! I appreciate your suggestions and insights during the process of solving this issue, as they played a crucial role in addressing the issue.

xichen01 pushed a commit to xichen01/ozone that referenced this pull request Sep 16, 2024
…s was not equal to checksumBlockDataChunks. (apache#7009)

(cherry picked from commit 0915f0b)
xichen01 pushed a commit to xichen01/ozone that referenced this pull request Sep 18, 2024
…s was not equal to checksumBlockDataChunks. (apache#7009)

(cherry picked from commit 0915f0b)
xichen01 pushed a commit to xichen01/ozone that referenced this pull request Sep 28, 2024
…s was not equal to checksumBlockDataChunks. (apache#7009)

(cherry picked from commit 0915f0b)
aswinshakil added a commit to aswinshakil/ozone that referenced this pull request Oct 1, 2024
COMMITS:
360fea5 HDDS-11494. Improve the duration option of freon ombg (apache#7246)
10d3b21 HDDS-11504. Update Ratis to 3.1.1. (apache#7257)
ce46297 HDDS-11162. Improve Disk Usage page UI (apache#7214)
c91f1c7 HDDS-11491. Avoid sharing clientId among deleting services (apache#7250)
b0943d5 HDDS-11501. Improve logging in XceiverServerRatis (apache#7252)
55925ab HDDS-11502. Class path contains multiple SLF4J providers (apache#7255)
d0ad836 HDDS-11472. Avoid recreating external access authorizer on OM state reload (apache#7238)
254db9e HDDS-11500. RootCARotationManager cancelling wrong task in notifyStatusChanged (apache#7251)
1e6e4b3 HDDS-11499. Remove redundant code from ECReconstructionCoordinator. (apache#7248)
adb2821 HDDS-11490. Bump rollup to 3.29.5 (apache#7232)
189a9fe HDDS-11484. Validate javadoc in CI (apache#7245)
64a29c6 HDDS-11497. Bump commons-configuration2 to 2.11.0 (apache#7242)
95cfadd HDDS-11496. Bump maven-install-plugin to 3.1.3 (apache#7244)
0a999cf HDDS-11493. Bump sqlite-jdbc to 3.46.1.3 (apache#7243)
a214a31 HDDS-11329. Update Ozone images to Rocky Linux-based runner (apache#7241)
56ddb85 HDDS-11371. Handle cases where OM does not have getServerDefaults() implemented. (apache#7130)
b5097c7 HDDS-11347. Add rocks_tools_native lib check in Ozone CLI checknative subcommand (apache#7101)
fb0bf77 HDDS-11489. Bump maven-site-plugin to 3.20.0 (apache#7226)
70e6e40 HDDS-11122. Fix javadoc warnings (apache#7234)
acf3fdc HDDS-11458. Selective checks: trigger checkstyle for properties file changes (apache#7196)
6b87207 HDDS-11469. Statistics of Pipeline and Container (apache#7217)
1b8468b HDDS-11411. Snapshot garbage collection should not run when the keys are moved from a deleted snapshot to the next snapshot in the chain (apache#7193)
1f86ce8 HDDS-10617. Unexpected number of files in ITestS3AContractGetFileStatusV1List (apache#7208)
73a3bcc HDDS-11467. Bump vite to 4.5.5 (apache#7212)
d45aa1d HDDS-11460. Bump express to 4.21.0 (apache#7197)
e2e30b8 HDDS-11354. Intermittent failure in TestOzoneManagerSnapshotAcl#testLookupKeyWithNotAllowedUserForPrefixAcl (apache#7205)
0fcb645 HDDS-11477. [doc] Add configuration description for datanode docs (apache#7223)
3598ee3 HDDS-11464. Removed unused constants from OzoneConsts. (apache#7207)
8c0b54e HDDS-11408. Snapshot rename table entries are propagated incorrectly on snapshot deletes (apache#7200)
719bdf9 HDDS-11396. NPE due to empty Handler#clusterId (apache#7145)
40c4001 HDDS-10479. Add ozone admin ratis local raftMetaConf (apache#7170)
45f9138 HDDS-11394. Fix pipeline close --all command (apache#7138)
2b196d1 HDDS-11468. Enabled DB sync button (apache#7216)
d3899d2 Clean up files created after TestKeyValueHandlerWithUnhealthyContainer#testMarkContainerUnhealthyInFailedVolume (apache#7219)
70b8dd5 HDDS-11157. Improve Datanodes page UI (apache#7168)
151709a HDDS-11446. Downgrade picocli to 4.7.5 due to regression (apache#7215)
7a26aff HDDS-11158. Improve Pipelines page UI (apache#7171)
c365aa0 HDDS-11181. Cleanup of unnecessary try-catch blocks (apache#7210)
88dd436 HDDS-11423. Implement equals operation for --filter option to ozone ldb scan (apache#7167)
95cfadd HDDS-11496. Bump maven-install-plugin to 3.1.3 (apache#7244)
0a999cf HDDS-11493. Bump sqlite-jdbc to 3.46.1.3 (apache#7243)
a214a31 HDDS-11329. Update Ozone images to Rocky Linux-based runner (apache#7241)
56ddb85 HDDS-11371. Handle cases where OM does not have getServerDefaults() implemented. (apache#7130)
b5097c7 HDDS-11347. Add rocks_tools_native lib check in Ozone CLI checknative subcommand (apache#7101)
fb0bf77 HDDS-11489. Bump maven-site-plugin to 3.20.0 (apache#7226)
70e6e40 HDDS-11122. Fix javadoc warnings (apache#7234)
acf3fdc HDDS-11458. Selective checks: trigger checkstyle for properties file changes (apache#7196)
6b87207 HDDS-11469. Statistics of Pipeline and Container (apache#7217)
1b8468b HDDS-11411. Snapshot garbage collection should not run when the keys are moved from a deleted snaps
hot to the next snapshot in the chain (apache#7193)
1f86ce8 HDDS-10617. Unexpected number of files in ITestS3AContractGetFileStatusV1List (apache#7208)
73a3bcc HDDS-11467. Bump vite to 4.5.5 (apache#7212)
d45aa1d HDDS-11460. Bump express to 4.21.0 (apache#7197)
e2e30b8 HDDS-11354. Intermittent failure in TestOzoneManagerSnapshotAcl#testLookupKeyWithNotAllowedUserForP
refixAcl (apache#7205)
0fcb645 HDDS-11477. [doc] Add configuration description for datanode docs (apache#7223)
3598ee3 HDDS-11464. Removed unused constants from OzoneConsts. (apache#7207)
8c0b54e HDDS-11408. Snapshot rename table entries are propagated incorrectly on snapshot deletes (apache#7200)
719bdf9 HDDS-11396. NPE due to empty Handler#clusterId (apache#7145)
40c4001 HDDS-10479. Add ozone admin ratis local raftMetaConf (apache#7170)
45f9138 HDDS-11394. Fix pipeline close --all command (apache#7138)
2b196d1 HDDS-11468. Enabled DB sync button (apache#7216)
d3899d2 Clean up files created after TestKeyValueHandlerWithUnhealthyContainer#testMarkContainerUnhealthyIn
FailedVolume (apache#7219)
70b8dd5 HDDS-11157. Improve Datanodes page UI (apache#7168)
151709a HDDS-11446. Downgrade picocli to 4.7.5 due to regression (apache#7215)
7a26aff HDDS-11158. Improve Pipelines page UI (apache#7171)
c365aa0 HDDS-11181. Cleanup of unnecessary try-catch blocks (apache#7210)
88dd436 HDDS-11423. Implement equals operation for --filter option to ozone ldb scan (apache#7167)
e0060a8 HDDS-11196. Improve SCM WebUI Display (apache#6960)
22ddfb9 Revert "HDDS-11456. Require successful dependency/licence checks for acceptance/compile/kubernetes (apache#7192)"
9f5bf43 HDDS-11457. Internal error on S3 CompleteMultipartUpload if parts are not specified (apache#7195)
10c47a1 HDDS-11459. Bump develocity-maven-extension to 1.22.1 (apache#7201)
50f2563 HDDS-11419. Fix waitForCheckpointDirectoryExist log message (apache#7199)
a7d7e37 HDDS-11456. Require successful dependency/licence checks for acceptance/compile/kubernetes (apache#7192)
5feb9ea HDDS-11453. OmSnapshotPurge should be in a different ozone manager double buffer batch (apache#7188)
703c4d5 HDDS-10984. Tool to restore SCM certificates from RocksDB. (apache#6781)
d221065 HDDS-11440. Add a lastTransactionInfo field in SnapshotInfo to check for transactions in flight on the snapshot (apache#7179)
e573701 HDDS-11448. Improve documentation in ContainerStateMachine (apache#7183)
0e49f7a HDDS-11449. Remove unnecessary log from client console. (apache#7184)
cd251f2 HDDS-11438. Ensure DataInputBuffer is closed in OMPBHelper#convert (apache#7182)
4b47812 HDDS-11389. Incorrect number of deleted containers shown in Recon UI. (apache#7149)
0915f0b HDDS-10985. EC Reconstruction failed because the size of currentChunks was not equal to checksumBlockDataChunks. (apache#7009)
0f16195 HDDS-11416. refactor ratis submit request avoid code duplicate (apache#7166)
86fe920 HDDS-11376. Improve ReplicationSupervisor to record replication metrics (apache#7140)

CONFLICT:
Merge conflict in hadoop-hdds/container-service/src/test/java/org/apache/hadoop/ozone/container/keyvalue/TestKeyValueHandlerWithUnhealthyContainer.java

MODIFIED:
/Users/abalasubramanian/Documents/ozone/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/checksum/DNContainerOperationClient.java
/Users/abalasubramanian/Documents/ozone/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/checksum/ReconcileContainerTask.java
aswinshakil added a commit to aswinshakil/ozone that referenced this pull request Oct 1, 2024
COMMITS:
360fea5 HDDS-11494. Improve the duration option of freon ombg (apache#7246)
10d3b21 HDDS-11504. Update Ratis to 3.1.1. (apache#7257)
ce46297 HDDS-11162. Improve Disk Usage page UI (apache#7214)
c91f1c7 HDDS-11491. Avoid sharing clientId among deleting services (apache#7250)
b0943d5 HDDS-11501. Improve logging in XceiverServerRatis (apache#7252)
55925ab HDDS-11502. Class path contains multiple SLF4J providers (apache#7255)
d0ad836 HDDS-11472. Avoid recreating external access authorizer on OM state reload (apache#7238)
254db9e HDDS-11500. RootCARotationManager cancelling wrong task in notifyStatusChanged (apache#7251)
1e6e4b3 HDDS-11499. Remove redundant code from ECReconstructionCoordinator. (apache#7248)
adb2821 HDDS-11490. Bump rollup to 3.29.5 (apache#7232)
189a9fe HDDS-11484. Validate javadoc in CI (apache#7245)
64a29c6 HDDS-11497. Bump commons-configuration2 to 2.11.0 (apache#7242)
95cfadd HDDS-11496. Bump maven-install-plugin to 3.1.3 (apache#7244)
0a999cf HDDS-11493. Bump sqlite-jdbc to 3.46.1.3 (apache#7243)
a214a31 HDDS-11329. Update Ozone images to Rocky Linux-based runner (apache#7241)
56ddb85 HDDS-11371. Handle cases where OM does not have getServerDefaults() implemented. (apache#7130)
b5097c7 HDDS-11347. Add rocks_tools_native lib check in Ozone CLI checknative subcommand (apache#7101)
fb0bf77 HDDS-11489. Bump maven-site-plugin to 3.20.0 (apache#7226)
70e6e40 HDDS-11122. Fix javadoc warnings (apache#7234)
acf3fdc HDDS-11458. Selective checks: trigger checkstyle for properties file changes (apache#7196)
6b87207 HDDS-11469. Statistics of Pipeline and Container (apache#7217)
1b8468b HDDS-11411. Snapshot garbage collection should not run when the keys are moved from a deleted snapshot to the next snapshot in the chain (apache#7193)
1f86ce8 HDDS-10617. Unexpected number of files in ITestS3AContractGetFileStatusV1List (apache#7208)
73a3bcc HDDS-11467. Bump vite to 4.5.5 (apache#7212)
d45aa1d HDDS-11460. Bump express to 4.21.0 (apache#7197)
e2e30b8 HDDS-11354. Intermittent failure in TestOzoneManagerSnapshotAcl#testLookupKeyWithNotAllowedUserForPrefixAcl (apache#7205)
0fcb645 HDDS-11477. [doc] Add configuration description for datanode docs (apache#7223)
3598ee3 HDDS-11464. Removed unused constants from OzoneConsts. (apache#7207)
8c0b54e HDDS-11408. Snapshot rename table entries are propagated incorrectly on snapshot deletes (apache#7200)
719bdf9 HDDS-11396. NPE due to empty Handler#clusterId (apache#7145)
40c4001 HDDS-10479. Add ozone admin ratis local raftMetaConf (apache#7170)
45f9138 HDDS-11394. Fix pipeline close --all command (apache#7138)
2b196d1 HDDS-11468. Enabled DB sync button (apache#7216)
d3899d2 Clean up files created after TestKeyValueHandlerWithUnhealthyContainer#testMarkContainerUnhealthyInFailedVolume (apache#7219)
70b8dd5 HDDS-11157. Improve Datanodes page UI (apache#7168)
151709a HDDS-11446. Downgrade picocli to 4.7.5 due to regression (apache#7215)
7a26aff HDDS-11158. Improve Pipelines page UI (apache#7171)
c365aa0 HDDS-11181. Cleanup of unnecessary try-catch blocks (apache#7210)
88dd436 HDDS-11423. Implement equals operation for --filter option to ozone ldb scan (apache#7167)
95cfadd HDDS-11496. Bump maven-install-plugin to 3.1.3 (apache#7244)
0a999cf HDDS-11493. Bump sqlite-jdbc to 3.46.1.3 (apache#7243)
a214a31 HDDS-11329. Update Ozone images to Rocky Linux-based runner (apache#7241)
56ddb85 HDDS-11371. Handle cases where OM does not have getServerDefaults() implemented. (apache#7130)
b5097c7 HDDS-11347. Add rocks_tools_native lib check in Ozone CLI checknative subcommand (apache#7101)
fb0bf77 HDDS-11489. Bump maven-site-plugin to 3.20.0 (apache#7226)
70e6e40 HDDS-11122. Fix javadoc warnings (apache#7234)
acf3fdc HDDS-11458. Selective checks: trigger checkstyle for properties file changes (apache#7196)
6b87207 HDDS-11469. Statistics of Pipeline and Container (apache#7217)
1b8468b HDDS-11411. Snapshot garbage collection should not run when the keys are moved from a deleted snaps
hot to the next snapshot in the chain (apache#7193)
1f86ce8 HDDS-10617. Unexpected number of files in ITestS3AContractGetFileStatusV1List (apache#7208)
73a3bcc HDDS-11467. Bump vite to 4.5.5 (apache#7212)
d45aa1d HDDS-11460. Bump express to 4.21.0 (apache#7197)
e2e30b8 HDDS-11354. Intermittent failure in TestOzoneManagerSnapshotAcl#testLookupKeyWithNotAllowedUserForP
refixAcl (apache#7205)
0fcb645 HDDS-11477. [doc] Add configuration description for datanode docs (apache#7223)
3598ee3 HDDS-11464. Removed unused constants from OzoneConsts. (apache#7207)
8c0b54e HDDS-11408. Snapshot rename table entries are propagated incorrectly on snapshot deletes (apache#7200)
719bdf9 HDDS-11396. NPE due to empty Handler#clusterId (apache#7145)
40c4001 HDDS-10479. Add ozone admin ratis local raftMetaConf (apache#7170)
45f9138 HDDS-11394. Fix pipeline close --all command (apache#7138)
2b196d1 HDDS-11468. Enabled DB sync button (apache#7216)
d3899d2 Clean up files created after TestKeyValueHandlerWithUnhealthyContainer#testMarkContainerUnhealthyIn
FailedVolume (apache#7219)
70b8dd5 HDDS-11157. Improve Datanodes page UI (apache#7168)
151709a HDDS-11446. Downgrade picocli to 4.7.5 due to regression (apache#7215)
7a26aff HDDS-11158. Improve Pipelines page UI (apache#7171)
c365aa0 HDDS-11181. Cleanup of unnecessary try-catch blocks (apache#7210)
88dd436 HDDS-11423. Implement equals operation for --filter option to ozone ldb scan (apache#7167)
e0060a8 HDDS-11196. Improve SCM WebUI Display (apache#6960)
22ddfb9 Revert "HDDS-11456. Require successful dependency/licence checks for acceptance/compile/kubernetes (apache#7192)"
9f5bf43 HDDS-11457. Internal error on S3 CompleteMultipartUpload if parts are not specified (apache#7195)
10c47a1 HDDS-11459. Bump develocity-maven-extension to 1.22.1 (apache#7201)
50f2563 HDDS-11419. Fix waitForCheckpointDirectoryExist log message (apache#7199)
a7d7e37 HDDS-11456. Require successful dependency/licence checks for acceptance/compile/kubernetes (apache#7192)
5feb9ea HDDS-11453. OmSnapshotPurge should be in a different ozone manager double buffer batch (apache#7188)
703c4d5 HDDS-10984. Tool to restore SCM certificates from RocksDB. (apache#6781)
d221065 HDDS-11440. Add a lastTransactionInfo field in SnapshotInfo to check for transactions in flight on the snapshot (apache#7179)
e573701 HDDS-11448. Improve documentation in ContainerStateMachine (apache#7183)
0e49f7a HDDS-11449. Remove unnecessary log from client console. (apache#7184)
cd251f2 HDDS-11438. Ensure DataInputBuffer is closed in OMPBHelper#convert (apache#7182)
4b47812 HDDS-11389. Incorrect number of deleted containers shown in Recon UI. (apache#7149)
0915f0b HDDS-10985. EC Reconstruction failed because the size of currentChunks was not equal to checksumBlockDataChunks. (apache#7009)
0f16195 HDDS-11416. refactor ratis submit request avoid code duplicate (apache#7166)
86fe920 HDDS-11376. Improve ReplicationSupervisor to record replication metrics (apache#7140)

CONFLICT:
Merge conflict in hadoop-hdds/container-service/src/test/java/org/apache/hadoop/ozone/container/keyvalue/TestKeyValueHandlerWithUnhealthyContainer.java

MODIFIED:
/Users/abalasubramanian/Documents/ozone/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/checksum/DNContainerOperationClient.java
/Users/abalasubramanian/Documents/ozone/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/checksum/ReconcileContainerTask.java
aswinshakil added a commit to aswinshakil/ozone that referenced this pull request Oct 2, 2024
COMMITS:
360fea5 HDDS-11494. Improve the duration option of freon ombg (apache#7246)
10d3b21 HDDS-11504. Update Ratis to 3.1.1. (apache#7257)
ce46297 HDDS-11162. Improve Disk Usage page UI (apache#7214)
c91f1c7 HDDS-11491. Avoid sharing clientId among deleting services (apache#7250)
b0943d5 HDDS-11501. Improve logging in XceiverServerRatis (apache#7252)
55925ab HDDS-11502. Class path contains multiple SLF4J providers (apache#7255)
d0ad836 HDDS-11472. Avoid recreating external access authorizer on OM state reload (apache#7238)
254db9e HDDS-11500. RootCARotationManager cancelling wrong task in notifyStatusChanged (apache#7251)
1e6e4b3 HDDS-11499. Remove redundant code from ECReconstructionCoordinator. (apache#7248)
adb2821 HDDS-11490. Bump rollup to 3.29.5 (apache#7232)
189a9fe HDDS-11484. Validate javadoc in CI (apache#7245)
64a29c6 HDDS-11497. Bump commons-configuration2 to 2.11.0 (apache#7242)
95cfadd HDDS-11496. Bump maven-install-plugin to 3.1.3 (apache#7244)
0a999cf HDDS-11493. Bump sqlite-jdbc to 3.46.1.3 (apache#7243)
a214a31 HDDS-11329. Update Ozone images to Rocky Linux-based runner (apache#7241)
56ddb85 HDDS-11371. Handle cases where OM does not have getServerDefaults() implemented. (apache#7130)
b5097c7 HDDS-11347. Add rocks_tools_native lib check in Ozone CLI checknative subcommand (apache#7101)
fb0bf77 HDDS-11489. Bump maven-site-plugin to 3.20.0 (apache#7226)
70e6e40 HDDS-11122. Fix javadoc warnings (apache#7234)
acf3fdc HDDS-11458. Selective checks: trigger checkstyle for properties file changes (apache#7196)
6b87207 HDDS-11469. Statistics of Pipeline and Container (apache#7217)
1b8468b HDDS-11411. Snapshot garbage collection should not run when the keys are moved from a deleted snapshot to the next snapshot in the chain (apache#7193)
1f86ce8 HDDS-10617. Unexpected number of files in ITestS3AContractGetFileStatusV1List (apache#7208)
73a3bcc HDDS-11467. Bump vite to 4.5.5 (apache#7212)
d45aa1d HDDS-11460. Bump express to 4.21.0 (apache#7197)
e2e30b8 HDDS-11354. Intermittent failure in TestOzoneManagerSnapshotAcl#testLookupKeyWithNotAllowedUserForPrefixAcl (apache#7205)
0fcb645 HDDS-11477. [doc] Add configuration description for datanode docs (apache#7223)
3598ee3 HDDS-11464. Removed unused constants from OzoneConsts. (apache#7207)
8c0b54e HDDS-11408. Snapshot rename table entries are propagated incorrectly on snapshot deletes (apache#7200)
719bdf9 HDDS-11396. NPE due to empty Handler#clusterId (apache#7145)
40c4001 HDDS-10479. Add ozone admin ratis local raftMetaConf (apache#7170)
45f9138 HDDS-11394. Fix pipeline close --all command (apache#7138)
2b196d1 HDDS-11468. Enabled DB sync button (apache#7216)
d3899d2 Clean up files created after TestKeyValueHandlerWithUnhealthyContainer#testMarkContainerUnhealthyInFailedVolume (apache#7219)
70b8dd5 HDDS-11157. Improve Datanodes page UI (apache#7168)
151709a HDDS-11446. Downgrade picocli to 4.7.5 due to regression (apache#7215)
7a26aff HDDS-11158. Improve Pipelines page UI (apache#7171)
c365aa0 HDDS-11181. Cleanup of unnecessary try-catch blocks (apache#7210)
88dd436 HDDS-11423. Implement equals operation for --filter option to ozone ldb scan (apache#7167)
95cfadd HDDS-11496. Bump maven-install-plugin to 3.1.3 (apache#7244)
0a999cf HDDS-11493. Bump sqlite-jdbc to 3.46.1.3 (apache#7243)
a214a31 HDDS-11329. Update Ozone images to Rocky Linux-based runner (apache#7241)
56ddb85 HDDS-11371. Handle cases where OM does not have getServerDefaults() implemented. (apache#7130)
b5097c7 HDDS-11347. Add rocks_tools_native lib check in Ozone CLI checknative subcommand (apache#7101)
fb0bf77 HDDS-11489. Bump maven-site-plugin to 3.20.0 (apache#7226)
70e6e40 HDDS-11122. Fix javadoc warnings (apache#7234)
acf3fdc HDDS-11458. Selective checks: trigger checkstyle for properties file changes (apache#7196)
6b87207 HDDS-11469. Statistics of Pipeline and Container (apache#7217)
1b8468b HDDS-11411. Snapshot garbage collection should not run when the keys are moved from a deleted snaps
hot to the next snapshot in the chain (apache#7193)
1f86ce8 HDDS-10617. Unexpected number of files in ITestS3AContractGetFileStatusV1List (apache#7208)
73a3bcc HDDS-11467. Bump vite to 4.5.5 (apache#7212)
d45aa1d HDDS-11460. Bump express to 4.21.0 (apache#7197)
e2e30b8 HDDS-11354. Intermittent failure in TestOzoneManagerSnapshotAcl#testLookupKeyWithNotAllowedUserForP
refixAcl (apache#7205)
0fcb645 HDDS-11477. [doc] Add configuration description for datanode docs (apache#7223)
3598ee3 HDDS-11464. Removed unused constants from OzoneConsts. (apache#7207)
8c0b54e HDDS-11408. Snapshot rename table entries are propagated incorrectly on snapshot deletes (apache#7200)
719bdf9 HDDS-11396. NPE due to empty Handler#clusterId (apache#7145)
40c4001 HDDS-10479. Add ozone admin ratis local raftMetaConf (apache#7170)
45f9138 HDDS-11394. Fix pipeline close --all command (apache#7138)
2b196d1 HDDS-11468. Enabled DB sync button (apache#7216)
d3899d2 Clean up files created after TestKeyValueHandlerWithUnhealthyContainer#testMarkContainerUnhealthyIn
FailedVolume (apache#7219)
70b8dd5 HDDS-11157. Improve Datanodes page UI (apache#7168)
151709a HDDS-11446. Downgrade picocli to 4.7.5 due to regression (apache#7215)
7a26aff HDDS-11158. Improve Pipelines page UI (apache#7171)
c365aa0 HDDS-11181. Cleanup of unnecessary try-catch blocks (apache#7210)
88dd436 HDDS-11423. Implement equals operation for --filter option to ozone ldb scan (apache#7167)
e0060a8 HDDS-11196. Improve SCM WebUI Display (apache#6960)
22ddfb9 Revert "HDDS-11456. Require successful dependency/licence checks for acceptance/compile/kubernetes (apache#7192)"
9f5bf43 HDDS-11457. Internal error on S3 CompleteMultipartUpload if parts are not specified (apache#7195)
10c47a1 HDDS-11459. Bump develocity-maven-extension to 1.22.1 (apache#7201)
50f2563 HDDS-11419. Fix waitForCheckpointDirectoryExist log message (apache#7199)
a7d7e37 HDDS-11456. Require successful dependency/licence checks for acceptance/compile/kubernetes (apache#7192)
5feb9ea HDDS-11453. OmSnapshotPurge should be in a different ozone manager double buffer batch (apache#7188)
703c4d5 HDDS-10984. Tool to restore SCM certificates from RocksDB. (apache#6781)
d221065 HDDS-11440. Add a lastTransactionInfo field in SnapshotInfo to check for transactions in flight on the snapshot (apache#7179)
e573701 HDDS-11448. Improve documentation in ContainerStateMachine (apache#7183)
0e49f7a HDDS-11449. Remove unnecessary log from client console. (apache#7184)
cd251f2 HDDS-11438. Ensure DataInputBuffer is closed in OMPBHelper#convert (apache#7182)
4b47812 HDDS-11389. Incorrect number of deleted containers shown in Recon UI. (apache#7149)
0915f0b HDDS-10985. EC Reconstruction failed because the size of currentChunks was not equal to checksumBlockDataChunks. (apache#7009)
0f16195 HDDS-11416. refactor ratis submit request avoid code duplicate (apache#7166)
86fe920 HDDS-11376. Improve ReplicationSupervisor to record replication metrics (apache#7140)

CONFLICT:
Merge conflict in hadoop-hdds/container-service/src/test/java/org/apache/hadoop/ozone/container/keyvalue/TestKeyValueHandlerWithUnhealthyContainer.java

MODIFIED:
/Users/abalasubramanian/Documents/ozone/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/checksum/DNContainerOperationClient.java
/Users/abalasubramanian/Documents/ozone/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/checksum/ReconcileContainerTask.java
aswinshakil added a commit to aswinshakil/ozone that referenced this pull request Oct 2, 2024
COMMITS:
360fea5 HDDS-11494. Improve the duration option of freon ombg (apache#7246)
10d3b21 HDDS-11504. Update Ratis to 3.1.1. (apache#7257)
ce46297 HDDS-11162. Improve Disk Usage page UI (apache#7214)
c91f1c7 HDDS-11491. Avoid sharing clientId among deleting services (apache#7250)
b0943d5 HDDS-11501. Improve logging in XceiverServerRatis (apache#7252)
55925ab HDDS-11502. Class path contains multiple SLF4J providers (apache#7255)
d0ad836 HDDS-11472. Avoid recreating external access authorizer on OM state reload (apache#7238)
254db9e HDDS-11500. RootCARotationManager cancelling wrong task in notifyStatusChanged (apache#7251)
1e6e4b3 HDDS-11499. Remove redundant code from ECReconstructionCoordinator. (apache#7248)
adb2821 HDDS-11490. Bump rollup to 3.29.5 (apache#7232)
189a9fe HDDS-11484. Validate javadoc in CI (apache#7245)
64a29c6 HDDS-11497. Bump commons-configuration2 to 2.11.0 (apache#7242)
95cfadd HDDS-11496. Bump maven-install-plugin to 3.1.3 (apache#7244)
0a999cf HDDS-11493. Bump sqlite-jdbc to 3.46.1.3 (apache#7243)
a214a31 HDDS-11329. Update Ozone images to Rocky Linux-based runner (apache#7241)
56ddb85 HDDS-11371. Handle cases where OM does not have getServerDefaults() implemented. (apache#7130)
b5097c7 HDDS-11347. Add rocks_tools_native lib check in Ozone CLI checknative subcommand (apache#7101)
fb0bf77 HDDS-11489. Bump maven-site-plugin to 3.20.0 (apache#7226)
70e6e40 HDDS-11122. Fix javadoc warnings (apache#7234)
acf3fdc HDDS-11458. Selective checks: trigger checkstyle for properties file changes (apache#7196)
6b87207 HDDS-11469. Statistics of Pipeline and Container (apache#7217)
1b8468b HDDS-11411. Snapshot garbage collection should not run when the keys are moved from a deleted snapshot to the next snapshot in the chain (apache#7193)
1f86ce8 HDDS-10617. Unexpected number of files in ITestS3AContractGetFileStatusV1List (apache#7208)
73a3bcc HDDS-11467. Bump vite to 4.5.5 (apache#7212)
d45aa1d HDDS-11460. Bump express to 4.21.0 (apache#7197)
e2e30b8 HDDS-11354. Intermittent failure in TestOzoneManagerSnapshotAcl#testLookupKeyWithNotAllowedUserForPrefixAcl (apache#7205)
0fcb645 HDDS-11477. [doc] Add configuration description for datanode docs (apache#7223)
3598ee3 HDDS-11464. Removed unused constants from OzoneConsts. (apache#7207)
8c0b54e HDDS-11408. Snapshot rename table entries are propagated incorrectly on snapshot deletes (apache#7200)
719bdf9 HDDS-11396. NPE due to empty Handler#clusterId (apache#7145)
40c4001 HDDS-10479. Add ozone admin ratis local raftMetaConf (apache#7170)
45f9138 HDDS-11394. Fix pipeline close --all command (apache#7138)
2b196d1 HDDS-11468. Enabled DB sync button (apache#7216)
d3899d2 Clean up files created after TestKeyValueHandlerWithUnhealthyContainer#testMarkContainerUnhealthyInFailedVolume (apache#7219)
70b8dd5 HDDS-11157. Improve Datanodes page UI (apache#7168)
151709a HDDS-11446. Downgrade picocli to 4.7.5 due to regression (apache#7215)
7a26aff HDDS-11158. Improve Pipelines page UI (apache#7171)
c365aa0 HDDS-11181. Cleanup of unnecessary try-catch blocks (apache#7210)
88dd436 HDDS-11423. Implement equals operation for --filter option to ozone ldb scan (apache#7167)
95cfadd HDDS-11496. Bump maven-install-plugin to 3.1.3 (apache#7244)
0a999cf HDDS-11493. Bump sqlite-jdbc to 3.46.1.3 (apache#7243)
a214a31 HDDS-11329. Update Ozone images to Rocky Linux-based runner (apache#7241)
56ddb85 HDDS-11371. Handle cases where OM does not have getServerDefaults() implemented. (apache#7130)
b5097c7 HDDS-11347. Add rocks_tools_native lib check in Ozone CLI checknative subcommand (apache#7101)
fb0bf77 HDDS-11489. Bump maven-site-plugin to 3.20.0 (apache#7226)
70e6e40 HDDS-11122. Fix javadoc warnings (apache#7234)
acf3fdc HDDS-11458. Selective checks: trigger checkstyle for properties file changes (apache#7196)
6b87207 HDDS-11469. Statistics of Pipeline and Container (apache#7217)
1b8468b HDDS-11411. Snapshot garbage collection should not run when the keys are moved from a deleted snaps
hot to the next snapshot in the chain (apache#7193)
1f86ce8 HDDS-10617. Unexpected number of files in ITestS3AContractGetFileStatusV1List (apache#7208)
73a3bcc HDDS-11467. Bump vite to 4.5.5 (apache#7212)
d45aa1d HDDS-11460. Bump express to 4.21.0 (apache#7197)
e2e30b8 HDDS-11354. Intermittent failure in TestOzoneManagerSnapshotAcl#testLookupKeyWithNotAllowedUserForP
refixAcl (apache#7205)
0fcb645 HDDS-11477. [doc] Add configuration description for datanode docs (apache#7223)
3598ee3 HDDS-11464. Removed unused constants from OzoneConsts. (apache#7207)
8c0b54e HDDS-11408. Snapshot rename table entries are propagated incorrectly on snapshot deletes (apache#7200)
719bdf9 HDDS-11396. NPE due to empty Handler#clusterId (apache#7145)
40c4001 HDDS-10479. Add ozone admin ratis local raftMetaConf (apache#7170)
45f9138 HDDS-11394. Fix pipeline close --all command (apache#7138)
2b196d1 HDDS-11468. Enabled DB sync button (apache#7216)
d3899d2 Clean up files created after TestKeyValueHandlerWithUnhealthyContainer#testMarkContainerUnhealthyIn
FailedVolume (apache#7219)
70b8dd5 HDDS-11157. Improve Datanodes page UI (apache#7168)
151709a HDDS-11446. Downgrade picocli to 4.7.5 due to regression (apache#7215)
7a26aff HDDS-11158. Improve Pipelines page UI (apache#7171)
c365aa0 HDDS-11181. Cleanup of unnecessary try-catch blocks (apache#7210)
88dd436 HDDS-11423. Implement equals operation for --filter option to ozone ldb scan (apache#7167)
e0060a8 HDDS-11196. Improve SCM WebUI Display (apache#6960)
22ddfb9 Revert "HDDS-11456. Require successful dependency/licence checks for acceptance/compile/kubernetes (apache#7192)"
9f5bf43 HDDS-11457. Internal error on S3 CompleteMultipartUpload if parts are not specified (apache#7195)
10c47a1 HDDS-11459. Bump develocity-maven-extension to 1.22.1 (apache#7201)
50f2563 HDDS-11419. Fix waitForCheckpointDirectoryExist log message (apache#7199)
a7d7e37 HDDS-11456. Require successful dependency/licence checks for acceptance/compile/kubernetes (apache#7192)
5feb9ea HDDS-11453. OmSnapshotPurge should be in a different ozone manager double buffer batch (apache#7188)
703c4d5 HDDS-10984. Tool to restore SCM certificates from RocksDB. (apache#6781)
d221065 HDDS-11440. Add a lastTransactionInfo field in SnapshotInfo to check for transactions in flight on the snapshot (apache#7179)
e573701 HDDS-11448. Improve documentation in ContainerStateMachine (apache#7183)
0e49f7a HDDS-11449. Remove unnecessary log from client console. (apache#7184)
cd251f2 HDDS-11438. Ensure DataInputBuffer is closed in OMPBHelper#convert (apache#7182)
4b47812 HDDS-11389. Incorrect number of deleted containers shown in Recon UI. (apache#7149)
0915f0b HDDS-10985. EC Reconstruction failed because the size of currentChunks was not equal to checksumBlockDataChunks. (apache#7009)
0f16195 HDDS-11416. refactor ratis submit request avoid code duplicate (apache#7166)
86fe920 HDDS-11376. Improve ReplicationSupervisor to record replication metrics (apache#7140)

CONFLICT:
Merge conflict in hadoop-hdds/container-service/src/test/java/org/apache/hadoop/ozone/container/keyvalue/TestKeyValueHandlerWithUnhealthyContainer.java

MODIFIED:
/Users/abalasubramanian/Documents/ozone/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/checksum/DNContainerOperationClient.java
/Users/abalasubramanian/Documents/ozone/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/checksum/ReconcileContainerTask.java
sarvekshayr pushed a commit to sarvekshayr/ozone that referenced this pull request Oct 7, 2024
aswinshakil added a commit to aswinshakil/ozone that referenced this pull request Dec 19, 2024
…239-container-reconciliation-merge

Commits:
0066526 HDDS-11869. Enable OM Ratis in TestOzoneDelegationTokenSecretManager (apache#7594)
4fe166d HDDS-11957. Make breadcrumb scrollable for long path names in DiskUsage page (apache#7590)
a523e17 HDDS-11846. [Recon] Recon Schema version_number column is always set as -1. (apache#7554)
f5ff2f0 HDDS-11868. Enable OM Ratis in TestQuotaRepairTask (apache#7593)
3a0e3b5 HDDS-11845. Extract k8s definitions for HttpFS and Recon from getting-started example (apache#7523)
6e0c753 HDDS-11509. logging improvements for directory deletion (apache#7314)
1f29e05 HDDS-11934. Split compat suite to old/new (apache#7578)
bde8cf4 HDDS-11759. Remove LegacyReplicationManager (apache#7580)
a27e4ec HDDS-11779. Add DN metrics to show deletion progress (apache#7552)
976e45f HDDS-11711. Add SCM metrics for delete commands sent and response received per datanode (apache#7522)
c28e16e HDDS-11950. Enable sortpom in dev-support module. (apache#7586)
dae388b HDDS-11907. OzoneSecretKey does not need to implement Writable (apache#7574)
8bb0587 HDDS-11712. Process other DeletedBlocksTransaction before retrying failed one. (apache#7532)
3648b59 HDDS-11906. Add sortpom dependency, sort root POM. (apache#7555)
54f0272 HDDS-11807. Make callId different for each request in openKeyCleanupService (apache#7551)
c523825 HDDS-11926 - Rename bucket name for bucket info/ls for linked buckets (apache#7581)
daf2f9f HDDS-11863. Speed up TestFSORepairTool (apache#7561)
f5e5493 HDDS-11927. Fix flaky TestContainerBalancerStatusInfo.testGetCurrentStatisticsWhileBalancingInProgress (apache#7579)
bef2415 HDDS-11940. Bump jline to 3.28.0 (apache#7576)
1453fd9 HDDS-11935. Bump develocity-maven-extension to 1.23 (apache#7577)
202b0c7 HDDS-11860. Improve BufferUtils.writeFully. (apache#7564)
008f9a6 HDDS-11852. Reduce duplication in some GenericCli subclasses (apache#7553)
7a46080 HDDS-11914. Snapshot diff should not filter SST Files based by reading SST file reader (apache#7563)
1835326 HDDS-11927. Mark testGetCurrentStatisticsWhileBalancingInProgress as flaky
66ccc25 HDDS-11908. Snapshot diff DAG traversal should not skip node based on prefix presence (apache#7567)
16ba289 Revert "HDDS-11413. PipelineManagerImpl lock optimization reduces AllocateBlock latency (apache#7160)"
bf6f323 HDDS-11413. PipelineManagerImpl lock optimization reduces AllocateBlock latency (apache#7160)
853d657 HDDS-11893. Fix full snapshot diff fallback logic because of DAG pruning (apache#7549)
b5d04e2 HDDS-11915. Netty OpenSsl not available in acceptance tests on arm64 (apache#7570)
8536490 HDDS-11367. Fix flaky balancer robot test (apache#7569)
745ed1c HDDS-11367. Improve ozone balancing status command output (apache#7139)
6b9cbe0 HDDS-11909. Intermittent timeout building Hadoop in s3a test (apache#7559)
eea5600 HDDS-11911. Return consistent error code when snapshot is not found in the DB or Snapshot Chain. (apache#7557)
e8f3b25 HDDS-11873. Skip old-only xcompat read tests (apache#7534)
ec348a7 HDDS-11889. Include Maven dependencies for hdds-rocks-native in cache (apache#7546)
aa37ae8 HDDS-11885. Download Hadoop for S3A test from mirrors if available (apache#7545)
befd64e HDDS-11694. Safemode Improvement: Introduce factory class to create safemode rules. (apache#7433)
a46153d HDDS-11872. Disable Apache snapshots repo (apache#7536)
092b000 HDDS-11890. Update project description in GitHub (apache#7547)
80c6446 HDDS-8101. Add tool to repair broken FSO tree (apache#7368)
23197e2 HDDS-11605. Directory deletion service should support multiple threads (apache#7349)
055b13c HDDS-11751. Use Java 21 in CI (apache#7458)
d0d82c5 HDDS-11886. Bump license-maven-plugin to 2.5.0 (apache#7539)
f7fe30a HDDS-11691. Support object tags in ObjectEndpointStreaming#put (apache#7543)
9854591 HDDS-11882. Make BOM, not aggregate one (apache#7544)
af345b2 HDDS-11877. Enable Maven cache for more checks (apache#7538)
51c6ed6 HDDS-11830. Subcommands should not extend GenericCli .(apache#7537)
959a39d HDDS-11851. Finer-grained subcommand interface for OzoneDebug and OzoneRepair. (apache#7526)
d4c41e5 HDDS-11334. Improve EC xcompat acceptance test (apache#7492)
8526f2e HDDS-11826. Interactive mode for ozone shell. (apache#7515)
fc63710 HDDS-11833. Return NotImplemented for S3 put-object-acl request. (apache#7531)
e8ad7ad HDDS-11782. ozone debug ldb --with-keys defaults to false instead of true (apache#7521)
cb0a402 HDDS-11859. Remove mention of fuse from s3 interface docs page (apache#7530)
e17f92c HDDS-10821. Ensure ChunkBuffer fully writes buffer to FileChannel (apache#6652)
d66c088 HDDS-11848. Serialization bug in Recon listKeys API (apache#7524)
faab1e8 HDDS-11656. Default native ACL limits to user and user's primary group (apache#7455)
8a1967e HDDS-11719. Remove dependency on server components from ozonefs-common (apache#7438)
9fcecc1 HDDS-11855. Mark TestContainerBalancerDatanodeNodeLimit#checkIterationResultException as flaky
65df308 HDDS-11794. Display HostName in OM / SCM Overview. (apache#7482)
0bde3a2 HDDS-11410. Refactoring more tests from TestContainerBalancerTask (apache#7156)
98e070e HDDS-11728. Refactor subcommand layouts of ozone debug and repair (apache#7489)
1c5676b HDDS-11849. Mark TestBlockOutputStreamWithFailures.test2DatanodesFailure as flaky
9dd8bb9 HDDS-11847. Mark TestSnapshotDeletingServiceIntegrationTest#testParallelExcecutionOfKeyDeletionAndSnapshotDeletion as flaky
b27714f HDDS-11266. Update proto.lock for Ozone 1.4.1 (apache#7504)
9b61937 HDDS-11806. Add HttpFS and Recon in getting-started k8s example (apache#7485)
77ce962 HDDS-10568. When the ldb command is executed, it is output by line (apache#7467)
69538b0 HDDS-11687. Robot warning: replace "is not" with "!=" (apache#7516)
b60d897 HDDS-11820. Create test principals at test run time (apache#7507)
6ba309a HDDS-11831. Finer-grained interface for dynamically registered subcommands (apache#7514)
db36e39 HDDS-11822. Register subcommands in OzoneShell (apache#7513)
e6bd3f5 HDDS-11829. Bump zstd-jni to 1.5.6-8 (apache#7510)
09a4a90 HDDS-11827. Bump exec-maven-plugin to 3.5.0 (apache#7512)
ad867bc HDDS-11824. Bump sqlite-jdbc to 3.47.1.0 (apache#7511)
850306d HDDS-11823. Bump cyclonedx-maven-plugin to 2.9.1 (apache#7508)
34f9d9e HDDS-11742. Update metrics with leaderId if known when starting SCM (apache#7471)
2c6c116 HDDS-11265. Add Ozone 1.4.1 to compatibility acceptance tests (apache#7503)
ebcdc6a HDDS-11810. Secure acceptance test on arm64 fails with LoginException: Checksum failed (apache#7498)
7f40624 HDDS-11821. Mark TestECKeyOutputStream#testECKeyCreatetWithDatanodeIdChange as unhealthy
9b26156 HDDS-11811. rocksdbjni deleted on exit could be used by other components apache#7493
3f92663 HDDS-11785. DataNode aborts ContainerStateMachine if it does not know any follower next index (apache#7480)
f0a2c87 HDDS-11773. Prevent frequent DataNode Ratis snapshotting. (apache#7473)
1383c18 HDDS-11718. Some CI checks passing despite error (apache#7483)
f98eac2 HDDS-11561. Refactor Open Key Search Endpoint and Consolidate with OmDBInsightEndpoint Using StartPrefix Parameter. (apache#7336)
a99ab27 HDDS-11243. SCM SafeModeRule Support EC. (apache#7008)
6871547 HDDS-11716. Address Incomplete Upgrade Scenario in Recon Upgrade Framework (apache#7452)
9bc9145 HDDS-10411. Support incremental ChunkBuffer checksum calculation (apache#7189)
579a38e HDDS-11723. Tool to better micro benchmark hbase performance in Ozone (apache#7463)
cc1a374 HDDS-11704. Hadoop test leaves running containers in case of failure (apache#7435)
c4b2056 Update documentation to mention that container schemaV3 is default (apache#7481)
20c4cfa HDDS-11386. Multithreading bug in ContainerBalancerTask (apache#7339)
b090312 HDDS-11780. Increase client write retry when SCM is in safe mode (apache#7470)
f8e4db9 HDDS-11793. Bump maven-checkstyle-plugin to 3.6.0 (apache#7476)
6a1ff84 HDDS-11791. Bump commons-io to 2.18.0 (apache#7478)
d7f8235 HDDS-11788. Bump log4j2 to 2.24.2 (apache#7479)
6547de7 HDDS-11790. Bump commons-lang3 to 3.17.0 (apache#7475)
1d8abd6 HDDS-11789. Bump zstd-jni to 1.5.6-7 (apache#7477)
6ca7230 HDDS-11769. Add tools folder into ozone src package. (apache#7466)
3b8ed58 HDDS-11682. Bump maven-resources-plugin to 3.3.0 (apache#7384)
f4a9ee0 HDDS-11702. Merge test_bucket_encryption into robot compatibility test (apache#7451)
d6a5488 HDDS-11713. Use seek to reach the start transaction. (apache#7460)
d52615a HDDS-11733. Remove okio versioning and unused okhttp dependency (apache#7447)
1a49991 HDDS-11617. Update hadoop to 3.4.1 (apache#7376)
9945de6 HDDS-11667. Validating DatanodeID on any request to the datanode (apache#7418)
fc6a2ea HDDS-11650. ContainerId list to track all containers created in a datanode (apache#7402)
a8db9cd HDDS-11749. Extract moveToTrash implementation to client code (apache#7453)
3ba3474 HDDS-11755. mktemp --suffix does not work on Mac (apache#7457)
433c7bb HDDS-11729. Update skipRecon property to skip only frontend build (apache#7454)
6b40003 HDDS-11739. Extract generic unmarshaller for S3 requests (apache#7449)
c7f65e7 HDDS-11740. Add debug command to show internal component versions (apache#7450)
0f7104e HDDS-11708. Recon ListKeys API should return a proper http response status code if NSSummary rebuild is in progress. (apache#7437)
0e0d5e9 HDDS-11163. Improve Heatmap page UI (apache#7420)
2cef393 HDDS-11696. Limit max number of entries in list keys/status response (apache#7431)
e96e314 HDDS-11697. Integrate Ozone Filesystem Implementation with Ozone ListStatusLight API (apache#7440)
ebcbce7 HDDS-11644. Close OMLayoutVersionManager (apache#7445)
20e4969 HDDS-11737. UnsupportedOperationException in S3 setBucketAcl (apache#7448)
b252181 HDDS-10804. Include only limited set of ports in Pipeline proto (apache#6655)
79ca956 HDDS-8829. Symmetric Keys for Delegation Tokens (apache#7394)
3e798e6 HDDS-11698. Use hadoop images from GitHub in CI (apache#7432)
3e278b7 HDDS-10655. Support PutObjectTagging, GetObjectTagging, and DeleteObjectTagging (apache#6756)
036e727 HDDS-11732. Fix ACL check on bucket resolution while reading from snapshot (apache#7446)
dbda703 HDDS-11736. Bump maven-javadoc-plugin to 3.11.1 (apache#7444)
238f232 HDDS-11692. Skip spotbugs for modules with only generated code. (apache#7428)
f60ad61 HDDS-11705. Snapshot operations on linked buckets should work on actual underlying bucket (apache#7434)
dd22dbe HDDS-11615. Add Upgrade Action for Initial Schema Constraints for Unhealthy Container Table in Recon. (apache#7372)
4066c7c HDDS-117. Add convenience methods for port management in DatanodeDetails (apache#7408)
12419fa HDDS-11695. SCM follower should not log NotLeaderException during Pipeline Report processing. (apache#7430)
5275ded HDDS-10133. Add a method to check key name in OMKeyRequest (apache#6012)
fd5c6d8 HDDS-11689. Extract scheduled workflow for populate-cache (apache#7429)
889ba80 HDDS-11653. Bump Ratis to 3.1.2 (apache#7427)
47ec4dd HDDS-11671. Refer to website for supported versions (apache#7412)
10cac80 HDDS-11686. Use ozone image from GitHub in CI (apache#7425)
aa6da3e HDDS-9781. Limited maxOpenFiles, disabled enableCompactionDag, and createCheckpointDirs when creating OMMetadataManager instance for bootstrapping (apache#7095)
9dd6a83 HDDS-11645. Mark TestReconScmSnapshot#testExplicitRemovalOfNode as flaky
8e617dc HDDS-11672. Mark TestSnapshotBackgroundServices#testCompactionLogBackgroundService as flaky
d09e6d4 HDDS-11646. Mark TestXceiverClientMetrics#testMetrics as flaky
8e4a508 HDDS-11668. Recon List Keys API: Reuse key prefix if parentID is the same (apache#7410)
ee63232 HDDS-11684. Remove suppression of HiddenField (apache#7423)
a33d8a3 HDDS-10166. Replace GenericTestUtils temporary directories with `@TempDir` (apache#7399)
3a18a9d HDDS-11664. Hadoop download failure not reported as error (apache#7421)
47c2409 HDDS-64. OzoneClientException should extend IOException. (apache#7403)
27fcd0c HDDS-11685. Use ozone-testkrb5 from GitHub (apache#7424)
5663971 HDDS-11665. Minor optimizations on the write path (apache#7407)
cb81f0c HDDS-11683. Skip shade in most integration checks (apache#7422)
952e0ec HDDS-11681. Bump Bouncy Castle to 1.79 (apache#7387)
2797c45 HDDS-11677. Bump sqlite-jdbc to 3.47.0.0 (apache#7413)
358534b HDDS-11675. Bump maven-site-plugin to 3.21.0 (apache#7414)
cf79245 HDDS-11674. Bump junit to 5.11.3 (apache#7415)
6dd566f HDDS-11583. Use ozone-runner from GitHub in CI (apache#7409)
ef2bf98 HDDS-11669. In OmUtils.normalizeKey isDebugEnabled should be evaluated first (apache#7411)
a7e3014 HDDS-11660. Recon List Key API: Reduce object creation and buffering memory (apache#7405)
5d18b9c HDDS-11659. Improve HSync compatibility test (apache#7404)
4e603aa HDDS-11462. Enhancing DataNode I/O Monitoring Capabilities. (apache#7206)
18f6e8a HDDS-11311. Added Compatibility test for HSync (apache#7400)
0415c0b HDDS-11649. Recon ListKeys API: Simplify filter predicates (apache#7395)
2547ac0 HDDS-11652. Fix SCM start command in SCM-HA doc (apache#7398)
c045839 HDDS-11623. Improve OM Ratis Configuration change log message (apache#7388)
2b1524b HDDS-11609. Switch to Recon v2 UI as the default UI (apache#7358)
efe5892 HDDS-11641. Allow testing Hadoop with custom docker images (apache#7393)
c055036 HDDS-11637. Compile failure is ignored in build check (apache#7389)
67e5261 HDDS-11563. Display OM/SCM service ID as Namespace in web UI (apache#7321)
0fb5e50 HDDS-11587. Ozone Manager not processing file put requests with multi-tenancy enabled (apache#7316)
786bb49 HDDS-11642. MutableQuantiles should be stopped (apache#7392)
76ec9b9 HDDS-11639. Upgrade ozone-runner to Rocky Linux 9.3 (apache#7391)
3bc3b8a HDDS-11621. Fix missing HADOOP_ variables in MR acceptance test (apache#7375)
6f9db61 HDDS-11200. Hsync client-side metrics (apache#7371)
58d1443 HDDS-10240. Cleanup zero-copy EC (apache#7340)
5b065d8 HDDS-11638. Bump cyclonedx-maven-plugin to 2.9.0 (apache#7383)
c7a196f HDDS-11635. Memory leak when using Ozone FS via Hadoop FileContext API (apache#7382)
c9956a1 HDDS-11601. Intermittent failure in acceptance balancer test. (apache#7343)
a737fc3 HDDS-11619. Remove dependency on hadoop-shaded-guava (apache#7373)
c4d6857 HDDS-11584. Document ozone debug ldb command (apache#7313)
dded26e HDDS-11588. Add main artifact jar to classpath file (apache#7324)
afed6d9 HDDS-11558. Make OM client retry idempotent (apache#7329)
e85b32d HDDS-11591. Copy dependencies when building each module (apache#7325)
72e56d7 HDDS-11601. Disable flaky EC balancer acceptance test
ab16cbe HDDS-11507. Add error information to log while handling ServiceException. (apache#7367)
980b960 HDDS-11380. Make node decommission error message more comprehensive (apache#7155)
61c094f HDDS-11614. Speed up TestTransferLeadershipShell (apache#7370)
91188b3 HDDS-11352. Remove Flaky annotation from TestOzoneManagerHAWithStoppedNodes using Ratis 3.1.1
faf133d HDDS-11220. Initialize block length using the chunk list from DataNode before seek (apache#7221)
91d41a0 HDDS-11465. Introducing Schema Versioning for Recon to Handle Fresh Installs and Upgrades. (apache#7213)
7a27db2 HDDS-11134. Create compatibility test for FSO bucket usage (apache#7350)
30906d1 HDDS-11612. Bump jnr-posix to 3.1.20 (apache#7360)
bed4aef HDDS-11611. Bump docker-maven-plugin to 0.45.1 (apache#7362)
e2c3d57 HDDS-11610. Bump maven-dependency-plugin to 3.8.1 (apache#7361)
24c1000 HDDS-11041. Add admin request filter for S3 requests and UGI support for GrpcOmTransport (apache#7268)
0b84998 HDDS-11594. Update batchPut buffer log for rocksdb. (apache#7356)
c013516 HDDS-11608. Client should not retry invalid protobuf request (apache#7354)
32a8c09 HDDS-11160. Improve Insights page UI (apache#7327)
782ad62 HDDS-11388. Fix unnecessary call to the DB for ContainerBalancer#getBalancerStatusInfo (apache#7224)
35b6a3a HDDS-11600. Intermittent failure in repro due to ordering differences in builddef.lst (apache#7342)
e7bf154 HDDS-11132. Revert client version bump done as part of HDDS-10983 (apache#7348)
ea5cbff HDDS-11602. Bump ozone-runner to 20241022-jdk17-1 (apache#7347)
3f98df5 HDDS-11580. Validate 'hdds.datanode.dir.du.reserved' property (apache#7328)
8568075 HDDS-11570. Fix HDDS Docs build failure with Hugo v0.135.0 (apache#7337)
721ae58 HDDS-11057. Enable reproducible builds (apache#6856)
86b7aae HDDS-11205. Implement a search feature for users to locate keys pending Deletion within the OM Deleted Keys Insights section (apache#6969)
f7b428d HDDS-11503. Add Robot test to verify Container Balancer for EC containers. (apache#7311)
85eb89b HDDS-11483. Make s3g object get and put operation buffer configurable (apache#7233)
515977a HDDS-11582. Bump body-parser to 1.20.3 (apache#7307)
9b66267 HDDS-11589. ReconSCMDBDefinition should be singleton. (apache#7323)
f784a84 HDDS-11578. Unify constants for RATIS_SNAPSHOT_DIR (apache#7310)
4670a5e HDDS-11498. Improve SCM deletion efficiency. (apache#7249)
3fb2cf0 HDDS-11108. Extract keywords for multipart upload tests (apache#7318)
4b24aa9 HDDS-11545. [UI] Add OM and SCM ID information (apache#7287)
860e269 HDDS-11538. Let coverage report link to java sources (apache#7280)
2139367 HDDS-11581. Remove duplicate ContainerStateMachine#RaftGroupId (apache#7312)
64e035d HDDS-11573. Remove lib/gson-2.10.1.jar (apache#7309)
ce07a3c HDDS-11456. Require successful dependency/licence checks for acceptance/compile/kubernetes (apache#7209)
c579d06 HDDS-11574. Ozone client leak in TestS3SDKV1 (apache#7308)
c044b79 HDDS-10390. MiniOzoneCluster to support S3 Gateway (apache#7281)
8eef589 HDDS-11557. Simplify DBColumnFamilyDefinition. (apache#7298)
4c77f6b HDDS-11562. Parameterize TestSCMNodeManager#testProcessLayoutVersion (apache#7300)
b51c4b3 HDDS-11572. Bump commons-io to 2.17.0 (apache#7305)
3a37870 HDDS-11571. Bump log4j2 to 2.24.1 (apache#7301)
494798c HDDS-11564. Mark TestBlockOutputStream#testWriteExactly... as flaky
fabf512 HDDS-11569. Bump restrict-imports-enforcer-rule from 2.5.0 to 2.6.0 (apache#7303)
1e62a0a HDDS-11568. Bump commons-codec to 1.17.1 (apache#7304)
e9f92a7 HDDS-11567. Bump common-custom-user-data-maven-extension to 2.0.1 (apache#7302)
cb44d5e HDDS-11555. SCMDBDefinition should be singleton. (apache#7296)
d473134 HDDS-11486. Reduce log level for NativeLibraryNotLoadedException in SnapshotDiffManager (apache#7290)
3348d91 HDDS-11564. Mark TestBlockOutputStream as flaky
e2f2aeb HDDS-11548. Add some logging to the StateMachine (apache#7291)
523c860 HDDS-11439. De-duplicate code for ReplicatedFileChecksumHelper and ECFileChecksumHelper (apache#7264)
05a409e HDDS-11519. Clean up unused lines in BlockOutputStream
ffe7198 HDDS-11544. Improve work with arrays (apache#7286)
5657604 HDDS-11556. Add a getTypeClass method to Codec. (apache#7295)
256aad9 HDDS-11546. Add regex operation to filter option of ldb scan command. (apache#7289)
7ef7de2 HDDS-11482. EC Checksum throws IllegalArgumentException because the buffer limit is negative (apache#7230)
77c17df HDDS-11551. Provide details about integration check failure (apache#7294)
911a583 HDDS-8188. Support max allowed length in response of ozone admin container list (apache#7181)
7f2e0e3 HDDS-11554. OMDBDefinition should be singleton. (apache#7292)
170761c HDDS-11547. Make MAVEN_OPTS optional (apache#7288)
4846e97 HDDS-11543. Track OzoneClient object leaks via LeakDetector framework. (apache#7285)
e00f7ae HDDS-11159. Improve Containers page UI (apache#7267)
cfda951 HDDS-11520. Fix Delete pending directories key mapping (apache#7269)
2e3de8a HDDS-11476. Implement lesser/greater operation for --filter option of ldb scan command (apache#7222)
06ccdb3 HDDS-11526. Fix hdds.datanode.metadata.rocksdb.cache.size default value mismatch (apache#7284)
b3afaec HDDS-11535. Incomplete SCM roles table header (apache#7278)
ed2a073 HDDS-11536. Bump macOS runner version to macos-13 (apache#7279)
1887f83 HDDS-11537. Bump frontend-maven-plugin to 1.15.1 (apache#7276)
1f1e618 HDDS-6776. Cleanup TestSCMSafeModeManager (apache#7272)
4bee3e9 HDDS-11534. Bump cyclonedx-maven-plugin to 2.8.2 (apache#7277)
789fb53 HDDS-11533. Bump maven-gpg-plugin to 3.2.7 (apache#7275)
eb26677 HDDS-11268. Add --table mode for OM/SCM Roles CLI (apache#7016)
28ea480 HDDS-11527. Avoid unnecessary duplicate build (apache#7270)
30da31f HDDS-3498. Shutdown datanode if address is already in use (apache#7256)
2401d27 HDDS-11046. Coverage decreased due to running tests with Java 17 (apache#7263)
78d8418 HDDS-11524. Bump snappy-java to 1.1.10.7 (apache#7202)
8747c0e HDDS-11518. Recon OmDB Insights show isKey=true for directories (apache#7260)
5d2bbc3 HDDS-11480. Refactor OM volume response tests (apache#7265)
31f9f2c HDDS-11517. Update version to 2.0.0-SNAPSHOT (apache#7258)
a0f0872 HDDS-11444. Make Datanode Command metrics consistent across all commands (apache#7191)
d3b63c6 HDDS-11492. Directory deletion get stuck having millions of directory (apache#7254)
f52f0af HDDS-11127. [hsync] Improve test coverage for XceiverClientRatis.java (apache#7225)
360fea5 HDDS-11494. Improve the duration option of freon ombg (apache#7246)
10d3b21 HDDS-11504. Update Ratis to 3.1.1. (apache#7257)
ce46297 HDDS-11162. Improve Disk Usage page UI (apache#7214)
c91f1c7 HDDS-11491. Avoid sharing clientId among deleting services (apache#7250)
b0943d5 HDDS-11501. Improve logging in XceiverServerRatis (apache#7252)
55925ab HDDS-11502. Class path contains multiple SLF4J providers (apache#7255)
d0ad836 HDDS-11472. Avoid recreating external access authorizer on OM state reload (apache#7238)
254db9e HDDS-11500. RootCARotationManager cancelling wrong task in notifyStatusChanged (apache#7251)
1e6e4b3 HDDS-11499. Remove redundant code from ECReconstructionCoordinator. (apache#7248)
adb2821 HDDS-11490. Bump rollup to 3.29.5 (apache#7232)
189a9fe HDDS-11484. Validate javadoc in CI (apache#7245)
64a29c6 HDDS-11497. Bump commons-configuration2 to 2.11.0 (apache#7242)
95cfadd HDDS-11496. Bump maven-install-plugin to 3.1.3 (apache#7244)
0a999cf HDDS-11493. Bump sqlite-jdbc to 3.46.1.3 (apache#7243)
a214a31 HDDS-11329. Update Ozone images to Rocky Linux-based runner (apache#7241)
56ddb85 HDDS-11371. Handle cases where OM does not have getServerDefaults() implemented. (apache#7130)
b5097c7 HDDS-11347. Add rocks_tools_native lib check in Ozone CLI checknative subcommand (apache#7101)
fb0bf77 HDDS-11489. Bump maven-site-plugin to 3.20.0 (apache#7226)
70e6e40 HDDS-11122. Fix javadoc warnings (apache#7234)
acf3fdc HDDS-11458. Selective checks: trigger checkstyle for properties file changes (apache#7196)
6b87207 HDDS-11469. Statistics of Pipeline and Container (apache#7217)
1b8468b HDDS-11411. Snapshot garbage collection should not run when the keys are moved from a deleted snapshot to the next snapshot in the chain (apache#7193)
1f86ce8 HDDS-10617. Unexpected number of files in ITestS3AContractGetFileStatusV1List (apache#7208)
73a3bcc HDDS-11467. Bump vite to 4.5.5 (apache#7212)
d45aa1d HDDS-11460. Bump express to 4.21.0 (apache#7197)
e2e30b8 HDDS-11354. Intermittent failure in TestOzoneManagerSnapshotAcl#testLookupKeyWithNotAllowedUserForPrefixAcl (apache#7205)
0fcb645 HDDS-11477. [doc] Add configuration description for datanode docs (apache#7223)
3598ee3 HDDS-11464. Removed unused constants from OzoneConsts. (apache#7207)
8c0b54e HDDS-11408. Snapshot rename table entries are propagated incorrectly on snapshot deletes (apache#7200)
719bdf9 HDDS-11396. NPE due to empty Handler#clusterId (apache#7145)
40c4001 HDDS-10479. Add ozone admin ratis local raftMetaConf (apache#7170)
45f9138 HDDS-11394. Fix pipeline close --all command (apache#7138)
2b196d1 HDDS-11468. Enabled DB sync button (apache#7216)
d3899d2 Clean up files created after TestKeyValueHandlerWithUnhealthyContainer#testMarkContainerUnhealthyInFailedVolume (apache#7219)
70b8dd5 HDDS-11157. Improve Datanodes page UI (apache#7168)
151709a HDDS-11446. Downgrade picocli to 4.7.5 due to regression (apache#7215)
7a26aff HDDS-11158. Improve Pipelines page UI (apache#7171)
c365aa0 HDDS-11181. Cleanup of unnecessary try-catch blocks (apache#7210)
88dd436 HDDS-11423. Implement equals operation for --filter option to ozone ldb scan (apache#7167)
e0060a8 HDDS-11196. Improve SCM WebUI Display (apache#6960)
22ddfb9 Revert "HDDS-11456. Require successful dependency/licence checks for acceptance/compile/kubernetes (apache#7192)"
9f5bf43 HDDS-11457. Internal error on S3 CompleteMultipartUpload if parts are not specified (apache#7195)
10c47a1 HDDS-11459. Bump develocity-maven-extension to 1.22.1 (apache#7201)
50f2563 HDDS-11419. Fix waitForCheckpointDirectoryExist log message (apache#7199)
a7d7e37 HDDS-11456. Require successful dependency/licence checks for acceptance/compile/kubernetes (apache#7192)
5feb9ea HDDS-11453. OmSnapshotPurge should be in a different ozone manager double buffer batch (apache#7188)
703c4d5 HDDS-10984. Tool to restore SCM certificates from RocksDB. (apache#6781)
d221065 HDDS-11440. Add a lastTransactionInfo field in SnapshotInfo to check for transactions in flight on the snapshot (apache#7179)
e573701 HDDS-11448. Improve documentation in ContainerStateMachine (apache#7183)
0e49f7a HDDS-11449. Remove unnecessary log from client console. (apache#7184)
cd251f2 HDDS-11438. Ensure DataInputBuffer is closed in OMPBHelper#convert (apache#7182)
4b47812 HDDS-11389. Incorrect number of deleted containers shown in Recon UI. (apache#7149)
0915f0b HDDS-10985. EC Reconstruction failed because the size of currentChunks was not equal to checksumBlockDataChunks. (apache#7009)
0f16195 HDDS-11416. refactor ratis submit request avoid code duplicate (apache#7166)
86fe920 HDDS-11376. Improve ReplicationSupervisor to record replication metrics (apache#7140)
883a63f HDDS-11441. ozone sh key put should only accept positive expectedGeneration (apache#7180)
33dbd4a HDDS-11357. Datanode Usageinfo Support Display Pipeline. (apache#7105)
9477aa6 HDDS-11436. Minor update in Recon API handling. (apache#7178)
8ca33c7 HDDS-11414. Key listing for FSO buckets fails with forward client (apache#7161)
f1ebd39 HDDS-11435. Bump sqlite-jdbc to 3.46.1.0 (apache#7174)
aaf8bd0 HDDS-11434. Bump log4j2 to 2.24.0 (apache#7176)
3510ce7 HDDS-11433. Bump Jetty to 9.4.56.v20240826 (apache#7175)
0047cd2 HDDS-11400. Bump maven-core to 3.9.9 (apache#7144)
274da83 HDDS-10488. Datanode OOM due to run out of mmap handler (apache#6690)
7a452ca HDDS-11391. Addendum to fix test failure.
7e1d9b0 HDDS-11145. ozone admin om cancelprepare --service-id improvement (apache#7159)
6888cf2 HDDS-11383. Improve read key dashboard to include add the read key related OM metrics. (apache#7131)
3e0d76c HDDS-11369. [hsync] Remove KeyOutputStreamSemaphore logs (apache#7136)
b23981c HDDS-11342. [hsync] Add a config as HBase-related features master switch (apache#7126)
3e1188a HDDS-11285. cli to trigger quota repair and status (apache#7104)
2e33978 HDDS-11401. Code cleanup in DatanodeStateMachine (apache#7146)
f563d67 HDDS-11391. Frequent Ozone DN Crashes During OM + DN Decommission with Freon (apache#7154)
18b28d2 HDDS-11312. [hsync] Added upgrade tests (apache#7110)
b29beb3 HDDS-11350. NullPointerException thrown on checking container balancer status (apache#7134)
111b9df HDDS-11407. Use OMLayoutFeature.HBASE_SUPPORT for HSYNC (apache#7152)
966b8d0 HDDS-11390. Removed hsync and hflush capability check in ContentGenerator (apache#7153)
877504a HDDS-11156. Improve Buckets page UI (apache#7100)
814f78f HDDS-11392. ChecksumByteBufferImpl's static initializer fails with java 17+ (apache#7135)
b5e1a8b HDDS-11398. Bump commons-compress to 1.27.1 (apache#7142)
a8e3ea9 HDDS-11397. Bump Jersey2 to 2.45 (apache#7141)
5992837 HDDS-11399. Bump maven-deploy-plugin to 3.1.3 (apache#7143)
47564bb HDDS-11359. Intermittent timeout in TestPipelineManagerMXBean#testPipelineInfo (apache#7132)
cc4e026 HDDS-11304. Make up for the missing functionality in CommandDispatcher (apache#7062)
2d372f6 HDDS-11339. Let PrometheusServlet rely on periodically published metrics (apache#7092)
f22c6f8 HDDS-11164. Improve Navbar UI (apache#7088)
23211c1 HDDS-11381. Adding logging for sortByDistanceCost in NetworkTopologyImpl (apache#7133)
3e9cdb6 HDDS-11378. Allow disabling OM version-specific feature via config (apache#7129)
23f3e5b HDDS-11152. OMDoubleBuffer error when handling snapshot's background operations (apache#7112)
5659b7e HDDS-11375. DN startup fails due to illegal configuration of raft.grpc.message.size.max (apache#7128)
41d8147 HDDS-11368. Remove dependency on Babel in Vite (apache#7119)
0bd8ba1 HDDS-11372. No coverage for org.apache.ozone packages (apache#7124)
3bd237d HDDS-11325. (addendum) Intermittent failure in TestBlockOutputStreamWithFailures#testContainerClose (apache#7121)
8306290 HDDS-11373. Log for EC reconstruction command lists the missing indexes as ASCII control characters (apache#7123)
dab1538 HDDS-11216. Replace HAUtils#buildCAX509List usages with other direct usages (apache#6981)
51a5fb9 Revert "HDDS-11235. Spare InfoBucket RPC call in FileSystem#mkdir() call. (apache#6990)" (apache#7122)
fab56b4 HDDS-11229. Chain optionals in Recon Insight (apache#7064)
2236041 HDDS-11365. Fix the NOTICE file (apache#7120)
2e30dc1 HDDS-11190. Add --fields option to ldb scan command (apache#6976)
0b75cb0 HDDS-11251. Deprecate definitions and remove listTrash and recoverTrash APIs (apache#7060)
be34303 HDDS-9198. Maintain local cache in OMSnapshotPurgeRequest to get updated snapshotInfo and pass the same to OMSnapshotPurgeResponse (apache#7045)
c07b408 HDDS-11208. Change RatisBlockOutputStream to use HDDS-11174. (apache#7072)
8f8d809 HDDS-11309. Increase CONTAINER_STATE Column Length in UNHEALTHY_CONTAINERS to Avoid Truncation (apache#7071)
350a340 HDDS-11364. Bump jgraphx to 3.9.12 (apache#7116)
45b7056 HDDS-11363. Bump develocity-maven-extension to 1.22 (apache#7115)
9dd18f1 HDDS-11362. Bump snappy-java to 1.1.10.6 (apache#7114)
637cb91 HDDS-11361. Bump Jersey2 to 2.44 (apache#7113)

Conflicts:
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueHandler.java
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/ozoneimpl/OzoneContainer.java
hadoop-hdds/container-service/src/test/java/org/apache/hadoop/ozone/container/keyvalue/TestKeyValueHandlerWithUnhealthyContainer.java
hadoop-ozone/dist/src/main/smoketest/admincli/container.robot

Modified during conflict:
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/checksum/DNContainerOperationClient.java
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/checksum/ReconcileContainerTask.java
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/client/rpc/TestECKeyOutputStream.java
aswinshakil added a commit to aswinshakil/ozone that referenced this pull request Dec 19, 2024
…239-container-reconciliation-merge

Commits:
0066526 HDDS-11869. Enable OM Ratis in TestOzoneDelegationTokenSecretManager (apache#7594)
4fe166d HDDS-11957. Make breadcrumb scrollable for long path names in DiskUsage page (apache#7590)
a523e17 HDDS-11846. [Recon] Recon Schema version_number column is always set as -1. (apache#7554)
f5ff2f0 HDDS-11868. Enable OM Ratis in TestQuotaRepairTask (apache#7593)
3a0e3b5 HDDS-11845. Extract k8s definitions for HttpFS and Recon from getting-started example (apache#7523)
6e0c753 HDDS-11509. logging improvements for directory deletion (apache#7314)
1f29e05 HDDS-11934. Split compat suite to old/new (apache#7578)
bde8cf4 HDDS-11759. Remove LegacyReplicationManager (apache#7580)
a27e4ec HDDS-11779. Add DN metrics to show deletion progress (apache#7552)
976e45f HDDS-11711. Add SCM metrics for delete commands sent and response received per datanode (apache#7522)
c28e16e HDDS-11950. Enable sortpom in dev-support module. (apache#7586)
dae388b HDDS-11907. OzoneSecretKey does not need to implement Writable (apache#7574)
8bb0587 HDDS-11712. Process other DeletedBlocksTransaction before retrying failed one. (apache#7532)
3648b59 HDDS-11906. Add sortpom dependency, sort root POM. (apache#7555)
54f0272 HDDS-11807. Make callId different for each request in openKeyCleanupService (apache#7551)
c523825 HDDS-11926 - Rename bucket name for bucket info/ls for linked buckets (apache#7581)
daf2f9f HDDS-11863. Speed up TestFSORepairTool (apache#7561)
f5e5493 HDDS-11927. Fix flaky TestContainerBalancerStatusInfo.testGetCurrentStatisticsWhileBalancingInProgress (apache#7579)
bef2415 HDDS-11940. Bump jline to 3.28.0 (apache#7576)
1453fd9 HDDS-11935. Bump develocity-maven-extension to 1.23 (apache#7577)
202b0c7 HDDS-11860. Improve BufferUtils.writeFully. (apache#7564)
008f9a6 HDDS-11852. Reduce duplication in some GenericCli subclasses (apache#7553)
7a46080 HDDS-11914. Snapshot diff should not filter SST Files based by reading SST file reader (apache#7563)
1835326 HDDS-11927. Mark testGetCurrentStatisticsWhileBalancingInProgress as flaky
66ccc25 HDDS-11908. Snapshot diff DAG traversal should not skip node based on prefix presence (apache#7567)
16ba289 Revert "HDDS-11413. PipelineManagerImpl lock optimization reduces AllocateBlock latency (apache#7160)"
bf6f323 HDDS-11413. PipelineManagerImpl lock optimization reduces AllocateBlock latency (apache#7160)
853d657 HDDS-11893. Fix full snapshot diff fallback logic because of DAG pruning (apache#7549)
b5d04e2 HDDS-11915. Netty OpenSsl not available in acceptance tests on arm64 (apache#7570)
8536490 HDDS-11367. Fix flaky balancer robot test (apache#7569)
745ed1c HDDS-11367. Improve ozone balancing status command output (apache#7139)
6b9cbe0 HDDS-11909. Intermittent timeout building Hadoop in s3a test (apache#7559)
eea5600 HDDS-11911. Return consistent error code when snapshot is not found in the DB or Snapshot Chain. (apache#7557)
e8f3b25 HDDS-11873. Skip old-only xcompat read tests (apache#7534)
ec348a7 HDDS-11889. Include Maven dependencies for hdds-rocks-native in cache (apache#7546)
aa37ae8 HDDS-11885. Download Hadoop for S3A test from mirrors if available (apache#7545)
befd64e HDDS-11694. Safemode Improvement: Introduce factory class to create safemode rules. (apache#7433)
a46153d HDDS-11872. Disable Apache snapshots repo (apache#7536)
092b000 HDDS-11890. Update project description in GitHub (apache#7547)
80c6446 HDDS-8101. Add tool to repair broken FSO tree (apache#7368)
23197e2 HDDS-11605. Directory deletion service should support multiple threads (apache#7349)
055b13c HDDS-11751. Use Java 21 in CI (apache#7458)
d0d82c5 HDDS-11886. Bump license-maven-plugin to 2.5.0 (apache#7539)
f7fe30a HDDS-11691. Support object tags in ObjectEndpointStreaming#put (apache#7543)
9854591 HDDS-11882. Make BOM, not aggregate one (apache#7544)
af345b2 HDDS-11877. Enable Maven cache for more checks (apache#7538)
51c6ed6 HDDS-11830. Subcommands should not extend GenericCli .(apache#7537)
959a39d HDDS-11851. Finer-grained subcommand interface for OzoneDebug and OzoneRepair. (apache#7526)
d4c41e5 HDDS-11334. Improve EC xcompat acceptance test (apache#7492)
8526f2e HDDS-11826. Interactive mode for ozone shell. (apache#7515)
fc63710 HDDS-11833. Return NotImplemented for S3 put-object-acl request. (apache#7531)
e8ad7ad HDDS-11782. ozone debug ldb --with-keys defaults to false instead of true (apache#7521)
cb0a402 HDDS-11859. Remove mention of fuse from s3 interface docs page (apache#7530)
e17f92c HDDS-10821. Ensure ChunkBuffer fully writes buffer to FileChannel (apache#6652)
d66c088 HDDS-11848. Serialization bug in Recon listKeys API (apache#7524)
faab1e8 HDDS-11656. Default native ACL limits to user and user's primary group (apache#7455)
8a1967e HDDS-11719. Remove dependency on server components from ozonefs-common (apache#7438)
9fcecc1 HDDS-11855. Mark TestContainerBalancerDatanodeNodeLimit#checkIterationResultException as flaky
65df308 HDDS-11794. Display HostName in OM / SCM Overview. (apache#7482)
0bde3a2 HDDS-11410. Refactoring more tests from TestContainerBalancerTask (apache#7156)
98e070e HDDS-11728. Refactor subcommand layouts of ozone debug and repair (apache#7489)
1c5676b HDDS-11849. Mark TestBlockOutputStreamWithFailures.test2DatanodesFailure as flaky
9dd8bb9 HDDS-11847. Mark TestSnapshotDeletingServiceIntegrationTest#testParallelExcecutionOfKeyDeletionAndSnapshotDeletion as flaky
b27714f HDDS-11266. Update proto.lock for Ozone 1.4.1 (apache#7504)
9b61937 HDDS-11806. Add HttpFS and Recon in getting-started k8s example (apache#7485)
77ce962 HDDS-10568. When the ldb command is executed, it is output by line (apache#7467)
69538b0 HDDS-11687. Robot warning: replace "is not" with "!=" (apache#7516)
b60d897 HDDS-11820. Create test principals at test run time (apache#7507)
6ba309a HDDS-11831. Finer-grained interface for dynamically registered subcommands (apache#7514)
db36e39 HDDS-11822. Register subcommands in OzoneShell (apache#7513)
e6bd3f5 HDDS-11829. Bump zstd-jni to 1.5.6-8 (apache#7510)
09a4a90 HDDS-11827. Bump exec-maven-plugin to 3.5.0 (apache#7512)
ad867bc HDDS-11824. Bump sqlite-jdbc to 3.47.1.0 (apache#7511)
850306d HDDS-11823. Bump cyclonedx-maven-plugin to 2.9.1 (apache#7508)
34f9d9e HDDS-11742. Update metrics with leaderId if known when starting SCM (apache#7471)
2c6c116 HDDS-11265. Add Ozone 1.4.1 to compatibility acceptance tests (apache#7503)
ebcdc6a HDDS-11810. Secure acceptance test on arm64 fails with LoginException: Checksum failed (apache#7498)
7f40624 HDDS-11821. Mark TestECKeyOutputStream#testECKeyCreatetWithDatanodeIdChange as unhealthy
9b26156 HDDS-11811. rocksdbjni deleted on exit could be used by other components apache#7493
3f92663 HDDS-11785. DataNode aborts ContainerStateMachine if it does not know any follower next index (apache#7480)
f0a2c87 HDDS-11773. Prevent frequent DataNode Ratis snapshotting. (apache#7473)
1383c18 HDDS-11718. Some CI checks passing despite error (apache#7483)
f98eac2 HDDS-11561. Refactor Open Key Search Endpoint and Consolidate with OmDBInsightEndpoint Using StartPrefix Parameter. (apache#7336)
a99ab27 HDDS-11243. SCM SafeModeRule Support EC. (apache#7008)
6871547 HDDS-11716. Address Incomplete Upgrade Scenario in Recon Upgrade Framework (apache#7452)
9bc9145 HDDS-10411. Support incremental ChunkBuffer checksum calculation (apache#7189)
579a38e HDDS-11723. Tool to better micro benchmark hbase performance in Ozone (apache#7463)
cc1a374 HDDS-11704. Hadoop test leaves running containers in case of failure (apache#7435)
c4b2056 Update documentation to mention that container schemaV3 is default (apache#7481)
20c4cfa HDDS-11386. Multithreading bug in ContainerBalancerTask (apache#7339)
b090312 HDDS-11780. Increase client write retry when SCM is in safe mode (apache#7470)
f8e4db9 HDDS-11793. Bump maven-checkstyle-plugin to 3.6.0 (apache#7476)
6a1ff84 HDDS-11791. Bump commons-io to 2.18.0 (apache#7478)
d7f8235 HDDS-11788. Bump log4j2 to 2.24.2 (apache#7479)
6547de7 HDDS-11790. Bump commons-lang3 to 3.17.0 (apache#7475)
1d8abd6 HDDS-11789. Bump zstd-jni to 1.5.6-7 (apache#7477)
6ca7230 HDDS-11769. Add tools folder into ozone src package. (apache#7466)
3b8ed58 HDDS-11682. Bump maven-resources-plugin to 3.3.0 (apache#7384)
f4a9ee0 HDDS-11702. Merge test_bucket_encryption into robot compatibility test (apache#7451)
d6a5488 HDDS-11713. Use seek to reach the start transaction. (apache#7460)
d52615a HDDS-11733. Remove okio versioning and unused okhttp dependency (apache#7447)
1a49991 HDDS-11617. Update hadoop to 3.4.1 (apache#7376)
9945de6 HDDS-11667. Validating DatanodeID on any request to the datanode (apache#7418)
fc6a2ea HDDS-11650. ContainerId list to track all containers created in a datanode (apache#7402)
a8db9cd HDDS-11749. Extract moveToTrash implementation to client code (apache#7453)
3ba3474 HDDS-11755. mktemp --suffix does not work on Mac (apache#7457)
433c7bb HDDS-11729. Update skipRecon property to skip only frontend build (apache#7454)
6b40003 HDDS-11739. Extract generic unmarshaller for S3 requests (apache#7449)
c7f65e7 HDDS-11740. Add debug command to show internal component versions (apache#7450)
0f7104e HDDS-11708. Recon ListKeys API should return a proper http response status code if NSSummary rebuild is in progress. (apache#7437)
0e0d5e9 HDDS-11163. Improve Heatmap page UI (apache#7420)
2cef393 HDDS-11696. Limit max number of entries in list keys/status response (apache#7431)
e96e314 HDDS-11697. Integrate Ozone Filesystem Implementation with Ozone ListStatusLight API (apache#7440)
ebcbce7 HDDS-11644. Close OMLayoutVersionManager (apache#7445)
20e4969 HDDS-11737. UnsupportedOperationException in S3 setBucketAcl (apache#7448)
b252181 HDDS-10804. Include only limited set of ports in Pipeline proto (apache#6655)
79ca956 HDDS-8829. Symmetric Keys for Delegation Tokens (apache#7394)
3e798e6 HDDS-11698. Use hadoop images from GitHub in CI (apache#7432)
3e278b7 HDDS-10655. Support PutObjectTagging, GetObjectTagging, and DeleteObjectTagging (apache#6756)
036e727 HDDS-11732. Fix ACL check on bucket resolution while reading from snapshot (apache#7446)
dbda703 HDDS-11736. Bump maven-javadoc-plugin to 3.11.1 (apache#7444)
238f232 HDDS-11692. Skip spotbugs for modules with only generated code. (apache#7428)
f60ad61 HDDS-11705. Snapshot operations on linked buckets should work on actual underlying bucket (apache#7434)
dd22dbe HDDS-11615. Add Upgrade Action for Initial Schema Constraints for Unhealthy Container Table in Recon. (apache#7372)
4066c7c HDDS-117. Add convenience methods for port management in DatanodeDetails (apache#7408)
12419fa HDDS-11695. SCM follower should not log NotLeaderException during Pipeline Report processing. (apache#7430)
5275ded HDDS-10133. Add a method to check key name in OMKeyRequest (apache#6012)
fd5c6d8 HDDS-11689. Extract scheduled workflow for populate-cache (apache#7429)
889ba80 HDDS-11653. Bump Ratis to 3.1.2 (apache#7427)
47ec4dd HDDS-11671. Refer to website for supported versions (apache#7412)
10cac80 HDDS-11686. Use ozone image from GitHub in CI (apache#7425)
aa6da3e HDDS-9781. Limited maxOpenFiles, disabled enableCompactionDag, and createCheckpointDirs when creating OMMetadataManager instance for bootstrapping (apache#7095)
9dd6a83 HDDS-11645. Mark TestReconScmSnapshot#testExplicitRemovalOfNode as flaky
8e617dc HDDS-11672. Mark TestSnapshotBackgroundServices#testCompactionLogBackgroundService as flaky
d09e6d4 HDDS-11646. Mark TestXceiverClientMetrics#testMetrics as flaky
8e4a508 HDDS-11668. Recon List Keys API: Reuse key prefix if parentID is the same (apache#7410)
ee63232 HDDS-11684. Remove suppression of HiddenField (apache#7423)
a33d8a3 HDDS-10166. Replace GenericTestUtils temporary directories with `@TempDir` (apache#7399)
3a18a9d HDDS-11664. Hadoop download failure not reported as error (apache#7421)
47c2409 HDDS-64. OzoneClientException should extend IOException. (apache#7403)
27fcd0c HDDS-11685. Use ozone-testkrb5 from GitHub (apache#7424)
5663971 HDDS-11665. Minor optimizations on the write path (apache#7407)
cb81f0c HDDS-11683. Skip shade in most integration checks (apache#7422)
952e0ec HDDS-11681. Bump Bouncy Castle to 1.79 (apache#7387)
2797c45 HDDS-11677. Bump sqlite-jdbc to 3.47.0.0 (apache#7413)
358534b HDDS-11675. Bump maven-site-plugin to 3.21.0 (apache#7414)
cf79245 HDDS-11674. Bump junit to 5.11.3 (apache#7415)
6dd566f HDDS-11583. Use ozone-runner from GitHub in CI (apache#7409)
ef2bf98 HDDS-11669. In OmUtils.normalizeKey isDebugEnabled should be evaluated first (apache#7411)
a7e3014 HDDS-11660. Recon List Key API: Reduce object creation and buffering memory (apache#7405)
5d18b9c HDDS-11659. Improve HSync compatibility test (apache#7404)
4e603aa HDDS-11462. Enhancing DataNode I/O Monitoring Capabilities. (apache#7206)
18f6e8a HDDS-11311. Added Compatibility test for HSync (apache#7400)
0415c0b HDDS-11649. Recon ListKeys API: Simplify filter predicates (apache#7395)
2547ac0 HDDS-11652. Fix SCM start command in SCM-HA doc (apache#7398)
c045839 HDDS-11623. Improve OM Ratis Configuration change log message (apache#7388)
2b1524b HDDS-11609. Switch to Recon v2 UI as the default UI (apache#7358)
efe5892 HDDS-11641. Allow testing Hadoop with custom docker images (apache#7393)
c055036 HDDS-11637. Compile failure is ignored in build check (apache#7389)
67e5261 HDDS-11563. Display OM/SCM service ID as Namespace in web UI (apache#7321)
0fb5e50 HDDS-11587. Ozone Manager not processing file put requests with multi-tenancy enabled (apache#7316)
786bb49 HDDS-11642. MutableQuantiles should be stopped (apache#7392)
76ec9b9 HDDS-11639. Upgrade ozone-runner to Rocky Linux 9.3 (apache#7391)
3bc3b8a HDDS-11621. Fix missing HADOOP_ variables in MR acceptance test (apache#7375)
6f9db61 HDDS-11200. Hsync client-side metrics (apache#7371)
58d1443 HDDS-10240. Cleanup zero-copy EC (apache#7340)
5b065d8 HDDS-11638. Bump cyclonedx-maven-plugin to 2.9.0 (apache#7383)
c7a196f HDDS-11635. Memory leak when using Ozone FS via Hadoop FileContext API (apache#7382)
c9956a1 HDDS-11601. Intermittent failure in acceptance balancer test. (apache#7343)
a737fc3 HDDS-11619. Remove dependency on hadoop-shaded-guava (apache#7373)
c4d6857 HDDS-11584. Document ozone debug ldb command (apache#7313)
dded26e HDDS-11588. Add main artifact jar to classpath file (apache#7324)
afed6d9 HDDS-11558. Make OM client retry idempotent (apache#7329)
e85b32d HDDS-11591. Copy dependencies when building each module (apache#7325)
72e56d7 HDDS-11601. Disable flaky EC balancer acceptance test
ab16cbe HDDS-11507. Add error information to log while handling ServiceException. (apache#7367)
980b960 HDDS-11380. Make node decommission error message more comprehensive (apache#7155)
61c094f HDDS-11614. Speed up TestTransferLeadershipShell (apache#7370)
91188b3 HDDS-11352. Remove Flaky annotation from TestOzoneManagerHAWithStoppedNodes using Ratis 3.1.1
faf133d HDDS-11220. Initialize block length using the chunk list from DataNode before seek (apache#7221)
91d41a0 HDDS-11465. Introducing Schema Versioning for Recon to Handle Fresh Installs and Upgrades. (apache#7213)
7a27db2 HDDS-11134. Create compatibility test for FSO bucket usage (apache#7350)
30906d1 HDDS-11612. Bump jnr-posix to 3.1.20 (apache#7360)
bed4aef HDDS-11611. Bump docker-maven-plugin to 0.45.1 (apache#7362)
e2c3d57 HDDS-11610. Bump maven-dependency-plugin to 3.8.1 (apache#7361)
24c1000 HDDS-11041. Add admin request filter for S3 requests and UGI support for GrpcOmTransport (apache#7268)
0b84998 HDDS-11594. Update batchPut buffer log for rocksdb. (apache#7356)
c013516 HDDS-11608. Client should not retry invalid protobuf request (apache#7354)
32a8c09 HDDS-11160. Improve Insights page UI (apache#7327)
782ad62 HDDS-11388. Fix unnecessary call to the DB for ContainerBalancer#getBalancerStatusInfo (apache#7224)
35b6a3a HDDS-11600. Intermittent failure in repro due to ordering differences in builddef.lst (apache#7342)
e7bf154 HDDS-11132. Revert client version bump done as part of HDDS-10983 (apache#7348)
ea5cbff HDDS-11602. Bump ozone-runner to 20241022-jdk17-1 (apache#7347)
3f98df5 HDDS-11580. Validate 'hdds.datanode.dir.du.reserved' property (apache#7328)
8568075 HDDS-11570. Fix HDDS Docs build failure with Hugo v0.135.0 (apache#7337)
721ae58 HDDS-11057. Enable reproducible builds (apache#6856)
86b7aae HDDS-11205. Implement a search feature for users to locate keys pending Deletion within the OM Deleted Keys Insights section (apache#6969)
f7b428d HDDS-11503. Add Robot test to verify Container Balancer for EC containers. (apache#7311)
85eb89b HDDS-11483. Make s3g object get and put operation buffer configurable (apache#7233)
515977a HDDS-11582. Bump body-parser to 1.20.3 (apache#7307)
9b66267 HDDS-11589. ReconSCMDBDefinition should be singleton. (apache#7323)
f784a84 HDDS-11578. Unify constants for RATIS_SNAPSHOT_DIR (apache#7310)
4670a5e HDDS-11498. Improve SCM deletion efficiency. (apache#7249)
3fb2cf0 HDDS-11108. Extract keywords for multipart upload tests (apache#7318)
4b24aa9 HDDS-11545. [UI] Add OM and SCM ID information (apache#7287)
860e269 HDDS-11538. Let coverage report link to java sources (apache#7280)
2139367 HDDS-11581. Remove duplicate ContainerStateMachine#RaftGroupId (apache#7312)
64e035d HDDS-11573. Remove lib/gson-2.10.1.jar (apache#7309)
ce07a3c HDDS-11456. Require successful dependency/licence checks for acceptance/compile/kubernetes (apache#7209)
c579d06 HDDS-11574. Ozone client leak in TestS3SDKV1 (apache#7308)
c044b79 HDDS-10390. MiniOzoneCluster to support S3 Gateway (apache#7281)
8eef589 HDDS-11557. Simplify DBColumnFamilyDefinition. (apache#7298)
4c77f6b HDDS-11562. Parameterize TestSCMNodeManager#testProcessLayoutVersion (apache#7300)
b51c4b3 HDDS-11572. Bump commons-io to 2.17.0 (apache#7305)
3a37870 HDDS-11571. Bump log4j2 to 2.24.1 (apache#7301)
494798c HDDS-11564. Mark TestBlockOutputStream#testWriteExactly... as flaky
fabf512 HDDS-11569. Bump restrict-imports-enforcer-rule from 2.5.0 to 2.6.0 (apache#7303)
1e62a0a HDDS-11568. Bump commons-codec to 1.17.1 (apache#7304)
e9f92a7 HDDS-11567. Bump common-custom-user-data-maven-extension to 2.0.1 (apache#7302)
cb44d5e HDDS-11555. SCMDBDefinition should be singleton. (apache#7296)
d473134 HDDS-11486. Reduce log level for NativeLibraryNotLoadedException in SnapshotDiffManager (apache#7290)
3348d91 HDDS-11564. Mark TestBlockOutputStream as flaky
e2f2aeb HDDS-11548. Add some logging to the StateMachine (apache#7291)
523c860 HDDS-11439. De-duplicate code for ReplicatedFileChecksumHelper and ECFileChecksumHelper (apache#7264)
05a409e HDDS-11519. Clean up unused lines in BlockOutputStream
ffe7198 HDDS-11544. Improve work with arrays (apache#7286)
5657604 HDDS-11556. Add a getTypeClass method to Codec. (apache#7295)
256aad9 HDDS-11546. Add regex operation to filter option of ldb scan command. (apache#7289)
7ef7de2 HDDS-11482. EC Checksum throws IllegalArgumentException because the buffer limit is negative (apache#7230)
77c17df HDDS-11551. Provide details about integration check failure (apache#7294)
911a583 HDDS-8188. Support max allowed length in response of ozone admin container list (apache#7181)
7f2e0e3 HDDS-11554. OMDBDefinition should be singleton. (apache#7292)
170761c HDDS-11547. Make MAVEN_OPTS optional (apache#7288)
4846e97 HDDS-11543. Track OzoneClient object leaks via LeakDetector framework. (apache#7285)
e00f7ae HDDS-11159. Improve Containers page UI (apache#7267)
cfda951 HDDS-11520. Fix Delete pending directories key mapping (apache#7269)
2e3de8a HDDS-11476. Implement lesser/greater operation for --filter option of ldb scan command (apache#7222)
06ccdb3 HDDS-11526. Fix hdds.datanode.metadata.rocksdb.cache.size default value mismatch (apache#7284)
b3afaec HDDS-11535. Incomplete SCM roles table header (apache#7278)
ed2a073 HDDS-11536. Bump macOS runner version to macos-13 (apache#7279)
1887f83 HDDS-11537. Bump frontend-maven-plugin to 1.15.1 (apache#7276)
1f1e618 HDDS-6776. Cleanup TestSCMSafeModeManager (apache#7272)
4bee3e9 HDDS-11534. Bump cyclonedx-maven-plugin to 2.8.2 (apache#7277)
789fb53 HDDS-11533. Bump maven-gpg-plugin to 3.2.7 (apache#7275)
eb26677 HDDS-11268. Add --table mode for OM/SCM Roles CLI (apache#7016)
28ea480 HDDS-11527. Avoid unnecessary duplicate build (apache#7270)
30da31f HDDS-3498. Shutdown datanode if address is already in use (apache#7256)
2401d27 HDDS-11046. Coverage decreased due to running tests with Java 17 (apache#7263)
78d8418 HDDS-11524. Bump snappy-java to 1.1.10.7 (apache#7202)
8747c0e HDDS-11518. Recon OmDB Insights show isKey=true for directories (apache#7260)
5d2bbc3 HDDS-11480. Refactor OM volume response tests (apache#7265)
31f9f2c HDDS-11517. Update version to 2.0.0-SNAPSHOT (apache#7258)
a0f0872 HDDS-11444. Make Datanode Command metrics consistent across all commands (apache#7191)
d3b63c6 HDDS-11492. Directory deletion get stuck having millions of directory (apache#7254)
f52f0af HDDS-11127. [hsync] Improve test coverage for XceiverClientRatis.java (apache#7225)
360fea5 HDDS-11494. Improve the duration option of freon ombg (apache#7246)
10d3b21 HDDS-11504. Update Ratis to 3.1.1. (apache#7257)
ce46297 HDDS-11162. Improve Disk Usage page UI (apache#7214)
c91f1c7 HDDS-11491. Avoid sharing clientId among deleting services (apache#7250)
b0943d5 HDDS-11501. Improve logging in XceiverServerRatis (apache#7252)
55925ab HDDS-11502. Class path contains multiple SLF4J providers (apache#7255)
d0ad836 HDDS-11472. Avoid recreating external access authorizer on OM state reload (apache#7238)
254db9e HDDS-11500. RootCARotationManager cancelling wrong task in notifyStatusChanged (apache#7251)
1e6e4b3 HDDS-11499. Remove redundant code from ECReconstructionCoordinator. (apache#7248)
adb2821 HDDS-11490. Bump rollup to 3.29.5 (apache#7232)
189a9fe HDDS-11484. Validate javadoc in CI (apache#7245)
64a29c6 HDDS-11497. Bump commons-configuration2 to 2.11.0 (apache#7242)
95cfadd HDDS-11496. Bump maven-install-plugin to 3.1.3 (apache#7244)
0a999cf HDDS-11493. Bump sqlite-jdbc to 3.46.1.3 (apache#7243)
a214a31 HDDS-11329. Update Ozone images to Rocky Linux-based runner (apache#7241)
56ddb85 HDDS-11371. Handle cases where OM does not have getServerDefaults() implemented. (apache#7130)
b5097c7 HDDS-11347. Add rocks_tools_native lib check in Ozone CLI checknative subcommand (apache#7101)
fb0bf77 HDDS-11489. Bump maven-site-plugin to 3.20.0 (apache#7226)
70e6e40 HDDS-11122. Fix javadoc warnings (apache#7234)
acf3fdc HDDS-11458. Selective checks: trigger checkstyle for properties file changes (apache#7196)
6b87207 HDDS-11469. Statistics of Pipeline and Container (apache#7217)
1b8468b HDDS-11411. Snapshot garbage collection should not run when the keys are moved from a deleted snapshot to the next snapshot in the chain (apache#7193)
1f86ce8 HDDS-10617. Unexpected number of files in ITestS3AContractGetFileStatusV1List (apache#7208)
73a3bcc HDDS-11467. Bump vite to 4.5.5 (apache#7212)
d45aa1d HDDS-11460. Bump express to 4.21.0 (apache#7197)
e2e30b8 HDDS-11354. Intermittent failure in TestOzoneManagerSnapshotAcl#testLookupKeyWithNotAllowedUserForPrefixAcl (apache#7205)
0fcb645 HDDS-11477. [doc] Add configuration description for datanode docs (apache#7223)
3598ee3 HDDS-11464. Removed unused constants from OzoneConsts. (apache#7207)
8c0b54e HDDS-11408. Snapshot rename table entries are propagated incorrectly on snapshot deletes (apache#7200)
719bdf9 HDDS-11396. NPE due to empty Handler#clusterId (apache#7145)
40c4001 HDDS-10479. Add ozone admin ratis local raftMetaConf (apache#7170)
45f9138 HDDS-11394. Fix pipeline close --all command (apache#7138)
2b196d1 HDDS-11468. Enabled DB sync button (apache#7216)
d3899d2 Clean up files created after TestKeyValueHandlerWithUnhealthyContainer#testMarkContainerUnhealthyInFailedVolume (apache#7219)
70b8dd5 HDDS-11157. Improve Datanodes page UI (apache#7168)
151709a HDDS-11446. Downgrade picocli to 4.7.5 due to regression (apache#7215)
7a26aff HDDS-11158. Improve Pipelines page UI (apache#7171)
c365aa0 HDDS-11181. Cleanup of unnecessary try-catch blocks (apache#7210)
88dd436 HDDS-11423. Implement equals operation for --filter option to ozone ldb scan (apache#7167)
e0060a8 HDDS-11196. Improve SCM WebUI Display (apache#6960)
22ddfb9 Revert "HDDS-11456. Require successful dependency/licence checks for acceptance/compile/kubernetes (apache#7192)"
9f5bf43 HDDS-11457. Internal error on S3 CompleteMultipartUpload if parts are not specified (apache#7195)
10c47a1 HDDS-11459. Bump develocity-maven-extension to 1.22.1 (apache#7201)
50f2563 HDDS-11419. Fix waitForCheckpointDirectoryExist log message (apache#7199)
a7d7e37 HDDS-11456. Require successful dependency/licence checks for acceptance/compile/kubernetes (apache#7192)
5feb9ea HDDS-11453. OmSnapshotPurge should be in a different ozone manager double buffer batch (apache#7188)
703c4d5 HDDS-10984. Tool to restore SCM certificates from RocksDB. (apache#6781)
d221065 HDDS-11440. Add a lastTransactionInfo field in SnapshotInfo to check for transactions in flight on the snapshot (apache#7179)
e573701 HDDS-11448. Improve documentation in ContainerStateMachine (apache#7183)
0e49f7a HDDS-11449. Remove unnecessary log from client console. (apache#7184)
cd251f2 HDDS-11438. Ensure DataInputBuffer is closed in OMPBHelper#convert (apache#7182)
4b47812 HDDS-11389. Incorrect number of deleted containers shown in Recon UI. (apache#7149)
0915f0b HDDS-10985. EC Reconstruction failed because the size of currentChunks was not equal to checksumBlockDataChunks. (apache#7009)
0f16195 HDDS-11416. refactor ratis submit request avoid code duplicate (apache#7166)
86fe920 HDDS-11376. Improve ReplicationSupervisor to record replication metrics (apache#7140)
883a63f HDDS-11441. ozone sh key put should only accept positive expectedGeneration (apache#7180)
33dbd4a HDDS-11357. Datanode Usageinfo Support Display Pipeline. (apache#7105)
9477aa6 HDDS-11436. Minor update in Recon API handling. (apache#7178)
8ca33c7 HDDS-11414. Key listing for FSO buckets fails with forward client (apache#7161)
f1ebd39 HDDS-11435. Bump sqlite-jdbc to 3.46.1.0 (apache#7174)
aaf8bd0 HDDS-11434. Bump log4j2 to 2.24.0 (apache#7176)
3510ce7 HDDS-11433. Bump Jetty to 9.4.56.v20240826 (apache#7175)
0047cd2 HDDS-11400. Bump maven-core to 3.9.9 (apache#7144)
274da83 HDDS-10488. Datanode OOM due to run out of mmap handler (apache#6690)
7a452ca HDDS-11391. Addendum to fix test failure.
7e1d9b0 HDDS-11145. ozone admin om cancelprepare --service-id improvement (apache#7159)
6888cf2 HDDS-11383. Improve read key dashboard to include add the read key related OM metrics. (apache#7131)
3e0d76c HDDS-11369. [hsync] Remove KeyOutputStreamSemaphore logs (apache#7136)
b23981c HDDS-11342. [hsync] Add a config as HBase-related features master switch (apache#7126)
3e1188a HDDS-11285. cli to trigger quota repair and status (apache#7104)
2e33978 HDDS-11401. Code cleanup in DatanodeStateMachine (apache#7146)
f563d67 HDDS-11391. Frequent Ozone DN Crashes During OM + DN Decommission with Freon (apache#7154)
18b28d2 HDDS-11312. [hsync] Added upgrade tests (apache#7110)
b29beb3 HDDS-11350. NullPointerException thrown on checking container balancer status (apache#7134)
111b9df HDDS-11407. Use OMLayoutFeature.HBASE_SUPPORT for HSYNC (apache#7152)
966b8d0 HDDS-11390. Removed hsync and hflush capability check in ContentGenerator (apache#7153)
877504a HDDS-11156. Improve Buckets page UI (apache#7100)
814f78f HDDS-11392. ChecksumByteBufferImpl's static initializer fails with java 17+ (apache#7135)
b5e1a8b HDDS-11398. Bump commons-compress to 1.27.1 (apache#7142)
a8e3ea9 HDDS-11397. Bump Jersey2 to 2.45 (apache#7141)
5992837 HDDS-11399. Bump maven-deploy-plugin to 3.1.3 (apache#7143)
47564bb HDDS-11359. Intermittent timeout in TestPipelineManagerMXBean#testPipelineInfo (apache#7132)
cc4e026 HDDS-11304. Make up for the missing functionality in CommandDispatcher (apache#7062)
2d372f6 HDDS-11339. Let PrometheusServlet rely on periodically published metrics (apache#7092)
f22c6f8 HDDS-11164. Improve Navbar UI (apache#7088)
23211c1 HDDS-11381. Adding logging for sortByDistanceCost in NetworkTopologyImpl (apache#7133)
3e9cdb6 HDDS-11378. Allow disabling OM version-specific feature via config (apache#7129)
23f3e5b HDDS-11152. OMDoubleBuffer error when handling snapshot's background operations (apache#7112)
5659b7e HDDS-11375. DN startup fails due to illegal configuration of raft.grpc.message.size.max (apache#7128)
41d8147 HDDS-11368. Remove dependency on Babel in Vite (apache#7119)
0bd8ba1 HDDS-11372. No coverage for org.apache.ozone packages (apache#7124)
3bd237d HDDS-11325. (addendum) Intermittent failure in TestBlockOutputStreamWithFailures#testContainerClose (apache#7121)
8306290 HDDS-11373. Log for EC reconstruction command lists the missing indexes as ASCII control characters (apache#7123)
dab1538 HDDS-11216. Replace HAUtils#buildCAX509List usages with other direct usages (apache#6981)
51a5fb9 Revert "HDDS-11235. Spare InfoBucket RPC call in FileSystem#mkdir() call. (apache#6990)" (apache#7122)
fab56b4 HDDS-11229. Chain optionals in Recon Insight (apache#7064)
2236041 HDDS-11365. Fix the NOTICE file (apache#7120)
2e30dc1 HDDS-11190. Add --fields option to ldb scan command (apache#6976)
0b75cb0 HDDS-11251. Deprecate definitions and remove listTrash and recoverTrash APIs (apache#7060)
be34303 HDDS-9198. Maintain local cache in OMSnapshotPurgeRequest to get updated snapshotInfo and pass the same to OMSnapshotPurgeResponse (apache#7045)
c07b408 HDDS-11208. Change RatisBlockOutputStream to use HDDS-11174. (apache#7072)
8f8d809 HDDS-11309. Increase CONTAINER_STATE Column Length in UNHEALTHY_CONTAINERS to Avoid Truncation (apache#7071)
350a340 HDDS-11364. Bump jgraphx to 3.9.12 (apache#7116)
45b7056 HDDS-11363. Bump develocity-maven-extension to 1.22 (apache#7115)
9dd18f1 HDDS-11362. Bump snappy-java to 1.1.10.6 (apache#7114)
637cb91 HDDS-11361. Bump Jersey2 to 2.44 (apache#7113)

Conflicts:
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueHandler.java
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/ozoneimpl/OzoneContainer.java
hadoop-hdds/container-service/src/test/java/org/apache/hadoop/ozone/container/keyvalue/TestKeyValueHandlerWithUnhealthyContainer.java
hadoop-ozone/dist/src/main/smoketest/admincli/container.robot

Modified during conflict:
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/checksum/DNContainerOperationClient.java
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/checksum/ReconcileContainerTask.java
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/client/rpc/TestECKeyOutputStream.java
ptlrs pushed a commit to ptlrs/ozone that referenced this pull request Mar 8, 2025
…currentChunks was not equal to checksumBlockDataChunks. (apache#7009) (apache#188)

(cherry picked from commit 0915f0b)

Co-authored-by: slfan1989 <[email protected]>
vtutrinov pushed a commit to vtutrinov/ozone that referenced this pull request Jul 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants