HDDS-8333. ReplicationManager: Allow partial EC reconstruction if insufficient nodes available #4579

sodonnel · 2023-04-17T15:15:49Z

What changes were proposed in this pull request?

If an EC container is missing multiple indexes, and there are not enough nodes to recover all indexes, the process currently fails. It could be possible to recover some of the indexes and it will reduce the chances of data loss.

Therefore, if we cannot select enough nodes to recover all indexes, we should try to select enough nodes to recover N - 1 indexes, or N - 2, as recovering some of them is better than nothing.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-8333

How was this patch tested?

New unit tests added and some existing tests modified to handle the change in behavior of throwing an exception on partial success.

…Rep handler and MisRepHandler

…ufficient nodes available

umamaheswararao · 2023-04-17T16:27:53Z

...m/src/main/java/org/apache/hadoop/hdds/scm/container/replication/ReplicationManagerUtil.java

+    final long dataSizeRequired =
+        Math.max(container.getUsedBytes(), defaultContainerSize);
+
+    int mutableRequiredNodes = requiredNodes;


Generally this works. But shouldn't we fix this kind of checks, so we can avoid this loop?

if (chosenNodes.size() < additionalRacksRequired) { String reason = "Chosen nodes size from Unique Racks: " + chosenNodes .size() + ", but required nodes to choose from Unique Racks: " + additionalRacksRequired + " do not match."; LOG.warn("Placement policy could not choose the enough nodes from " + "available racks. {} Available racks count: {}," + " Excluded nodes count: {}", reason, racks.size(), excludedNodesCount); throw new SCMException(reason, SCMException.ResultCodes.FAILED_TO_FIND_HEALTHY_NODES); }

I don't understand what you mean by this comment?

We can remove this if check to handle it instead & not have a loop. We can change the definition of the chooseDatanode function in placement policy to return a max number of nodes adhering to the placement policy.

I'm not going to start with changing the placement policy now. It is called from too many places, and it requires changing all the placement policies, making it a rather large change. As it stands the placement policy contract is that is returns all the requested nodes or an exception, and I am happy to keep it that way for now. This loop is reused in a few places in this utility function, and therefore it is nicely contained in one place now.

This loop is reused in a few places in this utility function, and therefore it is nicely contained in one place now.

If this is reused, then right place should be placement policy to fix? Probably have separate API with this contract. Currently it's like throw exception catch it and try again, even though chosen node end up getting same result inside policy.

Anyway, this is committed now. it works functionally. Does it loop even chosenNodes 0 and throw exception out from policy?

* master: (440 commits) HDDS-8445. Move PlacementPolicy back to SCM (apache#4588) HDDS-8335. ReplicationManager: EC Mis and Under replication handlers should handle overloaded exceptions (apache#4593) HDDS-8355. Intermittent failure in TestOMRatisSnapshots#testInstallSnapshot (apache#4592) HDDS-8444. Increase timeout of CI build (apache#4586) HDDS-8446. Selective checks: handle change in ci.yaml (apache#4587) HDDS-8440. Ozone Manager crashed with ClassCastException when deleting FSO bucket. (apache#4582) HDDS-7309. Enable by default GRPC between S3G and OM (apache#3820) HDDS-8458. Mark TestBlockDeletion#testBlockDeletion as flaky HDDS-8385. Ozone can't process snapshot when service UID > 2097151 (apache#4580) HDDS-8424: Preserve legacy bucket getKeyInfo behavior (apache#4576) HDDS-8453. Mark TestDirectoryDeletingServiceWithFSO#testDirDeletedTableCleanUpForSnapshot as flaky HDDS-8137. [Snapshot] SnapDiff to use tombstone entries in SST files (apache#4376) HDDS-8270. Measure checkAccess latency for Ozone objects (apache#4467) HDDS-8109. Seperate Ratis and EC MisReplication Handling (apache#4577) HDDS-8429. Checkpoint is not closed properly in OMDBCheckpointServlet (apache#4575) HDDS-8253. Set ozone.metadata.dirs to temporary dir if not defined in S3 Gateway (apache#4455) HDDS-8400. Expose rocksdb last sequence number through metrics (apache#4557) HDDS-8333. ReplicationManager: Allow partial EC reconstruction if insufficient nodes available (apache#4579) HDDS-8147. Introduce latency metrics for S3 Gateway operations (apache#4383) HDDS-7908. Support OM Metadata operation Generator in `Ozone freon` (apache#4251) ...

S O'Donnell added 2 commits April 17, 2023 16:13

Extract method to get nodes into utility class and used in RatisUnder…

7360523

…Rep handler and MisRepHandler

HDDS-8333. ReplicationManager: Allow partial EC reconstruction if ins…

2cd516b

…ufficient nodes available

sodonnel requested a review from adoroszlai April 17, 2023 15:16

kerneltime requested review from errose28, kerneltime and umamaheswararao April 17, 2023 16:06

umamaheswararao reviewed Apr 17, 2023

View reviewed changes

adoroszlai approved these changes Apr 18, 2023

View reviewed changes

adoroszlai added the scm label Apr 18, 2023

sodonnel merged commit b74faef into apache:master Apr 18, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

HDDS-8333. ReplicationManager: Allow partial EC reconstruction if insufficient nodes available #4579

HDDS-8333. ReplicationManager: Allow partial EC reconstruction if insufficient nodes available #4579

Uh oh!

sodonnel commented Apr 17, 2023

Uh oh!

umamaheswararao Apr 17, 2023

Uh oh!

sodonnel Apr 17, 2023

Uh oh!

swamirishi Apr 17, 2023

Uh oh!

sodonnel Apr 17, 2023

Uh oh!

umamaheswararao Apr 19, 2023 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

HDDS-8333. ReplicationManager: Allow partial EC reconstruction if insufficient nodes available #4579

HDDS-8333. ReplicationManager: Allow partial EC reconstruction if insufficient nodes available #4579

Uh oh!

Conversation

sodonnel commented Apr 17, 2023

What changes were proposed in this pull request?

What is the link to the Apache JIRA

How was this patch tested?

Uh oh!

umamaheswararao Apr 17, 2023

Choose a reason for hiding this comment

Uh oh!

sodonnel Apr 17, 2023

Choose a reason for hiding this comment

Uh oh!

swamirishi Apr 17, 2023

Choose a reason for hiding this comment

Uh oh!

sodonnel Apr 17, 2023

Choose a reason for hiding this comment

Uh oh!

umamaheswararao Apr 19, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

umamaheswararao Apr 19, 2023 •

edited

Loading