-
Notifications
You must be signed in to change notification settings - Fork 588
HDDS-8333. ReplicationManager: Allow partial EC reconstruction if insufficient nodes available #4579
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…Rep handler and MisRepHandler
…ufficient nodes available
| final long dataSizeRequired = | ||
| Math.max(container.getUsedBytes(), defaultContainerSize); | ||
|
|
||
| int mutableRequiredNodes = requiredNodes; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally this works. But shouldn't we fix this kind of checks, so we can avoid this loop?
if (chosenNodes.size() < additionalRacksRequired) {
String reason = "Chosen nodes size from Unique Racks: " + chosenNodes
.size() + ", but required nodes to choose from Unique Racks: "
+ additionalRacksRequired + " do not match.";
LOG.warn("Placement policy could not choose the enough nodes from " +
"available racks. {} Available racks count: {},"
+ " Excluded nodes count: {}",
reason, racks.size(), excludedNodesCount);
throw new SCMException(reason,
SCMException.ResultCodes.FAILED_TO_FIND_HEALTHY_NODES);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand what you mean by this comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can remove this if check to handle it instead & not have a loop. We can change the definition of the chooseDatanode function in placement policy to return a max number of nodes adhering to the placement policy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not going to start with changing the placement policy now. It is called from too many places, and it requires changing all the placement policies, making it a rather large change. As it stands the placement policy contract is that is returns all the requested nodes or an exception, and I am happy to keep it that way for now. This loop is reused in a few places in this utility function, and therefore it is nicely contained in one place now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This loop is reused in a few places in this utility function, and therefore it is nicely contained in one place now.
If this is reused, then right place should be placement policy to fix? Probably have separate API with this contract. Currently it's like throw exception catch it and try again, even though chosen node end up getting same result inside policy.
Anyway, this is committed now. it works functionally. Does it loop even chosenNodes 0 and throw exception out from policy?
* master: (440 commits) HDDS-8445. Move PlacementPolicy back to SCM (apache#4588) HDDS-8335. ReplicationManager: EC Mis and Under replication handlers should handle overloaded exceptions (apache#4593) HDDS-8355. Intermittent failure in TestOMRatisSnapshots#testInstallSnapshot (apache#4592) HDDS-8444. Increase timeout of CI build (apache#4586) HDDS-8446. Selective checks: handle change in ci.yaml (apache#4587) HDDS-8440. Ozone Manager crashed with ClassCastException when deleting FSO bucket. (apache#4582) HDDS-7309. Enable by default GRPC between S3G and OM (apache#3820) HDDS-8458. Mark TestBlockDeletion#testBlockDeletion as flaky HDDS-8385. Ozone can't process snapshot when service UID > 2097151 (apache#4580) HDDS-8424: Preserve legacy bucket getKeyInfo behavior (apache#4576) HDDS-8453. Mark TestDirectoryDeletingServiceWithFSO#testDirDeletedTableCleanUpForSnapshot as flaky HDDS-8137. [Snapshot] SnapDiff to use tombstone entries in SST files (apache#4376) HDDS-8270. Measure checkAccess latency for Ozone objects (apache#4467) HDDS-8109. Seperate Ratis and EC MisReplication Handling (apache#4577) HDDS-8429. Checkpoint is not closed properly in OMDBCheckpointServlet (apache#4575) HDDS-8253. Set ozone.metadata.dirs to temporary dir if not defined in S3 Gateway (apache#4455) HDDS-8400. Expose rocksdb last sequence number through metrics (apache#4557) HDDS-8333. ReplicationManager: Allow partial EC reconstruction if insufficient nodes available (apache#4579) HDDS-8147. Introduce latency metrics for S3 Gateway operations (apache#4383) HDDS-7908. Support OM Metadata operation Generator in `Ozone freon` (apache#4251) ...
What changes were proposed in this pull request?
If an EC container is missing multiple indexes, and there are not enough nodes to recover all indexes, the process currently fails. It could be possible to recover some of the indexes and it will reduce the chances of data loss.
Therefore, if we cannot select enough nodes to recover all indexes, we should try to select enough nodes to recover N - 1 indexes, or N - 2, as recovering some of them is better than nothing.
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-8333
How was this patch tested?
New unit tests added and some existing tests modified to handle the change in behavior of throwing an exception on partial success.