Skip to content

Conversation

@sodonnel
Copy link
Contributor

What changes were proposed in this pull request?

If an EC container is missing multiple indexes, and there are not enough nodes to recover all indexes, the process currently fails. It could be possible to recover some of the indexes and it will reduce the chances of data loss.

Therefore, if we cannot select enough nodes to recover all indexes, we should try to select enough nodes to recover N - 1 indexes, or N - 2, as recovering some of them is better than nothing.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-8333

How was this patch tested?

New unit tests added and some existing tests modified to handle the change in behavior of throwing an exception on partial success.

final long dataSizeRequired =
Math.max(container.getUsedBytes(), defaultContainerSize);

int mutableRequiredNodes = requiredNodes;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally this works. But shouldn't we fix this kind of checks, so we can avoid this loop?

if (chosenNodes.size() < additionalRacksRequired) {
      String reason = "Chosen nodes size from Unique Racks: " + chosenNodes
              .size() + ", but required nodes to choose from Unique Racks: "
              + additionalRacksRequired + " do not match.";
      LOG.warn("Placement policy could not choose the enough nodes from " +
                      "available racks. {} Available racks count: {},"
                      + " Excluded nodes count: {}",
              reason, racks.size(), excludedNodesCount);
      throw new SCMException(reason,
              SCMException.ResultCodes.FAILED_TO_FIND_HEALTHY_NODES);
    }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what you mean by this comment?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can remove this if check to handle it instead & not have a loop. We can change the definition of the chooseDatanode function in placement policy to return a max number of nodes adhering to the placement policy.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not going to start with changing the placement policy now. It is called from too many places, and it requires changing all the placement policies, making it a rather large change. As it stands the placement policy contract is that is returns all the requested nodes or an exception, and I am happy to keep it that way for now. This loop is reused in a few places in this utility function, and therefore it is nicely contained in one place now.

Copy link
Contributor

@umamaheswararao umamaheswararao Apr 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This loop is reused in a few places in this utility function, and therefore it is nicely contained in one place now.

If this is reused, then right place should be placement policy to fix? Probably have separate API with this contract. Currently it's like throw exception catch it and try again, even though chosen node end up getting same result inside policy.

Anyway, this is committed now. it works functionally. Does it loop even chosenNodes 0 and throw exception out from policy?

@adoroszlai adoroszlai added the scm label Apr 18, 2023
@sodonnel sodonnel merged commit b74faef into apache:master Apr 18, 2023
errose28 added a commit to errose28/ozone that referenced this pull request Apr 20, 2023
* master: (440 commits)
  HDDS-8445. Move PlacementPolicy back to SCM (apache#4588)
  HDDS-8335. ReplicationManager: EC Mis and Under replication handlers should handle overloaded exceptions (apache#4593)
  HDDS-8355. Intermittent failure in TestOMRatisSnapshots#testInstallSnapshot (apache#4592)
  HDDS-8444. Increase timeout of CI build (apache#4586)
  HDDS-8446. Selective checks: handle change in ci.yaml (apache#4587)
  HDDS-8440. Ozone Manager crashed with ClassCastException when deleting FSO bucket. (apache#4582)
  HDDS-7309. Enable by default GRPC between S3G and OM (apache#3820)
  HDDS-8458. Mark TestBlockDeletion#testBlockDeletion as flaky
  HDDS-8385. Ozone can't process snapshot when service UID > 2097151 (apache#4580)
  HDDS-8424: Preserve legacy bucket getKeyInfo behavior (apache#4576)
  HDDS-8453. Mark TestDirectoryDeletingServiceWithFSO#testDirDeletedTableCleanUpForSnapshot as flaky
  HDDS-8137. [Snapshot] SnapDiff to use tombstone entries in SST files (apache#4376)
  HDDS-8270. Measure checkAccess latency for Ozone objects (apache#4467)
  HDDS-8109. Seperate Ratis and EC MisReplication Handling (apache#4577)
  HDDS-8429. Checkpoint is not closed properly in OMDBCheckpointServlet (apache#4575)
  HDDS-8253. Set ozone.metadata.dirs to temporary dir if not defined in S3 Gateway (apache#4455)
  HDDS-8400. Expose rocksdb last sequence number through metrics (apache#4557)
  HDDS-8333. ReplicationManager: Allow partial EC reconstruction if insufficient nodes available (apache#4579)
  HDDS-8147. Introduce latency metrics for S3 Gateway operations (apache#4383)
  HDDS-7908. Support OM Metadata operation Generator in `Ozone freon` (apache#4251)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants