Skip to content

Conversation

@errose28
Copy link
Contributor

What changes were proposed in this pull request?

Summary

In the placeholder reconciliation implementation, handling of reconcile container commands from SCM to datanodes are done in the KeyValueHandler. In the final implementation they should go through the ReplicationSupervisor like replication and reconstruction commands for proper scheduling.

This change also makes sure the same ContainerChecksumTreeManager instance is available in all the places it is needed within the datanode, and closes its metrics properly instead of the workaround added in HDDS-10373.

Specific Changes

  • Usage of ContainerChecksumTreeManager is unified to one instance for the whole datanode.
  • Merkle tree metrics object is properly closed.
  • Reconcile commands are deduplicated by container ID and peer list.
    • The ReplicationSupervisor should drop duplicate commands.
  • ReconcileContainerCommandHandler uses ReplicationSupervisor for queueing. It does not implement it's own thread pool and queue.
    • With this change, it also uses ReplicationSupervisor for metrics.
      • This is similar to how ReconstructECContainersCommandHandler and ReplicateContainerCommandHandler are implemented (total time and invocation count are unused in that class)
      • It would be good to have more fine grained metrics for each of these task types instead of the shared ReplicationSupervisorMetrics, but that would be a broader change outside the scope of this PR.
  • New test helper methods for creating KeyValueHandler and HddsDispatcher instances when only a susbet of the functionality is required were added and existing tests were updated.
    • This greatly reduced test dependence on the KeyValueHandler constructor allowing us to make changes here to get the ContainerChecksumTreeManager where we need it.

What is the link to the Apache JIRA

HDDS-11254

How was this patch tested?

  • Metrics unregistration has been seen to fail existing tests if not done properly. No failures observed here.
  • TestReconcileCommandHandler tests for metrics were removed since those are now covered by TestReplicationSupervisor.
  • Unit test added for TestReconcileContainerTask that it properly detects task failure and equality.
  • Existing e2e acceptance tests for reconciliation pass.

errose28 added 12 commits August 1, 2024 18:20
Depends on PRs that are still open
…p-supervisor

Merge conflicts are resolved but the change does not yet build.

* HDDS-10239-container-reconciliation: (183 commits)
  HDDS-10376. Add a Datanode API to supply a merkle tree for a given container. (apache#6945)
  HDDS-11289. Bump docker-maven-plugin to 0.45.0 (apache#7024)
  HDDS-11287. Code cleanup in XceiverClientSpi (apache#7043)
  HDDS-11283. Refactor KeyValueStreamDataChannel to avoid spurious IDE build issues (apache#7040)
  HDDS-11257. Ozone write does not work when http proxy is set for the JVM. (apache#7036)
  HDDS-11249. Bump ozone-runner to 20240729-jdk17-1 (apache#7003)
  HDDS-10517. Recon - Add a UI component for showing DN decommissioning detailed status and info (apache#6724)
  HDDS-10926. Block deletion should update container merkle tree. (apache#6875)
  HDDS-11270. [hsync] Add DN layout version (HBASE_SUPPORT/version 8) upgrade test. (apache#7021)
  HDDS-11272. Statistics some node status information (apache#7025)
  HDDS-11278. Move code out of Hadoop util package (apache#7028)
  HDDS-11274. (addendum) Replace Hadoop annotations/configs with Ozone-specific ones
  HDDS-11274. Replace Hadoop annotations/configs with Ozone-specific ones (apache#7026)
  HDDS-11280. Add Synchronize in AbstractCommitWatcher.addAckDataLength (apache#7032)
  HDDS-11235. Spare InfoBucket RPC call in FileSystem#mkdir() call. (apache#6990)
  HDDS-11273. Bump commons-compress to 1.26.2 (apache#7023)
  HDDS-11225. Increase ipc.server.read.threadpool.size (apache#7007)
  HDDS-11224. Increase hdds.datanode.handler.count (apache#7011)
  HDDS-11259. [hsync] DataNode should verify HBASE_SUPPORT layout version for every PutBlock. (apache#7012)
  HDDS-11214. Added config to set rocksDB's max log file size and num of log files (apache#7014)
  ...

Conflicts:
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/ozoneimpl/OzoneContainer.java
@kerneltime
Copy link
Contributor

This looks good, once the CI passes on your fork we can make it ready to merge.

@errose28 errose28 marked this pull request as ready for review August 20, 2024 22:23
@errose28
Copy link
Contributor Author

Passing on my fork here

@kerneltime kerneltime merged commit 0baba47 into apache:HDDS-10239-container-reconciliation Aug 21, 2024
errose28 added a commit to errose28/ozone that referenced this pull request Aug 26, 2024
…an-on-error

* HDDS-10239-container-reconciliation: (428 commits)
  HDDS-11081. Use thread-local instance of FileSystem in Freon tests (apache#7091)
  HDDS-11333. Avoid hard-coded current version in upgrade/xcompat tests (apache#7089)
  Mark TestPipelineManagerMXBean#testPipelineInfo as flaky
  Mark TestAddRemoveOzoneManager#testForceBootstrap as flaky
  HDDS-11352. HDDS-11353. Mark TestOzoneManagerHAWithStoppedNodes as flaky
  HDDS-11354. Mark TestOzoneManagerSnapshotAcl#testLookupKeyWithNotAllowedUserForPrefixAcl as flaky
  HDDS-11355. Mark TestMultiBlockWritesWithDnFailures#testMultiBlockWritesWithIntermittentDnFailures as flaky
  HDDS-11227. Use server default key provider to encrypt/decrypt keys from multiple OMs. (apache#7081)
  HDDS-11316. Improve Create Key and Chunk IO Dashboards (apache#7075)
  HDDS-11239. Fix KeyOutputStream's exception handling when calling hsync concurrently (apache#7047)
  HDDS-11254. Reconcile commands should be handled by datanode ReplicationSupervisor (apache#7076)
  HDDS-11331. Fix Datanode unable to report for a long time (apache#7090)
  HDDS-11346. FS CLI gives incorrect recursive volume deletion prompt (apache#7102)
  HDDS-11349. Add NullPointer handling when volume/bucket tables are not initialized (apache#7103)
  HDDS-11209. Avoid insufficient EC pipelines in the container pipeline cache (apache#6974)
  HDDS-11284. refactor quota repair non-blocking while upgrade (apache#7035)
  HDDS-9790. Add tests for Overview page (apache#6983)
  HDDS-10904. [hsync] Enable PutBlock piggybacking and incremental chunk list by default (apache#7074)
  HDDS-11322. [hsync] Block ECKeyOutputStream from calling hsync and hflush (apache#7098)
  HDDS-11325. Intermittent failure in TestBlockOutputStreamWithFailures#testContainerClose (apache#7099)
  ...

Conflicts:
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/checksum/ContainerChecksumTreeManager.java
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueContainerCheck.java
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueHandler.java
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/ozoneimpl/OzoneContainer.java
errose28 added a commit to errose28/ozone that referenced this pull request Aug 28, 2024
…rrupt-files

* HDDS-10239-container-reconciliation: (61 commits)
  HDDS-11081. Use thread-local instance of FileSystem in Freon tests (apache#7091)
  HDDS-11333. Avoid hard-coded current version in upgrade/xcompat tests (apache#7089)
  Mark TestPipelineManagerMXBean#testPipelineInfo as flaky
  Mark TestAddRemoveOzoneManager#testForceBootstrap as flaky
  HDDS-11352. HDDS-11353. Mark TestOzoneManagerHAWithStoppedNodes as flaky
  HDDS-11354. Mark TestOzoneManagerSnapshotAcl#testLookupKeyWithNotAllowedUserForPrefixAcl as flaky
  HDDS-11355. Mark TestMultiBlockWritesWithDnFailures#testMultiBlockWritesWithIntermittentDnFailures as flaky
  HDDS-11227. Use server default key provider to encrypt/decrypt keys from multiple OMs. (apache#7081)
  HDDS-11316. Improve Create Key and Chunk IO Dashboards (apache#7075)
  HDDS-11239. Fix KeyOutputStream's exception handling when calling hsync concurrently (apache#7047)
  HDDS-11254. Reconcile commands should be handled by datanode ReplicationSupervisor (apache#7076)
  HDDS-11331. Fix Datanode unable to report for a long time (apache#7090)
  HDDS-11346. FS CLI gives incorrect recursive volume deletion prompt (apache#7102)
  HDDS-11349. Add NullPointer handling when volume/bucket tables are not initialized (apache#7103)
  HDDS-11209. Avoid insufficient EC pipelines in the container pipeline cache (apache#6974)
  HDDS-11284. refactor quota repair non-blocking while upgrade (apache#7035)
  HDDS-9790. Add tests for Overview page (apache#6983)
  HDDS-10904. [hsync] Enable PutBlock piggybacking and incremental chunk list by default (apache#7074)
  HDDS-11322. [hsync] Block ECKeyOutputStream from calling hsync and hflush (apache#7098)
  HDDS-11325. Intermittent failure in TestBlockOutputStreamWithFailures#testContainerClose (apache#7099)
  ...

Conflicts:
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/checksum/ContainerChecksumTreeManager.java
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants