Skip to content

Conversation

@kaijchen
Copy link
Member

What changes were proposed in this pull request?

Force close non-RATIS containers in ReplicationManager.

} else if (closeCommand.getForce()) {
// Non-RATIS containers should have the force close flag set, so they
// are moved to CLOSED immediately rather than going to quasi-closed.
controller.closeContainer(containerId);
} else {
controller.quasiCloseContainer(containerId);
LOG.info("Marking Container {} quasi closed", containerId);
}

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-7396

How was this patch tested?

TestClosingContainerHandler#testOpenOrClosingReplicasAreClosed

// Add CLOSING container replicas with index [1, closing]
for (int i = 1; i <= closing; i++) {
containerReplicas.add(ReplicationTestUtil.createContainerReplica(
containerInfo.containerID(), i,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor point, but just incase it causes problems in the future, we should fix it.

Ratis replicas should always have index = 0. EC replicas should always have indexes >= 1.

Based on the force flag, you could set the index to force == true ? i : 0 in the two loops.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I missed that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have set it to ReplicationType == EC ? i : 0 in case of STAND_ALONE or CHAINED being passed.

Copy link
Contributor

@sodonnel sodonnel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change looks good. I just have one minor comment on the index used in the test. If we fix it we should be good to commit.

public void testOpenOrClosingReplicasAreClosed() {
@ParameterizedTest
@MethodSource("replicationConfigs")
public void testOpenOrClosingReplicasAreClosed(ReplicationConfig repConfig) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parameterizing this is a good idea. If I understand correctly, this test will now run for both EC and RATIS replication configs. If so, we can delete the next test testOpenOrClosingRatisReplicasAreClosed().

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's covered by the parameterized test.
Thanks for pointing this out.

Copy link
Contributor

@siddhantsangwan siddhantsangwan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kaijchen Thanks for working on this. Looks good! Pending CI.

Copy link
Contributor

@sodonnel sodonnel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM when the CI is green.

Copy link
Contributor

@JacksonYao287 JacksonYao287 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @kaijchen for the work, and @sodonnel @siddhantsangwan for the review! LGTM

@JacksonYao287 JacksonYao287 merged commit 462f32d into apache:master Oct 26, 2022
@kaijchen
Copy link
Member Author

Thanks @sodonnel @siddhantsangwan @JacksonYao287 for the review.

@kaijchen kaijchen deleted the HDDS-7396 branch October 26, 2022 02:20
errose28 added a commit to errose28/ozone that referenced this pull request Oct 26, 2022
* master: (718 commits)
  HDDS-7342. Move encryption-related code from MultipartCryptoKeyInputStream to OzoneCryptoInputStream (apache#3852)
  HDDS-7413. Fix logging while marking container state unhealthy (apache#3887)
  Revert "HDDS-7253. Fix exception when '/' in key name (apache#3774)"
  HDDS-7396. Force close non-RATIS containers in ReplicationManager (apache#3877)
  HDDS-7121. Support namespace summaries (du, dist & counts) for legacy FS buckets (apache#3746)
  HDDS-7258. Cleanup the allocated but uncommitted blocks (apache#3778)
  HDDS-7381. Cleanup of VolumeManagerImpl (apache#3873)
  HDDS-7253. Fix exception when '/' in key name (apache#3774)
  HDDS-7182. Add property to control RocksDB max open files (apache#3843)
  HDDS-7284. JVM crash for rocksdb for read/write after close (apache#3801)
  HDDS-7368. [Multi-Tenant] Add Volume Existence check in preExecute for OMTenantCreateRequest (apache#3869)
  HDDS-7403. README Security Improvement (apache#3879)
  HDDS-7199. Implement new mix workload Read/Write Freon command (apache#3872)
  HDDS-7248. Recon: Expand the container status page to show all unhealthy container states (apache#3837)
  HDDS-7141. Recon: Improve Disk Usage Page (apache#3789)
  HDDS-7369. Fix wrong order of command arguments in Nonrolling-Upgrade.md (apache#3866)
  HDDS-6210. EC: Add EC metrics (apache#3851)
  HDDS-7355. non-primordial scm fail to get signed cert from primordial SCM when converting an unsecure cluster to secure (apache#3859)
  HDDS-7356. Update SCM-HA.zh.md to match the English version (apache#3861)
  HDDS-6930. SCM,OM,RECON should not print ERROR and exit with code 1 on successful shutdown (apache#3848)
  ...

Conflicts:
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/replication/LegacyReplicationManager.java
hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/container/replication/TestLegacyReplicationManager.java
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants