Skip to content

Conversation

@ChenSammi
Copy link
Contributor

Copy link
Contributor

@jojochuang jojochuang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So basically make delete block operation asynchronous, moving it to another thread.
From a very quick look at the PR, looks reasonable to me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is hard to read. Can you move out this Callable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. I refactored the code.

@ChenSammi
Copy link
Contributor Author

ChenSammi commented Jul 16, 2021

图片

This is the rocksDB cache metrics data captured from one DN. It costs about 200ms for open a rocksdb to persist the block deletion trasaction command. Becased on current default configuration, one DN can handle at most 60s / 200ms = 300 rocksdb opens if cache missed in 60s which is SCM block deleting command generation interval.

@jojochuang
Copy link
Contributor

图片

This is the rocksDB cache metrics data captured from one DN. It costs about 200ms for open a rocksdb to persist the block deletion trasaction command. Becased on current default configuration, one DN can handle at most 60s / 200ms = 300 rocksdb opens if cache missed in 60s which is SCM block deleting command generation interval.

This is yet another reason why we should reconsider the 1-db-per-container design :)

@ChenSammi
Copy link
Contributor Author

ChenSammi commented Jul 20, 2021

@jojochuang , do you mean 1-db-per-volume design? We think the same way.

Copy link
Contributor

@jojochuang jojochuang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just one comment. The rest looks good to me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh one more thing. We should do
Thread.currentThread().interrupt();

@ChenSammi
Copy link
Contributor Author

Hi @adoroszlai , do we need to do something to free disk space if "Warning: ForkStarter IOException: java.io.IOException: No space left on device" is found?

Copy link
Contributor

@jojochuang jojochuang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 from me. Waiting for Attila's review.

while (!deleteCommandQueues.isEmpty()) {
DeleteCmdInfo cmd = deleteCommandQueues.poll();
try {
processCmd(cmd.getCmd(), cmd.getContainer(), cmd.getContext(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the work ! NIT, can we use DeleteCmdInfo as the only one parameter of processCmd

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.

@ChenSammi
Copy link
Contributor Author

Thanks @jojochuang and @JacksonYao287 for the code review. Also thanks @adoroszlai for taking care of the "no space left" issue.

@ChenSammi ChenSammi merged commit f9ec7e9 into apache:master Aug 10, 2021

cluster.shutdownHddsDatanode(keyLocation.getPipeline().getFirstNode());

waitForReplicaCount(containerID, 2, cluster);
Copy link
Contributor

@adoroszlai adoroszlai Aug 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was this removed? Without this, the test may give false negative: replica count may be 3 both initially and after re-replication, and we need to distinguish between the two.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because delete thread pool shutdown takes time, when it finishes, the replication manager already have the container replicated. So the wait for 2 replicas randomly fails because of there are already 3 replicas.

@ChenSammi
Copy link
Contributor Author

TestBlockDeletion timed out twice on master since this commit.

* https://github.com/elek/ozone-build-results/blob/master/2021/08/10/9545/it-ozone/hadoop-ozone/integration-test/org.apache.hadoop.ozone.container.common.statemachine.commandhandler.TestBlockDeletion.txt

* https://github.com/elek/ozone-build-results/blob/master/2021/08/10/9555/it-ozone/hadoop-ozone/integration-test/org.apache.hadoop.ozone.container.common.statemachine.commandhandler.TestBlockDeletion.txt

I think this should be fixed.

Sure. I will take a look of timeout issue.

@adoroszlai
Copy link
Contributor

Sure. I will take a look of timeout issue.

Thanks. Filed HDDS-5605 for it. HDDS-5606 is also related, but it started happening earlier (few weeks ago).

errose28 added a commit to errose28/ozone that referenced this pull request Aug 12, 2021
* master:
  HDDS-5358. Incorrect cache entry invalidation causes intermittent failure in testGetS3SecretAndRevokeS3Secret (apache#2518)
  HDDS-5608. Fix wrong command in ugrade doc (apache#2524)
  HDDS-5000. Run CI checks selectively (apache#2479)
  HDDS-4929. Select target datanodes and containers to move for Container Balancer (apache#2441)
  HDDS-5283. getStorageSize cast to int can cause issue (apache#2303)
  HDDS-5449 Recon namespace summary 'du' information should return replicated size of a key (apache#2489)
  HDDS-5558. vUnit invocation unit() may produce NPE (apache#2513)
  HDDS-5531. For Link Buckets avoid showing metadata. (apache#2502)
  HDDS-5549. Add 1.1 to supported versions in security policy (apache#2519)
  HDDS-5555. remove pipeline manager v1 code (apache#2511)
  HDDS-5546.OM Service ID change causes OM startup failure. (apache#2512)
  HDDS-5360. DN failed to process all delete block commands in one heartbeat interval (apache#2420)
  HDDS-5021. dev-support Dockerfile is badly outdated (apache#2480)
errose28 added a commit to errose28/ozone that referenced this pull request Aug 12, 2021
* master:
  HDDS-5358. Incorrect cache entry invalidation causes intermittent failure in testGetS3SecretAndRevokeS3Secret (apache#2518)
  HDDS-5608. Fix wrong command in ugrade doc (apache#2524)
  HDDS-5000. Run CI checks selectively (apache#2479)
  HDDS-4929. Select target datanodes and containers to move for Container Balancer (apache#2441)
  HDDS-5283. getStorageSize cast to int can cause issue (apache#2303)
  HDDS-5449 Recon namespace summary 'du' information should return replicated size of a key (apache#2489)
  HDDS-5558. vUnit invocation unit() may produce NPE (apache#2513)
  HDDS-5531. For Link Buckets avoid showing metadata. (apache#2502)
  HDDS-5549. Add 1.1 to supported versions in security policy (apache#2519)
  HDDS-5555. remove pipeline manager v1 code (apache#2511)
  HDDS-5546.OM Service ID change causes OM startup failure. (apache#2512)
  HDDS-5360. DN failed to process all delete block commands in one heartbeat interval (apache#2420)
  HDDS-5021. dev-support Dockerfile is badly outdated (apache#2480)
errose28 added a commit to errose28/ozone that referenced this pull request Oct 5, 2021
…one heartbeat interval (apache#2420)"

This reverts commit f9ec7e9.

Conflicts:
	hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/statemachine/DatanodeConfiguration.java
    hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/statemachine/commandhandler/DeleteBlocksCommandHandler.java
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants