Skip to content

Conversation

@lokeshj1703
Copy link
Contributor

@lokeshj1703 lokeshj1703 commented Jan 20, 2021

What changes were proposed in this pull request?

If SCM cleans up all the DeletedBlocksTransaction in deletedBlocksTable, then restart, the txID in SCMMetadataStoreImpl will drop to 0.

in SCMMetadataStoreImpl.java

public SCMMetadataStoreImpl(OzoneConfiguration config)
    throws IOException {
  this.configuration = config;
  start(this.configuration);
  this.txID = new AtomicLong(this.getLargestRecordedTXID());
}

private Long getLargestRecordedTXID() throws IOException {
  try (TableIterator<Long, ? extends KeyValue<Long, DeletedBlocksTransaction>>
      txIter = deletedBlocksTable.iterator()) {
    txIter.seekToLast();
    Long txid = txIter.key();
    if (txid != null) {
      return txid;
    }
  }
  return 0L;
}

in DeletedBlockLogImpl.java

public void commitTransactions(
    List<DeleteBlockTransactionResult> transactionResults, UUID dnID) {
  ...
  scmMetadataStore.getDeletedBlocksTXTable().delete(txID);
  ...
}

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-4477

How was this patch tested?

Adds UT

@lokeshj1703
Copy link
Contributor Author

@GlenGeng Can you please take a look?

@lokeshj1703 lokeshj1703 self-assigned this Jan 20, 2021
@amaliujia
Copy link
Contributor

amaliujia commented Jan 20, 2021

I think currently we are planning to use a distributed sequence ID generator (design doc: https://docs.google.com/document/d/13qSwLuDBw7mytJtF9h2oauuLocFfr-XJSBdTfc1YvxM/edit?usp=sharing)

The idea is we are migrating SCM to Ratis based in HDDS-2823 (so for both HA and non-HA it uses the same code). Thus we implement a ID generator that based on term, which guarantees to be monotonic increasing.

Because of it, I suggest we postpone the fix in master branch for this issue unless it is causing serious production issue.

@GlenGeng-awx
Copy link
Contributor

GlenGeng-awx commented Jan 21, 2021

cc @runzhiwang, he is currently working on integrating DeleteBlockLog with ratis.

This solution is fine to me. Thanks for this smart solution!

In SCM HA, we decided to use the distributed sequence id <term, counter> as the txnId, which can easily maintain monotonically increasing property. Meanwhile in SCM running without ratis, we do not have term, thus need to leverage this solution to provide a monotonically increasing txnId.

As mentioned by @bshashikant, we might need support switch from HA (using Ratis) to non-HA (not using Ratis), it will be hard to maintain the monotonically increasing property between the two kind of txnIds.

For the txnId, since SCM HA will keep the DeletedBlocksTXTable to be the same across SCMs, I consider we don't need the distributed sequence id solution any more, this solution will be sufficient.

What do you think @runzhiwang @bshashikant @amaliujia @nandakumar131 ?

Attention that we might meet same problem for containerID and localID generation if we need to support switch ratis SCM to non-ratis SCM.

lock.unlock();
}
Map<Long, List<Long>> map = Collections.singletonMap(containerID, blocks);
addTransactions(map);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about remove this addTransaction(), since it is only used in unit test ?

Copy link
Contributor

@runzhiwang runzhiwang Jan 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@GlenGeng I have done this in my DeleteBlock pr.

@GlenGeng-awx
Copy link
Contributor

LGTM.

@amaliujia
Copy link
Contributor

Yes. I think if we eventually decide to not use Ratis for SCM non-HA, the fix in this PR will still be needed. If we decide to use Ratis for SCM non-HA, the solution for this case might will complicate HDDS-2823 a bit.

I would also like to hear other people's opinion for sure.

scmMetadataStore.getDeletedBlocksTXTable()
.putWithBatch(batch, largestTxnIdHolderKey,
DeletedBlocksTransaction.newBuilder().setTxID(getCurrentTXID())
.setContainerID(1).setCount(1).build());
Copy link
Contributor

@runzhiwang runzhiwang Jan 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lokeshj1703 maybe we need some comments. Besides can we define a new method to do the following. And because the containerID and count are dummy, we can define a const such as DUMMY_CONTAINERID = 1.

      scmMetadataStore.getDeletedBlocksTXTable()
          .putWithBatch(batch, largestTxnIdHolderKey,
              DeletedBlocksTransaction.newBuilder().setTxID(getCurrentTXID())
                  .setContainerID(1).setCount(1).build());

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in latest commit.

@lokeshj1703
Copy link
Contributor Author

@GlenGeng @amaliujia @runzhiwang Thanks for reviewing the PR!

For the txnId, since SCM HA will keep the DeletedBlocksTXTable to be the same across SCMs, I consider we don't need the distributed sequence id solution any more, this solution will be sufficient.

I agree with @GlenGeng . Since delete blocks table would be consistent across all SCM instances, we might not need DistributedSequenceIdGenerator for delete transactionId.

@bshashikant
Copy link
Contributor

@GlenGeng @amaliujia @runzhiwang Thanks for reviewing the PR!

For the txnId, since SCM HA will keep the DeletedBlocksTXTable to be the same across SCMs, I consider we don't need the distributed sequence id solution any more, this solution will be sufficient.

I agree with @GlenGeng . Since delete blocks table would be consistent across all SCM instances, we might not need DistributedSequenceIdGenerator for delete transactionId.

I agree with @lokeshj1703 and @GlenGeng here. This seems to be better approach to handle delete transactionId for both ratis and non-ratis case. I would prefer to get this patch committed.

@amaliujia
Copy link
Contributor

amaliujia commented Jan 21, 2021

Sure I agree for for TxnID we can follow the approach in this PR.

To make such approach be safe for SCM HA on Ratis, we will need follower installSnapshot implementation. That is on my TODO list in the near future.

Copy link
Contributor

@runzhiwang runzhiwang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Need to fix checkstyle

@lokeshj1703
Copy link
Contributor Author

@bshashikant Thanks for bringing up upgrade compatibility in internal discussion!

I have added a commit to address checkstyle issues and an upgrade fix. For upgrade, the old table would not have largestTxnIdHolderKey and therefore we would need to fetch the largest transaction from the table if it exists.

@bshashikant bshashikant merged commit b9d0f55 into apache:master Jan 25, 2021
@bshashikant
Copy link
Contributor

Thanks @lokeshj1703 for the contribution.

errose28 added a commit to errose28/ozone that referenced this pull request Feb 1, 2021
* master: (176 commits)
  HDDS-4760. Intermittent failure in ozone-ha acceptance test (apache#1853)
  HDDS-4770. Upgrade Ratis Thirdparty to 0.6.0 (apache#1868)
  HDDS-4765. Update close-pending workflow for new repo (apache#1856)
  HDDS-4737. Add ModifierOrder to checkstyle rules (apache#1839)
  HDDS-4704. Add permission check in OMDBCheckpointServlet (apache#1801)
  HDDS-4757. Unnecessary WARNING to set OZONE_CONF_DIR (apache#1849)
  HDDS-4751. TestOzoneFileSystem#testTrash failed when enabledFileSystemPaths and omRatisDisabled (apache#1851)
  HDDS-4736. Intermittent failure in testExpiredCertificate (apache#1838)
  HDDS-4758. Adjust classpath of ozone version to include log4j (apache#1850)
  HDDS-4518. Add metrics around Trash Operations. (apache#1832)
  HDDS-4708. Optimization: update RetryCount less frequently (update once per ~100) (apache#1805)
  HDDS-4748. sonarqube issue fix - "static" members should be accessed statically (apache#1748)
  HDDS-2402. Adapt hadolint check to improved CI framework (apache#1778)
  HDDS-4698. Upgrade Java for Sonar check (apache#1800)
  HDDS-4739. Upgrade Ratis to 1.1.0-eb66796d-SNAPSHOT (apache#1842)
  HDDS-4735. Fix typo in hdds.proto (apache#1837)
  HDDS-4430. OM failover timeout is too short (apache#1807)
  HDDS-4477. Delete txnId in SCMMetadataStoreImpl may drop to 0 after SCM restart. (apache#1828)
  HDDS-4688. Update Hadoop version to 3.2.2 (apache#1795)
  HDDS-4725. Change metrics unit from nanosecond to millisecond (apache#1823)
  ...
jojochuang pushed a commit to jojochuang/ozone that referenced this pull request Mar 11, 2021
…CM restart. (apache#1828)

(cherry picked from commit b9d0f55)
Change-Id: I98b2cbad99aa4007ad4f3892a59bde7856ba64c2
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants