Skip to content

Conversation

@amaliujia
Copy link
Contributor

What changes were proposed in this pull request?

Details in the JIRA

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-4708

How was this patch tested?

UT

@bshashikant
Copy link
Contributor

Probably, we should think about removing persisting the retry count in db altogether here.
cc ~ @lokeshj1703

@amaliujia
Copy link
Contributor Author

The writing the retry count into DB will still be useful at least when the retry count exceed the maxRetry. It is useful when some blocks cannot be deleted by a reason thus there is a record in DB such that people can analyze the reason.

Copy link
Contributor

@lokeshj1703 lokeshj1703 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@amaliujia Thanks for working on this! The changes look good to me. I have few comments inline. The added test seems to be failing. Can you please take a look?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(nextCount % 100) - (currentCount % 100) This would always be 0 since currentCount now would be equal to nextCount. I think we can also use sth like nextCount % 100 == 99?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

O I see at line 154 I did int nextCount = currentCount++;.

Indeed it means nextCount = currentCount. (use currentCount then ++).

I updated to int nextCount = currentCount + 1; and now nextCount % 100) - (currentCount % 100) is supposed to work

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

O I know what you were suggesting:

It needs to be at least (nextCount / 100) - (currentCount / 100) :)

Comment on lines 146 to 150
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
int currentCount = -1;
if (transactionRetryCountMap.containsKey(txID)) {
currentCount = transactionRetryCountMap.get(txID);
} else {
currentCount = block.getCount();
int currentCount = transactionRetryCountMap.getOrDefault(txID, block.getCount());

We can use the getOrDefault api here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can remove the commented code here?

@lokeshj1703
Copy link
Contributor

Probably, we should think about removing persisting the retry count in db altogether here.

Sure @bshashikant . Let's discuss this in a separate jira?

@amaliujia
Copy link
Contributor Author

@lokeshj1703 comments addressed.

Had to refactor testing code a bit to fix the failed UT.

Copy link
Contributor

@lokeshj1703 lokeshj1703 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@amaliujia Thanks for updating the PR! Regarding TestDeletedBlockLogBase and other inherited tests, Can we move them all to TestDeletedBlockLog itself? I see that you have defined public abstract int getMaxRetry(); for configuring the maxRetry. I think this can be done in a test as well using what is followed in TestDeletedBlockLog#testPersistence. We can recreate DeletedBlockLog after changing the configuration for the test. Sorry! This might require you to refactor.
There are few other comments inline.

Comment on lines 148 to 150
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Redundant change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
nextCount = -1;
transactionRetryCountMap.remove(txID);

We can remove the entry from map here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's difficult to understand the logic here. Can we replace it using %? Perhaps nextCount % 100 == 0 or 99?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nextCount % 100 == 0 is good.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be moved inside else if

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated to PR: Can we also remove from transactionToDNsCommitMap here for better surity?

Copy link
Contributor Author

@amaliujia amaliujia Jan 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact this might be a bug: for purgeTransaction I didn't see transactionToDNsCommitMap is cleaned up properly.

Add the cleaning up of transactionToDNsCommitMap here.

@amaliujia
Copy link
Contributor Author

@lokeshj1703 you suggestion was actually great: simple and easy.

Now this PR changes less code than before.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checkstyle complains this line exceeds 80 char limitation so I made this change.

@amaliujia
Copy link
Contributor Author

@lokeshj1703 PR rebased and conflicts solved. Any further comments?

Copy link
Contributor

@lokeshj1703 lokeshj1703 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@amaliujia Thanks for updating the PR! The changes look good to me. +1.

lokeshj1703 pushed a commit that referenced this pull request Jan 27, 2021
@lokeshj1703
Copy link
Contributor

@amaliujia Thanks for the contribution! I have committed the PR to master branch.

@amaliujia amaliujia deleted the HDDS-4708 branch January 27, 2021 18:39
errose28 added a commit to errose28/ozone that referenced this pull request Feb 1, 2021
* master: (176 commits)
  HDDS-4760. Intermittent failure in ozone-ha acceptance test (apache#1853)
  HDDS-4770. Upgrade Ratis Thirdparty to 0.6.0 (apache#1868)
  HDDS-4765. Update close-pending workflow for new repo (apache#1856)
  HDDS-4737. Add ModifierOrder to checkstyle rules (apache#1839)
  HDDS-4704. Add permission check in OMDBCheckpointServlet (apache#1801)
  HDDS-4757. Unnecessary WARNING to set OZONE_CONF_DIR (apache#1849)
  HDDS-4751. TestOzoneFileSystem#testTrash failed when enabledFileSystemPaths and omRatisDisabled (apache#1851)
  HDDS-4736. Intermittent failure in testExpiredCertificate (apache#1838)
  HDDS-4758. Adjust classpath of ozone version to include log4j (apache#1850)
  HDDS-4518. Add metrics around Trash Operations. (apache#1832)
  HDDS-4708. Optimization: update RetryCount less frequently (update once per ~100) (apache#1805)
  HDDS-4748. sonarqube issue fix - "static" members should be accessed statically (apache#1748)
  HDDS-2402. Adapt hadolint check to improved CI framework (apache#1778)
  HDDS-4698. Upgrade Java for Sonar check (apache#1800)
  HDDS-4739. Upgrade Ratis to 1.1.0-eb66796d-SNAPSHOT (apache#1842)
  HDDS-4735. Fix typo in hdds.proto (apache#1837)
  HDDS-4430. OM failover timeout is too short (apache#1807)
  HDDS-4477. Delete txnId in SCMMetadataStoreImpl may drop to 0 after SCM restart. (apache#1828)
  HDDS-4688. Update Hadoop version to 3.2.2 (apache#1795)
  HDDS-4725. Change metrics unit from nanosecond to millisecond (apache#1823)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants