Skip to content

Conversation

@errose28
Copy link
Contributor

@errose28 errose28 commented Oct 21, 2020

What changes were proposed in this pull request?

This pull request completes the open key cleanup service outlined in the parent Jira HDDS-4120. It implements the OpenKeyCleanupService class, and starts and stops the service in KeyManagerImpl. The following configurations have been defined to specify the service's behavior:

  1. ozone.open.key.cleanup.service.interval (default value 24 hours)
  2. ozone.open.key.expire.threshold (default value 1 week)
  3. ozone.open.key.cleanup.limit.per.task (default value 1000 keys)

See ozone-defaults.xml for their corresponding descriptions. Configurations for service interval and expiration threshold were previously defined in OzoneConfigKeys and done as raw numbers. This feature moves them to OMConfigKeys and implements them as TimeDurations, which requires an update to each spot in the code previously referring to these config values (mostly in test setups).

What is the link to the Apache JIRA

HDDS-4123

How was this patch tested?

Integration test has been added.

Move old configs out of OzoneConfigKeys and into OMConfigKeys.
Add expiration time parameter and time unit to all getExpiredOpenKeys calls.
Initial structure for OpenKeyCleanupService implementation.
Have not yet added a service timeout for it, this is set to 0 right now.
Add the ability to disable service timeout warning in BackgroundService.
Open key cleanup's OM RPC call will time out after 15 minutes.
* master: (23 commits)
  HDDS-4122. Implement OM Delete Expired Open Key Request and Response (apache#1435)
  HDDS-4336. ContainerInfo does not persist BCSID (sequenceId) leading to failed replica reports (apache#1488)
  Remove extra serialization from getBlockID (apache#1470)
  HDDS-4262. Use ClientID and CallID from Rpc Client to detect retry requests (apache#1436)
  HDDS-4285. Read is slow due to frequent calls to UGI.getCurrentUser() and getTokens() (apache#1454)
  HDDS-4312. findbugs check succeeds despite compile error (apache#1476)
  HDDS-4311. Type-safe config design doc points to OM HA (apache#1477)
  HDDS-3814. Drop a column family through debug cli tool (apache#1083)
  HDDS-3728. Bucket space: check quotaUsageInBytes when write key and allocate block. (apache#1458)
  HDDS-4316. Upgrade to angular 1.8.0 due to CVE-2020-7676 (apache#1481)
  HDDS-4325. Incompatible return codes from Ozone getconf -confKey (apache#1485). Contributed by Doroszlai, Attila.
  HDDS-4309. Fix inconsistency in recon config keys starting with recon and not ozone (apache#1478)
  HDDS-4310: Ozone getconf broke the compatibility (apache#1475)
  HDDS-4298. Use an interface in Ozone client instead of XceiverClientManager (apache#1460)
  HDDS-4280. Document notable configurations for Recon. (apache#1448)
  HDDS-4156. add hierarchical layout to Chinese doc (apache#1368)
  HDDS-4242. Copy PrefixInfo proto to new project hadoop-ozone/interface-storage (apache#1444)
  HDDS-4264. Uniform naming conventions of Ozone Shell Options. (apache#1447)
  HDDS-4271. Avoid logging chunk content in Ozone Insight (apache#1466)
  HDDS-4299. Display Ratis version with ozone version (apache#1464)
  ...
* master:
  HDDS-4301. SCM CA certificate does not encode KeyUsage extension properly (apache#1468)
  HDDS-4158. Provide a class type for Java based configuration (apache#1407)
  HDDS-4297. Allow multiple transactions per container to be sent for deletion by SCM.
  HDDS-2922. Balance ratis leader distribution in datanodes (apache#1371)
  HDDS-4269. Ozone DataNode thinks a volume is failed if an unexpected file is in the HDDS root directory. (apache#1490)
  HDDS-4327. Potential resource leakage using BatchOperation. (apache#1493)
  HDDS-3995. Fix s3g met NPE exception while write file by multiPartUpload (apache#1499)
  HDDS-4343. ReplicationManager.handleOverReplicatedContainer() does not handle unhealthyReplicas properly. (apache#1495)
task limit = 1000
service interval = 1 day
expiration threshold = 1 week
w for week does not exist, use d for days instead.
@errose28 errose28 marked this pull request as ready for review October 28, 2020 19:37
Copy link
Contributor

@bharatviswa504 bharatviswa504 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Expiry threshold is set to 2 seconds, do we need this to be set in all tests?
Previously this config is not used, but now this will affect, but as service interval is 24 hours it is fine I think so.

"ozone.key.deleting.limit.per.task";
public static final int OZONE_KEY_DELETING_LIMIT_PER_TASK_DEFAULT = 1000;

public static final String OZONE_OPEN_KEY_CLEANUP_SERVICE_INTERVAL =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: Do we want to append om to these configs?
Generally, most OM configs have "ozone.om"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I added the ozone.om prefix.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you update the config variable name OZONE_OPEN_KEY_CLEANUP_SERVICE_INTERVAL also to include "OM".
The convention for variable names is to have the exact words as is in the config key.

omId = UUID.randomUUID().toString();
conf.setBoolean(OZONE_ACL_ENABLED, true);
conf.setInt(OZONE_OPEN_KEY_EXPIRE_THRESHOLD_SECONDS, 2);
conf.setTimeDuration(OZONE_OPEN_KEY_EXPIRE_THRESHOLD, 2, TimeUnit.SECONDS);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to set this in this test?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need this setting in this or the other tests, I have removed these left over open key expiration configs that were present in tests before this feature.

omId = UUID.randomUUID().toString();
conf.setBoolean(OZONE_ACL_ENABLED, true);
conf.setInt(OZONE_OPEN_KEY_EXPIRE_THRESHOLD_SECONDS, 2);
conf.setTimeDuration(OZONE_OPEN_KEY_EXPIRE_THRESHOLD, 2, TimeUnit.SECONDS);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to set this in this test?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does not need to be set, I have removed it per the comment above.


private static final Logger LOG =
LoggerFactory.getLogger(OpenKeyCleanupService.class);
LoggerFactory.getLogger(KeyDeletingService.class);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: KeyDeletingService -> OpenKeyCleanupService

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, thanks!

// Use only a single thread for open key deletion. Multiple threads would read
// from the same table and can send deletion requests for same key multiple
// times.
private final static int KEY_DELETING_CORE_POOL_SIZE = 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

KEY_DELETING_CORE_POOL_SIZE -> OPEN_KEY_DELETING_CORE_POOL_SIZE

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, thanks!

// OzoneManager can be null for testing
return true;
}
return ozoneManager.isLeader();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to use isLeaderReady to avoid retry of the request and getting the keys from the delete table.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isLeaderReady API is added as part of HDDS-4484

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am now using the isLeaderReady API here. Let me know if I have done so correctly.

} catch (IOException e) {
LOG.error("Unable to get hanging open keys, retry in"
+ " next interval", e);
ozoneManager.getOmServerProtocol().submitRequest(null, omRequest);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to use new API added by HDDS-4481

/**
  * API used internally from OzoneManager Server when requests needs to be
  * submitted to ratis, where the crafted RaftClientRequest is passed along.
  * @param omRequest
  * @param raftClientRequest
  * @return OMResponse
  * @throws ServiceException
  */
 public OMResponse submitRequest(OMRequest omRequest,
     RaftClientRequest raftClientRequest) throws ServiceException {
   RaftClientReply raftClientReply = submitRequestToRatis(raftClientRequest);
   return processReply(omRequest, raftClientReply);
 }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the code to use the new API. Let me know if I have done so correctly.

try {
OpenKey openKey = OpenKey.newBuilder()
.setName(split[3])
.setClientID(Long.parseLong(split[4]))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clientID will be split.length -1 right?
Example:
/vol/buck/dir1/dir2/dir3/file1/10000

split[4] in this case will be dir3, which is not clientID.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, there were some bugs here related to separators, good catch. I believe they should be fixed now, and the tests in TestOpenKeyCleanupService now use some messy keys with extra separators to make sure they are handled correctly.

* master: (40 commits)
  HDDS-4473. Reduce number of sortDatanodes RPC calls (apache#1610)
  HDDS-4485. [DOC] add the authentication rules of the Ozone Ranger. (apache#1603)
  HDDS-4528. Upgrade slf4j to 1.7.30 (apache#1639)
  HDDS-4424. Update README with information how to report security issues (apache#1548)
  HDDS-4484. Use RaftServerImpl isLeader instead of periodic leader update logic in OM and isLeaderReady for read/write requests (apache#1638)
  HDDS-4429. Create unit test for SimpleContainerDownloader. (apache#1551)
  HDDS-4461. Reuse compiled binaries in acceptance test (apache#1588)
  HDDS-4511: Avoiding StaleNodeHandler to take effect in TestDeleteWithSlowFollower. (apache#1625)
  HDDS-4510. SCM can avoid creating RetriableDatanodeEventWatcher for deletion command ACK (apache#1626)
  HDDS-3363. Intermittent failure in testContainerImportExport (apache#1618)
  HDDS-4370. Datanode deletion service can avoid storing deleted blocks. (apache#1620)
  HDDS-4512. Remove unused netty3 transitive dependency (apache#1627)
  HDDS-4481. With HA OM can send deletion blocks to SCM multiple times. (apache#1608)
  HDDS-4487. SCM can avoid using RETRIABLE_DATANODE_COMMAND for datanode deletion commands. (apache#1621)
  HDDS-4471. GrpcOutputStream length can overflow (apache#1617)
  HDDS-4308. Fix issue with quota update (apache#1489)
  HDDS-4392. [DOC] Add Recon architecture to docs (apache#1602)
  HDDS-4501. Reload OM State fail should terminate OM for any exceptions. (apache#1622)
  HDDS-4492. CLI flag --quota should default to 'spaceQuota' to preserve backward compatibility. (apache#1609)
  HDDS-3689. Add various profiles to MiniOzoneChaosCluster to run different modes. (apache#1420)
  ...
conf.setInt(OMConfigKeys.OZONE_OPEN_KEY_CLEANUP_LIMIT_PER_TASK,
TESTING_TASK_LIMIT);

cluster = MiniOzoneCluster.newBuilder(conf).build();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add also a similar test for OM HA

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TestOpenKeyCleanupService is now parameterized for HA and non-HA testing.

@errose28
Copy link
Contributor Author

errose28 commented Jan 5, 2021

Thanks for the review @bharatviswa504, Let me know if your comments have been addressed correctly in the latest version of the code.

* master: (176 commits)
  HDDS-4760. Intermittent failure in ozone-ha acceptance test (apache#1853)
  HDDS-4770. Upgrade Ratis Thirdparty to 0.6.0 (apache#1868)
  HDDS-4765. Update close-pending workflow for new repo (apache#1856)
  HDDS-4737. Add ModifierOrder to checkstyle rules (apache#1839)
  HDDS-4704. Add permission check in OMDBCheckpointServlet (apache#1801)
  HDDS-4757. Unnecessary WARNING to set OZONE_CONF_DIR (apache#1849)
  HDDS-4751. TestOzoneFileSystem#testTrash failed when enabledFileSystemPaths and omRatisDisabled (apache#1851)
  HDDS-4736. Intermittent failure in testExpiredCertificate (apache#1838)
  HDDS-4758. Adjust classpath of ozone version to include log4j (apache#1850)
  HDDS-4518. Add metrics around Trash Operations. (apache#1832)
  HDDS-4708. Optimization: update RetryCount less frequently (update once per ~100) (apache#1805)
  HDDS-4748. sonarqube issue fix - "static" members should be accessed statically (apache#1748)
  HDDS-2402. Adapt hadolint check to improved CI framework (apache#1778)
  HDDS-4698. Upgrade Java for Sonar check (apache#1800)
  HDDS-4739. Upgrade Ratis to 1.1.0-eb66796d-SNAPSHOT (apache#1842)
  HDDS-4735. Fix typo in hdds.proto (apache#1837)
  HDDS-4430. OM failover timeout is too short (apache#1807)
  HDDS-4477. Delete txnId in SCMMetadataStoreImpl may drop to 0 after SCM restart. (apache#1828)
  HDDS-4688. Update Hadoop version to 3.2.2 (apache#1795)
  HDDS-4725. Change metrics unit from nanosecond to millisecond (apache#1823)
  ...
Copy link
Contributor

@hanishakoneru hanishakoneru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @errose28 for working on this. PR LGTM overall. Minor comments posted inline.

"ozone.key.deleting.limit.per.task";
public static final int OZONE_KEY_DELETING_LIMIT_PER_TASK_DEFAULT = 1000;

public static final String OZONE_OPEN_KEY_CLEANUP_SERVICE_INTERVAL =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you update the config variable name OZONE_OPEN_KEY_CLEANUP_SERVICE_INTERVAL also to include "OM".
The convention for variable names is to have the exact words as is in the config key.

.setNumOfOzoneManagers(numOfOMs)
.build();
cluster.waitForClusterToBeReady();
cluster.restartOzoneManager();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need OM restart in init?

conf.setBoolean(OZONE_ACL_ENABLED, true);
conf.setBoolean(OZONE_SECURITY_ENABLED_KEY, true);
conf.setInt(OZONE_OPEN_KEY_EXPIRE_THRESHOLD_SECONDS, 2);
conf.setTimeDuration(OZONE_OPEN_KEY_EXPIRE_THRESHOLD, 2, TimeUnit.SECONDS);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was OZONE_SECURITY_ENABLED_KEY set call removed by mistake?

long serviceInterval = configuration.getTimeDuration(
OZONE_OPEN_KEY_CLEANUP_SERVICE_INTERVAL,
OZONE_OPEN_KEY_CLEANUP_SERVICE_INTERVAL_DEFAULT.getDuration(),
TimeUnit.MILLISECONDS);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If user sets this config to an integer value (without giving the unit suffix), then the serviceInterval will take it as milliseconds. For example if user wants to set to 48 hours and sets it to value 48, the service interval will be calculated as 48 milliseconds.
To provide one layer of protection against this, we can use getTimeDuration(String name, long defaultValue, TimeUnit defaultUnit, TimeUnit returnUnit) instead.
Also, we should add "Unit could be defined with... " to the description in ozone-default.xml (similar to how it is in ozone.block.deleting.service.interval).

@Override
public List<String> getExpiredOpenKeys(int count) throws IOException {
public List<String> getExpiredOpenKeys(TimeDuration expireThreshold,
int count) throws IOException {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: Since all the time comparisons are in milliseconds, would it better to just send expireThresholdMs long value instead of expireThreshold TimeDuration?

private boolean isRatisEnabled() {
if (ozoneManager == null) {
return false;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above.

// Check if this is the Leader OM. If not leader, no need to execute this
// task.
if (shouldRun()) {
runCount.incrementAndGet();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A suggestion - It might be useful to expose a metric here to understand how many open keys have been cleaned up. It does not have to be part of this Jira.

// If the client ID cannot be parsed correctly, do not add the key to
// the map.
LOG.error("Failed to parse client ID {} as a long from open key {}.",
clientID, openKeyName, ex);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add in the message that this happened during Open key cleanup to be more clear.

.setClientId(ClientId.randomId())
.setServerId(server.getRaftPeerId())
.setGroupId(server.getRaftGroupId())
.setCallId(0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we set the callId to runCount?

}
} catch (IOException e) {
LOG.error("Error while running delete keys background task. Will " +
"retry at next run.", e);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we change "delete keys background task" to "open keys cleanup task" to avoid confusion.

@elek
Copy link
Member

elek commented Mar 1, 2021

/pending review comments from Hanisha are not addressed (as fat as I see)

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Marking this issue as un-mergeable as requested.

Please use /ready comment when it's resolved.

Please note that the PR will be closed after 21 days of inactivity from now. (But can be re-opened anytime later...)

review comments from Hanisha are not addressed (as fat as I see)

@github-actions
Copy link

Thank you very much for the patch. I am closing this PR temporarily as there was no activity recently and it is waiting for response from its author.

It doesn't mean that this PR is not important or ignored: feel free to reopen the PR at any time.

It only means that attention of committers is not required. We prefer to keep the review queue clean. This ensures PRs in need of review are more visible, which results in faster feedback for all PRs.

If you need ANY help to finish this PR, please contact the community on the mailing list or the slack channel."

@github-actions github-actions bot closed this Mar 23, 2021
@captainzmc
Copy link
Member

Hi @errose28, Thank you very much for the patch. I think this PR is important because many exceptions cause write failures and we need to clean up the failed keys.
I think this patch is very close to completion. Will you continue and reopen this PR? I can also help review this PR and merge this.

@errose28
Copy link
Contributor Author

Hi @captainzmc thanks for your interest in this pull request, and apologies for abandoning it. I see you need this feature for HDDS-5867. I recall quota support was being added around the same time I was working on this, so it is understandable there may be some gaps in integrating the two features.

I am not sure how close this actually is to completion right now. It needs to be rebased on a year's worth of changes, updated to support the new open file table for FSO (this might be better done in another PR), and checked against the quota feature. I am willing to finish it but do not have the bandwidth right now. I will try to get to it in a week's time, but if someone else is interested in getting this done ASAP I will gladly let someone else finish it and can offer assistance on Slack if needed.

@kaijchen
Copy link
Member

Hi @errose28 and @captainzmc, i will look into this. Any advice would be appreciated.

@captainzmc
Copy link
Member

captainzmc commented Jan 18, 2022

Hi @errose28 and @captainzmc, i will look into this. Any advice would be appreciated.

Thanks @kaijchen working for this. I think you can rebase to the master using the development branch of @errose28. Then resubmit the PR.

@errose28
Copy link
Contributor Author

Thanks for picking this up @kaijchen. Feel free to reach out to me on slack with any questions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants