HDDS-2424. Add the recover-trash command server side handling. #399

cxorm · 2019-12-28T11:59:51Z

What changes were proposed in this pull request?

This PR was created for adding the server side of recover-trash.

Components updated including (in propagation order.)
OzoneManager and OzoneManagerRequestHandler
KeyManager and its implementation
OMMetadataManager and its implementation

Other fixes would be completed in HDDS-2425 and HDDS-2426
(Including startKey and prefix)

Note

Cause recoverTrash is write request, we should handle the request with OMHA.

With this doc, we use late validation to handle write request.

This PR mainly has parts including updating OzoneManager and
fixing OzoneManagerProtocolClientSideTranslatorPB as well as
OMTrashRecoverRequest and OMTrashRecoverResponse to handle request of write type.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-2424

How was this patch tested?

Cause only adding the server side handling.
Just tested the propagation and ran the UT.

maobaolong

Look good overall, just some minor comments left.

maobaolong · 2019-12-30T01:38:34Z

hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java

+  public boolean recoverTrash(String volumeName, String bucketName,
+      String keyName, String destinationBucket) throws IOException {
+
+    Preconditions.checkNotNull(volumeName);


Not sure the @nonnull is a better way? I saw the Nonnull annotation in the hadoop-ozone project already, and findbugs will find null argument during build.

Thanks @maobaolong for this advice.

As I know, we could use both @Nonnull and Preconditions.checkNotNull(), and the former is to inform developer not use null in the parameter, and the later is to validate in runtime.

So, I think this part would not be fixed,
and we could create a Jira for the annotation issue. HDDS-2824

maobaolong · 2019-12-30T01:41:58Z

hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OmMetadataManagerImpl.java

+  public boolean recoverTrash(String volumeName, String bucketName,
+      String keyName, String destinationBucket) throws IOException {
+
+    // TODO: core logic stub would be added in later patch.


Maybe we need a JIRA tickets track this item?

Thanks @maobaolong for the suggestion.

Yeah, there are HDDS-2425 and HDDS-2426 that tracked this item.
Updated.

maobaolong · 2019-12-30T01:43:33Z

hadoop-ozone/ozone-manager/src/test/java/org/apache/hadoop/ozone/om/TestTrashService.java

    String destinationBucket = "destBucket";
    createAndDeleteKey(keyName);

-    /* TODO:HDDS-2424. */


Yeah, what i means is like this TODO comment.

bharatviswa504 · 2020-01-06T22:35:07Z

Recover trash is a write request command as this moves the keys from the delete table to the original key table. let me know if I am missing something here.

2 comments.

In code it is marked as read-only, should it be write request in OM?
If write request should follow the implementation of write requests (preExecute/ValidateAndUpdatecache)

As for write request commands, we should follow HA kind of request implementation. For reference this design link write link

bharatviswa504

One General question I have on the PR approach.

bharatviswa504 · 2020-01-07T04:45:03Z

@cxorm Attached the Cache Design document to HDDS-505. Sorry for the trouble the links in the document of write link above are internal links. So, attached them to HDDS-505

cxorm · 2020-01-07T04:53:31Z

@cxorm Attached the Cache Design document to HDDS-505. Sorry for the trouble the links in the document of write link above are internal links. So, attached them to HDDS-505

Thank you @bharatviswa504 for the document.
I'm going to fix this.

cxorm · 2020-01-19T18:02:21Z

Thanks @bharatviswa504 for the document.

The recoverTrash is type of write request, and with the document
I updated the PR with OMTrashRecoverRequest#preExecute and OMTrashRecoverRequest#validateAndUpdateCache as well as the draft of OMTrashRecoverResponse.
The fully implementation would be completed by following PR.

elek · 2020-02-10T10:48:05Z

@bharatviswa504 Can we commit this PR?

hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OzoneManager.java

hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java

hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManager.java

bharatviswa504 · 2020-02-14T06:01:40Z

...ne-manager/src/main/java/org/apache/hadoop/ozone/om/response/key/OMTrashRecoverResponse.java

+    omKeyInfo = OmUtils.prepareKeyForRecover(omKeyInfo, repeatedOmKeyInfo);
+    omMetadataManager.getDeletedTable()
+        .deleteWithBatch(batchOperation, omKeyInfo.getKeyName());
+    /* TODO: trashKey should be updated to destinationBucket. */


One question:

if the key is created, deleted, the key is created and the key is deleted. Now when recover, which omKeyInfo will be used from the delete table.

Thank you @bharatviswa504 for the question.

Refer to processing of DeletedTable in OMKeyDeleteResponse#addToDBBatch() and OmUtils#prepareKeyForDelete(), the latest deleted key is added in tail of RepeatedOmKeyInfo#omKeyInfoList.

So I think we could recover the latest deleted key from the DeletedTable in this created-deleted-created-deleted situation. (And when recovering the latest key, I think we should clear the old deleted key.)

Would you please give me advice if I miss something ?
If the idea is proper, I will update the description of this jira.

I am fine with recovering last delete key if that is the expected behavior.

(And when recovering the latest key, I think we should clear the old deleted key.)
We should not delete the other keys, as those keys will be picked by background trash service and the data for those keys need to be deleted.

And also doing this way, is also not correct from my understanding, let us say, we put those keys in delete table, and background delete key service will pick them up and send to SCM for deletion, at this point we got a recover trash command, so there is a chance that we recover the key which might have no data, as we submitted the request to SCM for deletion, and SCM, in turn, it will send to DN. How we shall handle this kind of scenarios?

Because deletion from delete table will happen when key purge request happens.

Code snippet link #link

/pending I'm tracing the background part. (Hope soon)

Thank you @bharatviswa504 for taking time to review this.

Here is my thought,
We set modificationTime when deleting key.

So I think we can compare the modificationTime with RECOVERY_WINDOW to exclude keys(exist in trash-enabled buckets) from purging.

Code snippet would be added after this line might like

if (trashEnable(info.getBucketName()) && (Time.now() - info.getModificationTime()) < RECOVERY_WINDOW) { /* Would not delete key in this situation. */ }

note recovery_window of bucket would be added in later Jira.

Could you please give me your thoughts or ideas if I miss something, thank you.

And here is discussion about trash-recovery.

...zone-manager/src/main/java/org/apache/hadoop/ozone/om/request/key/OMTrashRecoverRequest.java

...ne-manager/src/main/java/org/apache/hadoop/ozone/om/response/key/OMTrashRecoverResponse.java

bharatviswa504

Few comments inline.
Thank You @cxorm for the update.

elek · 2020-03-10T11:56:51Z

/pending Comments from @bharatviswa504 are not addressed, yet...

github-actions

Marking this issue as un-mergeable as requested.

Please use /ready comment when it's resolved.

Comments from @bharatviswa504 are not addressed, yet...

elek · 2020-04-01T08:26:55Z

@bharatviswa504 Can you please check it?

bharatviswa504 · 2020-04-06T18:57:45Z

@elek There are some pending comments which need to be resolved.

cxorm · 2020-04-12T21:46:53Z

hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/om/protocol/OzoneManagerProtocol.java

@bharatviswa504
What do you think about this cleanup write-request ?
Could we set the write-operation with a default or
we need a separated interface addressed the write-operation ?

bshashikant · 2020-04-17T09:10:37Z

@cxorm , can you plz update the pr?

bshashikant · 2020-04-24T14:20:21Z

@bharatviswa504 , can you please have a look?

cxorm · 2020-04-29T07:29:30Z

Thanks @bshashikant for the reminder.
Rebase latest master-branch (#843).

cxorm · 2020-04-30T21:24:30Z

Rebase latest master-branch (#839) to resolve conflict.

cxorm · 2020-05-05T07:53:45Z

Rebase latest master-branch (#848) and trigger github-actions.

bshashikant · 2020-05-05T12:14:58Z

Thanks @cxorm for working on this. I have committed this.

cxorm · 2020-05-06T07:39:04Z

Thanks @bharatviswa504 for the review
and thanks @bshashikant for the commit.

maobaolong reviewed Dec 30, 2019

View reviewed changes

xiaoyuyao requested a review from anuengineer January 6, 2020 17:18

bharatviswa504 requested changes Jan 6, 2020

View reviewed changes

cxorm force-pushed the HDDS-2424 branch from d7be25d to b41ebda Compare January 18, 2020 11:06

cxorm requested a review from bharatviswa504 January 19, 2020 18:05