-
Notifications
You must be signed in to change notification settings - Fork 587
HDDS-3286. BasicOzoneFileSystem support batchDelete. #814
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Please have a look at https://issues.apache.org/jira/browse/HDDS-2939. The problem is being addressed there. |
rakeshadr
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the good work.
Added few suggestions and comments. Please go through it. Thanks!
hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/OzoneBucket.java
Outdated
Show resolved
Hide resolved
hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/OzoneBucket.java
Outdated
Show resolved
Hide resolved
hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/OzoneBucket.java
Outdated
Show resolved
Hide resolved
hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/protocol/ClientProtocol.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need String keyName argument ? Can you please incorporate String keyName argument into the keyNameList argument, something similar you have very well refactored for keyNameList.add(keyName); in deleteKey api.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need
String keyNameargument ? Can you please incorporateString keyNameargument into thekeyNameListargument, something similar you have very well refactored forkeyNameList.add(keyName);in deleteKey api.
The keyName in OmKeyArgs is Shared with other interfaces. Operations like create and list still need to use it.
...e/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/key/OMKeyRenameRequest.java
Outdated
Show resolved
Hide resolved
hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/om/helpers/RenameInfo.java
Outdated
Show resolved
Hide resolved
hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/om/helpers/RenameInfo.java
Outdated
Show resolved
Hide resolved
hadoop-ozone/ozonefs/src/main/java/org/apache/hadoop/fs/ozone/BasicOzoneFileSystem.java
Outdated
Show resolved
Hide resolved
...one-manager/src/test/java/org/apache/hadoop/ozone/om/request/key/TestOMKeyRenameRequest.java
Outdated
Show resolved
Hide resolved
@mukul1987 Thanks for the tip. After looking at the design documentation, I found that is a better implementation. I'm going to colse this PR. |
Hi @rakeshadr Thank for you review. By looking at the design documentation of https://issues.apache.org/jira/browse/HDDS-2939. I found that the HDDS-2939 works better, and I'm going to colse this PR. |
c927d2b to
6208606
Compare
|
hi @rakeshadr Thank for you review. |
|
Hi @ChenSammi @xiaoyuyao |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have a list of KeyArgs here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree. Use list of KeyArgs instead of add a new keyNameList field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very good advice. Instead of modifying KeyArgs, I'll try to implement delete with a List that contains KeyArgs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not remain KeyArgs as for single key?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, We can avoid changing it here,by use list of KeyArgs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use repeated KeyArgs here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to iterate all omKeyInfo in the list as well here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not related to this class. But you wanna visit TestOzoneManagerHAWithData and see if HA needs a test case for deleting a list of keys.
hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/om/helpers/OmKeyArgs.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be great if we add a failure case here like there is an unknown key in a list of known keys. We would like to also test exceptions.
|
Hi @bharatviswa504, Could you please help to review this PR? Let's see if there is any effect on OM HA. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need to change toKeyName from required to optional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This place is useless for modification, I will restore this place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When we delete a list of keys, upon failure in the middle, can we return a list of deleted keys and undeleted keys? This may not be an issue when you delete a single key but when batch deleting, it is hard to recover from the failures without that information.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with Xiaoyu, we should return a list of deleted key result on failure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will add unDeletedKeys and deletedKeys in exception when an exception occurs when delete keys.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is OK with o3fs as we will mount a single bucket. In the context of ofs where you can have multiple volume buckets under the root. This lock can't guarantee atomic across all the delete keyname list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here I will make sure that all buckets I used are locked to ensure compatibility with ofs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will only print the address of the keNameList. You may want to expand the list and also protect the LOG.debug with a if LOG.isDebugEnabled().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above. we might only want to print the key that failed in the deletion instead of the whole list upon failures. print the whole list can be a debug or trace level log.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we define a new method called deleteObjects to be backward compatible?
|
Fixed issues with large lock granularity. PR can continue to be reviewed. |
|
Had rebased master and fixed conflict. PR can continue to be reviewed. cc @timmylicheng @ChenSammi @xiaoyuyao Thanks. |
ChenSammi
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sugguest to change all deleteBatchKey liked namning to deleleKeys.
|
|
||
| public static final String OZONE_FS_ITERATE_BATCH_SIZE = | ||
| "ozone.fs.iterate.batch-size"; | ||
| public static final int OZONE_FS_ITERATE_BATCH_SIZE_DEFAULT = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest a bigger default size, say 100?
| "ozone.s3.token.max.lifetime"; | ||
| public static final String OZONE_S3_AUTHINFO_MAX_LIFETIME_KEY_DEFAULT = "3m"; | ||
|
|
||
| public static final String OZONE_FS_ITERATE_BATCH_SIZE = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this property be used for batch rename too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, batch Rename will also use this property.
| * @throws IOException | ||
| */ | ||
| void deleteKeys(String volumeName, String bucketName, | ||
| List<String> keyNameList) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indent is incorrect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with Xiaoyu, we should return a list of deleted key result on failure.
...ne-manager/src/main/java/org/apache/hadoop/ozone/om/request/key/OMBatchKeyDeleteRequest.java
Show resolved
Hide resolved
0a2851d to
8604295
Compare
Codecov Report
@@ Coverage Diff @@
## master #814 +/- ##
=========================================
Coverage ? 69.43%
Complexity ? 9125
=========================================
Files ? 962
Lines ? 48287
Branches ? 4694
=========================================
Hits ? 33528
Misses ? 12541
Partials ? 2218 Continue to review full report at Codecov.
|
691eca8 to
3f720f1
Compare
3f720f1 to
257a8d4
Compare
|
Status updates: |
...op-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/OMClientRequest.java
Outdated
Show resolved
Hide resolved
9bf3f10 to
078c9d4
Compare
078c9d4 to
44c5788
Compare
|
Thanks @xiaoyuyao for starting to review this PR again. I have made some modifications according to the suggestions above. In addition, Due to the conflict problem, I rebase the previous commit. |
|
Thanks @captainzmc for the update. The latest change LGTM, +1. |
* upstream/master: (56 commits) HDDS-3264. Fix TestCSMMetrics.java. (apache#1120) HDDS-3858. Remove support to start Ozone and HDFS datanodes in the same JVM (apache#1117) HDDS-3704. Update all the documentation to use ozonefs-hadoop2/3 instead of legacy/current (apache#1099) HDDS-3773. Add OMDBDefinition to define structure of om.db. (apache#1076) Revert "HDDS-3263. Fix TestCloseContainerByPipeline.java. (apache#1119)" (apache#1126) HDDS-3821. Disable Ozone SPNEGO should not fall back to hadoop.http.a… (apache#1101) HDDS-3819. OzoneManager#listVolumeByUser ignores userName parameter when ACL is enabled (apache#1087) HDDS-3779. Add csi interface documents to show how to use ozone csi (apache#1059) HDDS-3857. Datanode in compose/ozonescripts can't be started (apache#1116) HDDS-3430. Enable TestWatchForCommit test cases. (apache#1114) HDDS-3263. Fix TestCloseContainerByPipeline.java. (apache#1119) HDDS-3512. s3g multi-part-upload saved incorrect content using streaming (apache#1092) HDDS-3836. Modify ContainerPlacementPolicyFactory JavaDoc (apache#1097) HDDS-3780. Replace the imagePullPolicy from always to IfNotPresent (apache#1055) HDDS-3847. Change OMNotLeaderException logging to DEBUG (apache#1118) HDDS-3745. Improve OM and SCM performance with 64% by avoid collect datanode information to s3g (apache#1031) HDDS-3286. BasicOzoneFileSystem support batchDelete. (apache#814) HDDS-3850. Update the admin document to let user know how to show the status of all rules. (apache#1109) HDDS-3848. Add ratis.thirdparty.version in main pom.xml (apache#1108) HDDS-3815. Avoid buffer copy in ContainerCommandRequestProto. (apache#1085) ...
What changes were proposed in this pull request?
Currently delete file is to get all the keys in the directory, and then delete one by one. This makes for poor performance.
By tested the deletion directory with 100,000 files, which took 3718.70 sec. And rename it took 7327.936 sec.
Using this PR, when batch-size is set to 100, the time of delete and rename's directory of 100,000 files is 62.498 sec and 46.002 sec. Performance improved nearly 100 times。
Rename has the same problem, and I'll add rename to the other PR.
##What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-3286
How was this patch tested?
Added UT
Existing acceptance tests.