Skip to content

Conversation

@captainzmc
Copy link
Member

@captainzmc captainzmc commented Apr 13, 2020

What changes were proposed in this pull request?

Currently delete file is to get all the keys in the directory, and then delete one by one. This makes for poor performance.
By tested the deletion directory with 100,000 files, which took 3718.70 sec. And rename it took 7327.936 sec.
Using this PR, when batch-size is set to 100, the time of delete and rename's directory of 100,000 files is 62.498 sec and 46.002 sec. Performance improved nearly 100 times。

Rename has the same problem, and I'll add rename to the other PR.

##What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-3286

How was this patch tested?

Added UT
Existing acceptance tests.

@mukul1987
Copy link
Contributor

Please have a look at https://issues.apache.org/jira/browse/HDDS-2939. The problem is being addressed there.

Copy link
Contributor

@rakeshadr rakeshadr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the good work.
Added few suggestions and comments. Please go through it. Thanks!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need String keyName argument ? Can you please incorporate String keyName argument into the keyNameList argument, something similar you have very well refactored for keyNameList.add(keyName); in deleteKey api.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need String keyName argument ? Can you please incorporate String keyName argument into the keyNameList argument, something similar you have very well refactored for keyNameList.add(keyName); in deleteKey api.

The keyName in OmKeyArgs is Shared with other interfaces. Operations like create and list still need to use it.

@captainzmc
Copy link
Member Author

Please have a look at https://issues.apache.org/jira/browse/HDDS-2939. The problem is being addressed there.

@mukul1987 Thanks for the tip. After looking at the design documentation, I found that is a better implementation. I'm going to colse this PR.

@captainzmc
Copy link
Member Author

Thanks for the good work.
Added few suggestions and comments. Please go through it. Thanks!

Hi @rakeshadr Thank for you review. By looking at the design documentation of https://issues.apache.org/jira/browse/HDDS-2939. I found that the HDDS-2939 works better, and I'm going to colse this PR.

@captainzmc captainzmc closed this Apr 15, 2020
@captainzmc captainzmc reopened this May 10, 2020
@captainzmc captainzmc closed this May 10, 2020
@captainzmc captainzmc changed the title HDDS-3286. BasicOzoneFileSystem support batchDelete and batchRename. (WIP)HDDS-3286. BasicOzoneFileSystem support batchDelete and batchRename. May 18, 2020
@captainzmc captainzmc reopened this May 18, 2020
@captainzmc captainzmc force-pushed the HDDS-3286-new branch 2 times, most recently from c927d2b to 6208606 Compare May 19, 2020 06:39
@captainzmc captainzmc changed the title (WIP)HDDS-3286. BasicOzoneFileSystem support batchDelete and batchRename. (WIP)HDDS-3286. BasicOzoneFileSystem support batchDelete. May 19, 2020
@captainzmc captainzmc changed the title (WIP)HDDS-3286. BasicOzoneFileSystem support batchDelete. HDDS-3286. BasicOzoneFileSystem support batchDelete. May 19, 2020
@captainzmc
Copy link
Member Author

captainzmc commented May 19, 2020

hi @rakeshadr Thank for you review.
After negotiation, I re-opened this PR, and I modified it according to your suggestion. In addition, for batch optimization of rename, I plan to submit a PR hdds-3620 separately based on this PR.

@captainzmc
Copy link
Member Author

captainzmc commented May 19, 2020

Hi @ChenSammi @xiaoyuyao
As we discussed before, because the progress of HDDS-2939 was slow, and it's going to take a long time to merge into master. So we optimized the batch operation based on the current interface. We were able to get better performance with this PR before the HDDS-2939 merged.
In addition, for batch rename operation, I plan to submit a PR HDDS-3620 separately based on this.
Could you please help me review this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have a list of KeyArgs here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. Use list of KeyArgs instead of add a new keyNameList field.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good advice. Instead of modifying KeyArgs, I'll try to implement delete with a List that contains KeyArgs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not remain KeyArgs as for single key?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, We can avoid changing it here,by use list of KeyArgs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use repeated KeyArgs here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to iterate all omKeyInfo in the list as well here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not related to this class. But you wanna visit TestOzoneManagerHAWithData and see if HA needs a test case for deleting a list of keys.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be great if we add a failure case here like there is an unknown key in a list of known keys. We would like to also test exceptions.

@captainzmc
Copy link
Member Author

captainzmc commented May 22, 2020

Hi @bharatviswa504, Could you please help to review this PR? Let's see if there is any effect on OM HA.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need to change toKeyName from required to optional?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This place is useless for modification, I will restore this place.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we delete a list of keys, upon failure in the middle, can we return a list of deleted keys and undeleted keys? This may not be an issue when you delete a single key but when batch deleting, it is hard to recover from the failures without that information.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with Xiaoyu, we should return a list of deleted key result on failure.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will add unDeletedKeys and deletedKeys in exception when an exception occurs when delete keys.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is OK with o3fs as we will mount a single bucket. In the context of ofs where you can have multiple volume buckets under the root. This lock can't guarantee atomic across all the delete keyname list.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I will make sure that all buckets I used are locked to ensure compatibility with ofs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will only print the address of the keNameList. You may want to expand the list and also protect the LOG.debug with a if LOG.isDebugEnabled().

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above. we might only want to print the key that failed in the deletion instead of the whole list upon failures. print the whole list can be a debug or trace level log.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we define a new method called deleteObjects to be backward compatible?

@captainzmc captainzmc reopened this Jun 2, 2020
@captainzmc
Copy link
Member Author

Fixed issues with large lock granularity. PR can continue to be reviewed.

@captainzmc
Copy link
Member Author

Had rebased master and fixed conflict. PR can continue to be reviewed. cc @timmylicheng @ChenSammi @xiaoyuyao Thanks.

Copy link
Contributor

@ChenSammi ChenSammi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sugguest to change all deleteBatchKey liked namning to deleleKeys.


public static final String OZONE_FS_ITERATE_BATCH_SIZE =
"ozone.fs.iterate.batch-size";
public static final int OZONE_FS_ITERATE_BATCH_SIZE_DEFAULT = 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest a bigger default size, say 100?

"ozone.s3.token.max.lifetime";
public static final String OZONE_S3_AUTHINFO_MAX_LIFETIME_KEY_DEFAULT = "3m";

public static final String OZONE_FS_ITERATE_BATCH_SIZE =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this property be used for batch rename too?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, batch Rename will also use this property.

* @throws IOException
*/
void deleteKeys(String volumeName, String bucketName,
List<String> keyNameList)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indent is incorrect.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with Xiaoyu, we should return a list of deleted key result on failure.

@codecov-commenter
Copy link

Codecov Report

❗ No coverage uploaded for pull request base (master@26d8375). Click here to learn what that means.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff            @@
##             master     #814   +/-   ##
=========================================
  Coverage          ?   69.43%           
  Complexity        ?     9125           
=========================================
  Files             ?      962           
  Lines             ?    48287           
  Branches          ?     4694           
=========================================
  Hits              ?    33528           
  Misses            ?    12541           
  Partials          ?     2218           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 26d8375...8604295. Read the comment docs.

@captainzmc
Copy link
Member Author

Status updates:
Some changes have been made based on the new comments. This PR can continue to be reviewed.

@captainzmc
Copy link
Member Author

Thanks @xiaoyuyao for starting to review this PR again. I have made some modifications according to the suggestions above. In addition, Due to the conflict problem, I rebase the previous commit.
So, this PR can continue to be reviewed.

@xiaoyuyao
Copy link
Contributor

Thanks @captainzmc for the update. The latest change LGTM, +1.

@xiaoyuyao xiaoyuyao merged commit a5556d3 into apache:master Jun 23, 2020
errose28 added a commit to errose28/ozone that referenced this pull request Jun 25, 2020
* upstream/master: (56 commits)
  HDDS-3264. Fix TestCSMMetrics.java. (apache#1120)
  HDDS-3858. Remove support to start Ozone and HDFS datanodes in the same JVM (apache#1117)
  HDDS-3704. Update all the documentation to use ozonefs-hadoop2/3 instead of legacy/current (apache#1099)
  HDDS-3773. Add OMDBDefinition to define structure of om.db. (apache#1076)
  Revert "HDDS-3263. Fix TestCloseContainerByPipeline.java. (apache#1119)" (apache#1126)
  HDDS-3821. Disable Ozone SPNEGO should not fall back to hadoop.http.a… (apache#1101)
  HDDS-3819. OzoneManager#listVolumeByUser ignores userName parameter when ACL is enabled (apache#1087)
  HDDS-3779. Add csi interface documents to show how to use ozone csi (apache#1059)
  HDDS-3857. Datanode in compose/ozonescripts can't be started (apache#1116)
  HDDS-3430. Enable TestWatchForCommit test cases. (apache#1114)
  HDDS-3263. Fix TestCloseContainerByPipeline.java. (apache#1119)
  HDDS-3512. s3g multi-part-upload saved incorrect content using streaming (apache#1092)
  HDDS-3836. Modify ContainerPlacementPolicyFactory JavaDoc (apache#1097)
  HDDS-3780. Replace the imagePullPolicy from always to IfNotPresent (apache#1055)
  HDDS-3847. Change OMNotLeaderException logging to DEBUG (apache#1118)
  HDDS-3745. Improve OM and SCM performance with 64% by avoid collect datanode information to s3g (apache#1031)
  HDDS-3286. BasicOzoneFileSystem  support batchDelete. (apache#814)
  HDDS-3850. Update the admin document to let user know how to show the status of all rules. (apache#1109)
  HDDS-3848. Add ratis.thirdparty.version in main pom.xml (apache#1108)
  HDDS-3815. Avoid buffer copy in ContainerCommandRequestProto. (apache#1085)
  ...
@captainzmc captainzmc deleted the HDDS-3286-new branch July 17, 2020 12:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants