Skip to content

Conversation

@mukund-thakur
Copy link
Contributor

Description of PR

Implementing batching of requests during bulk delete operation based on page size.

How was this patch tested?

Added new test. and re-ran the existing integration test.

For code changes:

  • Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')?
  • Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation?
  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • If applicable, have you updated the LICENSE, LICENSE-binary, NOTICE-binary files?

@mukund-thakur
Copy link
Contributor Author

I had to implement Lists.partition() method in hadoop-common as some enforcer rule is failing if I try to import guava.

@mukund-thakur
Copy link
Contributor Author

CC @steveloughran @mehakmeet

Copy link
Contributor

@mehakmeet mehakmeet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, some minor comments.

@mukund-thakur
Copy link
Contributor Author

Why these many java doc errors in PR ? I haven't changed the MarkerTool.
[ERROR] /home/jenkins/jenkins-home/workspace/hadoop-multibranch_PR-4045@2/ubuntu-focal/src/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/tools/MarkerTool.java:150: warning: empty <p> tag [ERROR] * <p></p> [ERROR] ^ [ERROR] /home/jenkins/jenkins-home/workspace/hadoop-multibranch_PR-4045@2/ubuntu-focal/src/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/tools/MarkerTool.java:964: warning: no @param for source [ERROR] public ScanArgsBuilder withSourceFS(final FileSystem source) { [ERROR] ^ [ERROR] /home/jenkins/jenkins-home/workspace/hadoop-multibranch_PR-4045@2/ubuntu-focal/src/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/tools/MarkerTool.java:964: warning: no @return [ERROR] public ScanArgsBuilder withSourceFS(final FileSystem source) { [ERROR] ^ [ERROR] /home/jenkins/jenkins-home/workspace/hadoop-multibranch_PR-4045@2/ubuntu-focal/src/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/tools/MarkerTool.java:970: warning: no @param for p [ERROR] public ScanArgsBuilder withPath(final Path p) { [ERROR] ^

@apache apache deleted a comment from hadoop-yetus Mar 7, 2022
@steveloughran
Copy link
Contributor

steveloughran commented Mar 7, 2022

javadoc. hmm. we've had so many problems over javadoc
 versions about tags, where <p/> was blocked, open <p>
was an error and where <p></p> without content an error. 
i don't think there is a good answer for a tool written in the 1990s
which probably doesn't complain about the <blink> tag. ignore

Copy link
Contributor

@steveloughran steveloughran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks ok...just make sure that there's no list rebuilding when lists in range are passed in

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 48s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 7 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 12m 37s Maven dependency ordering for branch
+1 💚 mvninstall 24m 28s trunk passed
+1 💚 compile 23m 16s trunk passed with JDK Ubuntu-11.0.14+9-Ubuntu-0ubuntu2.20.04
+1 💚 compile 19m 53s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 checkstyle 3m 47s trunk passed
+1 💚 mvnsite 2m 41s trunk passed
+1 💚 javadoc 1m 55s trunk passed with JDK Ubuntu-11.0.14+9-Ubuntu-0ubuntu2.20.04
+1 💚 javadoc 2m 28s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 3m 48s trunk passed
+1 💚 shadedclient 21m 17s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 27s Maven dependency ordering for patch
+1 💚 mvninstall 1m 33s the patch passed
+1 💚 compile 22m 4s the patch passed with JDK Ubuntu-11.0.14+9-Ubuntu-0ubuntu2.20.04
+1 💚 javac 22m 4s the patch passed
+1 💚 compile 19m 50s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 javac 19m 50s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 3m 38s the patch passed
+1 💚 mvnsite 2m 40s the patch passed
+1 💚 javadoc 1m 53s the patch passed with JDK Ubuntu-11.0.14+9-Ubuntu-0ubuntu2.20.04
+1 💚 javadoc 2m 35s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 4m 6s the patch passed
+1 💚 shadedclient 21m 23s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 17m 47s hadoop-common in the patch passed.
+1 💚 unit 2m 28s hadoop-aws in the patch passed.
+1 💚 asflicense 1m 1s The patch does not generate ASF License warnings.
222m 7s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4045/5/artifact/out/Dockerfile
GITHUB PR #4045
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell
uname Linux 917309fb88fa 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 64f9319
Default Java Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.14+9-Ubuntu-0ubuntu2.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4045/5/testReport/
Max. process+thread count 1603 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: .
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4045/5/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org

This message was automatically generated.

@apache apache deleted a comment from hadoop-yetus Mar 9, 2022
@apache apache deleted a comment from hadoop-yetus Mar 9, 2022
@apache apache deleted a comment from hadoop-yetus Mar 9, 2022
Copy link
Contributor

@steveloughran steveloughran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1
lovely

@mukund-thakur mukund-thakur merged commit 672e380 into apache:trunk Mar 11, 2022
mukund-thakur added a commit that referenced this pull request Mar 11, 2022
Multi object delete of size more than 1000 is not supported by S3 and 
fails with MalformedXML error. So implementing paging of requests to 
reduce the number of keys in a single request. Page size can be configured
using "fs.s3a.bulk.delete.page.size" 

 Contributed By: Mukund Thakur
@mukund-thakur
Copy link
Contributor Author

merged to branch-3.3

HarshitGupta11 pushed a commit to HarshitGupta11/hadoop that referenced this pull request Nov 28, 2022
Multi object delete of size more than 1000 is not supported by S3 and 
fails with MalformedXML error. So implementing paging of requests to 
reduce the number of keys in a single request. Page size can be configured
using "fs.s3a.bulk.delete.page.size" 

 Contributed By: Mukund Thakur
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants