Skip to content

Conversation

@virajjasani
Copy link
Contributor

Description of PR

HttpFS should support getSnapshotDiffReportListing API for improved snapshot diff.

How was this patch tested?

Local testing. Added some tests in httpfs module.

For code changes:

  • Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')?

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 53s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 13m 4s Maven dependency ordering for branch
+1 💚 mvninstall 26m 2s trunk passed
+1 💚 compile 5m 55s trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04
+1 💚 compile 5m 27s trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
+1 💚 checkstyle 1m 14s trunk passed
+1 💚 mvnsite 1m 31s trunk passed
+1 💚 javadoc 1m 7s trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04
+1 💚 javadoc 1m 0s trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
+1 💚 spotbugs 3m 18s trunk passed
+1 💚 shadedclient 22m 45s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 22s Maven dependency ordering for patch
+1 💚 mvninstall 1m 10s the patch passed
+1 💚 compile 5m 47s the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04
+1 💚 javac 5m 47s the patch passed
+1 💚 compile 5m 12s the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
+1 💚 javac 5m 12s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 1m 8s /results-checkstyle-hadoop-hdfs-project.txt hadoop-hdfs-project: The patch generated 1 new + 412 unchanged - 0 fixed = 413 total (was 412)
+1 💚 mvnsite 1m 18s the patch passed
+1 💚 javadoc 0m 53s the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04
+1 💚 javadoc 0m 48s the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
+1 💚 spotbugs 3m 29s the patch passed
+1 💚 shadedclient 22m 24s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 2m 16s hadoop-hdfs-client in the patch passed.
+1 💚 unit 7m 53s hadoop-hdfs-httpfs in the patch passed.
+1 💚 asflicense 0m 31s The patch does not generate ASF License warnings.
138m 16s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3730/1/artifact/out/Dockerfile
GITHUB PR #3730
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell
uname Linux 9bf904b7031b 4.15.0-153-generic #160-Ubuntu SMP Thu Jul 29 06:54:29 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / da2c6c2
Default Java Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3730/1/testReport/
Max. process+thread count 667 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs-httpfs U: hadoop-hdfs-project
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3730/1/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org

This message was automatically generated.

@virajjasani
Copy link
Contributor Author

@iwasakims @jojochuang @aajisaka Could you please take a look?

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 51s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 12m 37s Maven dependency ordering for branch
+1 💚 mvninstall 24m 28s trunk passed
+1 💚 compile 5m 52s trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04
+1 💚 compile 5m 21s trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
+1 💚 checkstyle 1m 16s trunk passed
+1 💚 mvnsite 1m 30s trunk passed
+1 💚 javadoc 1m 7s trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04
+1 💚 javadoc 0m 58s trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
+1 💚 spotbugs 3m 21s trunk passed
+1 💚 shadedclient 22m 52s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 24s Maven dependency ordering for patch
+1 💚 mvninstall 1m 12s the patch passed
+1 💚 compile 5m 39s the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04
+1 💚 javac 5m 39s the patch passed
+1 💚 compile 5m 17s the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
+1 💚 javac 5m 17s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 1m 8s the patch passed
+1 💚 mvnsite 1m 17s the patch passed
+1 💚 javadoc 0m 54s the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04
+1 💚 javadoc 0m 50s the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
+1 💚 spotbugs 3m 29s the patch passed
+1 💚 shadedclient 22m 34s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 2m 15s hadoop-hdfs-client in the patch passed.
+1 💚 unit 7m 42s hadoop-hdfs-httpfs in the patch passed.
+1 💚 asflicense 0m 30s The patch does not generate ASF License warnings.
135m 58s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3730/2/artifact/out/Dockerfile
GITHUB PR #3730
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell
uname Linux ccbff9a77bd8 4.15.0-153-generic #160-Ubuntu SMP Thu Jul 29 06:54:29 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 0b94a07
Default Java Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3730/2/testReport/
Max. process+thread count 670 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs-httpfs U: hadoop-hdfs-project
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3730/2/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org

This message was automatically generated.

@virajjasani
Copy link
Contributor Author

Minor refactor in the latest commit.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 1m 6s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 12m 35s Maven dependency ordering for branch
+1 💚 mvninstall 27m 33s trunk passed
+1 💚 compile 6m 43s trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04
+1 💚 compile 5m 44s trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
+1 💚 checkstyle 1m 20s trunk passed
+1 💚 mvnsite 1m 43s trunk passed
+1 💚 javadoc 1m 14s trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04
+1 💚 javadoc 1m 2s trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
+1 💚 spotbugs 3m 35s trunk passed
+1 💚 shadedclient 23m 50s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 27s Maven dependency ordering for patch
+1 💚 mvninstall 1m 19s the patch passed
+1 💚 compile 6m 41s the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04
+1 💚 javac 6m 41s the patch passed
+1 💚 compile 5m 59s the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
+1 💚 javac 5m 59s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 1m 13s the patch passed
+1 💚 mvnsite 1m 32s the patch passed
+1 💚 javadoc 1m 0s the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04
+1 💚 javadoc 0m 55s the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
+1 💚 spotbugs 3m 51s the patch passed
+1 💚 shadedclient 24m 1s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 2m 31s hadoop-hdfs-client in the patch passed.
+1 💚 unit 8m 52s hadoop-hdfs-httpfs in the patch passed.
+1 💚 asflicense 0m 31s The patch does not generate ASF License warnings.
148m 1s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3730/3/artifact/out/Dockerfile
GITHUB PR #3730
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell
uname Linux aca8e46c383a 4.15.0-153-generic #160-Ubuntu SMP Thu Jul 29 06:54:29 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 9356fb1
Default Java Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3730/3/testReport/
Max. process+thread count 657 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs-httpfs U: hadoop-hdfs-project
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3730/3/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org

This message was automatically generated.

@virajjasani
Copy link
Contributor Author

@ferhui @tasanuma could you please take a look?

@iwasakims
Copy link
Member

@virajjasani
LGTM overall. I'm not confident that making the getSnapshotDiffReportListing of DistributedFileSystem and WebHdfsFileSystem public is right while I have no idea of alternative. Exposing DFSClient to HttpFS is not good. I would like to wait for comments from another reviewers for some time.

HttpFSFileSystem might be able to leverage DFSUtilClient#getSnapshotDiffReport. It can be addressed in follow-up JIRAs.

@virajjasani
Copy link
Contributor Author

Thanks for taking a look.

I'm not confident that making the getSnapshotDiffReportListing of DistributedFileSystem and WebHdfsFileSystem public is right while I have no idea of alternative.

I have made WebHdfsFileSystem#getSnapshotDiffReportListing public only because of test usage, so let me make it as @VisibleForTesting and provide comment as well.

Exposing DFSClient to HttpFS is not good.

On the other hand, almost every other HttpFS APIs use DFSClient (through DFS) e.g. getSnapshotListing, getSnapshottableDirListing, even DFS#access is used directly by HttpFS. Though these APIs are accessed only if DFS is used (same logic applicable for this change). I am wondering what problem it can cause here. Is there any known issues similar to WebHdfsFileSystem using NameNode APIs directly (i.e. NamenodeWebHdfsMethods should utilize RPC Client Protocol rather than accessing FileSystem APIs directly).

HttpFSFileSystem might be able to leverage DFSUtilClient#getSnapshotDiffReport. It can be addressed in follow-up JIRAs.

Sounds good, however, I have the same question as above that majority of HttpFS APIs still directly use DFSClient if the underlying FileSystem is DFS. Does this cause any known issues?
Also, I just again took a look at majority sub-classes of FSOperations that are used by HttpFs and majority of them are directly using FileSystem APIs (for DFS specific functionality, we check whether underlying FS is DFS, if not we throw UnsupportedOperationException).

Thanks @iwasakims, let me add @VFT on WebHdfsFileSystem#getSnapshotDiffReportListing to be more specific about it's usage policy.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 56s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 12m 29s Maven dependency ordering for branch
+1 💚 mvninstall 24m 34s trunk passed
+1 💚 compile 5m 53s trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04
+1 💚 compile 5m 20s trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
+1 💚 checkstyle 1m 16s trunk passed
+1 💚 mvnsite 1m 30s trunk passed
+1 💚 javadoc 1m 8s trunk passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04
+1 💚 javadoc 0m 58s trunk passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
+1 💚 spotbugs 3m 21s trunk passed
+1 💚 shadedclient 23m 31s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 23s Maven dependency ordering for patch
+1 💚 mvninstall 1m 11s the patch passed
+1 💚 compile 5m 39s the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04
+1 💚 javac 5m 39s the patch passed
+1 💚 compile 5m 15s the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
+1 💚 javac 5m 15s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 1m 9s the patch passed
+1 💚 mvnsite 1m 17s the patch passed
+1 💚 javadoc 0m 55s the patch passed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04
+1 💚 javadoc 0m 48s the patch passed with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
+1 💚 spotbugs 3m 31s the patch passed
+1 💚 shadedclient 23m 7s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 2m 16s hadoop-hdfs-client in the patch passed.
+1 💚 unit 7m 57s hadoop-hdfs-httpfs in the patch passed.
+1 💚 asflicense 0m 30s The patch does not generate ASF License warnings.
137m 46s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3730/4/artifact/out/Dockerfile
GITHUB PR #3730
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell
uname Linux ce9a113ec40c 4.15.0-143-generic #147-Ubuntu SMP Wed Apr 14 16:10:11 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 169e413
Default Java Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3730/4/testReport/
Max. process+thread count 667 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs-httpfs U: hadoop-hdfs-project
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3730/4/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org

This message was automatically generated.

Copy link
Contributor

@jojochuang jojochuang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW TestHttpFSFWithSWebhdfsFileSystem is currently failing consistently in trunk. So we really need to fix it. (how did we miss it before?)

Copy link
Contributor

@jojochuang jojochuang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From a quick look the code looks good to me.

@virajjasani
Copy link
Contributor Author

BTW TestHttpFSFWithSWebhdfsFileSystem is currently failing consistently in trunk. So we really need to fix it. (how did we miss it before?)

Thanks @jojochuang. These tests are passing as per QA results on this PR:
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3730/4/testReport/org.apache.hadoop.fs.http.client/
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3730/3/testReport/org.apache.hadoop.fs.http.client/

I am not 100% sure why the test was missed but I guess it's because when a change is made to WebHdfs module explicitly, Httpfs tests are not run by QA and hence any existing test failure might be missed.

Copy link
Member

@iwasakims iwasakims left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update, @virajjasani. +1. merging this.

@iwasakims iwasakims merged commit 0c62a51 into apache:trunk Dec 3, 2021
HarshitGupta11 pushed a commit to HarshitGupta11/hadoop that referenced this pull request Nov 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants