Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions hudi-common/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -144,6 +144,7 @@
<artifactId>*</artifactId>
</exclusion>
</exclusions>
<scope>provided</scope>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this should be ok. @n3nash agree?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so. We explicitly whitelist in bundles anwyay, so does not matter

</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
Expand All @@ -154,6 +155,7 @@
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-hdfs</artifactId>
<scope>provided</scope>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wondering if the parent pom scope is already provided , is nt that inherited>?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, I think we can remove all the unnecessary Hadoop dependency from the sub-modules since they already exist in the parent pom. Do you mind removing those in this PR @Zhangchaoming ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@garyli1019 My pleasure.

</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -80,9 +80,6 @@ public class FSUtils {
private static final PathFilter ALLOW_ALL_FILTER = file -> true;

public static Configuration prepareHadoopConf(Configuration conf) {
conf.set("fs.hdfs.impl", org.apache.hadoop.hdfs.DistributedFileSystem.class.getName());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is some old old code. Have you ensured that this can run on a non HDFS filesystem now? (HDFS is testing in unit and integration tests already)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vinothchandar On my machine, the implementation of filesyetem is org.apache.hadoop.fs.viewfs.ViewFileSystem. IMO, following user's actual configuration will be bertter.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to be clear. This code basically says, if you use hdfs: or file: schemes, these classes are used. Of course, we honor the users configuration. otherwise we wont be able to write to S3 or GCS :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanna express that this hard code will ignore users configuration.

conf.set("fs.file.impl", org.apache.hadoop.fs.LocalFileSystem.class.getName());

// look for all properties, prefixed to be picked up
for (Entry<String, String> prop : System.getenv().entrySet()) {
if (prop.getKey().startsWith(HOODIE_ENV_PROPS_PREFIX)) {
Expand Down Expand Up @@ -607,8 +604,8 @@ public static HoodieWrapperFileSystem getFs(String path, SerializableConfigurati
* Helper to filter out paths under metadata folder when running fs.globStatus.
* @param fs File System
* @param globPath Glob Path
* @return
* @throws IOException
* @return the file status list of globPath exclude the meta folder
* @throws IOException when having trouble listing the path
*/
public static List<FileStatus> getGlobStatusExcludingMetaFolder(FileSystem fs, Path globPath) throws IOException {
FileStatus[] statuses = fs.globStatus(globPath);
Expand Down
43 changes: 4 additions & 39 deletions hudi-flink/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -153,39 +153,11 @@
<scope>provided</scope>
</dependency>

<!-- Hadoop -->
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<scope>compile</scope>
<exclusions>
<exclusion>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-log4j12</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-hdfs</artifactId>
<scope>compile</scope>
<exclusions>
<exclusion>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-log4j12</artifactId>
</exclusion>
</exclusions>
</dependency>
<!-- Parquet -->
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-auth</artifactId>
<scope>compile</scope>
<exclusions>
<exclusion>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-log4j12</artifactId>
</exclusion>
</exclusions>
<groupId>org.apache.parquet</groupId>
<artifactId>parquet-avro</artifactId>
<scope>test</scope>
</dependency>

<!-- Avro -->
Expand All @@ -197,13 +169,6 @@
<scope>compile</scope>
</dependency>

<!-- Parquet -->
<dependency>
<groupId>org.apache.parquet</groupId>
<artifactId>parquet-avro</artifactId>
<scope>compile</scope>
</dependency>

<!-- Hadoop -->
<dependency>
<groupId>org.apache.hadoop</groupId>
Expand Down