Skip to content

Conversation

@zhedoubushishi
Copy link
Contributor

@zhedoubushishi zhedoubushishi commented Jul 11, 2022

Tips

What is the purpose of the pull request

This PR aims to fix Hudi Presto query failures. The reason we saw java.lang.NoSuchFieldError: LOG during the presto query is because in this HoodieCopyOnWriteTableInputFormat class, it inherits field LOG from its parent class FileInputFormat which is a class from Hadoop.

So my understanding is in the compile time, it would reference this field from FileInputFormat.class. However, in the runtime, the presto doesn't have all the Hadoop classes in its classpath, what Presto uses is its own Hadoop dependency e.g. hadoop-apache2:jar:2.7.4-9. I checked that hadoop-apache2 does not have class FileInputFormat shaded which causes this runtime error.

Brief change log

(for example:)

  • Modify AnnotationLocation checkstyle rule in checkstyle.xml

Verify this pull request

This pull request is a trivial rework / code cleanup without any test coverage.

Committer checklist

  • Has a corresponding JIRA in PR title & commit

  • Commit message is descriptive of the change

  • CI is green

  • Necessary doc changes done or have another open PR

  • For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.

Copy link
Contributor

@KnightChess KnightChess left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a quesion, if don't have FileINputFormat class, maybe it will cause ClassNotFound when load this class?

@zhedoubushishi
Copy link
Contributor Author

I have a quesion, if don't have FileINputFormat class, maybe it will cause ClassNotFound when load this class?

Right. During the presto query, it would throw something like:

{{java.lang.NoSuchFieldError: LOG
at org.apache.hudi.hadoop.HoodieCopyOnWriteTableInputFormat.makeExternalFileSplit(HoodieCopyOnWriteTableInputFormat.java:199)
at org.apache.hudi.hadoop.HoodieCopyOnWriteTableInputFormat.makeSplit(HoodieCopyOnWriteTableInputFormat.java:100)
at org.apache.hudi.hadoop.realtime.HoodieMergeOnReadTableInputFormat.doMakeSplitForRealtimePath(HoodieMergeOnReadTableInputFormat.java:266)
at org.apache.hudi.hadoop.realtime.HoodieMergeOnReadTableInputFormat.makeSplit(HoodieMergeOnReadTableInputFormat.java:211)
at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:345)
at org.apache.hudi.hadoop.realtime.HoodieMergeOnReadTableInputFormat.getSplits(HoodieMergeOnReadTableInputFormat.java:79)
at org.apache.hudi.hadoop.HoodieParquetInputFormatBase.getSplits(HoodieParquetInputFormatBase.java:68)
at com.facebook.presto.hive.StoragePartitionLoader.loadPartition(StoragePartitionLoader.java:278)
at com.facebook.presto.hive.DelegatingPartitionLoader.loadPartition(DelegatingPartitionLoader.java:81)
at com.facebook.presto.hive.BackgroundHiveSplitLoader.loadSplits(BackgroundHiveSplitLoader.java:224)
at com.facebook.presto.hive.BackgroundHiveSplitLoader.access$700(BackgroundHiveSplitLoader.java:50)
at com.facebook.presto.hive.BackgroundHiveSplitLoader$HiveSplitLoaderTask.process(BackgroundHiveSplitLoader.java:153)

@zhedoubushishi
Copy link
Contributor Author

@hudi-bot run azure

@xushiyan xushiyan added priority:medium Moderate impact; usability gaps area:query-engine Query engine integrations labels Jul 12, 2022
@zhedoubushishi
Copy link
Contributor Author

@hudi-bot run azure

1 similar comment
@zhedoubushishi
Copy link
Contributor Author

@hudi-bot run azure

@hudi-bot
Copy link
Collaborator

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

@zhedoubushishi
Copy link
Contributor Author

Merged in #6161.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:query-engine Query engine integrations priority:medium Moderate impact; usability gaps

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants