-
Notifications
You must be signed in to change notification settings - Fork 2.5k
[HUDI-3130] Fixing Hive getSchema for RT tables addressing different partitions having different schemas #4468
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/realtime/AbstractRealtimeRecordReader.java
Outdated
Show resolved
Hide resolved
5933187 to
2461ceb
Compare
c032433 to
256f7d8
Compare
|
@codope : good to review the patch. I have pushed commits to address feedback. |
codope
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. One minor comment. Also, it would be good to add a test for multi-partitioned table with schema evolution for one of the partition.
...-hadoop-mr/src/test/java/org/apache/hudi/hadoop/realtime/TestHoodieRealtimeRecordReader.java
Outdated
Show resolved
Hide resolved
|
sure. will add a test for schema evolution |
|
@codope : Added a test for schema evolution across partitions. you can take a look. |
| * @throws Exception | ||
| */ | ||
| public Schema getTableAvroSchema() throws Exception { | ||
| return getTableAvroSchema(metaClient.getTableConfig().populateMetaFields()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aditiwari01 @codope folks, can you please elaborate why this has been changed? Why are we assuming that Table's schema should have meta-fields?
…partitions having different schemas (apache#4468) * Fixing Hive getSchema for RT tables * Addressing feedback * temp diff * fixing tests after spark datasource read support for metadata table is merged to master * Adding multi-partition schema evolution tests to HoodieRealTimeRecordReader Co-authored-by: Aditya Tiwari <[email protected]> Co-authored-by: sivabalan <[email protected]>
…partitions having different schemas (apache#4468) * Fixing Hive getSchema for RT tables * Addressing feedback * temp diff * fixing tests after spark datasource read support for metadata table is merged to master * Adding multi-partition schema evolution tests to HoodieRealTimeRecordReader Co-authored-by: Aditya Tiwari <[email protected]> Co-authored-by: sivabalan <[email protected]>
What is the purpose of the pull request
Fixing Hive getSchema for RT tables
Refer issue for more details: #2802
Verify this pull request:
Unit tests to be explored. I have tested on local and cluster spark-hive.
Committer checklist
Has a corresponding JIRA in PR title & commit
Commit message is descriptive of the change
CI is green
Necessary doc changes done or have another open PR
For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.