-
Notifications
You must be signed in to change notification settings - Fork 2.5k
[HUDI-1441] Fixing HoodieAvroUtils.rewriteRecord for nested record schema evolution #2982
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[HUDI-1441] Fixing HoodieAvroUtils.rewriteRecord for nested record schema evolution #2982
Conversation
Codecov Report
@@ Coverage Diff @@
## master #2982 +/- ##
=============================================
+ Coverage 54.80% 70.88% +16.07%
+ Complexity 3826 386 -3440
=============================================
Files 485 54 -431
Lines 23424 2016 -21408
Branches 2495 241 -2254
=============================================
- Hits 12838 1429 -11409
+ Misses 9431 454 -8977
+ Partials 1155 133 -1022
Flags with carried forward coverage won't be shown. Click here to find out more. |
272b5a7 to
92ca2e9
Compare
|
cc @codope could you chime in here? |
|
@codope : can you review this patch please. |
codope
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good. Left a few minor comments.
| } | ||
|
|
||
| @Test | ||
| public void testRewriteToEvolvedNestedRecord() throws Exception { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove Exception here and below?
| } | ||
| return datum; | ||
| case UNION: | ||
| Integer idx = (newSchema.getTypes().get(0).getType() == Schema.Type.NULL) ? 1 : 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of wrapper class, can we use the primitive type int here?
| assertEquals("val2", newRecord.get("pii_col")); | ||
| assertEquals(null, ((GenericRecord)newRecord.get("color_rec")).get("color_name")); | ||
| } catch (Exception e) { | ||
| e.printStackTrace(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's remove printStackTrace here and below?
|
Is this still relevant? @jonvex could you check this too? |
|
The revamp PR is #11893. |
Redo of #2309
What is the purpose of the pull request
If schema contains nested records, then HoodieAvroUtils rewrite() function copies the record fields as-is, from the oldrecord to the newRecord. If fields of the nested record have evolved, it would result in SchemaCompatibilityException or ArrayIndexOutOfBoundsException.
Brief change log
Modify HoodieAvroUtils rewrite() to rewrite the evolved fields, with new/evolved fields initialized to null.
Verify this pull request
This pull request is already covered by existing tests, such as TestHoodieAvroUtils.
Added testRewriteToEvolvedNestedRecord() and testRewriteToShorterRecord()
Committer checklist
Has a corresponding JIRA in PR title & commit
Commit message is descriptive of the change
CI is green
Necessary doc changes done or have another open PR
For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.