-
Notifications
You must be signed in to change notification settings - Fork 2.5k
[HUDI-4614] fix primary key extract of delete_record when complexKeyGen configured and ChangeLogDisabled #6385
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@garyli1019 @danny0405 can you help review it, thx |
|
@hudi-bot run azure |
| } | ||
|
|
||
| return Arrays.stream(fieldKV).map(kv -> { | ||
| final String[] kvArray = kv.split(":"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, why a simple key uses the key form: name:val instead of just val ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, why a simple key uses the key form:
name:valinstead of justval?
options.put(FlinkOptions.RECORD_KEY_FIELD.key(), "uuid");
options.put(FlinkOptions.PARTITION_PATH_FIELD.key(), "partition,name");
options.put(FlinkOptions.KEYGEN_TYPE.key(), KeyGeneratorType.COMPLEX.name());
If pk is "uuid", partition is "partition,name", in flink-sql we'll use COMPLEX KeyGenerator, then uuid will stored as "uuid:danny", I've tested it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@danny0405 I've explained it, looking forward to your reply.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So why we configure a COMPLEX key generator while the key is just simple here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So why we configure a
COMPLEXkey generator while the key is just simple here?
Because flink-sql's default logic is COMPLEX KeyGenerator, when boolean complexHoodieKey = pks.length > 1 || partitions.length > 1;
https://github.com/apache/hudi/blob/master/hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/HoodieTableFactory.java#L239
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, i have merged #6539 , so this pr can be closed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, i have merged #6539 , so this pr can be closed.
@danny0405 #6539 has little problem, if it's single pk and simple key generator, we'll store 'danny' not 'id:danny', so kvArray[1] will be null point
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, we can rebase the PR and fix it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@danny0405 , I've rebased this PR to master to fix the issue.
By the way, can you review #6429, thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have fixed it in master, so this PR can be closed, would review #6429 then.
…en configured and ChangeLogDisabled
c879ed6 to
1a97525
Compare
Change Logs
fix primary key extract of delete_record when complexKeyGen configured and ChangeLogDisabled
In this condition, primary key in RowData will be like "uuid:id1", not "id1";
Impact
when complexKeyGen configured and ChangeLogDisabled, the primary key output is not correct
Contributor's checklist