-
Notifications
You must be signed in to change notification settings - Fork 2.5k
[HUDI-8235] Adding support for EPOCHMICROSECONDS in TimestampBasedAvroKeyGenerator #7913
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
bvaradar
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sydneybeal : Thanks a lot for contributing to Hudi. It looks like the PR validation is failing. Can you take a look at this ? Also, Can you add a unit test for this change.
|
@sydneybeal : Pinging to see if you can address the comments ? |
|
@bvaradar Hi there I just realized I made this PR using my old github account so I've missed these notifications. I will look at addressing this test case as soon as I have a chance. |
yihua
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sydneybeal Thanks for your contribution. Since it's been a while, I've rebased the PR and added a couple of tests, along with docs update.
balaji-varadarajan-ai
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yihua LGTM. Does the existing behavior treat these types as strings ? In that case, we will have change in behavior with this patch (need release description about this).
Thanks. There is no behavior change to existing types; |
Approved by Balaji above through a different account
…oKeyGenerator (apache#7913) Co-authored-by: Sydney Beal <[email protected]> Co-authored-by: Y Ethan Guo <[email protected]>
…oKeyGenerator (apache#7913) Co-authored-by: Sydney Beal <[email protected]> Co-authored-by: Y Ethan Guo <[email protected]>
…oKeyGenerator (apache#7913) Co-authored-by: Sydney Beal <[email protected]> Co-authored-by: Y Ethan Guo <[email protected]>
…oKeyGenerator (apache#7913) Co-authored-by: Sydney Beal <[email protected]> Co-authored-by: Y Ethan Guo <[email protected]>
Change Logs
Adding support for EPOCHMICROSECONDS in TimestampBasedAvroKeyGenerator
Impact
Users can pass value EPOCHMICROSECONDS to config hoodie.deltastreamer.keygen.timebased.timestamp.type, key gen field is parsed into the correct partitions i.e. yyyy/mm/dd
Risk level (write none, low medium or high below)
Low
For testing I created a jar locally with hudi-utilities-bundle_2.12-0.12.2 with my code change and substituted it in my Spark job - it worked as planned, microsecond timestamps are being partitioned as expected now. Running Google Dataproc Hudi Deltastreamer job, reading from Confluent Cloud Debezium Avro Kafka topic, writing Hudi outputs to S3.
Documentation Update
Describe any necessary documentation update if there is any new feature, config, or user-facing change
This would require an update to the keygen timestamp.type in the documentation to include EPOCHMICROSECONDS in this section of the documentation
Contributor's checklist