Skip to content

Conversation

@sydneybeal
Copy link
Contributor

@sydneybeal sydneybeal commented Feb 9, 2023

Change Logs

Adding support for EPOCHMICROSECONDS in TimestampBasedAvroKeyGenerator

Impact

Users can pass value EPOCHMICROSECONDS to config hoodie.deltastreamer.keygen.timebased.timestamp.type, key gen field is parsed into the correct partitions i.e. yyyy/mm/dd

Risk level (write none, low medium or high below)

Low

For testing I created a jar locally with hudi-utilities-bundle_2.12-0.12.2 with my code change and substituted it in my Spark job - it worked as planned, microsecond timestamps are being partitioned as expected now. Running Google Dataproc Hudi Deltastreamer job, reading from Confluent Cloud Debezium Avro Kafka topic, writing Hudi outputs to S3.

Documentation Update

Describe any necessary documentation update if there is any new feature, config, or user-facing change

This would require an update to the keygen timestamp.type in the documentation to include EPOCHMICROSECONDS in this section of the documentation

Contributor's checklist

  • Read through contributor's guide
  • Change Logs and Impact were stated clearly
  • [] Adequate tests were added if applicable
  • [] CI passed

@bvaradar bvaradar self-assigned this Feb 22, 2023
@bvaradar bvaradar marked this pull request as ready for review February 22, 2023 09:23
Copy link
Contributor

@bvaradar bvaradar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sydneybeal : Thanks a lot for contributing to Hudi. It looks like the PR validation is failing. Can you take a look at this ? Also, Can you add a unit test for this change.

@bvaradar
Copy link
Contributor

@sydneybeal : Pinging to see if you can address the comments ?

@sydneyhoran
Copy link

@bvaradar Hi there I just realized I made this PR using my old github account so I've missed these notifications. I will look at addressing this test case as soon as I have a chance.

@github-actions github-actions bot added the size:XS PR with lines of changes in <= 10 label Feb 26, 2024
@yihua yihua changed the title Adding support for EPOCHMICROSECONDS in TimestampBasedAvroKeyGenerator [HUDI-8235] Adding support for EPOCHMICROSECONDS in TimestampBasedAvroKeyGenerator Sep 22, 2024
Copy link
Contributor

@yihua yihua left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sydneybeal Thanks for your contribution. Since it's been a while, I've rebased the PR and added a couple of tests, along with docs update.

@github-actions github-actions bot added size:S PR with lines of changes in (10, 100] and removed size:XS PR with lines of changes in <= 10 labels Sep 22, 2024
Copy link
Contributor

@balaji-varadarajan-ai balaji-varadarajan-ai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yihua LGTM. Does the existing behavior treat these types as strings ? In that case, we will have change in behavior with this patch (need release description about this).

@apache apache deleted a comment from hudi-bot Sep 22, 2024
@yihua
Copy link
Contributor

yihua commented Sep 22, 2024

@yihua LGTM. Does the existing behavior treat these types as strings ? In that case, we will have change in behavior with this patch (need release description about this).

Thanks. There is no behavior change to existing types; EPOCHMICROSECONDS is a new supported timestamp type like EPOCHMILLISECONDS in the Timestamp Key Generator. Config docs is updated so when we cut the release docs the configuration page will show this newly supported type.

@yihua yihua dismissed bvaradar’s stale review September 22, 2024 22:29

Approved by Balaji above through a different account

@hudi-bot
Copy link
Collaborator

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

@yihua yihua merged commit 5ec6bcf into apache:master Sep 22, 2024
linliu-code pushed a commit to linliu-code/hudi that referenced this pull request Dec 30, 2025
…oKeyGenerator (apache#7913)

Co-authored-by: Sydney Beal <[email protected]>
Co-authored-by: Y Ethan Guo <[email protected]>
linliu-code pushed a commit to linliu-code/hudi that referenced this pull request Dec 30, 2025
…oKeyGenerator (apache#7913)

Co-authored-by: Sydney Beal <[email protected]>
Co-authored-by: Y Ethan Guo <[email protected]>
linliu-code pushed a commit to linliu-code/hudi that referenced this pull request Dec 30, 2025
…oKeyGenerator (apache#7913)

Co-authored-by: Sydney Beal <[email protected]>
Co-authored-by: Y Ethan Guo <[email protected]>
linliu-code pushed a commit to linliu-code/hudi that referenced this pull request Dec 30, 2025
…oKeyGenerator (apache#7913)

Co-authored-by: Sydney Beal <[email protected]>
Co-authored-by: Y Ethan Guo <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:S PR with lines of changes in (10, 100]

Projects

Status: ✅ Done

Development

Successfully merging this pull request may close these issues.

6 participants