Skip to content

Conversation

@xiaochen-zhou
Copy link
Contributor

Change Logs

HoodieTableConfig has a config to let users to override timezone for commit time generation. but looks like there are some places where we use current system's zone instead of honoring the config.

Impact

Fix the time based on time zone set in table config.

Risk level (write none, low medium or high below)

low

Documentation Update

none

Contributor's checklist

  • Read through contributor's guide
  • Change Logs and Impact were stated clearly
  • Adequate tests were added if applicable
  • CI passed

return d.toInstant().atZone(ZoneId.systemDefault()).toLocalDateTime();
return d.toInstant().atZone(getZoneId()).toLocalDateTime();
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we supplement some UTs for parseDateFromInstantTime and convertDateToTemporalAccessor ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we supplement some UTs for parseDateFromInstantTime and convertDateToTemporalAccessor ?

I would be happy to do it.

Copy link
Contributor Author

@xiaochen-zhou xiaochen-zhou May 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

convertDateToTemporalAccessor

I added two UTs: testFormatDateWithCommitTimeZone and testInstantDateParsingWithCommitTimeZone, testInstantDateParsingWithCommitTimeZone is used to test the correctness of the HoodieInstantTimeGenerator#convertDateToTemporalAccessor()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we supplement some UTs for parseDateFromInstantTime and convertDateToTemporalAccessor ?

And in the TestHoodieActiveTimeline.java, there are many UTs related to DateParsing, such as:

  • testInvalidInstantDateParsing
  • testMillisGranularityInstantDateParsing
    etc.

return Date.from(dt.atZone(ZoneId.systemDefault()).toInstant());
Instant instant = dt.atZone(getZoneId()).toInstant();
TimeZone.setDefault(TimeZone.getTimeZone(getZoneId()));
return Date.from(instant);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is risky to set up timezone per JVM process: TimeZone.setDefault(, this could impact all the threads in the JVM.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is risky to set up timezone per JVM process: TimeZone.setDefault(, this could impact all the threads in the JVM.

done

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is risky to set up timezone per JVM process: TimeZone.setDefault(, this could impact all the threads in the JVM.

One of the tests failed, I will try to find the reason and change the code.

[ERROR] Failures: 
[ERROR]   TestHoodieDeltaStreamer.testCleanerDeleteReplacedDataWithArchive:1120 expected: <1> but was: <0>
[ERROR]   TestHoodieDeltaStreamer.testCleanerDeleteReplacedDataWithArchive:1090 expected: <false> but was: <true>
[ERROR]   TestHoodieDeltaStreamer.testCleanerDeleteReplacedDataWithArchive:1120 expected: <1> but was: <0>
[ERROR]   TestHoodieDeltaStreamer.testCleanerDeleteReplacedDataWithArchive:1090 expected: <false> but was: <true>
[INFO] 
[ERROR] Tests run: 354, Failures: 4, Errors: 0, Skipped: 7

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is known flaky test that has been fixed, just rebase with the latest master would solve it.


private static ZoneId getZoneId() {
return commitTimeZone.equals(HoodieTimelineTimeZone.LOCAL)
? ZoneId.systemDefault()
Copy link
Contributor

@danny0405 danny0405 May 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If possible, fetch the timezone with metaClient.tableConfig, the HoodieTimelineTimeZone can not assure the initialization of zoneId.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If possible, fetch the timezone whout metaClient.tableConfig, the HoodieTimelineTimeZone can not assure the initialization of zoneId.

I will try to modify the code as you say

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If possible, fetch the timezone whout metaClient.tableConfig, the HoodieTimelineTimeZone can not assure the initialization of zoneId.

In the class HoodieInstantTimeGenerator, set an initial value( HoodieTimelineTimeZone.LOCAL ) for the property commitTimeZone

private static HoodieTimelineTimeZone commitTimeZone = HoodieTimelineTimeZone.LOCAL;

And update commitTimeZone value in HoodieTableConfig#create

if (hoodieConfig.contains(TIMELINE_TIMEZONE)) {
    HoodieInstantTimeGenerator.setCommitTimeZone(HoodieTimelineTimeZone.valueOf(hoodieConfig.getString(TIMELINE_TIMEZONE)));
}
public static void setCommitTimeZone(HoodieTimelineTimeZone commitTimeZone) {
  HoodieInstantTimeGenerator.commitTimeZone = commitTimeZone;
}

So, I think getting ZoneId by HoodieTimelineTimeZone should be correct. and I don't really understand the meaning of the HoodieTimelineTimeZone can not assure the initialization of zoneId.
I don't know if my idea is correct, looking forward to your reply.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the discussions we take in: #8631

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the discussions we take in: #8631

I sees, I will try to modify the code as you say.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the discussions we take in: #8631

It seems that there is no good way to get HoodieTimelineTimeZone through HoodieTableMetaClient in HoodieInstantTimeGenerator, I currently get HoodieTimelineTimeZone by instantiate a HoodieTableConfig, can you give me some advice?

Copy link
Contributor

@danny0405 danny0405 May 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I currently get HoodieTimelineTimeZone by instantiate a HoodieTableConfig

If no existing table meta client or table config can be reused, we must instantiate a new one. For HoodieTableConfig, usually we fetch a meta client first then get the config, take

public static Option<HoodieTableConfig> getTableConfig(String basePath, org.apache.hadoop.conf.Configuration hadoopConf) {
for a reference.

@danny0405 danny0405 self-assigned this May 10, 2023
@danny0405 danny0405 added area:table-service Table services timezone priority:medium Moderate impact; usability gaps labels May 10, 2023
@hudi-bot
Copy link
Collaborator

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

@danny0405
Copy link
Contributor

Does PR #8631 already fixes the problem?

@xiaochen-zhou
Copy link
Contributor Author

xiaochen-zhou commented May 16, 2023

Does PR #8631 already fixes the problem?

Looks like it's been fixed, sorry for take up your time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:table-service Table services priority:medium Moderate impact; usability gaps

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

3 participants