-
Notifications
You must be signed in to change notification settings - Fork 2.5k
[HUDI-6155] Fix cleaner based on hours for earliest commit to retain #8659
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
74a4018
4907a9c
4173ee7
1defdc0
96b692c
e667c9b
9012eb0
4852055
d982ea3
eaf85f3
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||
|---|---|---|---|---|
|
|
@@ -19,6 +19,7 @@ | |||
| package org.apache.hudi.common.table.timeline; | ||||
|
|
||||
| import org.apache.hudi.common.model.HoodieTimelineTimeZone; | ||||
| import org.apache.hudi.common.table.HoodieTableConfig; | ||||
|
|
||||
| import java.text.ParseException; | ||||
| import java.time.LocalDateTime; | ||||
|
|
@@ -44,10 +45,10 @@ public class HoodieInstantTimeGenerator { | |||
| public static final int MILLIS_INSTANT_TIMESTAMP_FORMAT_LENGTH = MILLIS_INSTANT_TIMESTAMP_FORMAT.length(); | ||||
| // Formatter to generate Instant timestamps | ||||
| // Unfortunately millisecond format is not parsable as is https://bugs.openjdk.java.net/browse/JDK-8031085. hence have to do appendValue() | ||||
| private static DateTimeFormatter MILLIS_INSTANT_TIME_FORMATTER = new DateTimeFormatterBuilder().appendPattern(SECS_INSTANT_TIMESTAMP_FORMAT) | ||||
| private static final DateTimeFormatter MILLIS_INSTANT_TIME_FORMATTER = new DateTimeFormatterBuilder().appendPattern(SECS_INSTANT_TIMESTAMP_FORMAT) | ||||
| .appendValue(ChronoField.MILLI_OF_SECOND, 3).toFormatter(); | ||||
| private static final String MILLIS_GRANULARITY_DATE_FORMAT = "yyyy-MM-dd HH:mm:ss.SSS"; | ||||
| private static DateTimeFormatter MILLIS_GRANULARITY_DATE_FORMATTER = DateTimeFormatter.ofPattern(MILLIS_GRANULARITY_DATE_FORMAT); | ||||
| private static final DateTimeFormatter MILLIS_GRANULARITY_DATE_FORMATTER = DateTimeFormatter.ofPattern(MILLIS_GRANULARITY_DATE_FORMAT); | ||||
|
|
||||
| // The last Instant timestamp generated | ||||
| private static AtomicReference<String> lastInstantTime = new AtomicReference<>(String.valueOf(Integer.MIN_VALUE)); | ||||
|
|
@@ -61,7 +62,7 @@ public class HoodieInstantTimeGenerator { | |||
|
|
||||
| /** | ||||
| * Returns next instant time that adds N milliseconds to the current time. | ||||
| * Ensures each instant time is atleast 1 second apart since we create instant times at second granularity | ||||
| * Ensures each instant time is at least 1 second apart since we create instant times at second granularity | ||||
| * | ||||
| * @param milliseconds Milliseconds to add to current time while generating the new instant time | ||||
| */ | ||||
|
|
@@ -94,7 +95,7 @@ public static Date parseDateFromInstantTime(String timestamp) throws ParseExcept | |||
| } | ||||
|
|
||||
| LocalDateTime dt = LocalDateTime.parse(timestampInMillis, MILLIS_INSTANT_TIME_FORMATTER); | ||||
| return Date.from(dt.atZone(ZoneId.systemDefault()).toInstant()); | ||||
| return Date.from(dt.atZone(getZoneId()).toInstant()); | ||||
| } catch (DateTimeParseException e) { | ||||
| throw new ParseException(e.getMessage(), e.getErrorIndex()); | ||||
| } | ||||
|
|
@@ -129,7 +130,7 @@ public static String getInstantForDateString(String dateString) { | |||
| } | ||||
|
|
||||
| private static TemporalAccessor convertDateToTemporalAccessor(Date d) { | ||||
| return d.toInstant().atZone(ZoneId.systemDefault()).toLocalDateTime(); | ||||
| return d.toInstant().atZone(getZoneId()).toLocalDateTime(); | ||||
| } | ||||
|
|
||||
| public static void setCommitTimeZone(HoodieTimelineTimeZone commitTimeZone) { | ||||
|
|
@@ -144,4 +145,11 @@ public static boolean isValidInstantTime(String instantTime) { | |||
| return false; | ||||
| } | ||||
| } | ||||
|
|
||||
| private static ZoneId getZoneId() { | ||||
| HoodieTimelineTimeZone timelineTimezone = new HoodieTableConfig().getTimelineTimezone(); | ||||
| return timelineTimezone.equals(HoodieTimelineTimeZone.LOCAL) | ||||
| ? ZoneId.systemDefault() | ||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If possible, fetch the timezone with metaClient.tableConfig, the
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I will try to modify the code as you say
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
In the class private static HoodieTimelineTimeZone commitTimeZone = HoodieTimelineTimeZone.LOCAL;And update if (hoodieConfig.contains(TIMELINE_TIMEZONE)) {
HoodieInstantTimeGenerator.setCommitTimeZone(HoodieTimelineTimeZone.valueOf(hoodieConfig.getString(TIMELINE_TIMEZONE)));
}public static void setCommitTimeZone(HoodieTimelineTimeZone commitTimeZone) {
HoodieInstantTimeGenerator.commitTimeZone = commitTimeZone;
}So, I think getting ZoneId by HoodieTimelineTimeZone should be correct. and I don't really understand the meaning of
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. See the discussions we take in: #8631
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I sees, I will try to modify the code as you say.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
It seems that there is no good way to get
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
If no existing table meta client or table config can be reused, we must instantiate a new one. For hudi/hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/util/StreamerUtil.java Line 307 in 42b517d
|
||||
| : ZoneId.of(commitTimeZone.getTimeZone().toUpperCase()); | ||||
| } | ||||
| } | ||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we supplement some UTs for
parseDateFromInstantTimeandconvertDateToTemporalAccessor?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would be happy to do it.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added two UTs:
testFormatDateWithCommitTimeZoneandtestInstantDateParsingWithCommitTimeZone,testInstantDateParsingWithCommitTimeZoneis used to test the correctness of the HoodieInstantTimeGenerator#convertDateToTemporalAccessor()There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And in the TestHoodieActiveTimeline.java, there are many UTs related to DateParsing, such as:
testInvalidInstantDateParsingtestMillisGranularityInstantDateParsingetc.