Skip to content

[HUDI-3184] hudi-flink support timestamp-micros#4548

Merged
danny0405 merged 9 commits intoapache:masterfrom
AirToSupply:hudi_flink_support_timestamp_micro
Jan 12, 2022
Merged

[HUDI-3184] hudi-flink support timestamp-micros#4548
danny0405 merged 9 commits intoapache:masterfrom
AirToSupply:hudi_flink_support_timestamp_micro

Conversation

@AirToSupply
Copy link
Copy Markdown
Contributor

Tips

What is the purpose of the pull request

hudi-flink module support timestamp-micros. (HUDI-3184)

Brief change log

(for example:)

  • Modify AnnotationLocation checkstyle rule in checkstyle.xml

Verify this pull request

(Please pick either of the following options)

This pull request is a trivial rework / code cleanup without any test coverage.

(or)

This pull request is already covered by existing tests, such as (please describe tests).

(or)

This change added tests and can be verified as follows:

(example:)

  • Added integration tests for end-to-end.
  • Added HoodieClientWriteTest to verify the change.
  • Manually verified the change by running a job locally.

Committer checklist

  • Has a corresponding JIRA in PR title & commit

  • Commit message is descriptive of the change

  • CI is green

  • Necessary doc changes done or have another open PR

  • For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.

switch (unit) {
case MILLIS:
chronoUnit = ChronoUnit.MILLIS;
break;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the chronoUnit can be a class member variable right ?
And, we should also fix the non-utc timezone code path.

case TIMESTAMP_WITHOUT_TIME_ZONE:
return AvroToRowDataConverters::convertToTimestamp;
return avroObject -> convertToTimestamp(avroObject, (TimestampType) type);
case CHAR:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can also instantiate the chronoUnit first based on the type precision.

+ ", it only supports precision less than 6.");
}
return TimestampData.fromLocalDateTime(
LocalDateTime.ofInstant(Instant.ofEpochSecond(0).plus(timestamp, chronoUnit), ZoneId.systemDefault()));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should utc timezone here:

TimestampData.fromInstant(Instant.EPOCH.plus(timestamp, chronoUnit))

final ZoneOffset offset = ZoneOffset.systemDefault().getRules().getOffset(LocalDateTime.now());
if (schema.getLogicalType() instanceof LogicalTypes.TimestampMillis) {
return now.toInstant(offset).toEpochMilli();
} else if (schema.getLogicalType() instanceof LogicalTypes.TimestampMicros) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can instantiate two converters like this:

TimestampType timestampType = (TimestampType) type;
        if (timestampType.getPrecision() <= 3) {
          converter =
              new RowDataToAvroConverter() {
                private static final long serialVersionUID = 1L;

                @Override
                public Object convert(Schema schema, Object object) {
                  return ((TimestampData) object).toInstant().toEpochMilli();
                }
              };
        } else if (timestampType.getPrecision() <= 6) {
          converter =
              new RowDataToAvroConverter() {
                private static final long serialVersionUID = 1L;

                @Override
                public Object convert(Schema schema, Object object) {
                  return ChronoUnit.MICROS.between(Instant.EPOCH, ((TimestampData) object).toInstant());
                }
              };
        } else {
          throw new UnsupportedOperationException("Unsupported type: " + type);
        }

}

private static TimestampData convertToTimestamp(long timestamp, TimestampType logicalType) {
final int precision = logicalType.getPrecision();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as RowDataToAvroConverters, instantiate two converters here.

switch (unit) {
case MILLIS:
return TimeUnit.SECONDS.toMillis(now.toEpochSecond(offset))
+ now.toInstant(offset).getLong(ChronoField.MILLI_OF_SECOND);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we convert the TimestampData to Instant first then to the long then ? It is more concise i think.

@danny0405
Copy link
Copy Markdown
Contributor

Thanks for the contribution, i have left some comments ~

@danny0405 danny0405 self-assigned this Jan 10, 2022
private static TimestampData int64ToTimestamp(
boolean utcTimestamp,
long millionsOfDay,
ChronoUnit unit) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

millionsOfDay => interval

@hudi-bot
Copy link
Copy Markdown
Collaborator

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

@danny0405 danny0405 merged commit 4b01119 into apache:master Jan 12, 2022
@vinishjail97 vinishjail97 mentioned this pull request Jan 24, 2022
5 tasks
vingov pushed a commit to vingov/hudi that referenced this pull request Jan 26, 2022
* support both avro and parquet code path
* string rowdata conversion is also supported
liusenhua pushed a commit to liusenhua/hudi that referenced this pull request Mar 1, 2022
* support both avro and parquet code path
* string rowdata conversion is also supported
vingov pushed a commit to vingov/hudi that referenced this pull request Apr 3, 2022
* support both avro and parquet code path
* string rowdata conversion is also supported
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants