Skip to content

hudi use partition path field as hive partition field error in flink #5394

@onlywangyh

Description

@onlywangyh

To Reproduce

Steps to reproduce the behavior:

  1. create a mysql table like :
CREATE TABLE `timeTypeTest` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `datetime1` datetime DEFAULT NULL,
  `date1` date DEFAULT NULL,
  `datetime16` datetime(6) DEFAULT NULL,
  `time16` time DEFAULT NULL,
  `timestamp16` timestamp(6) NULL DEFAULT NULL,
  `timestamp16Partition` varchar(45) DEFAULT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `id_UNIQUE` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=2 DEFAULT CHARSET=latin1
  1. insert a data
    insert into mydb.timeTypeTest values ('2', '2020-07-30 10:08:22', '2020-07-30', '2020-07-30 10:08:22.000000', '10:08:22', '2020-07-30 10:08:22.000000', '2020-07-30')

  2. start a flink cdc to sink hudi with my config properties:

--hive-sync-enable=ture
--hive-sync-jdbc-url=jdbc:hive2://localhost:10000
--hive-sync-db=testDb
--hive-sync-table=testTable
--record-key-field=id
--partition-path-field=timestamp16
--hive-sync-partition-fields=inc_day
--hive-style-partitioning=true
--hive-sync-mode=jdbc
--hive-sync-username=hive
--hive-sync-password=hive

hoodie.deltastreamer.keygen.timebased.timestamp.type=EPOCHMILLISECONDS
hoodie.deltastreamer.keygen.timebased.output.dateformat=yyyy-MM-dd
hoodie.datasource.write.keygenerator.consistent.logical.timestamp.enabled=true
hive_sync.partition_extractor_class=org.apache.hudi.keygen.TimestampBasedAvroKeyGenerator

Expected behavior
create a hive table testTable with string partition field inc_day and add a partition "2020-07-30". But actually the partition field is timestamp16 with bigint type.

show partitions testTable;  ---- "2020-07-30"
select timestamp16 from testTable; ----- null

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions