Skip to content

[Bug] [connector-hive] The file_name_expression does not take effect in Hive sink. #9822

@Adamyuanyuan

Description

@Adamyuanyuan

Search before asking

  • I had searched in the issues and found no similar issues.

What happened

The parameter file_name_expression is hard-coded as ${transactionId} in the Hive sink and cannot take effect.
In the production environment, when writing to Hive in Streaming mode, the file_name_expression parameter needs to be adjusted in certain scenarios.


file_name_expression 这个参数在 Hive sink中是写死的 ${transactionId},无法生效;
生产环境Streaming模式写Hive的时候,某些场景下需要调整 file_name_expression 这个参数。

SeaTunnel Version

2.3.11

SeaTunnel Config

env {
  #运行并行度
  execution.parallelism = 1
  #流/批次作业
  job.mode = "STREAMING"
  #ck时间间隔
  checkpoint.interval = 60000
}

source {
  Kafka {
    schema = {
      fields {
        id = bigint
        page_no = bigint
        ...
      }
    }
    format = text
    field_delimiter = "\\|"
    topic = "push_report_event"
    bootstrap.servers = "..."
    consumer.group = "push_report_event"
    kafka.config = {
      max.poll.records = 100
      auto.offset.reset = "earliest"
      enable.auto.commit = "false"
    }
      result_table_name = "result_table"
  }  
}

transform {
  Sql {
    #来源结果集表名
    source_table_name = "result_table"
    #目标结果集表名
    result_table_name = "source_table"
    query = "select id, page_no..."
   }
}

sink {
  Hive {
      table_name = "gang_test.kafka_source_text"
      metastore_uri = "thrift://...:9083"
      hdfs_site_path = "datasource-conf/.../hdfs-site.xml"
      hive_site_path = "datasource-conf/.../hive-site.xml"
      krb5_path = "datasource-conf/.../krb5.conf"
      kerberos_principal = "hive/bigdata-..@MR......"
      kerberos_keytab_path = "datasource-conf/.../hive.keytab"
      overwrite = false
      # compress_codec = "SNAPPY"
      tmp_path = "hdfs://.../.../hive/warehouse/_seatunnel_tmp_kafka_source_text"
      batch_size = 5000
      batch_interval_ms = 15000
      source_table_name = "source_table"
      file_name_expression = "${now}"
  }
}

Running Command

Not needed

Error Exception

file_name_expression is not working

Zeta or Flink or Spark Version

Flink and zeta

Java or Scala Version

No response

Screenshots

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions