Skip to content

[HUDI-2735] Allow empty commits in Kafka Connect Sink for Hudi#4544

Merged
nsivabalan merged 1 commit intoapache:masterfrom
yihua:HUDI-2735-fix-kafka-connect-rollback-archival
Jan 10, 2022
Merged

[HUDI-2735] Allow empty commits in Kafka Connect Sink for Hudi#4544
nsivabalan merged 1 commit intoapache:masterfrom
yihua:HUDI-2735-fix-kafka-connect-rollback-archival

Conversation

@yihua
Copy link
Copy Markdown
Contributor

@yihua yihua commented Jan 10, 2022

What is the purpose of the pull request

This PR makes Kafka Connect Sink for Hudi to write empty commits when there are no new messages from the Kafka topic. This avoids constant rollbacks if the Kafka topic has no new message. Regardless of whether there are new messages or not, the write commit logic, including archival, is always executed, resolving the problem of no archival of rollbacks when there is no new message as well.

Brief change log

  • Removes the check of the size of write status list from all participants in ConnectTransactionCoordinator.
  • Adds a new test for empty status list.

Verify this pull request

This change added tests and can be verified as follows:

  • Run Kafka Connect Sink for Hudi using Quick Start Guide
  • Publish some messages to the Kafka topic: bash setupKafka.sh -n 100 -b 6
  • Wait for some time so the Sink ingests all messages and writes empty commits
  • Publish more messages to the topic: bash setupKafka.sh -n 100 -b 6 -o 600 -t
  • Verify the table timeline using hudi-cli:
hudi:hudi-test-topic->commits show
╔═══════════════════╤═════════════════════╤═══════════════════╤═════════════════════╤══════════════════════════╤═══════════════════════╤══════════════════════════════╤══════════════╗
║ CommitTime        │ Total Bytes Written │ Total Files Added │ Total Files Updated │ Total Partitions Written │ Total Records Written │ Total Update Records Written │ Total Errors ║
╠═══════════════════╪═════════════════════╪═══════════════════╪═════════════════════╪══════════════════════════╪═══════════════════════╪══════════════════════════════╪══════════════╣
║ 20220109184255282 │ 76.1 KB             │ 0                 │ 20                  │ 5                        │ 300                   │ 300                          │ 0            ║
╟───────────────────┼─────────────────────┼───────────────────┼─────────────────────┼──────────────────────────┼───────────────────────┼──────────────────────────────┼──────────────╢
║ 20220109184129070 │ 75.7 KB             │ 0                 │ 20                  │ 5                        │ 300                   │ 300                          │ 0            ║
╟───────────────────┼─────────────────────┼───────────────────┼─────────────────────┼──────────────────────────┼───────────────────────┼──────────────────────────────┼──────────────╢
║ 20220109183955630 │ 0.0 B               │ 0                 │ 0                   │ 0                        │ 0                     │ 0                            │ 0            ║
╟───────────────────┼─────────────────────┼───────────────────┼─────────────────────┼──────────────────────────┼───────────────────────┼──────────────────────────────┼──────────────╢
║ 20220109183755160 │ 0.0 B               │ 0                 │ 0                   │ 0                        │ 0                     │ 0                            │ 0            ║
╟───────────────────┼─────────────────────┼───────────────────┼─────────────────────┼──────────────────────────┼───────────────────────┼──────────────────────────────┼──────────────╢
║ 20220109183554995 │ 0.0 B               │ 0                 │ 0                   │ 0                        │ 0                     │ 0                            │ 0            ║
╟───────────────────┼─────────────────────┼───────────────────┼─────────────────────┼──────────────────────────┼───────────────────────┼──────────────────────────────┼──────────────╢
║ 20220109183354904 │ 0.0 B               │ 0                 │ 0                   │ 0                        │ 0                     │ 0                            │ 0            ║
╟───────────────────┼─────────────────────┼───────────────────┼─────────────────────┼──────────────────────────┼───────────────────────┼──────────────────────────────┼──────────────╢
║ 20220109183225656 │ 75.7 KB             │ 0                 │ 20                  │ 5                        │ 300                   │ 300                          │ 0            ║
╟───────────────────┼─────────────────────┼───────────────────┼─────────────────────┼──────────────────────────┼───────────────────────┼──────────────────────────────┼──────────────╢
║ 20220109183055068 │ 71.8 KB             │ 0                 │ 16                  │ 5                        │ 300                   │ 300                          │ 0            ║
╚═══════════════════╧═════════════════════╧═══════════════════╧═════════════════════╧══════════════════════════╧═══════════════════════╧══════════════════════════════╧══════════════╝
> ls -l /tmp/hoodie/hudi-test-topic/.hoodie/
total 400
  17296 Jan  9 18:32 20220109183055068.deltacommit
      0 Jan  9 18:30 20220109183055068.deltacommit.inflight
      0 Jan  9 18:30 20220109183055068.deltacommit.requested
  22482 Jan  9 18:33 20220109183225656.deltacommit
      0 Jan  9 18:32 20220109183225656.deltacommit.inflight
      0 Jan  9 18:32 20220109183225656.deltacommit.requested
    589 Jan  9 18:35 20220109183354904.deltacommit
      0 Jan  9 18:33 20220109183354904.deltacommit.inflight
      0 Jan  9 18:33 20220109183354904.deltacommit.requested
    589 Jan  9 18:37 20220109183554995.deltacommit
      0 Jan  9 18:35 20220109183554995.deltacommit.inflight
      0 Jan  9 18:35 20220109183554995.deltacommit.requested
    589 Jan  9 18:39 20220109183755160.deltacommit
      0 Jan  9 18:37 20220109183755160.deltacommit.inflight
      0 Jan  9 18:37 20220109183755160.deltacommit.requested
   7490 Jan  9 18:39 20220109183955045.compaction.requested
    589 Jan  9 18:41 20220109183955630.deltacommit
      0 Jan  9 18:39 20220109183955630.deltacommit.inflight
      0 Jan  9 18:39 20220109183955630.deltacommit.requested
  21413 Jan  9 18:42 20220109184129070.deltacommit
      0 Jan  9 18:41 20220109184129070.deltacommit.inflight
      0 Jan  9 18:41 20220109184129070.deltacommit.requested
  22743 Jan  9 18:44 20220109184255282.deltacommit
      0 Jan  9 18:42 20220109184255282.deltacommit.inflight
      0 Jan  9 18:42 20220109184255282.deltacommit.requested

Committer checklist

  • Has a corresponding JIRA in PR title & commit

  • Commit message is descriptive of the change

  • CI is green

  • Necessary doc changes done or have another open PR

  • For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.

@hudi-bot
Copy link
Copy Markdown
Collaborator

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

@nsivabalan nsivabalan added the priority:critical Production degraded; pipelines stalled label Jan 10, 2022
Copy link
Copy Markdown
Contributor

@nsivabalan nsivabalan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@nsivabalan nsivabalan merged commit bc95571 into apache:master Jan 10, 2022
@vinishjail97 vinishjail97 mentioned this pull request Jan 24, 2022
5 tasks
vingov pushed a commit to vingov/hudi that referenced this pull request Jan 26, 2022
liusenhua pushed a commit to liusenhua/hudi that referenced this pull request Mar 1, 2022
vingov pushed a commit to vingov/hudi that referenced this pull request Apr 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

priority:critical Production degraded; pipelines stalled

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants