Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KAFKA-16106: revert classic state transitions if deletion fails #16511

Open
wants to merge 3 commits into
base: trunk
Choose a base branch
from

Conversation

jeffkbkim
Copy link
Contributor

@jeffkbkim jeffkbkim commented Jul 2, 2024

An expire-group-metadata operation generates tombstone records, updates the groups state and decrements group size counters, then performs a write to the log. If there is a __consumer_offsets partition reassignment, for instance, this operation fails. The groups state is reverted to an earlier snapshot but classic group size counters are not. This begins an inconsistency between the metrics and the actual groups size. This applies to all unsuccessful write operations that alter the classic group state.

However, some operations that alter the classic group state does not produce records. This means that we cannot rely on timeline data structures as we do for consumer group states.

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

@dajac dajac added the KIP-848 label Jul 4, 2024
@jeffkbkim jeffkbkim marked this pull request as ready for review July 8, 2024 20:17
@@ -575,18 +581,32 @@ public CoordinatorResult<OffsetDeleteResponseData, CoordinatorRecord> deleteOffs
public CoordinatorResult<Void, CoordinatorRecord> cleanupGroupMetadata() {
long startMs = time.milliseconds();
List<CoordinatorRecord> records = new ArrayList<>();
AtomicInteger deletedClassicGroupCount = new AtomicInteger(0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to confirm, is the AtomicInteger used to match the following forEach loop?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you mean by using it to conform to the lambda expression, yes


when(groupMetadataManager.groupIds()).thenReturn(mkSet("group-id", "other-group-id"));
when(offsetMetadataManager.cleanupExpiredOffsets(eq("group-id"), eq(new ArrayList<>()))).thenReturn(true);
when(groupMetadataManager.maybeDeleteGroup(eq("group-id"), eq(new ArrayList<>()))).thenReturn(true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add something to the record list and assert it's non null later?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
3 participants