Skip to content

NRG: Parallel catchups can truncate committed#7424

Merged
neilalexander merged 1 commit intomainfrom
maurice/nrg-catchup-rollback
Oct 13, 2025
Merged

NRG: Parallel catchups can truncate committed#7424
neilalexander merged 1 commit intomainfrom
maurice/nrg-catchup-rollback

Conversation

@MauriceVanVeen
Copy link
Copy Markdown
Member

Follow-up of a small bug introduced in #7209 that could result in desync under niche conditions.

If a follower is behind it will request catchup from the leader. The leader however could send more catchup data than the follower requested by sending it up to the last known entry. The follower will then mark catchup as finished when it has received what it knew to be all entries, but it may receive more messages from the leader still.

If multiple catchups were running in parallel, this could result in the follower marking the catchup as complete but then still letting an append entry from another catchup through that could truncate the WAL for entries that were already responded to as "can be committed".

This PR simply adds a small guard to not allow truncation from a catchup entry if catchup has completed. Later on we'll need to look at how to improve catchup in general, especially with fast batch ingest that will require Raft catchup to be faster than that as well.

Signed-off-by: Maurice van Veen github@mauricevanveen.com

Signed-off-by: Maurice van Veen <github@mauricevanveen.com>
@MauriceVanVeen MauriceVanVeen requested a review from a team as a code owner October 13, 2025 11:37
Copy link
Copy Markdown
Member

@neilalexander neilalexander left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@neilalexander neilalexander merged commit f8cc99d into main Oct 13, 2025
90 of 92 checks passed
@neilalexander neilalexander deleted the maurice/nrg-catchup-rollback branch October 13, 2025 12:24
neilalexander added a commit that referenced this pull request Oct 13, 2025
Includes the following:

- #7416
- #7423
- #7425
- #7424
- #7411

Signed-off-by: Neil Twigg <neil@nats.io>
neilalexander added a commit that referenced this pull request Oct 28, 2025
Includes the following:

- #7380
- #7384
- #7385
- #7388
- #7395
- #7400
- #7399
- #7401
- #7402
- #7423
- #7424
- #7411
- #7428
- #7429
- #7431
- #7435
- #7433
- #7443
- #7455
- #7465
- #7466
- #7460
- #7484
- #7479

Signed-off-by: Neil Twigg <neil@nats.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants