[IMPROVED] Resync for Mirrors and Sources on LN reconnect in complex topologies#7265
Merged
neilalexander merged 1 commit intomainfrom Sep 7, 2025
Merged
[IMPROVED] Resync for Mirrors and Sources on LN reconnect in complex topologies#7265neilalexander merged 1 commit intomainfrom
neilalexander merged 1 commit intomainfrom
Conversation
MauriceVanVeen
approved these changes
Sep 7, 2025
server/jetstream_leafnode_test.go
Outdated
| }) | ||
| if elapsed := time.Since(start); elapsed > 2*time.Second { | ||
| if elapsed := time.Since(start); elapsed > 3*time.Second { | ||
| t.Fatalf("Expected to resync all streams <2s but got %v", elapsed) |
Member
There was a problem hiding this comment.
Should this also be updated to 3 seconds?
Member
Author
There was a problem hiding this comment.
Yes I upped it because I saw o ne failure running in a loop. Will update.
…are re-established. We previously improved this with PR #6981 - but this ws too rigid. It expected the LN to have JS enabled and have the same domain. The test also simulated a long time for the link to be down and manually changed the state to no in progress (si.sip). For simpler setups this worked, but if LNs were daisy chained, and either the GW Leafnode did not have JS enabled, or if enabled it would have a different domain, meaning the speedup would fail. Now we are much more broad about the conditions to retry. I did look into checking for $JS.<DOMAIN>.API.INFO but this was brittle and depended on timing and doing retries or backoffs. Will revisit in the future (We do have the ability to register for a callback for interest in a subject which could be utilized). For now this works well, and is simple, and the cost of being "wrong" in very complicated setups is minimal. Signed-off-by: Derek Collison <derek@nats.io>
7669a53 to
5ec9c25
Compare
neilalexander
added a commit
that referenced
this pull request
Sep 8, 2025
Includes the following: - #7200 - #7201 - #7202 - #7209 - #7210 - #7211 - #7213 - #7212 - #7216 - #7217 - #7230 - #7239 - #7246 - #7248 - 8241a15, specifically delayed errors that are not JS API errors - #7158 (not containing 2.12-specific changes) - #7233 - #7255 - #7249 - #7259 - #7265 - #7273 (not including Go 1.25.x) - #7258 - #7222 Signed-off-by: Maurice van Veen <github@mauricevanveen.com> Signed-off-by: Neil Twigg <neil@nats.io>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This improves our resync logic for source and mirrors when leafnodes are re-established.
We previously improved this with PR #6981 - but this was too rigid. It expected the LN to have JS enabled and have the same domain. The test also simulated a long time for the link to be down and manually changed the state to no in progress (si.sip).
For simpler setups this worked, but if LNs were daisy chained, and either the GW Leafnode did not have JS enabled, or if enabled it would have a different domain, meaning the speedup would fail.
Now we are much more broad about the conditions to retry. I did look into checking for $JS..API.INFO but this was brittle and depended on timing and doing retries or backoffs. Will revisit in the future (We do have the ability to register for a callback for interest in a subject which could be utilized). For now this works well, and is simple, and the cost of being "wrong" in very complicated setups is minimal.
Signed-off-by: Derek Collison derek@nats.io