Skip to content

Cherry-picks for 2.11.9-RC.3#7252

Merged
neilalexander merged 27 commits intorelease/v2.11.9from
maurice/2119rc3
Sep 8, 2025
Merged

Cherry-picks for 2.11.9-RC.3#7252
neilalexander merged 27 commits intorelease/v2.11.9from
maurice/2119rc3

Conversation

@MauriceVanVeen
Copy link
Copy Markdown
Member

@MauriceVanVeen MauriceVanVeen commented Sep 3, 2025

Includes the following:

Signed-off-by: Maurice van Veen github@mauricevanveen.com
Signed-off-by: Neil Twigg neil@nats.io

kozlovic and others added 16 commits September 3, 2025 15:12
…ials

With route pooling, the server would create a connection to the remote
and when receiving the INFO protocol, would start a new route if the
amount of routes was below the effective pool size. But if the route
failed due to authentication error, this would cause a rapid infinite
attempt to recreate a route.

The approach here is to delay the creation of the new route after
receiving the first PONG after sending a PING.

Resolves #7194

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Signed-off-by: Maurice van Veen <github@mauricevanveen.com>
Signed-off-by: Waldemar Quevedo <wally@nats.io>
Signed-off-by: Maurice van Veen <github@mauricevanveen.com>
Signed-off-by: Maurice van Veen <github@mauricevanveen.com>
Signed-off-by: Maurice van Veen <github@mauricevanveen.com>
Signed-off-by: Maurice van Veen <github@mauricevanveen.com>
Signed-off-by: Waldemar Quevedo <wally@nats.io>
Signed-off-by: Maurice van Veen <github@mauricevanveen.com>
Signed-off-by: Alberto Ricart <alberto@synadia.com>
The test leaves behind file server/tav.idx. Setting the raft's
store to use a temporary directory. This gets automatically
removed at the end of test.

Signed-off-by: Daniele Sciascia <daniele@nats.io>
Signed-off-by: Maurice van Veen <github@mauricevanveen.com>
On shutdown, server may find the cluster meta node to be nil.
If so, return early from the method.

Signed-off-by: Daniele Sciascia <daniele@nats.io>
Replace wrong comment about leaders that need to break tie when
using same term, with a assert.Unreachable. Two leaders should
never use the same term.

Signed-off-by: Daniele Sciascia <daniele@nats.io>
Signed-off-by: Maurice van Veen <github@mauricevanveen.com>
Signed-off-by: Maurice van Veen <github@mauricevanveen.com>
@wallyqs
Copy link
Copy Markdown
Member

wallyqs commented Sep 3, 2025

offline asserts support ok to include?

@MauriceVanVeen
Copy link
Copy Markdown
Member Author

offline asserts support ok to include?

Yes, that's intended to be available in 2.11, so downgrading from 2.12 when using new features will be safe.

@wallyqs
Copy link
Copy Markdown
Member

wallyqs commented Sep 3, 2025

just checking but is v2.11.9 the first time we are adding github.com/antithesishq/antithesis-sdk-go/assert outside of tests?

@MauriceVanVeen
Copy link
Copy Markdown
Member Author

MauriceVanVeen commented Sep 3, 2025

just checking but is v2.11.9 the first time we are adding github.com/antithesishq/antithesis-sdk-go/assert outside of tests?

Yes, but do note it's a "noop"-implementation by default. We swap it for the real dependency when testing under Antithesis.

Also, these asserts don't trigger in the hot-path, only in cases where things would have gone wrong already.

@derekcollison
Copy link
Copy Markdown
Member

I still grimace when I see these in production code tbh.

@MauriceVanVeen
Copy link
Copy Markdown
Member Author

I still grimace when I see these in production code tbh.

I understand that, but these kinds of asserts don't have any "production-value" because they're noop. These are only active under Antithesis, and ensure we can assert unreachable conditions are not reached, for conditions that are hard or impossible to observe from the outside.

In the end these asserts are a display of our commitment to correctness, as well as enabling us to detect possible regressions.

MauriceVanVeen and others added 3 commits September 3, 2025 16:57
Signed-off-by: Maurice van Veen <github@mauricevanveen.com>
The issue can be observed for instance with JetStream and source
streams, that would not be populating properly depending on where
a server connects to in a given topology. See the new test
`TestLeafNodeDaisyChainWithAccountImportExport` for a demonstration
of the issue without the fix in place.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Signed-off-by: Maurice van Veen <github@mauricevanveen.com>
@wallyqs
Copy link
Copy Markdown
Member

wallyqs commented Sep 4, 2025

also need to include #7258

kozlovic and others added 3 commits September 5, 2025 10:03
This is related to PR #5519 where the propagation of a shadow subscription
was suppressed for a subscription coming from a spoke leafnode connection,
but the handling of the unsubscribe of such shadow subscription was not
fixed, which could result in removing a legitimate interest.

The test `TestLeafNodeDupeDeliveryQueueSubAndPlainSub` that was added
in PR #5519 was modified to demonstrate what would have been the issue
without the fix in this PR.

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
…are re-established.

We previously improved this with PR #6981 - but this ws too rigid. It expected the LN to have JS enabled and have the same domain.
The test also simulated a long time for the link to be down and manually changed the state to no in progress (si.sip).

For simpler setups this worked, but if LNs were daisy chained, and either the GW Leafnode did not have JS enabled, or if enabled it would have a different domain, meaning the speedup would fail.

Now we are much more broad about the conditions to retry. I did look into checking for $JS.<DOMAIN>.API.INFO but this was brittle and depended on timing and doing retries or backoffs.
Will revisit in the future (We do have the ability to register for a callback for interest in a subject which could be utilized).
For now this works well, and is simple, and the cost of being "wrong" in very complicated setups is minimal.

Signed-off-by: Derek Collison <derek@nats.io>
Signed-off-by: Neil Twigg <neil@nats.io>
Signed-off-by: Neil Twigg <neil@nats.io>
Signed-off-by: Neil Twigg <neil@nats.io>
@wallyqs
Copy link
Copy Markdown
Member

wallyqs commented Sep 8, 2025

Need to bump client nats.go to v1.45.0 so that test from #7258 passes.

Signed-off-by: Waldemar Quevedo <wally@nats.io>
Signed-off-by: Waldemar Quevedo <wally@nats.io>
@neilalexander neilalexander marked this pull request as ready for review September 8, 2025 13:01
@neilalexander neilalexander requested a review from a team as a code owner September 8, 2025 13:01
Signed-off-by: Maurice van Veen <github@mauricevanveen.com>
@neilalexander neilalexander merged commit cfd563e into release/v2.11.9 Sep 8, 2025
89 of 92 checks passed
@neilalexander neilalexander deleted the maurice/2119rc3 branch September 8, 2025 14:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants