Skip to content

Direct: Fixes missing minimum value validation for MaxTcpConnectionsPerEndpoint#5887

Open
yumnahussain wants to merge 2 commits into
Azure:mainfrom
yumnahussain:users/t-yhussain/fix-max-tcp-connections-validation
Open

Direct: Fixes missing minimum value validation for MaxTcpConnectionsPerEndpoint#5887
yumnahussain wants to merge 2 commits into
Azure:mainfrom
yumnahussain:users/t-yhussain/fix-max-tcp-connections-validation

Conversation

@yumnahussain
Copy link
Copy Markdown
Member

@yumnahussain yumnahussain commented May 19, 2026

Description

Fixes #5878

MaxTcpConnectionsPerEndpoint documents that the value must be >= 16 (default 65,535), but the setter had no validation. This PR adds bound checking using a clamp-and-warn pattern — values below the documented minimum are clamped to 16 and a DefaultTrace.TraceWarning is emitted.

This is a non-breaking change: existing customers who previously passed invalid values will see their apps continue to start successfully, with a trace warning indicating the clamped value.

Changes

  • ConnectionPolicy.cs — Added MinimumMaxTcpConnectionsPerEndpoint (16) constant. MaxTcpConnectionsPerEndpoint setter now clamps invalid values and emits a trace warning.
  • CosmosClientOptions.cs — Added clamp+warn validation in ValidateDirectTCPSettings() for MaxTcpConnectionsPerEndpoint.
  • CosmosClientOptionsUnitTests.cs — New tests for clamping behavior on both CosmosClientOptions and ConnectionPolicy.
  • changelog.md — Updated Bugs Fixed entry under Unreleased.

Testing

  • Added unit tests covering boundary conditions
  • Non-breaking: no exceptions thrown, invalid values gracefully clamped to documented minimum with trace warning
  • Constant shared between ConnectionPolicy and CosmosClientOptions to prevent drift

Copy link
Copy Markdown
Contributor

@NaluTripician NaluTripician left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for picking this up! A few things below — the headline one is that the PR doesn't build locally (missing using for DefaultTrace). I also think we should follow the issue's request and throw ArgumentOutOfRangeException rather than silently reset the value, and extend the same validation to ConnectionPolicy.MaxTcpConnectionsPerEndpoint so the lower-level API surface is covered too.

Comment thread Microsoft.Azure.Cosmos/src/CosmosClientOptions.cs
Comment thread Microsoft.Azure.Cosmos/src/CosmosClientOptions.cs Outdated
Comment thread changelog.md Outdated

#### Bugs Fixed

- [#5878](https://github.com/Azure/azure-cosmos-dotnet-v3/pull/5878) Direct: Fixes missing validation for `MaxTcpConnectionsPerEndpoint`. Values below the documented minimum (16) now emit a warning and reset to the default (null).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The link points to pull/5878, which is the issue, not this PR. Every other entry in this file references the PR number (e.g. [5815](https://github.com/Azure/azure-cosmos-dotnet-v3/pull/5815)). Please change to [#5887](https://github.com/Azure/azure-cosmos-dotnet-v3/pull/5887).

Also, once the behavior switches to throwing, the entry text should be updated — something like: Direct: Fixes missing validation for MaxTcpConnectionsPerEndpoint. Values below the documented minimum (16) now throw ArgumentOutOfRangeException.

Copy link
Copy Markdown

@orange-dot orange-dot May 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@NaluTripician @yumnahussain I closed my PR with this approach. #5883
It also covers other issues mentioned in description.
New to this flow, so I closed my PR when i saw same issue is addressed in this PR. Just wanted to check if my PR was in right direction or was too broad? Thanks.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @orange-dot, no worries at all, welcome, and thanks for jumping in! Your direction was absolutely right, and it wasn't too broad.

For future reference: you don't need to close your PR when you see overlap. It's usually better to leave a comment on both PRs cross-linking them so the maintainers can decide which to take (or ask you to rebase/narrow scope). In this case the timing was unfortunate.

Thank you for taking a look at this issue and please keep contributing! We love when our community contributes :)

You can also take a look at the contributing guide or tag one of the contributors on your PR if you have future questions.

@yumnahussain yumnahussain force-pushed the users/t-yhussain/fix-max-tcp-connections-validation branch from 1069493 to 35d7a51 Compare May 19, 2026 18:16
…erEndpoint

Values below the documented minimum (16) now throw ArgumentOutOfRangeException.
Validation is enforced in both CosmosClientOptions (via ValidateDirectTCPSettings)
and ConnectionPolicy (via property setter) to cover all API surface levels.

Fixes Azure#5878

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@yumnahussain yumnahussain force-pushed the users/t-yhussain/fix-max-tcp-connections-validation branch from 35d7a51 to 9765aab Compare May 19, 2026 18:31
@NaluTripician
Copy link
Copy Markdown
Contributor

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

Copy link
Copy Markdown
Contributor

@NaluTripician NaluTripician left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall: a few comments inline. Verdict: small but non-trivial revisions before merge. Detail on the design issues is inline at each anchor.

One PR-level item that doesn't have a code anchor:

The PR description doesn't match the code. It still describes a warn-and-reset design ("Values below 16 trigger a DefaultTrace.TraceWarning and reset to null ... No exception is thrown — this is a non-breaking change") and lists test names (MaxTcpConnectionsPerEndpoint_BelowMinimum_ResetsToDefault, MaxTcpConnectionsPerEndpoint_AtOrAboveMinimum_Succeeds) that don't exist in the diff. The actual code throws ArgumentOutOfRangeException and the tests use PascalCase names without underscores. Since this will likely be squash-merged, the description becomes the commit message — please rewrite it so future readers and release-note tooling don't get the wrong picture.

(I'm fine with the throw itself — just want the description and changelog story to match what ships.)

throw new ArgumentException($"{settingName} requires {nameof(this.ConnectionMode)} to be set to {nameof(ConnectionMode.Direct)}");
}

if (this.maxTcpConnectionsPerEndpoint.HasValue && this.maxTcpConnectionsPerEndpoint.Value < 16)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This range check doesn't really belong in ValidateDirectTCPSettings() — that method's job is enforcing "Direct settings can't be used in Gateway mode", not value-range validation. Two practical consequences:

  1. The MaxTcpConnectionsPerEndpoint setter (around line 641) still does assign-then-validate: it writes the field and then calls this method. If a caller catches the new ArgumentOutOfRangeException, the field holds the invalid value, and any later assignment to ConnectionMode / MaxRequestsPerTcpConnection / etc. re-runs this method and throws again with paramName = "MaxTcpConnectionsPerEndpoint" — confusing to debug since the caller wasn't touching that property. The new ConnectionPolicy.MaxTcpConnectionsPerEndpoint setter you added in this PR does validate-then-assign correctly, so the PR is internally inconsistent.
  2. In Gateway mode the existing mode-mismatch ArgumentException fires before this new check, so a user who passes 5 in Gateway mode has to switch to Direct and re-run to ever see the "≥ 16" message.

Recommend moving the range check into the MaxTcpConnectionsPerEndpoint setter (validate before assigning) and dropping it from here. If you extract a small ValidateDirectTcpConnectionLimit(value, min, name) helper it can be shared with ConnectionPolicy.cs, which has the same check.

get => this.maxTcpConnectionsPerEndpoint;
set
{
if (value.HasValue && value.Value < 16)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two notes here:


Assert.ThrowsException<ArgumentOutOfRangeException>(() => options.MaxTcpConnectionsPerEndpoint = 15);
Assert.ThrowsException<ArgumentOutOfRangeException>(() => options.MaxTcpConnectionsPerEndpoint = 0);
Assert.ThrowsException<ArgumentOutOfRangeException>(() => options.MaxTcpConnectionsPerEndpoint = -1);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two test improvements:

  • Worth asserting ParamName and ActualValue on the thrown exception — those are public contract that tooling and customer catch when clauses rely on, and locking them in tests prevents a future refactor from silently shifting the param name (e.g., from nameof(this.MaxTcpConnectionsPerEndpoint) to nameof(value)):
    ArgumentOutOfRangeException ex = Assert.ThrowsException<ArgumentOutOfRangeException>(
        () => options.MaxTcpConnectionsPerEndpoint = 15);
    Assert.AreEqual("MaxTcpConnectionsPerEndpoint", ex.ParamName);
    Assert.AreEqual(15, ex.ActualValue);
  • ConnectionPolicy.MaxTcpConnectionsPerEndpoint got a new validating setter in this PR but has no coverage here. ConnectionPolicy is internal but reachable from this test assembly via InternalsVisibleTo. A parallel test would prevent the two validation sites from drifting.

Copy link
Copy Markdown
Member

@kushagraThapar kushagraThapar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching this validation gap, @yumnahussain! Before we merge I'd love to align on direction, because I think we're crossing into behavioral breaking change territory and the throw is risky.

Concern: this is a breaking change in disguise

Today (pre-PR), a customer passing MaxTcpConnectionsPerEndpoint = 5 (or 0, or any value < 16) gets silent acceptance — the value flows through to the RNTBD transport and the app starts up successfully. After this PR, those same customers will get ArgumentOutOfRangeException at CosmosClient construction on upgrade.

That's an observable runtime behavior change on previously-working code, even though we're enforcing a documented contract. For an SDK setting that customers may have hardcoded years ago, throwing is the riskiest possible upgrade behavior — we'd be turning a successful client construction into a startup crash with no opt-out.

Proposal: clamp + warn instead of throw

Could you please consider changing the implementation to clamp out-of-range values to the supported boundary and emit a warning trace, rather than throwing? Something like:

private const int MinMaxTcpConnectionsPerEndpoint = 16;

if (value.HasValue && value.Value < MinMaxTcpConnectionsPerEndpoint)
{
    DefaultTrace.TraceWarning(
        "MaxTcpConnectionsPerEndpoint value {0} is below the supported minimum of {1}; clamping to {1}.",
        value.Value,
        MinMaxTcpConnectionsPerEndpoint);
    value = MinMaxTcpConnectionsPerEndpoint;
}

This way:

  • Existing customer apps that today pass 5 keep starting up on upgrade (no live-site impact from a customer SDK bump).
  • Customers still get pushed toward a sane configuration via the warning trace (which we can grep for in diagnostics).
  • We avoid having to file this under ### Breaking Changes and the migration risk that comes with that.
  • The same pattern applies symmetrically to both ConnectionPolicy and CosmosClientOptions (and could be extracted into a single helper / constant for MaxRequestsPerTcpConnection and IdleTcpConnectionTimeout later — though those are out of scope here).

Asks

  1. Could you please switch the implementation to clamp-and-warn in both ConnectionPolicy.cs and CosmosClientOptions.cs?
  2. Update the changelog entry wording so it reflects the new behavior (still under ### Bugs Fixed is fine if we clamp — see inline suggestion).
  3. Update the tests to assert clamping behavior + the warning trace, not the throw.
  4. Refresh the PR description — it still describes the abandoned warn-and-reset design and references test names that don't exist (e.g. MaxTcpConnectionsPerEndpoint_BelowMinimum_ResetsToDefault). The squash-merge will use the description as the commit message.

I want to hear your thinking too — happy to discuss alternatives if you feel strongly about the throw direction. Thanks!

Comment on lines +488 to +494
if (value.HasValue && value.Value < 16)
{
throw new ArgumentOutOfRangeException(
nameof(this.MaxTcpConnectionsPerEndpoint),
value.Value,
$"{nameof(this.MaxTcpConnectionsPerEndpoint)} must be greater than or equal to 16.");
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please switch this from throwing to clamp + warn? Something like:

if (value.HasValue && value.Value < 16)
{
    DefaultTrace.TraceWarning(
        "MaxTcpConnectionsPerEndpoint value {0} is below the supported minimum of 16; clamping to 16.",
        value.Value);
    value = 16;
}

this.maxTcpConnectionsPerEndpoint = value;

Throwing on values that were previously silently accepted is a runtime behavior break for existing customers on upgrade. Clamping preserves app startup while still steering customers toward a sane configuration via the diagnostic trace. (Same change should be applied symmetrically inside ValidateDirectTCPSettings in CosmosClientOptions.cs.)

Comment on lines +1333 to +1339
if (this.maxTcpConnectionsPerEndpoint.HasValue && this.maxTcpConnectionsPerEndpoint.Value < 16)
{
throw new ArgumentOutOfRangeException(
nameof(this.MaxTcpConnectionsPerEndpoint),
this.maxTcpConnectionsPerEndpoint.Value,
$"{nameof(this.MaxTcpConnectionsPerEndpoint)} must be greater than or equal to 16.");
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as on ConnectionPolicy.cs — could we please switch this to a clamp + warn instead of a throw?

if (this.maxTcpConnectionsPerEndpoint.HasValue && this.maxTcpConnectionsPerEndpoint.Value < 16)
{
    DefaultTrace.TraceWarning(
        "MaxTcpConnectionsPerEndpoint value {0} is below the supported minimum of 16; clamping to 16.",
        this.maxTcpConnectionsPerEndpoint.Value);
    this.maxTcpConnectionsPerEndpoint = 16;
}

Two additional thoughts on the placement:

  1. Putting the bound check inside ValidateDirectTCPSettings means it fires on every TCP-related setter call (idle timeout, open timeout, max requests, port reuse, connection mode), not just on MaxTcpConnectionsPerEndpoint. With a throw that produces misleading paramName errors on unrelated setters (e.g., opts.ConnectionMode = Direct after a bad MaxTcp would complain about MaxTcp). Switching to clamp-and-warn at this central location actually works out cleanly since there's no paramName confusion to worry about.
  2. Could we please factor 16 into an internal const int MinMaxTcpConnectionsPerEndpoint = 16; so the two enforcement sites stay in sync as a single source of truth?

Comment thread changelog.md Outdated

#### Bugs Fixed

- [#5887](https://github.com/Azure/azure-cosmos-dotnet-v3/pull/5887) Direct: Fixes missing validation for `MaxTcpConnectionsPerEndpoint`. Values below the documented minimum (16) now throw `ArgumentOutOfRangeException`.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once we switch the implementation to clamp + warn (see inline comments on the source files), could you please rephrase this entry to reflect the new behavior? Suggested wording (keep under ### Bugs Fixed):

#5887 Direct: Fixes silent acceptance of MaxTcpConnectionsPerEndpoint values below the documented minimum (16). Such values are now clamped to 16 with a warning trace.

If we keep the throw as-is, this entry needs to move under ### Breaking Changes instead — existing customer apps passing values < 16 will start failing at CosmosClient construction on upgrade, which is a runtime behavior break.

Copy link
Copy Markdown
Member Author

@yumnahussain yumnahussain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey Kushagra, thanks for the detailed feedback!

I originally went with ArgumentOutOfRangeException because that's how other identities (like GatewayModeMaxConnectionLimit) handle invalid values, so I was following the existing pattern for consistency. However, after reading through your reasoning I fully agree — this would be a breaking change for existing customers who may have hardcoded values below 16 that work today, and causing a startup crash on SDK upgrade is too risky.

The clamp + warn approach resolves the validation gap while keeping existing apps stable on upgrade. Customers will still be made aware (via the trace warning) that their previous value for MaxTcpConnectionsPerEndpoint was not valid according to documentation, but without the live-site risk of a throw.

I'll update the implementation to clamp-and-warn in both ConnectionPolicy.cs and CosmosClientOptions.cs, update the tests to assert clamping behavior + the warning trace, and refresh the PR description and changelog accordingly. Thanks for the extra context!

@yumnahussain yumnahussain changed the title Direct: Fixes missing minimum value validation for MaxTcpConnectionsPerEndpoint Direct: Fixes missing minimum value validation for MaxTcpConnectionsPerEndpoint and MaxRequestsPerTcpConnection May 20, 2026
@yumnahussain yumnahussain changed the title Direct: Fixes missing minimum value validation for MaxTcpConnectionsPerEndpoint and MaxRequestsPerTcpConnection Direct: Fixes missing minimum value validation for MaxTcpConnectionsPerEndpoint May 20, 2026
…erEndpoint

Values below the documented minimum (16) are now clamped to 16 with a
DefaultTrace.TraceWarning instead of throwing, avoiding a breaking change
for existing customers.

Validation is enforced in both ConnectionPolicy (via property setter)
and CosmosClientOptions (via ValidateDirectTCPSettings) with a shared
constant to prevent drift.

Fixes Azure#5878

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@yumnahussain yumnahussain force-pushed the users/t-yhussain/fix-max-tcp-connections-validation branch from 5dc8d10 to 09089e3 Compare May 20, 2026 22:47
private const int defaultMaxConcurrentFanoutRequests = 32;
private const int defaultMaxConcurrentConnectionLimit = 50;

internal const int MinimumMaxTcpConnectionsPerEndpoint = 16;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Rename to MinimumTcpConnectionsPerEndpoint ?

Copy link
Copy Markdown
Member

@kushagraThapar kushagraThapar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Converting my prior review to a blocking one — same feedback stands. Please address the comments there before merging.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Direct: MaxTcpConnectionsPerEndpoint does not validate documented minimum value constraint

5 participants