Fix KnownInstanceMetadataIsUpToDateAsync: guard HTTP status codes on discovery endpoints#6048
Merged
Merged
Conversation
… for discovery endpoints
Copilot
AI
changed the title
[WIP] Fix instance discovery integration test for status code check
Fix KnownInstanceMetadataIsUpToDateAsync: guard HTTP status codes on discovery endpoints
Jun 2, 2026
Contributor
There was a problem hiding this comment.
Pull request overview
This PR hardens the KnownInstanceMetadataIsUpToDateAsync integration test by adding HTTP status-code guards before attempting to JSON-deserialize discovery responses, preventing cryptic JSON parsing failures when an endpoint returns non-JSON error content (notably login.windows-ppe.net in restricted environments).
Changes:
- Read discovery response bodies and check
IsSuccessStatusCodebefore deserialization, surfacing clear failure diagnostics (status code, reason phrase, truncated body). - Treat PPE discovery endpoint failures as environmental by marking the test
Assert.Inconclusive, while keeping the public endpoint as a hard failure viaAssert.Fail. - Add a small
Truncatehelper to keep diagnostic output readable.
RyAuld
approved these changes
Jun 2, 2026
neha-bhargava
approved these changes
Jun 2, 2026
This was referenced Jun 5, 2026
Open
This was referenced Jun 8, 2026
Open
Merged
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
KnownInstanceMetadataIsUpToDateAsyncwas throwing a crypticSystem.Text.Json.JsonException: 'F' is an invalid start of a valuein CI becauselogin.windows-ppe.net(an internal/corp-only endpoint) returns403 Forbiddenfrom restricted agents — and the test was feeding that non-JSON body directly intoJsonHelper.DeserializeFromJsonwithout checking the HTTP status first.Changes proposed in this request
IsSuccessStatusCode. On failure, surface a clear message with the status code, reason phrase, and a truncated body snippet instead of letting the JSON parser choke.Assert.Inconclusive—login.windows-ppe.netis an environmental precondition (corp-only), not a product bug. On non-success, the test is now marked inconclusive with a descriptive message and exits before the assertion logic, so public-cloud coverage still ran.Assert.Fail—login.microsoftonline.commust be reachable; non-success there is a hard failure with the same diagnostic message format.Truncatehelper — private staticTruncate(string s, int maxLength = 500)handles null/empty safely; used in both error messages to keep them readable.No product code changed. The sovereign-cloud filter,
InstanceDiscoveryMetadataEntryComparer, andKnownMetadataProviderare untouched. When both endpoints succeed, test behavior is identical to before.Testing
Existing test logic is preserved end-to-end when both endpoints are reachable. The new paths (inconclusive/fail) are exercised when the respective endpoint returns non-2xx.
Performance impact
None.
Documentation
Original prompt
Problem
The integration test
Microsoft.Identity.Test.Integration.HeadlessTests.InstanceDiscoveryIntegrationTests.KnownInstanceMetadataIsUpToDateAsyncis failing with a confusing error:Root cause
The test located at
tests/Microsoft.Identity.Test.Integration.netcore/HeadlessTests/InstanceDiscoveryIntegrationTests.csmakes two raw HTTP GET calls and feeds the response body straight intoJsonHelper.DeserializeFromJson<InstanceDiscoveryResponse>without checking the HTTP status code:https://login.microsoftonline.com/common/discovery/instance?...(public)https://login.windows-ppe.net/common/discovery/instance?...(PPE / pre-production)The PPE endpoint
login.windows-ppe.netis internal/restricted. From CI agents that aren't on the corp network / allowlist, it routinely returns a non-JSON error response (e.g. HTTP 403 with a body starting withForbidden). That body's first byteFis whatSystem.Text.Jsonis choking on atLineNumber: 0 | BytePositionInLine: 0.This is environmental, not a regression in the recent sovereign-cloud filtering (Bleu/Delos/GovSG) — that code runs strictly after the deserialize call that threw.
Required changes
Edit
tests/Microsoft.Identity.Test.Integration.netcore/HeadlessTests/InstanceDiscoveryIntegrationTests.cs, methodKnownInstanceMetadataIsUpToDateAsync:Guard each HTTP call with a status-code check. If a discovery endpoint returns a non-success status, surface a clear message including the status code, reason phrase, and a snippet of the body, instead of letting
System.Text.Jsonthrow a cryptic'F' is an invalid start of a valueerror.Treat PPE unreachability as inconclusive, not a failure. Because
login.windows-ppe.netis an environmental precondition (not a product bug), useAssert.Inconclusive(...)when the PPE call fails. Keep failures on the publiclogin.microsoftonline.comendpoint as a hardAssert.Fail(or just let it fail), since that endpoint must be reachable.Decouple PPE from the public-cloud assertion. If only PPE is unreachable, still verify the public metadata. Concretely: only concat
actualPpeMetadatainto the comparison if the PPE call succeeded; otherwise compare against just the public metadata (and note in the inconclusive message that PPE was skipped), OR split into a clean early-return after the inconclusive call. The simpler approach is: on PPE failure, callAssert.Inconclusive(which short-circuits the test) so the public-cloud half still ran the network call but we don't false-fail. Pick whichever is cleanest while keeping the existing assertion logic intact when both endpoints succeed.Suggested shape (adapt as needed for code style in the file):
Where
Truncateis a small local helper (or inlines.Substring(0, Math.Min(500, s.Length))) so the assertion message stays readable.Do not change the existing
sovereignCloudsNotInDiscoveryfilter logic, theInstanceDiscoveryMetadataEntryComparer, orKnownMetadataProvider. The fix is scoped strictly to the HTTP/deserialization handling at the top of the test method.Properly dispose /
usingtheHttpClientandHttpResponseMessageinstances if the surrounding code style does so; otherwise leave allocation patterns as-is to minimize diff.Acceptance criteria
login.windows-ppe.netreturns a non-success status (e.g. 403), the...This pull request was created from Copilot chat.