Skip to content

Fix ESS credential staleness: proactively evict cert using token_not_after OID#6017

Closed
Robbie-Microsoft wants to merge 3 commits into
mainfrom
rginsburg/mtls_ess_cert_staleness_fix
Closed

Fix ESS credential staleness: proactively evict cert using token_not_after OID#6017
Robbie-Microsoft wants to merge 3 commits into
mainfrom
rginsburg/mtls_ess_cert_staleness_fix

Conversation

@Robbie-Microsoft

Copy link
Copy Markdown
Contributor

Problem

ESS-issued Managed Identity credentials have a split lifetime: X.509 NotAfter is ~14 days (for cert-based renewal), but Entra only accepts token requests for ~7 days via the token_not_after extension (OID 1.3.6.1.4.1.311.90.2.1). After day 7, TLS succeeds but Entra rejects with AADSTS1000901. MSAL had no recovery.

Fix

MSAL now reads OID 1.3.6.1.4.1.311.90.2.1 directly from the cached cert. If now >= token_not_after, the cert is proactively evicted and a fresh one is minted before any rejection occurs. Falls back to cert.NotAfter for certs without the OID.

Changes

  • MtlsCertificateCache: Replace hardcoded 7-day threshold with OID reader + DER parser (TryParseTokenNotAfterExtension).
  • ImdsV2ManagedIdentitySource: Remove AADSTS1000901 reactive-retry (proactive check makes it redundant); restore original SCHANNEL-only catch.
  • Tests: Updated to use OID-bearing certs; added OID parser and boundary tests.

Robbie-Microsoft and others added 2 commits May 19, 2026 12:09
…e token_not_after eviction

Reactive fix (ImdsV2ManagedIdentitySource):
- Add s_staleBindingAadstsCodes array seeded with AADSTS1000901; adding
  future codes is a one-line change
- Add IsStaleBindingAadstsError(): detects STS-layer cert rejection via
  colon-terminated code match to prevent false prefix matches
- Add TryGetStaleBindingReason(): unified SCHANNEL + AADSTS detection;
  ensures AADSTS-only exceptions are never mislabeled as SCHANNEL
- Add ExecuteWithMtlsSelfHealingAsync(): single retry wrapper replacing
  the former inline SCHANNEL-only catch; covers both failure modes;
  retry is structurally once-only (second throw propagates outside catch)
- Simplify AuthenticateAsync() to delegate to the new wrapper

Proactive fix (MtlsCertificateCache):
- Add CertTokenNotAfterThreshold (7 days) matching the known ESS
  token_not_after window relative to cert issuance (NotBefore)
- Add IsCertTokenExpiredForTokenRequests(): treats a cert as expired
  for token purposes when its age exceeds the threshold
- Apply the check at all three cache-read sites (memory pre-gate,
  memory post-gate, persistent) so a stale cert is never served from
  any layer without first minting a fresh one

Tests (PersistentCertificateStoreUnitTests):
- 6 tests for IsStaleBindingAadstsError true/false paths including
  prefix-collision guard and wrong-ErrorCode guard
- 6 tests for TryGetStaleBindingReason including mislabel regression
  (AADSTS exception with SocketException inner must not be SCHANNEL)
- 4 unit tests for IsCertTokenExpiredForTokenRequests (fresh, stale,
  boundary, null)
- 1 integration test GetOrCreateAsync_EvictsStaleToken_And_MintsNew
  verifying end-to-end eviction path

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…cert

Replace hardcoded 7-day threshold with OID-based proactive eviction.

The ESS-issued cert embeds OID 1.3.6.1.4.1.311.90.2.1 (token_not_after
extension) which Entra uses to enforce when the cert may no longer
acquire tokens -- even while the X.509 NotAfter is still valid for TLS
renewal purposes.

Changes:
- MtlsCertificateCache: Replace CertTokenNotAfterThreshold (7-day
  constant) with TokenNotAfterOid const. IsCertTokenExpiredForTokenRequests
  now reads the OID from cert.Extensions and compares its parsed
  GeneralizedTime/UTCTime value to DateTimeOffset.UtcNow. Falls back to
  cert.NotAfter when the OID is absent (old cert format or CSK certs
  where token validity == cert validity per IMDS PR #15749846).
  Adds TryParseTokenNotAfterExtension for DER parsing.

- ImdsV2ManagedIdentitySource: Remove AADSTS1000901 reactive-retry path
  (Fix 1). Restore original SCHANNEL-only catch in AuthenticateAsync.
  The proactive OID check in Fix 2 prevents the AADSTS1000901 case from
  ever being reached, making Fix 1 redundant.

- Tests: Remove IsStaleBindingAadstsError and TryGetStaleBindingReason
  test regions. Update IsCertTokenExpiredForTokenRequests tests to use
  certs bearing the token_not_after OID extension. Add OID parser tests.
  Replace CreateSelfSignedCertWithNotBefore helper with
  CreateSelfSignedCertWithTokenNotAfterOid.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings May 19, 2026 17:16
@Robbie-Microsoft Robbie-Microsoft requested a review from a team as a code owner May 19, 2026 17:16
- Split TryParseTokenNotAfterExtension_ReturnsFalse_For_InvalidData into
  three focused tests (NullData, EmptyData, NonTimeTag) with Arrange/Act/Assert
- Fix NonTimeTag test to use 4-byte buffer so it exercises the tag check
  rather than the minimum-length guard (rawData.Length < 4)
- Add TryParseTokenNotAfterExtension_Parses_SequenceWrappedGeneralizedTime to
  cover the SEQUENCE (0x30) wrapper path in the DER parser
- Fix misleading subject CN in IsCertTokenExpiredForTokenRequests_ReturnsTrue_For_CertWithoutOid_ExpiredNotAfter
- Move TryParseTokenNotAfterExtension and GetOrCreateAsync tests into their own
  named regions for clarity

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

This PR addresses ESS-managed identity credential “staleness” by proactively evicting cached mTLS certificates once their token_not_after X.509 extension (OID 1.3.6.1.4.1.311.90.2.1) has elapsed, preventing Entra token request failures after the shorter token window closes.

Changes:

  • Add token_not_after OID parsing and use it as the cache-eviction boundary (fallback to X.509 NotAfter when missing/unparseable).
  • Update cache lookup logic to treat token-expired certs as cache misses and mint fresh credentials.
  • Add/adjust unit tests and add test app.config binding redirects for System.Memory on .NET Framework test projects.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
tests/Microsoft.Identity.Test.Unit/app.config Adds System.Memory bindingRedirect for unit tests on .NET Framework.
tests/Microsoft.Identity.Test.Unit/ManagedIdentityTests/PersistentCertificateStoreUnitTests.cs Adds tests for token window expiry behavior and DER time parsing; adds helper to generate OID-bearing certs.
tests/Microsoft.Identity.Test.Integration.netfx/app.config Adds System.Memory bindingRedirect for integration tests on .NET Framework.
src/client/Microsoft.Identity.Client/ManagedIdentity/V2/MtlsCertificateCache.cs Implements token_not_after OID parsing and proactive eviction in MtlsBindingCache.
src/client/Microsoft.Identity.Client/ManagedIdentity/V2/ImdsV2ManagedIdentitySource.cs Removes redundant XML doc lines for SCHANNEL detection helper.

else
{
// OID absent or unparseable — fall back to the X.509 NotAfter as the token boundary.
tokenNotAfter = new DateTimeOffset(cert.NotAfter, TimeSpan.Zero);
Comment on lines +272 to +283
string timeStr = Encoding.ASCII.GetString(rawData, pos, len).TrimEnd('Z');

// DER GeneralizedTime: YYYYMMDDHHMMSS[.fff]
if (tag == 0x18)
{
return DateTimeOffset.TryParseExact(
timeStr,
new[] { "yyyyMMddHHmmss", "yyyyMMddHHmmss.fff" },
CultureInfo.InvariantCulture,
DateTimeStyles.AssumeUniversal | DateTimeStyles.AdjustToUniversal,
out result);
}
Comment on lines +285 to +291
// DER UTCTime: YYMMDDHHMMSS
return DateTimeOffset.TryParseExact(
timeStr,
"yyMMddHHmmss",
CultureInfo.InvariantCulture,
DateTimeStyles.AssumeUniversal | DateTimeStyles.AdjustToUniversal,
out result);
Comment on lines 122 to 127
logger.Verbose(() =>
$"[PersistentCert] mTLS binding cache HIT (persistent) for '{cacheKey}'.");

if (persistedEntry.Certificate.HasPrivateKey)
if (persistedEntry.Certificate.HasPrivateKey &&
!IsCertTokenExpiredForTokenRequests(persistedEntry.Certificate, logger))
{
@Robbie-Microsoft

Copy link
Copy Markdown
Contributor Author

Will be fixed on the IMDS side

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants