Management API: Fix OAuth client registration permanently skipped after transient failure (closes #22356) by AndyButland · Pull Request #22368 · umbraco/Umbraco-CMS

AndyButland · 2026-04-07T14:19:02Z

Description

This PR addresses the report in #22356 of being unable to access the Swagger endpoints before a restart after an unattended install.

I haven't been able to replicate the issue with normal use. That said, analysis has uncovered a couple of issues that could be more defensively coded and might be the cause of finding the problem in practice.

Background

PR #22020 introduced RuntimeLevel.Upgrading and moved unattended upgrades to a background service. During the Upgrading phase, the HTTP server is up and accepting requests while migrations run concurrently in the background.

BackOfficeAuthorizationInitializationMiddleware registers OAuth clients on the first backoffice request when RuntimeLevel >= Upgrade. The Upgrading level (4) satisfies this check, so the middleware attempts registration. However, if EnsureBackOfficeApplicationAsync fails during this window — for example due to database contention with the concurrent migration — two bugs in the middleware make the failure permanent (or at least until a restart):

Host cached before registration: The host was added to _knownHosts before calling EnsureBackOfficeApplicationAsync. On failure, the host remained cached and all subsequent requests skipped registration.
Semaphore leak on failure: The semaphore was released manually without try/finally. If EnsureBackOfficeApplicationAsync threw, the semaphore was never released, risking deadlocks for new hosts.

Theory as to why this isn't always seen

Triggering the issue requires a backoffice request to arrive during the brief Upgrading window while the background migration service holds database resources. On a local dev machine with a fresh install and no external packages, migrations typically complete near-instantly, making the window very small. But it could be hit when if a request is made during boot.

After a restart, RuntimeLevel resolves directly to Run with no concurrent migrations, so registration succeeds — which is why the restart workaround works.

How the fix resolves it

The fix is defensive — even if we can't always reproduce the exact race condition, the middleware should be resilient to transient failures:

Host cached only after success: _knownHosts is populated only after EnsureBackOfficeApplicationAsync completes without throwing. If it fails, the next request retries.
Semaphore in try/finally: The semaphore is always released, preventing deadlocks.

Test plan

Automated

Unit tests added to BackOfficeAuthorizationInitializationMiddlewareTests should pass.

Manual

As mentioned, I've not been able to replicate with normal use, but have used this method to deterministically reproduce the race condition: add a temporary debug hack that makes the first EnsureBackOfficeApplicationAsync call throw, then test with the old and new middleware code.

1. Add debug hack (temporary — revert before merging)

In src/Umbraco.Cms.Api.Management/Security/BackOfficeApplicationManager.cs, add a fail-once counter at the top of EnsureBackOfficeApplicationAsync:

// DEBUG: Remove before merging.
private static int _debugCallCount;

public async Task EnsureBackOfficeApplicationAsync(
    IEnumerable<Uri> backOfficeHosts, CancellationToken cancellationToken = default)
{
    // --- START DEBUG BLOCK (remove before merging) ---
    if (Interlocked.Increment(ref _debugCallCount) == 1)
    {
        _logger.LogWarning(
            "=== DEBUG: Simulating transient failure (attempt #{Count}) ===",
            _debugCallCount);
        await Task.Delay(100, cancellationToken);
        throw new Exception("DEBUG: Simulated database contention during upgrade");
    }
    _logger.LogWarning(
        "=== DEBUG: Registration attempt #{Count} — proceeding normally ===",
        _debugCallCount);
    // --- END DEBUG BLOCK ---

    // ... rest of method unchanged ...

2a. Confirm the bug (old middleware code on `main`)

Check out the original InitializeBackOfficeAuthorizationOnceAsync from main

Delete the existing tokens and applications:

  DELETE FROM umbracoOpenIddictTokens
  DELETE FROM umbracoOpenIddictAuthorizations
  DELETE FROM umbracoOpenIddictApplications WHERE ClientId IN ('umbraco-swagger', 'umbraco-postman')

Run dotnet run --project src/Umbraco.Web.UI, navigate to https://localhost:44339/umbraco.

Log shows: Simulating transient failure (attempt #1)
Refresh — no second log line. The host was cached before the throw, so the middleware skips retry.

Try to authorize via the Swagger UI and the result will be:

error:invalid_request
error_description:The specified 'client_id' is invalid.
error_uri:https://documentation.openiddict.com/errors/ID2052

2b. Confirm the fix (this PR)

Switch to the fixed InitializeBackOfficeAuthorizationOnceAsync from this branch:

Delete the tokens and applications again.

Run dotnet run --project src/Umbraco.Web.UI, navigate to https://localhost:44339/umbraco.

Log shows two lines: Simulating transient failure (attempt #1) then Registration attempt #2 — proceeding normally.

Try to authorize via the Swagger UI and the result should be successful.

3. Cleanup

Remove the debug code.

…r transient failure.

Copilot

Pull request overview

This PR hardens BackOfficeAuthorizationInitializationMiddleware to ensure OAuth client registration isn’t permanently skipped after a transient failure during RuntimeLevel.Upgrading, and adds unit tests to prevent regressions.

Changes:

Release the first-request semaphore via try/finally to avoid deadlocks after exceptions.
Only add hosts to _knownHosts after successful EnsureBackOfficeApplicationAsync, so transient failures are retried.
Add unit tests covering retry-on-failure, caching-on-success, semaphore release, and runtime-level guard behavior.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File	Description
`src/Umbraco.Cms.Api.Management/Middleware/BackOfficeAuthorizationInitializationMiddleware.cs`	Makes registration retryable after failures and ensures semaphore is always released.
`tests/Umbraco.Tests.UnitTests/Umbraco.Cms.Api.Management/Middleware/BackOfficeAuthorizationInitializationMiddlewareTests.cs`	Adds regression tests for retry, caching, guard clause, and semaphore-release behavior.

...mbraco.Cms.Api.Management/Middleware/BackOfficeAuthorizationInitializationMiddlewareTests.cs

src/Umbraco.Cms.Api.Management/Middleware/BackOfficeAuthorizationInitializationMiddleware.cs

hifi-phil · 2026-04-08T14:10:59Z

I was seeing this on install when I created a new intense of Umbraco with Clean starter kit installed and run everything at once. it maybe be this brief window is extended when trying to run everything at once; install, migration, site setup.

Migaroez

Haven't been able to reproduce the race condition outside of forcing it trough code as documented. Will give it a go on a slower pc later today. Either way, I don't think the code changes will do any harm and definetly improve code health and startup behaviour

…er transient failure (closes #22356) (#22368) * Prevent OAuth client registration from being permanently skipped after transient failure. * Addressed code review feedback.

Prevent OAuth client registration from being permanently skipped afte…

ea52063

…r transient failure.

Copilot AI review requested due to automatic review settings April 7, 2026 14:19

Copilot started reviewing on behalf of AndyButland April 7, 2026 14:19 View session

AndyButland marked this pull request as draft April 7, 2026 14:21

Copilot AI reviewed Apr 7, 2026

View reviewed changes

AndyButland added 2 commits April 8, 2026 09:14

Merge branch 'main' into v17/bugfix/22356-oauth-client-registrations

3a98493

Addressed code review feedback.

d7d75c2

AndyButland marked this pull request as ready for review April 8, 2026 13:50

Migaroez approved these changes Apr 9, 2026

View reviewed changes

AndyButland merged commit 8d25312 into main Apr 9, 2026
26 of 27 checks passed

AndyButland deleted the v17/bugfix/22356-oauth-client-registrations branch April 9, 2026 12:00

AndyButland added the release/17.3.2 label Apr 9, 2026

AndyButland mentioned this pull request Apr 9, 2026

Umbraco 17.3 regression: OAuth clients (umbraco-swagger, umbraco-postman) not registered after unattended install #22356

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Management API: Fix OAuth client registration permanently skipped after transient failure (closes #22356)#22368

Management API: Fix OAuth client registration permanently skipped after transient failure (closes #22356)#22368
AndyButland merged 3 commits intomainfrom
v17/bugfix/22356-oauth-client-registrations

AndyButland commented Apr 7, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

hifi-phil commented Apr 8, 2026

Uh oh!

Migaroez left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

AndyButland commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Background

Theory as to why this isn't always seen

How the fix resolves it

Test plan

Automated

Manual

1. Add debug hack (temporary — revert before merging)

2a. Confirm the bug (old middleware code on main)

2b. Confirm the fix (this PR)

3. Cleanup

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

hifi-phil commented Apr 8, 2026

Uh oh!

Migaroez left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

AndyButland commented Apr 7, 2026 •

edited

Loading

2a. Confirm the bug (old middleware code on `main`)