Skip to content

feat: add control plane for multi-proxy worker management#24217

Merged
ryan-crabbe merged 5 commits intomainfrom
litellm_ryan_march_18
Mar 20, 2026
Merged

feat: add control plane for multi-proxy worker management#24217
ryan-crabbe merged 5 commits intomainfrom
litellm_ryan_march_18

Conversation

@ryan-crabbe
Copy link
Copy Markdown
Contributor

@ryan-crabbe ryan-crabbe commented Mar 20, 2026

Type

🆕 New Feature

Circle CI

https://app.circleci.com/pipelines/github/BerriAI/litellm?branch=litellm_ryan_march_18

Changes

Adds a control plane capability that enables a central admin instance to manage multiple regional worker proxies from a single UI.

Backend:

  • Worker registry loaded from YAML config (worker_id, name, url)
  • /.well-known/litellm-ui-config exposes is_control_plane and workers list
  • /v3/login + /v3/login/exchange: opaque code exchange for cross-origin username/password auth (JWT never in URL/logs, single-use 60s TTL)
  • SSO cookie handoff with return_to → opaque code → exchange
  • _validate_return_to: full origin validation (scheme+hostname+port)
  • Startup warning when control_plane_url set without Redis
  • Both /v3 endpoints gated behind control_plane_url config

Frontend:

  • Worker selector dropdown on login page (gated behind is_control_plane)
  • Cross-origin SSO code exchange handling on callback
  • switchToWorkerUrl: localStorage-persisted worker URL for API calls
  • useWorker hook: shared worker state management
  • WorkerDropdown in navbar for switching workers
  • Logout/switch clears worker state from localStorage

Adds a control plane capability that enables a central admin instance
to manage multiple regional worker proxies from a single UI.

Backend:
- Worker registry loaded from YAML config (worker_id, name, url)
- /.well-known/litellm-ui-config exposes is_control_plane and workers list
- /v3/login + /v3/login/exchange: opaque code exchange for cross-origin
  username/password auth (JWT never in URL/logs, single-use 60s TTL)
- SSO cookie handoff with return_to → opaque code → exchange
- _validate_return_to: full origin validation (scheme+hostname+port)
- Startup warning when control_plane_url set without Redis
- Both /v3 endpoints gated behind control_plane_url config

Frontend:
- Worker selector dropdown on login page (gated behind is_control_plane)
- Cross-origin SSO code exchange handling on callback
- switchToWorkerUrl: localStorage-persisted worker URL for API calls
- useWorker hook: shared worker state management
- WorkerDropdown in navbar for switching workers
- Logout/switch clears worker state from localStorage

Tests:
- 7 tests for /v3/login + /v3/login/exchange
- 10 tests for _validate_return_to
- 2 tests for control plane discovery endpoint
@vercel
Copy link
Copy Markdown

vercel bot commented Mar 20, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
litellm Ready Ready Preview, Comment Mar 20, 2026 9:00pm

Request Review

@codspeed-hq
Copy link
Copy Markdown
Contributor

codspeed-hq bot commented Mar 20, 2026

Merging this PR will not alter performance

✅ 16 untouched benchmarks


Comparing litellm_ryan_march_18 (541863a) with main (50f88c8)

Open in CodSpeed

// Exchange it for the JWT via the worker's /v3/login/exchange endpoint.
const params = new URLSearchParams(window.location.search);
const ssoCode = params.get("code");
if (ssoCode) {

Check failure

Code scanning / CodeQL

User-controlled bypass of security check High

This condition guards a sensitive
action
, but a
user-provided value
controls it.
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 20, 2026

Greptile Summary

This PR introduces a control-plane architecture for LiteLLM that lets a central admin instance manage multiple regional worker proxies from a single UI. The backend adds a worker registry (YAML-configured), exposes is_control_plane / workers via the UI discovery endpoint, and implements two new endpoints — POST /v3/login (opaque-code issue) and POST /v3/login/exchange (single-use code redemption) — to support cross-origin username/password and SSO login flows without exposing JWTs in URLs or browser history. The frontend adds a worker selector on the login page, useWorker hook for shared worker state, a WorkerDropdown in the navbar, and switchToWorkerUrl / exchangeLoginCode helpers in networking.tsx.

Key concerns identified (some new, several previously discussed):

  • SSO code exchange fires on all instances (LoginPage.tsx line 50): the ?code= exchange path is not gated on uiConfig?.is_control_plane, so a crafted URL with ?code= on a non-control-plane instance silently fails and leaves the user stuck on the loading screen (no .catch() handler, which was separately flagged).
  • KeyError on malformed cache entry (proxy_server.py login_v3_exchange): cached_data["token"] and cached_data["redirect_url"] are accessed directly; if the dict exists but lacks these keys the handler raises KeyError which surfaces as a confusing 500 rather than 401.
  • Non-JSON request body returns 500 (proxy_server.py login_v3 and login_v3_exchange): unlike /v2/login, invalid JSON bodies aren't caught explicitly and return 500 instead of 400.
  • Previously flagged (do not re-raise): TOCTOU race on code exchange, missing cookie Secure attribute, str(None) credential coercion, missing credentials: "include" in exchangeLoginCode, hardcoded localStorage key strings, hardcoded /ui/login path, and unauthenticated worker URL exposure in the discovery endpoint.

Confidence Score: 2/5

  • Not safe to merge yet — several P1 issues in the authentication critical path remain open across this review and the previous thread.
  • The PR introduces a meaningful new security surface (cross-origin auth with opaque code exchange, cookie-based SSO handoff, worker URL persisted in localStorage). Several P1 issues exist: the TOCTOU race that breaks the single-use guarantee under concurrent requests, missing cookie security attributes (Secure, httponly, samesite) on the token cookie, str(None) silently producing valid-looking credentials, missing .catch() leaving users stuck, the SSO exchange firing on non-control-plane instances, and a KeyError that converts a 401 scenario into a 500. Until these are addressed the auth flow is fragile for multi-pod Redis deployments and the cookie hardening is insufficient for HTTPS production deployments.
  • litellm/proxy/proxy_server.py (new v3 endpoints), litellm/proxy/management_endpoints/ui_sso.py (SSO redirect and cookie), and ui/litellm-dashboard/src/app/login/LoginPage.tsx (code exchange and error handling).

Important Files Changed

Filename Overview
litellm/proxy/proxy_server.py Adds /v3/login and /v3/login/exchange endpoints for control-plane cross-origin auth. Multiple issues flagged (including in previous threads): TOCTOU on code deletion, missing cookie security attributes, str(None) credential coercion, non-JSON body returning 500 instead of 400, and unguarded dict key access on cached data that can produce a KeyError → 500.
litellm/proxy/management_endpoints/ui_sso.py Extends SSO flow with return_to support for cross-origin worker redirects. _validate_return_to is well-implemented (origin-exact, case-insensitive, default-port normalization). Main concern (flagged in previous thread): litellm_cp_return_to cookie is missing the Secure attribute for HTTPS deployments.
litellm/proxy/discovery_endpoints/ui_discovery_endpoints.py Adds is_control_plane and workers fields to /.well-known/litellm-ui-config. Logic is correct — is_control_plane derived from a non-empty worker registry. Worker URLs are exposed unauthenticated (discussed in previous thread).
litellm/types/proxy/control_plane_endpoints.py New WorkerRegistryEntry Pydantic model with URL scheme validation. Clean and correct.
ui/litellm-dashboard/src/app/login/LoginPage.tsx Most complex frontend change. Adds worker selector, SSO code-exchange flow, and worker-switch token clearing. Multiple issues flagged across previous thread and this review: missing .catch() on exchangeLoginCode, SSO code exchange not gated on is_control_plane, hardcoded localStorage key, hardcoded /ui/login path, and uiConfig load-failure leaving user stuck.
ui/litellm-dashboard/src/components/networking.tsx Adds switchToWorkerUrl, exchangeLoginCode, and localStorage-backed proxyBaseUrl initialization. Worker URL is validated for HTTP/HTTPS scheme before storing. Known issues (previous thread): exchangeLoginCode missing credentials: "include" for cross-origin cookie delivery, and WORKER_URL_KEY constant not exported for use in other files.
ui/litellm-dashboard/src/hooks/useWorker.ts New hook for worker state management. Clean implementation — initializes from localStorage, syncs proxyBaseUrl on mount via useEffect, and provides selectWorker/disconnectFromWorker callbacks. Note that each component instance gets its own state; synchronization happens via localStorage rather than shared React state.

Sequence Diagram

sequenceDiagram
    participant CP as Control Plane UI
    participant Worker as Worker Proxy
    participant SSO as SSO Provider

    rect rgb(230, 240, 255)
        CP->>CP: switchToWorkerUrl worker.url
        CP->>Worker: POST /v3/login username+password
        Worker->>Worker: authenticate and generate JWT
        Worker->>Worker: cache JWT at login_code:CODE TTL 60s
        Worker-->>CP: response with code and expires_in
        CP->>Worker: POST /v3/login/exchange with code
        Worker->>Worker: atomic GET and DELETE cache entry
        Worker-->>CP: response with token and redirect_url
        CP->>CP: set token cookie and navigate to dashboard
    end

    rect rgb(255, 240, 230)
        CP->>CP: store worker URL in localStorage
        CP->>Worker: GET /sso/key/generate?return_to=CP_URL
        Worker->>Worker: validate return_to origin
        Worker->>Worker: set litellm_cp_return_to cookie
        Worker-->>CP: redirect to SSO Provider
        CP->>SSO: OAuth2 authorization request
        SSO-->>Worker: callback with auth code
        Worker->>Worker: exchange auth code for user info
        Worker->>Worker: cache JWT at login_code:CODE
        Worker->>Worker: delete return_to cookie
        Worker-->>CP: redirect to CP_URL with login code
        CP->>Worker: POST /v3/login/exchange with code
        Worker-->>CP: response with token
        CP->>CP: set token cookie and navigate to dashboard
    end
Loading

Comments Outside Diff (2)

  1. litellm/proxy/proxy_server.py, line 266-268 (link)

    P1 Non-JSON body returns 500 instead of 400

    await request.json() raises a json.JSONDecodeError when the body is not valid JSON, which is caught by the outer except Exception as e handler and re-raised as HTTP_500_INTERNAL_SERVER_ERROR. The equivalent /v2/login path already has a test (test_login_v2_returns_json_on_invalid_json_body) covering the 400-on-bad-body case, but the v3 endpoint has no such guard.

    Consider catching the JSON parse failure explicitly and raising a ProxyException with HTTP_400_BAD_REQUEST. The same applies to the body = await request.json() line inside login_v3_exchange.

  2. litellm/proxy/proxy_server.py, line 383-391 (link)

    P1 Unguarded dict key access on cached data can raise KeyError

    cached_data["token"] and cached_data["redirect_url"] are accessed directly after only checking not cached_data or not isinstance(cached_data, dict). If cached_data is a non-empty dict but lacks one of those keys — for example if a different part of the codebase wrote something to a login_code: prefixed Redis key — a KeyError is raised, caught by the outer except Exception, and surfaced as a confusing 500 Internal Server Error rather than the more appropriate 401 Unauthorized.

    Prefer using .get() on both keys followed by an explicit None check, raising a ProxyException with HTTP_401_UNAUTHORIZED if either value is missing. This keeps the error response semantically consistent with the "invalid or expired code" case just above.

Last reviewed commit: "Merge branch 'litell..."

Comment on lines +50 to +59
if (ssoCode) {
const workerUrl = localStorage.getItem("litellm_worker_url");
exchangeLoginCode(ssoCode, workerUrl).then(() => {
params.delete("code");
const cleanSearch = params.toString();
window.history.replaceState(null, "", window.location.pathname + (cleanSearch ? `?${cleanSearch}` : ""));
router.replace("/ui/?login=success");
});
return;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Unhandled rejection leaves user stuck on loading screen

exchangeLoginCode(...).then(...) has no .catch() / rejection handler. If the exchange fails (e.g., the 60 s TTL expired, Redis missed the code, or the network request errored), the promise rejects silently. isLoading stays true (the setIsLoading(false) is never reached), and the user sees an infinite <LoadingScreen /> with no error message and no way to recover.

Suggested change
if (ssoCode) {
const workerUrl = localStorage.getItem("litellm_worker_url");
exchangeLoginCode(ssoCode, workerUrl).then(() => {
params.delete("code");
const cleanSearch = params.toString();
window.history.replaceState(null, "", window.location.pathname + (cleanSearch ? `?${cleanSearch}` : ""));
router.replace("/ui/?login=success");
});
return;
}
if (ssoCode) {
const workerUrl = localStorage.getItem("litellm_worker_url");
exchangeLoginCode(ssoCode, workerUrl)
.then(() => {
params.delete("code");
const cleanSearch = params.toString();
window.history.replaceState(null, "", window.location.pathname + (cleanSearch ? `?${cleanSearch}` : ""));
router.replace("/ui/?login=success");
})
.catch(() => {
setIsLoading(false);
});
return;
}

Comment on lines +11244 to +11256
if not cached_data or not isinstance(cached_data, dict):
raise ProxyException(
message="Invalid or expired login code",
type=ProxyErrorTypes.auth_error,
param="code",
code=status.HTTP_401_UNAUTHORIZED,
)

# Single-use: delete immediately
if redis_usage_cache is not None:
await redis_usage_cache.async_delete_cache(key=cache_key)
else:
await user_api_key_cache.async_delete_cache(key=cache_key)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 TOCTOU race: single-use code is not atomically consumed

The get → delete sequence is not atomic. Two concurrent POST /v3/login/exchange requests with the same code can both pass the if not cached_data check before either one executes the delete — meaning both callers receive a valid token and the single-use guarantee is defeated.

On the Redis-backed path (the critical multi-pod case), the fix is to use an atomic GETDEL (or a Lua transaction) so the read and removal are a single round-trip. On the in-memory user_api_key_cache path the risk is lower (single-threaded asyncio event loop) but still theoretically exploitable if the cache implementation yields control between the two awaits.

Until an atomic helper is available, ensure at minimum that no additional await expressions appear between the async_get_cache and async_delete_cache calls, to keep the window as small as possible.

Comment on lines +406 to +414
if return_to is not None and sso_redirect is not None:
SSOAuthenticationHandler._validate_return_to(return_to)
sso_redirect.set_cookie(
key="litellm_cp_return_to",
value=return_to,
max_age=600,
httponly=True,
samesite="lax",
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 litellm_cp_return_to cookie is missing the Secure attribute

The cookie is created with httponly=True and samesite="lax" but without secure=True. Control-plane deployments run over HTTPS; without the Secure flag the browser will also transmit the cookie over plain HTTP connections, which could expose the return_to value to network interception.

Adding secure=True (or conditionally, only when control_plane_url starts with https://) brings this in line with standard security hardening for cookies used in a security-sensitive redirect flow.

const params = new URLSearchParams(window.location.search);
const ssoCode = params.get("code");
if (ssoCode) {
const workerUrl = localStorage.getItem("litellm_worker_url");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Hardcoded localStorage key duplicates a constant defined in networking.tsx

"litellm_worker_url" is already defined as WORKER_URL_KEY in networking.tsx (line 89). Reading it directly here as a magic string means that if the key is ever renamed in networking.tsx, this lookup will silently break the SSO code-exchange flow.

Consider exporting WORKER_URL_KEY from networking.tsx and importing it here, or exposing a dedicated getStoredWorkerUrl() helper that centralises the localStorage access.

},
status_code=status.HTTP_200_OK,
)
json_response.set_cookie(key="token", value=cached_data["token"])
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Token cookie set without httponly, secure, or samesite attributes

json_response.set_cookie(...) uses all defaults here, which means the cookie is readable by JavaScript, has no SameSite policy, and is sent over HTTP. For a JWT that grants admin-UI access, the cookie should include httponly=True, samesite="lax", and secure=True for HTTPS deployments.

Also note that in the cross-origin case (control-plane frontend fetching the worker's exchange endpoint) this server-set cookie will be blocked by modern browsers anyway because a cross-origin Set-Cookie without SameSite=None; Secure is rejected. The primary token delivery path (setting document.cookie from the response body in networking.tsx) still works — but hardening the server-side cookie is good defence-in-depth for same-origin calls.

@yuneng-jiang yuneng-jiang self-requested a review March 20, 2026 16:55
Comment on lines +9143 to +9151
throw new Error(deriveErrorMessage(errorData));
}

const exchangeData: LoginResponse = await exchangeResponse.json();
if (exchangeData.token) {
document.cookie = `token=${exchangeData.token}; path=/; SameSite=Lax`;
}
return exchangeData;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Cross-origin exchangeLoginCode missing credentials: "include"

The loginCall exchange path (used in the direct username/password flow) passes credentials: "include" when calling /v3/login/exchange. The exchangeLoginCode function (used in the SSO callback flow) does not, which creates an inconsistency. While the JWT is extracted from the response body in both paths — making the omission non-blocking in most cases — the browser won't send or store cookies for the worker domain without credentials: "include". This diverges from the same endpoint's behaviour elsewhere and may cause subtle issues for consumers that rely on the server-set token cookie on the worker origin.

Suggested change
throw new Error(deriveErrorMessage(errorData));
}
const exchangeData: LoginResponse = await exchangeResponse.json();
if (exchangeData.token) {
document.cookie = `token=${exchangeData.token}; path=/; SameSite=Lax`;
}
return exchangeData;
}
const response = await fetch(`${base}/v3/login/exchange`, {
method: "POST",
body: JSON.stringify({ code }),
credentials: "include",
headers: { "Content-Type": "application/json" },
});

Comment on lines +11270 to +11272
verbose_proxy_logger.exception(
"litellm.proxy.proxy_server.login_v3_exchange(): Exception occurred - {}".format(
str(e)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 CORS headers required for cross-origin worker /v3/login and /v3/login/exchange calls

The control-plane UI (e.g. https://cp.example.com) calls these endpoints on a worker (e.g. https://worker1.example.com). That is a cross-origin request. Modern browsers send a CORS preflight (OPTIONS) before the actual POST; without a matching Access-Control-Allow-Origin header in the worker's response, the browser silently blocks both the preflight and the request — making the entire cross-origin login flow fail with no visible error beyond a CORS console message.

If LiteLLM's existing global CORS middleware is already permissive enough this may "just work", but the PR doesn't document or verify this. Consider:

  • Adding an explicit control_plane_urlAccess-Control-Allow-Origin mapping for these two endpoints, OR
  • Documenting that the deployer must configure the CORS middleware's allowed_origins to include the control plane URL when control_plane_url is set.

Without this, cross-origin username/password login and the SSO code-exchange callback will be blocked by browsers in all production deployments that separate control plane and worker origins.

Comment on lines +14 to +15
is_control_plane: bool = False
workers: List[WorkerRegistryEntry] = []
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Worker registry (including internal URLs) exposed to unauthenticated callers

/.well-known/litellm-ui-config is publicly accessible without authentication (by design, since the login page fetches it before the user is authenticated). As a result, the full workers list — including each worker's name, ID, and internal/external URL — is visible to any unauthenticated user who can reach the control plane.

For deployments where worker URLs are internal hostnames or carry access tokens embedded in the URL, this is a meaningful information-disclosure surface. Consider:

  • Omitting url from the workers field in UiDiscoveryEndpoints and having the UI resolve the URL via a worker ID after authentication, OR
  • Keeping the current design but calling out in documentation that worker URLs should not contain credentials and should be treated as semi-public.

Comment on lines +76 to +82
// If switching workers on a control plane, clear the old token and show login
const switchingWorker = params.has("worker");
if (switchingWorker && uiConfig?.is_control_plane) {
clearTokenCookies();
setIsLoading(false);
return;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Worker switch clears cookies only when is_control_plane is true — uiConfig may be stale

The guard uiConfig?.is_control_plane is read from the already-loaded config, which is correct. However, the effect dependency array is [isConfigLoading, router, uiConfig]. If a user navigates to /ui/login?worker=team-b on a non-control-plane instance and uiConfig resolves to undefined (e.g., a network error), uiConfig?.is_control_plane is undefined (falsy) — the tokens are not cleared and isLoading is never set to false, leaving the user stuck on the loading screen.

If uiConfig fails to load and the URL has ?worker=, the effect should still set isLoading(false) to avoid the spinner hanging indefinitely.

Comment on lines +11148 to +11149
master_key=master_key,
prisma_client=prisma_client,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 str(body.get(...)) silently coerces missing fields to "None"

When username or password is absent from the request body, body.get("username") returns None, and str(None) produces the literal string "None". authenticate_user then receives "None" as the credential rather than raising a clear validation error, making missing-parameter failures harder to diagnose and potentially interacting unexpectedly with username-based lookups.

Consider explicitly checking for the presence of both fields before coercing to str, and raising a ProxyException with HTTP_400_BAD_REQUEST if either is absent — consistent with how other endpoints validate required body fields.

Comment on lines +82 to +91
localStorage.removeItem("litellm_selected_worker_id");
localStorage.removeItem("litellm_worker_url");
window.location.href = logoutUrl;
};

const handleWorkerSwitch = (workerId: string) => {
clearTokenCookies();
clearStoredReturnUrl();
localStorage.removeItem("litellm_selected_worker_id");
localStorage.removeItem("litellm_worker_url");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Magic-string localStorage keys duplicated from constants

Both handleLogout (lines 82–83) and handleWorkerSwitch (lines 90–91) manually remove "litellm_selected_worker_id" and "litellm_worker_url" from localStorage using hardcoded strings. These strings are already defined as constants in useWorker.ts (SELECTED_WORKER_KEY) and networking.tsx (WORKER_URL_KEY), and the useWorker hook already exposes a disconnectFromWorker() function that atomically clears both keys and resets proxyBaseUrl.

If either constant is renamed, these calls will silently stop working. Consider importing disconnectFromWorker from useWorker and calling it in both handlers instead of duplicating the removal logic:

// Before redirecting in handleLogout / handleWorkerSwitch:
disconnectFromWorker();  // clears SELECTED_WORKER_KEY, WORKER_URL_KEY, and proxyBaseUrl

Or at minimum, export SELECTED_WORKER_KEY from useWorker.ts and WORKER_URL_KEY from networking.tsx and import them here.

// SSO on the worker (or this instance if no worker), always
// include return_to so the callback redirects back here
const ssoBase = selectedWorker?.url ?? getProxyBaseUrl();
const returnTo = encodeURIComponent(window.location.origin + "/ui/login");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Hardcoded /ui/login path breaks non-root server path deployments

window.location.origin + "/ui/login" constructs the SSO return_to URL by always appending the hardcoded path /ui/login. For deployments where the control plane is served at a non-root server root path (e.g. https://cp.example.com/litellm/ui/login), window.location.origin gives https://cp.example.com — so the return_to value becomes https://cp.example.com/ui/login, which does not match the actual login page URL.

After the SSO redirect, the browser lands on the wrong path and the ?code= parameter is never consumed by LoginPage.tsx, causing the login flow to silently fail.

Use window.location.pathname (which already contains the full path including any server root path prefix) instead of the hardcoded string:

Suggested change
const returnTo = encodeURIComponent(window.location.origin + "/ui/login");
const returnTo = encodeURIComponent(window.location.origin + window.location.pathname);

Copy link
Copy Markdown
Contributor

@yuneng-jiang yuneng-jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@ryan-crabbe ryan-crabbe enabled auto-merge March 20, 2026 20:54
@ryan-crabbe ryan-crabbe merged commit 72c307d into main Mar 20, 2026
79 of 99 checks passed
Comment on lines +50 to +58
if (ssoCode) {
const workerUrl = localStorage.getItem("litellm_worker_url");
exchangeLoginCode(ssoCode, workerUrl).then(() => {
params.delete("code");
const cleanSearch = params.toString();
window.history.replaceState(null, "", window.location.pathname + (cleanSearch ? `?${cleanSearch}` : ""));
router.replace("/ui/?login=success");
});
return;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 SSO code exchange not gated on is_control_plane

The ?code= exchange path fires on every instance — including workers and non-control-plane deployments — regardless of whether this is actually a control-plane UI. On any instance where control_plane_url is not configured, calling /v3/login/exchange returns a 404. Because there's no .catch() handler, the error is swallowed silently, isLoading stays true, and the user sees an infinite loading screen whenever a ?code= query param appears in the URL on a non-control-plane instance.

Additionally, the exchange is attempted even when params.get("login") is not "success", meaning any ?code= query param (from any source) triggers it.

A minimal guard prevents both failure modes:

const ssoCode = params.get("code");
const loginSuccess = params.get("login") === "success";
if (ssoCode && loginSuccess && uiConfig?.is_control_plane) {
  const workerUrl = localStorage.getItem("litellm_worker_url");
  exchangeLoginCode(ssoCode, workerUrl)
    .then(() => { /* ... */ })
    .catch(() => { setIsLoading(false); });
  return;
}

@ishaan-berri ishaan-berri deleted the litellm_ryan_march_18 branch March 26, 2026 22:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants