Skip to content

fix(web): gate platform API calls on organization ID readiness#33953

Merged
agarg5 merged 3 commits into
mainfrom
fix-org-gate-race
Jun 8, 2026
Merged

fix(web): gate platform API calls on organization ID readiness#33953
agarg5 merged 3 commits into
mainfrom
fix-org-gate-race

Conversation

@agarg5

@agarg5 agarg5 commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Platform API calls (assistantsList, featureFlagsClientFlagValuesRetrieve) require the Vellum-Organization-Id header, but could fire before the organization store finished loading — causing 400 errors and leaving the hatching screen stuck on "Setting up your assistant…" indefinitely.
  • Gates the assistant lifecycle query (shouldQueryServer) and the client feature flag sync on currentOrganizationId being non-null before firing.

Test plan

  • Load the app on production/dev with a platform session — verify no 400 errors on /v1/assistants or /v1/feature-flags/client-flag-values/ during startup
  • Verify the hatching screen progresses past "Setting up your assistant…" without hanging
  • Verify local mode (no platform session) still works — the org gate should be irrelevant since shouldQueryServer is already false

🤖 Generated with Claude Code

Platform API endpoints require the Vellum-Organization-Id header.
During app startup, the organization store loads asynchronously, but
the assistant lifecycle query and client feature flag fetch could fire
before it resolved — sending requests without the header and getting
400 responses from Django. This caused the hatching screen to hang
on "Setting up your assistant…" indefinitely.

Gate both the lifecycle server query and the feature flag sync on the
organization ID being available before firing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1618ae5c66

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread apps/web/src/assistant/use-lifecycle.ts Outdated
Comment on lines +63 to +67
const shouldQueryServer =
isAuthenticated(sessionStatus) &&
!isGatewayAuthMode() &&
(hasPlatformSession || !isLocalMode());
(hasPlatformSession || !isLocalMode()) &&
!!currentOrganizationId;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Gate the imperative lifecycle fetch on org readiness

For authenticated platform sessions while currentOrganizationId is still hydrating, this only disables the passive useAssistantQuery; the effect below still calls lifecycleService.respondToInputs(), which immediately runs checkAssistant()/fetchQuery() and hits assistantsList without the Vellum-Organization-Id header. In the common case where selectedPlatformAssistantId remains null (multi-assistant disabled), the effect also won't rerun when the org later arrives, so the startup 400 can still put the lifecycle into an error/stuck state instead of recovering.

Useful? React with 👍 / 👎.

@agarg5

agarg5 commented Jun 8, 2026

Copy link
Copy Markdown
Contributor Author

Root cause: the race was introduced by #31194 (migrate OrganizationProvider from React Context to Zustand store). The original Context provider blocked rendering until the org was resolved; the Zustand migration made the fetch non-blocking, so children can fire platform API calls before the org ID is ready. A sessionStorage fallback was added to cover refreshes, but it's empty on first visits or cleared storage — leaving a window where requests go out without the Vellum-Organization-Id header and get 400'd by Django.

@agarg5 agarg5 requested a review from ashleeradka June 8, 2026 18:19
The Codex review correctly identified that respondToInputs() calls
checkAssistant() imperatively via fetchQuery, bypassing the passive
useAssistantQuery enabled gate. Pass hasOrganization through the
service inputs and short-circuit respondToInputs() before the
imperative fetch when the org store hasn't resolved yet.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@agarg5

agarg5 commented Jun 8, 2026

Copy link
Copy Markdown
Contributor Author

ran into this issue when hatching assistants on dev loaded infinitely

@vex-assistant-bot vex-assistant-bot Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

COMMENT — fix shape is right, one helper-use nit.

What this does: gates shouldQueryServer, useClientFeatureFlagSync, and the imperative respondToInputscheckAssistant path on the org store having a currentOrganizationId. Cleanly addresses Codex's P1 about the lifecycle effect still firing checkAssistant without the org header — confirmed in lifecycle-service.ts:204 with the early-return + currentOrganizationId added to the effect deps array.

One observation (non-blocking): there's already a canonical helper for this exact case — useIsOrgReady() in apps/web/src/hooks/use-is-org-ready.ts, used in ~8 production sites (conversation-queries.ts, text-to-speech-card.tsx, use-stored-credential-presence.ts, etc.) and called out in apps/web/AGENTS.md as the canonical org-hydration gate. Established in #32912 (LUM-2114) for exactly this 400 Organization-Id header failure mode, reused across #33160 / #33165 / #33201.

It's also slightly more correct: returns !hasPlatformSession || currentOrgId != null, so local/self-hosted/gateway-only sessions read true instead of getting blocked on an org that never arrives. With raw !!currentOrganizationId in the new lifecycle-service.ts gate, respondToInputs will early-return for the entire session in pure local-mode without a platform session — worth confirming checkAssistant() isn't load-bearing on that path (your test plan covers it via shouldQueryServer=false, but the service-level gate is a separate code path).

Suggested shape — three call sites become one:

const isOrgReady = useIsOrgReady();
// shouldQueryServer && isOrgReady
// useClientFeatureFlagSync(hasPlatformSession && !isSessionInitializing && isOrgReady)
// hasOrganization: isOrgReady

Heads up: Codex's review is anchored at 1618ae5c (the pre-fix commit). The 471d9bd4 commit is what actually addresses its P1 — a fresh @codex review would clear the stale finding at HEAD.

Vellum Constitution — Trust-seeking: consolidating on one named gate keeps the org-hydration invariant auditable; reinventing it per call-site is how the next 400 sneaks back in.

Replace raw `!!currentOrganizationId` / `hasOrganization` checks with
the canonical `useIsOrgReady()` hook (established in #32912). The hook
returns `!hasPlatformSession || currentOrgId != null`, so local/gateway
sessions pass through instead of being blocked on an org that never
arrives.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@agarg5 agarg5 merged commit d075433 into main Jun 8, 2026
7 checks passed
@agarg5 agarg5 deleted the fix-org-gate-race branch June 8, 2026 18:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant