fix(web): root-cause fix for daemon query retries + TQ conversion (WEB-7, WEB-2H)#33165
Conversation
…B-7, WEB-2H) Closes LUM-2199 - Extract shouldRetryDaemonError utility to utils/daemon-errors.ts - Move isExpectedDaemonTransientError from lib/sentry/ to utils/ (re-export for compat) - Replace retry: false with shouldRetryDaemonError in use-history-pagination.ts (WEB-7) - Convert imperative credential read to TQ useQuery in web-search-card.tsx (WEB-2H) - Derive saved state from daemon config via useMemo, eliminating redundant useState - Keep bestEffort: true as defense-in-depth in error handlers Co-Authored-By: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
…ter config patch The savedWebSearchMode/savedWebSearchProvider derivation from daemon config lags behind during the async refetch window after patchDaemonConfig. This briefly re-enables the save button. Fix: set a synchronous override in handleSave, cleared when the daemon config refetch completes. Co-Authored-By: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
…ctor Adopted useDaemonConfigQuery + useDaemonConfigMutation + useProvisionProviderKey from #33156, combined with TQ credential query, savedOverride, and requiresProviderCredential rename from this branch. Co-Authored-By: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ad62a900de
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| } | ||
| return data!.found; | ||
| }, | ||
| enabled: !!assistantId && requiresProviderCredential, |
There was a problem hiding this comment.
Gate credential query on org readiness
In platform-hosted sessions this daemon request needs the Vellum-Organization-Id header, but the new TanStack query can run as soon as assistantId exists even if the org store has not hydrated yet. I checked apps/web/docs/STATE_MANAGEMENT.md and useIsOrgReady(): queries that need this header should include that gate; otherwise, if the 400 org-header retries exhaust before hydration completes, the query stays errored with webSearchHasStoredKey treated as false and users with an existing BYOK web-search key can be blocked/asked for a key until something else causes a refetch.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Good catch — this is a real bug. Fixed in ca4f9da.\n\nAdded useIsOrgReady() gate to the credential query's enabled condition, matching the established pattern in conversation-queries.ts. The query now waits for org store hydration before firing, preventing 400 "Organization-Id header" errors that could exhaust retries and leave webSearchHasStoredKey permanently false.\n\nNote: useDaemonConfigQuery itself also lacks this gate (pre-existing), but that's a separate concern — it's gated on assistantId which in platform mode implies the lifecycle service has run, which typically means org is ready. The credential query was more vulnerable because it could mount eagerly via requiresProviderCredential before the config query resolves.
…e hydration The secretsReadPost daemon call needs the Vellum-Organization-Id header, which the org store provides after async hydration. Without the isOrgReady gate, the query could fire before hydration completes, exhaust retries on 400 'Organization-Id header' errors, and leave webSearchHasStoredKey as false permanently. Co-Authored-By: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
There was a problem hiding this comment.
✦ APPROVE
Value: Replaces a bestEffort-style symptom mute with proper TanStack Query retry + the useIsOrgReady() hydration gate so WEB-7 (77 events) and WEB-2H stop firing for the reason they fire, not just at the Sentry shipping layer. Centralizes the daemon-transient predicate where every retry-aware caller can reach it.
What this does: Moves isExpectedDaemonTransientError from lib/sentry/capture-error.ts → utils/daemon-errors.ts (correct home, not Sentry-specific), adds shouldRetryDaemonError as the canonical TQ retry predicate (3 retries max, exponential backoff, fails fast on 500/programming bugs), wires it into use-history-pagination (replaces unjustified retry: false), and rewrites web-search-card.tsx to use useDaemonConfigQuery + useQuery for the secret-presence check — eliminating secretScopeRef, secretReadRevision, cancelled, and the permanent-false bug.
Review detail
Verified at HEAD ca4f9da9:
daemon-errors.tsmatches the predicate shape adopted in #33160 (503/502/401/400+org-header).MAX_DAEMON_RETRIES = 3is sensible — at 1s/2s/4s backoff that's ~7s of recovery window, well above typical hydration time.use-history-pagination.ts:144—retry: shouldRetryDaemonErroris the right call. Oldretry: falseon history pagination was the textbook case the PR description calls out.capture-error.tsre-exports the predicate for migration compatibility — diff is strictly subtractive (–31/+5).
Codex P2 / Devin P1 — both real, both already addressed:
- Codex P2 (org-header hydration race on credential query): legitimate — same pattern just landed in #32912/#33160. Fixed by Devin in
ca4f9da9by addinguseIsOrgReady()to theenabledpredicate at line 125. Verified. - Devin P1 (save button briefly re-enables in the refetch window between
setSaving(false)and daemon config invalidation): real race. Fixed ine2c6e1d8via thesavedOverridesynchronous bridge that clears whenreconciledupdates. The dual-state pattern (savedOverridefor synchronous bridging +reconciledfor ground truth) is clean.
Anti-pattern grep on full file at HEAD:
- Zero new runtime-boundary casts.
as ServiceModeon the localStorage read is pre-existing parity with sibling cards. data!.foundat L123 — gated byif (!response.ok) throwimmediately prior; HeyAPI shape guaranteesdatawhen response is ok. Defensible.assistantId!at L112 — gated byenabled: !!assistantId && requiresProviderCredential && isOrgReady. Defensible.- Zero
@ts-ignore, zeroeslint-disable, zero|| 0.
Architecture note: useDaemonConfigQuery itself doesn't have the useIsOrgReady gate (Devin called this out in the response thread as a separate, pre-existing concern). Worth a follow-up — it's gated on assistantId, which transitively gates on assistantsListOptions succeeding, but on the platform path that upstream query needs the org header too. Probably a clean follow-up ticket rather than scope creep into this PR.
Test coverage: daemon-errors.test.ts (+107) covers the full status matrix (503/502/401/400-org/400-other/500/TypeError/Error) for both isExpectedDaemonTransientError and shouldRetryDaemonError, plus the failureCount >= MAX_DAEMON_RETRIES cap.
Vellum Constitution — Trust-seeking: replaces a silent-by-default Sentry mute with bounded, observable retry + an explicit hydration gate. Failure modes that survive retries still surface; expected transients no longer pollute the signal.
|
@codex review |
|
Codex Review: Didn't find any major issues. 🎉 ℹ️ About Codex in GitHubCodex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback". |
Prompt / plan
Root-cause fix for Sentry issues WEB-7 (77 events) and WEB-2H (1 event). Instead of suppressing errors via
bestEffort(symptom treatment), this prevents them via TanStack Query retry and proper data-fetching patterns.What changed
1. Shared daemon-error utilities (
utils/daemon-errors.ts)isExpectedDaemonTransientError(error)— classifies 503/502/401/400-org-header as expected transient daemon startup errors. Moved fromlib/sentry/capture-error.ts(wrong location per CONVENTIONS.md — not Sentry-specific)shouldRetryDaemonError(failureCount, error)— TQ retry predicate.retry: shouldRetryDaemonErroris now the convention for daemon queriescapture-error.tsfor backward compatibility2. WEB-7 fix (
use-history-pagination.ts)retry: falsewithretry: shouldRetryDaemonError3. WEB-2H fix (
web-search-card.tsx)useEffect+ async IIFE secret read to TQuseQuery— gains retry, caching, abort, refetch-on-focussecretScopeRef,secretReadRevision(manual cache-bust),cancelledboolean,webSearchHasStoredKeyuseStatewebSearchHasStoredKey = falseon transient daemon 503 with no retrysavedOverridestate to prevent save button race condition during async config refetchuseDaemonConfigQuery+useDaemonConfigMutation+useProvisionProviderKeyfrom PR refactor(web): split useDaemonConfig god-hook, fix untyped patches and double invalidation (LUM-2184) #33156's refactor4. Defense-in-depth —
bestEffort: truein error handlers catches anything retry doesn't cover (retries exhausted)5. Convention docs — Updated CONVENTIONS.md and STATE_MANAGEMENT.md with daemon retry guidance
Architecture
Closes LUM-2199
Related: LUM-2205 (same pattern in
provider-editor-modal.tsx, separate PR)Test plan
bunx tsc --noEmit— cleanbun run lint— cleanisExpectedDaemonTransientErrorandshouldRetryDaemonError(daemon-errors.test.ts)Link to Devin session: https://app.devin.ai/sessions/c2e17ff1867f4ebd90aac007ea0f5453
Requested by: @ashleeradka