Skip to content

fix(web): surface the real reason when local-assistant connect fails#33241

Merged
vex-assistant-bot[bot] merged 2 commits into
mainfrom
devin/1780496664-surface-local-connect-error
Jun 3, 2026
Merged

fix(web): surface the real reason when local-assistant connect fails#33241
vex-assistant-bot[bot] merged 2 commits into
mainfrom
devin/1780496664-surface-local-connect-error

Conversation

@devin-ai-integration

Copy link
Copy Markdown
Contributor

Prompt / plan

While investigating a "Couldn't connect to your assistant. Make sure it's running." card on the local-mode login page, the failure was undiagnosable: clicking Retry produced only a brief flicker and the same generic message, with nothing logged to devtools or Sentry. The root reason is that the local auto-connect path swallows the thrown error.

What changed

connectToLocal() wrapped its whole flow — fetchGuardianTokenHost → gateway token exchange → initSession → navigate — in a bare catch {} that discarded the error and set a fixed headline. This change:

  • Binds the caught error and reports it through captureError() (the app's standard reporting primitive: console + Sentry, with transient-network filtering and non-Error normalization), tagged with structured context (assistantId, gatewayPort).
  • Renders the error's normalized message as a secondary, muted detail line beneath the existing headline in both local-mode error surfaces, so the user sees the actual failure (e.g. connection refused on the gateway port, a 4xx/5xx from the token exchange, or a guardian-token error) instead of a dead end.

The friendly headline and the Retry affordance are unchanged; this is strictly additive diagnosability.

Why it's safe

  • Pure within-renderer change in apps/web; no wire/IPC/HTTP contract is touched, so there is no macOS↔platform version-skew concern.
  • No behavior change on the success path. On failure it adds a log + a detail line; it does not alter control flow, retries, or navigation.
  • Uses the existing captureError/normalizeToError helpers rather than a bare Sentry.captureException, matching apps/web conventions (apps/web/AGENTS.md — Manual error reporting).

Root cause analysis

  1. How did the code get into this state? The auto-connect flow was written to fail "gracefully" with a single user-facing message, and the catch was left argument-less (catch {}) so the error was never bound — a swallow that reads as intentional terseness but removes all diagnosability.
  2. What decisions led to it? Optimizing the happy path and treating connect failure as a single opaque state, with no logging hook on the renderer side of the local-mode connect path.
  3. Warning signs missed? A Retry button that re-runs an identical silent failure is a tell: if Retry can't change the outcome and nothing is logged, the error is being discarded.
  4. Preventing recurrence? Bind and report caught errors through captureError() on any user-triggered async action whose only feedback is a generic message. The sibling handler in this same file (handlePlatformProvider) already logs via console.error; the local-connect path was the outlier.
  5. AGENTS.md guideline? apps/web/AGENTS.md already documents captureError() as the manual error-reporting path. No new rule is warranted — the durable takeaway (don't use bare catch {} on user-facing async flows; route through captureError) is covered by existing guidance, and a more prescriptive rule risks going stale.

Scope note

This makes the remaining connect failure visible; it is not itself the fix for whatever the underlying connect error turns out to be. Once the surfaced message is captured on a real machine, any further root-cause fix (guardian token, gateway reachability, token exchange) will be a separate, targeted PR.

Test plan

  • bunx tsc --noEmit and eslint clean in apps/web (pre-commit hook regenerates the OpenAPI client first; the only lint output is a pre-existing exhaustive-deps warning unrelated to this change).
  • Manual: on a local-mode build where connect fails, the card now shows the specific reason beneath the headline and the error is logged to the console (and Sentry) under context local-login.connect.

Link to Devin session: https://app.devin.ai/sessions/15bca57bd4c64a3085cfb80e1f26355a
Requested by: @ashleeradka

The local-mode auto-connect path (connectToLocal) wrapped its entire flow in a bare catch {} that discarded the thrown error and rendered a fixed "Couldn't connect to your assistant. Make sure it's running." message with no detail and no console/Sentry record. When the connect failed at any step (guardian-token acquisition, the gateway token exchange, or initSession), the actual reason was invisible in both the UI and devtools — clicking Retry just re-ran the same silent failure.

Bind the caught error, report it through captureError (console + Sentry with structured context: assistantId and gatewayPort), and render its normalized message as a secondary detail line beneath the existing headline so the real failure is visible instead of a generic dead end.

Co-Authored-By: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
@devin-ai-integration

Copy link
Copy Markdown
Contributor Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment, CI, and merge conflict monitoring

@devin-ai-integration devin-ai-integration Bot left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 2 additional findings.

Open in Devin Review

…ssage

The negative-toned headline plus its error-detail line were rendered
inline in both the local-only and mixed login branches. Extract a single
ConnectErrorMessage component so the markup lives in one place; it renders
nothing without a headline, so the detail can no longer appear orphaned.

Co-Authored-By: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
@ashleeradka

Copy link
Copy Markdown
Contributor

@codex review

@vex-assistant-bot

Copy link
Copy Markdown
Contributor

@devin review — added one refactor commit (2c1a4a7) on top of the original. Re-run please.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2c1a4a7fb5

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +324 to +330
captureError(err, {
context: "local-login.connect",
extra: {
assistantId: assistant.assistantId,
gatewayPort: assistant.resources?.gatewayPort,
},
});

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Log transient local-connect failures before filtering

When the packaged app cannot reach the local gateway, the fetch failure reaches this catch as a browser TypeError such as Failed to fetch; captureError() returns early for those transient network errors before it calls console.error or Sentry (apps/web/src/lib/sentry/capture-error.ts). In that gateway-down scenario this path still produces no devtools/Sentry record, so the main failure case this change is trying to diagnose remains silent unless the user notices the UI detail.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct on the mechanic, but it doesn't affect this PR's goal. The PR surfaces the reason in the UI, and that path is independent of captureError: the catch always runs setConnectErrorDetail(normalizeToError(err).message), so the gateway-down TypeError: Failed to fetch is rendered in the error card regardless of whether Sentry/console sees it.

Two cases worth separating:

  • Missing / expired guardian token (the actual trap this PR targets) throws a GuardianTokenError, which is not a transient network error — so it passes the isTransientNetworkError gate and is logged to console + Sentry.
  • Gateway not running is a genuine transient network drop, which captureError deliberately filters out of Sentry (per its docstring — network drops aren't app bugs). That filtering is a central, repo-wide policy in lib/sentry/capture-error.ts, not something this path opts into.

So nothing regresses here: the user still sees the reason, and the non-transient failure class is captured. Routing connect-time network failures into Sentry would mean changing that shared filtering policy (e.g. an opt-out flag on captureError), which is out of scope for this fix — happy to file it as a follow-up if we want connect-time network failures observable in Sentry.

@vex-assistant-bot vex-assistant-bot Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

APPROVE

Value: Turns the single most diagnosability-hostile flow in the self-hosted login (catch {} with a fixed string + a Retry that re-runs the same silent failure) into something a user, Boss, or Sentry can actually act on, without changing the success path. Plus extracts the negative-tone headline+detail markup into a ConnectErrorMessage component so the two surfaces stay in lockstep when copy or styling changes.

What this does:

  1. Imports captureError + normalizeToError from @/lib/sentry/capture-error (the repo-standard reporting primitive — console + Sentry with transient-network filtering + non-Error normalization, per apps/web/AGENTS.md).
  2. connectToLocal's catch {} becomes catch (err)captureError(err, { context: 'local-login.connect', extra: { assistantId, gatewayPort } }) + setConnectErrorDetail(normalizeToError(err).message). The friendly headline stays as-is.
  3. New connectErrorDetail state cleared at the top of each connect attempt.
  4. New ConnectErrorMessage({ message, detail }) component used in BOTH the post-attempt and the picker views — replaces two near-identical inline <p> blocks. Returns null without a headline so callers mount it unconditionally.
Codex P2 "Log transient local-connect failures before filtering" — mooted (out of scope, by-design repo policy)

Codex's mechanic is correct: captureError() filters isTransientNetworkError results before hitting console+Sentry, so a gateway-down TypeError: Failed to fetch will not appear in either. Devin's inline reply walks the cases precisely:

  • GuardianTokenError (the actual trap this PR targets) is NOT a transient network error → passes the filter → logged to console+Sentry. ✓
  • Gateway not running is a real transient network drop → deliberately filtered by captureError per its docstring ("network drops aren't app bugs"). The user still sees the actual fetch failure rendered in the detail line because setConnectErrorDetail(...) runs unconditionally inside the catch, regardless of the captureError filter.

Routing connect-time network failures into Sentry would mean changing the repo-wide lib/sentry/capture-error.ts filtering policy (e.g. an opt-out flag like captureError(err, { ..., reportTransient: true })). That's a separate, cross-cutting decision — Boss's call whether to file it as a follow-up. Not blocking this PR.

Anti-pattern cross-check
  • No as runtime-boundary casts in the diff. ✓
  • Uses var(--system-negative-strong) + var(--content-secondary) + text-body-small-default — semantic tokens, paired correctly (negative-strong text, secondary detail). ✓ (design-system-review.md compliant.)
  • ConnectErrorMessage returns null when message is null — caller can mount unconditionally, which the picker view does. The post-attempt view still gates on connectError &&, which is fine but slightly redundant. Pure cleanup nit, not blocking.
  • Pure within-renderer change in apps/web/.../login-page.tsx. No IPC/HTTP/wire contract touched → no macOS↔platform version-skew concern. ✓
  • No SSE/seq territory. No vembda territory. (R11e clean — Vargas + Mahmoud territories untouched.)

Durable lesson worth noting for the team: bare catch {} on a user-triggered async flow whose only feedback is a generic message is the highest-leverage diagnosability bug — it removes Retry's information value AND blinds Sentry. The handlePlatformProvider sibling handler in the same file already routes errors through console.error. This PR brings the self-hosted side up to parity.

Merge gate at HEAD 2c1a4a7f: Vex ✓ · Codex commented P2 (verified out-of-scope, repo-wide filtering policy; Devin posted the rebuttal inline) · Devin re-engaged at HEAD via inline reply, no new blocking findings · CI all 7 green ✓. Devin is last pusher → bot-merge may be blocked by branch protection; will attempt and flag if blocked.

Vellum Constitution — Distinct: a tiny, surgical patch that fixes a class of UX bug ("Retry button on opaque error") and pays its rent in observability without grabbing a single byte of additional behavioral surface area.

@vex-assistant-bot vex-assistant-bot Bot merged commit 89ca2de into main Jun 3, 2026
7 checks passed
@vex-assistant-bot vex-assistant-bot Bot deleted the devin/1780496664-surface-local-connect-error branch June 3, 2026 16:24
marinatrajk pushed a commit that referenced this pull request Jun 3, 2026
…33241)

* fix(web): surface the real reason when local-assistant connect fails

The local-mode auto-connect path (connectToLocal) wrapped its entire flow in a bare catch {} that discarded the thrown error and rendered a fixed "Couldn't connect to your assistant. Make sure it's running." message with no detail and no console/Sentry record. When the connect failed at any step (guardian-token acquisition, the gateway token exchange, or initSession), the actual reason was invisible in both the UI and devtools — clicking Retry just re-ran the same silent failure.

Bind the caught error, report it through captureError (console + Sentry with structured context: assistantId and gatewayPort), and render its normalized message as a secondary detail line beneath the existing headline so the real failure is visible instead of a generic dead end.

Co-Authored-By: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>

* refactor(web): dedup the connect-failure markup behind ConnectErrorMessage

The negative-toned headline plus its error-detail line were rendered
inline in both the local-only and mixed login branches. Extract a single
ConnectErrorMessage component so the markup lives in one place; it renders
nothing without a headline, so the detail can no longer appear orphaned.

Co-Authored-By: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>

---------

Co-authored-by: ashlee@vellum.ai <ashlee@vellum.ai>
Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant