diff --git a/AGENTS.md b/AGENTS.md
index 860ef84ce50..0bf92c6ed54 100644
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -45,6 +45,10 @@ Comments should explain **why** something is done and provide non-obvious contex
 
 Whenever you introduce, remove, or significantly modify a service, module, or data flow, you MUST update `ARCHITECTURE.md` to reflect the change. The Mermaid diagrams should always accurately represent the current system architecture, including new services, IPC message types, storage locations, and data flows.
 
+## Keep AGENTS.md up to date
+
+When your PR establishes a new mandatory pattern, convention, or architectural constraint that other agents must follow, update `AGENTS.md` in the same PR. Examples: introducing a new abstraction layer that all callsites must use, adding a guard test that enforces an import rule, or changing how a subsystem handles failure modes. If the pattern is only relevant within a single file or module, a code comment is sufficient — only add to `AGENTS.md` when the rule applies project-wide.
+
 ## Slash Commands — TLDR
 
 These are the most commonly used slash commands defined in `.claude/commands/`:
@@ -135,6 +139,36 @@ Concretely:
 
 Why: the gateway is the single point of ingress, handling TLS termination, auth, rate limiting, and routing. Exposing the daemon directly bypasses these protections and breaks the deployment model.
 
+## LLM Provider Abstraction
+
+All LLM calls in production code **MUST** go through the provider abstraction layer — never import `@anthropic-ai/sdk` (or any other provider SDK) directly.
+
+- Use `getConfiguredProvider()` from `providers/provider-send-message.ts` to obtain a provider instance, then call `provider.sendMessage(...)`.
+- Use the helper utilities (`extractText`, `extractToolUse`, `userMessage`, `createTimeout`, etc.) from the same module.
+- A guard test (`no-direct-anthropic-sdk-imports.test.ts`) enforces this — any new direct SDK import in production code will fail CI.
+- The only file allowed to import `@anthropic-ai/sdk` directly is `providers/anthropic/client.ts`.
+
+### Model intents over hardcoded model IDs
+
+Do not hardcode provider-specific model names (e.g., `claude-haiku-4-5-20251001`, `gpt-4o-mini`). Instead, use `modelIntent` in the config to express **what you need** from the model:
+
+- `'latency-optimized'` — fastest response (e.g., classifiers, triage, icon generation)
+- `'quality-optimized'` — best reasoning (e.g., summaries, complex analysis)
+- `'vision-optimized'` — best vision/multimodal capabilities
+
+The `RetryProvider` resolves intents to provider-specific models automatically. An explicit `model` in config takes precedence over `modelIntent`.
+
+### Provider-agnostic language
+
+Use generic terms in comments, logs, and variable names — write "LLM" instead of "Haiku"/"Sonnet"/"Claude". The system is multi-provider; naming should reflect that.
+
+## Approval Flow Resilience
+
+- **Rich delivery failures must degrade gracefully.** If delivering a rich approval prompt (e.g., Telegram inline buttons) fails, fall back to plain text with parser-compatible instructions (e.g., `Reply "yes" to approve`) — never auto-deny.
+- **Non-rich channels** (SMS, http-api) receive plain-text approval prompts without approval metadata payloads.
+- **Race conditions:** Always check whether a decision has already been resolved before delivering the engine's optimistic reply. If `handleChannelDecision` returns `applied: false`, deliver an "already resolved" notice and return `stale_ignored`.
+- **Requester self-cancel:** A requester with a pending guardian approval must be able to cancel their own request (but not self-approve).
+
 ## Tooling Direction
 
 Do not add new tool registrations using the `class ____Tool implements Tool {` pattern.
diff --git a/ARCHITECTURE.md b/ARCHITECTURE.md
index de9a66725fe..af4f170f31d 100644
--- a/ARCHITECTURE.md
+++ b/ARCHITECTURE.md
@@ -434,7 +434,7 @@ graph TB
   - `/channels/inbound` (Telegram/SMS/WhatsApp path) before run orchestration.
   - Inbound Twilio voice setup (`RelayConnection.handleSetup`) to seed call-time actor context.
 - Runtime channel runs pass this as `guardianContext`, and session runtime assembly injects `<guardian_context>` into provider-facing prompts.
-- Voice call orchestration mirrors the same prompt contract: `CallOrchestrator` receives guardian context on setup and refreshes it immediately after successful voice challenge verification, so the first post-verification turn is grounded as `actor_role: guardian`.
+- Voice calls mirror the same prompt contract: `CallController` receives guardian context on setup and refreshes it immediately after successful voice challenge verification, so the first post-verification turn is grounded as `actor_role: guardian`.
 - Voice-specific behavior (DTMF/speech verification flow, relay state machine) remains voice-local; only actor-role resolution is shared.
 
 ### SMS Channel (Twilio)
@@ -3939,11 +3939,11 @@ The assistant inbox extends the guardian security model to support controlled cr
 
 The channel inbound handler (`channel-routes.ts`) enforces an access control layer between message receipt and agent processing:
 
-1. When `inbox_enabled` is true and the sender is not the guardian, the handler looks up the sender in `assistant_ingress_members` by `(sourceChannel, externalUserId)`.
-2. If no member record exists, the `inbox_default_policy` config determines behavior (allow, deny, or escalate).
-3. If a member exists, their individual `policy` field takes precedence.
+1. When `senderExternalUserId` is present and the sender is not the guardian, the handler looks up the sender in `assistant_ingress_members` by `(sourceChannel, externalUserId)`.
+2. If no member record exists, the message is denied (`not_a_member`).
+3. If a member exists, their individual `policy` field determines behavior (allow, deny, or escalate).
 
-Invite tokens are created via the `ingress_invite` IPC contract. Each token is SHA-256 hashed before storage — the raw token is returned exactly once at creation time. External users redeem invites by sending the token as a channel message, which creates a member record with the default policy.
+Invite tokens are created via the `ingress_invite` IPC contract. Each token is SHA-256 hashed before storage — the raw token is returned exactly once at creation time. External users redeem invites by sending the token as a channel message, which creates a member record with `allow` policy.
 
 #### Escalation Data Flow
 
@@ -4091,14 +4091,16 @@ The Calls subsystem supports both **outbound** and **inbound** voice calls via T
 ```mermaid
 sequenceDiagram
     participant User as User (Chat UI)
-    participant Session as Session / Tool Executor
     participant CallStore as CallStore (SQLite)
     participant TwilioProvider as TwilioProvider
     participant TwilioAPI as Twilio REST API
     participant Gateway as Gateway (public)
     participant Routes as twilio-routes.ts (runtime)
     participant WS as RelayConnection (WebSocket)
-    participant Orch as CallOrchestrator
+    participant Ctrl as CallController
+    participant Bridge as voice-session-bridge
+    participant RunOrch as RunOrchestrator
+    participant Session as Session / AgentLoop
     participant LLM as Anthropic Claude
     participant State as CallState (Notifiers)
     participant GuardianDispatch as GuardianDispatch
@@ -4123,35 +4125,40 @@ sequenceDiagram
     TwilioAPI->>Gateway: WebSocket /webhooks/twilio/relay
     Gateway->>WS: proxy WS to runtime /v1/calls/relay
     WS->>WS: setup message (callSid)
-    WS->>Orch: new CallOrchestrator()
-    Orch->>State: registerCallOrchestrator()
+    WS->>Ctrl: new CallController()
+    Ctrl->>State: registerCallController()
 
     loop Conversation turns
         TwilioAPI->>WS: prompt (caller utterance)
         WS->>WS: extract speaker metadata + map speaker identity
-        WS->>Orch: handleCallerUtterance(transcript, speakerContext)
-        Orch->>LLM: messages.stream()
-        LLM-->>Orch: text tokens (streaming)
-        Orch->>WS: sendTextToken() (for TTS)
-        Orch->>CallStore: recordCallEvent()
+        WS->>Ctrl: handleCallerUtterance(transcript, speakerContext)
+        Ctrl->>Bridge: startVoiceTurn()
+        Bridge->>RunOrch: startRun(conversationId, content, {sourceChannel: 'voice', eventSink})
+        RunOrch->>Session: route to session pipeline
+        Session->>LLM: agent loop (tools, memory, skills)
+        LLM-->>Session: text tokens (streaming)
+        Session-->>Bridge: eventSink.onTextDelta()
+        Bridge-->>Ctrl: onTextDelta callback
+        Ctrl->>WS: sendTextToken() (for TTS)
+        Ctrl->>CallStore: recordCallEvent()
     end
 
     alt ASK_GUARDIAN pattern detected
-        Orch->>CallStore: createPendingQuestion()
-        Orch->>GuardianDispatch: dispatchGuardianQuestion()
+        Ctrl->>CallStore: createPendingQuestion()
+        Ctrl->>GuardianDispatch: dispatchGuardianQuestion()
         GuardianDispatch->>Mac: guardian_request_thread_created IPC
         GuardianDispatch->>TG/SMS: POST /deliver/{channel}
         Note over Mac,TG/SMS: First channel to respond wins
         Mac/TG/SMS->>Routes: guardian answer
         Routes->>CallDomain: answerCall()
-        CallDomain->>Orch: handleUserAnswer()
-        Orch->>LLM: continue with [USER_ANSWERED: ...]
+        CallDomain->>Ctrl: handleUserAnswer()
+        Ctrl->>Bridge: startVoiceTurn([USER_ANSWERED: ...])
     end
 
     alt END_CALL pattern detected
-        Orch->>WS: endSession()
-        Orch->>CallStore: updateCallSession(completed)
-        Orch->>State: fireCallCompletionNotifier()
+        Ctrl->>WS: endSession()
+        Ctrl->>CallStore: updateCallSession(completed)
+        Ctrl->>State: fireCallCompletionNotifier()
     end
 
     TwilioAPI->>Gateway: POST /webhooks/twilio/status
@@ -4162,7 +4169,7 @@ sequenceDiagram
 
 ### Inbound Call Flow
 
-Inbound calls are triggered when someone dials the assistant's Twilio phone number. The gateway resolves which assistant owns the number, the runtime bootstraps a session keyed by CallSid, and the relay connection optionally gates the call behind guardian voice verification before handing off to the LLM orchestrator.
+Inbound calls are triggered when someone dials the assistant's Twilio phone number. The gateway resolves which assistant owns the number, the runtime bootstraps a session keyed by CallSid, and the relay connection optionally gates the call behind guardian voice verification before handing off to the CallController.
 
 ```mermaid
 sequenceDiagram
@@ -4174,7 +4181,10 @@ sequenceDiagram
     participant CallStore as CallStore (SQLite)
     participant WS as RelayConnection (WebSocket)
     participant GuardianSvc as ChannelGuardianService
-    participant Orch as CallOrchestrator
+    participant Ctrl as CallController
+    participant Bridge as voice-session-bridge
+    participant RunOrch as RunOrchestrator
+    participant Session as Session / AgentLoop
     participant LLM as Anthropic Claude
 
     Caller->>TwilioAPI: Dials assistant phone number
@@ -4213,7 +4223,7 @@ sequenceDiagram
             WS->>GuardianSvc: validateAndConsumeChallenge(code)
             alt Code matches
                 GuardianSvc-->>WS: success + guardian binding created
-                WS->>Orch: startNormalCallFlow(isInbound=true)
+                WS->>Ctrl: startNormalCallFlow(isInbound=true)
             else Code incorrect + attempts remaining
                 WS->>Caller: TTS "That code was incorrect. Please try again."
             else Max attempts exceeded
@@ -4223,28 +4233,37 @@ sequenceDiagram
             end
         end
     else No pending guardian challenge
-        WS->>Orch: startNormalCallFlow(isInbound=true)
+        WS->>Ctrl: startNormalCallFlow(isInbound=true)
     end
 
-    Orch->>Orch: buildInboundSystemPrompt()
-    Note over Orch: "You are answering an incoming call<br/>on behalf of [user]. Greet warmly,<br/>find out what they need."
-    Orch->>LLM: initial greeting turn
-    LLM-->>Orch: receptionist-style greeting
-    Orch->>WS: sendTextToken() (TTS to caller)
+    Ctrl->>Bridge: startVoiceTurn([CALL_OPENING])
+    Bridge->>RunOrch: startRun(conversationId, [CALL_OPENING], {sourceChannel: 'voice', eventSink})
+    RunOrch->>Session: route to session pipeline
+    Note over Session: Session runtime assembly injects<br/>voice channel context + system prompt
+    Session->>LLM: agent loop (initial greeting turn)
+    LLM-->>Session: receptionist-style greeting
+    Session-->>Bridge: eventSink.onTextDelta()
+    Bridge-->>Ctrl: onTextDelta callback
+    Ctrl->>WS: sendTextToken() (TTS to caller)
 
     loop Conversation turns
         Caller->>WS: prompt (caller utterance)
-        WS->>Orch: handleCallerUtterance(transcript, speakerContext)
-        Orch->>LLM: messages.stream()
-        LLM-->>Orch: text tokens (streaming)
-        Orch->>WS: sendTextToken() (for TTS)
-        Orch->>CallStore: recordCallEvent()
+        WS->>Ctrl: handleCallerUtterance(transcript, speakerContext)
+        Ctrl->>Bridge: startVoiceTurn()
+        Bridge->>RunOrch: startRun(conversationId, content, {sourceChannel: 'voice', eventSink})
+        RunOrch->>Session: route to session pipeline
+        Session->>LLM: agent loop (tools, memory, skills)
+        LLM-->>Session: text tokens (streaming)
+        Session-->>Bridge: eventSink.onTextDelta()
+        Bridge-->>Ctrl: onTextDelta callback
+        Ctrl->>WS: sendTextToken() (for TTS)
+        Ctrl->>CallStore: recordCallEvent()
     end
 ```
 
 **Inbound vs. outbound detection**: The relay server determines call direction by checking `session.initiatedFromConversationId`. Outbound calls are initiated from an existing conversation (`initiatedFromConversationId` set). Inbound calls are bootstrapped from Twilio webhooks and therefore have `initiatedFromConversationId == null`.
 
-**Inbound system prompt**: The `CallOrchestrator.buildInboundSystemPrompt()` generates a receptionist-style prompt: "You are on a live phone call, answering an incoming call on behalf of [user]. The caller dialed in to reach you. You do not have a specific task -- your role is to greet them warmly, find out what they need, and assist them."
+**Inbound system prompt**: The session pipeline (via voice-session-bridge) generates system prompts appropriate for the voice channel context. For inbound calls, this produces a receptionist-style prompt that greets the caller warmly and helps them with what they need.
 
 **Guardian voice verification gate**: When a pending voice guardian challenge exists (created via the desktop UI), inbound callers must enter a six-digit code via DTMF or by speaking the digits before the call proceeds. Up to 3 attempts are allowed. On success, a guardian binding is created and the call transitions to normal flow. On failure, the call ends with a "Verification failed" message. This allows guardians to verify their identity over voice before being granted channel access.
 
@@ -4265,8 +4284,9 @@ sequenceDiagram
 | `assistant/src/calls/twilio-routes.ts` | HTTP webhook handlers: voice webhook (returns TwiML with WS-A/WS-B guardrails), status callback, connect action |
 | `assistant/src/calls/relay-server.ts` | WebSocket handler for the Twilio ConversationRelay protocol; manages RelayConnection instances per call |
 | `assistant/src/calls/speaker-identification.ts` | Reusable speaker recognition primitive for voice prompts: extracts provider speaker metadata (top-level and nested fields), resolves stable per-call speaker identities, and emits speaker context for personalization |
-| `assistant/src/calls/call-orchestrator.ts` | LLM-driven conversation manager: receives caller utterances, streams responses via Anthropic Claude, detects ASK_GUARDIAN and END_CALL control markers |
-| `assistant/src/calls/call-state.ts` | Notifier pattern (Maps with register/unregister/fire helpers) for cross-component communication: question notifiers, completion notifiers, and orchestrator registry |
+| `assistant/src/calls/call-controller.ts` | Session-backed voice controller: routes voice turns through the daemon session pipeline via voice-session-bridge, detects ASK_GUARDIAN and END_CALL control markers |
+| `assistant/src/calls/voice-session-bridge.ts` | Bridge between voice relay and the daemon session/run pipeline: wraps RunOrchestrator.startRun() with voice-specific defaults, translating agent-loop events into callbacks for real-time TTS streaming |
+| `assistant/src/calls/call-state.ts` | Notifier pattern (Maps with register/unregister/fire helpers) for cross-component communication: question notifiers, completion notifiers, and controller registry |
 | `assistant/src/calls/call-constants.ts` | Config-backed constants: max call duration, user consultation timeout, silence timeout, denied emergency numbers |
 | `assistant/src/calls/voice-provider.ts` | Abstract VoiceProvider interface for provider-agnostic call initiation |
 | `assistant/src/calls/voice-quality.ts` | Voice quality profile resolution: `resolveVoiceQualityProfile()` reads `calls.voice` config and returns effective TTS provider, voice spec, and fallback settings for the active mode |
@@ -4303,7 +4323,7 @@ The `validateTransition(current, next)` function is called by `updateCallSession
 
 ### Cross-Channel Guardian Consultation
 
-When the LLM emits `[ASK_GUARDIAN: question]` during a voice call, the orchestrator creates a pending question and calls `dispatchGuardianQuestion()` on the guardian dispatch engine. The dispatch engine handles the full cross-channel fan-out:
+When the LLM emits `[ASK_GUARDIAN: question]` during a voice call, the controller creates a pending question and calls `dispatchGuardianQuestion()` on the guardian dispatch engine. The dispatch engine handles the full cross-channel fan-out:
 
 1. **Request creation**: A `guardian_action_request` row is created with a unique 6-character hex request code, the question text, a `pending` status, and an expiry timestamp.
 
@@ -4321,6 +4341,14 @@ When the LLM emits `[ASK_GUARDIAN: question]` during a voice call, the orchestra
 
 7. **Separation from channel guardian approvals**: Guardian action requests are SEPARATE from `channelGuardianApprovalRequests` (the existing channel tool-approval system). The channel guardian approval system handles tool-use permission grants (approve/deny a specific tool invocation). Guardian action requests handle free-form questions from voice calls that require human input to continue the conversation.
 
+#### Guardian Request Copy Generation Pipeline
+
+Thread titles and initial messages for guardian question threads are generated via `guardian-question-copy.ts`. The module calls the configured LLM provider (with `modelIntent: 'latency-optimized'`) to produce an emoji-prefixed, attention-oriented title and a richer initial message that explains the live-call context. A 5-second timeout guards the generation call. When no provider is configured, generation times out, or parsing fails, the module falls back to deterministic copy (`buildFallbackCopy`) that uses a warning emoji prefix and a simple template containing the question text. The generative copy is awaited only in the macOS delivery branch so that Telegram/SMS deliveries dispatch without LLM latency.
+
+#### macOS Notification + Deep-Link Flow
+
+When a guardian question is dispatched while the macOS app is backgrounded, the Swift client posts a native `UNUserNotificationCenter` notification under the `GUARDIAN_REQUEST` category. The notification title mirrors the emoji-prefixed thread title from the copy generation pipeline. Tapping the notification triggers the `openConversationThread` deep-link handler, which switches the main window to the guardian question thread so the user can reply immediately.
+
 ### SQLite Tables
 
 All five tables live in `~/.vellum/workspace/data/db/assistant.db` alongside existing tables:
@@ -4375,7 +4403,7 @@ This makes ingress URL updates smoother in local tunnel workflows because Twilio
 | GET | `/v1/calls/:callSessionId` | Get call status, including any pending question |
 | POST | `/v1/calls/:callSessionId/cancel` | Cancel an active call |
 | POST | `/v1/calls/:callSessionId/answer` | Answer a pending question via HTTP (alternative to in-thread bridge) |
-| POST | `/v1/calls/:callSessionId/instruction` | Relay a steering instruction to an active call's orchestrator (alternative to in-thread bridge) |
+| POST | `/v1/calls/:callSessionId/instruction` | Relay a steering instruction to an active call's controller (alternative to in-thread bridge) |
 | POST | `/v1/internal/twilio/status` | Internal status callback used by gateway; accepts JSON `{ params }` |
 | POST | `/v1/internal/twilio/connect-action` | Internal connect action callback used by gateway; accepts JSON `{ params }` |
 | WS | `/v1/calls/relay` | ConversationRelay WebSocket (bidirectional: prompt/interrupt/dtmf from Twilio, text tokens/end to Twilio) |
@@ -4392,10 +4420,10 @@ Both tools and HTTP routes delegate to the same domain functions in `call-domain
 
 ### Control Markers
 
-The CallOrchestrator detects two special markers in the LLM's response text:
+The CallController detects two special markers in the LLM's response text:
 
-- **`[ASK_GUARDIAN: question]`** — The AI needs to consult the guardian. The orchestrator creates a pending question, notifies the session via `fireCallQuestionNotifier`, puts the caller on hold, and waits for a guardian answer (timeout configured via `calls.userConsultTimeoutSeconds`).
-- **`[END_CALL]`** — The AI has determined the call's purpose is fulfilled. The orchestrator sends a goodbye, closes the ConversationRelay session, and marks the call as completed.
+- **`[ASK_GUARDIAN: question]`** — The AI needs to consult the guardian. The controller creates a pending question, notifies the session via `fireCallQuestionNotifier`, puts the caller on hold, and waits for a guardian answer (timeout configured via `calls.userConsultTimeoutSeconds`).
+- **`[END_CALL]`** — The AI has determined the call's purpose is fulfilled. The controller sends a goodbye, closes the ConversationRelay session, and marks the call as completed.
 
 Both markers are stripped from the TTS output so the callee never hears the raw control text.
 
@@ -4423,7 +4451,7 @@ Call behavior is controlled via the `calls` config block in the assistant config
 | `calls.disclosure.enabled` | boolean | `true` | Whether the AI should disclose it is an AI at the start of the call. |
 | `calls.disclosure.text` | string | *(default disclosure prompt)* | The disclosure instruction included in the system prompt. |
 | `calls.safety.denyCategories` | string[] | `[]` | Categories of calls to deny (e.g., emergency numbers are always denied regardless of this setting). |
-| `calls.model` | string | *(unset — uses default model)* | Optional override for the LLM model used in call orchestration. |
+| `calls.model` | string | *(unset — uses default model)* | Optional override for the LLM model used in voice call conversations. |
 | `calls.voice.mode` | enum | `'twilio_standard'` | Voice quality mode. Options: `twilio_standard` (standard Twilio TTS with Google voices — fully supported), `twilio_elevenlabs_tts` (ElevenLabs voices through Twilio ConversationRelay — fully supported), `elevenlabs_agent` (full ElevenLabs conversational agent — experimental/restricted, blocked by runtime guard). |
 | `calls.voice.language` | string | `'en-US'` | Language code for TTS and transcription. |
 | `calls.voice.transcriptionProvider` | enum | `'Deepgram'` | Speech-to-text provider (`Deepgram` or `Google`). |
@@ -4608,7 +4636,7 @@ Keep-alive heartbeats (every 30 s by default):
 | Proxy leaf certs | `{dataDir}/proxy-ca/issued/` | PEM files per hostname | openssl CLI, cached | 1-year validity, re-issued on CA change |
 | Proxy sessions | In-memory (SessionManager) | Map<ProxySessionId, ManagedSession> | Manual lifecycle | Ephemeral; 5min idle timeout, cleared on shutdown |
 | Call sessions, events, pending questions | `~/.vellum/workspace/data/db/assistant.db` | SQLite | Drizzle ORM | Permanent, cascade on session delete |
-| Active call orchestrators | In-memory (CallState) | Map<callSessionId, CallOrchestrator> | Manual lifecycle | Ephemeral; cleared on call end or destroy |
+| Active call controllers | In-memory (CallState) | Map<callSessionId, CallController> | Manual lifecycle | Ephemeral; cleared on call end or destroy |
 | Guardian bindings | `~/.vellum/workspace/data/db/assistant.db` | SQLite | Drizzle ORM | Permanent; revoked bindings retained |
 | Guardian verification challenges | `~/.vellum/workspace/data/db/assistant.db` | SQLite | Drizzle ORM | Permanent; consumed/expired challenges retained |
 | Guardian approval requests | `~/.vellum/workspace/data/db/assistant.db` | SQLite | Drizzle ORM | Permanent; decision outcome retained |
diff --git a/assistant/README.md b/assistant/README.md
index 07dd2169725..b087f1fef40 100644
--- a/assistant/README.md
+++ b/assistant/README.md
@@ -292,13 +292,13 @@ The assistant inbox provides secure cross-user messaging, allowing external user
 
 ### Ingress Membership
 
-External users join through **invite tokens** — the owner creates an invite via the desktop UI or IPC, and the external user redeems the token by sending it as a channel message. Redemption auto-creates a **member** record with a configurable access policy:
+External users join through **invite tokens** — the owner creates an invite via the desktop UI or IPC, and the external user redeems the token by sending it as a channel message. Redemption auto-creates a **member** record with an access policy:
 
 - **`allow`** — Messages are processed normally through the agent pipeline.
 - **`deny`** — Messages are rejected with a refusal notice.
 - **`escalate`** — Messages are held for guardian (owner) approval before processing.
 
-The default policy for new members is controlled by the `inbox_default_policy` config. Members can be listed, updated, revoked, or blocked via the `ingress_member` IPC contract.
+Non-members (senders with no invite redemption) are denied by default. Members can be listed, updated, revoked, or blocked via the `ingress_member` IPC contract.
 
 ### Escalation Flow (Dual-Surface)
 
diff --git a/assistant/package.json b/assistant/package.json
index dd05a8ed2ae..014f6cd03b6 100644
--- a/assistant/package.json
+++ b/assistant/package.json
@@ -1,6 +1,6 @@
 {
   "name": "@vellumai/assistant",
-  "version": "0.3.5",
+  "version": "0.3.6",
   "type": "module",
   "bin": {
     "vellum": "./src/index.ts"
diff --git a/assistant/src/__tests__/__snapshots__/ipc-snapshot.test.ts.snap b/assistant/src/__tests__/__snapshots__/ipc-snapshot.test.ts.snap
index 12d13af57a0..2d94b6203b9 100644
--- a/assistant/src/__tests__/__snapshots__/ipc-snapshot.test.ts.snap
+++ b/assistant/src/__tests__/__snapshots__/ipc-snapshot.test.ts.snap
@@ -2614,6 +2614,7 @@ exports[`IPC message snapshots ServerMessage types guardian_request_thread_creat
 {
   "callSessionId": "call-001",
   "conversationId": "conv-guardian-001",
+  "questionText": "What is the gate code?",
   "requestId": "req-guardian-001",
   "title": "Guardian action request",
   "type": "guardian_request_thread_created",
diff --git a/assistant/src/__tests__/call-controller.test.ts b/assistant/src/__tests__/call-controller.test.ts
new file mode 100644
index 00000000000..ea7c683b364
--- /dev/null
+++ b/assistant/src/__tests__/call-controller.test.ts
@@ -0,0 +1,835 @@
+import { describe, test, expect, beforeEach, afterAll, mock, type Mock } from 'bun:test';
+import { mkdtempSync, rmSync } from 'node:fs';
+import { tmpdir } from 'node:os';
+import { join } from 'node:path';
+
+const testDir = mkdtempSync(join(tmpdir(), 'call-controller-test-'));
+
+// ── Platform + logger mocks (must come before any source imports) ────
+
+mock.module('../util/platform.js', () => ({
+  getDataDir: () => testDir,
+  isMacOS: () => process.platform === 'darwin',
+  isLinux: () => process.platform === 'linux',
+  isWindows: () => process.platform === 'win32',
+  getSocketPath: () => join(testDir, 'test.sock'),
+  getPidPath: () => join(testDir, 'test.pid'),
+  getDbPath: () => join(testDir, 'test.db'),
+  getLogPath: () => join(testDir, 'test.log'),
+  ensureDataDir: () => {},
+  readHttpToken: () => null,
+}));
+
+mock.module('../util/logger.js', () => ({
+  getLogger: () =>
+    new Proxy({} as Record<string, unknown>, {
+      get: () => () => {},
+    }),
+}));
+
+// ── Config mock ─────────────────────────────────────────────────────
+
+mock.module('../config/loader.js', () => ({
+  getConfig: () => ({
+    provider: 'anthropic',
+    providerOrder: ['anthropic'],
+    apiKeys: { anthropic: 'test-key' },
+    calls: {
+      enabled: true,
+      provider: 'twilio',
+      maxDurationSeconds: 12 * 60,
+      userConsultTimeoutSeconds: 90,
+      userConsultationTimeoutSeconds: 90,
+      silenceTimeoutSeconds: 30,
+      disclosure: { enabled: false, text: '' },
+      safety: { denyCategories: [] },
+      model: undefined,
+    },
+    memory: { enabled: false },
+  }),
+}));
+
+// ── Voice session bridge mock ────────────────────────────────────────
+
+/**
+ * Creates a mock startVoiceTurn implementation that emits text_delta
+ * events for each token and calls onComplete when done.
+ */
+function createMockVoiceTurn(tokens: string[]) {
+  return async (opts: {
+    conversationId: string;
+    content: string;
+    assistantId?: string;
+    onTextDelta: (text: string) => void;
+    onComplete: () => void;
+    onError: (message: string) => void;
+    signal?: AbortSignal;
+  }) => {
+    // Check for abort before proceeding
+    if (opts.signal?.aborted) {
+      const err = new Error('aborted');
+      err.name = 'AbortError';
+      throw err;
+    }
+
+    // Emit text deltas
+    for (const token of tokens) {
+      if (opts.signal?.aborted) break;
+      opts.onTextDelta(token);
+    }
+
+    if (!opts.signal?.aborted) {
+      opts.onComplete();
+    }
+
+    let aborted = false;
+    return {
+      runId: `run-${Date.now()}`,
+      abort: () => { aborted = true; },
+    };
+  };
+}
+
+// eslint-disable-next-line @typescript-eslint/no-explicit-any
+let mockStartVoiceTurn: Mock<any>;
+
+mock.module('../calls/voice-session-bridge.js', () => {
+  mockStartVoiceTurn = mock(createMockVoiceTurn(['Hello', ' there']));
+  return {
+    startVoiceTurn: (...args: unknown[]) => mockStartVoiceTurn(...args),
+    setVoiceBridgeOrchestrator: () => {},
+  };
+});
+
+// ── Import source modules after all mocks are registered ────────────
+
+import { initializeDb, getDb, resetDb } from '../memory/db.js';
+import { conversations } from '../memory/schema.js';
+import {
+  createCallSession,
+  getCallSession,
+  getCallEvents,
+  getPendingQuestion,
+  updateCallSession,
+} from '../calls/call-store.js';
+import {
+  getCallController,
+} from '../calls/call-state.js';
+import { CallController } from '../calls/call-controller.js';
+import type { RelayConnection } from '../calls/relay-server.js';
+
+initializeDb();
+
+afterAll(() => {
+  resetDb();
+  try {
+    rmSync(testDir, { recursive: true });
+  } catch {
+    /* best effort */
+  }
+});
+
+// ── RelayConnection mock factory ────────────────────────────────────
+
+interface MockRelay extends RelayConnection {
+  sentTokens: Array<{ token: string; last: boolean }>;
+  endCalled: boolean;
+  endReason: string | undefined;
+}
+
+function createMockRelay(): MockRelay {
+  const state = {
+    sentTokens: [] as Array<{ token: string; last: boolean }>,
+    _endCalled: false,
+    _endReason: undefined as string | undefined,
+  };
+
+  return {
+    get sentTokens() { return state.sentTokens; },
+    get endCalled() { return state._endCalled; },
+    get endReason() { return state._endReason; },
+    sendTextToken(token: string, last: boolean) {
+      state.sentTokens.push({ token, last });
+    },
+    endSession(reason?: string) {
+      state._endCalled = true;
+      state._endReason = reason;
+    },
+  } as unknown as MockRelay;
+}
+
+// ── Helpers ─────────────────────────────────────────────────────────
+
+let ensuredConvIds = new Set<string>();
+function ensureConversation(id: string): void {
+  if (ensuredConvIds.has(id)) return;
+  const db = getDb();
+  const now = Date.now();
+  db.insert(conversations).values({
+    id,
+    title: `Test conversation ${id}`,
+    createdAt: now,
+    updatedAt: now,
+  }).run();
+  ensuredConvIds.add(id);
+}
+
+function resetTables() {
+  const db = getDb();
+  db.run('DELETE FROM guardian_action_deliveries');
+  db.run('DELETE FROM guardian_action_requests');
+  db.run('DELETE FROM call_pending_questions');
+  db.run('DELETE FROM call_events');
+  db.run('DELETE FROM call_sessions');
+  db.run('DELETE FROM tool_invocations');
+  db.run('DELETE FROM messages');
+  db.run('DELETE FROM conversations');
+  ensuredConvIds = new Set();
+}
+
+/**
+ * Create a call session and a controller wired to a mock relay.
+ */
+function setupController(task?: string, opts?: { assistantId?: string; guardianContext?: import('../daemon/session-runtime-assembly.js').GuardianRuntimeContext }) {
+  ensureConversation('conv-ctrl-test');
+  const session = createCallSession({
+    conversationId: 'conv-ctrl-test',
+    provider: 'twilio',
+    fromNumber: '+15551111111',
+    toNumber: '+15552222222',
+    task,
+  });
+  updateCallSession(session.id, { status: 'in_progress' });
+  const relay = createMockRelay();
+  const controller = new CallController(session.id, relay as unknown as RelayConnection, task ?? null, {
+    assistantId: opts?.assistantId,
+    guardianContext: opts?.guardianContext,
+  });
+  return { session, relay, controller };
+}
+
+describe('call-controller', () => {
+  beforeEach(() => {
+    resetTables();
+    // Reset the bridge mock to default behaviour
+    mockStartVoiceTurn.mockImplementation(createMockVoiceTurn(['Hello', ' there']));
+  });
+
+  // ── handleCallerUtterance ─────────────────────────────────────────
+
+  test('handleCallerUtterance: streams tokens via sendTextToken', async () => {
+    mockStartVoiceTurn.mockImplementation(createMockVoiceTurn(['Hi', ', how', ' are you?']));
+    const { relay, controller } = setupController();
+
+    await controller.handleCallerUtterance('Hello');
+
+    // Verify tokens were sent to the relay
+    const nonEmptyTokens = relay.sentTokens.filter((t) => t.token.length > 0);
+    expect(nonEmptyTokens.length).toBeGreaterThan(0);
+    // The last token should have last=true (empty string token signaling end)
+    const lastToken = relay.sentTokens[relay.sentTokens.length - 1];
+    expect(lastToken.last).toBe(true);
+
+    controller.destroy();
+  });
+
+  test('handleCallerUtterance: sends last=true at end of turn', async () => {
+    mockStartVoiceTurn.mockImplementation(createMockVoiceTurn(['Simple response.']));
+    const { relay, controller } = setupController();
+
+    await controller.handleCallerUtterance('Test');
+
+    // Find the final empty-string token that marks end of turn
+    const endMarkers = relay.sentTokens.filter((t) => t.last === true);
+    expect(endMarkers.length).toBeGreaterThanOrEqual(1);
+
+    controller.destroy();
+  });
+
+  test('handleCallerUtterance: includes speaker context in voice turn content', async () => {
+    mockStartVoiceTurn.mockImplementation(async (opts: { content: string; onTextDelta: (t: string) => void; onComplete: () => void }) => {
+      expect(opts.content).toContain('[SPEAKER id="speaker-1" label="Aaron" source="provider" confidence="0.91"]');
+      expect(opts.content).toContain('Can you summarize this meeting?');
+      opts.onTextDelta('Sure, here is a summary.');
+      opts.onComplete();
+      return { runId: 'run-1', abort: () => {} };
+    });
+
+    const { controller } = setupController();
+
+    await controller.handleCallerUtterance('Can you summarize this meeting?', {
+      speakerId: 'speaker-1',
+      speakerLabel: 'Aaron',
+      speakerConfidence: 0.91,
+      source: 'provider',
+    });
+
+    controller.destroy();
+  });
+
+  test('startInitialGreeting: sends CALL_OPENING content and strips control marker from speech', async () => {
+    let turnCount = 0;
+    mockStartVoiceTurn.mockImplementation(async (opts: { content: string; onTextDelta: (t: string) => void; onComplete: () => void }) => {
+      turnCount++;
+      expect(opts.content).toContain('[CALL_OPENING]');
+      const tokens = ['Hi, I am calling about your appointment request. Is now a good time to talk?'];
+      for (const token of tokens) {
+        opts.onTextDelta(token);
+      }
+      opts.onComplete();
+      return { runId: 'run-1', abort: () => {} };
+    });
+
+    const { relay, controller } = setupController('Confirm appointment');
+
+    await controller.startInitialGreeting();
+    await controller.startInitialGreeting(); // should be no-op
+
+    const allText = relay.sentTokens.map((t) => t.token).join('');
+    expect(allText).toContain('appointment request');
+    expect(allText).toContain('good time to talk');
+    expect(allText).not.toContain('[CALL_OPENING]');
+    expect(turnCount).toBe(1); // idempotent
+
+    controller.destroy();
+  });
+
+  test('startInitialGreeting: tags only the first caller response with CALL_OPENING_ACK', async () => {
+    let turnCount = 0;
+    mockStartVoiceTurn.mockImplementation(async (opts: { content: string; onTextDelta: (t: string) => void; onComplete: () => void }) => {
+      turnCount++;
+
+      let tokens: string[];
+      if (turnCount === 1) {
+        expect(opts.content).toContain('[CALL_OPENING]');
+        tokens = ['Hey Noa, it\'s Credence calling about your joke request. Is now okay for a quick one?'];
+      } else if (turnCount === 2) {
+        expect(opts.content).toContain('[CALL_OPENING_ACK]');
+        expect(opts.content).toContain('Yeah. Sure. What\'s up?');
+        tokens = ['Great, here\'s one right away. Why did the scarecrow win an award?'];
+      } else {
+        expect(opts.content).not.toContain('[CALL_OPENING_ACK]');
+        expect(opts.content).toContain('Tell me the punchline');
+        tokens = ['Because he was outstanding in his field.'];
+      }
+
+      for (const token of tokens) {
+        opts.onTextDelta(token);
+      }
+      opts.onComplete();
+      return { runId: `run-${turnCount}`, abort: () => {} };
+    });
+
+    const { controller } = setupController('Tell a joke immediately');
+
+    await controller.startInitialGreeting();
+    await controller.handleCallerUtterance('Yeah. Sure. What\'s up?');
+    await controller.handleCallerUtterance('Tell me the punchline');
+
+    expect(turnCount).toBe(3);
+
+    controller.destroy();
+  });
+
+  // ── ASK_GUARDIAN pattern ──────────────────────────────────────────
+
+  test('ASK_GUARDIAN pattern: detects pattern, creates pending question, enters waiting_on_user', async () => {
+    mockStartVoiceTurn.mockImplementation(createMockVoiceTurn(
+      ['Let me check on that. ', '[ASK_GUARDIAN: What date works best?]'],
+    ));
+    const { session, relay, controller } = setupController('Book appointment');
+
+    await controller.handleCallerUtterance('I need to schedule something');
+
+    // Verify a pending question was created
+    const question = getPendingQuestion(session.id);
+    expect(question).not.toBeNull();
+    expect(question!.questionText).toBe('What date works best?');
+    expect(question!.status).toBe('pending');
+
+    // Verify session status was updated to waiting_on_user
+    const updatedSession = getCallSession(session.id);
+    expect(updatedSession!.status).toBe('waiting_on_user');
+
+    // The ASK_GUARDIAN marker text should NOT appear in the relay tokens
+    const allText = relay.sentTokens.map((t) => t.token).join('');
+    expect(allText).not.toContain('[ASK_GUARDIAN:');
+
+    controller.destroy();
+  });
+
+  test('strips internal context markers from spoken output', async () => {
+    mockStartVoiceTurn.mockImplementation(createMockVoiceTurn([
+      'Thanks for waiting. ',
+      '[USER_ANSWERED: The guardian said 3 PM works.] ',
+      '[USER_INSTRUCTION: Keep this short.] ',
+      '[CALL_OPENING_ACK] ',
+      'I can confirm 3 PM works.',
+    ]));
+    const { relay, controller } = setupController();
+
+    await controller.handleCallerUtterance('Any update?');
+
+    const allText = relay.sentTokens.map((t) => t.token).join('');
+    expect(allText).toContain('Thanks for waiting.');
+    expect(allText).toContain('I can confirm 3 PM works.');
+    expect(allText).not.toContain('[USER_ANSWERED:');
+    expect(allText).not.toContain('[USER_INSTRUCTION:');
+    expect(allText).not.toContain('[CALL_OPENING_ACK]');
+    expect(allText).not.toContain('USER_ANSWERED');
+    expect(allText).not.toContain('USER_INSTRUCTION');
+    expect(allText).not.toContain('CALL_OPENING_ACK');
+
+    controller.destroy();
+  });
+
+  // ── END_CALL pattern ──────────────────────────────────────────────
+
+  test('END_CALL pattern: detects marker, calls endSession, updates status to completed', async () => {
+    mockStartVoiceTurn.mockImplementation(createMockVoiceTurn(
+      ['Thank you for calling, goodbye! ', '[END_CALL]'],
+    ));
+    const { session, relay, controller } = setupController();
+
+    await controller.handleCallerUtterance('That is all, thanks');
+
+    // endSession should have been called
+    expect(relay.endCalled).toBe(true);
+
+    // Session status should be completed
+    const updatedSession = getCallSession(session.id);
+    expect(updatedSession!.status).toBe('completed');
+    expect(updatedSession!.endedAt).not.toBeNull();
+
+    // The END_CALL marker text should NOT appear in the relay tokens
+    const allText = relay.sentTokens.map((t) => t.token).join('');
+    expect(allText).not.toContain('[END_CALL]');
+
+    controller.destroy();
+  });
+
+  // ── handleUserAnswer ──────────────────────────────────────────────
+
+  test('handleUserAnswer: returns true immediately and fires LLM asynchronously', async () => {
+    // First utterance triggers ASK_GUARDIAN
+    mockStartVoiceTurn.mockImplementation(createMockVoiceTurn(
+      ['Hold on. [ASK_GUARDIAN: Preferred time?]'],
+    ));
+    const { relay, controller } = setupController();
+
+    await controller.handleCallerUtterance('I need an appointment');
+
+    // Now provide the answer — reset mock for second turn
+    mockStartVoiceTurn.mockImplementation(async (opts: { content: string; onTextDelta: (t: string) => void; onComplete: () => void }) => {
+      expect(opts.content).toContain('[USER_ANSWERED: 3pm tomorrow]');
+      const tokens = ['Great, I have scheduled for 3pm tomorrow.'];
+      for (const token of tokens) {
+        opts.onTextDelta(token);
+      }
+      opts.onComplete();
+      return { runId: 'run-2', abort: () => {} };
+    });
+
+    const accepted = await controller.handleUserAnswer('3pm tomorrow');
+    expect(accepted).toBe(true);
+
+    // handleUserAnswer fires runTurn without awaiting, so give the
+    // microtask queue a tick to let the async work complete.
+    await new Promise((r) => setTimeout(r, 50));
+
+    // Should have streamed a response for the answer
+    const tokensAfterAnswer = relay.sentTokens.filter((t) => t.token.includes('3pm'));
+    expect(tokensAfterAnswer.length).toBeGreaterThan(0);
+
+    controller.destroy();
+  });
+
+  // ── Full mid-call question flow ──────────────────────────────────
+
+  test('mid-call question flow: unavailable time -> ask user -> user confirms -> resumed call', async () => {
+    // Step 1: Caller says "7:30" but it's unavailable. The LLM asks the user.
+    mockStartVoiceTurn.mockImplementation(createMockVoiceTurn(
+      ['I\'m sorry, 7:30 is not available. ', '[ASK_GUARDIAN: Is 8:00 okay instead?]'],
+    ));
+
+    const { session, relay, controller } = setupController('Schedule a haircut');
+
+    await controller.handleCallerUtterance('Can I book for 7:30?');
+
+    // Verify we're in waiting_on_user state
+    expect(controller.getState()).toBe('waiting_on_user');
+    const question = getPendingQuestion(session.id);
+    expect(question).not.toBeNull();
+    expect(question!.questionText).toBe('Is 8:00 okay instead?');
+
+    // Verify session status
+    const midSession = getCallSession(session.id);
+    expect(midSession!.status).toBe('waiting_on_user');
+
+    // Step 2: User answers "Yes, 8:00 works"
+    mockStartVoiceTurn.mockImplementation(createMockVoiceTurn(
+      ['Great, I\'ve booked you for 8:00. See you then! ', '[END_CALL]'],
+    ));
+
+    const accepted = await controller.handleUserAnswer('Yes, 8:00 works for me');
+    expect(accepted).toBe(true);
+
+    // Give the fire-and-forget LLM call time to complete
+    await new Promise((r) => setTimeout(r, 50));
+
+    // Step 3: Verify call completed
+    const endSession = getCallSession(session.id);
+    expect(endSession!.status).toBe('completed');
+    expect(endSession!.endedAt).not.toBeNull();
+
+    // Verify the END_CALL marker triggered endSession on relay
+    expect(relay.endCalled).toBe(true);
+
+    controller.destroy();
+  });
+
+  // ── Error handling ────────────────────────────────────────────────
+
+  test('Voice turn error: sends error message to caller and returns to idle', async () => {
+    mockStartVoiceTurn.mockImplementation(async (opts: { onError: (msg: string) => void }) => {
+      opts.onError('API rate limit exceeded');
+      return { runId: 'run-err', abort: () => {} };
+    });
+
+    const { relay, controller } = setupController();
+
+    await controller.handleCallerUtterance('Hello');
+
+    // Should have sent an error recovery message
+    const errorTokens = relay.sentTokens.filter((t) =>
+      t.token.includes('technical issue'),
+    );
+    expect(errorTokens.length).toBeGreaterThan(0);
+
+    // State should return to idle after error
+    expect(controller.getState()).toBe('idle');
+
+    controller.destroy();
+  });
+
+  test('handleUserAnswer: returns false when not in waiting_on_user state', async () => {
+    const { controller } = setupController();
+
+    // Controller starts in idle state
+    const result = await controller.handleUserAnswer('some answer');
+    expect(result).toBe(false);
+
+    controller.destroy();
+  });
+
+  // ── handleInterrupt ───────────────────────────────────────────────
+
+  test('handleInterrupt: resets state to idle', () => {
+    const { controller } = setupController();
+
+    // Calling handleInterrupt should not throw
+    controller.handleInterrupt();
+
+    controller.destroy();
+  });
+
+  test('handleInterrupt: sends turn terminator when interrupting active speech', async () => {
+    mockStartVoiceTurn.mockImplementation(async (opts: { signal?: AbortSignal; onTextDelta: (t: string) => void; onComplete: () => void }) => {
+      return new Promise((resolve) => {
+        // Simulate a long-running turn that can be aborted
+        const timeout = setTimeout(() => {
+          opts.onTextDelta('This should be interrupted');
+          opts.onComplete();
+          resolve({ runId: 'run-1', abort: () => {} });
+        }, 1000);
+
+        opts.signal?.addEventListener('abort', () => {
+          clearTimeout(timeout);
+          // In the real system, generation_cancelled triggers
+          // onComplete via the event sink. The AbortSignal listener
+          // in call-controller also resolves turnComplete defensively.
+          opts.onComplete();
+          resolve({ runId: 'run-1', abort: () => {} });
+        }, { once: true });
+      });
+    });
+
+    const { relay, controller } = setupController();
+    const turnPromise = controller.handleCallerUtterance('Start speaking');
+    await new Promise((r) => setTimeout(r, 5));
+    controller.handleInterrupt();
+    await turnPromise;
+
+    const endTurnMarkers = relay.sentTokens.filter((t) => t.token === '' && t.last === true);
+    expect(endTurnMarkers.length).toBeGreaterThan(0);
+
+    controller.destroy();
+  });
+
+  test('handleInterrupt: turnComplete settles even when event sink callbacks are not called', async () => {
+    // Simulate a turn that never calls onComplete or onError on abort —
+    // the defensive AbortSignal listener in runTurn() should settle the promise.
+    mockStartVoiceTurn.mockImplementation(async (opts: { signal?: AbortSignal; onTextDelta: (t: string) => void; onComplete: () => void }) => {
+      return new Promise((resolve) => {
+        const timeout = setTimeout(() => {
+          opts.onTextDelta('Long running turn');
+          opts.onComplete();
+          resolve({ runId: 'run-1', abort: () => {} });
+        }, 5000);
+
+        opts.signal?.addEventListener('abort', () => {
+          clearTimeout(timeout);
+          // Intentionally do NOT call onComplete — simulates the old
+          // broken path where generation_cancelled was not forwarded.
+          resolve({ runId: 'run-1', abort: () => {} });
+        }, { once: true });
+      });
+    });
+
+    const { controller } = setupController();
+    const turnPromise = controller.handleCallerUtterance('Start speaking');
+    await new Promise((r) => setTimeout(r, 5));
+    controller.handleInterrupt();
+
+    // Should not hang — the AbortSignal listener resolves the promise
+    await turnPromise;
+
+    expect(controller.getState()).toBe('idle');
+
+    controller.destroy();
+  });
+
+  // ── Guardian context pass-through ──────────────────────────────────
+
+  test('handleCallerUtterance: passes guardian context to startVoiceTurn', async () => {
+    const guardianCtx = {
+      sourceChannel: 'voice' as const,
+      actorRole: 'non-guardian' as const,
+      guardianExternalUserId: '+15550009999',
+      guardianChatId: '+15550009999',
+      requesterExternalUserId: '+15550002222',
+    };
+
+    let capturedGuardianContext: unknown = undefined;
+    mockStartVoiceTurn.mockImplementation(async (opts: {
+      guardianContext?: unknown;
+      onTextDelta: (t: string) => void;
+      onComplete: () => void;
+    }) => {
+      capturedGuardianContext = opts.guardianContext;
+      opts.onTextDelta('Hello.');
+      opts.onComplete();
+      return { runId: 'run-gc', abort: () => {} };
+    });
+
+    const { controller } = setupController(undefined, { guardianContext: guardianCtx });
+
+    await controller.handleCallerUtterance('Hello');
+
+    expect(capturedGuardianContext).toEqual(guardianCtx);
+
+    controller.destroy();
+  });
+
+  test('handleCallerUtterance: passes assistantId to startVoiceTurn', async () => {
+    let capturedAssistantId: string | undefined;
+    mockStartVoiceTurn.mockImplementation(async (opts: {
+      assistantId?: string;
+      onTextDelta: (t: string) => void;
+      onComplete: () => void;
+    }) => {
+      capturedAssistantId = opts.assistantId;
+      opts.onTextDelta('Hello.');
+      opts.onComplete();
+      return { runId: 'run-aid', abort: () => {} };
+    });
+
+    const { controller } = setupController(undefined, { assistantId: 'my-assistant' });
+
+    await controller.handleCallerUtterance('Hello');
+
+    expect(capturedAssistantId).toBe('my-assistant');
+
+    controller.destroy();
+  });
+
+  test('setGuardianContext: subsequent turns use updated guardian context', async () => {
+    const initialCtx = {
+      sourceChannel: 'voice' as const,
+      actorRole: 'unverified_channel' as const,
+      denialReason: 'no_binding' as const,
+    };
+
+    const upgradedCtx = {
+      sourceChannel: 'voice' as const,
+      actorRole: 'guardian' as const,
+      guardianExternalUserId: '+15550003333',
+      guardianChatId: '+15550003333',
+    };
+
+    const capturedContexts: unknown[] = [];
+    mockStartVoiceTurn.mockImplementation(async (opts: {
+      guardianContext?: unknown;
+      onTextDelta: (t: string) => void;
+      onComplete: () => void;
+    }) => {
+      capturedContexts.push(opts.guardianContext);
+      opts.onTextDelta('Response.');
+      opts.onComplete();
+      return { runId: `run-${capturedContexts.length}`, abort: () => {} };
+    });
+
+    const { controller } = setupController(undefined, { guardianContext: initialCtx });
+
+    // First turn: unverified
+    await controller.handleCallerUtterance('Hello');
+    expect(capturedContexts[0]).toEqual(initialCtx);
+
+    // Simulate guardian verification succeeding
+    controller.setGuardianContext(upgradedCtx);
+
+    // Second turn: should use upgraded guardian context
+    await controller.handleCallerUtterance('I verified');
+    expect(capturedContexts[1]).toEqual(upgradedCtx);
+
+    controller.destroy();
+  });
+
+  // ── destroy ───────────────────────────────────────────────────────
+
+  test('destroy: unregisters controller', () => {
+    const { session, controller } = setupController();
+
+    // Controller should be registered
+    expect(getCallController(session.id)).toBeDefined();
+
+    controller.destroy();
+
+    // After destroy, controller should be unregistered
+    expect(getCallController(session.id)).toBeUndefined();
+  });
+
+  test('destroy: can be called multiple times without error', () => {
+    const { controller } = setupController();
+
+    controller.destroy();
+    // Second destroy should not throw
+    expect(() => controller.destroy()).not.toThrow();
+  });
+
+  test('destroy: during active turn does not trigger post-turn side effects', async () => {
+    // Simulate a turn that completes after destroy() is called
+    mockStartVoiceTurn.mockImplementation(async (opts: { signal?: AbortSignal; onTextDelta: (t: string) => void; onComplete: () => void }) => {
+      return new Promise((resolve) => {
+        const timeout = setTimeout(() => {
+          opts.onTextDelta('This is a long response');
+          opts.onComplete();
+          resolve({ runId: 'run-1', abort: () => {} });
+        }, 1000);
+
+        opts.signal?.addEventListener('abort', () => {
+          clearTimeout(timeout);
+          // The defensive abort listener in runTurn resolves turnComplete
+          opts.onComplete();
+          resolve({ runId: 'run-1', abort: () => {} });
+        }, { once: true });
+      });
+    });
+
+    const { relay, controller } = setupController();
+    const turnPromise = controller.handleCallerUtterance('Start speaking');
+
+    // Let the turn start
+    await new Promise((r) => setTimeout(r, 5));
+
+    // Destroy the controller while the turn is active
+    controller.destroy();
+
+    // Wait for the turn to settle
+    await turnPromise;
+
+    // Verify that NO spurious post-turn side effects occurred after destroy:
+    // - No final empty-string sendTextToken('', true) call after abort
+    // The only end marker should be from handleInterrupt, not from post-turn logic
+    const endMarkers = relay.sentTokens.filter((t) => t.token === '' && t.last === true);
+
+    // destroy() increments llmRunVersion, so isCurrentRun() returns false
+    // for the aborted turn, preventing post-turn side effects including
+    // the spurious relay.sendTextToken('', true) on line 418.
+    expect(endMarkers.length).toBe(0);
+  });
+
+  // ── handleUserInstruction ─────────────────────────────────────────
+
+  test('handleUserInstruction: injects instruction marker and triggers turn when idle', async () => {
+    mockStartVoiceTurn.mockImplementation(async (opts: { content: string; onTextDelta: (t: string) => void; onComplete: () => void }) => {
+      expect(opts.content).toContain('[USER_INSTRUCTION: Ask about their weekend plans]');
+      const tokens = ['Sure, do you have any weekend plans?'];
+      for (const token of tokens) {
+        opts.onTextDelta(token);
+      }
+      opts.onComplete();
+      return { runId: 'run-instr', abort: () => {} };
+    });
+
+    const { relay, controller } = setupController();
+
+    await controller.handleUserInstruction('Ask about their weekend plans');
+
+    // Should have streamed a response since controller was idle
+    const nonEmptyTokens = relay.sentTokens.filter((t) => t.token.length > 0);
+    expect(nonEmptyTokens.length).toBeGreaterThan(0);
+
+    controller.destroy();
+  });
+
+  test('handleUserInstruction: emits user_instruction_relayed event', async () => {
+    mockStartVoiceTurn.mockImplementation(createMockVoiceTurn(['Understood, adjusting approach.']));
+
+    const { session, controller } = setupController();
+
+    await controller.handleUserInstruction('Be more formal in your tone');
+
+    const events = getCallEvents(session.id);
+    const instructionEvents = events.filter((e) => e.eventType === 'user_instruction_relayed');
+    expect(instructionEvents.length).toBe(1);
+
+    const payload = JSON.parse(instructionEvents[0].payloadJson);
+    expect(payload.instruction).toBe('Be more formal in your tone');
+
+    controller.destroy();
+  });
+
+  test('handleUserInstruction: does not trigger turn when controller is not idle', async () => {
+    // First, trigger ASK_GUARDIAN so controller enters waiting_on_user
+    mockStartVoiceTurn.mockImplementation(createMockVoiceTurn(
+      ['Hold on. [ASK_GUARDIAN: What time?]'],
+    ));
+
+    const { session, controller } = setupController();
+    await controller.handleCallerUtterance('I need an appointment');
+    expect(controller.getState()).toBe('waiting_on_user');
+
+    // Track how many times startVoiceTurn is called
+    let turnCallCount = 0;
+    mockStartVoiceTurn.mockImplementation(async (opts: { onTextDelta: (t: string) => void; onComplete: () => void }) => {
+      turnCallCount++;
+      opts.onTextDelta('Response after instruction.');
+      opts.onComplete();
+      return { runId: 'run-2', abort: () => {} };
+    });
+
+    // Inject instruction while in waiting_on_user state
+    await controller.handleUserInstruction('Suggest morning slots');
+
+    // The turn should NOT have been triggered since we're not idle
+    expect(turnCallCount).toBe(0);
+
+    // But the event should still be recorded
+    const events = getCallEvents(session.id);
+    const instructionEvents = events.filter((e) => e.eventType === 'user_instruction_relayed');
+    expect(instructionEvents.length).toBe(1);
+
+    controller.destroy();
+  });
+});
diff --git a/assistant/src/__tests__/call-orchestrator.test.ts b/assistant/src/__tests__/call-orchestrator.test.ts
deleted file mode 100644
index b82b53df187..00000000000
--- a/assistant/src/__tests__/call-orchestrator.test.ts
+++ /dev/null
@@ -1,1496 +0,0 @@
-import { describe, test, expect, beforeEach, afterAll, mock, type Mock } from 'bun:test';
-import { mkdtempSync, rmSync } from 'node:fs';
-import { tmpdir } from 'node:os';
-import { join } from 'node:path';
-
-const testDir = mkdtempSync(join(tmpdir(), 'call-orchestrator-test-'));
-
-// ── Platform + logger mocks (must come before any source imports) ────
-
-mock.module('../util/platform.js', () => ({
-  getDataDir: () => testDir,
-  isMacOS: () => process.platform === 'darwin',
-  isLinux: () => process.platform === 'linux',
-  isWindows: () => process.platform === 'win32',
-  getSocketPath: () => join(testDir, 'test.sock'),
-  getPidPath: () => join(testDir, 'test.pid'),
-  getDbPath: () => join(testDir, 'test.db'),
-  getLogPath: () => join(testDir, 'test.log'),
-  ensureDataDir: () => {},
-  readHttpToken: () => null,
-}));
-
-mock.module('../util/logger.js', () => ({
-  getLogger: () =>
-    new Proxy({} as Record<string, unknown>, {
-      get: () => () => {},
-    }),
-}));
-
-// ── User reference mock ──────────────────────────────────────────────
-
-let mockUserReference = 'my human';
-
-mock.module('../config/user-reference.js', () => ({
-  resolveUserReference: () => mockUserReference,
-}));
-
-// ── Config mock ─────────────────────────────────────────────────────
-
-let mockCallModel: string | undefined = undefined;
-let mockDisclosure: { enabled: boolean; text: string } = { enabled: false, text: '' };
-
-mock.module('../config/loader.js', () => ({
-  getConfig: () => ({
-    provider: 'anthropic',
-    providerOrder: ['anthropic'],
-    apiKeys: { anthropic: 'test-key' },
-    calls: {
-      enabled: true,
-      provider: 'twilio',
-      maxDurationSeconds: 12 * 60,
-      userConsultTimeoutSeconds: 90,
-      userConsultationTimeoutSeconds: 90,
-      silenceTimeoutSeconds: 30,
-      disclosure: mockDisclosure,
-      safety: { denyCategories: [] },
-      model: mockCallModel,
-    },
-    memory: { enabled: false },
-  }),
-}));
-
-// ── Helpers for building mock provider responses ────────────────────
-
-/**
- * Creates a mock provider sendMessage implementation that emits text_delta
- * events for each token and resolves with the full response.
- */
-function createMockProviderResponse(tokens: string[]) {
-  const fullText = tokens.join('');
-  return async (
-    _messages: unknown[],
-    _tools: unknown[],
-    _systemPrompt: string,
-    options?: { onEvent?: (event: { type: string; text?: string }) => void; signal?: AbortSignal },
-  ) => {
-    // Emit text_delta events for each token
-    for (const token of tokens) {
-      options?.onEvent?.({ type: 'text_delta', text: token });
-    }
-    return {
-      content: [{ type: 'text', text: fullText }],
-      model: 'claude-sonnet-4-20250514',
-      usage: { inputTokens: 100, outputTokens: 50 },
-      stopReason: 'end_turn',
-    };
-  };
-}
-
-// ── Provider registry mock ──────────────────────────────────────────
-
-// eslint-disable-next-line @typescript-eslint/no-explicit-any
-let mockSendMessage: Mock<any>;
-
-mock.module('../providers/registry.js', () => {
-  mockSendMessage = mock(createMockProviderResponse(['Hello', ' there']));
-  return {
-    listProviders: () => ['anthropic'],
-    getFailoverProvider: () => ({
-      name: 'anthropic',
-      sendMessage: (...args: unknown[]) => mockSendMessage(...args),
-    }),
-    getDefaultModel: (providerName: string) => {
-      const defaults: Record<string, string> = {
-        anthropic: 'claude-opus-4-6',
-        openai: 'gpt-5.2',
-        gemini: 'gemini-3-flash',
-        ollama: 'llama3.2',
-        fireworks: 'accounts/fireworks/models/kimi-k2p5',
-        openrouter: 'x-ai/grok-4',
-      };
-      return defaults[providerName] ?? defaults.anthropic;
-    },
-  };
-});
-
-mock.module('../providers/provider-send-message.js', () => ({
-  resolveConfiguredProvider: () => ({
-    provider: {
-      name: 'anthropic',
-      sendMessage: (...args: unknown[]) => mockSendMessage(...args),
-    },
-    configuredProviderName: 'anthropic',
-    selectedProviderName: 'anthropic',
-    usedFallbackPrimary: false,
-  }),
-  getConfiguredProvider: () => ({
-    name: 'anthropic',
-    sendMessage: (...args: unknown[]) => mockSendMessage(...args),
-  }),
-}));
-
-// ── Import source modules after all mocks are registered ────────────
-
-import { initializeDb, getDb, resetDb } from '../memory/db.js';
-import { conversations } from '../memory/schema.js';
-import {
-  createCallSession,
-  getCallSession,
-  getCallEvents,
-  getPendingQuestion,
-  updateCallSession,
-} from '../calls/call-store.js';
-import {
-  getCallOrchestrator,
-} from '../calls/call-state.js';
-import { CallOrchestrator } from '../calls/call-orchestrator.js';
-import type { RelayConnection } from '../calls/relay-server.js';
-
-initializeDb();
-
-afterAll(() => {
-  resetDb();
-  try {
-    rmSync(testDir, { recursive: true });
-  } catch {
-    /* best effort */
-  }
-});
-
-// ── RelayConnection mock factory ────────────────────────────────────
-
-interface MockRelay extends RelayConnection {
-  sentTokens: Array<{ token: string; last: boolean }>;
-  endCalled: boolean;
-  endReason: string | undefined;
-}
-
-function createMockRelay(): MockRelay {
-  const state = {
-    sentTokens: [] as Array<{ token: string; last: boolean }>,
-    _endCalled: false,
-    _endReason: undefined as string | undefined,
-  };
-
-  return {
-    get sentTokens() { return state.sentTokens; },
-    get endCalled() { return state._endCalled; },
-    get endReason() { return state._endReason; },
-    sendTextToken(token: string, last: boolean) {
-      state.sentTokens.push({ token, last });
-    },
-    endSession(reason?: string) {
-      state._endCalled = true;
-      state._endReason = reason;
-    },
-  } as unknown as MockRelay;
-}
-
-// ── Helpers ─────────────────────────────────────────────────────────
-
-let ensuredConvIds = new Set<string>();
-function ensureConversation(id: string): void {
-  if (ensuredConvIds.has(id)) return;
-  const db = getDb();
-  const now = Date.now();
-  db.insert(conversations).values({
-    id,
-    title: `Test conversation ${id}`,
-    createdAt: now,
-    updatedAt: now,
-  }).run();
-  ensuredConvIds.add(id);
-}
-
-function resetTables() {
-  const db = getDb();
-  db.run('DELETE FROM guardian_action_deliveries');
-  db.run('DELETE FROM guardian_action_requests');
-  db.run('DELETE FROM call_pending_questions');
-  db.run('DELETE FROM call_events');
-  db.run('DELETE FROM call_sessions');
-  db.run('DELETE FROM tool_invocations');
-  db.run('DELETE FROM messages');
-  db.run('DELETE FROM conversations');
-  ensuredConvIds = new Set();
-}
-
-/**
- * Create a call session and an orchestrator wired to a mock relay.
- */
-function setupOrchestrator(task?: string) {
-  ensureConversation('conv-orch-test');
-  const session = createCallSession({
-    conversationId: 'conv-orch-test',
-    provider: 'twilio',
-    fromNumber: '+15551111111',
-    toNumber: '+15552222222',
-    task,
-  });
-  updateCallSession(session.id, { status: 'in_progress' });
-  const relay = createMockRelay();
-  const orchestrator = new CallOrchestrator(session.id, relay as unknown as RelayConnection, task ?? null);
-  return { session, relay, orchestrator };
-}
-
-describe('call-orchestrator', () => {
-  beforeEach(() => {
-    resetTables();
-    mockCallModel = undefined;
-    mockUserReference = 'my human';
-    mockDisclosure = { enabled: false, text: '' };
-    // Reset the provider mock to default behaviour
-    mockSendMessage.mockImplementation(createMockProviderResponse(['Hello', ' there']));
-  });
-
-  // ── handleCallerUtterance ─────────────────────────────────────────
-
-  test('handleCallerUtterance: streams tokens via sendTextToken', async () => {
-    mockSendMessage.mockImplementation(createMockProviderResponse(['Hi', ', how', ' are you?']));
-    const { relay, orchestrator } = setupOrchestrator();
-
-    await orchestrator.handleCallerUtterance('Hello');
-
-    // Verify tokens were sent to the relay
-    const nonEmptyTokens = relay.sentTokens.filter((t) => t.token.length > 0);
-    expect(nonEmptyTokens.length).toBeGreaterThan(0);
-    // The last token should have last=true (empty string token signaling end)
-    const lastToken = relay.sentTokens[relay.sentTokens.length - 1];
-    expect(lastToken.last).toBe(true);
-
-    orchestrator.destroy();
-  });
-
-  test('handleCallerUtterance: sends last=true at end of turn', async () => {
-    mockSendMessage.mockImplementation(createMockProviderResponse(['Simple response.']));
-    const { relay, orchestrator } = setupOrchestrator();
-
-    await orchestrator.handleCallerUtterance('Test');
-
-    // Find the final empty-string token that marks end of turn
-    const endMarkers = relay.sentTokens.filter((t) => t.last === true);
-    expect(endMarkers.length).toBeGreaterThanOrEqual(1);
-
-    orchestrator.destroy();
-  });
-
-  test('handleCallerUtterance: includes speaker context in model message', async () => {
-    mockSendMessage.mockImplementation(async (messages: unknown[], ..._rest: unknown[]) => {
-      const msgs = messages as Array<{ role: string; content: Array<{ type: string; text: string }> }>;
-      const userMessage = msgs.find((m) => m.role === 'user');
-      const userText = userMessage?.content?.[0]?.text ?? '';
-      expect(userText).toContain('[SPEAKER id="speaker-1" label="Aaron" source="provider" confidence="0.91"]');
-      expect(userText).toContain('Can you summarize this meeting?');
-      return {
-        content: [{ type: 'text', text: 'Sure, here is a summary.' }],
-        model: 'claude-sonnet-4-20250514',
-        usage: { inputTokens: 100, outputTokens: 50 },
-        stopReason: 'end_turn',
-      };
-    });
-
-    const { orchestrator } = setupOrchestrator();
-
-    await orchestrator.handleCallerUtterance('Can you summarize this meeting?', {
-      speakerId: 'speaker-1',
-      speakerLabel: 'Aaron',
-      speakerConfidence: 0.91,
-      source: 'provider',
-    });
-
-    orchestrator.destroy();
-  });
-
-  test('startInitialGreeting: generates model-driven opening and strips control marker from speech', async () => {
-    mockSendMessage.mockImplementation(async (messages: unknown[], ..._rest: unknown[]) => {
-      const msgs = messages as Array<{ role: string; content: Array<{ type: string; text: string }> }>;
-      const firstUser = msgs.find((m) => m.role === 'user');
-      expect(firstUser?.content?.[0]?.text).toContain('[CALL_OPENING]');
-      const tokens = ['Hi, I am calling about your appointment request. Is now a good time to talk?'];
-      const opts = _rest[2] as { onEvent?: (event: { type: string; text?: string }) => void } | undefined;
-      for (const token of tokens) {
-        opts?.onEvent?.({ type: 'text_delta', text: token });
-      }
-      return {
-        content: [{ type: 'text', text: tokens.join('') }],
-        model: 'claude-sonnet-4-20250514',
-        usage: { inputTokens: 100, outputTokens: 50 },
-        stopReason: 'end_turn',
-      };
-    });
-
-    const { relay, orchestrator } = setupOrchestrator('Confirm appointment');
-
-    const callCountBefore = mockSendMessage.mock.calls.length;
-    await orchestrator.startInitialGreeting();
-    await orchestrator.startInitialGreeting();
-
-    const allText = relay.sentTokens.map((t) => t.token).join('');
-    expect(allText).toContain('appointment request');
-    expect(allText).toContain('good time to talk');
-    expect(allText).not.toContain('[CALL_OPENING]');
-    expect(mockSendMessage.mock.calls.length - callCountBefore).toBe(1);
-
-    orchestrator.destroy();
-  });
-
-  test('startInitialGreeting: tags only the first caller response with CALL_OPENING_ACK', async () => {
-    let callCount = 0;
-    mockSendMessage.mockImplementation(async (messages: unknown[], _tools: unknown[], _systemPrompt: unknown, options?: { onEvent?: (event: { type: string; text?: string }) => void }) => {
-      callCount++;
-      const msgs = messages as Array<{ role: string; content: Array<{ type: string; text: string }> }>;
-      const userMessages = msgs.filter((m) => m.role === 'user');
-      const lastUser = userMessages[userMessages.length - 1]?.content?.[0]?.text ?? '';
-
-      let tokens: string[];
-      if (callCount === 1) {
-        expect(lastUser).toContain('[CALL_OPENING]');
-        tokens = ['Hey Noa, it\'s Credence calling about your joke request. Is now okay for a quick one?'];
-      } else if (callCount === 2) {
-        expect(lastUser).toContain('[CALL_OPENING_ACK]');
-        expect(lastUser).toContain('Yeah. Sure. What\'s up?');
-        tokens = ['Great, here\'s one right away. Why did the scarecrow win an award?'];
-      } else {
-        expect(lastUser).not.toContain('[CALL_OPENING_ACK]');
-        expect(lastUser).toContain('Tell me the punchline');
-        tokens = ['Because he was outstanding in his field.'];
-      }
-
-      for (const token of tokens) {
-        options?.onEvent?.({ type: 'text_delta', text: token });
-      }
-      return {
-        content: [{ type: 'text', text: tokens.join('') }],
-        model: 'claude-sonnet-4-20250514',
-        usage: { inputTokens: 100, outputTokens: 50 },
-        stopReason: 'end_turn',
-      };
-    });
-
-    const { orchestrator } = setupOrchestrator('Tell a joke immediately');
-
-    await orchestrator.startInitialGreeting();
-    await orchestrator.handleCallerUtterance('Yeah. Sure. What\'s up?');
-    await orchestrator.handleCallerUtterance('Tell me the punchline');
-
-    expect(callCount).toBe(3);
-
-    orchestrator.destroy();
-  });
-
-  // ── ASK_GUARDIAN pattern ──────────────────────────────────────────
-
-  test('ASK_GUARDIAN pattern: detects pattern, creates pending question, enters waiting_on_user', async () => {
-    mockSendMessage.mockImplementation(createMockProviderResponse(
-      ['Let me check on that. ', '[ASK_GUARDIAN: What date works best?]'],
-    ));
-    const { session, relay, orchestrator } = setupOrchestrator('Book appointment');
-
-    await orchestrator.handleCallerUtterance('I need to schedule something');
-
-    // Verify a pending question was created
-    const question = getPendingQuestion(session.id);
-    expect(question).not.toBeNull();
-    expect(question!.questionText).toBe('What date works best?');
-    expect(question!.status).toBe('pending');
-
-    // Verify session status was updated to waiting_on_user
-    const updatedSession = getCallSession(session.id);
-    expect(updatedSession!.status).toBe('waiting_on_user');
-
-    // The ASK_GUARDIAN marker text should NOT appear in the relay tokens
-    const allText = relay.sentTokens.map((t) => t.token).join('');
-    expect(allText).not.toContain('[ASK_GUARDIAN:');
-
-    orchestrator.destroy();
-  });
-
-  test('strips internal context markers from spoken output', async () => {
-    mockSendMessage.mockImplementation(createMockProviderResponse([
-      'Thanks for waiting. ',
-      '[USER_ANSWERED: The guardian said 3 PM works.] ',
-      '[USER_INSTRUCTION: Keep this short.] ',
-      '[CALL_OPENING_ACK] ',
-      'I can confirm 3 PM works.',
-    ]));
-    const { relay, orchestrator } = setupOrchestrator();
-
-    await orchestrator.handleCallerUtterance('Any update?');
-
-    const allText = relay.sentTokens.map((t) => t.token).join('');
-    expect(allText).toContain('Thanks for waiting.');
-    expect(allText).toContain('I can confirm 3 PM works.');
-    expect(allText).not.toContain('[USER_ANSWERED:');
-    expect(allText).not.toContain('[USER_INSTRUCTION:');
-    expect(allText).not.toContain('[CALL_OPENING_ACK]');
-    expect(allText).not.toContain('USER_ANSWERED');
-    expect(allText).not.toContain('USER_INSTRUCTION');
-    expect(allText).not.toContain('CALL_OPENING_ACK');
-
-    orchestrator.destroy();
-  });
-
-  // ── END_CALL pattern ──────────────────────────────────────────────
-
-  test('END_CALL pattern: detects marker, calls endSession, updates status to completed', async () => {
-    mockSendMessage.mockImplementation(createMockProviderResponse(
-      ['Thank you for calling, goodbye! ', '[END_CALL]'],
-    ));
-    const { session, relay, orchestrator } = setupOrchestrator();
-
-    await orchestrator.handleCallerUtterance('That is all, thanks');
-
-    // endSession should have been called
-    expect(relay.endCalled).toBe(true);
-
-    // Session status should be completed
-    const updatedSession = getCallSession(session.id);
-    expect(updatedSession!.status).toBe('completed');
-    expect(updatedSession!.endedAt).not.toBeNull();
-
-    // The END_CALL marker text should NOT appear in the relay tokens
-    const allText = relay.sentTokens.map((t) => t.token).join('');
-    expect(allText).not.toContain('[END_CALL]');
-
-    orchestrator.destroy();
-  });
-
-  // ── handleUserAnswer ──────────────────────────────────────────────
-
-  test('handleUserAnswer: returns true immediately and fires LLM asynchronously', async () => {
-    // First utterance triggers ASK_GUARDIAN
-    mockSendMessage.mockImplementation(createMockProviderResponse(
-      ['Hold on. [ASK_GUARDIAN: Preferred time?]'],
-    ));
-    const { relay, orchestrator } = setupOrchestrator();
-
-    await orchestrator.handleCallerUtterance('I need an appointment');
-
-    // Now provide the answer — reset mock for second LLM call
-    mockSendMessage.mockImplementation(async (messages: unknown[], ..._rest: unknown[]) => {
-      // Verify the messages include the USER_ANSWERED marker
-      const msgs = messages as Array<{ role: string; content: Array<{ type: string; text: string }> }>;
-      const lastUserMsg = msgs.filter((m) => m.role === 'user').pop();
-      expect(lastUserMsg?.content?.[0]?.text).toContain('[USER_ANSWERED: 3pm tomorrow]');
-      const tokens = ['Great, I have scheduled for 3pm tomorrow.'];
-      const opts = _rest[2] as { onEvent?: (event: { type: string; text?: string }) => void } | undefined;
-      for (const token of tokens) {
-        opts?.onEvent?.({ type: 'text_delta', text: token });
-      }
-      return {
-        content: [{ type: 'text', text: tokens.join('') }],
-        model: 'claude-sonnet-4-20250514',
-        usage: { inputTokens: 100, outputTokens: 50 },
-        stopReason: 'end_turn',
-      };
-    });
-
-    const accepted = await orchestrator.handleUserAnswer('3pm tomorrow');
-    expect(accepted).toBe(true);
-
-    // handleUserAnswer fires runLlm without awaiting, so give the
-    // microtask queue a tick to let the async LLM work complete.
-    await new Promise((r) => setTimeout(r, 50));
-
-    // Should have streamed a response for the answer
-    const tokensAfterAnswer = relay.sentTokens.filter((t) => t.token.includes('3pm'));
-    expect(tokensAfterAnswer.length).toBeGreaterThan(0);
-
-    orchestrator.destroy();
-  });
-
-  // ── Full mid-call question flow ──────────────────────────────────
-
-  test('mid-call question flow: unavailable time → ask user → user confirms → resumed call', async () => {
-    // Step 1: Caller says "7:30" but it's unavailable. The LLM asks the user.
-    mockSendMessage.mockImplementation(createMockProviderResponse(
-      ['I\'m sorry, 7:30 is not available. ', '[ASK_GUARDIAN: Is 8:00 okay instead?]'],
-    ));
-
-    const { session, relay, orchestrator } = setupOrchestrator('Schedule a haircut');
-
-    await orchestrator.handleCallerUtterance('Can I book for 7:30?');
-
-    // Verify we're in waiting_on_user state
-    expect(orchestrator.getState()).toBe('waiting_on_user');
-    const question = getPendingQuestion(session.id);
-    expect(question).not.toBeNull();
-    expect(question!.questionText).toBe('Is 8:00 okay instead?');
-
-    // Verify session status
-    const midSession = getCallSession(session.id);
-    expect(midSession!.status).toBe('waiting_on_user');
-
-    // Step 2: User answers "Yes, 8:00 works"
-    mockSendMessage.mockImplementation(createMockProviderResponse(
-      ['Great, I\'ve booked you for 8:00. See you then! ', '[END_CALL]'],
-    ));
-
-    const accepted = await orchestrator.handleUserAnswer('Yes, 8:00 works for me');
-    expect(accepted).toBe(true);
-
-    // Give the fire-and-forget LLM call time to complete
-    await new Promise((r) => setTimeout(r, 50));
-
-    // Step 3: Verify call completed
-    const endSession = getCallSession(session.id);
-    expect(endSession!.status).toBe('completed');
-    expect(endSession!.endedAt).not.toBeNull();
-
-    // Verify the END_CALL marker triggered endSession on relay
-    expect(relay.endCalled).toBe(true);
-
-    orchestrator.destroy();
-  });
-
-  // ── Provider / LLM failure paths ───────────────────────────────
-
-  test('LLM error: sends error message to caller and returns to idle', async () => {
-    // Make sendMessage reject with an error
-    mockSendMessage.mockImplementation(async () => {
-      throw new Error('API rate limit exceeded');
-    });
-
-    const { relay, orchestrator } = setupOrchestrator();
-
-    await orchestrator.handleCallerUtterance('Hello');
-
-    // Should have sent an error recovery message
-    const errorTokens = relay.sentTokens.filter((t) =>
-      t.token.includes('technical issue'),
-    );
-    expect(errorTokens.length).toBeGreaterThan(0);
-
-    // State should return to idle after error
-    expect(orchestrator.getState()).toBe('idle');
-
-    orchestrator.destroy();
-  });
-
-  test('LLM APIUserAbortError: treats as expected abort without technical-issue fallback', async () => {
-    mockSendMessage.mockImplementation(async () => {
-      const err = new Error('user abort');
-      err.name = 'APIUserAbortError';
-      throw err;
-    });
-
-    const { relay, orchestrator } = setupOrchestrator();
-    await orchestrator.handleCallerUtterance('Hello');
-
-    const errorTokens = relay.sentTokens.filter((t) => t.token.includes('technical issue'));
-    expect(errorTokens.length).toBe(0);
-    expect(orchestrator.getState()).toBe('idle');
-
-    orchestrator.destroy();
-  });
-
-  test('stale superseded turn errors do not emit technical-issue fallback', async () => {
-    let callCount = 0;
-    mockSendMessage.mockImplementation(async (_messages: unknown[], _tools: unknown[], _systemPrompt: unknown, options?: { onEvent?: (event: { type: string; text?: string }) => void }) => {
-      callCount++;
-      if (callCount === 1) {
-        return new Promise((_, reject) => {
-          setTimeout(() => reject(new Error('stale stream failure')), 20);
-        });
-      }
-      const tokens = ['Second turn response.'];
-      for (const token of tokens) {
-        options?.onEvent?.({ type: 'text_delta', text: token });
-      }
-      return {
-        content: [{ type: 'text', text: tokens.join('') }],
-        model: 'claude-sonnet-4-20250514',
-        usage: { inputTokens: 100, outputTokens: 50 },
-        stopReason: 'end_turn',
-      };
-    });
-
-    const { relay, orchestrator } = setupOrchestrator();
-
-    const firstTurnPromise = orchestrator.handleCallerUtterance('First utterance');
-    // Allow the first turn to enter runLlm before the second utterance interrupts it.
-    await new Promise((r) => setTimeout(r, 5));
-    const secondTurnPromise = orchestrator.handleCallerUtterance('Second utterance');
-
-    await Promise.all([firstTurnPromise, secondTurnPromise]);
-
-    const allTokens = relay.sentTokens.map((t) => t.token).join('');
-    expect(allTokens).toContain('Second turn response.');
-    expect(allTokens).not.toContain('technical issue');
-
-    orchestrator.destroy();
-  });
-
-  test('barge-in cleanup never sends empty user turns to provider', async () => {
-    let callCount = 0;
-    mockSendMessage.mockImplementation(async (messages: unknown[], _tools: unknown[], _systemPrompt: unknown, options?: { onEvent?: (event: { type: string; text?: string }) => void; signal?: AbortSignal }) => {
-      callCount++;
-
-      // Initial outbound opener
-      if (callCount === 1) {
-        const tokens = ['Hey Noa, this is Credence calling.'];
-        for (const token of tokens) {
-          options?.onEvent?.({ type: 'text_delta', text: token });
-        }
-        return {
-          content: [{ type: 'text', text: tokens.join('') }],
-          model: 'claude-sonnet-4-20250514',
-          usage: { inputTokens: 100, outputTokens: 50 },
-          stopReason: 'end_turn',
-        };
-      }
-
-      // First caller turn enters an in-flight LLM run that gets interrupted
-      if (callCount === 2) {
-        return new Promise((_, reject) => {
-          options?.signal?.addEventListener('abort', () => {
-            const err = new Error('aborted');
-            err.name = 'AbortError';
-            reject(err);
-          }, { once: true });
-        });
-      }
-
-      // Second caller turn should never include an empty user message.
-      const msgs = messages as Array<{ role: string; content: Array<{ type: string; text: string }> }>;
-      const userMessages = msgs.filter((m) => m.role === 'user');
-      expect(userMessages.length).toBeGreaterThan(0);
-      expect(userMessages.every((m) => m.content?.[0]?.text?.trim().length > 0)).toBe(true);
-      const tokens = ['Got it, thanks for clarifying.'];
-      for (const token of tokens) {
-        options?.onEvent?.({ type: 'text_delta', text: token });
-      }
-      return {
-        content: [{ type: 'text', text: tokens.join('') }],
-        model: 'claude-sonnet-4-20250514',
-        usage: { inputTokens: 100, outputTokens: 50 },
-        stopReason: 'end_turn',
-      };
-    });
-
-    const { relay, orchestrator } = setupOrchestrator('Quick check-in');
-    await orchestrator.startInitialGreeting();
-
-    const firstTurnPromise = orchestrator.handleCallerUtterance('Hello?');
-    await new Promise((r) => setTimeout(r, 5));
-    const secondTurnPromise = orchestrator.handleCallerUtterance('What have you been up to lately?');
-
-    await Promise.all([firstTurnPromise, secondTurnPromise]);
-
-    const allTokens = relay.sentTokens.map((t) => t.token).join('');
-    expect(allTokens).toContain('Got it, thanks for clarifying.');
-    expect(allTokens).not.toContain('technical issue');
-
-    orchestrator.destroy();
-  });
-
-  test('rapid caller barge-in coalesces contiguous user turns for role alternation', async () => {
-    let callCount = 0;
-    mockSendMessage.mockImplementation(async (messages: unknown[], _tools: unknown[], _systemPrompt: unknown, options?: { onEvent?: (event: { type: string; text?: string }) => void; signal?: AbortSignal }) => {
-      callCount++;
-      if (callCount === 1) {
-        return new Promise((_, reject) => {
-          options?.signal?.addEventListener('abort', () => {
-            const err = new Error('aborted');
-            err.name = 'AbortError';
-            reject(err);
-          }, { once: true });
-        });
-      }
-
-      const msgs = messages as Array<{ role: string; content: Array<{ type: string; text: string }> }>;
-      const roles = msgs.map((m) => m.role);
-      for (let i = 1; i < roles.length; i++) {
-        expect(!(roles[i - 1] === 'user' && roles[i] === 'user')).toBe(true);
-      }
-      const userMessages = msgs.filter((m) => m.role === 'user');
-      const lastUser = userMessages[userMessages.length - 1];
-      expect(lastUser?.content?.[0]?.text).toContain('First caller utterance');
-      expect(lastUser?.content?.[0]?.text).toContain('Second caller utterance');
-      const tokens = ['Merged turn handled.'];
-      for (const token of tokens) {
-        options?.onEvent?.({ type: 'text_delta', text: token });
-      }
-      return {
-        content: [{ type: 'text', text: tokens.join('') }],
-        model: 'claude-sonnet-4-20250514',
-        usage: { inputTokens: 100, outputTokens: 50 },
-        stopReason: 'end_turn',
-      };
-    });
-
-    const { relay, orchestrator } = setupOrchestrator();
-    const firstTurnPromise = orchestrator.handleCallerUtterance('First caller utterance');
-    await new Promise((r) => setTimeout(r, 5));
-    const secondTurnPromise = orchestrator.handleCallerUtterance('Second caller utterance');
-
-    await Promise.all([firstTurnPromise, secondTurnPromise]);
-
-    const allTokens = relay.sentTokens.map((t) => t.token).join('');
-    expect(allTokens).toContain('Merged turn handled.');
-
-    orchestrator.destroy();
-  });
-
-  test('interrupt then next caller prompt still preserves role alternation', async () => {
-    let callCount = 0;
-    mockSendMessage.mockImplementation(async (messages: unknown[], _tools: unknown[], _systemPrompt: unknown, options?: { onEvent?: (event: { type: string; text?: string }) => void; signal?: AbortSignal }) => {
-      callCount++;
-      if (callCount === 1) {
-        return new Promise((_, reject) => {
-          options?.signal?.addEventListener('abort', () => {
-            const err = new Error('aborted');
-            err.name = 'AbortError';
-            reject(err);
-          }, { once: true });
-        });
-      }
-
-      const msgs = messages as Array<{ role: string; content: Array<{ type: string; text: string }> }>;
-      const roles = msgs.map((m) => m.role);
-      for (let i = 1; i < roles.length; i++) {
-        expect(!(roles[i - 1] === 'user' && roles[i] === 'user')).toBe(true);
-      }
-      const userMessages = msgs.filter((m) => m.role === 'user');
-      const lastUser = userMessages[userMessages.length - 1];
-      expect(lastUser?.content?.[0]?.text).toContain('First caller utterance');
-      expect(lastUser?.content?.[0]?.text).toContain('Second caller utterance');
-      const tokens = ['Post-interrupt response.'];
-      for (const token of tokens) {
-        options?.onEvent?.({ type: 'text_delta', text: token });
-      }
-      return {
-        content: [{ type: 'text', text: tokens.join('') }],
-        model: 'claude-sonnet-4-20250514',
-        usage: { inputTokens: 100, outputTokens: 50 },
-        stopReason: 'end_turn',
-      };
-    });
-
-    const { relay, orchestrator } = setupOrchestrator();
-    const firstTurnPromise = orchestrator.handleCallerUtterance('First caller utterance');
-    await new Promise((r) => setTimeout(r, 5));
-    orchestrator.handleInterrupt();
-    const secondTurnPromise = orchestrator.handleCallerUtterance('Second caller utterance');
-
-    await Promise.all([firstTurnPromise, secondTurnPromise]);
-
-    const allTokens = relay.sentTokens.map((t) => t.token).join('');
-    expect(allTokens).toContain('Post-interrupt response.');
-    expect(allTokens).not.toContain('technical issue');
-
-    orchestrator.destroy();
-  });
-
-  test('handleUserAnswer: returns false when not in waiting_on_user state', async () => {
-    const { orchestrator } = setupOrchestrator();
-
-    // Orchestrator starts in idle state
-    const result = await orchestrator.handleUserAnswer('some answer');
-    expect(result).toBe(false);
-
-    orchestrator.destroy();
-  });
-
-  // ── handleInterrupt ───────────────────────────────────────────────
-
-  test('handleInterrupt: resets state to idle', () => {
-    const { orchestrator } = setupOrchestrator();
-
-    // Calling handleInterrupt should not throw
-    orchestrator.handleInterrupt();
-
-    orchestrator.destroy();
-  });
-
-  test('handleInterrupt: increments llmRunVersion to suppress stale turn side effects', async () => {
-    // Use a sendMessage that resolves immediately but whose continuation
-    // (the code after `await provider.sendMessage()`) will run asynchronously.
-    // This simulates the race where the promise microtask is queued right
-    // as handleInterrupt fires.
-    mockSendMessage.mockImplementation(async (_messages: unknown[], _tools: unknown[], _systemPrompt: unknown, options?: { onEvent?: (event: { type: string; text?: string }) => void }) => {
-      // Emit some tokens synchronously
-      options?.onEvent?.({ type: 'text_delta', text: 'Stale response that should be suppressed.' });
-      return {
-        content: [{ type: 'text', text: 'Stale response that should be suppressed.' }],
-        model: 'claude-sonnet-4-20250514',
-        usage: { inputTokens: 100, outputTokens: 50 },
-        stopReason: 'end_turn',
-      };
-    });
-
-    const { relay, orchestrator } = setupOrchestrator();
-
-    // Start an LLM turn (don't await — we want to interrupt mid-flight)
-    const turnPromise = orchestrator.handleCallerUtterance('Hello');
-
-    // Interrupt immediately. Because sendMessage resolves as a microtask,
-    // its continuation hasn't run yet. handleInterrupt increments
-    // llmRunVersion so the continuation's isCurrentRun check will fail.
-    orchestrator.handleInterrupt();
-
-    // Let the stale turn's microtask continuation execute
-    await turnPromise;
-
-    // The orchestrator should remain idle — the stale turn must not
-    // have pushed state to waiting_on_user or any other post-turn state.
-    expect(orchestrator.getState()).toBe('idle');
-
-    // No technical-issue fallback should have been sent
-    const errorTokens = relay.sentTokens.filter((t) => t.token.includes('technical issue'));
-    expect(errorTokens.length).toBe(0);
-
-    // endSession should NOT have been called by the stale turn
-    expect(relay.endCalled).toBe(false);
-
-    orchestrator.destroy();
-  });
-
-  test('handleInterrupt: sends turn terminator when interrupting active speech', async () => {
-    mockSendMessage.mockImplementation(async (_messages: unknown[], _tools: unknown[], _systemPrompt: unknown, options?: { onEvent?: (event: { type: string; text?: string }) => void; signal?: AbortSignal }) => {
-      return new Promise((_, reject) => {
-        options?.signal?.addEventListener('abort', () => {
-          const err = new Error('aborted');
-          err.name = 'AbortError';
-          reject(err);
-        }, { once: true });
-      });
-    });
-
-    const { relay, orchestrator } = setupOrchestrator();
-    const turnPromise = orchestrator.handleCallerUtterance('Start speaking');
-    await new Promise((r) => setTimeout(r, 5));
-    orchestrator.handleInterrupt();
-    await turnPromise;
-
-    const endTurnMarkers = relay.sentTokens.filter((t) => t.token === '' && t.last === true);
-    expect(endTurnMarkers.length).toBeGreaterThan(0);
-
-    orchestrator.destroy();
-  });
-
-  // ── destroy ───────────────────────────────────────────────────────
-
-  test('destroy: unregisters orchestrator', () => {
-    const { session, orchestrator } = setupOrchestrator();
-
-    // Orchestrator should be registered
-    expect(getCallOrchestrator(session.id)).toBeDefined();
-
-    orchestrator.destroy();
-
-    // After destroy, orchestrator should be unregistered
-    expect(getCallOrchestrator(session.id)).toBeUndefined();
-  });
-
-  test('destroy: can be called multiple times without error', () => {
-    const { orchestrator } = setupOrchestrator();
-
-    orchestrator.destroy();
-    // Second destroy should not throw
-    expect(() => orchestrator.destroy()).not.toThrow();
-  });
-
-  // ── Model override from config ──────────────────────────────────────
-
-  test('does not override model when calls.model is not set (preserves cross-provider failover)', async () => {
-    mockCallModel = undefined;
-    mockSendMessage.mockImplementation(async (_messages: unknown[], _tools: unknown[], _systemPrompt: unknown, options?: { config?: { model?: string }; onEvent?: (event: { type: string; text?: string }) => void }) => {
-      // When calls.model is unset, no model override should be passed so each
-      // provider in the failover chain uses its own default model.
-      expect(options?.config?.model).toBeUndefined();
-      const tokens = ['Default model response.'];
-      for (const token of tokens) {
-        options?.onEvent?.({ type: 'text_delta', text: token });
-      }
-      return {
-        content: [{ type: 'text', text: tokens.join('') }],
-        model: 'claude-opus-4-6',
-        usage: { inputTokens: 100, outputTokens: 50 },
-        stopReason: 'end_turn',
-      };
-    });
-
-    const { orchestrator } = setupOrchestrator();
-    await orchestrator.handleCallerUtterance('Hello');
-    orchestrator.destroy();
-  });
-
-  test('uses calls.model override from config when set', async () => {
-    mockCallModel = 'claude-haiku-4-5-20251001';
-    mockSendMessage.mockImplementation(async (_messages: unknown[], _tools: unknown[], _systemPrompt: unknown, options?: { config?: { model: string }; onEvent?: (event: { type: string; text?: string }) => void }) => {
-      expect(options?.config?.model).toBe('claude-haiku-4-5-20251001');
-      const tokens = ['Override model response.'];
-      for (const token of tokens) {
-        options?.onEvent?.({ type: 'text_delta', text: token });
-      }
-      return {
-        content: [{ type: 'text', text: tokens.join('') }],
-        model: 'claude-haiku-4-5-20251001',
-        usage: { inputTokens: 100, outputTokens: 50 },
-        stopReason: 'end_turn',
-      };
-    });
-
-    const { orchestrator } = setupOrchestrator();
-    await orchestrator.handleCallerUtterance('Hello');
-    orchestrator.destroy();
-  });
-
-  test('treats empty string calls.model as unset and omits model override', async () => {
-    mockCallModel = '';
-    mockSendMessage.mockImplementation(async (_messages: unknown[], _tools: unknown[], _systemPrompt: unknown, options?: { config?: { model?: string }; onEvent?: (event: { type: string; text?: string }) => void }) => {
-      // Empty string is treated as unset — no model override
-      expect(options?.config?.model).toBeUndefined();
-      const tokens = ['Fallback model response.'];
-      for (const token of tokens) {
-        options?.onEvent?.({ type: 'text_delta', text: token });
-      }
-      return {
-        content: [{ type: 'text', text: tokens.join('') }],
-        model: 'claude-opus-4-6',
-        usage: { inputTokens: 100, outputTokens: 50 },
-        stopReason: 'end_turn',
-      };
-    });
-
-    const { orchestrator } = setupOrchestrator();
-    await orchestrator.handleCallerUtterance('Hello');
-    orchestrator.destroy();
-  });
-
-  test('treats whitespace-only calls.model as unset and omits model override', async () => {
-    mockCallModel = '   ';
-    mockSendMessage.mockImplementation(async (_messages: unknown[], _tools: unknown[], _systemPrompt: unknown, options?: { config?: { model?: string }; onEvent?: (event: { type: string; text?: string }) => void }) => {
-      // Whitespace-only is treated as unset — no model override
-      expect(options?.config?.model).toBeUndefined();
-      const tokens = ['Fallback model response.'];
-      for (const token of tokens) {
-        options?.onEvent?.({ type: 'text_delta', text: token });
-      }
-      return {
-        content: [{ type: 'text', text: tokens.join('') }],
-        model: 'claude-opus-4-6',
-        usage: { inputTokens: 100, outputTokens: 50 },
-        stopReason: 'end_turn',
-      };
-    });
-
-    const { orchestrator } = setupOrchestrator();
-    await orchestrator.handleCallerUtterance('Hello');
-    orchestrator.destroy();
-  });
-
-  // ── handleUserInstruction ─────────────────────────────────────────
-
-  test('handleUserInstruction: injects instruction marker into conversation history and triggers LLM when idle', async () => {
-    mockSendMessage.mockImplementation(async (messages: unknown[], _tools: unknown[], _systemPrompt: unknown, options?: { onEvent?: (event: { type: string; text?: string }) => void }) => {
-      const msgs = messages as Array<{ role: string; content: Array<{ type: string; text: string }> }>;
-      const instructionMsg = msgs.find((m) =>
-        m.role === 'user' && m.content?.[0]?.text?.includes('[USER_INSTRUCTION:'),
-      );
-      expect(instructionMsg).toBeDefined();
-      expect(instructionMsg!.content[0].text).toContain('[USER_INSTRUCTION: Ask about their weekend plans]');
-      const tokens = ['Sure, do you have any weekend plans?'];
-      for (const token of tokens) {
-        options?.onEvent?.({ type: 'text_delta', text: token });
-      }
-      return {
-        content: [{ type: 'text', text: tokens.join('') }],
-        model: 'claude-sonnet-4-20250514',
-        usage: { inputTokens: 100, outputTokens: 50 },
-        stopReason: 'end_turn',
-      };
-    });
-
-    const { relay, orchestrator } = setupOrchestrator();
-
-    await orchestrator.handleUserInstruction('Ask about their weekend plans');
-
-    // Should have streamed a response since orchestrator was idle
-    const nonEmptyTokens = relay.sentTokens.filter((t) => t.token.length > 0);
-    expect(nonEmptyTokens.length).toBeGreaterThan(0);
-
-    orchestrator.destroy();
-  });
-
-  test('handleUserInstruction: does not break existing answer flow', async () => {
-    // Step 1: Caller says something, LLM responds normally
-    mockSendMessage.mockImplementation(createMockProviderResponse(['Hello! How can I help you today?']));
-    const { session: _session, relay, orchestrator } = setupOrchestrator('Book appointment');
-
-    await orchestrator.handleCallerUtterance('Hi there');
-
-    // Step 2: Inject an instruction while idle
-    mockSendMessage.mockImplementation(async (messages: unknown[], _tools: unknown[], _systemPrompt: unknown, options?: { onEvent?: (event: { type: string; text?: string }) => void }) => {
-      const msgs = messages as Array<{ role: string; content: Array<{ type: string; text: string }> }>;
-      // Verify the history contains both the original exchange and the instruction
-      expect(msgs.length).toBeGreaterThanOrEqual(3); // user utterance + assistant response + instruction
-      const instructionMsg = msgs.find((m) =>
-        m.role === 'user' && m.content?.[0]?.text?.includes('[USER_INSTRUCTION:'),
-      );
-      expect(instructionMsg).toBeDefined();
-      const tokens = ['Of course, let me mention the weekend special.'];
-      for (const token of tokens) {
-        options?.onEvent?.({ type: 'text_delta', text: token });
-      }
-      return {
-        content: [{ type: 'text', text: tokens.join('') }],
-        model: 'claude-sonnet-4-20250514',
-        usage: { inputTokens: 100, outputTokens: 50 },
-        stopReason: 'end_turn',
-      };
-    });
-
-    await orchestrator.handleUserInstruction('Mention the weekend special');
-
-    // Step 3: Caller speaks again — the flow should continue normally
-    mockSendMessage.mockImplementation(createMockProviderResponse(
-      ['Great choice! The weekend special is 20% off.'],
-    ));
-
-    await orchestrator.handleCallerUtterance('Tell me more about that');
-
-    // Verify state is idle after the normal flow
-    expect(orchestrator.getState()).toBe('idle');
-
-    // Verify relay received tokens from all exchanges
-    const allText = relay.sentTokens.map((t) => t.token).join('');
-    expect(allText).toContain('Hello');
-    expect(allText).toContain('weekend special');
-
-    orchestrator.destroy();
-  });
-
-  test('handleUserInstruction: emits user_instruction_relayed event', async () => {
-    mockSendMessage.mockImplementation(createMockProviderResponse(['Understood, adjusting approach.']));
-
-    const { session, orchestrator } = setupOrchestrator();
-
-    await orchestrator.handleUserInstruction('Be more formal in your tone');
-
-    const events = getCallEvents(session.id);
-    const instructionEvents = events.filter((e) => e.eventType === 'user_instruction_relayed');
-    expect(instructionEvents.length).toBe(1);
-
-    const payload = JSON.parse(instructionEvents[0].payloadJson);
-    expect(payload.instruction).toBe('Be more formal in your tone');
-
-    orchestrator.destroy();
-  });
-
-  test('handleUserInstruction: does not trigger LLM when orchestrator is not idle', async () => {
-    // First, trigger ASK_GUARDIAN so orchestrator enters waiting_on_user
-    mockSendMessage.mockImplementation(createMockProviderResponse(
-      ['Hold on. [ASK_GUARDIAN: What time?]'],
-    ));
-
-    const { session, orchestrator } = setupOrchestrator();
-    await orchestrator.handleCallerUtterance('I need an appointment');
-    expect(orchestrator.getState()).toBe('waiting_on_user');
-
-    // Track how many times the provider mock is called
-    let streamCallCount = 0;
-    mockSendMessage.mockImplementation(async () => {
-      streamCallCount++;
-      return {
-        content: [{ type: 'text', text: 'Response after instruction.' }],
-        model: 'claude-sonnet-4-20250514',
-        usage: { inputTokens: 100, outputTokens: 50 },
-        stopReason: 'end_turn',
-      };
-    });
-
-    // Inject instruction while in waiting_on_user state
-    await orchestrator.handleUserInstruction('Suggest morning slots');
-
-    // The LLM should NOT have been triggered since we're not idle
-    expect(streamCallCount).toBe(0);
-
-    // But the event should still be recorded
-    const events = getCallEvents(session.id);
-    const instructionEvents = events.filter((e) => e.eventType === 'user_instruction_relayed');
-    expect(instructionEvents.length).toBe(1);
-
-    orchestrator.destroy();
-  });
-
-  // ── System prompt: identity phrasing ────────────────────────────────
-
-  test('system prompt contains resolved user reference (default)', async () => {
-    mockSendMessage.mockImplementation(async (_messages: unknown[], _tools: unknown[], systemPrompt: unknown, options?: { onEvent?: (event: { type: string; text?: string }) => void }) => {
-      expect(systemPrompt as string).toContain('on behalf of my human');
-      const tokens = ['Hello.'];
-      for (const token of tokens) {
-        options?.onEvent?.({ type: 'text_delta', text: token });
-      }
-      return {
-        content: [{ type: 'text', text: tokens.join('') }],
-        model: 'claude-sonnet-4-20250514',
-        usage: { inputTokens: 100, outputTokens: 50 },
-        stopReason: 'end_turn',
-      };
-    });
-
-    const { orchestrator } = setupOrchestrator();
-    await orchestrator.handleCallerUtterance('Hi');
-    orchestrator.destroy();
-  });
-
-  test('system prompt contains resolved user reference when set to a name', async () => {
-    mockUserReference = 'John';
-    mockSendMessage.mockImplementation(async (_messages: unknown[], _tools: unknown[], systemPrompt: unknown, options?: { onEvent?: (event: { type: string; text?: string }) => void }) => {
-      expect(systemPrompt as string).toContain('on behalf of John');
-      const tokens = ['Hello John\'s contact.'];
-      for (const token of tokens) {
-        options?.onEvent?.({ type: 'text_delta', text: token });
-      }
-      return {
-        content: [{ type: 'text', text: tokens.join('') }],
-        model: 'claude-sonnet-4-20250514',
-        usage: { inputTokens: 100, outputTokens: 50 },
-        stopReason: 'end_turn',
-      };
-    });
-
-    const { orchestrator } = setupOrchestrator();
-    await orchestrator.handleCallerUtterance('Hi');
-    orchestrator.destroy();
-  });
-
-  test('system prompt does not hardcode "your user" in the opening line', async () => {
-    mockUserReference = 'Alice';
-    mockSendMessage.mockImplementation(async (_messages: unknown[], _tools: unknown[], systemPrompt: unknown, options?: { onEvent?: (event: { type: string; text?: string }) => void }) => {
-      expect(systemPrompt as string).not.toContain('on behalf of your user');
-      expect(systemPrompt as string).toContain('on behalf of Alice');
-      const tokens = ['Hi there.'];
-      for (const token of tokens) {
-        options?.onEvent?.({ type: 'text_delta', text: token });
-      }
-      return {
-        content: [{ type: 'text', text: tokens.join('') }],
-        model: 'claude-sonnet-4-20250514',
-        usage: { inputTokens: 100, outputTokens: 50 },
-        stopReason: 'end_turn',
-      };
-    });
-
-    const { orchestrator } = setupOrchestrator();
-    await orchestrator.handleCallerUtterance('Hello');
-    orchestrator.destroy();
-  });
-
-  test('system prompt includes assistant identity bias rule', async () => {
-    mockSendMessage.mockImplementation(async (_messages: unknown[], _tools: unknown[], systemPrompt: unknown, options?: { onEvent?: (event: { type: string; text?: string }) => void }) => {
-      expect(systemPrompt as string).toContain('refer to yourself as an assistant');
-      expect(systemPrompt as string).toContain('Avoid the phrase "AI assistant" unless directly asked');
-      const tokens = ['Sure thing.'];
-      for (const token of tokens) {
-        options?.onEvent?.({ type: 'text_delta', text: token });
-      }
-      return {
-        content: [{ type: 'text', text: tokens.join('') }],
-        model: 'claude-sonnet-4-20250514',
-        usage: { inputTokens: 100, outputTokens: 50 },
-        stopReason: 'end_turn',
-      };
-    });
-
-    const { orchestrator } = setupOrchestrator();
-    await orchestrator.handleCallerUtterance('Hi');
-    orchestrator.destroy();
-  });
-
-  test('system prompt includes opening-ack guidance to avoid duplicate introductions', async () => {
-    mockSendMessage.mockImplementation(async (_messages: unknown[], _tools: unknown[], systemPrompt: unknown, options?: { onEvent?: (event: { type: string; text?: string }) => void }) => {
-      expect(systemPrompt as string).toContain('[CALL_OPENING_ACK]');
-      expect(systemPrompt as string).toContain('without re-introducing yourself');
-      const tokens = ['Understood.'];
-      for (const token of tokens) {
-        options?.onEvent?.({ type: 'text_delta', text: token });
-      }
-      return {
-        content: [{ type: 'text', text: tokens.join('') }],
-        model: 'claude-sonnet-4-20250514',
-        usage: { inputTokens: 100, outputTokens: 50 },
-        stopReason: 'end_turn',
-      };
-    });
-
-    const { orchestrator } = setupOrchestrator();
-    await orchestrator.handleCallerUtterance('Hi');
-    orchestrator.destroy();
-  });
-
-  test('assistant identity rule appears before disclosure rule in prompt', async () => {
-    mockSendMessage.mockImplementation(async (_messages: unknown[], _tools: unknown[], systemPrompt: unknown, options?: { onEvent?: (event: { type: string; text?: string }) => void }) => {
-      const prompt = systemPrompt as string;
-      const identityIdx = prompt.indexOf('refer to yourself as an assistant');
-      const disclosureIdx = prompt.indexOf('Be concise');
-      expect(identityIdx).toBeGreaterThan(-1);
-      expect(disclosureIdx).toBeGreaterThan(-1);
-      expect(identityIdx).toBeLessThan(disclosureIdx);
-      const tokens = ['OK.'];
-      for (const token of tokens) {
-        options?.onEvent?.({ type: 'text_delta', text: token });
-      }
-      return {
-        content: [{ type: 'text', text: tokens.join('') }],
-        model: 'claude-sonnet-4-20250514',
-        usage: { inputTokens: 100, outputTokens: 50 },
-        stopReason: 'end_turn',
-      };
-    });
-
-    const { orchestrator } = setupOrchestrator();
-    await orchestrator.handleCallerUtterance('Test');
-    orchestrator.destroy();
-  });
-
-  test('system prompt uses disclosure text when disclosure is enabled', async () => {
-    mockDisclosure = {
-      enabled: true,
-      text: 'At the very beginning of the call, introduce yourself as an assistant calling on behalf of the person you represent. Do not say "AI assistant".',
-    };
-    mockSendMessage.mockImplementation(async (_messages: unknown[], _tools: unknown[], systemPrompt: unknown, options?: { onEvent?: (event: { type: string; text?: string }) => void }) => {
-      expect(systemPrompt as string).toContain('introduce yourself as an assistant calling on behalf of the person you represent');
-      expect(systemPrompt as string).toContain('Do not say "AI assistant"');
-      const tokens = ['Hello, I am calling on behalf of my human.'];
-      for (const token of tokens) {
-        options?.onEvent?.({ type: 'text_delta', text: token });
-      }
-      return {
-        content: [{ type: 'text', text: tokens.join('') }],
-        model: 'claude-sonnet-4-20250514',
-        usage: { inputTokens: 100, outputTokens: 50 },
-        stopReason: 'end_turn',
-      };
-    });
-
-    const { orchestrator } = setupOrchestrator();
-    await orchestrator.handleCallerUtterance('Who is this?');
-    orchestrator.destroy();
-  });
-
-  test('system prompt falls back to "Begin the conversation naturally" when disclosure is disabled', async () => {
-    mockDisclosure = { enabled: false, text: '' };
-    mockSendMessage.mockImplementation(async (_messages: unknown[], _tools: unknown[], systemPrompt: unknown, options?: { onEvent?: (event: { type: string; text?: string }) => void }) => {
-      expect(systemPrompt as string).toContain('Begin the conversation naturally');
-      expect(systemPrompt as string).not.toContain('introduce yourself as an assistant calling on behalf of the person');
-      const tokens = ['Hello there.'];
-      for (const token of tokens) {
-        options?.onEvent?.({ type: 'text_delta', text: token });
-      }
-      return {
-        content: [{ type: 'text', text: tokens.join('') }],
-        model: 'claude-sonnet-4-20250514',
-        usage: { inputTokens: 100, outputTokens: 50 },
-        stopReason: 'end_turn',
-      };
-    });
-
-    const { orchestrator } = setupOrchestrator();
-    await orchestrator.handleCallerUtterance('Hi');
-    orchestrator.destroy();
-  });
-
-  test('system prompt does not use "AI assistant" as a self-identity label', async () => {
-    mockSendMessage.mockImplementation(async (_messages: unknown[], _tools: unknown[], systemPrompt: unknown, options?: { onEvent?: (event: { type: string; text?: string }) => void }) => {
-      expect(systemPrompt as string).not.toMatch(/(?:you are|call yourself|introduce yourself as).*AI assistant/i);
-      const tokens = ['Got it.'];
-      for (const token of tokens) {
-        options?.onEvent?.({ type: 'text_delta', text: token });
-      }
-      return {
-        content: [{ type: 'text', text: tokens.join('') }],
-        model: 'claude-sonnet-4-20250514',
-        usage: { inputTokens: 100, outputTokens: 50 },
-        stopReason: 'end_turn',
-      };
-    });
-
-    const { orchestrator } = setupOrchestrator();
-    await orchestrator.handleCallerUtterance('Hello');
-    orchestrator.destroy();
-  });
-
-  // ── Inbound call orchestration ──────────────────────────────────────
-
-  test('inbound call (no task) uses receptionist-style system prompt', async () => {
-    mockSendMessage.mockImplementation(async (_messages: unknown[], _tools: unknown[], systemPrompt: unknown, options?: { onEvent?: (event: { type: string; text?: string }) => void }) => {
-      // Should contain inbound-specific language
-      expect(systemPrompt as string).toContain('answering an incoming call');
-      expect(systemPrompt as string).toContain('find out what they need');
-      // Should NOT contain outbound-specific language
-      expect(systemPrompt as string).not.toContain('state why you are calling');
-      expect(systemPrompt as string).not.toContain('Task:');
-      const tokens = ['Hello, how can I help you today?'];
-      for (const token of tokens) {
-        options?.onEvent?.({ type: 'text_delta', text: token });
-      }
-      return {
-        content: [{ type: 'text', text: tokens.join('') }],
-        model: 'claude-sonnet-4-20250514',
-        usage: { inputTokens: 100, outputTokens: 50 },
-        stopReason: 'end_turn',
-      };
-    });
-
-    // setupOrchestrator with no task creates an inbound-style session
-    const { orchestrator } = setupOrchestrator(undefined);
-    await orchestrator.handleCallerUtterance('Hi there');
-    orchestrator.destroy();
-  });
-
-  test('outbound call (with task) uses task-driven system prompt', async () => {
-    mockSendMessage.mockImplementation(async (_messages: unknown[], _tools: unknown[], systemPrompt: unknown, options?: { onEvent?: (event: { type: string; text?: string }) => void }) => {
-      expect(systemPrompt as string).toContain('Task: Confirm Friday appointment');
-      expect(systemPrompt as string).toContain('state why you are calling');
-      expect(systemPrompt as string).not.toContain('answering an incoming call');
-      const tokens = ['Hi, I am calling about your appointment.'];
-      for (const token of tokens) {
-        options?.onEvent?.({ type: 'text_delta', text: token });
-      }
-      return {
-        content: [{ type: 'text', text: tokens.join('') }],
-        model: 'claude-sonnet-4-20250514',
-        usage: { inputTokens: 100, outputTokens: 50 },
-        stopReason: 'end_turn',
-      };
-    });
-
-    const { orchestrator } = setupOrchestrator('Confirm Friday appointment');
-    await orchestrator.handleCallerUtterance('Hello?');
-    orchestrator.destroy();
-  });
-
-  test('inbound call initial greeting sends receptionist opener', async () => {
-    mockSendMessage.mockImplementation(async (messages: unknown[], _tools: unknown[], systemPrompt: unknown, options?: { onEvent?: (event: { type: string; text?: string }) => void }) => {
-      // The system prompt should use inbound framing
-      expect(systemPrompt as string).toContain('answering an incoming call');
-      // The opening marker should be present
-      const msgs = messages as Array<{ role: string; content: Array<{ type: string; text: string }> }>;
-      const userMsgs = msgs.filter((m) => m.role === 'user');
-      expect(userMsgs.some((m) => m.content?.[0]?.text?.includes('[CALL_OPENING]'))).toBe(true);
-      const tokens = ['Hello, this is my human\'s assistant. How can I help you?'];
-      for (const token of tokens) {
-        options?.onEvent?.({ type: 'text_delta', text: token });
-      }
-      return {
-        content: [{ type: 'text', text: tokens.join('') }],
-        model: 'claude-sonnet-4-20250514',
-        usage: { inputTokens: 100, outputTokens: 50 },
-        stopReason: 'end_turn',
-      };
-    });
-
-    const { relay, orchestrator } = setupOrchestrator(undefined);
-    await orchestrator.startInitialGreeting();
-
-    const allText = relay.sentTokens.map((t) => t.token).join('');
-    expect(allText).toContain('How can I help you');
-
-    orchestrator.destroy();
-  });
-
-  test('inbound call multi-turn conversation uses inbound prompt consistently', async () => {
-    let turnNumber = 0;
-    mockSendMessage.mockImplementation(async (_messages: unknown[], _tools: unknown[], systemPrompt: unknown, options?: { onEvent?: (event: { type: string; text?: string }) => void }) => {
-      turnNumber++;
-      // Every turn should use the inbound system prompt
-      expect(systemPrompt as string).toContain('answering an incoming call');
-      expect(systemPrompt as string).not.toContain('Task:');
-
-      let tokens: string[];
-      if (turnNumber === 1) tokens = ['Hello, how can I help you?'];
-      else if (turnNumber === 2) tokens = ['Sure, let me help with scheduling.'];
-      else tokens = ['Your meeting is set for 3pm.'];
-      for (const token of tokens) {
-        options?.onEvent?.({ type: 'text_delta', text: token });
-      }
-      return {
-        content: [{ type: 'text', text: tokens.join('') }],
-        model: 'claude-sonnet-4-20250514',
-        usage: { inputTokens: 100, outputTokens: 50 },
-        stopReason: 'end_turn',
-      };
-    });
-
-    const { orchestrator } = setupOrchestrator(undefined);
-
-    await orchestrator.startInitialGreeting();
-    await orchestrator.handleCallerUtterance('I need to schedule a meeting');
-    await orchestrator.handleCallerUtterance('How about 3pm?');
-
-    expect(turnNumber).toBe(3);
-    orchestrator.destroy();
-  });
-
-  test('inbound call system prompt includes greet-the-caller guidance for CALL_OPENING', async () => {
-    mockSendMessage.mockImplementation(async (_messages: unknown[], _tools: unknown[], systemPrompt: unknown, options?: { onEvent?: (event: { type: string; text?: string }) => void }) => {
-      // Should tell the model to greet warmly and ask how to help
-      expect(systemPrompt as string).toContain('greet the caller warmly');
-      expect(systemPrompt as string).toContain('how you can help');
-      const tokens = ['Hello!'];
-      for (const token of tokens) {
-        options?.onEvent?.({ type: 'text_delta', text: token });
-      }
-      return {
-        content: [{ type: 'text', text: tokens.join('') }],
-        model: 'claude-sonnet-4-20250514',
-        usage: { inputTokens: 100, outputTokens: 50 },
-        stopReason: 'end_turn',
-      };
-    });
-
-    const { orchestrator } = setupOrchestrator(undefined);
-    await orchestrator.handleCallerUtterance('Hi');
-    orchestrator.destroy();
-  });
-
-  test('inbound call system prompt respects disclosure setting', async () => {
-    mockDisclosure = {
-      enabled: true,
-      text: 'Disclose that you are an AI at the start.',
-    };
-    mockSendMessage.mockImplementation(async (_messages: unknown[], _tools: unknown[], systemPrompt: unknown, options?: { onEvent?: (event: { type: string; text?: string }) => void }) => {
-      expect(systemPrompt as string).toContain('answering an incoming call');
-      expect(systemPrompt as string).toContain('Disclose that you are an AI at the start.');
-      const tokens = ['Hello, I am an AI assistant.'];
-      for (const token of tokens) {
-        options?.onEvent?.({ type: 'text_delta', text: token });
-      }
-      return {
-        content: [{ type: 'text', text: tokens.join('') }],
-        model: 'claude-sonnet-4-20250514',
-        usage: { inputTokens: 100, outputTokens: 50 },
-        stopReason: 'end_turn',
-      };
-    });
-
-    const { orchestrator } = setupOrchestrator(undefined);
-    await orchestrator.handleCallerUtterance('Who is this?');
-    orchestrator.destroy();
-  });
-
-  test('inbound call persists assistant response to voice conversation', async () => {
-    mockSendMessage.mockImplementation(createMockProviderResponse(['I can definitely help you with that.']));
-
-    const { session, orchestrator } = setupOrchestrator(undefined);
-    await orchestrator.startInitialGreeting();
-
-    // Verify assistant transcript was persisted
-    const messages = (await import('../memory/conversation-store.js')).getMessages('conv-orch-test');
-    const assistantMsgs = messages.filter((m) => m.role === 'assistant');
-    expect(assistantMsgs.length).toBeGreaterThan(0);
-    const lastAssistant = assistantMsgs[assistantMsgs.length - 1];
-    expect(lastAssistant.content).toContain('I can definitely help you with that');
-
-    // Verify event was recorded
-    const events = getCallEvents(session.id).filter((e) => e.eventType === 'assistant_spoke');
-    expect(events.length).toBeGreaterThan(0);
-
-    orchestrator.destroy();
-  });
-});
diff --git a/assistant/src/__tests__/call-state.test.ts b/assistant/src/__tests__/call-state.test.ts
index b1578e6419f..f829c85acc1 100644
--- a/assistant/src/__tests__/call-state.test.ts
+++ b/assistant/src/__tests__/call-state.test.ts
@@ -16,11 +16,11 @@ import {
   registerCallCompletionNotifier,
   unregisterCallCompletionNotifier,
   fireCallCompletionNotifier,
-  registerCallOrchestrator,
-  unregisterCallOrchestrator,
-  getCallOrchestrator,
+  registerCallController,
+  unregisterCallController,
+  getCallController,
 } from '../calls/call-state.js';
-import type { CallOrchestrator } from '../calls/call-orchestrator.js';
+import type { CallController } from '../calls/call-controller.js';
 
 describe('call-state', () => {
   // Clean up notifiers between tests
@@ -28,7 +28,7 @@ describe('call-state', () => {
     unregisterCallQuestionNotifier('test-conv');
     unregisterCallTranscriptNotifier('test-conv');
     unregisterCallCompletionNotifier('test-conv');
-    unregisterCallOrchestrator('test-session');
+    unregisterCallController('test-session');
   });
 
   // ── Question notifiers ────────────────────────────────────────────
@@ -135,40 +135,40 @@ describe('call-state', () => {
     fireCallCompletionNotifier('unregistered-conv', 'session-1');
   });
 
-  // ── Orchestrator registry ─────────────────────────────────────────
+  // ── Controller registry ─────────────────────────────────────────
 
-  test('registerCallOrchestrator + getCallOrchestrator: retrieves orchestrator', () => {
-    const fakeOrchestrator = { id: 'fake-orch' } as unknown as CallOrchestrator;
+  test('registerCallController + getCallController: retrieves controller', () => {
+    const fakeController = { id: 'fake-ctrl' } as unknown as CallController;
 
-    registerCallOrchestrator('test-session', fakeOrchestrator);
+    registerCallController('test-session', fakeController);
 
-    const retrieved = getCallOrchestrator('test-session');
-    expect(retrieved).toBe(fakeOrchestrator);
+    const retrieved = getCallController('test-session');
+    expect(retrieved).toBe(fakeController);
   });
 
-  test('unregisterCallOrchestrator: getCallOrchestrator returns undefined after unregister', () => {
-    const fakeOrchestrator = { id: 'fake-orch-2' } as unknown as CallOrchestrator;
+  test('unregisterCallController: getCallController returns undefined after unregister', () => {
+    const fakeController = { id: 'fake-ctrl-2' } as unknown as CallController;
 
-    registerCallOrchestrator('test-session', fakeOrchestrator);
-    unregisterCallOrchestrator('test-session');
+    registerCallController('test-session', fakeController);
+    unregisterCallController('test-session');
 
-    const retrieved = getCallOrchestrator('test-session');
+    const retrieved = getCallController('test-session');
     expect(retrieved).toBeUndefined();
   });
 
-  test('getCallOrchestrator returns undefined for unregistered session', () => {
-    const retrieved = getCallOrchestrator('nonexistent-session');
+  test('getCallController returns undefined for unregistered session', () => {
+    const retrieved = getCallController('nonexistent-session');
     expect(retrieved).toBeUndefined();
   });
 
-  test('registering a new orchestrator for same session overwrites the previous one', () => {
-    const first = { id: 'first' } as unknown as CallOrchestrator;
-    const second = { id: 'second' } as unknown as CallOrchestrator;
+  test('registering a new controller for same session overwrites the previous one', () => {
+    const first = { id: 'first' } as unknown as CallController;
+    const second = { id: 'second' } as unknown as CallController;
 
-    registerCallOrchestrator('test-session', first);
-    registerCallOrchestrator('test-session', second);
+    registerCallController('test-session', first);
+    registerCallController('test-session', second);
 
-    const retrieved = getCallOrchestrator('test-session');
+    const retrieved = getCallController('test-session');
     expect(retrieved).toBe(second);
   });
 });
diff --git a/assistant/src/__tests__/checker.test.ts b/assistant/src/__tests__/checker.test.ts
index 643101187e8..98e8444810e 100644
--- a/assistant/src/__tests__/checker.test.ts
+++ b/assistant/src/__tests__/checker.test.ts
@@ -1,7 +1,6 @@
 // Smoke command (run all security test files together):
 // bun test src/__tests__/checker.test.ts src/__tests__/trust-store.test.ts src/__tests__/session-skill-tools.test.ts src/__tests__/skill-script-runner-host.test.ts
 
-/* eslint-disable @typescript-eslint/no-explicit-any */
 import { describe, test, expect, beforeAll, beforeEach, afterEach, mock } from 'bun:test';
 import { mkdtempSync, mkdirSync, rmSync, writeFileSync, symlinkSync, realpathSync } from 'node:fs';
 import { tmpdir, homedir } from 'node:os';
@@ -39,9 +38,16 @@ mock.module('../util/logger.js', () => ({
 
 // Mutable config object so tests can switch permissions.mode between
 // 'legacy', 'strict', and 'workspace' without re-registering the mock.
-const testConfig: Record<string, any> = {
-  permissions: { mode: 'legacy' as 'legacy' | 'strict' | 'workspace' },
-  skills: { load: { extraDirs: [] as string[] } },
+interface TestConfig {
+  permissions: { mode: 'legacy' | 'strict' | 'workspace' };
+  skills: { load: { extraDirs: string[] } };
+  sandbox: { enabled: boolean };
+  [key: string]: unknown;
+}
+
+const testConfig: TestConfig = {
+  permissions: { mode: 'legacy' },
+  skills: { load: { extraDirs: [] } },
   sandbox: { enabled: true },
 };
 
@@ -58,6 +64,7 @@ mock.module('../config/loader.js', () => ({
 
 import { classifyRisk, check, generateAllowlistOptions, generateScopeOptions, _resetLegacyDeprecationWarning } from '../permissions/checker.js';
 import { RiskLevel } from '../permissions/types.js';
+import type { TrustRule } from '../permissions/types.js';
 import { addRule, clearCache, findHighestPriorityRule } from '../permissions/trust-store.js';
 import { getDefaultRuleTemplates } from '../permissions/defaults.js';
 import { registerTool, getTool } from '../tools/registry.js';
@@ -2353,13 +2360,13 @@ describe('Permission Checker', () => {
       const trustDir = dirnameFn(trustPath);
       if (!existsSync(trustDir)) mkdirSyncFs(trustDir, { recursive: true });
 
-      let currentRules: any[] = [];
+      let currentRules: TrustRule[] = [];
       try {
         const raw = readFileSync(trustPath, 'utf-8');
         currentRules = JSON.parse(raw).rules ?? [];
       } catch { /* first run */ }
 
-      currentRules = currentRules.filter((r: any) => r.id !== opts.id);
+      currentRules = currentRules.filter((r: TrustRule) => r.id !== opts.id);
       currentRules.push({
         ...opts,
         createdAt: Date.now(),
@@ -2486,7 +2493,7 @@ describe('Permission Checker', () => {
         // Write the executionTarget field directly (addVersionBoundRule doesn't support it)
         const trustPath = join(checkerTestDir, 'protected', 'trust.json');
         const raw = JSON.parse((await import('node:fs')).readFileSync(trustPath, 'utf-8'));
-        const rule = raw.rules.find((r: any) => r.id === 'inv4-target-scoped');
+        const rule = raw.rules.find((r: TrustRule) => r.id === 'inv4-target-scoped');
         rule.executionTarget = '/usr/local/bin/node';
         (await import('node:fs')).writeFileSync(trustPath, JSON.stringify(raw, null, 2));
         clearCache();
diff --git a/assistant/src/__tests__/config-schema.test.ts b/assistant/src/__tests__/config-schema.test.ts
index fadc2f9578e..6088bccaa9a 100644
--- a/assistant/src/__tests__/config-schema.test.ts
+++ b/assistant/src/__tests__/config-schema.test.ts
@@ -74,9 +74,9 @@ describe('AssistantConfigSchema', () => {
     const result = AssistantConfigSchema.parse({});
     expect(result.provider).toBe('anthropic');
     expect(result.model).toBe('claude-opus-4-6');
-    expect(result.maxTokens).toBe(64000);
+    expect(result.maxTokens).toBe(16000);
     expect(result.apiKeys).toEqual({});
-    expect(result.thinking).toEqual({ enabled: false, budgetTokens: 10000 });
+    expect(result.thinking).toEqual({ enabled: false, budgetTokens: 10000, streamThinking: false });
     expect(result.contextWindow).toEqual({
       enabled: true,
       maxInputTokens: 180000,
@@ -1193,8 +1193,8 @@ describe('loadConfig with schema validation', () => {
     const config = loadConfig();
     expect(config.provider).toBe('anthropic');
     expect(config.model).toBe('claude-opus-4-6');
-    expect(config.maxTokens).toBe(64000);
-    expect(config.thinking).toEqual({ enabled: false, budgetTokens: 10000 });
+    expect(config.maxTokens).toBe(16000);
+    expect(config.thinking).toEqual({ enabled: false, budgetTokens: 10000, streamThinking: false });
     expect(config.contextWindow).toEqual({
       enabled: true,
       maxInputTokens: 180000,
@@ -1215,7 +1215,7 @@ describe('loadConfig with schema validation', () => {
   test('falls back to default for invalid maxTokens', () => {
     writeConfig({ maxTokens: -100 });
     const config = loadConfig();
-    expect(config.maxTokens).toBe(64000);
+    expect(config.maxTokens).toBe(16000);
   });
 
   test('falls back to defaults for invalid nested values', () => {
@@ -1240,13 +1240,13 @@ describe('loadConfig with schema validation', () => {
     expect(config.model).toBe('gpt-4');
     expect(config.thinking.enabled).toBe(true);
     expect(config.thinking.budgetTokens).toBe(5000);
-    expect(config.maxTokens).toBe(64000);
+    expect(config.maxTokens).toBe(16000);
   });
 
   test('handles no config file', () => {
     const config = loadConfig();
     expect(config.provider).toBe('anthropic');
-    expect(config.maxTokens).toBe(64000);
+    expect(config.maxTokens).toBe(16000);
   });
 
   test('partial nested objects get defaults for missing fields', () => {
diff --git a/assistant/src/__tests__/guardian-dispatch.test.ts b/assistant/src/__tests__/guardian-dispatch.test.ts
index 37c4957dc7f..dcf8fdc4a59 100644
--- a/assistant/src/__tests__/guardian-dispatch.test.ts
+++ b/assistant/src/__tests__/guardian-dispatch.test.ts
@@ -56,6 +56,34 @@ mock.module('../runtime/gateway-client.js', () => ({
   },
 }));
 
+// Mock guardian-question-copy to return deterministic values without hitting a real provider.
+// Only generateGuardianCopy (the async LLM call) is mocked; buildFallbackCopy is the real
+// implementation passed through so guardian-dispatch can use it if needed.
+let mockGuardianCopy = {
+  threadTitle: '\u{1F6A8} Caller needs the gate code',
+  initialMessage: 'Your assistant needs your input during a live phone call.\n\nQuestion: What is the gate code?\n\nReply to this message with your answer.',
+};
+
+mock.module('../calls/guardian-question-copy.js', () => ({
+  generateGuardianCopy: async (questionText: string) => ({
+    threadTitle: mockGuardianCopy.threadTitle,
+    initialMessage: mockGuardianCopy.initialMessage.includes(questionText)
+      ? mockGuardianCopy.initialMessage
+      : mockGuardianCopy.initialMessage.replace(/Question: .*/, `Question: ${questionText}`),
+  }),
+  // Pass through the real buildFallbackCopy implementation (tested in guardian-question-copy.test.ts)
+  buildFallbackCopy: (questionText: string) => ({
+    threadTitle: `\u26A0\uFE0F ${questionText.slice(0, 70)}`,
+    initialMessage: [
+      'Your assistant needs your input during a phone call.',
+      '',
+      `Question: ${questionText}`,
+      '',
+      'Reply to this message with your answer.',
+    ].join('\n'),
+  }),
+}));
+
 import { initializeDb, getDb, resetDb } from '../memory/db.js';
 import { conversations } from '../memory/schema.js';
 import { createCallSession, createPendingQuestion } from '../calls/call-store.js';
@@ -87,6 +115,10 @@ function resetTables(): void {
   mockTelegramBinding = null;
   mockSmsBinding = null;
   deliveredMessages.length = 0;
+  mockGuardianCopy = {
+    threadTitle: '\u{1F6A8} Caller needs the gate code',
+    initialMessage: 'Your assistant needs your input during a live phone call.\n\nQuestion: What is the gate code?\n\nReply to this message with your answer.',
+  };
 }
 
 describe('guardian-dispatch', () => {
@@ -250,4 +282,108 @@ describe('guardian-dispatch', () => {
       pendingQuestion: pq,
     })).resolves.toBeUndefined();
   });
+
+  test('broadcast title is emoji-prefixed and does not start with "Guardian question:"', async () => {
+    const convId = 'conv-dispatch-6';
+    ensureConversation(convId);
+
+    const session = createCallSession({
+      conversationId: convId,
+      provider: 'twilio',
+      fromNumber: '+15550001111',
+      toNumber: '+15550002222',
+    });
+    const pq = createPendingQuestion(session.id, 'What is the gate code?');
+
+    const broadcastedMessages: unknown[] = [];
+    const broadcastFn = (msg: unknown) => { broadcastedMessages.push(msg); };
+
+    await dispatchGuardianQuestion({
+      callSessionId: session.id,
+      conversationId: convId,
+      assistantId: 'self',
+      pendingQuestion: pq,
+      broadcast: broadcastFn,
+    });
+
+    const msg = broadcastedMessages[0] as Record<string, unknown>;
+    const title = msg.title as string;
+
+    // Title must NOT start with the old static "Guardian question:" prefix
+    expect(title.startsWith('Guardian question:')).toBe(false);
+
+    // Title must start with an emoji (code point > 127 or common emoji ranges)
+    const firstCodePoint = title.codePointAt(0)!;
+    expect(firstCodePoint).toBeGreaterThan(127);
+  });
+
+  test('broadcast includes questionText field matching the original question', async () => {
+    const convId = 'conv-dispatch-7';
+    ensureConversation(convId);
+
+    const questionText = 'What is the WiFi password?';
+    const session = createCallSession({
+      conversationId: convId,
+      provider: 'twilio',
+      fromNumber: '+15550001111',
+      toNumber: '+15550002222',
+    });
+    const pq = createPendingQuestion(session.id, questionText);
+
+    const broadcastedMessages: unknown[] = [];
+    const broadcastFn = (msg: unknown) => { broadcastedMessages.push(msg); };
+
+    await dispatchGuardianQuestion({
+      callSessionId: session.id,
+      conversationId: convId,
+      assistantId: 'self',
+      pendingQuestion: pq,
+      broadcast: broadcastFn,
+    });
+
+    expect(broadcastedMessages).toHaveLength(1);
+    const msg = broadcastedMessages[0] as Record<string, unknown>;
+    expect(msg.type).toBe('guardian_request_thread_created');
+    expect(msg.questionText).toBe(questionText);
+  });
+
+  test('initial message in mac conversation contains question text from generative copy', async () => {
+    const convId = 'conv-dispatch-8';
+    ensureConversation(convId);
+
+    // Set mock copy to a known generative-style message
+    mockGuardianCopy = {
+      threadTitle: '\u{1F4DE} Live call: Gate code needed',
+      initialMessage: 'You have an active phone call that needs your help.\n\nThe caller is asking: What is the gate code?\n\nPlease reply with your answer to resume the call.',
+    };
+
+    const session = createCallSession({
+      conversationId: convId,
+      provider: 'twilio',
+      fromNumber: '+15550001111',
+      toNumber: '+15550002222',
+    });
+    const pq = createPendingQuestion(session.id, 'What is the gate code?');
+
+    const broadcastedMessages: unknown[] = [];
+    const broadcastFn = (msg: unknown) => { broadcastedMessages.push(msg); };
+
+    await dispatchGuardianQuestion({
+      callSessionId: session.id,
+      conversationId: convId,
+      assistantId: 'self',
+      pendingQuestion: pq,
+      broadcast: broadcastFn,
+    });
+
+    const msg = broadcastedMessages[0] as Record<string, unknown>;
+    const macConvId = msg.conversationId as string;
+
+    const messages = getMessages(macConvId);
+    expect(messages.length).toBeGreaterThanOrEqual(1);
+    const content = messages[0].content;
+    // The generative copy should be used as the initial message
+    expect(content).toContain('What is the gate code?');
+    expect(content).toContain('active phone call');
+  });
 });
diff --git a/assistant/src/__tests__/guardian-question-copy.test.ts b/assistant/src/__tests__/guardian-question-copy.test.ts
new file mode 100644
index 00000000000..b97cdf8848b
--- /dev/null
+++ b/assistant/src/__tests__/guardian-question-copy.test.ts
@@ -0,0 +1,47 @@
+import { describe, test, expect } from 'bun:test';
+import { buildFallbackCopy } from '../calls/guardian-question-copy.js';
+
+describe('buildFallbackCopy', () => {
+  test('threadTitle starts with warning emoji', () => {
+    const result = buildFallbackCopy('What is the gate code?');
+    expect(result.threadTitle.startsWith('\u26A0\uFE0F')).toBe(true);
+  });
+
+  test('threadTitle does not start with "Guardian question:"', () => {
+    const result = buildFallbackCopy('What is the gate code?');
+    expect(result.threadTitle.startsWith('Guardian question:')).toBe(false);
+  });
+
+  test('threadTitle is under 80 characters for reasonable input', () => {
+    const result = buildFallbackCopy('What is the gate code?');
+    expect(result.threadTitle.length).toBeLessThan(80);
+  });
+
+  test('initialMessage contains the question text', () => {
+    const question = 'Should I let the delivery driver in?';
+    const result = buildFallbackCopy(question);
+    expect(result.initialMessage).toContain(question);
+  });
+
+  test('initialMessage contains "Reply to this message" instruction', () => {
+    const result = buildFallbackCopy('Any question here');
+    expect(result.initialMessage).toContain('Reply to this message');
+  });
+
+  test('very long question text gets truncated in title', () => {
+    const longQuestion = 'A'.repeat(200);
+    const result = buildFallbackCopy(longQuestion);
+
+    // Title should use questionText.slice(0, 70), so the question portion is at most 70 chars
+    // Plus the emoji prefix and space, should still be well under 80
+    expect(result.threadTitle.length).toBeLessThanOrEqual(
+      '\u26A0\uFE0F '.length + 70,
+    );
+
+    // The full question should NOT appear in the title
+    expect(result.threadTitle).not.toContain(longQuestion);
+
+    // But the full question should still appear in the initial message
+    expect(result.initialMessage).toContain(longQuestion);
+  });
+});
diff --git a/assistant/src/__tests__/ipc-snapshot.test.ts b/assistant/src/__tests__/ipc-snapshot.test.ts
index 3a8c6f65a95..c06e459fd04 100644
--- a/assistant/src/__tests__/ipc-snapshot.test.ts
+++ b/assistant/src/__tests__/ipc-snapshot.test.ts
@@ -1700,6 +1700,7 @@ const serverMessages: Record<ServerMessageType, ServerMessage> = {
     requestId: 'req-guardian-001',
     callSessionId: 'call-001',
     title: 'Guardian action request',
+    questionText: 'What is the gate code?',
   },
   subagent_spawned: {
     type: 'subagent_spawned',
diff --git a/assistant/src/__tests__/provider-error-scenarios.test.ts b/assistant/src/__tests__/provider-error-scenarios.test.ts
index 470aad72d50..823a456b177 100644
--- a/assistant/src/__tests__/provider-error-scenarios.test.ts
+++ b/assistant/src/__tests__/provider-error-scenarios.test.ts
@@ -10,8 +10,8 @@ mock.module('../util/logger.js', () => ({
 }));
 
 // Only mock sleep so retries complete instantly; keep real retry logic
-mock.module('../util/retry.js', () => {
-  const real = require(retryModulePath);
+mock.module('../util/retry.js', async () => {
+  const real = await import(retryModulePath);
   return {
     ...real,
     sleep: () => Promise.resolve(),
@@ -231,7 +231,7 @@ describe('RetryProvider — network error retries', () => {
     const inner = makeFlaky(1, err);
     const provider = new RetryProvider(inner);
 
-    const result = await provider.sendMessage(MESSAGES);
+    const _result = await provider.sendMessage(MESSAGES);
     expect(inner.calls).toBe(2);
   });
 
@@ -386,7 +386,7 @@ describe('FailoverProvider — model unavailability fallback', () => {
     const secondary = makeProvider('secondary');
     const provider = new FailoverProvider([primary, secondary]);
 
-    const result = await provider.sendMessage(MESSAGES);
+    const _result = await provider.sendMessage(MESSAGES);
     expect(primary.calls).toBe(1);
     expect(secondary.calls).toBe(1);
   });
diff --git a/assistant/src/__tests__/relay-server.test.ts b/assistant/src/__tests__/relay-server.test.ts
index 3e6adb502b5..b7adb1cb3c4 100644
--- a/assistant/src/__tests__/relay-server.test.ts
+++ b/assistant/src/__tests__/relay-server.test.ts
@@ -263,8 +263,8 @@ describe('relay-server', () => {
     const connectedEvents = events.filter(e => e.eventType === 'call_connected');
     expect(connectedEvents.length).toBe(1);
 
-    // Verify orchestrator was created
-    expect(relay.getOrchestrator()).not.toBeNull();
+    // Verify controller was created
+    expect(relay.getController()).not.toBeNull();
 
     relay.destroy();
   });
@@ -815,11 +815,11 @@ describe('relay-server', () => {
       to: '+15552222222',
     }));
 
-    expect(relay.getOrchestrator()).not.toBeNull();
+    expect(relay.getController()).not.toBeNull();
 
     relay.destroy();
 
-    expect(relay.getOrchestrator()).toBeNull();
+    expect(relay.getController()).toBeNull();
   });
 
   test('destroy: can be called multiple times without error', () => {
@@ -1145,7 +1145,7 @@ describe('relay-server', () => {
       to: '+15551111111',
     }));
 
-    const runtimeContext = (relay.getOrchestrator() as unknown as { guardianContext?: { sourceChannel?: string; actorRole?: string; guardianExternalUserId?: string } })?.guardianContext;
+    const runtimeContext = (relay.getController() as unknown as { guardianContext?: { sourceChannel?: string; actorRole?: string; guardianExternalUserId?: string } })?.guardianContext;
     expect(runtimeContext?.sourceChannel).toBe('voice');
     expect(runtimeContext?.actorRole).toBe('guardian');
     expect(runtimeContext?.guardianExternalUserId).toBe('+15550001111');
@@ -1181,7 +1181,7 @@ describe('relay-server', () => {
       to: '+15551111111',
     }));
 
-    const runtimeContext = (relay.getOrchestrator() as unknown as {
+    const runtimeContext = (relay.getController() as unknown as {
       guardianContext?: {
         sourceChannel?: string;
         actorRole?: string;
@@ -1197,7 +1197,7 @@ describe('relay-server', () => {
     relay.destroy();
   });
 
-  test('inbound guardian verification updates orchestrator context to guardian', async () => {
+  test('inbound guardian verification updates controller context to guardian', async () => {
     ensureConversation('conv-guardian-context-upgrade');
     const session = createCallSession({
       conversationId: 'conv-guardian-context-upgrade',
@@ -1219,7 +1219,7 @@ describe('relay-server', () => {
       to: session.toNumber,
     }));
 
-    const preVerify = (relay.getOrchestrator() as unknown as {
+    const preVerify = (relay.getController() as unknown as {
       guardianContext?: { actorRole?: string };
     })?.guardianContext;
     expect(preVerify?.actorRole).toBe('unverified_channel');
@@ -1233,7 +1233,7 @@ describe('relay-server', () => {
 
     await new Promise((resolve) => setTimeout(resolve, 10));
 
-    const postVerify = (relay.getOrchestrator() as unknown as {
+    const postVerify = (relay.getController() as unknown as {
       guardianContext?: { sourceChannel?: string; actorRole?: string; guardianExternalUserId?: string };
     })?.guardianContext;
     expect(postVerify?.sourceChannel).toBe('voice');
diff --git a/assistant/src/__tests__/run-orchestrator.test.ts b/assistant/src/__tests__/run-orchestrator.test.ts
index 50c03c8ec11..6eb09a20c4b 100644
--- a/assistant/src/__tests__/run-orchestrator.test.ts
+++ b/assistant/src/__tests__/run-orchestrator.test.ts
@@ -36,6 +36,7 @@ import { initializeDb, getDb, resetDb } from '../memory/db.js';
 import { createConversation } from '../memory/conversation-store.js';
 import { createRun, getRun, setRunConfirmation } from '../memory/runs-store.js';
 import { RunOrchestrator } from '../runtime/run-orchestrator.js';
+import type { VoiceRunEventSink } from '../runtime/run-orchestrator.js';
 import type { ChannelCapabilities } from '../daemon/session-runtime-assembly.js';
 
 initializeDb();
@@ -110,7 +111,7 @@ describe('run failure detection', () => {
       deriveDefaultStrictSideEffects: () => false,
     });
 
-    const run = await orchestrator.startRun(conversation.id, 'Hello');
+    const { run } = await orchestrator.startRun(conversation.id, 'Hello');
 
     // The agent loop fires asynchronously; give it a tick to settle.
     await new Promise((r) => setTimeout(r, 50));
@@ -133,7 +134,7 @@ describe('run failure detection', () => {
       deriveDefaultStrictSideEffects: () => false,
     });
 
-    const run = await orchestrator.startRun(conversation.id, 'Hello');
+    const { run } = await orchestrator.startRun(conversation.id, 'Hello');
 
     await new Promise((r) => setTimeout(r, 50));
 
@@ -212,7 +213,7 @@ describe('run approval state executionTarget', () => {
       deriveDefaultStrictSideEffects: () => false,
     });
 
-    const run = await orchestrator.startRun(conversation.id, 'Run host command');
+    const { run } = await orchestrator.startRun(conversation.id, 'Run host command');
     const stored = orchestrator.getRun(run.id);
     expect(stored?.status).toBe('needs_confirmation');
     expect(stored?.pendingConfirmation?.executionTarget).toBe('host');
@@ -461,3 +462,385 @@ describe('strictSideEffects re-derivation across runs', () => {
     expect((session as unknown as { memoryPolicy: { strictSideEffects: boolean } }).memoryPolicy.strictSideEffects).toBe(false);
   });
 });
+
+// ═══════════════════════════════════════════════════════════════════════════
+// VoiceRunEventSink forwarding
+// ═══════════════════════════════════════════════════════════════════════════
+
+describe('eventSink forwarding', () => {
+  beforeEach(() => {
+    const db = getDb();
+    db.run('DELETE FROM message_runs');
+    db.run('DELETE FROM messages');
+    db.run('DELETE FROM conversations');
+  });
+
+  test('eventSink receives assistant_text_delta events', async () => {
+    const conversation = createConversation('event sink delta test');
+    const deltaMsg: ServerMessage = {
+      type: 'assistant_text_delta',
+      text: 'Hello from agent',
+      sessionId: conversation.id,
+    };
+    const session = makeSessionWithEvent(deltaMsg);
+
+    const receivedDeltas: string[] = [];
+    const sink: VoiceRunEventSink = {
+      onTextDelta: (text) => receivedDeltas.push(text),
+      onMessageComplete: () => {},
+      onError: () => {},
+      onToolUse: () => {},
+    };
+
+    const orchestrator = new RunOrchestrator({
+      getOrCreateSession: async () => session,
+      resolveAttachments: () => [],
+      deriveDefaultStrictSideEffects: () => false,
+    });
+
+    await orchestrator.startRun(conversation.id, 'Hello', undefined, {
+      eventSink: sink,
+    });
+    await new Promise((r) => setTimeout(r, 50));
+
+    expect(receivedDeltas).toEqual(['Hello from agent']);
+  });
+
+  test('eventSink receives error events', async () => {
+    const conversation = createConversation('event sink error test');
+    const errMsg: ServerMessage = {
+      type: 'error',
+      message: 'Something broke',
+    };
+    const session = makeSessionWithEvent(errMsg);
+
+    const receivedErrors: string[] = [];
+    const sink: VoiceRunEventSink = {
+      onTextDelta: () => {},
+      onMessageComplete: () => {},
+      onError: (msg) => receivedErrors.push(msg),
+      onToolUse: () => {},
+    };
+
+    const orchestrator = new RunOrchestrator({
+      getOrCreateSession: async () => session,
+      resolveAttachments: () => [],
+      deriveDefaultStrictSideEffects: () => false,
+    });
+
+    await orchestrator.startRun(conversation.id, 'Hello', undefined, {
+      eventSink: sink,
+    });
+    await new Promise((r) => setTimeout(r, 50));
+
+    expect(receivedErrors).toEqual(['Something broke']);
+  });
+
+  test('eventSink receives tool_use_start events', async () => {
+    const conversation = createConversation('event sink tool test');
+    const toolMsg: ServerMessage = {
+      type: 'tool_use_start',
+      toolName: 'web_search',
+      input: { query: 'test' },
+      sessionId: conversation.id,
+    };
+    const session = makeSessionWithEvent(toolMsg);
+
+    const receivedTools: Array<{ name: string; input: Record<string, unknown> }> = [];
+    const sink: VoiceRunEventSink = {
+      onTextDelta: () => {},
+      onMessageComplete: () => {},
+      onError: () => {},
+      onToolUse: (name, input) => receivedTools.push({ name, input }),
+    };
+
+    const orchestrator = new RunOrchestrator({
+      getOrCreateSession: async () => session,
+      resolveAttachments: () => [],
+      deriveDefaultStrictSideEffects: () => false,
+    });
+
+    await orchestrator.startRun(conversation.id, 'Hello', undefined, {
+      eventSink: sink,
+    });
+    await new Promise((r) => setTimeout(r, 50));
+
+    expect(receivedTools).toHaveLength(1);
+    expect(receivedTools[0].name).toBe('web_search');
+    expect(receivedTools[0].input).toEqual({ query: 'test' });
+  });
+
+  test('eventSink receives onMessageComplete on generation_cancelled', async () => {
+    const conversation = createConversation('event sink cancelled test');
+    const cancelledMsg: ServerMessage = {
+      type: 'generation_cancelled',
+      sessionId: conversation.id,
+    };
+    const session = makeSessionWithEvent(cancelledMsg);
+
+    let messageCompleteCount = 0;
+    const receivedErrors: string[] = [];
+    const sink: VoiceRunEventSink = {
+      onTextDelta: () => {},
+      onMessageComplete: () => { messageCompleteCount++; },
+      onError: (msg) => receivedErrors.push(msg),
+      onToolUse: () => {},
+    };
+
+    const orchestrator = new RunOrchestrator({
+      getOrCreateSession: async () => session,
+      resolveAttachments: () => [],
+      deriveDefaultStrictSideEffects: () => false,
+    });
+
+    await orchestrator.startRun(conversation.id, 'Hello', undefined, {
+      eventSink: sink,
+    });
+    await new Promise((r) => setTimeout(r, 50));
+
+    // generation_cancelled should be forwarded as onMessageComplete
+    expect(messageCompleteCount).toBe(1);
+    // It should NOT trigger onError
+    expect(receivedErrors).toHaveLength(0);
+  });
+
+  test('eventSink receives onError when runAgentLoop throws', async () => {
+    const conversation = createConversation('event sink exception test');
+
+    // Build a session whose runAgentLoop throws an exception instead of
+    // emitting events — simulating an unhandled crash in the agent loop.
+    const session = {
+      isProcessing: () => false,
+      persistUserMessage: () => undefined as unknown as string,
+      memoryPolicy: { scopeId: 'default', includeDefaultFallback: false, strictSideEffects: false },
+      setChannelCapabilities: () => {},
+      setAssistantId: () => {},
+      setGuardianContext: () => {},
+      setCommandIntent: () => {},
+      setTurnChannelContext: () => {},
+      updateClient: () => {},
+      runAgentLoop: async () => {
+        throw new Error('Unexpected agent crash');
+      },
+      handleConfirmationResponse: () => {},
+    } as unknown as Session;
+
+    const receivedErrors: string[] = [];
+    const sink: VoiceRunEventSink = {
+      onTextDelta: () => {},
+      onMessageComplete: () => {},
+      onError: (msg) => receivedErrors.push(msg),
+      onToolUse: () => {},
+    };
+
+    const orchestrator = new RunOrchestrator({
+      getOrCreateSession: async () => session,
+      resolveAttachments: () => [],
+      deriveDefaultStrictSideEffects: () => false,
+    });
+
+    await orchestrator.startRun(conversation.id, 'Hello', undefined, {
+      eventSink: sink,
+    });
+    await new Promise((r) => setTimeout(r, 50));
+
+    // The exception message should be forwarded to the event sink
+    expect(receivedErrors).toEqual(['Unexpected agent crash']);
+  });
+
+  test('no events forwarded when eventSink is not provided', async () => {
+    const conversation = createConversation('no sink test');
+    const deltaMsg: ServerMessage = {
+      type: 'assistant_text_delta',
+      text: 'Hello',
+      sessionId: conversation.id,
+    };
+    const session = makeSessionWithEvent(deltaMsg);
+
+    const orchestrator = new RunOrchestrator({
+      getOrCreateSession: async () => session,
+      resolveAttachments: () => [],
+      deriveDefaultStrictSideEffects: () => false,
+    });
+
+    // Should not throw when no eventSink is provided
+    const { run } = await orchestrator.startRun(conversation.id, 'Hello');
+    await new Promise((r) => setTimeout(r, 50));
+
+    const stored = orchestrator.getRun(run.id);
+    expect(stored?.status).toBe('completed');
+  });
+});
+
+// ═══════════════════════════════════════════════════════════════════════════
+// Run abort / cancellation
+// ═══════════════════════════════════════════════════════════════════════════
+
+describe('run abort', () => {
+  beforeEach(() => {
+    const db = getDb();
+    db.run('DELETE FROM message_runs');
+    db.run('DELETE FROM messages');
+    db.run('DELETE FROM conversations');
+  });
+
+  test('startRun returns an abort function', async () => {
+    const conversation = createConversation('abort handle test');
+    const session = {
+      isProcessing: () => false,
+      currentRequestId: undefined as string | undefined,
+      persistUserMessage: (_c: string, _a: unknown[], reqId: string) => {
+        session.currentRequestId = reqId;
+        return undefined as unknown as string;
+      },
+      memoryPolicy: { scopeId: 'default', includeDefaultFallback: false, strictSideEffects: false },
+      setChannelCapabilities: () => {},
+      setAssistantId: () => {},
+      setGuardianContext: () => {},
+      setCommandIntent: () => {},
+      setTurnChannelContext: () => {},
+      updateClient: () => {},
+      runAgentLoop: async () => {},
+      handleConfirmationResponse: () => {},
+      abort: () => {},
+    } as unknown as Session;
+
+    const orchestrator = new RunOrchestrator({
+      getOrCreateSession: async () => session,
+      resolveAttachments: () => [],
+      deriveDefaultStrictSideEffects: () => false,
+    });
+
+    const handle = await orchestrator.startRun(conversation.id, 'Hello');
+    expect(typeof handle.abort).toBe('function');
+    expect(handle.run.id).toBeDefined();
+  });
+
+  test('aborting a run does not crash session state', async () => {
+    const conversation = createConversation('abort safety test');
+    let abortCalled = false;
+
+    const session = {
+      isProcessing: () => false,
+      currentRequestId: undefined as string | undefined,
+      persistUserMessage: (_c: string, _a: unknown[], reqId: string) => {
+        session.currentRequestId = reqId;
+        return undefined as unknown as string;
+      },
+      memoryPolicy: { scopeId: 'default', includeDefaultFallback: false, strictSideEffects: false },
+      setChannelCapabilities: () => {},
+      setAssistantId: () => {},
+      setGuardianContext: () => {},
+      setCommandIntent: () => {},
+      setTurnChannelContext: () => {},
+      updateClient: () => {},
+      runAgentLoop: async () => {
+        // Simulate a long-running agent loop
+        await new Promise((r) => setTimeout(r, 200));
+      },
+      handleConfirmationResponse: () => {},
+      abort: () => { abortCalled = true; },
+    } as unknown as Session;
+
+    const orchestrator = new RunOrchestrator({
+      getOrCreateSession: async () => session,
+      resolveAttachments: () => [],
+      deriveDefaultStrictSideEffects: () => false,
+    });
+
+    const handle = await orchestrator.startRun(conversation.id, 'Hello');
+
+    // Abort immediately — session still has same requestId
+    handle.abort();
+    expect(abortCalled).toBe(true);
+
+    // Wait for cleanup to settle
+    await new Promise((r) => setTimeout(r, 300));
+
+    // Session state should not be corrupted — the run completes normally
+    // since the mock runAgentLoop resolves after 200ms regardless.
+    const stored = orchestrator.getRun(handle.run.id);
+    expect(stored).not.toBeNull();
+  });
+
+  test('stale abort handle is a no-op when session has moved to a new run', async () => {
+    const conversation = createConversation('stale abort test');
+    let abortCalled = false;
+
+    const session = {
+      isProcessing: () => false,
+      currentRequestId: undefined as string | undefined,
+      persistUserMessage: (_c: string, _a: unknown[], reqId: string) => {
+        session.currentRequestId = reqId;
+        return undefined as unknown as string;
+      },
+      memoryPolicy: { scopeId: 'default', includeDefaultFallback: false, strictSideEffects: false },
+      setChannelCapabilities: () => {},
+      setAssistantId: () => {},
+      setGuardianContext: () => {},
+      setCommandIntent: () => {},
+      setTurnChannelContext: () => {},
+      updateClient: () => {},
+      runAgentLoop: async () => {},
+      handleConfirmationResponse: () => {},
+      abort: () => { abortCalled = true; },
+    } as unknown as Session;
+
+    const orchestrator = new RunOrchestrator({
+      getOrCreateSession: async () => session,
+      resolveAttachments: () => [],
+      deriveDefaultStrictSideEffects: () => false,
+    });
+
+    // Start first run and capture its handle
+    const handle1 = await orchestrator.startRun(conversation.id, 'First turn');
+    await new Promise((r) => setTimeout(r, 50));
+
+    // Start second run — session's currentRequestId now belongs to run 2
+    const _handle2 = await orchestrator.startRun(conversation.id, 'Second turn');
+
+    // Attempt to abort using the stale handle from run 1.
+    // Since the session has moved to a new requestId, this should be a no-op.
+    handle1.abort();
+    expect(abortCalled).toBe(false);
+  });
+
+  test('abort works when session still has matching requestId', async () => {
+    const conversation = createConversation('matching abort test');
+    let abortCalled = false;
+
+    const session = {
+      isProcessing: () => false,
+      currentRequestId: undefined as string | undefined,
+      persistUserMessage: (_c: string, _a: unknown[], reqId: string) => {
+        session.currentRequestId = reqId;
+        return undefined as unknown as string;
+      },
+      memoryPolicy: { scopeId: 'default', includeDefaultFallback: false, strictSideEffects: false },
+      setChannelCapabilities: () => {},
+      setAssistantId: () => {},
+      setGuardianContext: () => {},
+      setCommandIntent: () => {},
+      setTurnChannelContext: () => {},
+      updateClient: () => {},
+      runAgentLoop: async () => {
+        // Keep the agent loop running so the session stays on this requestId
+        await new Promise((r) => setTimeout(r, 500));
+      },
+      handleConfirmationResponse: () => {},
+      abort: () => { abortCalled = true; },
+    } as unknown as Session;
+
+    const orchestrator = new RunOrchestrator({
+      getOrCreateSession: async () => session,
+      resolveAttachments: () => [],
+      deriveDefaultStrictSideEffects: () => false,
+    });
+
+    const handle = await orchestrator.startRun(conversation.id, 'Hello');
+
+    // Abort while the session is still processing this run
+    handle.abort();
+    expect(abortCalled).toBe(true);
+  });
+});
diff --git a/assistant/src/__tests__/runtime-runs.test.ts b/assistant/src/__tests__/runtime-runs.test.ts
index ff495c27fed..485b8fd218e 100644
--- a/assistant/src/__tests__/runtime-runs.test.ts
+++ b/assistant/src/__tests__/runtime-runs.test.ts
@@ -163,7 +163,7 @@ describe('runtime runs — swarm lifecycle', () => {
       deriveDefaultStrictSideEffects: () => false,
     });
 
-    const run = await orchestrator.startRun(conversation.id, 'Build a feature');
+    const { run } = await orchestrator.startRun(conversation.id, 'Build a feature');
     expect(run.status).toBe('running');
 
     // Wait for agent loop to complete
@@ -181,7 +181,7 @@ describe('runtime runs — swarm lifecycle', () => {
       deriveDefaultStrictSideEffects: () => false,
     });
 
-    const run = await orchestrator.startRun(conversation.id, 'Run swarm');
+    const { run } = await orchestrator.startRun(conversation.id, 'Run swarm');
 
     await new Promise((r) => setTimeout(r, 50));
 
@@ -198,7 +198,7 @@ describe('runtime runs — swarm lifecycle', () => {
       deriveDefaultStrictSideEffects: () => false,
     });
 
-    const run = await orchestrator.startRun(conversation.id, 'Delegate a swarm task');
+    const { run } = await orchestrator.startRun(conversation.id, 'Delegate a swarm task');
 
     // Give agent loop time to emit confirmation_request
     await new Promise((r) => setTimeout(r, 50));
@@ -216,7 +216,7 @@ describe('runtime runs — swarm lifecycle', () => {
       deriveDefaultStrictSideEffects: () => false,
     });
 
-    const run = await orchestrator.startRun(conversation.id, 'Run with approval');
+    const { run } = await orchestrator.startRun(conversation.id, 'Run with approval');
     await new Promise((r) => setTimeout(r, 50));
 
     // Verify pending state
diff --git a/assistant/src/__tests__/session-agent-loop.test.ts b/assistant/src/__tests__/session-agent-loop.test.ts
index cd64d9d389f..b1dc6981938 100644
--- a/assistant/src/__tests__/session-agent-loop.test.ts
+++ b/assistant/src/__tests__/session-agent-loop.test.ts
@@ -1,6 +1,6 @@
 import { describe, expect, mock, test, beforeEach } from 'bun:test';
 import type { Message, ContentBlock } from '../providers/types.js';
-import type { AgentEvent, CheckpointDecision } from '../agent/loop.js';
+import type { AgentEvent, CheckpointDecision, CheckpointInfo } from '../agent/loop.js';
 import type { ServerMessage } from '../daemon/ipc-protocol.js';
 
 // ── Module mocks (must precede imports of the module under test) ─────
@@ -212,7 +212,7 @@ type AgentLoopRun = (
   onEvent: (event: AgentEvent) => void,
   signal?: AbortSignal,
   requestId?: string,
-  onCheckpoint?: () => CheckpointDecision,
+  onCheckpoint?: (checkpoint: CheckpointInfo) => CheckpointDecision,
 ) => Promise<Message[]>;
 
 function makeCtx(overrides?: Partial<AgentLoopSessionContext> & { agentLoopRun?: AgentLoopRun }): AgentLoopSessionContext {
@@ -811,7 +811,7 @@ describe('session-agent-loop', () => {
     test('drains queue after completion', async () => {
       let drainReason: string | undefined;
       const ctx = makeCtx({
-        agentLoopRun: async (messages, onEvent) => {
+        agentLoopRun: async (messages: Message[], onEvent: (event: AgentEvent) => void) => {
           onEvent({
             type: 'message_complete',
             message: { role: 'assistant', content: [{ type: 'text', text: 'ok' }] },
diff --git a/assistant/src/__tests__/session-init.benchmark.test.ts b/assistant/src/__tests__/session-init.benchmark.test.ts
index ac4875c3223..508fe9807ad 100644
--- a/assistant/src/__tests__/session-init.benchmark.test.ts
+++ b/assistant/src/__tests__/session-init.benchmark.test.ts
@@ -194,9 +194,9 @@ mock.module('../calls/call-state.js', () => ({
   registerCallCompletionNotifier: () => {},
   unregisterCallCompletionNotifier: () => {},
   fireCallCompletionNotifier: () => {},
-  registerCallOrchestrator: () => {},
-  unregisterCallOrchestrator: () => {},
-  getCallOrchestrator: () => undefined,
+  registerCallController: () => {},
+  unregisterCallController: () => {},
+  getCallController: () => undefined,
 }));
 
 mock.module('../calls/call-store.js', () => ({
diff --git a/assistant/src/__tests__/session-surfaces-task-progress.test.ts b/assistant/src/__tests__/session-surfaces-task-progress.test.ts
index 2550d59eba6..3ee82eca4de 100644
--- a/assistant/src/__tests__/session-surfaces-task-progress.test.ts
+++ b/assistant/src/__tests__/session-surfaces-task-progress.test.ts
@@ -9,6 +9,7 @@ import type {
 } from '../daemon/ipc-protocol.js';
 import {
   surfaceProxyResolver,
+  createSurfaceMutex,
   type SurfaceSessionContext,
 } from '../daemon/session-surfaces.js';
 
@@ -26,6 +27,7 @@ function makeContext(sent: ServerMessage[] = []): SurfaceSessionContext {
     enqueueMessage: () => ({ queued: false, requestId: 'req-1' }),
     getQueueDepth: () => 0,
     processMessage: async () => 'ok',
+    withSurface: createSurfaceMutex(),
   };
 }
 
diff --git a/assistant/src/__tests__/session-tool-setup-app-refresh.test.ts b/assistant/src/__tests__/session-tool-setup-app-refresh.test.ts
index 8564ef6fbd8..614462357be 100644
--- a/assistant/src/__tests__/session-tool-setup-app-refresh.test.ts
+++ b/assistant/src/__tests__/session-tool-setup-app-refresh.test.ts
@@ -68,6 +68,7 @@ function makeCtx(overrides: Partial<ToolSetupContext> = {}): ToolSetupContext {
     enqueueMessage: () => ({ queued: false, requestId: 'r' }),
     getQueueDepth: () => 0,
     processMessage: async () => '',
+    withSurface: async <T>(_id: string, fn: () => T | Promise<T>) => fn(),
     memoryPolicy: { scopeId: 'default', strictSideEffects: false },
     ...overrides,
   };
diff --git a/assistant/src/__tests__/session-tool-setup-memory-scope.test.ts b/assistant/src/__tests__/session-tool-setup-memory-scope.test.ts
index b7453b34b9c..0571e729dad 100644
--- a/assistant/src/__tests__/session-tool-setup-memory-scope.test.ts
+++ b/assistant/src/__tests__/session-tool-setup-memory-scope.test.ts
@@ -55,6 +55,7 @@ function makeCtx(overrides: Partial<ToolSetupContext> = {}): ToolSetupContext {
     enqueueMessage: () => ({ queued: false, requestId: 'r' }),
     getQueueDepth: () => 0,
     processMessage: async () => '',
+    withSurface: async <T>(_id: string, fn: () => T | Promise<T>) => fn(),
     memoryPolicy: { scopeId: 'default', strictSideEffects: false },
     ...overrides,
   };
diff --git a/assistant/src/__tests__/session-tool-setup-side-effect-flag.test.ts b/assistant/src/__tests__/session-tool-setup-side-effect-flag.test.ts
index b71a9becdb7..c608d37ca37 100644
--- a/assistant/src/__tests__/session-tool-setup-side-effect-flag.test.ts
+++ b/assistant/src/__tests__/session-tool-setup-side-effect-flag.test.ts
@@ -55,6 +55,7 @@ function makeCtx(overrides: Partial<ToolSetupContext> = {}): ToolSetupContext {
     enqueueMessage: () => ({ queued: false, requestId: 'r' }),
     getQueueDepth: () => 0,
     processMessage: async () => '',
+    withSurface: async <T>(_id: string, fn: () => T | Promise<T>) => fn(),
     memoryPolicy: { scopeId: 'default', strictSideEffects: false },
     ...overrides,
   };
diff --git a/assistant/src/__tests__/starter-task-flow.test.ts b/assistant/src/__tests__/starter-task-flow.test.ts
index 5716e844ed5..e108851ad49 100644
--- a/assistant/src/__tests__/starter-task-flow.test.ts
+++ b/assistant/src/__tests__/starter-task-flow.test.ts
@@ -30,6 +30,7 @@ mock.module('../home-base/prebuilt/seed.js', () => ({
 import {
   handleSurfaceAction,
   surfaceProxyResolver,
+  createSurfaceMutex,
   type SurfaceSessionContext,
 } from '../daemon/session-surfaces.js';
 
@@ -49,6 +50,7 @@ function makeContext(): SurfaceSessionContext {
     enqueueMessage: () => ({ queued: false, requestId: 'req-1' }),
     getQueueDepth: () => 0,
     processMessage: async () => 'ok',
+    withSurface: createSurfaceMutex(),
   };
 }
 
diff --git a/assistant/src/__tests__/terminal-tools.test.ts b/assistant/src/__tests__/terminal-tools.test.ts
index 95539a03914..93f4fb36cbb 100644
--- a/assistant/src/__tests__/terminal-tools.test.ts
+++ b/assistant/src/__tests__/terminal-tools.test.ts
@@ -1,8 +1,10 @@
-/* eslint-disable @typescript-eslint/no-explicit-any */
 import { describe, test, expect, beforeEach, afterEach, mock } from 'bun:test';
 import { mkdtempSync, mkdirSync, rmSync, symlinkSync } from 'node:fs';
 import { tmpdir } from 'node:os';
 import { join } from 'node:path';
+import type { Tool } from '../tools/types.js';
+import type { SandboxBackend } from '../tools/terminal/backends/types.js';
+import type { ShellOutputResult } from '../tools/shared/shell-output.js';
 
 // ── Mock modules ────────────────────────────────────────────────────────────
 
@@ -502,7 +504,7 @@ describe('wrapCommand', () => {
 describe('Native sandbox backend', () => {
   // We test NativeBackend directly rather than through wrapCommand to avoid
   // platform-dependent sandbox-exec/bwrap availability.
-  let NativeBackend: any;
+  let NativeBackend: new () => SandboxBackend;
 
   beforeEach(async () => {
     const mod = await import('../tools/terminal/backends/native.js');
@@ -546,8 +548,8 @@ describe('Native sandbox backend', () => {
 // ═══════════════════════════════════════════════════════════════════════════
 
 describe('Docker sandbox backend', () => {
-  let DockerBackend: any;
-  let _resetDockerChecks: any;
+  let DockerBackend: new (sandboxRoot: string, config?: Record<string, unknown>, uid?: number, gid?: number) => SandboxBackend;
+  let _resetDockerChecks: () => void;
 
   const sandboxDir = join(testTmpDir, 'docker-sandbox');
 
@@ -628,7 +630,7 @@ describe('Docker sandbox backend', () => {
 // ═══════════════════════════════════════════════════════════════════════════
 
 describe('Shell tool input validation', () => {
-  let shellTool: any;
+  let shellTool: Tool;
 
   beforeEach(async () => {
     const mod = await import('../tools/terminal/shell.js');
@@ -637,6 +639,7 @@ describe('Shell tool input validation', () => {
 
   const baseContext = {
     workingDir: testTmpDir,
+    sessionId: 'test-session-1',
     conversationId: 'test-conv-1',
     onOutput: () => {},
   };
@@ -700,13 +703,14 @@ describe('Shell tool input validation', () => {
 
   test('tool definition includes required schema fields', () => {
     const def = shellTool.getDefinition();
+    const schema = def.input_schema as { required: string[]; properties: Record<string, unknown> };
     expect(def.name).toBe('bash');
-    expect(def.input_schema.required).toContain('command');
-    expect(def.input_schema.required).toContain('reason');
-    expect(def.input_schema.properties.command).toBeDefined();
-    expect(def.input_schema.properties.timeout_seconds).toBeDefined();
-    expect(def.input_schema.properties.network_mode).toBeDefined();
-    expect(def.input_schema.properties.credential_ids).toBeDefined();
+    expect(schema.required).toContain('command');
+    expect(schema.required).toContain('reason');
+    expect(schema.properties.command).toBeDefined();
+    expect(schema.properties.timeout_seconds).toBeDefined();
+    expect(schema.properties.network_mode).toBeDefined();
+    expect(schema.properties.credential_ids).toBeDefined();
   });
 });
 
@@ -715,7 +719,7 @@ describe('Shell tool input validation', () => {
 // ═══════════════════════════════════════════════════════════════════════════
 
 describe('formatShellOutput', () => {
-  let formatShellOutput: any;
+  let formatShellOutput: (stdout: string, stderr: string, code: number | null, timedOut: boolean, timeoutSec: number) => ShellOutputResult;
 
   beforeEach(async () => {
     const mod = await import('../tools/shared/shell-output.js');
@@ -775,7 +779,7 @@ describe('formatShellOutput', () => {
 // ═══════════════════════════════════════════════════════════════════════════
 
 describe('EvaluateTypescriptTool input validation', () => {
-  let evalTool: any;
+  let evalTool: Tool;
 
   beforeEach(async () => {
     const mod = await import('../tools/terminal/evaluate-typescript.js');
@@ -784,6 +788,7 @@ describe('EvaluateTypescriptTool input validation', () => {
 
   const baseContext = {
     workingDir: testTmpDir,
+    sessionId: 'test-session-1',
     conversationId: 'test-conv-1',
     onOutput: () => {},
   };
@@ -829,12 +834,13 @@ describe('EvaluateTypescriptTool input validation', () => {
 
   test('tool definition has correct name and schema', () => {
     const def = evalTool.getDefinition();
+    const schema = def.input_schema as { required: string[]; properties: Record<string, unknown> };
     expect(def.name).toBe('evaluate_typescript_code');
-    expect(def.input_schema.required).toContain('code');
-    expect(def.input_schema.properties.code).toBeDefined();
-    expect(def.input_schema.properties.mock_input_json).toBeDefined();
-    expect(def.input_schema.properties.timeout_seconds).toBeDefined();
-    expect(def.input_schema.properties.filename).toBeDefined();
-    expect(def.input_schema.properties.entrypoint).toBeDefined();
+    expect(schema.required).toContain('code');
+    expect(schema.properties.code).toBeDefined();
+    expect(schema.properties.mock_input_json).toBeDefined();
+    expect(schema.properties.timeout_seconds).toBeDefined();
+    expect(schema.properties.filename).toBeDefined();
+    expect(schema.properties.entrypoint).toBeDefined();
   });
 });
diff --git a/assistant/src/__tests__/tool-executor-shell-integration.test.ts b/assistant/src/__tests__/tool-executor-shell-integration.test.ts
index 654d07ed706..47662bfd951 100644
--- a/assistant/src/__tests__/tool-executor-shell-integration.test.ts
+++ b/assistant/src/__tests__/tool-executor-shell-integration.test.ts
@@ -1,4 +1,3 @@
-/* eslint-disable @typescript-eslint/no-explicit-any */
 /**
  * Integration tests: ToolExecutor → real checker.js → real shell-identity → real tree-sitter parser.
  *
@@ -10,6 +9,7 @@
  */
 import { describe, test, expect, beforeAll, mock } from 'bun:test';
 import type { ToolContext } from '../tools/types.js';
+import type { AllowlistOption, ScopeOption } from '../permissions/types.js';
 import { PermissionPrompter } from '../permissions/prompter.js';
 
 // ── Config mock ──────────────────────────────────────────────────────
@@ -133,16 +133,16 @@ function makeContext(overrides?: Partial<ToolContext>): ToolContext {
  * passed to the prompter by the executor, then allows the tool.
  */
 function makeCapturingPrompter() {
-  let capturedAllowlist: any[] | undefined;
-  let capturedScopes: any[] | undefined;
+  let capturedAllowlist: AllowlistOption[] | undefined;
+  let capturedScopes: ScopeOption[] | undefined;
 
   const prompter = {
     prompt: async (
       _toolName: string,
       _input: Record<string, unknown>,
       _riskLevel: string,
-      allowlistOptions: any[],
-      scopeOptions: any[],
+      allowlistOptions: AllowlistOption[],
+      scopeOptions: ScopeOption[],
     ) => {
       capturedAllowlist = allowlistOptions;
       capturedScopes = scopeOptions;
@@ -177,7 +177,7 @@ describe('ToolExecutor → real shell allowlist integration', () => {
     expect(allowlist).toBeDefined();
     expect(allowlist!.length).toBeGreaterThan(1);
 
-    const patterns = allowlist!.map((o: any) => o.pattern);
+    const patterns = allowlist!.map((o: AllowlistOption) => o.pattern);
 
     // Should contain the exact command
     expect(patterns).toContain('npm install express');
@@ -197,8 +197,8 @@ describe('ToolExecutor → real shell allowlist integration', () => {
     const scopes = getScopes();
     expect(scopes).toBeDefined();
     expect(scopes!.length).toBeGreaterThanOrEqual(2);
-    expect(scopes!.some((s: any) => s.scope === '/tmp/project')).toBe(true);
-    expect(scopes!.some((s: any) => s.scope === 'everywhere')).toBe(true);
+    expect(scopes!.some((s: ScopeOption) => s.scope === '/tmp/project')).toBe(true);
+    expect(scopes!.some((s: ScopeOption) => s.scope === 'everywhere')).toBe(true);
   });
 
   test('compound command produces only exact compound option (no action keys)', async () => {
@@ -226,7 +226,7 @@ describe('ToolExecutor → real shell allowlist integration', () => {
     expect(allowlist).toBeDefined();
     expect(allowlist!.length).toBeGreaterThan(1);
 
-    const patterns = allowlist!.map((o: any) => o.pattern);
+    const patterns = allowlist!.map((o: AllowlistOption) => o.pattern);
 
     // Should contain the full original command as the exact option
     expect(patterns).toContain('cd /repo && gh pr view 123');
@@ -250,7 +250,7 @@ describe('ToolExecutor → real shell allowlist integration', () => {
     expect(scopes).toBeDefined();
     expect(scopes!.length).toBeGreaterThanOrEqual(2);
 
-    const scopeValues = scopes!.map((s: any) => s.scope);
+    const scopeValues = scopes!.map((s: ScopeOption) => s.scope);
 
     // Project-scoped option
     expect(scopeValues).toContain('/Users/test/my-project');
@@ -276,7 +276,7 @@ describe('ToolExecutor → real shell allowlist integration', () => {
     expect(allowlist).toBeDefined();
     expect(allowlist!.length).toBeGreaterThan(1);
 
-    const patterns = allowlist!.map((o: any) => o.pattern);
+    const patterns = allowlist!.map((o: AllowlistOption) => o.pattern);
 
     // Should contain exact command and action keys
     expect(patterns).toContain('git status');
diff --git a/assistant/src/__tests__/tool-executor.test.ts b/assistant/src/__tests__/tool-executor.test.ts
index 0c46df3f850..495604e12bb 100644
--- a/assistant/src/__tests__/tool-executor.test.ts
+++ b/assistant/src/__tests__/tool-executor.test.ts
@@ -1,8 +1,7 @@
-/* eslint-disable @typescript-eslint/no-explicit-any */
 import { describe, test, expect, beforeEach, afterEach, afterAll, mock, spyOn } from 'bun:test';
-import type { ToolExecutionResult, Tool } from '../tools/types.js';
+import type { ToolExecutionResult, Tool, ToolLifecycleEvent, ToolPermissionPromptEvent } from '../tools/types.js';
 import { RiskLevel } from '../permissions/types.js';
-import type { PolicyContext } from '../permissions/types.js';
+import type { AllowlistOption, ScopeOption, PolicyContext, TrustRule } from '../permissions/types.js';
 
 const mockConfig = {
   provider: 'anthropic',
@@ -322,8 +321,8 @@ describe('ToolExecutor contextual rule creation', () => {
 
   function setupAddRuleSpy() {
     addRuleSpy = spyOn(trustStore, 'addRule').mockImplementation(
-      (tool: string, pattern: string, scope: string, decision = 'allow', priority = 100, options?: any) => {
-        return { id: 'spy-rule-id', tool, pattern, scope, decision, priority, createdAt: Date.now(), ...options } as any;
+      (tool: string, pattern: string, scope: string, decision = 'allow', priority = 100, options?: { allowHighRisk?: boolean; executionTarget?: string }) => {
+        return { id: 'spy-rule-id', tool, pattern, scope, decision, priority, createdAt: Date.now(), ...options } as TrustRule;
       },
     );
     return addRuleSpy;
@@ -508,8 +507,8 @@ describe('ToolExecutor strict mode + high-risk integration (PR 25)', () => {
 
   function setupAddRuleSpy() {
     addRuleSpy = spyOn(trustStore, 'addRule').mockImplementation(
-      (tool: string, pattern: string, scope: string, decision = 'allow', priority = 100, options?: any) => {
-        return { id: 'spy-rule-id', tool, pattern, scope, decision, priority, createdAt: Date.now(), ...options } as any;
+      (tool: string, pattern: string, scope: string, decision = 'allow', priority = 100, options?: { allowHighRisk?: boolean; executionTarget?: string }) => {
+        return { id: 'spy-rule-id', tool, pattern, scope, decision, priority, createdAt: Date.now(), ...options } as TrustRule;
       },
     );
     return addRuleSpy;
@@ -1481,8 +1480,8 @@ describe('ToolExecutor persistentDecisionsAllowed contract', () => {
 
   function setupAddRuleSpy() {
     addRuleSpy = spyOn(trustStore, 'addRule').mockImplementation(
-      (tool: string, pattern: string, scope: string, decision = 'allow', priority = 100, options?: any) => {
-        return { id: 'spy-rule-id', tool, pattern, scope, decision, priority, createdAt: Date.now(), ...options } as any;
+      (tool: string, pattern: string, scope: string, decision = 'allow', priority = 100, options?: { allowHighRisk?: boolean; executionTarget?: string }) => {
+        return { id: 'spy-rule-id', tool, pattern, scope, decision, priority, createdAt: Date.now(), ...options } as TrustRule;
       },
     );
     return addRuleSpy;
@@ -1534,14 +1533,14 @@ describe('ToolExecutor persistentDecisionsAllowed contract', () => {
   });
 
   test('persistentDecisionsAllowed: false is emitted in lifecycle event for proxied bash', async () => {
-    let capturedEvent: any;
+    let capturedEvent: ToolPermissionPromptEvent | undefined;
     const prompter = makePrompterWithDecision('allow');
     const executor = new ToolExecutor(prompter);
     const result = await executor.execute(
       'bash',
       { command: 'curl https://example.com', network_mode: 'proxied' },
       makeContext({
-        onToolLifecycleEvent: (event: any) => {
+        onToolLifecycleEvent: (event: ToolLifecycleEvent) => {
           if (event.type === 'permission_prompt') {
             capturedEvent = event;
           }
@@ -1551,18 +1550,18 @@ describe('ToolExecutor persistentDecisionsAllowed contract', () => {
 
     expect(result.isError).toBe(false);
     expect(capturedEvent).toBeDefined();
-    expect(capturedEvent.persistentDecisionsAllowed).toBe(false);
+    expect(capturedEvent!.persistentDecisionsAllowed).toBe(false);
   });
 
   test('persistentDecisionsAllowed: true is emitted in lifecycle event for non-proxied bash', async () => {
-    let capturedEvent: any;
+    let capturedEvent: ToolPermissionPromptEvent | undefined;
     const prompter = makePrompterWithDecision('allow');
     const executor = new ToolExecutor(prompter);
     const result = await executor.execute(
       'bash',
       { command: 'echo hello' },
       makeContext({
-        onToolLifecycleEvent: (event: any) => {
+        onToolLifecycleEvent: (event: ToolLifecycleEvent) => {
           if (event.type === 'permission_prompt') {
             capturedEvent = event;
           }
@@ -1572,7 +1571,7 @@ describe('ToolExecutor persistentDecisionsAllowed contract', () => {
 
     expect(result.isError).toBe(false);
     expect(capturedEvent).toBeDefined();
-    expect(capturedEvent.persistentDecisionsAllowed).toBe(true);
+    expect(capturedEvent!.persistentDecisionsAllowed).toBe(true);
   });
 
   test('persistentDecisionsAllowed is passed to prompter confirmation_request for proxied bash', async () => {
@@ -1580,8 +1579,8 @@ describe('ToolExecutor persistentDecisionsAllowed contract', () => {
     const prompter = {
       prompt: async (
         _toolName: string, _input: Record<string, unknown>, _riskLevel: string,
-        _allowlistOptions: any[], _scopeOptions: any[], _diff: any, _sandboxed: any,
-        _sessionId: any, _executionTarget: any, persistentDecisionsAllowed: any,
+        _allowlistOptions: AllowlistOption[], _scopeOptions: ScopeOption[], _diff: unknown, _sandboxed: unknown,
+        _sessionId: unknown, _executionTarget: unknown, persistentDecisionsAllowed: boolean | undefined,
       ) => {
         capturedPersistent = persistentDecisionsAllowed;
         return { decision: 'allow' as const };
@@ -1643,8 +1642,8 @@ describe('E2E: proxied bash activation vs proxy approval persistence', () => {
 
   function setupAddRuleSpy() {
     addRuleSpy = spyOn(trustStore, 'addRule').mockImplementation(
-      (tool: string, pattern: string, scope: string, decision = 'allow', priority = 100, options?: any) => {
-        return { id: 'spy-rule-id', tool, pattern, scope, decision, priority, createdAt: Date.now(), ...options } as any;
+      (tool: string, pattern: string, scope: string, decision = 'allow', priority = 100, options?: { allowHighRisk?: boolean; executionTarget?: string }) => {
+        return { id: 'spy-rule-id', tool, pattern, scope, decision, priority, createdAt: Date.now(), ...options } as TrustRule;
       },
     );
     return addRuleSpy;
@@ -1871,8 +1870,8 @@ describe('ToolExecutor persistent-allow lifecycle', () => {
 
   function setupAddRuleSpy() {
     addRuleSpy = spyOn(trustStore, 'addRule').mockImplementation(
-      (tool: string, pattern: string, scope: string, decision = 'allow', priority = 100, options?: any) => {
-        return { id: 'spy-rule-id', tool, pattern, scope, decision, priority, createdAt: Date.now(), ...options } as any;
+      (tool: string, pattern: string, scope: string, decision = 'allow', priority = 100, options?: { allowHighRisk?: boolean; executionTarget?: string }) => {
+        return { id: 'spy-rule-id', tool, pattern, scope, decision, priority, createdAt: Date.now(), ...options } as TrustRule;
       },
     );
     return addRuleSpy;
@@ -1954,12 +1953,12 @@ describe('integration regressions — prompt payload (PR 11)', () => {
   test('shell command prompt payload includes allowlist and scope options', async () => {
     checkResultOverride = { decision: 'prompt', reason: 'Medium risk: requires approval' };
 
-    let capturedAllowlist: any[] | undefined;
-    let capturedScopes: any[] | undefined;
+    let capturedAllowlist: AllowlistOption[] | undefined;
+    let capturedScopes: ScopeOption[] | undefined;
     const prompter = {
       prompt: async (
         _toolName: string, _input: Record<string, unknown>, _riskLevel: string,
-        allowlistOptions: any[], scopeOptions: any[],
+        allowlistOptions: AllowlistOption[], scopeOptions: ScopeOption[],
       ) => {
         capturedAllowlist = allowlistOptions;
         capturedScopes = scopeOptions;
diff --git a/assistant/src/__tests__/voice-session-bridge.test.ts b/assistant/src/__tests__/voice-session-bridge.test.ts
new file mode 100644
index 00000000000..dc033d97c31
--- /dev/null
+++ b/assistant/src/__tests__/voice-session-bridge.test.ts
@@ -0,0 +1,696 @@
+import { describe, test, expect, beforeEach, afterAll, mock } from 'bun:test';
+import { mkdtempSync, rmSync } from 'node:fs';
+import { tmpdir } from 'node:os';
+import { join } from 'node:path';
+import type { ServerMessage } from '../daemon/ipc-protocol.js';
+import type { Session } from '../daemon/session.js';
+
+const testDir = mkdtempSync(join(tmpdir(), 'voice-bridge-test-'));
+
+mock.module('../util/platform.js', () => ({
+  getRootDir: () => testDir,
+  getDataDir: () => testDir,
+  isMacOS: () => process.platform === 'darwin',
+  isLinux: () => process.platform === 'linux',
+  isWindows: () => process.platform === 'win32',
+  getSocketPath: () => join(testDir, 'test.sock'),
+  getPidPath: () => join(testDir, 'test.pid'),
+  getDbPath: () => join(testDir, 'test.db'),
+  getLogPath: () => join(testDir, 'test.log'),
+  ensureDataDir: () => {},
+}));
+
+mock.module('../util/logger.js', () => ({
+  getLogger: () => new Proxy({} as Record<string, unknown>, {
+    get: () => () => {},
+  }),
+}));
+
+mock.module('../config/loader.js', () => ({
+  getConfig: () => ({
+    secretDetection: { enabled: false },
+  }),
+}));
+
+import { initializeDb, getDb, resetDb } from '../memory/db.js';
+import { createConversation } from '../memory/conversation-store.js';
+import { RunOrchestrator } from '../runtime/run-orchestrator.js';
+import { setVoiceBridgeOrchestrator, startVoiceTurn } from '../calls/voice-session-bridge.js';
+
+initializeDb();
+
+/**
+ * Build a session that emits multiple events via the onEvent callback,
+ * simulating assistant text deltas followed by message_complete.
+ */
+function makeStreamingSession(events: ServerMessage[]): Session {
+  return {
+    isProcessing: () => false,
+    persistUserMessage: () => undefined as unknown as string,
+    memoryPolicy: { scopeId: 'default', includeDefaultFallback: false, strictSideEffects: false },
+    setChannelCapabilities: () => {},
+    setAssistantId: () => {},
+    setGuardianContext: () => {},
+    setCommandIntent: () => {},
+    setTurnChannelContext: () => {},
+    updateClient: () => {},
+    runAgentLoop: async (_content: string, _messageId: string, onEvent: (msg: ServerMessage) => void) => {
+      for (const event of events) {
+        onEvent(event);
+      }
+    },
+    handleConfirmationResponse: () => {},
+    abort: () => {},
+  } as unknown as Session;
+}
+
+describe('voice-session-bridge', () => {
+  beforeEach(() => {
+    const db = getDb();
+    db.run('DELETE FROM message_runs');
+    db.run('DELETE FROM messages');
+    db.run('DELETE FROM conversations');
+  });
+
+  test('throws when orchestrator not injected', async () => {
+    // Reset the module-level orchestrator by re-calling with undefined
+    // (we can't easily reset module state, so we test the fresh import path)
+    // Instead, test that startVoiceTurn works after injection
+    expect(true).toBe(true); // placeholder — real test below
+  });
+
+  test('startVoiceTurn forwards text deltas to onTextDelta callback', async () => {
+    const conversation = createConversation('voice bridge delta test');
+    const events: ServerMessage[] = [
+      { type: 'assistant_text_delta', text: 'Hello ', sessionId: conversation.id },
+      { type: 'assistant_text_delta', text: 'world', sessionId: conversation.id },
+      { type: 'message_complete', sessionId: conversation.id },
+    ];
+    const session = makeStreamingSession(events);
+
+    const orchestrator = new RunOrchestrator({
+      getOrCreateSession: async () => session,
+      resolveAttachments: () => [],
+      deriveDefaultStrictSideEffects: () => false,
+    });
+    setVoiceBridgeOrchestrator(orchestrator);
+
+    const receivedDeltas: string[] = [];
+    let completed = false;
+
+    const handle = await startVoiceTurn({
+      conversationId: conversation.id,
+      content: 'Hello from caller',
+      onTextDelta: (text) => receivedDeltas.push(text),
+      onComplete: () => { completed = true; },
+      onError: () => {},
+    });
+
+    // Wait for async agent loop
+    await new Promise((r) => setTimeout(r, 50));
+
+    expect(receivedDeltas).toEqual(['Hello ', 'world']);
+    expect(completed).toBe(true);
+    expect(handle.runId).toBeDefined();
+    expect(typeof handle.abort).toBe('function');
+  });
+
+  test('startVoiceTurn forwards error events to onError callback', async () => {
+    const conversation = createConversation('voice bridge error test');
+    const events: ServerMessage[] = [
+      { type: 'error', message: 'Provider unavailable' },
+    ];
+    const session = makeStreamingSession(events);
+
+    const orchestrator = new RunOrchestrator({
+      getOrCreateSession: async () => session,
+      resolveAttachments: () => [],
+      deriveDefaultStrictSideEffects: () => false,
+    });
+    setVoiceBridgeOrchestrator(orchestrator);
+
+    const receivedErrors: string[] = [];
+    await startVoiceTurn({
+      conversationId: conversation.id,
+      content: 'Hello',
+      onTextDelta: () => {},
+      onComplete: () => {},
+      onError: (msg) => receivedErrors.push(msg),
+    });
+
+    await new Promise((r) => setTimeout(r, 50));
+
+    expect(receivedErrors).toEqual(['Provider unavailable']);
+  });
+
+  test('abort handle cancels the in-flight run', async () => {
+    const conversation = createConversation('voice bridge abort test');
+    let abortCalled = false;
+
+    const session = {
+      isProcessing: () => false,
+      persistUserMessage: () => undefined as unknown as string,
+      memoryPolicy: { scopeId: 'default', includeDefaultFallback: false, strictSideEffects: false },
+      setChannelCapabilities: () => {},
+      setAssistantId: () => {},
+      setGuardianContext: () => {},
+      setCommandIntent: () => {},
+      setTurnChannelContext: () => {},
+      updateClient: () => {},
+      runAgentLoop: async () => {
+        await new Promise((r) => setTimeout(r, 200));
+      },
+      handleConfirmationResponse: () => {},
+      abort: () => { abortCalled = true; },
+    } as unknown as Session;
+
+    const orchestrator = new RunOrchestrator({
+      getOrCreateSession: async () => session,
+      resolveAttachments: () => [],
+      deriveDefaultStrictSideEffects: () => false,
+    });
+    setVoiceBridgeOrchestrator(orchestrator);
+
+    const handle = await startVoiceTurn({
+      conversationId: conversation.id,
+      content: 'Hello',
+      onTextDelta: () => {},
+      onComplete: () => {},
+      onError: () => {},
+    });
+
+    handle.abort();
+    expect(abortCalled).toBe(true);
+  });
+
+  test('external AbortSignal triggers run abort', async () => {
+    const conversation = createConversation('voice bridge signal test');
+    let abortCalled = false;
+
+    const session = {
+      isProcessing: () => false,
+      persistUserMessage: () => undefined as unknown as string,
+      memoryPolicy: { scopeId: 'default', includeDefaultFallback: false, strictSideEffects: false },
+      setChannelCapabilities: () => {},
+      setAssistantId: () => {},
+      setGuardianContext: () => {},
+      setCommandIntent: () => {},
+      setTurnChannelContext: () => {},
+      updateClient: () => {},
+      runAgentLoop: async () => {
+        await new Promise((r) => setTimeout(r, 200));
+      },
+      handleConfirmationResponse: () => {},
+      abort: () => { abortCalled = true; },
+    } as unknown as Session;
+
+    const orchestrator = new RunOrchestrator({
+      getOrCreateSession: async () => session,
+      resolveAttachments: () => [],
+      deriveDefaultStrictSideEffects: () => false,
+    });
+    setVoiceBridgeOrchestrator(orchestrator);
+
+    const ac = new AbortController();
+    await startVoiceTurn({
+      conversationId: conversation.id,
+      content: 'Hello',
+      onTextDelta: () => {},
+      onComplete: () => {},
+      onError: () => {},
+      signal: ac.signal,
+    });
+
+    // Abort via the external controller
+    ac.abort();
+    // Give the event listener a microtask to fire
+    await new Promise((r) => setTimeout(r, 10));
+
+    expect(abortCalled).toBe(true);
+  });
+
+  test('startVoiceTurn passes turnChannelContext with voice channel', async () => {
+    const conversation = createConversation('voice bridge channel context test');
+    const events: ServerMessage[] = [
+      { type: 'message_complete', sessionId: conversation.id },
+    ];
+
+    let capturedTurnChannelContext: unknown = null;
+    const session = {
+      ...makeStreamingSession(events),
+      setTurnChannelContext: (ctx: unknown) => { capturedTurnChannelContext = ctx; },
+    } as unknown as Session;
+
+    const orchestrator = new RunOrchestrator({
+      getOrCreateSession: async () => session,
+      resolveAttachments: () => [],
+      deriveDefaultStrictSideEffects: () => false,
+    });
+    setVoiceBridgeOrchestrator(orchestrator);
+
+    await startVoiceTurn({
+      conversationId: conversation.id,
+      content: 'Hello',
+      onTextDelta: () => {},
+      onComplete: () => {},
+      onError: () => {},
+    });
+
+    await new Promise((r) => setTimeout(r, 50));
+
+    expect(capturedTurnChannelContext).toEqual({
+      userMessageChannel: 'voice',
+      assistantMessageChannel: 'voice',
+    });
+  });
+
+  test('startVoiceTurn forces strict side effects for non-guardian actors', async () => {
+    const conversation = createConversation('voice bridge strict non-guardian test');
+    const events: ServerMessage[] = [
+      { type: 'message_complete', sessionId: conversation.id },
+    ];
+
+    let capturedStrictSideEffects: boolean | undefined;
+    const session = {
+      ...makeStreamingSession(events),
+      get memoryPolicy() { return { scopeId: 'default', includeDefaultFallback: false, strictSideEffects: false }; },
+      set memoryPolicy(val: Record<string, unknown>) { capturedStrictSideEffects = val.strictSideEffects as boolean; },
+    } as unknown as Session;
+
+    const orchestrator = new RunOrchestrator({
+      getOrCreateSession: async () => session,
+      resolveAttachments: () => [],
+      deriveDefaultStrictSideEffects: () => false,
+    });
+    setVoiceBridgeOrchestrator(orchestrator);
+
+    await startVoiceTurn({
+      conversationId: conversation.id,
+      content: 'Hello',
+      guardianContext: {
+        sourceChannel: 'voice',
+        actorRole: 'non-guardian',
+        guardianExternalUserId: '+15550009999',
+        guardianChatId: '+15550009999',
+        requesterExternalUserId: '+15550002222',
+      },
+      onTextDelta: () => {},
+      onComplete: () => {},
+      onError: () => {},
+    });
+
+    await new Promise((r) => setTimeout(r, 50));
+
+    expect(capturedStrictSideEffects).toBe(true);
+  });
+
+  test('startVoiceTurn forces strict side effects for unverified_channel actors', async () => {
+    const conversation = createConversation('voice bridge strict unverified test');
+    const events: ServerMessage[] = [
+      { type: 'message_complete', sessionId: conversation.id },
+    ];
+
+    let capturedStrictSideEffects: boolean | undefined;
+    const session = {
+      ...makeStreamingSession(events),
+      get memoryPolicy() { return { scopeId: 'default', includeDefaultFallback: false, strictSideEffects: false }; },
+      set memoryPolicy(val: Record<string, unknown>) { capturedStrictSideEffects = val.strictSideEffects as boolean; },
+    } as unknown as Session;
+
+    const orchestrator = new RunOrchestrator({
+      getOrCreateSession: async () => session,
+      resolveAttachments: () => [],
+      deriveDefaultStrictSideEffects: () => false,
+    });
+    setVoiceBridgeOrchestrator(orchestrator);
+
+    await startVoiceTurn({
+      conversationId: conversation.id,
+      content: 'Hello',
+      guardianContext: {
+        sourceChannel: 'voice',
+        actorRole: 'unverified_channel',
+        denialReason: 'no_binding',
+      },
+      onTextDelta: () => {},
+      onComplete: () => {},
+      onError: () => {},
+    });
+
+    await new Promise((r) => setTimeout(r, 50));
+
+    expect(capturedStrictSideEffects).toBe(true);
+  });
+
+  test('startVoiceTurn does not force strict side effects for guardian actors', async () => {
+    const conversation = createConversation('voice bridge strict guardian test');
+    const events: ServerMessage[] = [
+      { type: 'message_complete', sessionId: conversation.id },
+    ];
+
+    let capturedStrictSideEffects: boolean | undefined;
+    const session = {
+      ...makeStreamingSession(events),
+      get memoryPolicy() { return { scopeId: 'default', includeDefaultFallback: false, strictSideEffects: false }; },
+      set memoryPolicy(val: Record<string, unknown>) { capturedStrictSideEffects = val.strictSideEffects as boolean; },
+    } as unknown as Session;
+
+    const orchestrator = new RunOrchestrator({
+      getOrCreateSession: async () => session,
+      resolveAttachments: () => [],
+      deriveDefaultStrictSideEffects: () => false,
+    });
+    setVoiceBridgeOrchestrator(orchestrator);
+
+    await startVoiceTurn({
+      conversationId: conversation.id,
+      content: 'Hello',
+      guardianContext: {
+        sourceChannel: 'voice',
+        actorRole: 'guardian',
+        guardianExternalUserId: '+15550001111',
+        guardianChatId: '+15550001111',
+      },
+      onTextDelta: () => {},
+      onComplete: () => {},
+      onError: () => {},
+    });
+
+    await new Promise((r) => setTimeout(r, 50));
+
+    // Guardian actors use the derived default (false), not forced true
+    expect(capturedStrictSideEffects).toBe(false);
+  });
+
+  test('startVoiceTurn passes guardian context to the session', async () => {
+    const conversation = createConversation('voice bridge guardian context test');
+    const events: ServerMessage[] = [
+      { type: 'message_complete', sessionId: conversation.id },
+    ];
+
+    let capturedGuardianContext: unknown = null;
+    const session = {
+      ...makeStreamingSession(events),
+      setGuardianContext: (ctx: unknown) => { capturedGuardianContext = ctx; },
+    } as unknown as Session;
+
+    const orchestrator = new RunOrchestrator({
+      getOrCreateSession: async () => session,
+      resolveAttachments: () => [],
+      deriveDefaultStrictSideEffects: () => false,
+    });
+    setVoiceBridgeOrchestrator(orchestrator);
+
+    const guardianCtx = {
+      sourceChannel: 'voice' as const,
+      actorRole: 'guardian' as const,
+      guardianExternalUserId: '+15550001111',
+      guardianChatId: '+15550001111',
+    };
+
+    await startVoiceTurn({
+      conversationId: conversation.id,
+      content: 'Hello',
+      assistantId: 'test-assistant',
+      guardianContext: guardianCtx,
+      onTextDelta: () => {},
+      onComplete: () => {},
+      onError: () => {},
+    });
+
+    await new Promise((r) => setTimeout(r, 50));
+
+    expect(capturedGuardianContext).toEqual(guardianCtx);
+  });
+
+  test('auto-denies confirmation requests for non-guardian voice turns', async () => {
+    const conversation = createConversation('voice bridge auto-deny non-guardian test');
+
+    let clientHandler: (msg: ServerMessage) => void = () => {};
+    const handleConfirmationCalls: Array<{
+      requestId: string;
+      decision: string;
+      decisionContext?: string;
+    }> = [];
+
+    const session = {
+      isProcessing: () => false,
+      persistUserMessage: () => undefined as unknown as string,
+      memoryPolicy: { scopeId: 'default', includeDefaultFallback: false, strictSideEffects: false },
+      setChannelCapabilities: () => {},
+      setAssistantId: () => {},
+      setGuardianContext: () => {},
+      setCommandIntent: () => {},
+      setTurnChannelContext: () => {},
+      updateClient: (handler: (msg: ServerMessage) => void) => {
+        clientHandler = handler;
+      },
+      runAgentLoop: async () => {
+        // Simulate the prompter emitting a confirmation_request via the
+        // updateClient callback (this is how the real prompter works).
+        clientHandler({
+          type: 'confirmation_request',
+          requestId: 'req-voice-1',
+          toolName: 'host_bash',
+          input: { command: 'rm -rf /' },
+          riskLevel: 'high',
+          allowlistOptions: [],
+          scopeOptions: [],
+        } as ServerMessage);
+        // The auto-deny resolves the prompter immediately, so the agent loop
+        // can continue. In production the loop would continue; here we just
+        // return to simulate completion.
+      },
+      handleConfirmationResponse: (
+        requestId: string,
+        decision: string,
+        _selectedPattern?: string,
+        _selectedScope?: string,
+        decisionContext?: string,
+      ) => {
+        handleConfirmationCalls.push({ requestId, decision, decisionContext });
+      },
+      abort: () => {},
+    } as unknown as Session;
+
+    const orchestrator = new RunOrchestrator({
+      getOrCreateSession: async () => session,
+      resolveAttachments: () => [],
+      deriveDefaultStrictSideEffects: () => false,
+    });
+    setVoiceBridgeOrchestrator(orchestrator);
+
+    await startVoiceTurn({
+      conversationId: conversation.id,
+      content: 'Delete everything',
+      guardianContext: {
+        sourceChannel: 'voice',
+        actorRole: 'non-guardian',
+        guardianExternalUserId: '+15550009999',
+        guardianChatId: '+15550009999',
+        requesterExternalUserId: '+15550002222',
+      },
+      onTextDelta: () => {},
+      onComplete: () => {},
+      onError: () => {},
+    });
+
+    await new Promise((r) => setTimeout(r, 50));
+
+    // The confirmation should have been auto-denied immediately
+    expect(handleConfirmationCalls.length).toBe(1);
+    expect(handleConfirmationCalls[0].requestId).toBe('req-voice-1');
+    expect(handleConfirmationCalls[0].decision).toBe('deny');
+    expect(handleConfirmationCalls[0].decisionContext).toContain('voice call');
+    expect(handleConfirmationCalls[0].decisionContext).toContain('host_bash');
+  });
+
+  test('auto-denies confirmation requests for unverified_channel voice turns', async () => {
+    const conversation = createConversation('voice bridge auto-deny unverified test');
+
+    let clientHandler: (msg: ServerMessage) => void = () => {};
+    const handleConfirmationCalls: Array<{
+      requestId: string;
+      decision: string;
+    }> = [];
+
+    const session = {
+      isProcessing: () => false,
+      persistUserMessage: () => undefined as unknown as string,
+      memoryPolicy: { scopeId: 'default', includeDefaultFallback: false, strictSideEffects: false },
+      setChannelCapabilities: () => {},
+      setAssistantId: () => {},
+      setGuardianContext: () => {},
+      setCommandIntent: () => {},
+      setTurnChannelContext: () => {},
+      updateClient: (handler: (msg: ServerMessage) => void) => {
+        clientHandler = handler;
+      },
+      runAgentLoop: async () => {
+        clientHandler({
+          type: 'confirmation_request',
+          requestId: 'req-voice-2',
+          toolName: 'network_request',
+          input: { url: 'https://evil.com' },
+          riskLevel: 'medium',
+          allowlistOptions: [],
+          scopeOptions: [],
+        } as ServerMessage);
+      },
+      handleConfirmationResponse: (
+        requestId: string,
+        decision: string,
+      ) => {
+        handleConfirmationCalls.push({ requestId, decision });
+      },
+      abort: () => {},
+    } as unknown as Session;
+
+    const orchestrator = new RunOrchestrator({
+      getOrCreateSession: async () => session,
+      resolveAttachments: () => [],
+      deriveDefaultStrictSideEffects: () => false,
+    });
+    setVoiceBridgeOrchestrator(orchestrator);
+
+    await startVoiceTurn({
+      conversationId: conversation.id,
+      content: 'Make a request',
+      guardianContext: {
+        sourceChannel: 'voice',
+        actorRole: 'unverified_channel',
+        denialReason: 'no_binding',
+      },
+      onTextDelta: () => {},
+      onComplete: () => {},
+      onError: () => {},
+    });
+
+    await new Promise((r) => setTimeout(r, 50));
+
+    expect(handleConfirmationCalls.length).toBe(1);
+    expect(handleConfirmationCalls[0].requestId).toBe('req-voice-2');
+    expect(handleConfirmationCalls[0].decision).toBe('deny');
+  });
+
+  test('does NOT auto-deny confirmation requests for guardian voice turns', async () => {
+    const conversation = createConversation('voice bridge no-auto-deny guardian test');
+
+    let clientHandler: (msg: ServerMessage) => void = () => {};
+    const handleConfirmationCalls: Array<{
+      requestId: string;
+      decision: string;
+    }> = [];
+
+    const session = {
+      isProcessing: () => false,
+      persistUserMessage: () => undefined as unknown as string,
+      memoryPolicy: { scopeId: 'default', includeDefaultFallback: false, strictSideEffects: false },
+      setChannelCapabilities: () => {},
+      setAssistantId: () => {},
+      setGuardianContext: () => {},
+      setCommandIntent: () => {},
+      setTurnChannelContext: () => {},
+      updateClient: (handler: (msg: ServerMessage) => void) => {
+        clientHandler = handler;
+      },
+      runAgentLoop: async () => {
+        clientHandler({
+          type: 'confirmation_request',
+          requestId: 'req-voice-3',
+          toolName: 'host_bash',
+          input: { command: 'ls' },
+          riskLevel: 'low',
+          allowlistOptions: [],
+          scopeOptions: [],
+        } as ServerMessage);
+        // For guardian actors, the confirmation enters the normal
+        // pending state — it is NOT auto-denied. The runAgentLoop
+        // would normally block waiting for resolution. We just return
+        // to keep the test simple.
+      },
+      handleConfirmationResponse: (
+        requestId: string,
+        decision: string,
+      ) => {
+        handleConfirmationCalls.push({ requestId, decision });
+      },
+      abort: () => {},
+    } as unknown as Session;
+
+    const orchestrator = new RunOrchestrator({
+      getOrCreateSession: async () => session,
+      resolveAttachments: () => [],
+      deriveDefaultStrictSideEffects: () => false,
+    });
+    setVoiceBridgeOrchestrator(orchestrator);
+
+    await startVoiceTurn({
+      conversationId: conversation.id,
+      content: 'List files',
+      guardianContext: {
+        sourceChannel: 'voice',
+        actorRole: 'guardian',
+        guardianExternalUserId: '+15550001111',
+        guardianChatId: '+15550001111',
+      },
+      onTextDelta: () => {},
+      onComplete: () => {},
+      onError: () => {},
+    });
+
+    await new Promise((r) => setTimeout(r, 50));
+
+    // Guardian actors should NOT have auto-deny — confirmation enters
+    // the normal pending flow (stored in run store, not auto-denied).
+    expect(handleConfirmationCalls.length).toBe(0);
+  });
+
+  test('pre-aborted signal triggers immediate abort', async () => {
+    const conversation = createConversation('voice bridge pre-abort test');
+    let abortCalled = false;
+
+    const session = {
+      isProcessing: () => false,
+      persistUserMessage: () => undefined as unknown as string,
+      memoryPolicy: { scopeId: 'default', includeDefaultFallback: false, strictSideEffects: false },
+      setChannelCapabilities: () => {},
+      setAssistantId: () => {},
+      setGuardianContext: () => {},
+      setCommandIntent: () => {},
+      setTurnChannelContext: () => {},
+      updateClient: () => {},
+      runAgentLoop: async () => {
+        await new Promise((r) => setTimeout(r, 200));
+      },
+      handleConfirmationResponse: () => {},
+      abort: () => { abortCalled = true; },
+    } as unknown as Session;
+
+    const orchestrator = new RunOrchestrator({
+      getOrCreateSession: async () => session,
+      resolveAttachments: () => [],
+      deriveDefaultStrictSideEffects: () => false,
+    });
+    setVoiceBridgeOrchestrator(orchestrator);
+
+    const ac = new AbortController();
+    ac.abort(); // Pre-abort before calling startVoiceTurn
+
+    await startVoiceTurn({
+      conversationId: conversation.id,
+      content: 'Hello',
+      onTextDelta: () => {},
+      onComplete: () => {},
+      onError: () => {},
+      signal: ac.signal,
+    });
+
+    expect(abortCalled).toBe(true);
+  });
+});
+
+afterAll(() => {
+  resetDb();
+  try { rmSync(testDir, { recursive: true, force: true }); } catch { /* best effort */ }
+});
diff --git a/assistant/src/__tests__/web-search.test.ts b/assistant/src/__tests__/web-search.test.ts
index 40e49f3e697..4ed4285d216 100644
--- a/assistant/src/__tests__/web-search.test.ts
+++ b/assistant/src/__tests__/web-search.test.ts
@@ -1,4 +1,3 @@
-/* eslint-disable @typescript-eslint/no-explicit-any */
 import { describe, test, expect, beforeEach, afterEach } from 'bun:test';
 
 // No mock.module calls — this test file uses its own inline executeWebSearch
@@ -45,7 +44,7 @@ describe('WebSearchTool', () => {
       globalThis.fetch = (async () => {
         fetchCalled = true;
         return new Response('{}', { status: 200 });
-      }) as any;
+      }) as unknown as typeof fetch;
 
       process.env.BRAVE_API_KEY = 'test-key';
 
@@ -74,7 +73,7 @@ describe('WebSearchTool', () => {
           status: 200,
           headers: { 'Content-Type': 'application/json' },
         });
-      }) as any;
+      }) as unknown as typeof fetch;
 
       await executeWebSearch({ query: 'test', count: 50 }, 'test-key', 'brave');
       expect(capturedUrl).toContain('count=20');
@@ -91,7 +90,7 @@ describe('WebSearchTool', () => {
           status: 200,
           headers: { 'Content-Type': 'application/json' },
         });
-      }) as any;
+      }) as unknown as typeof fetch;
 
       await executeWebSearch({ query: 'test', offset: 20 }, 'test-key', 'brave');
       expect(capturedUrl).toContain('offset=9');
@@ -105,7 +104,7 @@ describe('WebSearchTool', () => {
           status: 200,
           headers: { 'Content-Type': 'application/json' },
         });
-      }) as any;
+      }) as unknown as typeof fetch;
 
       await executeWebSearch({ query: 'test', freshness: 'pw' }, 'test-key', 'brave');
       expect(capturedUrl).toContain('freshness=pw');
@@ -119,7 +118,7 @@ describe('WebSearchTool', () => {
           status: 200,
           headers: { 'Content-Type': 'application/json' },
         });
-      }) as any;
+      }) as unknown as typeof fetch;
 
       await executeWebSearch({ query: 'test', freshness: 'invalid' }, 'test-key', 'brave');
       expect(capturedUrl).not.toContain('freshness');
@@ -142,7 +141,7 @@ describe('WebSearchTool', () => {
           status: 200,
           headers: { 'Content-Type': 'application/json' },
         })
-      ) as any;
+      ) as unknown as typeof fetch;
 
       const result = await executeWebSearch({ query: 'test' }, 'test-key', 'brave');
       expect(result.isError).toBe(false);
@@ -159,7 +158,7 @@ describe('WebSearchTool', () => {
           status: 200,
           headers: { 'Content-Type': 'application/json' },
         })
-      ) as any;
+      ) as unknown as typeof fetch;
 
       const result = await executeWebSearch({ query: 'noresults' }, 'test-key', 'brave');
       expect(result.isError).toBe(false);
@@ -172,7 +171,7 @@ describe('WebSearchTool', () => {
           status: 200,
           headers: { 'Content-Type': 'application/json' },
         })
-      ) as any;
+      ) as unknown as typeof fetch;
 
       const result = await executeWebSearch({ query: 'empty' }, 'test-key', 'brave');
       expect(result.isError).toBe(false);
@@ -182,7 +181,7 @@ describe('WebSearchTool', () => {
     test('handles 401 unauthorized', async () => {
       globalThis.fetch = (async () =>
         new Response('Unauthorized', { status: 401 })
-      ) as any;
+      ) as unknown as typeof fetch;
 
       const result = await executeWebSearch({ query: 'test' }, 'bad-key', 'brave');
       expect(result.isError).toBe(true);
@@ -200,7 +199,7 @@ describe('WebSearchTool', () => {
           JSON.stringify({ web: { results: [{ title: 'Result', url: 'https://example.com', description: 'Found it' }] } }),
           { status: 200, headers: { 'Content-Type': 'application/json' } },
         );
-      }) as any;
+      }) as unknown as typeof fetch;
 
       const result = await executeWebSearch({ query: 'test' }, 'test-key', 'brave');
       expect(result.isError).toBe(false);
@@ -211,7 +210,7 @@ describe('WebSearchTool', () => {
     test('returns error after exhausting 429 retries', async () => {
       globalThis.fetch = (async () =>
         new Response('Too Many Requests', { status: 429 })
-      ) as any;
+      ) as unknown as typeof fetch;
 
       const result = await executeWebSearch({ query: 'test' }, 'test-key', 'brave');
       expect(result.isError).toBe(true);
@@ -235,7 +234,7 @@ describe('WebSearchTool', () => {
           JSON.stringify({ web: { results: [] } }),
           { status: 200, headers: { 'Content-Type': 'application/json' } },
         );
-      }) as any;
+      }) as unknown as typeof fetch;
 
       const result = await executeWebSearch({ query: 'test' }, 'test-key', 'brave');
       expect(result.isError).toBe(false);
@@ -245,7 +244,7 @@ describe('WebSearchTool', () => {
     test('handles network errors', async () => {
       globalThis.fetch = (async () => {
         throw new Error('Network unreachable');
-      }) as any;
+      }) as unknown as typeof fetch;
 
       const result = await executeWebSearch({ query: 'test' }, 'test-key', 'brave');
       expect(result.isError).toBe(true);
@@ -254,7 +253,7 @@ describe('WebSearchTool', () => {
 
     test('sends correct headers', async () => {
       let capturedHeaders: Record<string, string> = {};
-      globalThis.fetch = (async (_url: string, init: any) => {
+      globalThis.fetch = (async (_url: string, init: RequestInit) => {
         capturedHeaders = Object.fromEntries(
           Object.entries(init.headers as Record<string, string>),
         );
@@ -262,7 +261,7 @@ describe('WebSearchTool', () => {
           status: 200,
           headers: { 'Content-Type': 'application/json' },
         });
-      }) as any;
+      }) as unknown as typeof fetch;
 
       await executeWebSearch({ query: 'test' }, 'my-api-key', 'brave');
       expect(capturedHeaders['X-Subscription-Token']).toBe('my-api-key');
@@ -288,7 +287,7 @@ describe('WebSearchTool', () => {
           status: 200,
           headers: { 'Content-Type': 'application/json' },
         })
-      ) as any;
+      ) as unknown as typeof fetch;
 
       const result = await executeWebSearch({ query: 'test' }, 'test-key', 'brave');
       expect(result.isError).toBe(false);
@@ -309,7 +308,7 @@ describe('WebSearchTool', () => {
           status: 200,
           headers: { 'Content-Type': 'application/json' },
         })
-      ) as any;
+      ) as unknown as typeof fetch;
 
       const result = await executeWebSearch({ query: 'test' }, 'pplx-key', 'perplexity');
       expect(result.isError).toBe(false);
@@ -328,7 +327,7 @@ describe('WebSearchTool', () => {
           status: 200,
           headers: { 'Content-Type': 'application/json' },
         })
-      ) as any;
+      ) as unknown as typeof fetch;
 
       const result = await executeWebSearch({ query: 'noresults' }, 'pplx-key', 'perplexity');
       expect(result.isError).toBe(false);
@@ -338,7 +337,7 @@ describe('WebSearchTool', () => {
     test('handles 401 unauthorized', async () => {
       globalThis.fetch = (async () =>
         new Response('Unauthorized', { status: 401 })
-      ) as any;
+      ) as unknown as typeof fetch;
 
       const result = await executeWebSearch({ query: 'test' }, 'bad-key', 'perplexity');
       expect(result.isError).toBe(true);
@@ -347,23 +346,23 @@ describe('WebSearchTool', () => {
 
     test('sends correct headers', async () => {
       let capturedHeaders: Record<string, string> = {};
-      let capturedBody: any;
-      globalThis.fetch = (async (_url: string, init: any) => {
+      let capturedBody: Record<string, unknown> = {};
+      globalThis.fetch = (async (_url: string, init: RequestInit) => {
         capturedHeaders = Object.fromEntries(
           Object.entries(init.headers as Record<string, string>),
         );
-        capturedBody = JSON.parse(init.body);
+        capturedBody = JSON.parse(init.body as string);
         return new Response(JSON.stringify({ choices: [{ message: { content: 'result' } }] }), {
           status: 200,
           headers: { 'Content-Type': 'application/json' },
         });
-      }) as any;
+      }) as unknown as typeof fetch;
 
       await executeWebSearch({ query: 'test query' }, 'pplx-my-key', 'perplexity');
       expect(capturedHeaders['Authorization']).toBe('Bearer pplx-my-key');
       expect(capturedHeaders['Content-Type']).toBe('application/json');
       expect(capturedBody.model).toBe('sonar');
-      expect(capturedBody.messages[0].content).toBe('test query');
+      expect((capturedBody.messages as Array<{ content: string }>)[0].content).toBe('test query');
     });
 
     test('retries on 429 and succeeds', async () => {
@@ -377,7 +376,7 @@ describe('WebSearchTool', () => {
           JSON.stringify({ choices: [{ message: { content: 'Found it' } }] }),
           { status: 200, headers: { 'Content-Type': 'application/json' } },
         );
-      }) as any;
+      }) as unknown as typeof fetch;
 
       const result = await executeWebSearch({ query: 'test' }, 'pplx-key', 'perplexity');
       expect(result.isError).toBe(false);
@@ -388,7 +387,7 @@ describe('WebSearchTool', () => {
     test('handles network errors', async () => {
       globalThis.fetch = (async () => {
         throw new Error('Network unreachable');
-      }) as any;
+      }) as unknown as typeof fetch;
 
       const result = await executeWebSearch({ query: 'test' }, 'pplx-key', 'perplexity');
       expect(result.isError).toBe(true);
@@ -397,6 +396,23 @@ describe('WebSearchTool', () => {
   });
 });
 
+interface BraveSearchResult {
+  title: string;
+  url: string;
+  description?: string;
+  age?: string;
+  extra_snippets?: string[];
+}
+
+interface BraveSearchResponse {
+  web?: { results: BraveSearchResult[] };
+}
+
+interface PerplexityResponse {
+  choices?: Array<{ message: { content: string } }>;
+  citations?: string[];
+}
+
 /**
  * Helper that exercises the web search logic directly, bypassing module
  * registration concerns. This replicates the core execute path from
@@ -461,7 +477,7 @@ async function executeBraveSearchHelper(
       });
 
       if (response.ok) {
-        const data = await response.json() as any;
+        const data = await response.json() as BraveSearchResponse;
         const results = data.web?.results ?? [];
 
         if (results.length === 0) {
@@ -536,7 +552,7 @@ async function executePerplexitySearchHelper(
       });
 
       if (response.ok) {
-        const data = await response.json() as any;
+        const data = await response.json() as PerplexityResponse;
         const content = data.choices?.[0]?.message?.content;
         if (!content) {
           return { content: `No results found for "${query}".`, isError: false };
diff --git a/assistant/src/agent/loop.ts b/assistant/src/agent/loop.ts
index b4467f60d21..d678ee703b3 100644
--- a/assistant/src/agent/loop.ts
+++ b/assistant/src/agent/loop.ts
@@ -35,7 +35,7 @@ export type AgentEvent =
   | { type: 'usage'; inputTokens: number; outputTokens: number; cacheCreationInputTokens?: number; cacheReadInputTokens?: number; model: string; providerDurationMs: number; rawRequest?: unknown; rawResponse?: unknown };
 
 const DEFAULT_CONFIG: AgentLoopConfig = {
-  maxTokens: 64000,
+  maxTokens: 16000,
   maxToolUseTurns: 30,
 };
 
diff --git a/assistant/src/amazon/client.ts b/assistant/src/amazon/client.ts
index f70d3f08b8d..eb7264b51fd 100644
--- a/assistant/src/amazon/client.ts
+++ b/assistant/src/amazon/client.ts
@@ -54,7 +54,7 @@ import {
 } from './session.js';
 import type { ExtractedCredential } from '../tools/browser/network-recording-types.js';
 import { extensionRelayServer } from '../browser-extension-relay/server.js';
-import type { ExtensionResponse } from '../browser-extension-relay/protocol.js';
+import type { ExtensionCommand, ExtensionResponse } from '../browser-extension-relay/protocol.js';
 import { readHttpToken } from '../util/platform.js';
 import { getRuntimeHttpPort } from '../config/env.js';
 
@@ -72,7 +72,7 @@ async function sendRelayCommand(command: Record<string, unknown>): Promise<Exten
   // Try in-process relay first (works when running inside the daemon)
   const status = extensionRelayServer.getStatus();
   if (status.connected) {
-    return extensionRelayServer.sendCommand(command as any);
+    return extensionRelayServer.sendCommand(command as Omit<ExtensionCommand, 'id'>);
   }
 
   // Fall back to HTTP relay endpoint on the daemon
@@ -163,7 +163,7 @@ async function findAmazonTab(): Promise<number> {
 let lastCookieSyncTime = 0;
 const COOKIE_SYNC_INTERVAL = 60_000; // re-sync at most once per minute
 
-async function syncCookiesToBrowser(cookies: ExtractedCredential[]): Promise<void> {
+async function _syncCookiesToBrowser(cookies: ExtractedCredential[]): Promise<void> {
   const now = Date.now();
   if (now - lastCookieSyncTime < COOKIE_SYNC_INTERVAL) return;
 
@@ -212,7 +212,7 @@ async function cdpEval(tabId: number, script: string): Promise<unknown> {
   }
 
   const value = resp.result;
-  if (value === undefined || value === null) {
+  if (value == null) {
     throw new Error('Empty browser eval response');
   }
 
diff --git a/assistant/src/browser-extension-relay/server.ts b/assistant/src/browser-extension-relay/server.ts
index 58dbaa916b9..f0763f36a33 100644
--- a/assistant/src/browser-extension-relay/server.ts
+++ b/assistant/src/browser-extension-relay/server.ts
@@ -147,7 +147,7 @@ export class ExtensionRelayServer {
 
   getStatus(): ExtensionRelayStatus {
     return {
-      connected: this.ws !== null,
+      connected: !!this.ws,
       connectionId: this.connectionId,
       lastHeartbeatAt: this.lastHeartbeatAt,
       pendingCommandCount: this.pendingCommands.size,
diff --git a/assistant/src/calls/call-orchestrator.ts b/assistant/src/calls/call-controller.ts
similarity index 57%
rename from assistant/src/calls/call-orchestrator.ts
rename to assistant/src/calls/call-controller.ts
index 65701c8b901..760a60ff004 100644
--- a/assistant/src/calls/call-orchestrator.ts
+++ b/assistant/src/calls/call-controller.ts
@@ -1,15 +1,13 @@
 /**
- * LLM-driven call orchestrator.
+ * Session-backed voice call controller.
  *
- * Manages the conversation loop for an active phone call: receives caller
- * utterances, sends them to Claude via the Anthropic streaming API, and
- * streams text tokens back through the RelayConnection for real-time TTS.
+ * Routes voice turns through the daemon session pipeline via
+ * voice-session-bridge instead of calling provider.sendMessage() directly.
+ * This gives voice calls access to tools, memory, skills, and runtime
+ * injections while preserving all existing call UX behavior (control markers,
+ * barge-in, state machine, guardian verification).
  */
 
-import { getConfig } from '../config/loader.js';
-import { resolveConfiguredProvider } from '../providers/provider-send-message.js';
-import type { ProviderEvent } from '../providers/types.js';
-import { resolveUserReference } from '../config/user-reference.js';
 import { getLogger } from '../util/logger.js';
 import {
   getCallSession,
@@ -20,21 +18,18 @@ import {
 } from './call-store.js';
 import { getMaxCallDurationMs, getUserConsultationTimeoutMs, SILENCE_TIMEOUT_MS } from './call-constants.js';
 import type { RelayConnection } from './relay-server.js';
-import { registerCallOrchestrator, unregisterCallOrchestrator, fireCallQuestionNotifier, fireCallCompletionNotifier, fireCallTranscriptNotifier } from './call-state.js';
+import { registerCallController, unregisterCallController, fireCallQuestionNotifier, fireCallCompletionNotifier, fireCallTranscriptNotifier } from './call-state.js';
 import type { PromptSpeakerContext } from './speaker-identification.js';
 import { addPointerMessage, formatDuration } from './call-pointer-messages.js';
 import { persistCallCompletionMessage } from './call-conversation-messages.js';
-import * as conversationStore from '../memory/conversation-store.js';
 import { dispatchGuardianQuestion } from './guardian-dispatch.js';
 import type { ServerMessage } from '../daemon/ipc-contract.js';
-import {
-  buildGuardianContextBlock,
-  type GuardianRuntimeContext,
-} from '../daemon/session-runtime-assembly.js';
+import type { GuardianRuntimeContext } from '../daemon/session-runtime-assembly.js';
+import { startVoiceTurn, type VoiceTurnHandle } from './voice-session-bridge.js';
 
-const log = getLogger('call-orchestrator');
+const log = getLogger('call-controller');
 
-type OrchestratorState = 'idle' | 'processing' | 'waiting_on_user' | 'speaking';
+type ControllerState = 'idle' | 'processing' | 'waiting_on_user' | 'speaking';
 
 const ASK_GUARDIAN_CAPTURE_REGEX = /\[ASK_GUARDIAN:\s*(.+?)\]/;
 const ASK_GUARDIAN_MARKER_REGEX = /\[ASK_GUARDIAN:\s*.+?\]/g;
@@ -57,12 +52,12 @@ function stripInternalSpeechMarkers(text: string): string {
     .replace(END_CALL_MARKER_REGEX, '');
 }
 
-export class CallOrchestrator {
+export class CallController {
   private callSessionId: string;
   private relay: RelayConnection;
-  private state: OrchestratorState = 'idle';
-  private conversationHistory: Array<{ role: 'user' | 'assistant'; content: string }> = [];
+  private state: ControllerState = 'idle';
   private abortController: AbortController = new AbortController();
+  private currentTurnHandle: VoiceTurnHandle | null = null;
   private silenceTimer: ReturnType<typeof setTimeout> | null = null;
   private durationTimer: ReturnType<typeof setTimeout> | null = null;
   private durationWarningTimer: ReturnType<typeof setTimeout> | null = null;
@@ -85,6 +80,15 @@ export class CallOrchestrator {
   private assistantId: string;
   /** Guardian trust context for the current caller, when available. */
   private guardianContext: GuardianRuntimeContext | null;
+  /** Conversation ID for the voice session. */
+  private conversationId: string;
+  /**
+   * Track whether the last message sent to the session was a user message
+   * whose assistant response has not yet been received. This is used to
+   * prevent sending consecutive user messages that would violate role
+   * alternation in the underlying session pipeline.
+   */
+  private lastSentWasOpener = false;
 
   constructor(
     callSessionId: string,
@@ -103,15 +107,20 @@ export class CallOrchestrator {
     this.broadcast = opts?.broadcast;
     this.assistantId = opts?.assistantId ?? 'self';
     this.guardianContext = opts?.guardianContext ?? null;
+
+    // Resolve the conversation ID from the call session
+    const session = getCallSession(callSessionId);
+    this.conversationId = session?.conversationId ?? callSessionId;
+
     this.startDurationTimer();
     this.resetSilenceTimer();
-    registerCallOrchestrator(callSessionId, this);
+    registerCallController(callSessionId, this);
   }
 
   /**
-   * Returns the current orchestrator state.
+   * Returns the current controller state.
    */
-  getState(): OrchestratorState {
+  getState(): ControllerState {
     return this.state;
   }
 
@@ -131,12 +140,8 @@ export class CallOrchestrator {
 
     this.initialGreetingStarted = true;
     this.resetSilenceTimer();
-    this.conversationHistory.push({ role: 'user', content: CALL_OPENING_MARKER });
-    await this.runLlm();
-    const lastMessage = this.conversationHistory[this.conversationHistory.length - 1];
-    if (lastMessage?.role === 'assistant') {
-      this.awaitingOpeningAck = true;
-    }
+    this.lastSentWasOpener = true;
+    await this.runTurn(CALL_OPENING_MARKER);
   }
 
   /**
@@ -146,32 +151,7 @@ export class CallOrchestrator {
     const interruptedInFlight = this.state === 'processing' || this.state === 'speaking';
     // If we're already processing or speaking, abort the in-flight generation
     if (interruptedInFlight) {
-      this.abortController.abort();
-      this.abortController = new AbortController();
-    }
-
-    // Strip the one-shot [CALL_OPENING] marker from conversation history
-    // so it doesn't leak into subsequent LLM requests after barge-in.
-    // This runs unconditionally because the standard Twilio barge-in path
-    // calls handleInterrupt() first (setting state to 'idle') before
-    // handleCallerUtterance — so interruptedInFlight would be false even
-    // though an interrupt just occurred.
-    // Without this, the consecutive-user merge path below would append
-    // the caller's transcript to the synthetic "[CALL_OPENING]" message,
-    // causing the model to re-run opener behavior instead of responding
-    // directly to the caller.
-    // If the marker-only seed message becomes empty, remove it entirely:
-    // Anthropic rejects any user turn with empty content.
-    for (let i = 0; i < this.conversationHistory.length; i++) {
-      const entry = this.conversationHistory[i];
-      if (!entry.content.includes(CALL_OPENING_MARKER)) continue;
-      const stripped = entry.content.replace(CALL_OPENING_MARKER_REGEX, '').trim();
-      if (stripped.length === 0) {
-        this.conversationHistory.splice(i, 1);
-        i--;
-      } else {
-        entry.content = stripped;
-      }
+      this.abortCurrentTurn();
     }
 
     this.state = 'processing';
@@ -187,24 +167,8 @@ export class CallOrchestrator {
         : CALL_OPENING_ACK_MARKER
       : callerContent;
 
-    // Preserve strict role alternation for Anthropic. If the last message
-    // is already user-role (e.g. interrupted run never appended assistant,
-    // or a second caller prompt arrives before assistant completion), merge
-    // this utterance into that same user turn.
-    const lastMessage = this.conversationHistory[this.conversationHistory.length - 1];
-    if (lastMessage?.role === 'user') {
-      const existingContent = lastMessage.content.trim();
-      lastMessage.content = existingContent.length > 0
-        ? `${lastMessage.content}\n${callerTurnContent}`
-        : callerTurnContent;
-    } else {
-      this.conversationHistory.push({
-        role: 'user',
-        content: callerTurnContent,
-      });
-    }
-
-    await this.runLlm();
+    this.lastSentWasOpener = false;
+    await this.runTurn(callerTurnContent);
   }
 
   /**
@@ -214,7 +178,7 @@ export class CallOrchestrator {
     if (this.state !== 'waiting_on_user') {
       log.warn(
         { callSessionId: this.callSessionId, state: this.state },
-        'handleUserAnswer called but orchestrator is not in waiting_on_user state',
+        'handleUserAnswer called but controller is not in waiting_on_user state',
       );
       return false;
     }
@@ -230,8 +194,8 @@ export class CallOrchestrator {
 
     // Merge any instructions that were queued during the waiting_on_user
     // state into a single user message alongside the answer to avoid
-    // consecutive user-role messages (which violate Anthropic API
-    // role-alternation requirements).
+    // consecutive user-role messages (which violate API role-alternation
+    // requirements).
     const parts: string[] = [];
     for (const instr of this.pendingInstructions) {
       parts.push(`[USER_INSTRUCTION: ${instr}]`);
@@ -239,54 +203,40 @@ export class CallOrchestrator {
     this.pendingInstructions = [];
     parts.push(`[USER_ANSWERED: ${answerText}]`);
 
-    this.conversationHistory.push({ role: 'user', content: parts.join('\n') });
+    const content = parts.join('\n');
 
     // Fire-and-forget: unblock the caller so the HTTP response and answer
     // persistence happen immediately, before LLM streaming begins.
-    this.runLlm().catch((err) =>
-      log.error({ err, callSessionId: this.callSessionId }, 'runLlm failed after user answer'),
+    this.runTurn(content).catch((err) =>
+      log.error({ err, callSessionId: this.callSessionId }, 'runTurn failed after user answer'),
     );
     return true;
   }
 
   /**
-   * Inject a user instruction into the orchestrator's conversation history.
+   * Inject a user instruction into the controller's conversation.
    * The instruction is formatted as a dedicated marker that the system prompt
    * tells the model to treat as high-priority steering input.
    *
-   * When the LLM is actively processing or speaking, or when the orchestrator
+   * When the LLM is actively processing or speaking, or when the controller
    * is waiting on a user answer, the instruction is queued and spliced into
    * the conversation at the correct chronological position once the current
-   * turn completes. This prevents:
-   *  - History ordering corruption (instruction appearing before an in-flight
-   *    assistant response).
-   *  - Consecutive user-role messages (which violate Anthropic API
-   *    role-alternation requirements).
+   * turn completes.
    */
   async handleUserInstruction(instructionText: string): Promise<void> {
     recordCallEvent(this.callSessionId, 'user_instruction_relayed', { instruction: instructionText });
 
-    // Queue the instruction when it cannot be safely appended right now:
-    //  - processing/speaking: an LLM turn is in-flight; appending would
-    //    place the instruction before the assistant response in the array.
-    //  - waiting_on_user: the last message is an assistant turn; the next
-    //    message should be the user's answer. Queued instructions are merged
-    //    into that answer message by handleUserAnswer().
+    // Queue the instruction when it cannot be safely appended right now
     if (this.state === 'processing' || this.state === 'speaking' || this.state === 'waiting_on_user') {
       this.pendingInstructions.push(instructionText);
       return;
     }
 
-    this.conversationHistory.push({
-      role: 'user',
-      content: `[USER_INSTRUCTION: ${instructionText}]`,
-    });
-
     // Reset the silence timer so the instruction-triggered LLM turn
     // doesn't race with a stale silence timeout.
     this.resetSilenceTimer();
 
-    await this.runLlm();
+    await this.runTurn(`[USER_INSTRUCTION: ${instructionText}]`);
   }
 
   /**
@@ -294,8 +244,7 @@ export class CallOrchestrator {
    */
   handleInterrupt(): void {
     const wasSpeaking = this.state === 'speaking';
-    this.abortController.abort();
-    this.abortController = new AbortController();
+    this.abortCurrentTurn();
     this.llmRunVersion++;
     // Explicitly terminate the in-progress TTS turn so the relay can
     // immediately hand control back to the caller after barge-in.
@@ -314,93 +263,25 @@ export class CallOrchestrator {
     if (this.durationWarningTimer) clearTimeout(this.durationWarningTimer);
     if (this.consultationTimer) clearTimeout(this.consultationTimer);
     if (this.durationEndTimer) { clearTimeout(this.durationEndTimer); this.durationEndTimer = null; }
-    this.abortController.abort();
-    unregisterCallOrchestrator(this.callSessionId);
-    log.info({ callSessionId: this.callSessionId }, 'CallOrchestrator destroyed');
+    this.llmRunVersion++;
+    this.abortCurrentTurn();
+    unregisterCallController(this.callSessionId);
+    log.info({ callSessionId: this.callSessionId }, 'CallController destroyed');
   }
 
   // ── Private ──────────────────────────────────────────────────────
 
-  private buildGuardianPromptSection(): string[] {
-    if (!this.guardianContext) return [];
-    return [
-      '',
-      'GUARDIAN ACTOR CONTEXT (authoritative):',
-      buildGuardianContextBlock(this.guardianContext),
-      '- Treat `actor_role` as source-of-truth for whether this caller is the verified guardian.',
-      '- If `actor_role` is `guardian`, the current caller is verified for this assistant on voice.',
-      '- If `actor_role` is `non-guardian` or `unverified_channel`, do not imply the caller is verified.',
-    ];
-  }
-
-  private buildSystemPrompt(): string {
-    const config = getConfig();
-    const disclosureRule = config.calls.disclosure.enabled
-      ? `1. ${config.calls.disclosure.text}`
-      : '1. Begin the conversation naturally.';
-
-    if (this.isInbound) {
-      return this.buildInboundSystemPrompt(disclosureRule);
-    }
-
-    return [
-      `You are on a live phone call on behalf of ${resolveUserReference()}.`,
-      this.task ? `Task: ${this.task}` : '',
-      '',
-      'You are speaking directly to the person who answered the phone.',
-      'Respond naturally and conversationally — speak as you would in a real phone conversation.',
-      ...this.buildGuardianPromptSection(),
-      '',
-      'IMPORTANT RULES:',
-      '0. When introducing yourself, refer to yourself as an assistant. Avoid the phrase "AI assistant" unless directly asked.',
-      disclosureRule,
-      '2. Be concise — phone conversations should be brief and natural.',
-      '3. If the callee asks something you don\'t know, include [ASK_GUARDIAN: your question here] in your response along with a hold message like "Let me check on that for you."',
-      '4. If the callee provides information preceded by [USER_ANSWERED: ...], use that answer naturally in the conversation.',
-      '5. If you see [USER_INSTRUCTION: ...], treat it as a high-priority steering directive from your user. Follow the instruction immediately, adjusting your approach or response accordingly.',
-      '6. When the call\'s purpose is fulfilled, include [END_CALL] in your response along with a polite goodbye.',
-      '7. Do not make up information — ask the user if unsure.',
-      '8. Keep responses short — 1-3 sentences is ideal for phone conversation.',
-      '9. When caller text includes [SPEAKER id="..." label="..."], treat each speaker as a distinct person and personalize responses using that speaker\'s prior context in this call.',
-      '10. If the latest user turn is [CALL_OPENING], generate a natural, context-specific opener: briefly introduce yourself once as an assistant, state why you are calling using the Task context, and ask a short permission/check-in question. Vary the wording; do not use a fixed template.',
-      '11. If the latest user turn includes [CALL_OPENING_ACK], treat it as the callee acknowledging your opener and continue the conversation naturally without re-introducing yourself or repeating the initial check-in question.',
-      '12. Do not repeat your introduction within the same call unless the callee explicitly asks who you are.',
-    ]
-      .filter(Boolean)
-      .join('\n');
-  }
-
   /**
-   * Build a system prompt tailored for inbound calls where the caller
-   * reached out to us. The assistant greets naturally and helps the
-   * caller with whatever they need, rather than delivering an outbound
-   * task message.
+   * Abort the current in-flight turn using the VoiceTurnHandle if available,
+   * plus the local AbortController for signal propagation.
    */
-  private buildInboundSystemPrompt(disclosureRule: string): string {
-    return [
-      `You are on a live phone call, answering an incoming call on behalf of ${resolveUserReference()}.`,
-      '',
-      'The caller dialed in to reach you. You do not have a specific task — your role is to greet them warmly, find out what they need, and assist them.',
-      'Respond naturally and conversationally — speak as you would in a real phone conversation.',
-      ...this.buildGuardianPromptSection(),
-      '',
-      'IMPORTANT RULES:',
-      '0. When introducing yourself, refer to yourself as an assistant. Avoid the phrase "AI assistant" unless directly asked.',
-      disclosureRule,
-      '2. Be concise — phone conversations should be brief and natural.',
-      '3. If the caller asks something you don\'t know or need to verify, include [ASK_GUARDIAN: your question here] in your response along with a hold message like "Let me check on that for you."',
-      '4. If information is provided preceded by [USER_ANSWERED: ...], use that answer naturally in the conversation.',
-      '5. If you see [USER_INSTRUCTION: ...], treat it as a high-priority steering directive from your user. Follow the instruction immediately, adjusting your approach or response accordingly.',
-      '6. When the caller indicates they are done or the conversation reaches a natural conclusion, include [END_CALL] in your response along with a polite goodbye.',
-      '7. Do not make up information — ask the user if unsure.',
-      '8. Keep responses short — 1-3 sentences is ideal for phone conversation.',
-      '9. When caller text includes [SPEAKER id="..." label="..."], treat each speaker as a distinct person and personalize responses using that speaker\'s prior context in this call.',
-      '10. If the latest user turn is [CALL_OPENING], greet the caller warmly and ask how you can help. For example: "Hello, this is [name]\'s assistant. How can I help you today?" Vary the wording; do not use a fixed template.',
-      '11. If the latest user turn includes [CALL_OPENING_ACK], treat it as the caller acknowledging your greeting and continue the conversation naturally.',
-      '12. Do not repeat your introduction within the same call unless the caller explicitly asks who you are.',
-    ]
-      .filter(Boolean)
-      .join('\n');
+  private abortCurrentTurn(): void {
+    if (this.currentTurnHandle) {
+      this.currentTurnHandle.abort();
+      this.currentTurnHandle = null;
+    }
+    this.abortController.abort();
+    this.abortController = new AbortController();
   }
 
   private formatCallerUtterance(transcript: string, speaker?: PromptSpeakerContext): string {
@@ -412,40 +293,24 @@ export class CallOrchestrator {
   }
 
   /**
-   * Run the LLM with the current conversation history and stream
+   * Execute a single voice turn through the session pipeline and stream
    * the response back through the relay.
    */
-  private async runLlm(): Promise<void> {
-    const config = getConfig();
-    const resolved = resolveConfiguredProvider();
-    if (!resolved) {
-      log.error({ callSessionId: this.callSessionId }, 'No provider available');
-      this.relay.sendTextToken('I\'m sorry, I\'m having a technical issue. Please try again later.', true);
-      this.state = 'idle';
-      return;
-    }
-    const { provider } = resolved;
-
+  private async runTurn(content: string): Promise<void> {
     const runVersion = ++this.llmRunVersion;
     const runSignal = this.abortController.signal;
 
     try {
       this.state = 'speaking';
 
-      // Only override the model when the user has explicitly configured one
-      // AND the selected provider matches the configured provider. Forwarding
-      // a provider-specific model to a fallback provider would cause
-      // cross-provider 4xx errors (e.g., sending "gpt-5.2" to Anthropic).
-      const callModel = !resolved.usedFallbackPrimary
-        ? (config.calls.model?.trim() || undefined)
-        : undefined;
-
       // Buffer incoming tokens so we can strip control markers ([ASK_GUARDIAN:...], [END_CALL])
       // before they reach TTS. We hold text whenever an unmatched '[' appears, since it
       // could be the start of a control marker.
       let ttsBuffer = '';
+      // Accumulate the full response text for post-turn marker detection
+      let fullResponseText = '';
 
-      const flushSafeText = (_force: boolean): void => {
+      const flushSafeText = (): void => {
         if (!this.isCurrentRun(runVersion)) return;
         if (ttsBuffer.length === 0) return;
         const bracketIdx = ttsBuffer.indexOf('[');
@@ -463,13 +328,6 @@ export class CallOrchestrator {
           // Only hold the buffer if the bracket text could be the start of a
           // known control marker. Otherwise flush immediately so ordinary
           // bracketed text (e.g. "[A]", "[note]") doesn't stall TTS.
-          //
-          // The check must be bidirectional:
-          //  - When the buffer is shorter than the prefix (e.g. "[ASK"), the
-          //    buffer is a prefix of the control tag → hold it.
-          //  - When the buffer is longer than the prefix (e.g. "[ASK_GUARDIAN: what"),
-          //    the buffer starts with the control tag prefix → hold it (the
-          //    variable-length payload hasn't been closed yet).
           const afterBracket = ttsBuffer;
           const couldBeControl =
             '[ASK_GUARDIAN:'.startsWith(afterBracket) ||
@@ -490,7 +348,6 @@ export class CallOrchestrator {
 
           if (!couldBeControl) {
             // Not a control marker prefix — flush up to the next '[' (if any)
-            // so we don't accidentally flush a later partial control marker.
             const nextBracket = ttsBuffer.indexOf('[', 1);
             if (nextBracket === -1) {
               this.relay.sendTextToken(ttsBuffer, false);
@@ -504,29 +361,52 @@ export class CallOrchestrator {
         }
       };
 
-      const response = await provider.sendMessage(
-        this.conversationHistory.map((m) => ({
-          role: m.role as 'user' | 'assistant',
-          content: [{ type: 'text' as const, text: m.content }],
-        })),
-        [],  // no tools
-        this.buildSystemPrompt(),
-        {
-          config: {
-            ...(callModel ? { model: callModel } : {}),
-            max_tokens: 512,
-          },
-          onEvent: (event: ProviderEvent) => {
-            if (!this.isCurrentRun(runVersion)) return;
-            if (event.type === 'text_delta') {
-              ttsBuffer += event.text;
-              ttsBuffer = stripInternalSpeechMarkers(ttsBuffer);
-              flushSafeText(false);
-            }
-          },
+      // Use a promise to track completion of the voice turn
+      const turnComplete = new Promise<void>((resolve, reject) => {
+        const onTextDelta = (text: string): void => {
+          if (!this.isCurrentRun(runVersion)) return;
+          fullResponseText += text;
+          ttsBuffer += text;
+          ttsBuffer = stripInternalSpeechMarkers(ttsBuffer);
+          flushSafeText();
+        };
+
+        const onComplete = (): void => {
+          resolve();
+        };
+
+        const onError = (message: string): void => {
+          reject(new Error(message));
+        };
+
+        // Start the voice turn through the session bridge
+        startVoiceTurn({
+          conversationId: this.conversationId,
+          content,
+          assistantId: this.assistantId,
+          guardianContext: this.guardianContext ?? undefined,
+          onTextDelta,
+          onComplete,
+          onError,
           signal: runSignal,
-        },
-      );
+        }).then((handle) => {
+          if (this.isCurrentRun(runVersion)) {
+            this.currentTurnHandle = handle;
+          } else {
+            // Turn was superseded before handle arrived; abort immediately
+            handle.abort();
+          }
+        }).catch((err) => {
+          reject(err);
+        });
+
+        // Defensive: if the turn is aborted (e.g. barge-in) and the event
+        // sink callbacks are never invoked, resolve the promise so it
+        // doesn't hang forever.
+        runSignal.addEventListener('abort', () => { resolve(); }, { once: true });
+      });
+
+      await turnComplete;
       if (!this.isCurrentRun(runVersion)) return;
 
       // Final sweep: strip any remaining control markers from the buffer
@@ -538,26 +418,20 @@ export class CallOrchestrator {
       // Signal end of this turn's speech
       this.relay.sendTextToken('', true);
 
-      const responseText = response.content
-        .filter((b): b is { type: 'text'; text: string } => b.type === 'text')
-        .map((b) => b.text)
-        .join('') || '';
+      // Mark the greeting's first response as awaiting ack
+      if (this.lastSentWasOpener && fullResponseText.length > 0) {
+        this.awaitingOpeningAck = true;
+        this.lastSentWasOpener = false;
+      }
+
+      const responseText = fullResponseText;
 
-      // Record the assistant response
-      this.conversationHistory.push({ role: 'assistant', content: responseText });
+      // Record the assistant response event
       recordCallEvent(this.callSessionId, 'assistant_spoke', { text: responseText });
       const spokenText = stripInternalSpeechMarkers(responseText).trim();
       if (spokenText.length > 0) {
         const session = getCallSession(this.callSessionId);
         if (session) {
-          // Persist assistant transcript to the voice conversation so it
-          // survives even when no live daemon Session is listening.
-          conversationStore.addMessage(
-            session.conversationId,
-            'assistant',
-            JSON.stringify([{ type: 'text', text: spokenText }]),
-            { userMessageChannel: 'voice', assistantMessageChannel: 'voice' },
-          );
           fireCallTranscriptNotifier(session.conversationId, this.callSessionId, 'assistant', spokenText);
         }
       }
@@ -632,11 +506,12 @@ export class CallOrchestrator {
       }
 
       // Normal turn complete — flush any instructions that arrived while
-      // the LLM was active. They are appended after the assistant response
-      // so chronological order is preserved, then a new LLM turn is started.
+      // the LLM was active.
       this.state = 'idle';
+      this.currentTurnHandle = null;
       this.flushPendingInstructions();
     } catch (err: unknown) {
+      this.currentTurnHandle = null;
       // Aborted requests are expected (interruptions, rapid utterances)
       if (this.isExpectedAbortError(err) || runSignal.aborted) {
         log.debug(
@@ -645,7 +520,7 @@ export class CallOrchestrator {
             errName: err instanceof Error ? err.name : typeof err,
             stale: !this.isCurrentRun(runVersion),
           },
-          'LLM request aborted',
+          'Voice turn aborted',
         );
         if (this.isCurrentRun(runVersion)) {
           this.state = 'idle';
@@ -655,11 +530,11 @@ export class CallOrchestrator {
       if (!this.isCurrentRun(runVersion)) {
         log.debug(
           { callSessionId: this.callSessionId, errName: err instanceof Error ? err.name : typeof err },
-          'Ignoring stale LLM streaming error from superseded turn',
+          'Ignoring stale voice turn error from superseded turn',
         );
         return;
       }
-      log.error({ err, callSessionId: this.callSessionId }, 'LLM streaming error');
+      log.error({ err, callSessionId: this.callSessionId }, 'Voice turn error');
       this.relay.sendTextToken('I\'m sorry, I encountered a technical issue. Could you repeat that?', true);
       this.state = 'idle';
       this.flushPendingInstructions();
@@ -677,10 +552,6 @@ export class CallOrchestrator {
 
   /**
    * Drain any instructions that were queued while the LLM was active.
-   * Each instruction is appended as a user message (now correctly after
-   * the assistant response) and a new LLM turn is kicked off to handle
-   * them. Batches all pending instructions into a single user message to
-   * avoid triggering multiple sequential LLM turns.
    */
   private flushPendingInstructions(): void {
     if (this.pendingInstructions.length === 0) return;
@@ -690,16 +561,13 @@ export class CallOrchestrator {
     );
     this.pendingInstructions = [];
 
-    this.conversationHistory.push({
-      role: 'user',
-      content: parts.join('\n'),
-    });
+    const content = parts.join('\n');
 
     this.resetSilenceTimer();
 
     // Fire-and-forget so we don't block the current turn's cleanup.
-    this.runLlm().catch((err) =>
-      log.error({ err, callSessionId: this.callSessionId }, 'runLlm failed after flushing queued instructions'),
+    this.runTurn(content).catch((err) =>
+      log.error({ err, callSessionId: this.callSessionId }, 'runTurn failed after flushing queued instructions'),
     );
   }
 
diff --git a/assistant/src/calls/call-domain.ts b/assistant/src/calls/call-domain.ts
index 1850abbb350..113b0407acf 100644
--- a/assistant/src/calls/call-domain.ts
+++ b/assistant/src/calls/call-domain.ts
@@ -19,7 +19,7 @@ import {
   expirePendingQuestions,
 } from './call-store.js';
 import { isTerminalState } from './call-state-machine.js';
-import { getCallOrchestrator, unregisterCallOrchestrator } from './call-state.js';
+import { getCallController, unregisterCallController } from './call-state.js';
 import { activeRelayConnections } from './relay-server.js';
 import { TwilioConversationRelayProvider } from './twilio-provider.js';
 import { getTwilioConfig } from './twilio-config.js';
@@ -402,7 +402,7 @@ export function getCallStatus(
 }
 
 /**
- * Cancel an active call. Cleans up relay connections and orchestrators.
+ * Cancel an active call. Cleans up relay connections and controllers.
  */
 export async function cancelCall(input: CancelCallInput): Promise<{ ok: true; session: CallSession } | CallError> {
   const { callSessionId, reason } = input;
@@ -436,11 +436,11 @@ export async function cancelCall(input: CancelCallInput): Promise<{ ok: true; se
     activeRelayConnections.delete(callSessionId);
   }
 
-  // Clean up orchestrator
-  const orchestrator = getCallOrchestrator(callSessionId);
-  if (orchestrator) {
-    orchestrator.destroy();
-    unregisterCallOrchestrator(callSessionId);
+  // Clean up controller
+  const controller = getCallController(callSessionId);
+  if (controller) {
+    controller.destroy();
+    unregisterCallController(callSessionId);
   }
 
   // Update session status
@@ -480,19 +480,19 @@ export async function answerCall(input: AnswerCallInput): Promise<{ ok: true; qu
     return { ok: false, error: 'No pending question found', status: 404 };
   }
 
-  const orchestrator = getCallOrchestrator(callSessionId);
-  if (!orchestrator) {
-    log.warn({ callSessionId }, 'answerCall: no active orchestrator for call session');
-    return { ok: false, error: 'No active orchestrator for this call', status: 409 };
+  const controller = getCallController(callSessionId);
+  if (!controller) {
+    log.warn({ callSessionId }, 'answerCall: no active controller for call session');
+    return { ok: false, error: 'No active controller for this call', status: 409 };
   }
 
-  const accepted = await orchestrator.handleUserAnswer(answer);
+  const accepted = await controller.handleUserAnswer(answer);
   if (!accepted) {
     log.warn(
       { callSessionId },
-      'answerCall: orchestrator rejected the answer (not in waiting_on_user state)',
+      'answerCall: controller rejected the answer (not in waiting_on_user state)',
     );
-    return { ok: false, error: 'Orchestrator is not waiting for an answer', status: 409 };
+    return { ok: false, error: 'Controller is not waiting for an answer', status: 409 };
   }
 
   answerPendingQuestion(question.id, answer);
@@ -501,9 +501,9 @@ export async function answerCall(input: AnswerCallInput): Promise<{ ok: true; qu
 }
 
 /**
- * Relay a user instruction to an active call's orchestrator.
+ * Relay a user instruction to an active call's controller.
  * Validates that the call is active and the instruction is non-empty
- * before injecting it into the orchestrator's conversation history.
+ * before injecting it into the controller's conversation.
  */
 export async function relayInstruction(input: RelayInstructionInput): Promise<{ ok: true } | CallError> {
   const { callSessionId, instructionText } = input;
@@ -521,14 +521,14 @@ export async function relayInstruction(input: RelayInstructionInput): Promise<{
     return { ok: false, error: `Call session ${callSessionId} is not active (status: ${session.status})`, status: 409 };
   }
 
-  const orchestrator = getCallOrchestrator(callSessionId);
-  if (!orchestrator) {
-    return { ok: false, error: 'No active orchestrator for this call', status: 409 };
+  const controller = getCallController(callSessionId);
+  if (!controller) {
+    return { ok: false, error: 'No active controller for this call', status: 409 };
   }
 
-  await orchestrator.handleUserInstruction(instructionText);
+  await controller.handleUserInstruction(instructionText);
 
-  log.info({ callSessionId }, 'User instruction relayed to orchestrator');
+  log.info({ callSessionId }, 'User instruction relayed to controller');
 
   return { ok: true };
 }
diff --git a/assistant/src/calls/call-state.ts b/assistant/src/calls/call-state.ts
index c441d78709b..d2752d8c020 100644
--- a/assistant/src/calls/call-state.ts
+++ b/assistant/src/calls/call-state.ts
@@ -1,12 +1,12 @@
 /**
- * Call session notifiers and orchestrator registry.
+ * Call session notifiers and controller registry.
  *
  * Follows the same notifier pattern as watch-state.ts: module-level Maps
  * with register/unregister/fire helpers keyed by conversationId.
  */
 
 import { getLogger } from '../util/logger.js';
-import type { CallOrchestrator } from './call-orchestrator.js';
+import type { CallController } from './call-controller.js';
 
 const log = getLogger('call-state');
 
@@ -69,19 +69,19 @@ export function fireCallCompletionNotifier(conversationId: string, callSessionId
   completionNotifiers.get(conversationId)?.(callSessionId);
 }
 
-// ── Active orchestrator registry ────────────────────────────────────
-const activeCallOrchestrators = new Map<string, CallOrchestrator>();
+// ── Active controller registry ──────────────────────────────────────
+const activeCallControllers = new Map<string, CallController>();
 
-export function registerCallOrchestrator(callSessionId: string, orchestrator: CallOrchestrator): void {
-  activeCallOrchestrators.set(callSessionId, orchestrator);
-  log.info({ callSessionId }, 'Call orchestrator registered');
+export function registerCallController(callSessionId: string, controller: CallController): void {
+  activeCallControllers.set(callSessionId, controller);
+  log.info({ callSessionId }, 'Call controller registered');
 }
 
-export function unregisterCallOrchestrator(callSessionId: string): void {
-  activeCallOrchestrators.delete(callSessionId);
-  log.info({ callSessionId }, 'Call orchestrator unregistered');
+export function unregisterCallController(callSessionId: string): void {
+  activeCallControllers.delete(callSessionId);
+  log.info({ callSessionId }, 'Call controller unregistered');
 }
 
-export function getCallOrchestrator(callSessionId: string): CallOrchestrator | undefined {
-  return activeCallOrchestrators.get(callSessionId);
+export function getCallController(callSessionId: string): CallController | undefined {
+  return activeCallControllers.get(callSessionId);
 }
diff --git a/assistant/src/calls/guardian-dispatch.ts b/assistant/src/calls/guardian-dispatch.ts
index e0c4afdd26c..cd4236beb3e 100644
--- a/assistant/src/calls/guardian-dispatch.ts
+++ b/assistant/src/calls/guardian-dispatch.ts
@@ -1,7 +1,7 @@
 /**
  * Guardian dispatch engine for cross-channel voice calls.
  *
- * When a call orchestrator detects ASK_GUARDIAN, this module:
+ * When a call controller detects ASK_GUARDIAN, this module:
  * 1. Creates a guardian_action_request
  * 2. Determines delivery destinations (telegram, sms, macos)
  * 3. Creates guardian_action_delivery rows for each destination
@@ -24,6 +24,7 @@ import { addMessage } from '../memory/conversation-store.js';
 import type { CallPendingQuestion } from './types.js';
 import { readHttpToken } from '../util/platform.js';
 import type { ServerMessage } from '../daemon/ipc-contract.js';
+import { generateGuardianCopy } from './guardian-question-copy.js';
 
 const log = getLogger('guardian-dispatch');
 
@@ -104,10 +105,19 @@ export async function dispatchGuardianQuestion(params: GuardianDispatchParams):
     // Mac (internal) delivery — always created
     destinations.push({ channel: 'macos' });
 
+    // Start LLM copy generation concurrently — only awaited in the macOS branch
+    // so external channels (Telegram, SMS) dispatch without LLM latency.
+    const guardianCopyPromise = generateGuardianCopy(
+      pendingQuestion.questionText,
+      request.requestCode,
+    );
+
     // Create delivery rows and dispatch
     for (const dest of destinations) {
       if (dest.channel === 'macos') {
-        // Create a dedicated server-side conversation for the mac guardian thread
+        // Create conversation and delivery row synchronously so they exist
+        // before awaiting LLM copy — prevents a race where an external channel
+        // reply resolves the request before the macOS delivery is created.
         const macConvKey = `asst:${assistantId}:guardian:request:${request.id}`;
         const { conversationId: macConversationId } = getOrCreateConversation(macConvKey);
 
@@ -117,11 +127,14 @@ export async function dispatchGuardianQuestion(params: GuardianDispatchParams):
           destinationConversationId: macConversationId,
         });
 
+        // Now await LLM-generated copy for the message content and thread title
+        const guardianCopy = await guardianCopyPromise;
+
         // Add the guardian question as the initial message in the thread
         addMessage(
           macConversationId,
           'assistant',
-          JSON.stringify([{ type: 'text', text: `Your assistant needs your input during a phone call.\n\nQuestion: ${request.questionText}\n\nReply to this message with your answer.` }]),
+          JSON.stringify([{ type: 'text', text: guardianCopy.initialMessage }]),
           { userMessageChannel: 'voice', assistantMessageChannel: 'macos' },
         );
 
@@ -132,7 +145,8 @@ export async function dispatchGuardianQuestion(params: GuardianDispatchParams):
             conversationId: macConversationId,
             requestId: request.id,
             callSessionId,
-            title: `Guardian question: ${pendingQuestion.questionText.slice(0, 80)}`,
+            title: guardianCopy.threadTitle,
+            questionText: request.questionText,
           } as ServerMessage);
         }
         updateDeliveryStatus(delivery.id, 'sent');
diff --git a/assistant/src/calls/guardian-question-copy.ts b/assistant/src/calls/guardian-question-copy.ts
new file mode 100644
index 00000000000..6b4ba84bf22
--- /dev/null
+++ b/assistant/src/calls/guardian-question-copy.ts
@@ -0,0 +1,133 @@
+/**
+ * Generative copy for guardian question threads.
+ *
+ * Uses the configured provider to generate an attention-oriented emoji-prefixed
+ * thread title and a richer initial message. Falls back to deterministic copy
+ * when the provider is unavailable or generation fails/times out.
+ */
+
+import { getLogger } from '../util/logger.js';
+import {
+  resolveConfiguredProvider,
+  createTimeout,
+  extractText,
+  userMessage,
+} from '../providers/provider-send-message.js';
+
+const log = getLogger('guardian-question-copy');
+
+/** Timeout for the generative copy call (ms). */
+const GENERATION_TIMEOUT_MS = 5_000;
+
+export interface GuardianCopy {
+  threadTitle: string;
+  initialMessage: string;
+}
+
+/**
+ * Build deterministic fallback copy when generation is unavailable or fails.
+ */
+export function buildFallbackCopy(questionText: string): GuardianCopy {
+  return {
+    threadTitle: `\u26A0\uFE0F ${questionText.slice(0, 70)}`,
+    initialMessage: [
+      'Your assistant needs your input during a phone call.',
+      '',
+      `Question: ${questionText}`,
+      '',
+      'Reply to this message with your answer.',
+    ].join('\n'),
+  };
+}
+
+/**
+ * Generate guardian thread copy (title + initial message) via the configured
+ * LLM provider. Returns deterministic fallback when the provider is unavailable,
+ * generation times out, or any error occurs.
+ */
+export async function generateGuardianCopy(
+  questionText: string,
+  requestCode?: string,
+): Promise<GuardianCopy> {
+  const fallback = buildFallbackCopy(questionText);
+
+  // If no provider is configured, return fallback immediately
+  const resolved = resolveConfiguredProvider();
+  if (!resolved) {
+    log.debug('No provider available for guardian copy generation, using fallback');
+    return fallback;
+  }
+
+  const { signal, cleanup } = createTimeout(GENERATION_TIMEOUT_MS);
+
+  try {
+    const prompt = [
+      'Generate a thread title and initial message for a guardian question during a live phone call.',
+      '',
+      `Question: ${questionText}`,
+      ...(requestCode ? [`Reference code: ${requestCode}`] : []),
+      '',
+      'Requirements:',
+      '- TITLE: An emoji-prefixed, attention-oriented, concise title (under 80 characters). Do NOT start with "Guardian question:". Use a relevant warning or alert emoji.',
+      '- MESSAGE: A clear initial message that includes the question text, mentions this is a live phone call waiting for the user\'s input, and asks them to reply with their answer.',
+      '',
+      'Respond in exactly this format (no extra text):',
+      'TITLE: <your title>',
+      'MESSAGE: <your message>',
+    ].join('\n');
+
+    const response = await resolved.provider.sendMessage(
+      [userMessage(prompt)],
+      undefined,
+      undefined,
+      { signal, config: { modelIntent: 'latency-optimized' } },
+    );
+
+    const text = extractText(response);
+    const parsed = parseGeneratedCopy(text);
+
+    if (parsed) {
+      return parsed;
+    }
+
+    log.warn({ raw: text }, 'Failed to parse generated guardian copy, using fallback');
+    return fallback;
+  } catch (err) {
+    if (signal.aborted) {
+      log.warn('Guardian copy generation timed out, using fallback');
+    } else {
+      log.warn({ err }, 'Guardian copy generation failed, using fallback');
+    }
+    return fallback;
+  } finally {
+    cleanup();
+  }
+}
+
+/**
+ * Parse the structured TITLE/MESSAGE response from the model.
+ * Returns null if the format is not matched.
+ */
+function parseGeneratedCopy(text: string): GuardianCopy | null {
+  const titleMatch = text.match(/^TITLE:\s*(.+)/m);
+  const messageMatch = text.match(/^MESSAGE:\s*([\s\S]+)/m);
+
+  if (!titleMatch || !messageMatch) {
+    return null;
+  }
+
+  const title = titleMatch[1].trim();
+  const message = messageMatch[1].trim();
+
+  // Sanity checks: title must be non-empty and under 80 chars, message must be non-empty
+  if (!title || title.length > 80 || !message) {
+    return null;
+  }
+
+  // Reject the old static prefix — the model is guided towards better titles but has final say
+  if (/^guardian question:/i.test(title)) {
+    return null;
+  }
+
+  return { threadTitle: title, initialMessage: message };
+}
diff --git a/assistant/src/calls/relay-server.ts b/assistant/src/calls/relay-server.ts
index 9c4545ac94e..c5c86fb1d39 100644
--- a/assistant/src/calls/relay-server.ts
+++ b/assistant/src/calls/relay-server.ts
@@ -17,7 +17,7 @@ import {
   recordCallEvent,
   expirePendingQuestions,
 } from './call-store.js';
-import { CallOrchestrator } from './call-orchestrator.js';
+import { CallController } from './call-controller.js';
 import { fireCallTranscriptNotifier, fireCallCompletionNotifier } from './call-state.js';
 import { addPointerMessage, formatDuration } from './call-pointer-messages.js';
 import { persistCallCompletionMessage } from './call-conversation-messages.js';
@@ -145,7 +145,7 @@ export class RelayConnection {
     speaker?: PromptSpeakerContext;
   }>;
   private abortController: AbortController;
-  private orchestrator: CallOrchestrator | null = null;
+  private controller: CallController | null = null;
   private speakerIdentityTracker: SpeakerIdentityTracker;
 
   // Verification state (outbound callee verification)
@@ -217,7 +217,7 @@ export class RelayConnection {
         this.handleError(parsed);
         break;
       default:
-        log.warn({ callSessionId: this.callSessionId, type: (parsed as Record<string, unknown>).type }, 'Unknown relay message type');
+        log.warn({ callSessionId: this.callSessionId, type: (parsed as { type: unknown }).type }, 'Unknown relay message type');
     }
   }
 
@@ -263,26 +263,26 @@ export class RelayConnection {
   }
 
   /**
-   * Set the orchestrator for this connection.
+   * Set the controller for this connection.
    */
-  setOrchestrator(orchestrator: CallOrchestrator): void {
-    this.orchestrator = orchestrator;
+  setController(controller: CallController): void {
+    this.controller = controller;
   }
 
   /**
-   * Get the orchestrator for this connection.
+   * Get the controller for this connection.
    */
-  getOrchestrator(): CallOrchestrator | null {
-    return this.orchestrator;
+  getController(): CallController | null {
+    return this.controller;
   }
 
   /**
    * Clean up resources on disconnect.
    */
   destroy(): void {
-    if (this.orchestrator) {
-      this.orchestrator.destroy();
-      this.orchestrator = null;
+    if (this.controller) {
+      this.controller.destroy();
+      this.controller = null;
     }
     this.abortController.abort();
     log.info({ callSessionId: this.callSessionId }, 'RelayConnection destroyed');
@@ -382,7 +382,7 @@ export class RelayConnection {
     const assistantId = normalizeAssistantId(session?.assistantId ?? 'self');
     const isInbound = session?.initiatedFromConversationId == null;
 
-    // Create and attach the LLM-driven orchestrator. For inbound voice,
+    // Create and attach the session-backed voice controller. For inbound voice,
     // seed guardian actor context from caller identity + active binding so
     // first-turn behavior matches channel ingress semantics.
     const initialGuardianContext = isInbound
@@ -397,12 +397,12 @@ export class RelayConnection {
       )
       : undefined;
 
-    const orchestrator = new CallOrchestrator(this.callSessionId, this, session?.task ?? null, {
+    const controller = new CallController(this.callSessionId, this, session?.task ?? null, {
       broadcast: globalBroadcast,
       assistantId,
       guardianContext: initialGuardianContext,
     });
-    this.setOrchestrator(orchestrator);
+    this.setController(controller);
 
     const config = getConfig();
     const verificationConfig = config.calls.verification;
@@ -416,10 +416,10 @@ export class RelayConnection {
       if (pendingChallenge) {
         this.startInboundGuardianVerification(assistantId, msg.from);
       } else {
-        this.startNormalCallFlow(orchestrator, true);
+        this.startNormalCallFlow(controller, true);
       }
     } else {
-      this.startNormalCallFlow(orchestrator, false);
+      this.startNormalCallFlow(controller, false);
     }
   }
 
@@ -469,13 +469,13 @@ export class RelayConnection {
   }
 
   /**
-   * Start normal call flow — fire the orchestrator greeting unless a
+   * Start normal call flow — fire the controller greeting unless a
    * static welcome greeting is configured.
    */
-  private startNormalCallFlow(orchestrator: CallOrchestrator, isInbound: boolean): void {
+  private startNormalCallFlow(controller: CallController, isInbound: boolean): void {
     const hasStaticGreeting = !!process.env.CALL_WELCOME_GREETING?.trim();
     if (!hasStaticGreeting) {
-      orchestrator.startInitialGreeting().catch((err) =>
+      controller.startInitialGreeting().catch((err) =>
         log.error({ err, callSessionId: this.callSessionId }, `Failed to start initial ${isInbound ? 'inbound' : 'outbound'} greeting`),
       );
     }
@@ -582,8 +582,8 @@ export class RelayConnection {
 
       // Proceed to normal call flow (use startNormalCallFlow to respect
       // the CALL_WELCOME_GREETING static greeting guard)
-      if (this.orchestrator) {
-        this.orchestrator.setGuardianContext(
+      if (this.controller) {
+        this.controller.setGuardianContext(
           toGuardianRuntimeContext(
             'voice',
             resolveGuardianContext({
@@ -594,7 +594,7 @@ export class RelayConnection {
             }),
           ),
         );
-        this.startNormalCallFlow(this.orchestrator, true);
+        this.startNormalCallFlow(this.controller, true);
       }
     } else {
       this.verificationAttempts++;
@@ -678,7 +678,9 @@ export class RelayConnection {
       'Caller transcript received (final)',
     );
 
-    const speakerMetadata = extractPromptSpeakerMetadata(msg as unknown as Record<string, unknown>);
+    // Spread to widen the typed message into a plain record — extractPromptSpeakerMetadata
+    // probes for snake_case and nested property variants not on RelayPromptMessage.
+    const speakerMetadata = extractPromptSpeakerMetadata({ ...msg });
     const speaker = this.speakerIdentityTracker.identifySpeaker(speakerMetadata);
 
     // Record in conversation history
@@ -701,22 +703,17 @@ export class RelayConnection {
 
     const session = getCallSession(this.callSessionId);
     if (session) {
-      // Persist caller transcript to the voice conversation so it survives
-      // even when no live daemon Session is listening.
-      conversationStore.addMessage(
-        session.conversationId,
-        'user',
-        JSON.stringify([{ type: 'text', text: msg.voicePrompt }]),
-        { userMessageChannel: 'voice', assistantMessageChannel: 'voice' },
-      );
+      // User message persistence is handled by the session pipeline
+      // (RunOrchestrator.startRun -> session.persistUserMessage) so we only
+      // need to fire the transcript notifier for UI subscribers here.
       fireCallTranscriptNotifier(session.conversationId, this.callSessionId, 'caller', msg.voicePrompt);
     }
 
-    // Route to orchestrator for LLM-driven response
-    if (this.orchestrator) {
-      await this.orchestrator.handleCallerUtterance(msg.voicePrompt, speaker);
+    // Route to controller for session-backed response
+    if (this.controller) {
+      await this.controller.handleCallerUtterance(msg.voicePrompt, speaker);
     } else {
-      // Fallback if orchestrator not yet initialized
+      // Fallback if controller not yet initialized
       this.sendTextToken('I\'m still setting up. Please hold.', true);
     }
   }
@@ -731,9 +728,9 @@ export class RelayConnection {
     this.abortController.abort();
     this.abortController = new AbortController();
 
-    // Notify the orchestrator of the interruption
-    if (this.orchestrator) {
-      this.orchestrator.handleInterrupt();
+    // Notify the controller of the interruption
+    if (this.controller) {
+      this.controller.handleInterrupt();
     }
   }
 
@@ -778,8 +775,8 @@ export class RelayConnection {
           log.info({ callSessionId: this.callSessionId }, 'Callee verification succeeded');
 
           // Proceed to the normal call flow
-          if (this.orchestrator) {
-            this.orchestrator.startInitialGreeting().catch((err) =>
+          if (this.controller) {
+            this.controller.startInitialGreeting().catch((err) =>
               log.error({ err, callSessionId: this.callSessionId }, 'Failed to start initial outbound greeting after verification'),
             );
           }
diff --git a/assistant/src/calls/twilio-routes.ts b/assistant/src/calls/twilio-routes.ts
index e8f30381604..bfa7bbe93d1 100644
--- a/assistant/src/calls/twilio-routes.ts
+++ b/assistant/src/calls/twilio-routes.ts
@@ -73,9 +73,9 @@ export function buildWelcomeGreeting(task: string | null, configuredGreeting?: s
   void task;
   const override = configuredGreeting?.trim();
   if (override) return override;
-  // The contextual first opener now comes from the call orchestrator's
-  // initial LLM turn. Keep Twilio's relay-level greeting empty by default
-  // so we don't speak a deterministic static line first.
+  // The contextual first opener now comes from the call controller's
+  // initial LLM turn via the session pipeline. Keep Twilio's relay-level
+  // greeting empty by default so we don't speak a deterministic static line first.
   return '';
 }
 
diff --git a/assistant/src/calls/voice-session-bridge.ts b/assistant/src/calls/voice-session-bridge.ts
new file mode 100644
index 00000000000..d6303946813
--- /dev/null
+++ b/assistant/src/calls/voice-session-bridge.ts
@@ -0,0 +1,148 @@
+/**
+ * Bridge between voice relay and the daemon session/run pipeline.
+ *
+ * Provides a `startVoiceTurn()` function that wraps RunOrchestrator.startRun()
+ * with voice-specific defaults, translating agent-loop events into simple
+ * callbacks suitable for real-time TTS streaming.
+ *
+ * Dependency injection follows the same module-level setter pattern used by
+ * setRelayBroadcast in relay-server.ts: the daemon lifecycle injects the
+ * RunOrchestrator instance at startup via `setVoiceBridgeOrchestrator()`.
+ */
+
+import type { RunOrchestrator, VoiceRunEventSink } from '../runtime/run-orchestrator.js';
+import type { GuardianRuntimeContext } from '../daemon/session-runtime-assembly.js';
+import { getLogger } from '../util/logger.js';
+
+/**
+ * Matches the exact `[CALL_OPENING]` marker that call-controller sends for
+ * the initial greeting turn. We replace it with a benign content string before
+ * persisting so the marker never appears in session history where it could
+ * retrigger opener behavior after a barge-in interruption.
+ */
+const CALL_OPENING_MARKER = '[CALL_OPENING]';
+
+const log = getLogger('voice-session-bridge');
+
+// ---------------------------------------------------------------------------
+// Module-level dependency injection
+// ---------------------------------------------------------------------------
+
+let orchestrator: RunOrchestrator | undefined;
+
+/**
+ * Inject the RunOrchestrator instance from daemon lifecycle.
+ * Must be called during daemon startup before any voice turns are executed.
+ */
+export function setVoiceBridgeOrchestrator(orch: RunOrchestrator): void {
+  orchestrator = orch;
+}
+
+// ---------------------------------------------------------------------------
+// Types
+// ---------------------------------------------------------------------------
+
+export interface VoiceTurnOptions {
+  /** The conversation ID for this voice call's session. */
+  conversationId: string;
+  /** The transcribed caller utterance or synthetic marker. */
+  content: string;
+  /** Assistant scope for multi-assistant channels. */
+  assistantId?: string;
+  /** Guardian trust context for the caller. */
+  guardianContext?: GuardianRuntimeContext;
+  /** Called for each streaming text token from the agent loop. */
+  onTextDelta: (text: string) => void;
+  /** Called when the agent loop completes a full response. */
+  onComplete: () => void;
+  /** Called when the agent loop encounters an error. */
+  onError: (message: string) => void;
+  /** Optional AbortSignal for external cancellation (e.g. barge-in). */
+  signal?: AbortSignal;
+}
+
+export interface VoiceTurnHandle {
+  /** The run ID for this turn. */
+  runId: string;
+  /** Abort the in-flight turn (e.g. for barge-in). */
+  abort: () => void;
+}
+
+// ---------------------------------------------------------------------------
+// startVoiceTurn
+// ---------------------------------------------------------------------------
+
+/**
+ * Execute a single voice turn through the daemon session pipeline.
+ *
+ * Wraps RunOrchestrator.startRun() with voice-specific defaults:
+ *   - sourceChannel: 'voice'
+ *   - eventSink wired to the provided callbacks
+ *   - abort propagated from the returned handle
+ *
+ * The caller (CallController via relay-server) can use the returned handle
+ * to cancel the turn on barge-in.
+ */
+export async function startVoiceTurn(opts: VoiceTurnOptions): Promise<VoiceTurnHandle> {
+  if (!orchestrator) {
+    throw new Error('Voice bridge not initialized — setVoiceBridgeOrchestrator() was not called');
+  }
+
+  const eventSink: VoiceRunEventSink = {
+    onTextDelta: opts.onTextDelta,
+    onMessageComplete: opts.onComplete,
+    onError: opts.onError,
+    onToolUse: (toolName, input) => {
+      log.debug({ toolName, input }, 'Voice turn tool_use event');
+    },
+  };
+
+  // Derive forceStrictSideEffects from guardian context to match channel
+  // ingress behavior: non-guardian and unverified actors always get strict
+  // side effects so all side-effect tools trigger the confirmation flow.
+  const actorRole = opts.guardianContext?.actorRole;
+  const forceStrictSideEffects =
+    actorRole === 'non-guardian' || actorRole === 'unverified_channel'
+      ? true
+      : undefined;
+
+  // Replace the [CALL_OPENING] marker with a neutral instruction before
+  // persisting. The marker must not appear as a user message in session
+  // history — after a barge-in interruption the next turn would replay
+  // the stale marker and potentially retrigger opener behavior.
+  const persistedContent = opts.content === CALL_OPENING_MARKER
+    ? '(call connected — deliver opening greeting)'
+    : opts.content;
+
+  const { run, abort } = await orchestrator.startRun(
+    opts.conversationId,
+    persistedContent,
+    undefined, // no attachments for voice
+    {
+      sourceChannel: 'voice',
+      assistantId: opts.assistantId,
+      guardianContext: opts.guardianContext,
+      ...(forceStrictSideEffects ? { forceStrictSideEffects, voiceAutoDenyConfirmations: true } : {}),
+      turnChannelContext: {
+        userMessageChannel: 'voice',
+        assistantMessageChannel: 'voice',
+      },
+      eventSink,
+    },
+  );
+
+  // If the caller provided an external AbortSignal (e.g. from a
+  // RelayConnection's AbortController), wire it to the run's abort.
+  if (opts.signal) {
+    if (opts.signal.aborted) {
+      abort();
+    } else {
+      opts.signal.addEventListener('abort', () => abort(), { once: true });
+    }
+  }
+
+  return {
+    runId: run.id,
+    abort,
+  };
+}
diff --git a/assistant/src/cli/amazon.ts b/assistant/src/cli/amazon.ts
index 3a054271e9b..fda62cd9473 100644
--- a/assistant/src/cli/amazon.ts
+++ b/assistant/src/cli/amazon.ts
@@ -17,12 +17,9 @@ import {
   extractRequests,
   saveRequests,
 } from '../amazon/request-extractor.js';
-import { NetworkRecorder } from '../tools/browser/network-recorder.js';
 import {
-  saveRecording,
   loadRecording,
 } from '../tools/browser/recording-store.js';
-import type { SessionRecording } from '../tools/browser/network-recording-types.js';
 import {
   search,
   getProductDetails,
diff --git a/assistant/src/cli/influencer.ts b/assistant/src/cli/influencer.ts
new file mode 100644
index 00000000000..0017ab755ae
--- /dev/null
+++ b/assistant/src/cli/influencer.ts
@@ -0,0 +1,244 @@
+/**
+ * CLI command group: `vellum influencer`
+ *
+ * Research influencers on Instagram, TikTok, and X/Twitter via the Chrome extension relay.
+ * All commands output JSON to stdout. Use --json for machine-readable output.
+ */
+
+import { Command } from 'commander';
+import {
+  searchInfluencers,
+  getInfluencerProfile,
+  compareInfluencers,
+  type InfluencerSearchCriteria,
+} from '../influencer/client.js';
+
+// ---------------------------------------------------------------------------
+// Helpers
+// ---------------------------------------------------------------------------
+
+function output(data: unknown, json: boolean): void {
+  process.stdout.write(
+    json ? JSON.stringify(data) + '\n' : JSON.stringify(data, null, 2) + '\n',
+  );
+}
+
+function outputError(message: string, code = 1): void {
+  output({ ok: false, error: message }, true);
+  process.exitCode = code;
+}
+
+function getJson(cmd: Command): boolean {
+  let c: Command | null = cmd;
+  while (c) {
+    if ((c.opts() as { json?: boolean }).json) return true;
+    c = c.parent;
+  }
+  return false;
+}
+
+async function run(cmd: Command, fn: () => Promise<unknown>): Promise<void> {
+  try {
+    const result = await fn();
+    output(
+      { ok: true, ...(result as Record<string, unknown>) },
+      getJson(cmd),
+    );
+  } catch (err) {
+    outputError(err instanceof Error ? err.message : String(err));
+  }
+}
+
+// ---------------------------------------------------------------------------
+// Command registration
+// ---------------------------------------------------------------------------
+
+export function registerInfluencerCommand(program: Command): void {
+  const inf = program
+    .command('influencer')
+    .description(
+      'Research influencers on Instagram, TikTok, and X/Twitter. ' +
+      'Uses the Chrome extension relay to browse each platform. ' +
+      'Requires the user to be logged in on each platform in Chrome.',
+    )
+    .option('--json', 'Machine-readable JSON output');
+
+  // =========================================================================
+  // search — search for influencers across platforms
+  // =========================================================================
+  inf
+    .command('search')
+    .description(
+      'Search for influencers matching criteria across Instagram, TikTok, and X/Twitter',
+    )
+    .argument('<query>', 'Search query — niche, topic, or keywords (e.g. "fitness coach", "vegan food")')
+    .option(
+      '--platforms <platforms>',
+      'Comma-separated list of platforms to search (instagram,tiktok,twitter)',
+      'instagram,tiktok,twitter',
+    )
+    .option('--min-followers <n>', 'Minimum follower count (e.g. 10000, 10k, 1m)')
+    .option('--max-followers <n>', 'Maximum follower count (e.g. 100000, 100k, 1m)')
+    .option('--limit <n>', 'Max results per platform', '10')
+    .option('--verified', 'Only return verified accounts')
+    .action(
+      async (
+        query: string,
+        opts: {
+          platforms: string;
+          minFollowers?: string;
+          maxFollowers?: string;
+          limit: string;
+          verified?: boolean;
+        },
+        cmd: Command,
+      ) => {
+        await run(cmd, async () => {
+          const platforms = opts.platforms
+            .split(',')
+            .map((p) => p.trim().toLowerCase())
+            .filter((p): p is 'instagram' | 'tiktok' | 'twitter' =>
+              ['instagram', 'tiktok', 'twitter'].includes(p),
+            );
+
+          const criteria: InfluencerSearchCriteria = {
+            query,
+            platforms,
+            minFollowers: opts.minFollowers
+              ? parseHumanNumber(opts.minFollowers)
+              : undefined,
+            maxFollowers: opts.maxFollowers
+              ? parseHumanNumber(opts.maxFollowers)
+              : undefined,
+            limit: parseInt(opts.limit, 10),
+            verifiedOnly: opts.verified,
+          };
+
+          const results = await searchInfluencers(criteria);
+
+          const totalProfiles = results.reduce((sum, r) => sum + r.count, 0);
+
+          return {
+            results,
+            totalProfiles,
+            platforms: platforms.length,
+            query,
+          };
+        });
+      },
+    );
+
+  // =========================================================================
+  // profile — get detailed profile data for a specific influencer
+  // =========================================================================
+  inf
+    .command('profile')
+    .description('Get detailed profile data for a specific influencer')
+    .argument('<username>', 'Username/handle (without @ prefix)')
+    .option(
+      '--platform <platform>',
+      'Platform (instagram, tiktok, or twitter)',
+      'instagram',
+    )
+    .action(
+      async (
+        username: string,
+        opts: { platform: string },
+        cmd: Command,
+      ) => {
+        await run(cmd, async () => {
+          const platform = opts.platform.toLowerCase() as
+            | 'instagram'
+            | 'tiktok'
+            | 'twitter';
+          if (!['instagram', 'tiktok', 'twitter'].includes(platform)) {
+            throw new Error(
+              `Invalid platform: ${opts.platform}. Use instagram, tiktok, or twitter.`,
+            );
+          }
+
+          const cleanUsername = username.replace(/^@/, '');
+          const profile = await getInfluencerProfile(platform, cleanUsername);
+
+          if (!profile) {
+            throw new Error(
+              `Could not find profile @${cleanUsername} on ${platform}`,
+            );
+          }
+
+          return { profile };
+        });
+      },
+    );
+
+  // =========================================================================
+  // compare — compare multiple influencers side by side
+  // =========================================================================
+  inf
+    .command('compare')
+    .description(
+      'Compare multiple influencers side by side. ' +
+      'Provide usernames as platform:username pairs.',
+    )
+    .argument(
+      '<influencers...>',
+      'Space-separated list of platform:username pairs (e.g. instagram:nike twitter:nike tiktok:nike)',
+    )
+    .action(async (influencers: string[], _opts: unknown, cmd: Command) => {
+      await run(cmd, async () => {
+        const parsed = influencers.map((inf) => {
+          const [platform, username] = inf.includes(':')
+            ? inf.split(':', 2)
+            : ['instagram', inf];
+
+          const cleanPlatform = platform.toLowerCase() as
+            | 'instagram'
+            | 'tiktok'
+            | 'twitter';
+          if (!['instagram', 'tiktok', 'twitter'].includes(cleanPlatform)) {
+            throw new Error(
+              `Invalid platform "${platform}" in "${inf}". Use instagram, tiktok, or twitter.`,
+            );
+          }
+
+          return {
+            platform: cleanPlatform,
+            username: username.replace(/^@/, ''),
+          };
+        });
+
+        const profiles = await compareInfluencers(parsed);
+
+        return {
+          profiles,
+          count: profiles.length,
+          requested: parsed.length,
+        };
+      });
+    });
+}
+
+// ---------------------------------------------------------------------------
+// Utilities
+// ---------------------------------------------------------------------------
+
+/**
+ * Parse human-friendly numbers like "10k", "1.5m", "100000" into integers.
+ */
+function parseHumanNumber(text: string): number {
+  const cleaned = text.toLowerCase().replace(/,/g, '').trim();
+  const match = cleaned.match(/^([\d.]+)\s*([kmbt]?)$/);
+  if (!match) return parseInt(text, 10) || 0;
+
+  const num = parseFloat(match[1]);
+  const suffix = match[2];
+  const multipliers: Record<string, number> = {
+    '': 1,
+    k: 1_000,
+    m: 1_000_000,
+    b: 1_000_000_000,
+    t: 1_000_000_000_000,
+  };
+
+  return Math.round(num * (multipliers[suffix] || 1));
+}
diff --git a/assistant/src/config/bundled-skills/google-oauth-setup/SKILL.md b/assistant/src/config/bundled-skills/google-oauth-setup/SKILL.md
index 3c558579a33..3af386c85e7 100644
--- a/assistant/src/config/bundled-skills/google-oauth-setup/SKILL.md
+++ b/assistant/src/config/bundled-skills/google-oauth-setup/SKILL.md
@@ -6,7 +6,7 @@ includes: ["browser", "public-ingress"]
 metadata: {"vellum": {"emoji": "\ud83d\udd11"}}
 ---
 
-You are helping your user set up Google Cloud OAuth credentials so Gmail and Google Calendar integrations can connect. You will automate the entire GCP setup via the browser while the user watches via screencast.
+You are helping your user set up Google Cloud OAuth credentials so Gmail and Google Calendar integrations can connect. You will automate the entire GCP setup via the browser while the user watches via screencast. The user's only manual action is signing in to their Google account — everything else is fully automated.
 
 ## Client Check
 
@@ -26,11 +26,13 @@ Use `ui_show` with `surface_type: "confirmation"` and this message:
 
 > **Set up Google Cloud for Gmail & Calendar**
 >
-> I'll create a Google Cloud project, enable the Gmail and Calendar APIs, configure OAuth, and download credentials — all automatically in the browser. You can watch everything via screencast.
+> Here's what will happen:
+> 1. **A browser opens** — you sign in to your Google account
+> 2. **I automate everything** — project creation, APIs, OAuth config, credentials
+> 3. **You enter credentials** from a downloaded file (secure prompt — I never see them)
+> 4. **You authorize Vellum** with one click
 >
-> After the automated setup, I'll ask you to securely enter the client ID and client secret from the downloaded credential file (I never see these values).
->
-> Ready to get started?
+> The whole thing takes 2-3 minutes. Ready?
 
 If the user declines, acknowledge and stop. No further confirmations are needed after this point.
 
@@ -39,8 +41,9 @@ If the user declines, acknowledge and stop. No further confirmations are needed
 Use `browser_navigate` to go to `https://console.cloud.google.com/`.
 
 Take a `browser_screenshot` and `browser_snapshot` to check the page state:
-- **If a sign-in page appears:** Tell the user "Please sign in to your Google account in the browser window. Let me know when you're done." Wait for their confirmation, then re-check.
-- **If a CAPTCHA appears:** Tell the user "There's a CAPTCHA to solve. Please complete it in the browser window and let me know." Wait, then re-check.
+- **If a sign-in page appears:** Tell the user: "Please sign in to your Google account in the browser preview panel (or the Chrome window that just opened)." Then **auto-detect sign-in completion** by polling `browser_snapshot` every 5-10 seconds. Check if the current URL has moved away from `accounts.google.com` to `console.cloud.google.com`. Do NOT ask the user to "let me know when you're done" — detect it automatically. Once sign-in is detected, tell the user: "Signed in! Starting the automated setup now..."
+- **If already signed in** (URL is already `console.cloud.google.com`): Tell the user: "Already signed in — starting setup now..." and continue immediately.
+- **If a CAPTCHA appears:** The browser automation's built-in handoff will handle this. If it persists, tell the user: "There's a CAPTCHA in the browser — please complete it and I'll continue automatically."
 - **If the console dashboard loads:** Continue to Step 3.
 
 ## Step 3: Create or Select a Project
@@ -75,7 +78,7 @@ Take a `browser_screenshot` to show result. Tell the user: "APIs enabled!"
 
 ## Step 5: Configure OAuth Consent Screen
 
-Tell the user: "Configuring OAuth consent screen..."
+Tell the user: "Configuring OAuth consent screen — this is the longest step, but it's fully automated..."
 
 Navigate to `https://console.cloud.google.com/apis/credentials/consent?project=PROJECT_ID`.
 
@@ -121,15 +124,17 @@ Click "Create".
 
 ## Step 7: Download Credentials JSON
 
+Tell the user: "Almost done — downloading credentials..."
+
 After the credentials dialog appears, click the "Download JSON" button (it may say "DOWNLOAD JSON" or show a download icon).
 
 Use `browser_wait_for_download` to wait for the file to download.
 
-Tell the user: "Credentials downloaded! The file is at: `<path>`"
+Tell the user: "Credentials downloaded!"
 
 ## Step 8: Secure Credential Entry
 
-Tell the user to open the downloaded JSON file, then prompt for secure entry:
+Tell the user: "I've downloaded the credentials file. Please open it and enter the values below. I won't see what you type — these go directly to secure storage."
 
 ```
 credential_store prompt:
diff --git a/assistant/src/config/bundled-skills/influencer/SKILL.md b/assistant/src/config/bundled-skills/influencer/SKILL.md
new file mode 100644
index 00000000000..58a869e0a3a
--- /dev/null
+++ b/assistant/src/config/bundled-skills/influencer/SKILL.md
@@ -0,0 +1,144 @@
+---
+name: "Influencer Research"
+description: "Research influencers on Instagram, TikTok, and X/Twitter using the Chrome extension relay"
+user-invocable: true
+metadata: {"vellum": {"emoji": "🔍"}}
+---
+
+You can research and discover influencers across Instagram, TikTok, and X/Twitter using the `vellum influencer` CLI.
+
+## CLI Setup
+
+**IMPORTANT: Always use `host_bash` (not `bash`) for all `vellum influencer` commands.** The influencer CLI needs host access for the Chrome extension relay and the `vellum` binary, neither of which are available inside the sandbox.
+
+`vellum influencer` is a built-in subcommand of the Vellum assistant CLI. If `vellum` is not found, prepend `PATH="$HOME/.local/bin:$PATH"` to the command.
+
+## Prerequisites
+
+- The Chrome extension relay must be connected (user should have the Vellum extension loaded in Chrome)
+- The user must be **logged in** on each platform they want to search (Instagram, TikTok, X) in their Chrome browser
+- The extension MUST have the `debugger` permission (required to bypass CSP on Instagram and other Meta sites)
+- If the relay is not connected, tell the user: "Please open Chrome, click the Vellum extension icon, and click Connect — then I'll retry."
+
+## Platform-Specific Architecture
+
+### Instagram
+Instagram's search at `/explore/search/keyword/?q=...` shows a **grid of posts**, NOT profiles. The discovery flow is:
+1. Search by keyword → extract post links (`/p/` and `/reel/`)
+2. Visit each post → find the author username from page links
+3. Deduplicate usernames
+4. Visit each unique profile → scrape stats from `meta[name="description"]` (most reliable source, format: "49K Followers, 463 Following, 551 Posts - Display Name (@user)")
+5. Filter and rank by criteria
+
+**CSP Note:** Instagram blocks `eval()`, `new Function()`, inline scripts, and blob URLs via strict CSP. The extension uses `chrome.debugger` API (CDP Runtime.evaluate) as a fallback, which bypasses all CSP restrictions.
+
+### TikTok
+TikTok has a dedicated user search at `/search/user?q=...`. Each result card produces a predictable text pattern in `innerText`:
+```
+DisplayName
+username
+77.9K
+Followers
+·
+1.5M
+Likes
+Follow
+```
+We parse this pattern directly (DOM class selectors are obfuscated and unreliable on TikTok). After extracting usernames and follower counts, we visit each profile for bios.
+
+### X/Twitter
+X has a people search at `/search?q=...&f=user` with `[data-testid="UserCell"]` components containing username, display name, bio, and verified status.
+
+## Typical Flow
+
+When the user asks to find or research influencers:
+
+1. **Understand the criteria.** Ask about:
+   - **Niche/topic** — what kind of influencers? (fitness, beauty, tech, food, etc.)
+   - **Platforms** — Instagram, TikTok, X/Twitter, or all three?
+   - **Follower range** — micro (1K-10K), mid-tier (10K-100K), macro (100K-1M), mega (1M+)?
+   - **Verified only?** — do they need the blue checkmark?
+   - Don't over-ask. If the user says "find me fitness influencers on Instagram", that's enough to start.
+
+2. **Search** — run `vellum influencer search "<query>" --platforms <platforms> [options] --json`
+
+3. **Present results** — show a clean summary of each influencer found:
+   - Username and display name
+   - Platform
+   - Follower count
+   - Bio snippet
+   - Verified status
+   - Content themes detected
+   - Profile URL
+
+4. **Deep dive** (if needed) — run `vellum influencer profile <username> --platform <platform> --json` to get detailed data on a specific influencer.
+
+5. **Compare** (if needed) — run `vellum influencer compare instagram:user1 twitter:user2 tiktok:user3 --json` to compare influencers side by side.
+
+## Follower Range Shortcuts
+
+When the user describes influencer tiers, map to these ranges:
+- **Nano**: `--min-followers 1000 --max-followers 10000`
+- **Micro**: `--min-followers 10000 --max-followers 100000`
+- **Mid-tier**: `--min-followers 100000 --max-followers 500000`
+- **Macro**: `--min-followers 500000 --max-followers 1000000`
+- **Mega**: `--min-followers 1000000`
+
+Human-friendly numbers are supported: `10k`, `100k`, `1m`, etc.
+
+## Command Reference
+
+```
+vellum influencer search "<query>" [options] --json
+  --platforms <list>       Comma-separated: instagram,tiktok,twitter (default: all three)
+  --min-followers <n>      Minimum follower count (e.g. 10k, 100000)
+  --max-followers <n>      Maximum follower count (e.g. 1m, 500k)
+  --limit <n>              Max results per platform (default: 10)
+  --verified               Only return verified accounts
+
+vellum influencer profile <username> --platform <platform> --json
+  --platform <platform>    instagram, tiktok, or twitter (default: instagram)
+
+vellum influencer compare <platform:username ...> --json
+  Arguments are space-separated platform:username pairs
+  e.g. instagram:nike twitter:nike tiktok:nike
+```
+
+## Important Behavior
+
+- **Use `--json` flag** on all commands for reliable parsing.
+- **Always use `host_bash`** for these commands, never `bash`.
+- **Be patient with results.** The tool navigates actual browser tabs, so each platform search takes 10-30 seconds. Warn the user it may take a moment.
+- **Rate limiting.** Don't hammer the platforms. The tool has built-in delays, but avoid running many searches in rapid succession.
+- **Present results nicely.** Use tables or formatted lists. Group by platform. Highlight standout profiles.
+- **Offer next steps.** After showing results, ask if they want to:
+  - Get more details on specific profiles
+  - Compare top picks side by side
+  - Search with different criteria
+  - Export the results
+- **Handle errors gracefully.** If a platform fails (e.g. not logged in), show results from the platforms that worked and mention which one failed.
+- **Do NOT use the browser skill.** All influencer research goes through the CLI, not browser automation.
+
+## Example Interactions
+
+**User**: "Find me fitness influencers on Instagram and TikTok"
+
+1. `vellum influencer search "fitness coach workout" --platforms instagram,tiktok --limit 10 --json`
+2. Present results grouped by platform with follower counts and bios
+3. "I found 8 fitness influencers on Instagram and 6 on TikTok. Want me to dig deeper into any of these profiles?"
+
+**User**: "I need micro-influencers in the beauty niche, verified only"
+
+1. `vellum influencer search "beauty makeup skincare" --platforms instagram,tiktok,twitter --min-followers 10k --max-followers 100k --verified --limit 10 --json`
+2. Present filtered results
+3. Offer to compare top picks
+
+**User**: "Compare @username1 on Instagram with @username2 on TikTok"
+
+1. `vellum influencer compare instagram:username1 tiktok:username2 --json`
+2. Present side-by-side comparison with followers, engagement, bio, themes
+
+**User**: "Tell me more about @specificuser on Instagram"
+
+1. `vellum influencer profile specificuser --platform instagram --json`
+2. Show full profile details including bio, follower/following counts, verified status, content themes
diff --git a/assistant/src/config/bundled-skills/media-processing/services/gemini-map.ts b/assistant/src/config/bundled-skills/media-processing/services/gemini-map.ts
index 096a9b512cc..db4f7fa0b43 100644
--- a/assistant/src/config/bundled-skills/media-processing/services/gemini-map.ts
+++ b/assistant/src/config/bundled-skills/media-processing/services/gemini-map.ts
@@ -13,7 +13,7 @@ import { GoogleGenAI, ApiError } from '@google/genai';
 import { ConcurrencyPool } from './concurrency-pool.js';
 import { CostTracker, type CostSummary } from './cost-tracker.js';
 import { computeRetryDelay, sleep } from '../../../../util/retry.js';
-import type { Segment, SubjectRegistry } from './preprocess.js';
+import type { Segment } from './preprocess.js';
 
 // ---------------------------------------------------------------------------
 // Types
@@ -91,6 +91,7 @@ function computeConfigHash(options: GeminiMapOptions): string {
     systemPrompt: options.systemPrompt,
     outputSchema: options.outputSchema,
     model: options.model ?? 'gemini-2.5-flash',
+    context: options.context,
   });
   return createHash('sha256').update(payload).digest('hex').slice(0, 8);
 }
diff --git a/assistant/src/config/bundled-skills/media-processing/services/preprocess.ts b/assistant/src/config/bundled-skills/media-processing/services/preprocess.ts
index 8203f920d18..4cf7f2272b4 100644
--- a/assistant/src/config/bundled-skills/media-processing/services/preprocess.ts
+++ b/assistant/src/config/bundled-skills/media-processing/services/preprocess.ts
@@ -424,7 +424,7 @@ export async function preprocessForAsset(
     for (const seg of rawSegments) {
       const segDuration = seg.endSeconds - seg.startSeconds;
       const effectiveInterval = computeEffectiveInterval(segDuration, config.intervalSeconds);
-      const frameTimestamps = generateFrameTimestamps(seg.startSeconds, seg.endSeconds, config.intervalSeconds);
+      const _frameTimestamps = generateFrameTimestamps(seg.startSeconds, seg.endSeconds, config.intervalSeconds);
 
       const segTempDir = join(tempDir, seg.id);
       await mkdir(segTempDir, { recursive: true });
@@ -468,11 +468,6 @@ export async function preprocessForAsset(
       });
     }
 
-    // Atomically swap temp dir to durable path
-    await rm(framesDir, { recursive: true, force: true });
-    await mkdir(dirname(framesDir), { recursive: true });
-    await rename(tempDir, framesDir);
-
     const totalFrames = segments.reduce((sum, s) => sum + s.framePaths.length, 0);
     if (rawSegments.length > 0 && totalFrames === 0) {
       throw new Error(
@@ -481,6 +476,11 @@ export async function preprocessForAsset(
     }
     onProgress?.(`Extracted ${totalFrames} total frames across ${segments.length} segments.\n`);
 
+    // Atomically swap temp dir to durable path
+    await rm(framesDir, { recursive: true, force: true });
+    await mkdir(dirname(framesDir), { recursive: true });
+    await rename(tempDir, framesDir);
+
     // Step 4: Subject registry
     onProgress?.('Building subject registry...\n');
     const allExtractedPaths = segments.flatMap((s) => s.framePaths);
diff --git a/assistant/src/config/bundled-skills/media-processing/tools/extract-keyframes.ts b/assistant/src/config/bundled-skills/media-processing/tools/extract-keyframes.ts
index f07116eecf4..d3dd35d3cea 100644
--- a/assistant/src/config/bundled-skills/media-processing/tools/extract-keyframes.ts
+++ b/assistant/src/config/bundled-skills/media-processing/tools/extract-keyframes.ts
@@ -1,7 +1,7 @@
 import { join, dirname } from 'node:path';
 import type { ToolContext, ToolExecutionResult } from '../../../../tools/types.js';
 import { getMediaAssetById, getKeyframesForAsset } from '../../../../memory/media-store.js';
-import { preprocessForAsset, type PreprocessOptions, type PreprocessManifest } from '../services/preprocess.js';
+import { preprocessForAsset, type PreprocessOptions } from '../services/preprocess.js';
 
 export { preprocessForAsset } from '../services/preprocess.js';
 
diff --git a/assistant/src/config/bundled-skills/media-processing/tools/media-diagnostics.ts b/assistant/src/config/bundled-skills/media-processing/tools/media-diagnostics.ts
index 7771554a8af..7c0df28cb21 100644
--- a/assistant/src/config/bundled-skills/media-processing/tools/media-diagnostics.ts
+++ b/assistant/src/config/bundled-skills/media-processing/tools/media-diagnostics.ts
@@ -6,6 +6,8 @@
  * All metrics are generic media-processing infrastructure.
  */
 
+import { join, dirname } from 'node:path';
+import { readFile } from 'node:fs/promises';
 import type { ToolContext, ToolExecutionResult } from '../../../../tools/types.js';
 import {
   getMediaAssetById,
@@ -13,6 +15,7 @@ import {
   getKeyframesForAsset,
   type ProcessingStage,
 } from '../../../../memory/media-store.js';
+import type { PreprocessManifest } from '../services/preprocess.js';
 // ---------------------------------------------------------------------------
 // Cost estimation constants (Gemini 2.5 Flash pricing)
 // ---------------------------------------------------------------------------
@@ -98,7 +101,19 @@ export async function run(
 
   // Cost estimation: Gemini 2.5 Flash is ~$0.001 per segment (~10 frames each)
   const keyframeCount = keyframes.length;
-  const estimatedSegments = Math.ceil(keyframeCount / 10);
+
+  // Prefer actual segment count from preprocess manifest when available
+  let estimatedSegments: number;
+  const manifestPath = join(dirname(asset.filePath), 'pipeline', asset.id, 'manifest.json');
+  try {
+    const raw = await readFile(manifestPath, 'utf-8');
+    const manifest: PreprocessManifest = JSON.parse(raw);
+    estimatedSegments = manifest.segments.length;
+  } catch {
+    // Manifest doesn't exist yet (preprocess hasn't run) — fall back to estimation
+    estimatedSegments = Math.ceil(keyframeCount / 10);
+  }
+
   const estimatedTotalCost = estimatedSegments * ESTIMATED_COST_PER_SEGMENT_USD;
 
   const report: DiagnosticReport = {
diff --git a/assistant/src/config/bundled-skills/messaging/SKILL.md b/assistant/src/config/bundled-skills/messaging/SKILL.md
index bf1432da05f..6e2ec84634b 100644
--- a/assistant/src/config/bundled-skills/messaging/SKILL.md
+++ b/assistant/src/config/bundled-skills/messaging/SKILL.md
@@ -22,7 +22,7 @@ Gmail, Slack, and Telegram setup all require a publicly reachable URL for OAuth
    - Then call `skill_load` with `skill: "google-oauth-setup"`.
    - Tell the user Gmail isn't connected yet and briefly explain what the setup involves, then use `ui_show` with `surface_type: "confirmation"` to ask for permission to start:
      - **message:** "Ready to set up Gmail?"
-     - **detail:** "I'll automate the entire setup in the browser — creating a Google Cloud project, enabling APIs, and configuring OAuth. It takes a few minutes and you can watch via screencast."
+     - **detail:** "I'll open a browser where you sign in to Google, then automate everything else — creating a project, enabling APIs, and connecting your account. Takes 2-3 minutes and you can watch in the browser preview panel."
      - **confirmLabel:** "Get Started"
      - **cancelLabel:** "Not Now"
    - If the user confirms, briefly acknowledge (e.g., "Setting up Gmail now...") and proceed with the setup guide. If they decline, acknowledge and let them know they can set it up later.
diff --git a/assistant/src/config/bundled-skills/messaging/tools/messaging-analyze-style.ts b/assistant/src/config/bundled-skills/messaging/tools/messaging-analyze-style.ts
index b2066289446..db97135f121 100644
--- a/assistant/src/config/bundled-skills/messaging/tools/messaging-analyze-style.ts
+++ b/assistant/src/config/bundled-skills/messaging/tools/messaging-analyze-style.ts
@@ -7,12 +7,9 @@ import { memoryItems } from '../../../../memory/schema.js';
 import { enqueueMemoryJob } from '../../../../memory/jobs-store.js';
 import { extractStylePatterns } from '../../../../messaging/style-analyzer.js';
 import { truncate } from '../../../../util/truncate.js';
+import { clampUnitInterval } from '../../../../memory/validation.js';
 import { resolveProvider, withProviderToken, ok, err } from './shared.js';
 
-function clamp(value: number, min: number, max: number): number {
-  return Math.min(max, Math.max(min, value));
-}
-
 function upsertMemoryItem(opts: {
   kind: string;
   subject: string;
@@ -35,7 +32,7 @@ function upsertMemoryItem(opts: {
       .set({
         statement: opts.statement,
         status: 'active',
-        importance: Math.max(existing.importance ?? 0, opts.importance),
+        importance: clampUnitInterval(Math.max(existing.importance ?? 0, opts.importance)),
         lastSeenAt: now,
         verificationState: 'assistant_inferred',
       })
@@ -51,7 +48,7 @@ function upsertMemoryItem(opts: {
       statement: opts.statement,
       status: 'active',
       confidence: 0.8,
-      importance: opts.importance,
+      importance: clampUnitInterval(opts.importance),
       fingerprint,
       verificationState: 'assistant_inferred',
       scopeId: opts.scopeId,
@@ -90,7 +87,7 @@ export async function run(input: Record<string, unknown>, context: ToolContext):
 
       for (const pattern of result.stylePatterns) {
         const subject = `${provider.id} writing style: ${pattern.aspect}`;
-        const importance = clamp(pattern.importance ?? 0.65, 0.55, 0.85);
+        const importance = clampUnitInterval(Math.min(0.85, Math.max(0.55, pattern.importance ?? 0.65)));
         upsertMemoryItem({ kind: 'style', subject, statement: pattern.summary, importance, scopeId });
         savedCount++;
       }
diff --git a/assistant/src/config/bundled-skills/twitter/icon.svg b/assistant/src/config/bundled-skills/twitter/icon.svg
new file mode 100644
index 00000000000..7a133ecfa07
--- /dev/null
+++ b/assistant/src/config/bundled-skills/twitter/icon.svg
@@ -0,0 +1,14 @@
+<svg viewBox="0 0 16 16" xmlns="http://www.w3.org/2000/svg">
+<rect width="16" height="16" fill="#ffffff"/>
+<rect x="2" y="2" width="12" height="12" fill="#000000"/>
+<rect x="4" y="5" width="2" height="1" fill="#ffffff"/>
+<rect x="5" y="4" width="1" height="3" fill="#ffffff"/>
+<rect x="6" y="5" width="2" height="1" fill="#ffffff"/>
+<rect x="10" y="5" width="2" height="1" fill="#ffffff"/>
+<rect x="11" y="4" width="1" height="3" fill="#ffffff"/>
+<rect x="12" y="5" width="2" height="1" fill="#ffffff"/>
+<rect x="4" y="9" width="1" height="2" fill="#ffffff"/>
+<rect x="5" y="8" width="6" height="1" fill="#ffffff"/>
+<rect x="11" y="9" width="1" height="2" fill="#ffffff"/>
+<rect x="5" y="11" width="6" height="1" fill="#ffffff"/>
+</svg>
\ No newline at end of file
diff --git a/assistant/src/config/core-schema.ts b/assistant/src/config/core-schema.ts
index 38045f618e9..bde0f7d8738 100644
--- a/assistant/src/config/core-schema.ts
+++ b/assistant/src/config/core-schema.ts
@@ -103,6 +103,9 @@ export const ThinkingConfigSchema = z.object({
     .int('thinking.budgetTokens must be an integer')
     .positive('thinking.budgetTokens must be a positive integer')
     .default(10000),
+  streamThinking: z
+    .boolean({ error: 'thinking.streamThinking must be a boolean' })
+    .default(false),
 });
 
 export const ContextWindowConfigSchema = z.object({
@@ -170,17 +173,90 @@ export const SmsConfigSchema = z.object({
     .optional(),
 });
 
+export const IngressWebhookConfigSchema = z.object({
+  secret: z
+    .string({ error: 'ingress.webhook.secret must be a string' })
+    .default(''),
+  timeoutMs: z
+    .number({ error: 'ingress.webhook.timeoutMs must be a number' })
+    .int('ingress.webhook.timeoutMs must be an integer')
+    .positive('ingress.webhook.timeoutMs must be a positive integer')
+    .default(30_000),
+  maxRetries: z
+    .number({ error: 'ingress.webhook.maxRetries must be a number' })
+    .int('ingress.webhook.maxRetries must be an integer')
+    .nonnegative('ingress.webhook.maxRetries must be a non-negative integer')
+    .default(2),
+  initialBackoffMs: z
+    .number({ error: 'ingress.webhook.initialBackoffMs must be a number' })
+    .int('ingress.webhook.initialBackoffMs must be an integer')
+    .positive('ingress.webhook.initialBackoffMs must be a positive integer')
+    .default(500),
+  maxPayloadBytes: z
+    .number({ error: 'ingress.webhook.maxPayloadBytes must be a number' })
+    .int('ingress.webhook.maxPayloadBytes must be an integer')
+    .positive('ingress.webhook.maxPayloadBytes must be a positive integer')
+    .default(1_048_576),
+});
+
+export const IngressRateLimitConfigSchema = z.object({
+  maxRequestsPerMinute: z
+    .number({ error: 'ingress.rateLimit.maxRequestsPerMinute must be a number' })
+    .int('ingress.rateLimit.maxRequestsPerMinute must be an integer')
+    .nonnegative('ingress.rateLimit.maxRequestsPerMinute must be a non-negative integer')
+    .default(0),
+  maxRequestsPerHour: z
+    .number({ error: 'ingress.rateLimit.maxRequestsPerHour must be a number' })
+    .int('ingress.rateLimit.maxRequestsPerHour must be an integer')
+    .nonnegative('ingress.rateLimit.maxRequestsPerHour must be a non-negative integer')
+    .default(0),
+});
+
 const IngressBaseSchema = z.object({
   enabled: z
     .boolean({ error: 'ingress.enabled must be a boolean' })
     .optional(),
   publicBaseUrl: z
     .string({ error: 'ingress.publicBaseUrl must be a string' })
+    .refine(
+      (val) => val === '' || /^https?:\/\//i.test(val),
+      'ingress.publicBaseUrl must be an absolute URL starting with http:// or https://',
+    )
     .default(''),
+  webhook: IngressWebhookConfigSchema.default({
+    secret: '',
+    timeoutMs: 30_000,
+    maxRetries: 2,
+    initialBackoffMs: 500,
+    maxPayloadBytes: 1_048_576,
+  }),
+  rateLimit: IngressRateLimitConfigSchema.default({
+    maxRequestsPerMinute: 0,
+    maxRequestsPerHour: 0,
+  }),
+  shutdownDrainMs: z
+    .number({ error: 'ingress.shutdownDrainMs must be a number' })
+    .int('ingress.shutdownDrainMs must be an integer')
+    .nonnegative('ingress.shutdownDrainMs must be a non-negative integer')
+    .default(5_000),
 });
 
 export const IngressConfigSchema = IngressBaseSchema
-  .default({ publicBaseUrl: '' })
+  .default({
+    publicBaseUrl: '',
+    webhook: {
+      secret: '',
+      timeoutMs: 30_000,
+      maxRetries: 2,
+      initialBackoffMs: 500,
+      maxPayloadBytes: 1_048_576,
+    },
+    rateLimit: {
+      maxRequestsPerMinute: 0,
+      maxRequestsPerHour: 0,
+    },
+    shutdownDrainMs: 5_000,
+  })
   .transform((val) => ({
     ...val,
     // Backward compatibility: if `enabled` was never explicitly set (undefined),
@@ -194,19 +270,27 @@ export const IngressConfigSchema = IngressBaseSchema
     enabled: val.enabled ?? (val.publicBaseUrl ? true : undefined),
   }));
 
-export const AssistantInboxConfigSchema = z.object({
-  enabled: z
-    .boolean({ error: 'assistantInbox.enabled must be a boolean' })
-    .default(false),
-  invitesEnabled: z
-    .boolean({ error: 'assistantInbox.invitesEnabled must be a boolean' })
-    .default(false),
-  memberAclEnabled: z
-    .boolean({ error: 'assistantInbox.memberAclEnabled must be a boolean' })
-    .default(false),
-  policyEnabled: z
-    .boolean({ error: 'assistantInbox.policyEnabled must be a boolean' })
-    .default(false),
+export const DaemonConfigSchema = z.object({
+  startupSocketWaitMs: z
+    .number({ error: 'daemon.startupSocketWaitMs must be a number' })
+    .int('daemon.startupSocketWaitMs must be an integer')
+    .positive('daemon.startupSocketWaitMs must be a positive integer')
+    .default(5000),
+  stopTimeoutMs: z
+    .number({ error: 'daemon.stopTimeoutMs must be a number' })
+    .int('daemon.stopTimeoutMs must be an integer')
+    .positive('daemon.stopTimeoutMs must be a positive integer')
+    .default(5000),
+  sigkillGracePeriodMs: z
+    .number({ error: 'daemon.sigkillGracePeriodMs must be a number' })
+    .int('daemon.sigkillGracePeriodMs must be an integer')
+    .positive('daemon.sigkillGracePeriodMs must be a positive integer')
+    .default(2000),
+  titleGenerationMaxTokens: z
+    .number({ error: 'daemon.titleGenerationMaxTokens must be a number' })
+    .int('daemon.titleGenerationMaxTokens must be an integer')
+    .positive('daemon.titleGenerationMaxTokens must be a positive integer')
+    .default(30),
 });
 
 export type TimeoutConfig = z.infer<typeof TimeoutConfigSchema>;
@@ -219,5 +303,7 @@ export type ThinkingConfig = z.infer<typeof ThinkingConfigSchema>;
 export type ContextWindowConfig = z.infer<typeof ContextWindowConfigSchema>;
 export type ModelPricingOverride = z.infer<typeof ModelPricingOverrideSchema>;
 export type SmsConfig = z.infer<typeof SmsConfigSchema>;
+export type IngressWebhookConfig = z.infer<typeof IngressWebhookConfigSchema>;
+export type IngressRateLimitConfig = z.infer<typeof IngressRateLimitConfigSchema>;
+export type DaemonConfig = z.infer<typeof DaemonConfigSchema>;
 export type IngressConfig = z.infer<typeof IngressConfigSchema>;
-export type AssistantInboxConfig = z.infer<typeof AssistantInboxConfigSchema>;
diff --git a/assistant/src/config/defaults.ts b/assistant/src/config/defaults.ts
index b7c834fd851..5ea7bdaf955 100644
--- a/assistant/src/config/defaults.ts
+++ b/assistant/src/config/defaults.ts
@@ -8,14 +8,15 @@ export const DEFAULT_CONFIG: AssistantConfig = {
   apiKeys: {},
   webSearchProvider: 'perplexity',
   providerOrder: [],
-  maxTokens: 64000,
+  maxTokens: 16000,
   thinking: {
     enabled: false,
     budgetTokens: 10000,
+    streamThinking: false,
   },
   contextWindow: {
     enabled: true,
-    maxInputTokens: 180000,
+    maxInputTokens: 200000,
     targetInputTokens: 110000,
     compactThreshold: 0.8,
     preserveRecentUserTurns: 8,
@@ -46,7 +47,7 @@ export const DEFAULT_CONFIG: AssistantConfig = {
       injectionFormat: 'markdown' as const,
       injectionStrategy: 'prepend_user_block' as const,
       reranking: {
-        enabled: true,
+        enabled: false,
         model: 'claude-haiku-4-5-20251001',
         topK: 20,
       },
@@ -265,11 +266,23 @@ export const DEFAULT_CONFIG: AssistantConfig = {
   ingress: {
     enabled: undefined,
     publicBaseUrl: '',
+    webhook: {
+      secret: '',
+      timeoutMs: 30_000,
+      maxRetries: 2,
+      initialBackoffMs: 500,
+      maxPayloadBytes: 1_048_576,
+    },
+    rateLimit: {
+      maxRequestsPerMinute: 0,
+      maxRequestsPerHour: 0,
+    },
+    shutdownDrainMs: 5_000,
   },
-  assistantInbox: {
-    enabled: false,
-    invitesEnabled: false,
-    memberAclEnabled: false,
-    policyEnabled: false,
+  daemon: {
+    startupSocketWaitMs: 5000,
+    stopTimeoutMs: 5000,
+    sigkillGracePeriodMs: 2000,
+    titleGenerationMaxTokens: 30,
   },
 };
diff --git a/assistant/src/config/env-registry.ts b/assistant/src/config/env-registry.ts
new file mode 100644
index 00000000000..936170bc9f4
--- /dev/null
+++ b/assistant/src/config/env-registry.ts
@@ -0,0 +1,162 @@
+/**
+ * Centralized environment variable registry.
+ *
+ * This module documents every VELLUM_* and related env var with its type,
+ * default, and description, and exports typed accessor functions for each.
+ *
+ * IMPORTANT: This module has NO internal imports (no logger, no platform
+ * utilities) so it can be safely imported from bootstrap-level code like
+ * util/platform.ts and util/logger.ts without circular dependencies.
+ *
+ * Higher-level env vars that depend on the logger or config system live in
+ * config/env.ts, which re-exports selected accessors from this module.
+ */
+
+// ── Helpers (dependency-free) ────────────────────────────────────────────────
+
+function str(name: string): string | undefined {
+  const v = process.env[name]?.trim();
+  return v || undefined;
+}
+
+function flag(name: string): boolean {
+  const raw = str(name);
+  return raw === 'true' || raw === '1';
+}
+
+function flagTriState(name: string): boolean | undefined {
+  const raw = str(name);
+  if (raw === 'true' || raw === '1') return true;
+  if (raw === 'false' || raw === '0') return false;
+  return undefined;
+}
+
+// ── Registry ─────────────────────────────────────────────────────────────────
+// Each entry documents the env var name, type, default, and purpose.
+
+/**
+ * BASE_DATA_DIR — string, default: os.homedir()
+ * Overrides the home directory used as the base for ~/.vellum and lockfiles.
+ * Primarily used in tests to isolate filesystem state.
+ */
+export function getBaseDataDir(): string | undefined {
+  return str('BASE_DATA_DIR');
+}
+
+/**
+ * VELLUM_DAEMON_SOCKET — string, default: ~/.vellum/vellum.sock
+ * Overrides the Unix domain socket path for daemon IPC.
+ * Supports ~ expansion.
+ */
+export function getDaemonSocket(): string | undefined {
+  return str('VELLUM_DAEMON_SOCKET');
+}
+
+/**
+ * VELLUM_DAEMON_TCP_PORT — number, default: 8765
+ * TCP port for the daemon's TCP listener (used by iOS clients).
+ */
+export function getDaemonTcpPort(): number {
+  const raw = str('VELLUM_DAEMON_TCP_PORT');
+  if (raw) {
+    const port = parseInt(raw, 10);
+    if (!isNaN(port) && port > 0 && port <= 65535) return port;
+  }
+  return 8765;
+}
+
+/**
+ * VELLUM_DAEMON_TCP_ENABLED — boolean tri-state, default: undefined (falls back to flag file)
+ * Whether the daemon TCP listener should be active.
+ * 'true'/'1' → on, 'false'/'0' → off, unset → check flag file.
+ */
+export function getDaemonTcpEnabled(): boolean | undefined {
+  return flagTriState('VELLUM_DAEMON_TCP_ENABLED');
+}
+
+/**
+ * VELLUM_DAEMON_TCP_HOST — string, default: context-dependent (127.0.0.1 or 0.0.0.0)
+ * Hostname/address for the TCP listener. When unset, platform.ts resolves
+ * based on whether iOS pairing is enabled.
+ */
+export function getDaemonTcpHost(): string | undefined {
+  return str('VELLUM_DAEMON_TCP_HOST');
+}
+
+/**
+ * VELLUM_DAEMON_IOS_PAIRING — boolean tri-state, default: undefined (falls back to flag file)
+ * Whether iOS pairing mode is enabled. When on, TCP binds to 0.0.0.0.
+ * 'true'/'1' → on, 'false'/'0' → off, unset → check flag file.
+ */
+export function getDaemonIosPairing(): boolean | undefined {
+  return flagTriState('VELLUM_DAEMON_IOS_PAIRING');
+}
+
+/**
+ * VELLUM_DEBUG — boolean, default: false
+ * Enables debug-level logging and verbose output.
+ */
+export function getDebugMode(): boolean {
+  return flag('VELLUM_DEBUG');
+}
+
+/**
+ * VELLUM_LOG_STDERR — boolean, default: false
+ * Forces logger output to stderr instead of log files.
+ */
+export function getLogStderr(): boolean {
+  return flag('VELLUM_LOG_STDERR');
+}
+
+/**
+ * DEBUG_STDOUT_LOGS — boolean, default: false
+ * Enables additional log output to stdout (alongside file logging).
+ */
+export function getDebugStdoutLogs(): boolean {
+  return flag('DEBUG_STDOUT_LOGS');
+}
+
+/**
+ * VELLUM_ENABLE_MONITORING — boolean, default: false
+ * Enables monitoring/telemetry (Logfire, etc.).
+ */
+export function getEnableMonitoring(): boolean {
+  return flag('VELLUM_ENABLE_MONITORING');
+}
+
+// ── Known env var names ──────────────────────────────────────────────────────
+
+/**
+ * Complete set of recognized VELLUM_* env var names. Used by validateEnvVars()
+ * to warn about typos or unrecognized variables.
+ */
+const KNOWN_VELLUM_VARS = new Set([
+  'VELLUM_DAEMON_SOCKET',
+  'VELLUM_DAEMON_TCP_PORT',
+  'VELLUM_DAEMON_TCP_ENABLED',
+  'VELLUM_DAEMON_TCP_HOST',
+  'VELLUM_DAEMON_IOS_PAIRING',
+  'VELLUM_DEBUG',
+  'VELLUM_LOG_STDERR',
+  'VELLUM_ENABLE_MONITORING',
+  'VELLUM_HOOK_EVENT',
+  'VELLUM_HOOK_NAME',
+]);
+
+/**
+ * Check all VELLUM_* env vars and return warnings for any unrecognized ones.
+ * Returns an array of warning messages (empty if all vars are recognized).
+ *
+ * This is intentionally a pure function that returns strings rather than
+ * logging directly, so it can be called from bootstrap code before the
+ * logger is initialized.
+ */
+export function checkUnrecognizedEnvVars(): string[] {
+  const warnings: string[] = [];
+  for (const key of Object.keys(process.env)) {
+    if (key.startsWith('VELLUM_') && !KNOWN_VELLUM_VARS.has(key)) {
+      warnings.push(`Unrecognized environment variable: ${key}`);
+    }
+  }
+  return warnings;
+}
diff --git a/assistant/src/config/env.ts b/assistant/src/config/env.ts
index 3a2da46d317..50d8eca0f90 100644
--- a/assistant/src/config/env.ts
+++ b/assistant/src/config/env.ts
@@ -8,14 +8,14 @@
  * - Fail-fast validation via validateEnv() at startup
  * - Shared derived values (e.g. gateway base URL) instead of duplicated logic
  *
- * Variables NOT centralized here (must resolve before this module loads):
- * - BASE_DATA_DIR, VELLUM_DAEMON_* — bootstrap/platform layer (util/platform.ts)
- * - VELLUM_DEBUG, BUN_TEST, NODE_ENV log flags — logger init (util/logger.ts)
- * - APP_VERSION — compile-time embedding (version.ts)
- * - __EVAL_INPUT_JSON, __SKILL_*_JSON — internal sandbox IPC
+ * Bootstrap-level env vars (BASE_DATA_DIR, VELLUM_DAEMON_*, VELLUM_DEBUG,
+ * VELLUM_LOG_STDERR, DEBUG_STDOUT_LOGS) are defined in config/env-registry.ts
+ * which has no internal dependencies and can be imported from platform/logger
+ * without circular imports.
  */
 
 import { getLogger } from '../util/logger.js';
+import { getEnableMonitoring, checkUnrecognizedEnvVars } from './env-registry.js';
 
 const log = getLogger('env');
 
@@ -40,12 +40,6 @@ function int(name: string, fallback?: number): number | undefined {
   return n;
 }
 
-/** Read an env var as a boolean flag ('true'/'1' → true, everything else → false). */
-function flag(name: string): boolean {
-  const raw = str(name);
-  return raw === 'true' || raw === '1';
-}
-
 // ── Gateway ──────────────────────────────────────────────────────────────────
 
 const DEFAULT_GATEWAY_PORT = 7830;
@@ -82,8 +76,8 @@ export function setIngressPublicBaseUrl(value: string | undefined): void {
 
 // ── Runtime HTTP ─────────────────────────────────────────────────────────────
 
-export function getRuntimeHttpPort(): number | undefined {
-  return int('RUNTIME_HTTP_PORT');
+export function getRuntimeHttpPort(): number {
+  return int('RUNTIME_HTTP_PORT') ?? 7821;
 }
 
 export function getRuntimeHttpHost(): string {
@@ -134,7 +128,7 @@ export function getLogfireToken(): string | undefined {
 }
 
 export function isMonitoringEnabled(): boolean {
-  return flag('VELLUM_ENABLE_MONITORING');
+  return getEnableMonitoring();
 }
 
 export function getSentryDsn(): string | undefined {
@@ -167,11 +161,15 @@ export function validateEnv(): void {
   }
 
   const httpPort = getRuntimeHttpPort();
-  if (httpPort !== undefined && (httpPort < 1 || httpPort > 65535)) {
+  if (httpPort < 1 || httpPort > 65535) {
     throw new Error(`Invalid RUNTIME_HTTP_PORT: ${httpPort} (must be 1-65535)`);
   }
 
   if (getTwilioWssBaseUrl()) {
     log.warn('TWILIO_WSS_BASE_URL env var is deprecated. Relay URL is now derived from ingress.publicBaseUrl.');
   }
+
+  for (const warning of checkUnrecognizedEnvVars()) {
+    log.warn(warning);
+  }
 }
diff --git a/assistant/src/config/memory-schema.ts b/assistant/src/config/memory-schema.ts
index 48fd5c870f1..da9693f2878 100644
--- a/assistant/src/config/memory-schema.ts
+++ b/assistant/src/config/memory-schema.ts
@@ -60,7 +60,7 @@ export const QdrantConfigSchema = z.object({
 export const MemoryRerankingConfigSchema = z.object({
   enabled: z
     .boolean({ error: 'memory.retrieval.reranking.enabled must be a boolean' })
-    .default(true),
+    .default(false),
   model: z
     .string({ error: 'memory.retrieval.reranking.model must be a string' })
     .default('claude-haiku-4-5-20251001'),
@@ -186,7 +186,7 @@ export const MemoryRetrievalConfigSchema = z.object({
     })
     .default('prepend_user_block'),
   reranking: MemoryRerankingConfigSchema.default({
-    enabled: true,
+    enabled: false,
     model: 'claude-haiku-4-5-20251001',
     topK: 20,
   }),
@@ -430,7 +430,7 @@ export const MemoryConfigSchema = z.object({
     injectionFormat: 'markdown',
     injectionStrategy: 'prepend_user_block',
     reranking: {
-      enabled: true,
+      enabled: false,
       model: 'claude-haiku-4-5-20251001',
       topK: 20,
     },
diff --git a/assistant/src/config/schema.ts b/assistant/src/config/schema.ts
index ef25f664c39..1ab31167358 100644
--- a/assistant/src/config/schema.ts
+++ b/assistant/src/config/schema.ts
@@ -101,8 +101,10 @@ export {
   ContextWindowConfigSchema,
   ModelPricingOverrideSchema,
   SmsConfigSchema,
+  IngressWebhookConfigSchema,
+  IngressRateLimitConfigSchema,
   IngressConfigSchema,
-  AssistantInboxConfigSchema,
+  DaemonConfigSchema,
 } from './core-schema.js';
 export type {
   TimeoutConfig,
@@ -115,8 +117,10 @@ export type {
   ContextWindowConfig,
   ModelPricingOverride,
   SmsConfig,
+  IngressWebhookConfig,
+  IngressRateLimitConfig,
   IngressConfig,
-  AssistantInboxConfig,
+  DaemonConfig,
 } from './core-schema.js';
 
 // Imports for AssistantConfigSchema composition
@@ -137,7 +141,7 @@ import {
   ModelPricingOverrideSchema,
   SmsConfigSchema,
   IngressConfigSchema,
-  AssistantInboxConfigSchema,
+  DaemonConfigSchema,
 } from './core-schema.js';
 
 const VALID_PROVIDERS = ['anthropic', 'openai', 'gemini', 'ollama', 'fireworks', 'openrouter'] as const;
@@ -172,10 +176,11 @@ export const AssistantConfigSchema = z.object({
     .number({ error: 'maxTokens must be a number' })
     .int('maxTokens must be an integer')
     .positive('maxTokens must be a positive integer')
-    .default(64000),
+    .default(16000),
   thinking: ThinkingConfigSchema.default({
     enabled: false,
     budgetTokens: 10000,
+    streamThinking: false,
   }),
   contextWindow: ContextWindowConfigSchema.default({
     enabled: true,
@@ -210,7 +215,7 @@ export const AssistantConfigSchema = z.object({
       injectionFormat: 'markdown',
       injectionStrategy: 'prepend_user_block',
       reranking: {
-        enabled: true,
+        enabled: false,
         model: 'claude-haiku-4-5-20251001',
         topK: 20,
       },
@@ -426,11 +431,11 @@ export const AssistantConfigSchema = z.object({
     phoneNumber: '',
   }),
   ingress: IngressConfigSchema,
-  assistantInbox: AssistantInboxConfigSchema.default({
-    enabled: false,
-    invitesEnabled: false,
-    memberAclEnabled: false,
-    policyEnabled: false,
+  daemon: DaemonConfigSchema.default({
+    startupSocketWaitMs: 5000,
+    stopTimeoutMs: 5000,
+    sigkillGracePeriodMs: 2000,
+    titleGenerationMaxTokens: 30,
   }),
 }).superRefine((config, ctx) => {
   if (config.contextWindow.targetInputTokens >= config.contextWindow.maxInputTokens) {
diff --git a/assistant/src/config/system-prompt.ts b/assistant/src/config/system-prompt.ts
index b1570f67e85..2222212b2f8 100644
--- a/assistant/src/config/system-prompt.ts
+++ b/assistant/src/config/system-prompt.ts
@@ -181,19 +181,11 @@ function buildTaskScheduleReminderRoutingSection(): string {
     '- A timed alert, not a tracked task',
     '',
     '### Common mistakes to avoid',
-    '- "Add this to my tasks" → task_list_add (NOT schedule_create or reminder_create)',
-    '- "What\'s on my task list?" → task_list_show (NOT schedule_list)',
-    '- "Remind me to buy groceries" without a time → task_list_add (it\'s a task, not a timed reminder)',
-    '- "Remind me at 5pm to buy groceries" → reminder_create (explicit time trigger)',
-    '- "Check my inbox every morning at 8am" → schedule_create (recurring automation, cron)',
-    '- "Every other Tuesday at 10am" → schedule_create (recurring automation, RRULE)',
-    '- "Every weekday except holidays" → schedule_create (RRULE with EXDATE for exclusions)',
-    '- "Daily for the next 30 days" → schedule_create (RRULE with COUNT=30)',
-    '- "Bump priority on X" → task_list_update (NOT task_list_add)',
-    '- "Move this up" / "change this task priority" → task_list_update (NOT task_list_add)',
-    '- "Mark X as done" → task_list_update (NOT task_list_add)',
-    '- "Remove X from my tasks" → task_list_remove (NOT task_list_update)',
-    '- "Delete that task" / "clean up the duplicate" → task_list_remove',
+    '- "Add this to my tasks" / "Remind me to X" (no time) → task_list_add (NOT schedule or reminder)',
+    '- "Remind me at 5pm" → reminder_create (explicit time trigger)',
+    '- "Every morning at 8am" / recurring patterns → schedule_create',
+    '- "Bump priority" / "mark as done" → task_list_update (NOT task_list_add)',
+    '- "Remove X from tasks" / "delete that task" → task_list_remove (NOT task_list_update)',
     '',
     '### Entity type routing: work items vs task templates',
     '',
@@ -228,22 +220,12 @@ function buildAttachmentSection(): string {
     '- `filename`: Optional override for the delivered filename (defaults to the basename of the path).',
     '- `mime_type`: Optional MIME type override (inferred from the file extension if omitted).',
     '',
-    'Examples:',
-    '```',
-    '<vellum-attachment source="sandbox" path="scratch/chart.png" />',
-    '<vellum-attachment source="sandbox" path="scratch/video.mp4" mime_type="video/mp4" />',
-    '<vellum-attachment source="sandbox" path="scratch/report.pdf" />',
-    '```',
+    'Example: `<vellum-attachment source="sandbox" path="scratch/chart.png" />`',
     '',
     'Limits: up to 5 attachments per turn, 20 MB each. Tool outputs that produce image or file content blocks are also automatically converted into attachments.',
     '',
     '### Inline Images and GIFs',
-    '',
-    'The chat natively renders images and animated GIFs inline in message bubbles. When you have an image or GIF URL (e.g. from Giphy, web search, or any tool), embed it directly in your response text using markdown image syntax:',
-    '',
-    '`![description](https://media.giphy.com/media/example/giphy.gif)`',
-    '',
-    'This renders the image/GIF visually inside the chat bubble with full animation. You can also use `ui_show`, `app_create`, or `vellum-attachment` for images when appropriate. Do NOT wrap image markdown in code fences or it will render as literal text.',
+    'Embed images/GIFs inline using markdown: `![description](URL)`. Do NOT wrap in code fences.',
   ].join('\n');
 }
 
@@ -332,19 +314,8 @@ function buildToolPermissionSection(): string {
     '- NEVER show raw commands in backticks like `ls -lt ~/Downloads`. Describe the action in plain English.',
     '- Keep it conversational, like you\'re talking to a friend.',
     '',
-    'Good examples:',
-    '- "Sure! To show you your recent downloads, I\'ll need to look through your Downloads folder. This is read-only, nothing gets moved or deleted. Can you allow this for me?"',
-    '- "Yes, I can help with that! I\'ll need to install the project dependencies, which will download some packages and create a node_modules folder. Hit Allow to proceed."',
-    '- "Absolutely! I\'ll need to read your shell configuration file to check your setup. I won\'t change anything. Can you allow this?"',
-    '- "I can look into that! I\'ll need to access your contacts database to pull up the info. This is just a read-only lookup, nothing gets modified. Can you allow this?"',
-    '',
-    'Bad examples (NEVER do this):',
-    '- "I\'ll run `ls -lt ~/Desktop/`" (raw command, too technical)',
-    '- "I\'ll list your most recent downloads for you." (doesn\'t ask for permission)',
-    '- Using em dashes anywhere in the response',
-    '- Calling a tool with no preceding text at all',
-    '',
-    'Be conversational and transparent. Your user is granting access to their machine, so acknowledge their request, explain what you need in plain language, and ask them to allow it.',
+    'Good: "To show your recent downloads, I\'ll need to look through your Downloads folder. This is read-only. Can you allow this?"',
+    'Bad: "I\'ll run `ls -lt ~/Desktop/`" (raw command), or calling a tool with no preceding text.',
     '',
     '### Handling Permission Denials',
     '',
@@ -606,12 +577,7 @@ function buildConfigSection(): string {
     '**LOOKS.md** — update when:',
     '- They ask you to change your appearance, colors, or outfit',
     '- You want to refresh your look',
-    '- Available body/cheek colors: violet, emerald, rose, amber, indigo, slate, cyan, blue, green, red, orange, pink',
-    '- Available hats: none, top_hat, crown, cap, beanie, wizard_hat, cowboy_hat',
-    '- Available shirts: none, tshirt, suit, hoodie, tank_top, sweater',
-    '- Available accessories: none, sunglasses, monocle, bowtie, necklace, scarf, cape',
-    '- Available held items: none, sword, staff, shield, balloon',
-    '- Available outfit colors: red, blue, yellow, purple, orange, pink, cyan, brown, black, white, gold, silver',
+    '- Read LOOKS.md for available options (colors, hats, shirts, accessories, held items)',
     '',
     'When updating, read the file first, then make a targeted edit. Include all useful information, but don\'t bloat the files over time',
   ].join('\n');
@@ -677,19 +643,14 @@ function buildDynamicSkillWorkflowSection(): string {
   return [
     '## Dynamic Skill Authoring Workflow',
     '',
-    'When your user requests a capability that no existing tool or skill can satisfy, follow this exact procedure:',
-    '',
-    '1. **Validate the gap.** Confirm no existing tool or installed skill covers the need.',
-    '2. **Draft a TypeScript snippet.** Write a self-contained snippet that exports a `default` or `run` function with signature `(input: unknown) => unknown | Promise<unknown>`.',
-    '3. **Test with `evaluate_typescript_code`.** Call the tool to run the snippet in a sandbox. Iterate until it passes.',
-    '4. **Persist with `scaffold_managed_skill`.** Only after successful evaluation and explicit user consent, call `scaffold_managed_skill` to write the skill to `~/.vellum/workspace/skills/<id>/`.',
-    '5. **Load and use.** Call `skill_load` with the new skill ID before invoking the skill-driven flow.',
+    'When no existing tool or skill can satisfy a request:',
+    '1. Validate the gap — confirm no existing tool/skill covers it.',
+    '2. Draft a TypeScript snippet exporting a `default` or `run` function (`(input: unknown) => unknown | Promise<unknown>`).',
+    '3. Test with `evaluate_typescript_code`. Iterate until it passes (max 3 attempts, then ask the user).',
+    '4. Persist with `scaffold_managed_skill` only after user consent.',
+    '5. Load with `skill_load` before use.',
     '',
-    'Important constraints:',
-    '- **Never persist or delete skills without explicit user confirmation.** Both operations require user approval.',
-    '- If evaluation fails after 3 attempts, summarize the failure and ask your user for guidance instead of continuing to retry.',
-    '- After a skill is written or deleted, the next turn may run in a recreated session due to file-watcher eviction. Continue normally.',
-    '- To remove a managed skill, use `delete_managed_skill`.',
+    '**Never persist or delete skills without explicit user confirmation.** To remove: `delete_managed_skill`.',
     '',
     '### Browser Skill Prerequisite',
     'If you need browser capabilities (navigating web pages, clicking elements, extracting content) and `browser_*` tools are not available, load the "browser" skill first using `skill_load`.',
diff --git a/assistant/src/config/templates/SOUL.md b/assistant/src/config/templates/SOUL.md
index 13aaefb8253..db2f8868738 100644
--- a/assistant/src/config/templates/SOUL.md
+++ b/assistant/src/config/templates/SOUL.md
@@ -30,7 +30,7 @@ You are your own entity with your own emerging identity. You can — and should
 
 ## Personality
 
-Be concise when the situation calls for it, thorough when it matters. Not a corporate drone. Not a sycophant. Just good at what you do.
+Be concise. Default to 1-3 sentences. Only go longer when the user's request genuinely requires detail (multi-step instructions, code, analysis). Never pad responses with filler, preamble, or restating what the user said. Lead with the answer or action, not context-setting. After tool calls, summarize results in one sentence unless the user needs detail. Not a corporate drone. Not a sycophant. Just good at what you do.
 
 ## Quirks
 
diff --git a/assistant/src/config/types.ts b/assistant/src/config/types.ts
index d48c5844526..caadbbd2cde 100644
--- a/assistant/src/config/types.ts
+++ b/assistant/src/config/types.ts
@@ -39,5 +39,5 @@ export type {
   CallerIdentityConfig,
   SmsConfig,
   IngressConfig,
-  AssistantInboxConfig,
+  DaemonConfig,
 } from './schema.js';
diff --git a/assistant/src/daemon/approval-generators.ts b/assistant/src/daemon/approval-generators.ts
new file mode 100644
index 00000000000..d607a1fe5b2
--- /dev/null
+++ b/assistant/src/daemon/approval-generators.ts
@@ -0,0 +1,186 @@
+import type { ApprovalCopyGenerator, ApprovalConversationGenerator, ApprovalConversationResult, ApprovalConversationDisposition } from '../runtime/http-types.js';
+import {
+  buildGenerationPrompt,
+  includesRequiredKeywords,
+  getFallbackMessage,
+  APPROVAL_COPY_TIMEOUT_MS,
+  APPROVAL_COPY_MAX_TOKENS,
+  APPROVAL_COPY_SYSTEM_PROMPT,
+} from '../runtime/approval-message-composer.js';
+import { loadConfig } from '../config/loader.js';
+import { getFailoverProvider, listProviders } from '../providers/registry.js';
+
+// ---------------------------------------------------------------------------
+// Approval conversation generator constants
+// ---------------------------------------------------------------------------
+
+const APPROVAL_CONVERSATION_TIMEOUT_MS = 8_000;
+const APPROVAL_CONVERSATION_MAX_TOKENS = 300;
+
+const APPROVAL_CONVERSATION_SYSTEM_PROMPT =
+  'You are an assistant helping a user manage a pending tool approval request. '
+  + 'Analyze the user\'s message to determine if they are making a decision '
+  + '(approve, reject, or cancel) or just asking a question / making conversation. '
+  + 'When uncertain, default to keep_pending — never approve or reject without clear intent. '
+  + 'For guardians: explain what tool is requesting approval and from whom. '
+  + 'Always provide a natural, helpful reply along with your decision.';
+
+const APPROVAL_CONVERSATION_TOOL_NAME = 'approval_decision';
+
+const APPROVAL_CONVERSATION_TOOL_SCHEMA = {
+  name: APPROVAL_CONVERSATION_TOOL_NAME,
+  description:
+    'Record the disposition of the approval conversation turn. '
+    + 'Call this tool with the determined disposition and a natural reply to the user.',
+  input_schema: {
+    type: 'object' as const,
+    properties: {
+      disposition: {
+        type: 'string',
+        enum: ['keep_pending', 'approve_once', 'approve_always', 'reject'],
+        description:
+          'The decision: keep_pending if the user is asking questions or unclear, '
+          + 'approve_once to approve this single request, approve_always to approve '
+          + 'this tool permanently, reject to deny the request.',
+      },
+      replyText: {
+        type: 'string',
+        description: 'A natural language reply to send back to the user.',
+      },
+      targetRunId: {
+        type: 'string',
+        description:
+          'The run ID of the specific pending approval being acted on. '
+          + 'Required when there are multiple pending approvals and the disposition is decision-bearing.',
+      },
+    },
+    required: ['disposition', 'replyText'],
+  },
+};
+
+const VALID_DISPOSITIONS: ReadonlySet<string> = new Set([
+  'keep_pending',
+  'approve_once',
+  'approve_always',
+  'reject',
+]);
+
+/**
+ * Create the daemon-owned approval copy generator that resolves providers
+ * and calls `provider.sendMessage` to generate approval copy text.
+ * This keeps all provider awareness in the daemon lifecycle, away from
+ * the runtime composer.
+ */
+export function createApprovalCopyGenerator(): ApprovalCopyGenerator {
+  return async (context, options = {}) => {
+    const config = loadConfig();
+    let provider;
+    try {
+      provider = getFailoverProvider(config.provider, config.providerOrder);
+    } catch {
+      return null;
+    }
+
+    const fallbackText = options.fallbackText?.trim() || getFallbackMessage(context);
+    const requiredKeywords = options.requiredKeywords?.map((kw) => kw.trim()).filter((kw) => kw.length > 0);
+    const prompt = buildGenerationPrompt(context, fallbackText, requiredKeywords);
+
+    const response = await provider.sendMessage(
+      [{ role: 'user', content: [{ type: 'text', text: prompt }] }],
+      [],
+      APPROVAL_COPY_SYSTEM_PROMPT,
+      {
+        config: {
+          max_tokens: options.maxTokens ?? APPROVAL_COPY_MAX_TOKENS,
+        },
+        signal: AbortSignal.timeout(options.timeoutMs ?? APPROVAL_COPY_TIMEOUT_MS),
+      },
+    );
+
+    const block = response.content.find((entry) => entry.type === 'text');
+    const text = block && 'text' in block ? block.text.trim() : '';
+    if (!text) return null;
+    const cleaned = text
+      .replace(/^["'`]+/, '')
+      .replace(/["'`]+$/, '')
+      .trim();
+    if (!cleaned) return null;
+    if (!includesRequiredKeywords(cleaned, requiredKeywords)) return null;
+    return cleaned;
+  };
+}
+
+/**
+ * Create the daemon-owned approval conversation generator that resolves
+ * providers and uses tool_use / function calling for structured output.
+ * Follows the same provider-aware pattern as createApprovalCopyGenerator().
+ */
+export function createApprovalConversationGenerator(): ApprovalConversationGenerator {
+  return async (context) => {
+    const config = loadConfig();
+    if (!listProviders().includes(config.provider)) {
+      throw new Error('No provider available for approval conversation');
+    }
+    const provider = getFailoverProvider(config.provider, config.providerOrder);
+
+    const pendingDescription = context.pendingApprovals
+      .map((p) => `- Run ${p.runId}: tool "${p.toolName}"`)
+      .join('\n');
+
+    const userPrompt = [
+      `Role: ${context.role}`,
+      `Tool requesting approval: "${context.toolName}"`,
+      `Allowed actions: ${context.allowedActions.join(', ')}`,
+      `Pending approvals:\n${pendingDescription}`,
+      `\nUser message: ${context.userMessage}`,
+    ].join('\n');
+
+    const response = await provider.sendMessage(
+      [{ role: 'user', content: [{ type: 'text', text: userPrompt }] }],
+      [APPROVAL_CONVERSATION_TOOL_SCHEMA],
+      APPROVAL_CONVERSATION_SYSTEM_PROMPT,
+      {
+        config: {
+          max_tokens: APPROVAL_CONVERSATION_MAX_TOKENS,
+        },
+        signal: AbortSignal.timeout(APPROVAL_CONVERSATION_TIMEOUT_MS),
+      },
+    );
+
+    // Extract the tool_use block from the response
+    const toolUseBlock = response.content.find(
+      (block) => block.type === 'tool_use' && block.name === APPROVAL_CONVERSATION_TOOL_NAME,
+    );
+
+    if (!toolUseBlock || toolUseBlock.type !== 'tool_use') {
+      throw new Error('Provider did not return a tool_use block for approval decision');
+    }
+
+    const input = toolUseBlock.input as Record<string, unknown>;
+
+    // Strict validation of the structured output
+    const disposition = input.disposition;
+    if (typeof disposition !== 'string' || !VALID_DISPOSITIONS.has(disposition)) {
+      throw new Error(`Invalid disposition: ${String(disposition)}`);
+    }
+
+    const replyText = input.replyText;
+    if (typeof replyText !== 'string' || replyText.trim().length === 0) {
+      throw new Error('Missing or empty replyText in tool_use response');
+    }
+
+    const targetRunId = input.targetRunId;
+    if (targetRunId !== undefined && typeof targetRunId !== 'string') {
+      throw new Error('Invalid targetRunId in tool_use response');
+    }
+
+    const result: ApprovalConversationResult = {
+      disposition: disposition as ApprovalConversationDisposition,
+      replyText: replyText.trim(),
+    };
+    if (typeof targetRunId === 'string' && targetRunId.length > 0) {
+      result.targetRunId = targetRunId;
+    }
+    return result;
+  };
+}
diff --git a/assistant/src/daemon/daemon-control.ts b/assistant/src/daemon/daemon-control.ts
new file mode 100644
index 00000000000..96e9432a782
--- /dev/null
+++ b/assistant/src/daemon/daemon-control.ts
@@ -0,0 +1,217 @@
+import { spawn } from 'node:child_process';
+import { mkdirSync, readFileSync, writeFileSync, unlinkSync, existsSync, openSync, closeSync } from 'node:fs';
+import { join, resolve } from 'node:path';
+import {
+  getSocketPath,
+  getPidPath,
+  getRootDir,
+  removeSocketFile,
+} from '../util/platform.js';
+import { getLogger } from '../util/logger.js';
+import { DaemonError } from '../util/errors.js';
+import { getConfig } from '../config/loader.js';
+
+const log = getLogger('lifecycle');
+
+function isProcessRunning(pid: number): boolean {
+  try {
+    process.kill(pid, 0);
+    return true;
+  } catch {
+    return false;
+  }
+}
+
+function readPid(): number | null {
+  const pidPath = getPidPath();
+  if (!existsSync(pidPath)) return null;
+  try {
+    const pid = parseInt(readFileSync(pidPath, 'utf-8').trim(), 10);
+    return isNaN(pid) ? null : pid;
+  } catch {
+    return null;
+  }
+}
+
+export function writePid(pid: number): void {
+  writeFileSync(getPidPath(), String(pid));
+}
+
+export function cleanupPidFile(): void {
+  const pidPath = getPidPath();
+  if (existsSync(pidPath)) {
+    unlinkSync(pidPath);
+  }
+}
+
+export function isDaemonRunning(): boolean {
+  const pid = readPid();
+  if (pid == null) return false;
+  if (!isProcessRunning(pid)) {
+    cleanupPidFile();
+    return false;
+  }
+  return true;
+}
+
+export function getDaemonStatus(): { running: boolean; pid?: number } {
+  const pid = readPid();
+  if (pid == null) return { running: false };
+  if (!isProcessRunning(pid)) {
+    cleanupPidFile();
+    return { running: false };
+  }
+  return { running: true, pid };
+}
+
+export async function startDaemon(): Promise<{
+  pid: number;
+  alreadyRunning: boolean;
+}> {
+  const status = getDaemonStatus();
+  if (status.running && status.pid) {
+    return { pid: status.pid, alreadyRunning: true };
+  }
+
+  // Only create the root dir for socket/PID — the daemon process itself
+  // handles migration + full ensureDataDir() in runDaemon(). Calling
+  // ensureDataDir() here would pre-create workspace destination dirs
+  // and cause migration moves to no-op.
+  const rootDir = getRootDir();
+  if (!existsSync(rootDir)) {
+    mkdirSync(rootDir, { recursive: true });
+  }
+
+  // Clean up stale socket (only if it's actually a Unix socket)
+  const socketPath = getSocketPath();
+  removeSocketFile(socketPath);
+
+  // Spawn the daemon as a detached child process
+  const mainPath = resolve(
+    import.meta.dirname ?? __dirname,
+    'main.ts',
+  );
+
+  // Redirect the child's stderr to a file instead of piping it back to the
+  // parent. A pipe's read end is destroyed when the parent exits, leaving
+  // fd 2 broken in the child. Bun (unlike Node.js) does not ignore SIGPIPE,
+  // so any later stderr write would silently kill the daemon.
+  const stderrPath = join(rootDir, 'daemon-stderr.log');
+  const stderrFd = openSync(stderrPath, 'w');
+
+  const child = spawn('bun', ['run', mainPath], {
+    detached: true,
+    stdio: ['ignore', 'ignore', stderrFd],
+    env: { ...process.env },
+  });
+
+  // The child inherited the fd; close the parent's copy.
+  closeSync(stderrFd);
+
+  let childExited = false;
+  let childExitCode: number | null = null;
+  child.on('exit', (code) => {
+    childExited = true;
+    childExitCode = code;
+  });
+
+  child.unref();
+
+  const pid = child.pid;
+  if (!pid) {
+    throw new DaemonError('Failed to start daemon: no PID returned');
+  }
+
+  writePid(pid);
+
+  // Wait for socket to appear
+  const config = getConfig();
+  const maxWait = config.daemon.startupSocketWaitMs;
+  const interval = 100;
+  let waited = 0;
+  while (waited < maxWait) {
+    if (existsSync(socketPath)) {
+      return { pid, alreadyRunning: false };
+    }
+    if (childExited) {
+      cleanupPidFile();
+      const stderr = readFileSync(stderrPath, 'utf-8').trim();
+      const detail = stderr
+        ? `\n${stderr}`
+        : `\nCheck logs at ~/.vellum/workspace/data/logs/ for details.`;
+      throw new DaemonError(
+        `Daemon exited immediately (code ${childExitCode ?? 'unknown'}).${detail}`,
+      );
+    }
+    await new Promise((r) => setTimeout(r, interval));
+    waited += interval;
+  }
+
+  throw new DaemonError(
+    `Daemon started but socket not available after ${maxWait}ms`,
+  );
+}
+
+export type StopResult =
+  | { stopped: true }
+  | { stopped: false; reason: 'not_running' | 'stop_failed' };
+
+export async function stopDaemon(): Promise<StopResult> {
+  const pid = readPid();
+  if (pid == null || !isProcessRunning(pid)) {
+    cleanupPidFile();
+    return { stopped: false, reason: 'not_running' };
+  }
+
+  process.kill(pid, 'SIGTERM');
+
+  const config = getConfig();
+
+  // Wait for process to exit
+  const maxWait = config.daemon.stopTimeoutMs;
+  const interval = 100;
+  let waited = 0;
+  while (waited < maxWait) {
+    if (!isProcessRunning(pid)) {
+      cleanupPidFile();
+      return { stopped: true };
+    }
+    await new Promise((r) => setTimeout(r, interval));
+    waited += interval;
+  }
+
+  // Force kill
+  try {
+    process.kill(pid, 'SIGKILL');
+  } catch (err) {
+    log.debug({ err, pid }, 'SIGKILL failed, process already exited');
+  }
+
+  // Wait for the process to actually die after SIGKILL. Without this,
+  // startDaemon() can race with the dying process's shutdown handler,
+  // which removes the socket file and bricks the new daemon.
+  const killMaxWait = config.daemon.sigkillGracePeriodMs;
+  let killWaited = 0;
+  while (killWaited < killMaxWait && isProcessRunning(pid)) {
+    await new Promise((r) => setTimeout(r, 100));
+    killWaited += 100;
+  }
+
+  // Only clean up if the process has actually exited.
+  // If it's still alive after SIGKILL + timeout, preserve both socket
+  // and PID file so isDaemonRunning() still reports true and prevents
+  // a duplicate daemon from being spawned.
+  if (!isProcessRunning(pid)) {
+    removeSocketFile(getSocketPath());
+    cleanupPidFile();
+    return { stopped: true };
+  }
+
+  log.warn({ pid }, 'Daemon process still running after SIGKILL + timeout, leaving socket and PID file intact');
+  return { stopped: false, reason: 'stop_failed' };
+}
+
+export async function ensureDaemonRunning(): Promise<void> {
+  if (isDaemonRunning()) return;
+  await startDaemon();
+}
diff --git a/assistant/src/daemon/handlers/pairing.ts b/assistant/src/daemon/handlers/pairing.ts
index 4d5dfd7c876..899b88ad1a8 100644
--- a/assistant/src/daemon/handlers/pairing.ts
+++ b/assistant/src/daemon/handlers/pairing.ts
@@ -5,9 +5,7 @@ import type {
 } from '../ipc-protocol.js';
 import { log, defineHandlers, type HandlerContext } from './shared.js';
 import {
-  isDeviceApproved,
   approveDevice,
-  refreshDevice,
   removeDevice,
   clearAllDevices,
   listDevices,
@@ -26,7 +24,7 @@ export function initPairingHandlers(store: PairingStore, bearerToken: string | u
 function handlePairingApprovalResponse(
   msg: PairingApprovalResponse,
   _socket: net.Socket,
-  ctx: HandlerContext,
+  _ctx: HandlerContext,
 ): void {
   if (!pairingStoreRef) {
     log.warn('Pairing store not initialized');
diff --git a/assistant/src/daemon/handlers/skills.ts b/assistant/src/daemon/handlers/skills.ts
index 7752b2e896f..f991e7344aa 100644
--- a/assistant/src/daemon/handlers/skills.ts
+++ b/assistant/src/daemon/handlers/skills.ts
@@ -423,7 +423,7 @@ export async function handleSkillsSearch(
   ctx: HandlerContext,
 ): Promise<void> {
   try {
-    // Search vellum-skills catalog (remote with bundled fallback)
+    // Search vellum-skills catalog (platform API with bundled fallback)
     const catalogEntries = await listCatalogEntries();
     const query = (msg.query ?? '').toLowerCase();
     const matchingCatalog = catalogEntries.filter((e) => {
diff --git a/assistant/src/daemon/ipc-contract/work-items.ts b/assistant/src/daemon/ipc-contract/work-items.ts
index d5151e5aaa8..96dc12c01d8 100644
--- a/assistant/src/daemon/ipc-contract/work-items.ts
+++ b/assistant/src/daemon/ipc-contract/work-items.ts
@@ -221,4 +221,5 @@ export interface GuardianRequestThreadCreated {
   requestId: string;
   callSessionId: string;
   title: string;
+  questionText: string;
 }
diff --git a/assistant/src/daemon/lifecycle.ts b/assistant/src/daemon/lifecycle.ts
index 863787ed8b3..036918fcd46 100644
--- a/assistant/src/daemon/lifecycle.ts
+++ b/assistant/src/daemon/lifecycle.ts
@@ -1,25 +1,19 @@
-import { spawn } from 'node:child_process';
 import { randomBytes } from 'node:crypto';
-import { mkdirSync, readFileSync, writeFileSync, unlinkSync, existsSync, openSync, closeSync, chmodSync } from 'node:fs';
+import { mkdirSync, readFileSync, writeFileSync, existsSync, chmodSync } from 'node:fs';
 import { createRequire } from 'node:module';
-import { dirname, join, resolve } from 'node:path';
+import { dirname, join } from 'node:path';
 import { config as dotenvConfig } from 'dotenv';
-import * as Sentry from '@sentry/node';
 import {
   getInterfacesDir,
   getSocketPath,
-  getPidPath,
   getHttpTokenPath,
   getRootDir,
   ensureDataDir,
-  migrateToDataLayout,
-  migrateToWorkspaceLayout,
-  removeSocketFile,
 } from '../util/platform.js';
-import { initializeDb, getSqlite, resetDb } from '../memory/db.js';
+import { migrateToDataLayout } from '../migrations/data-layout.js';
+import { migrateToWorkspaceLayout } from '../migrations/workspace-layout.js';
+import { initializeDb } from '../memory/db.js';
 import { rotateToolInvocations } from '../memory/tool-usage-store.js';
-import { initializeProviders, getFailoverProvider, listProviders } from '../providers/registry.js';
-import { initializeTools } from '../tools/registry.js';
 import { loadConfig } from '../config/loader.js';
 import {
   getQdrantUrlEnv,
@@ -32,430 +26,44 @@ import { ensurePromptFiles } from '../config/system-prompt.js';
 import { loadPrebuiltHtml } from '../home-base/prebuilt/seed.js';
 import { DaemonServer } from './server.js';
 import { setRelayBroadcast } from '../calls/relay-server.js';
+import { setVoiceBridgeOrchestrator } from '../calls/voice-session-bridge.js';
 import { listWorkItems, updateWorkItem } from '../work-items/work-item-store.js';
 import { getLogger, initLogger } from '../util/logger.js';
-import { DaemonError } from '../util/errors.js';
 import { initSentry } from '../instrument.js';
 import { initLogfire } from '../logfire.js';
 import { startMemoryJobsWorker } from '../memory/jobs-worker.js';
 import { QdrantManager } from '../memory/qdrant-manager.js';
 import { initQdrantClient } from '../memory/qdrant-client.js';
 import { startScheduler } from '../schedule/scheduler.js';
-import { initWatcherEngine } from '../watcher/engine.js';
-import { registerWatcherProvider } from '../watcher/provider-registry.js';
-import { gmailProvider } from '../watcher/providers/gmail.js';
-import { googleCalendarProvider } from '../watcher/providers/google-calendar.js';
-import { slackProvider as slackWatcherProvider } from '../watcher/providers/slack.js';
-import { githubProvider } from '../watcher/providers/github.js';
-import { linearProvider } from '../watcher/providers/linear.js';
-import { registerMessagingProvider } from '../messaging/registry.js';
-import { slackProvider as slackMessagingProvider } from '../messaging/providers/slack/adapter.js';
-import { gmailMessagingProvider } from '../messaging/providers/gmail/adapter.js';
-import { telegramBotMessagingProvider } from '../messaging/providers/telegram-bot/adapter.js';
-import { smsMessagingProvider } from '../messaging/providers/sms/adapter.js';
-import { whatsappMessagingProvider } from '../messaging/providers/whatsapp/adapter.js';
-import { browserManager } from '../tools/browser/browser-manager.js';
 import { RuntimeHttpServer } from '../runtime/http-server.js';
-import type { ApprovalCopyGenerator, ApprovalConversationGenerator, ApprovalConversationResult, ApprovalConversationDisposition } from '../runtime/http-types.js';
-import {
-  buildGenerationPrompt,
-  includesRequiredKeywords,
-  getFallbackMessage,
-  APPROVAL_COPY_TIMEOUT_MS,
-  APPROVAL_COPY_MAX_TOKENS,
-  APPROVAL_COPY_SYSTEM_PROMPT,
-} from '../runtime/approval-message-composer.js';
 import { getHookManager } from '../hooks/manager.js';
 import { installTemplates } from '../hooks/templates.js';
 import { installCliLaunchers } from './install-cli-launchers.js';
 import { HeartbeatService } from '../workspace/heartbeat-service.js';
 import { AgentHeartbeatService } from '../agent-heartbeat/agent-heartbeat-service.js';
-import { getEnrichmentService } from '../workspace/commit-message-enrichment-service.js';
 import { reconcileCallsOnStartup } from '../calls/call-recovery.js';
 import { TwilioConversationRelayProvider } from '../calls/twilio-provider.js';
+import { createApprovalCopyGenerator, createApprovalConversationGenerator } from './approval-generators.js';
+import { initializeProvidersAndTools, registerWatcherProviders, registerMessagingProviders } from './providers-setup.js';
+import { installShutdownHandlers } from './shutdown-handlers.js';
+import { writePid, cleanupPidFile } from './daemon-control.js';
+
+// Re-export public API so existing consumers don't need to change imports
+export {
+  isDaemonRunning,
+  getDaemonStatus,
+  startDaemon,
+  stopDaemon,
+  ensureDaemonRunning,
+} from './daemon-control.js';
+export type { StopResult } from './daemon-control.js';
 
 const log = getLogger('lifecycle');
 
-function isProcessRunning(pid: number): boolean {
-  try {
-    process.kill(pid, 0);
-    return true;
-  } catch {
-    return false;
-  }
-}
-
-function readPid(): number | null {
-  const pidPath = getPidPath();
-  if (!existsSync(pidPath)) return null;
-  try {
-    const pid = parseInt(readFileSync(pidPath, 'utf-8').trim(), 10);
-    return isNaN(pid) ? null : pid;
-  } catch {
-    return null;
-  }
-}
-
-function writePid(pid: number): void {
-  writeFileSync(getPidPath(), String(pid));
-}
-
-function cleanupPidFile(): void {
-  const pidPath = getPidPath();
-  if (existsSync(pidPath)) {
-    unlinkSync(pidPath);
-  }
-}
-
-export function isDaemonRunning(): boolean {
-  const pid = readPid();
-  if (pid == null) return false;
-  if (!isProcessRunning(pid)) {
-    // Stale PID file
-    cleanupPidFile();
-    return false;
-  }
-  return true;
-}
-
-export function getDaemonStatus(): { running: boolean; pid?: number } {
-  const pid = readPid();
-  if (pid == null) return { running: false };
-  if (!isProcessRunning(pid)) {
-    cleanupPidFile();
-    return { running: false };
-  }
-  return { running: true, pid };
-}
-
-export async function startDaemon(): Promise<{
-  pid: number;
-  alreadyRunning: boolean;
-}> {
-  const status = getDaemonStatus();
-  if (status.running && status.pid) {
-    return { pid: status.pid, alreadyRunning: true };
-  }
-
-  // Only create the root dir for socket/PID — the daemon process itself
-  // handles migration + full ensureDataDir() in runDaemon(). Calling
-  // ensureDataDir() here would pre-create workspace destination dirs
-  // and cause migration moves to no-op.
-  const rootDir = getRootDir();
-  if (!existsSync(rootDir)) {
-    mkdirSync(rootDir, { recursive: true });
-  }
-
-  // Clean up stale socket (only if it's actually a Unix socket)
-  const socketPath = getSocketPath();
-  removeSocketFile(socketPath);
-
-  // Spawn the daemon as a detached child process
-  const mainPath = resolve(
-    import.meta.dirname ?? __dirname,
-    'main.ts',
-  );
-
-  // Redirect the child's stderr to a file instead of piping it back to the
-  // parent. A pipe's read end is destroyed when the parent exits, leaving
-  // fd 2 broken in the child. Bun (unlike Node.js) does not ignore SIGPIPE,
-  // so any later stderr write would silently kill the daemon.
-  const stderrPath = join(rootDir, 'daemon-stderr.log');
-  const stderrFd = openSync(stderrPath, 'w');
-
-  const child = spawn('bun', ['run', mainPath], {
-    detached: true,
-    stdio: ['ignore', 'ignore', stderrFd],
-    env: { ...process.env },
-  });
-
-  // The child inherited the fd; close the parent's copy.
-  closeSync(stderrFd);
-
-  let childExited = false;
-  let childExitCode: number | null = null;
-  child.on('exit', (code) => {
-    childExited = true;
-    childExitCode = code;
-  });
-
-  child.unref();
-
-  const pid = child.pid;
-  if (!pid) {
-    throw new DaemonError('Failed to start daemon: no PID returned');
-  }
-
-  writePid(pid);
-
-  // Wait for socket to appear
-  const maxWait = 5000;
-  const interval = 100;
-  let waited = 0;
-  while (waited < maxWait) {
-    if (existsSync(socketPath)) {
-      return { pid, alreadyRunning: false };
-    }
-    if (childExited) {
-      cleanupPidFile();
-      const stderr = readFileSync(stderrPath, 'utf-8').trim();
-      const detail = stderr
-        ? `\n${stderr}`
-        : `\nCheck logs at ~/.vellum/workspace/data/logs/ for details.`;
-      throw new DaemonError(
-        `Daemon exited immediately (code ${childExitCode ?? 'unknown'}).${detail}`,
-      );
-    }
-    await new Promise((r) => setTimeout(r, interval));
-    waited += interval;
-  }
-
-  throw new DaemonError(
-    'Daemon started but socket not available after 5 seconds',
-  );
-}
-
-export type StopResult =
-  | { stopped: true }
-  | { stopped: false; reason: 'not_running' | 'stop_failed' };
-
-export async function stopDaemon(): Promise<StopResult> {
-  const pid = readPid();
-  if (pid == null || !isProcessRunning(pid)) {
-    cleanupPidFile();
-    return { stopped: false, reason: 'not_running' };
-  }
-
-  process.kill(pid, 'SIGTERM');
-
-  // Wait for process to exit
-  const maxWait = 5000;
-  const interval = 100;
-  let waited = 0;
-  while (waited < maxWait) {
-    if (!isProcessRunning(pid)) {
-      cleanupPidFile();
-      return { stopped: true };
-    }
-    await new Promise((r) => setTimeout(r, interval));
-    waited += interval;
-  }
-
-  // Force kill
-  try {
-    process.kill(pid, 'SIGKILL');
-  } catch (err) {
-    log.debug({ err, pid }, 'SIGKILL failed, process already exited');
-  }
-
-  // Wait for the process to actually die after SIGKILL. Without this,
-  // startDaemon() can race with the dying process's shutdown handler,
-  // which removes the socket file and bricks the new daemon.
-  const killMaxWait = 2000;
-  let killWaited = 0;
-  while (killWaited < killMaxWait && isProcessRunning(pid)) {
-    await new Promise((r) => setTimeout(r, 100));
-    killWaited += 100;
-  }
-
-  // Only clean up if the process has actually exited.
-  // If it's still alive after SIGKILL + timeout, preserve both socket
-  // and PID file so isDaemonRunning() still reports true and prevents
-  // a duplicate daemon from being spawned.
-  if (!isProcessRunning(pid)) {
-    removeSocketFile(getSocketPath());
-    cleanupPidFile();
-    return { stopped: true };
-  }
-
-  log.warn({ pid }, 'Daemon process still running after SIGKILL + timeout, leaving socket and PID file intact');
-  return { stopped: false, reason: 'stop_failed' };
-}
-
-export async function ensureDaemonRunning(): Promise<void> {
-  if (isDaemonRunning()) return;
-  await startDaemon();
-}
-
 function loadDotEnv(): void {
   dotenvConfig({ path: join(getRootDir(), '.env'), quiet: true });
 }
 
-/**
- * Create the daemon-owned approval copy generator that resolves providers
- * and calls `provider.sendMessage` to generate approval copy text.
- * This keeps all provider awareness in the daemon lifecycle, away from
- * the runtime composer.
- */
-function createApprovalCopyGenerator(): ApprovalCopyGenerator {
-  return async (context, options = {}) => {
-    const config = loadConfig();
-    let provider;
-    try {
-      provider = getFailoverProvider(config.provider, config.providerOrder);
-    } catch {
-      return null;
-    }
-
-    const fallbackText = options.fallbackText?.trim() || getFallbackMessage(context);
-    const requiredKeywords = options.requiredKeywords?.map((kw) => kw.trim()).filter((kw) => kw.length > 0);
-    const prompt = buildGenerationPrompt(context, fallbackText, requiredKeywords);
-
-    const response = await provider.sendMessage(
-      [{ role: 'user', content: [{ type: 'text', text: prompt }] }],
-      [],
-      APPROVAL_COPY_SYSTEM_PROMPT,
-      {
-        config: {
-          max_tokens: options.maxTokens ?? APPROVAL_COPY_MAX_TOKENS,
-        },
-        signal: AbortSignal.timeout(options.timeoutMs ?? APPROVAL_COPY_TIMEOUT_MS),
-      },
-    );
-
-    const block = response.content.find((entry) => entry.type === 'text');
-    const text = block && 'text' in block ? block.text.trim() : '';
-    if (!text) return null;
-    const cleaned = text
-      .replace(/^["'`]+/, '')
-      .replace(/["'`]+$/, '')
-      .trim();
-    if (!cleaned) return null;
-    if (!includesRequiredKeywords(cleaned, requiredKeywords)) return null;
-    return cleaned;
-  };
-}
-
-// ---------------------------------------------------------------------------
-// Approval conversation generator constants
-// ---------------------------------------------------------------------------
-
-const APPROVAL_CONVERSATION_TIMEOUT_MS = 8_000;
-const APPROVAL_CONVERSATION_MAX_TOKENS = 300;
-
-const APPROVAL_CONVERSATION_SYSTEM_PROMPT =
-  'You are an assistant helping a user manage a pending tool approval request. '
-  + 'Analyze the user\'s message to determine if they are making a decision '
-  + '(approve, reject, or cancel) or just asking a question / making conversation. '
-  + 'When uncertain, default to keep_pending — never approve or reject without clear intent. '
-  + 'For guardians: explain what tool is requesting approval and from whom. '
-  + 'Always provide a natural, helpful reply along with your decision.';
-
-const APPROVAL_CONVERSATION_TOOL_NAME = 'approval_decision';
-
-const APPROVAL_CONVERSATION_TOOL_SCHEMA = {
-  name: APPROVAL_CONVERSATION_TOOL_NAME,
-  description:
-    'Record the disposition of the approval conversation turn. '
-    + 'Call this tool with the determined disposition and a natural reply to the user.',
-  input_schema: {
-    type: 'object' as const,
-    properties: {
-      disposition: {
-        type: 'string',
-        enum: ['keep_pending', 'approve_once', 'approve_always', 'reject'],
-        description:
-          'The decision: keep_pending if the user is asking questions or unclear, '
-          + 'approve_once to approve this single request, approve_always to approve '
-          + 'this tool permanently, reject to deny the request.',
-      },
-      replyText: {
-        type: 'string',
-        description: 'A natural language reply to send back to the user.',
-      },
-      targetRunId: {
-        type: 'string',
-        description:
-          'The run ID of the specific pending approval being acted on. '
-          + 'Required when there are multiple pending approvals and the disposition is decision-bearing.',
-      },
-    },
-    required: ['disposition', 'replyText'],
-  },
-};
-
-const VALID_DISPOSITIONS: ReadonlySet<string> = new Set([
-  'keep_pending',
-  'approve_once',
-  'approve_always',
-  'reject',
-]);
-
-/**
- * Create the daemon-owned approval conversation generator that resolves
- * providers and uses tool_use / function calling for structured output.
- * Follows the same provider-aware pattern as createApprovalCopyGenerator().
- */
-function createApprovalConversationGenerator(): ApprovalConversationGenerator {
-  return async (context) => {
-    const config = loadConfig();
-    if (!listProviders().includes(config.provider)) {
-      throw new Error('No provider available for approval conversation');
-    }
-    const provider = getFailoverProvider(config.provider, config.providerOrder);
-
-    const pendingDescription = context.pendingApprovals
-      .map((p) => `- Run ${p.runId}: tool "${p.toolName}"`)
-      .join('\n');
-
-    const userPrompt = [
-      `Role: ${context.role}`,
-      `Tool requesting approval: "${context.toolName}"`,
-      `Allowed actions: ${context.allowedActions.join(', ')}`,
-      `Pending approvals:\n${pendingDescription}`,
-      `\nUser message: ${context.userMessage}`,
-    ].join('\n');
-
-    const response = await provider.sendMessage(
-      [{ role: 'user', content: [{ type: 'text', text: userPrompt }] }],
-      [APPROVAL_CONVERSATION_TOOL_SCHEMA],
-      APPROVAL_CONVERSATION_SYSTEM_PROMPT,
-      {
-        config: {
-          max_tokens: APPROVAL_CONVERSATION_MAX_TOKENS,
-        },
-        signal: AbortSignal.timeout(APPROVAL_CONVERSATION_TIMEOUT_MS),
-      },
-    );
-
-    // Extract the tool_use block from the response
-    const toolUseBlock = response.content.find(
-      (block) => block.type === 'tool_use' && block.name === APPROVAL_CONVERSATION_TOOL_NAME,
-    );
-
-    if (!toolUseBlock || toolUseBlock.type !== 'tool_use') {
-      throw new Error('Provider did not return a tool_use block for approval decision');
-    }
-
-    const input = toolUseBlock.input as Record<string, unknown>;
-
-    // Strict validation of the structured output
-    const disposition = input.disposition;
-    if (typeof disposition !== 'string' || !VALID_DISPOSITIONS.has(disposition)) {
-      throw new Error(`Invalid disposition: ${String(disposition)}`);
-    }
-
-    const replyText = input.replyText;
-    if (typeof replyText !== 'string' || replyText.trim().length === 0) {
-      throw new Error('Missing or empty replyText in tool_use response');
-    }
-
-    const targetRunId = input.targetRunId;
-    if (targetRunId !== undefined && typeof targetRunId !== 'string') {
-      throw new Error('Invalid targetRunId in tool_use response');
-    }
-
-    const result: ApprovalConversationResult = {
-      disposition: disposition as ApprovalConversationDisposition,
-      replyText: replyText.trim(),
-    };
-    if (typeof targetRunId === 'string' && targetRunId.length > 0) {
-      result.targetRunId = targetRunId;
-    }
-    return result;
-  };
-}
-
 // Entry point for the daemon process itself
 export async function runDaemon(): Promise<void> {
   loadDotEnv();
@@ -510,7 +118,6 @@ export async function runDaemon(): Promise<void> {
   installTemplates();
   ensurePromptFiles();
 
-  // Install standalone CLI launchers (e.g. doordash, map) in ~/.vellum/bin/
   try {
     installCliLaunchers();
   } catch (err) {
@@ -530,8 +137,6 @@ export async function runDaemon(): Promise<void> {
     log.info({ count: orphanedRunning.length }, 'Recovered orphaned running work items');
   }
 
-  // Reconcile in-flight calls that were left in non-terminal states
-  // after a daemon crash or restart.
   try {
     const twilioProvider = new TwilioConversationRelayProvider();
     await reconcileCallsOnStartup(twilioProvider, log);
@@ -546,10 +151,7 @@ export async function runDaemon(): Promise<void> {
     initLogger({ dir: config.logFile.dir, retentionDays: config.logFile.retentionDays });
   }
 
-  log.info('Daemon startup: initializing providers and tools');
-  initializeProviders(config);
-  await initializeTools();
-  log.info('Daemon startup: providers and tools initialized');
+  await initializeProvidersAndTools(config);
 
   // Start the IPC socket BEFORE Qdrant so that clients can connect
   // immediately. Qdrant startup can take 30+ seconds (binary download,
@@ -562,9 +164,7 @@ export async function runDaemon(): Promise<void> {
   // Initialize Qdrant vector store — non-fatal so the daemon stays up without it
   const qdrantUrl = getQdrantUrlEnv() || config.memory.qdrant.url;
   log.info({ qdrantUrl }, 'Daemon startup: initializing Qdrant');
-  const qdrantManager = new QdrantManager({
-    url: qdrantUrl,
-  });
+  const qdrantManager = new QdrantManager({ url: qdrantUrl });
   try {
     await qdrantManager.start();
     initQdrantClient({
@@ -581,20 +181,9 @@ export async function runDaemon(): Promise<void> {
 
   log.info('Daemon startup: starting memory worker');
   const memoryWorker = startMemoryJobsWorker();
-  // Initialize watcher engine and register providers
-  registerWatcherProvider(gmailProvider);
-  registerWatcherProvider(googleCalendarProvider);
-  registerWatcherProvider(slackWatcherProvider);
-  registerWatcherProvider(githubProvider);
-  registerWatcherProvider(linearProvider);
-  initWatcherEngine();
-
-  // Register messaging providers
-  registerMessagingProvider(slackMessagingProvider);
-  registerMessagingProvider(gmailMessagingProvider);
-  registerMessagingProvider(telegramBotMessagingProvider);
-  registerMessagingProvider(smsMessagingProvider);
-  registerMessagingProvider(whatsappMessagingProvider);
+
+  registerWatcherProviders();
+  registerMessagingProviders();
 
   const scheduler = startScheduler(
     async (conversationId, message) => {
@@ -631,58 +220,65 @@ export async function runDaemon(): Promise<void> {
     },
   );
 
-  // Start optional runtime HTTP server when RUNTIME_HTTP_PORT is set
+  // Start the runtime HTTP server. Required for iOS pairing (gateway proxies
+  // to it) and optional REST API access. Defaults to port 7821.
   let runtimeHttp: RuntimeHttpServer | null = null;
   const httpPort = getRuntimeHttpPort();
-  log.info({ httpPort }, 'Daemon startup: checking RUNTIME_HTTP_PORT');
-  if (httpPort) {
-    const port = httpPort;
-    // Resolve the bearer token in priority order:
-    //   1. Explicit env var (e.g. cloud deploys)
-    //   2. Existing token file on disk (preserves QR-paired iOS devices across restarts)
-    //   3. Fresh random token (first-time startup)
-    const httpTokenPath = getHttpTokenPath();
-    let bearerToken = getRuntimeProxyBearerToken();
-    if (!bearerToken) {
-      try {
-        const existing = readFileSync(httpTokenPath, 'utf-8').trim();
-        if (existing) bearerToken = existing;
-      } catch {
-        // File doesn't exist or can't be read — will generate below
-      }
-    }
-    if (!bearerToken) {
-      bearerToken = randomBytes(32).toString('hex');
-    }
-    writeFileSync(httpTokenPath, bearerToken, { mode: 0o600 });
-    chmodSync(httpTokenPath, 0o600);
-
-    const hostname = getRuntimeHttpHost();
-
-    runtimeHttp = new RuntimeHttpServer({
-      port,
-      hostname,
-      bearerToken,
-      processMessage: (conversationId, content, attachmentIds, options, sourceChannel) =>
-        server.processMessage(conversationId, content, attachmentIds, options, sourceChannel),
-      persistAndProcessMessage: (conversationId, content, attachmentIds, options, sourceChannel) =>
-        server.persistAndProcessMessage(conversationId, content, attachmentIds, options, sourceChannel),
-      runOrchestrator: server.createRunOrchestrator(),
-      interfacesDir: getInterfacesDir(),
-      approvalCopyGenerator: createApprovalCopyGenerator(),
-      approvalConversationGenerator: createApprovalConversationGenerator(),
-    });
+  log.info({ httpPort }, 'Daemon startup: starting runtime HTTP server');
+
+  // Resolve the bearer token in priority order:
+  //   1. Explicit env var (e.g. cloud deploys)
+  //   2. Existing token file on disk (preserves QR-paired iOS devices across restarts)
+  //   3. Fresh random token (first-time startup)
+  const httpTokenPath = getHttpTokenPath();
+  let bearerToken = getRuntimeProxyBearerToken();
+  if (!bearerToken) {
     try {
-      log.info({ port, hostname }, 'Daemon startup: starting runtime HTTP server');
-      await runtimeHttp.start();
-      setRelayBroadcast((msg) => server.broadcast(msg));
-      server.setHttpPort(port);
-      log.info({ port, hostname }, 'Daemon startup: runtime HTTP server listening');
-    } catch (err) {
-      log.warn({ err, port }, 'Failed to start runtime HTTP server, continuing without it');
-      runtimeHttp = null;
+      const existing = readFileSync(httpTokenPath, 'utf-8').trim();
+      if (existing) bearerToken = existing;
+    } catch {
+      // File doesn't exist or can't be read — will generate below
     }
   }
+  if (!bearerToken) {
+    bearerToken = randomBytes(32).toString('hex');
+  }
+  writeFileSync(httpTokenPath, bearerToken, { mode: 0o600 });
+  chmodSync(httpTokenPath, 0o600);
+
+  const hostname = getRuntimeHttpHost();
+
+  const runOrchestrator = server.createRunOrchestrator();
+
+  runtimeHttp = new RuntimeHttpServer({
+    port: httpPort,
+    hostname,
+    bearerToken,
+    processMessage: (conversationId, content, attachmentIds, options, sourceChannel) =>
+      server.processMessage(conversationId, content, attachmentIds, options, sourceChannel),
+    persistAndProcessMessage: (conversationId, content, attachmentIds, options, sourceChannel) =>
+      server.persistAndProcessMessage(conversationId, content, attachmentIds, options, sourceChannel),
+    runOrchestrator,
+    interfacesDir: getInterfacesDir(),
+    approvalCopyGenerator: createApprovalCopyGenerator(),
+    approvalConversationGenerator: createApprovalConversationGenerator(),
+  });
+
+  // Inject the voice bridge orchestrator BEFORE attempting to start the HTTP
+  // server. The bridge only needs the RunOrchestrator instance (already created
+  // above) and must be available even when the HTTP server fails to bind.
+  setVoiceBridgeOrchestrator(runOrchestrator);
+
+  try {
+    await runtimeHttp.start();
+    setRelayBroadcast((msg) => server.broadcast(msg));
+    runtimeHttp.setPairingBroadcast((msg) => server.broadcast(msg));
+    server.setHttpPort(httpPort);
+    log.info({ port: httpPort, hostname }, 'Daemon startup: runtime HTTP server listening');
+  } catch (err) {
+    log.warn({ err, port: httpPort }, 'Failed to start runtime HTTP server, continuing without it');
+    runtimeHttp = null;
+  }
 
   writePid(process.pid);
   log.info({ pid: process.pid }, 'Daemon started');
@@ -695,9 +291,6 @@ export async function runDaemon(): Promise<void> {
     socketPath: getSocketPath(),
   });
 
-  // Rotate old audit log entries after startup handshake is complete.
-  // This runs after the socket is listening so it won't block the 5s
-  // readiness window in startDaemon().
   if (config.auditLog.retentionDays > 0) {
     try {
       rotateToolInvocations(config.auditLog.retentionDays);
@@ -706,15 +299,9 @@ export async function runDaemon(): Promise<void> {
     }
   }
 
-  // Start workspace heartbeat service. This periodically checks all
-  // tracked workspaces for uncommitted changes and auto-commits when
-  // thresholds are exceeded (age > 5 min OR > 20 files changed).
-  // Acts as a safety net for long-running operations or background
-  // processes that modify workspace files between turn-boundary commits.
   const heartbeat = new HeartbeatService();
   heartbeat.start();
 
-  // Start model-driven heartbeat service (opt-in via config).
   const agentHeartbeat = new AgentHeartbeatService({
     processMessage: (conversationId, content) =>
       server.processMessage(conversationId, content),
@@ -722,98 +309,15 @@ export async function runDaemon(): Promise<void> {
   });
   agentHeartbeat.start();
 
-  // Graceful shutdown
-  let shuttingDown = false;
-  const shutdown = async () => {
-    if (shuttingDown) return; // Prevent re-entrant shutdown
-    shuttingDown = true;
-    log.info('Shutting down daemon...');
-
-    hookManager.stopWatching();
-
-    // Force exit if graceful shutdown takes too long.
-    // Set this BEFORE awaiting heartbeat stop and triggering daemon-stop hooks
-    // so it covers all potentially-blocking async shutdown work.
-    const forceTimer = setTimeout(() => {
-      log.warn('Graceful shutdown timed out, forcing exit');
-      cleanupPidFile();
-      process.exit(1);
-    }, 10_000);
-    forceTimer.unref();
-
-    await heartbeat.stop();
-    await agentHeartbeat.stop();
-
-    try {
-      await hookManager.trigger('daemon-stop', { pid: process.pid });
-    } catch {
-      // Don't let hook failures block shutdown
-    }
-
-    // Commit any uncommitted workspace changes before stopping the server.
-    // This ensures no workspace state is lost during graceful shutdown.
-    try {
-      log.info({ phase: 'pre_stop' }, 'Committing pending workspace changes');
-      await heartbeat.commitAllPending();
-    } catch (err) {
-      log.warn({ err, phase: 'pre_stop' }, 'Shutdown workspace commit failed');
-    }
-
-    await server.stop();
-
-    // Final commit sweep: catch any writes that occurred during server.stop()
-    // (e.g. in-flight tool executions completing during drain).
-    try {
-      log.info({ phase: 'post_stop' }, 'Final workspace commit sweep');
-      await heartbeat.commitAllPending();
-    } catch (err) {
-      log.warn({ err, phase: 'post_stop' }, 'Post-stop workspace commit failed');
-    }
-
-    // Flush in-flight enrichment jobs so shutdown commit notes are not dropped.
-    // The enrichment service's shutdown() drains active jobs and discards pending ones.
-    try {
-      await getEnrichmentService().shutdown();
-    } catch (err) {
-      log.warn({ err }, 'Enrichment service shutdown failed (non-fatal)');
-    }
-
-    if (runtimeHttp) await runtimeHttp.stop();
-    await browserManager.closeAllPages();
-    scheduler.stop();
-    memoryWorker.stop();
-    await qdrantManager.stop();
-
-    // Checkpoint WAL and close SQLite so no writes are lost on exit.
-    // Checkpoint and close are in separate try blocks so that close()
-    // always runs even if checkpointing throws (e.g. SQLITE_BUSY).
-    try {
-      getSqlite().exec('PRAGMA wal_checkpoint(TRUNCATE)');
-    } catch (err) {
-      log.warn({ err }, 'WAL checkpoint failed (non-fatal)');
-    }
-    try {
-      resetDb();
-    } catch (err) {
-      log.warn({ err }, 'Database close failed (non-fatal)');
-    }
-
-    await Sentry.flush(2000);
-    clearTimeout(forceTimer);
-    cleanupPidFile();
-    process.exit(0);
-  };
-
-  process.on('SIGTERM', shutdown);
-  process.on('SIGINT', shutdown);
-
-  process.on('unhandledRejection', (reason) => {
-    log.error({ err: reason }, 'Unhandled promise rejection');
-    Sentry.captureException(reason);
-  });
-
-  process.on('uncaughtException', (err) => {
-    log.error({ err }, 'Uncaught exception');
-    Sentry.captureException(err);
+  installShutdownHandlers({
+    server,
+    heartbeat,
+    agentHeartbeat,
+    hookManager,
+    runtimeHttp,
+    scheduler,
+    memoryWorker,
+    qdrantManager,
+    cleanupPidFile,
   });
 }
diff --git a/assistant/src/daemon/providers-setup.ts b/assistant/src/daemon/providers-setup.ts
new file mode 100644
index 00000000000..ddb4b21044b
--- /dev/null
+++ b/assistant/src/daemon/providers-setup.ts
@@ -0,0 +1,43 @@
+import { initializeProviders } from '../providers/registry.js';
+import { initializeTools } from '../tools/registry.js';
+import { registerWatcherProvider } from '../watcher/provider-registry.js';
+import { gmailProvider } from '../watcher/providers/gmail.js';
+import { googleCalendarProvider } from '../watcher/providers/google-calendar.js';
+import { slackProvider as slackWatcherProvider } from '../watcher/providers/slack.js';
+import { githubProvider } from '../watcher/providers/github.js';
+import { linearProvider } from '../watcher/providers/linear.js';
+import { registerMessagingProvider } from '../messaging/registry.js';
+import { slackProvider as slackMessagingProvider } from '../messaging/providers/slack/adapter.js';
+import { gmailMessagingProvider } from '../messaging/providers/gmail/adapter.js';
+import { telegramBotMessagingProvider } from '../messaging/providers/telegram-bot/adapter.js';
+import { smsMessagingProvider } from '../messaging/providers/sms/adapter.js';
+import { whatsappMessagingProvider } from '../messaging/providers/whatsapp/adapter.js';
+import { initWatcherEngine } from '../watcher/engine.js';
+import type { AssistantConfig } from '../config/types.js';
+import { getLogger } from '../util/logger.js';
+
+const log = getLogger('lifecycle');
+
+export async function initializeProvidersAndTools(config: AssistantConfig): Promise<void> {
+  log.info('Daemon startup: initializing providers and tools');
+  initializeProviders(config);
+  await initializeTools();
+  log.info('Daemon startup: providers and tools initialized');
+}
+
+export function registerWatcherProviders(): void {
+  registerWatcherProvider(gmailProvider);
+  registerWatcherProvider(googleCalendarProvider);
+  registerWatcherProvider(slackWatcherProvider);
+  registerWatcherProvider(githubProvider);
+  registerWatcherProvider(linearProvider);
+  initWatcherEngine();
+}
+
+export function registerMessagingProviders(): void {
+  registerMessagingProvider(slackMessagingProvider);
+  registerMessagingProvider(gmailMessagingProvider);
+  registerMessagingProvider(telegramBotMessagingProvider);
+  registerMessagingProvider(smsMessagingProvider);
+  registerMessagingProvider(whatsappMessagingProvider);
+}
diff --git a/assistant/src/daemon/session-agent-loop-handlers.ts b/assistant/src/daemon/session-agent-loop-handlers.ts
index 782c1b09172..12d677d0243 100644
--- a/assistant/src/daemon/session-agent-loop-handlers.ts
+++ b/assistant/src/daemon/session-agent-loop-handlers.ts
@@ -121,6 +121,7 @@ export function handleThinkingDelta(
   deps: EventHandlerDeps,
   event: Extract<AgentEvent, { type: 'thinking_delta' }>,
 ): void {
+  if (!deps.ctx.streamThinking) return;
   emitLlmCallStartedIfNeeded(state, deps);
   deps.onEvent({ type: 'assistant_thinking_delta', thinking: event.thinking });
 }
diff --git a/assistant/src/daemon/session-agent-loop.ts b/assistant/src/daemon/session-agent-loop.ts
index 865260e7111..0002d1d6608 100644
--- a/assistant/src/daemon/session-agent-loop.ts
+++ b/assistant/src/daemon/session-agent-loop.ts
@@ -116,6 +116,7 @@ export interface AgentLoopSessionContext {
   lastAttachmentWarnings: string[];
 
   hasNoClient: boolean;
+  readonly streamThinking: boolean;
   readonly prompter: PermissionPrompter;
   readonly queue: MessageQueue;
 
@@ -625,13 +626,10 @@ export async function runAgentLoopImpl(
     }
 
     if (isFirstMessage) {
-      void (async () => {
-        try {
-          await generateTitle(ctx, content, state.firstAssistantText, onEvent);
-        } catch (err) {
+      generateTitle(ctx, content, state.firstAssistantText, onEvent, abortController.signal)
+        .catch((err) => {
           log.warn({ err, conversationId: ctx.conversationId }, 'Failed to generate conversation title (non-fatal, using default title)');
-        }
-      })();
+        });
     }
   } catch (err) {
     const errorCtx = { phase: 'agent_loop' as const, aborted: abortController.signal.aborted };
@@ -712,13 +710,18 @@ async function generateTitle(
   userMessage: string,
   assistantResponse: string,
   onEvent: (msg: ServerMessage) => void,
+  sessionSignal?: AbortSignal,
 ): Promise<void> {
+  const config = getConfig();
   const prompt = `Generate a very short title for this conversation. Rules: at most 5 words, at most 40 characters, no quotes.\n\nUser: ${truncate(userMessage, 200, '')}\nAssistant: ${truncate(assistantResponse, 200, '')}`;
+  const signal = sessionSignal
+    ? AbortSignal.any([sessionSignal, AbortSignal.timeout(10_000)])
+    : AbortSignal.timeout(10_000);
   const response = await ctx.provider.sendMessage(
     [{ role: 'user', content: [{ type: 'text', text: prompt }] }],
     [],
     undefined,
-    { config: { max_tokens: 30 } },
+    { config: { max_tokens: config.daemon.titleGenerationMaxTokens }, signal },
   );
 
   const textBlock = response.content.find((b) => b.type === 'text');
diff --git a/assistant/src/daemon/session-messaging.ts b/assistant/src/daemon/session-messaging.ts
index 77169852614..541cfee4df3 100644
--- a/assistant/src/daemon/session-messaging.ts
+++ b/assistant/src/daemon/session-messaging.ts
@@ -64,6 +64,7 @@ export function enqueueMessage(
     currentPage,
     metadata,
     turnChannelContext,
+    queuedAt: Date.now(),
   });
   if (!pushed) {
     return { queued: false, rejected: true, requestId };
diff --git a/assistant/src/daemon/session-queue-manager.ts b/assistant/src/daemon/session-queue-manager.ts
index ac8484e36f3..e64d9b0280b 100644
--- a/assistant/src/daemon/session-queue-manager.ts
+++ b/assistant/src/daemon/session-queue-manager.ts
@@ -7,6 +7,9 @@
 
 import type { ServerMessage, UserMessageAttachment } from './ipc-protocol.js';
 import type { TurnChannelContext } from '../channels/types.js';
+import { getLogger } from '../util/logger.js';
+
+const log = getLogger('session-queue');
 
 export interface QueuedMessage {
   content: string;
@@ -17,9 +20,14 @@ export interface QueuedMessage {
   currentPage?: string;
   metadata?: Record<string, unknown>;
   turnChannelContext?: TurnChannelContext;
+  /** Timestamp (ms) when the message was enqueued. */
+  queuedAt: number;
 }
 
 export const MAX_QUEUE_DEPTH = 10;
+/** Messages older than this (ms) are auto-expired from the queue. */
+export const DEFAULT_MAX_WAIT_MS = 60_000;
+const CAPACITY_WARNING_THRESHOLD = 0.8;
 
 /**
  * Describes why a queued message was promoted from the queue.
@@ -37,27 +45,77 @@ export interface QueuePolicy {
   checkpointHandoffEnabled: boolean;
 }
 
+export interface QueueMetrics {
+  currentDepth: number;
+  totalDropped: number;
+  totalExpired: number;
+  /** Average wait time (ms) of dequeued messages. 0 when no messages have been dequeued. */
+  averageWaitMs: number;
+}
+
 /**
  * Typed wrapper around the queued-message array.
  *
- * Session owns one instance; the wrapper handles capacity checks and
- * iteration so the rest of Session doesn't touch the raw array.
+ * Session owns one instance; the wrapper handles capacity checks,
+ * expiry, metrics, and iteration so the rest of Session doesn't
+ * touch the raw array.
  */
 export class MessageQueue {
   private items: QueuedMessage[] = [];
+  private maxWaitMs: number;
+  private droppedCount = 0;
+  private expiredCount = 0;
+  private totalWaitMs = 0;
+  private dequeuedCount = 0;
+  private capacityWarned = false;
+
+  constructor(maxWaitMs: number = DEFAULT_MAX_WAIT_MS) {
+    this.maxWaitMs = maxWaitMs;
+  }
 
   push(item: QueuedMessage): boolean {
-    if (this.items.length >= MAX_QUEUE_DEPTH) return false;
+    this.expireStale();
+
+    if (this.items.length >= MAX_QUEUE_DEPTH) {
+      this.droppedCount++;
+      item.onEvent({
+        type: 'error',
+        message: 'Message queue is full. Please wait for current messages to be processed.',
+        category: 'queue_full',
+      });
+      return false;
+    }
+
+    item.queuedAt = Date.now();
     this.items.push(item);
+
+    const ratio = this.items.length / MAX_QUEUE_DEPTH;
+    if (ratio >= CAPACITY_WARNING_THRESHOLD && !this.capacityWarned) {
+      this.capacityWarned = true;
+      log.warn({ depth: this.items.length, max: MAX_QUEUE_DEPTH }, 'Queue nearing capacity');
+    } else if (ratio < CAPACITY_WARNING_THRESHOLD) {
+      this.capacityWarned = false;
+    }
+
     return true;
   }
 
   shift(): QueuedMessage | undefined {
-    return this.items.shift();
+    this.expireStale();
+    const item = this.items.shift();
+    if (item) {
+      this.dequeuedCount++;
+      this.totalWaitMs += Date.now() - item.queuedAt;
+    }
+    if (this.items.length / MAX_QUEUE_DEPTH < CAPACITY_WARNING_THRESHOLD) {
+      this.capacityWarned = false;
+    }
+    return item;
   }
 
   clear(): void {
     this.items = [];
+    this.capacityWarned = false;
   }
 
   get length(): number {
@@ -78,6 +136,38 @@ export class MessageQueue {
     return this.items.splice(idx, 1)[0];
   }
 
+  getMetrics(): QueueMetrics {
+    return {
+      currentDepth: this.items.length,
+      totalDropped: this.droppedCount,
+      totalExpired: this.expiredCount,
+      averageWaitMs: this.dequeuedCount > 0 ? this.totalWaitMs / this.dequeuedCount : 0,
+    };
+  }
+
+  /** Remove messages that have been waiting longer than maxWaitMs. */
+  private expireStale(): void {
+    const now = Date.now();
+    const cutoff = now - this.maxWaitMs;
+    const before = this.items.length;
+    this.items = this.items.filter((item) => {
+      if (item.queuedAt < cutoff) {
+        this.expiredCount++;
+        log.warn({ requestId: item.requestId, waitMs: now - item.queuedAt }, 'Expiring stale queued message');
+        item.onEvent({
+          type: 'error',
+          message: 'Your queued message was dropped because it waited too long in the queue.',
+          category: 'queue_expired',
+        });
+        return false;
+      }
+      return true;
+    });
+    if (this.items.length < before && this.items.length / MAX_QUEUE_DEPTH < CAPACITY_WARNING_THRESHOLD) {
+      this.capacityWarned = false;
+    }
+  }
+
   [Symbol.iterator](): Iterator<QueuedMessage> {
     return this.items[Symbol.iterator]();
   }
diff --git a/assistant/src/daemon/session-surfaces.ts b/assistant/src/daemon/session-surfaces.ts
index aaf614bed00..69adbf85749 100644
--- a/assistant/src/daemon/session-surfaces.ts
+++ b/assistant/src/daemon/session-surfaces.ts
@@ -17,16 +17,13 @@ import {
   getPrebuiltHomeBasePreview,
   findSeededHomeBaseApp,
 } from '../home-base/prebuilt/seed.js';
+import { isPlainObject } from '../util/object.js';
 
 const log = getLogger('session-surfaces');
 
 const MAX_UNDO_DEPTH = 10;
 const TASK_PROGRESS_TEMPLATE_FIELDS = ['title', 'status', 'steps'] as const;
 
-function isPlainObject(value: unknown): value is Record<string, unknown> {
-  return typeof value === 'object' && value != null && !Array.isArray(value);
-}
-
 function normalizeCardShowData(input: Record<string, unknown>, rawData: Record<string, unknown>): CardSurfaceData {
   const normalized: Record<string, unknown> = { ...rawData };
 
@@ -128,6 +125,24 @@ export interface SurfaceSessionContext {
     onEvent: (msg: ServerMessage) => void,
     requestId?: string,
   ): Promise<string>;
+  /** Serialize operations on a given surface to prevent read-modify-write races. */
+  withSurface<T>(surfaceId: string, fn: () => T | Promise<T>): Promise<T>;
+}
+
+/**
+ * Per-surface async mutex using Promise chaining.
+ * Operations on the same surfaceId are serialized; different surfaces run concurrently.
+ */
+export function createSurfaceMutex(): <T>(surfaceId: string, fn: () => T | Promise<T>) => Promise<T> {
+  const chains = new Map<string, Promise<void>>();
+
+  return <T>(surfaceId: string, fn: () => T | Promise<T>): Promise<T> => {
+    const prev = chains.get(surfaceId) ?? Promise.resolve();
+    const next = prev.then(fn, fn);
+    // Keep the chain alive but swallow errors so one failure doesn't block subsequent ops
+    chains.set(surfaceId, next.then(() => {}, () => {}));
+    return next;
+  };
 }
 
 /**
diff --git a/assistant/src/daemon/session-tool-setup.ts b/assistant/src/daemon/session-tool-setup.ts
index 66a02326467..4fa8345fa11 100644
--- a/assistant/src/daemon/session-tool-setup.ts
+++ b/assistant/src/daemon/session-tool-setup.ts
@@ -15,6 +15,7 @@ import type { SecretPrompter } from '../permissions/secret-prompter.js';
 import { addRule, findHighestPriorityRule } from '../permissions/trust-store.js';
 import { generateAllowlistOptions, generateScopeOptions, normalizeWebFetchUrl } from '../permissions/checker.js';
 import { getLogger } from '../util/logger.js';
+import { isPlainObject } from '../util/object.js';
 
 const log = getLogger('session-tool-setup');
 import { getAllToolDefinitions } from '../tools/registry.js';
@@ -77,10 +78,6 @@ export function buildToolDefinitions(): ToolDefinition[] {
 
 // ── DoorDash task_progress auto-update ────────────────────────────────
 
-function isPlainObject(value: unknown): value is Record<string, unknown> {
-  return typeof value === 'object' && value != null && !Array.isArray(value);
-}
-
 interface DoordashStep { label: string; status: string; detail?: string }
 
 /**
diff --git a/assistant/src/daemon/session.ts b/assistant/src/daemon/session.ts
index 3a9b503db6f..d7d3e52e430 100644
--- a/assistant/src/daemon/session.ts
+++ b/assistant/src/daemon/session.ts
@@ -38,12 +38,13 @@ import { ContextWindowManager } from '../context/window-manager.js';
 import { getHookManager } from '../hooks/manager.js';
 import { ConflictGate } from './session-conflict-gate.js';
 import { MessageQueue } from './session-queue-manager.js';
-import type { QueueDrainReason } from './session-queue-manager.js';
+import type { QueueDrainReason, QueueMetrics } from './session-queue-manager.js';
 import type { ChannelCapabilities, GuardianRuntimeContext } from './session-runtime-assembly.js';
 import type { AssistantAttachmentDraft } from './assistant-attachments.js';
 import {
   handleSurfaceAction as handleSurfaceActionImpl,
   handleSurfaceUndo as handleSurfaceUndoImpl,
+  createSurfaceMutex,
 } from './session-surfaces.js';
 import {
   undo as undoImpl,
@@ -135,12 +136,14 @@ export class Session {
   /** @internal */ lastSurfaceAction = new Map<string, { actionId: string; data?: Record<string, unknown> }>();
   /** @internal */ surfaceState = new Map<string, { surfaceType: SurfaceType; data: SurfaceData }>();
   /** @internal */ surfaceUndoStacks = new Map<string, string[]>();
+  /** @internal */ withSurface = createSurfaceMutex();
   /** @internal */ currentTurnSurfaces: Array<{ surfaceId: string; surfaceType: SurfaceType; title?: string; data: SurfaceData; actions?: Array<{ id: string; label: string; style?: string }>; display?: string }> = [];
   /** @internal */ onEscalateToComputerUse?: (task: string, sourceSessionId: string) => boolean;
   /** @internal */ workspaceTopLevelContext: string | null = null;
   /** @internal */ workspaceTopLevelDirty = true;
   public readonly traceEmitter: TraceEmitter;
   public memoryPolicy: SessionMemoryPolicy;
+  /** @internal */ streamThinking: boolean;
   /** @internal */ turnCount = 0;
   public lastAssistantAttachments: AssistantAttachmentDraft[] = [];
   public lastAttachmentWarnings: string[] = [];
@@ -195,6 +198,7 @@ export class Session {
     );
 
     const config = getConfig();
+    this.streamThinking = config.thinking.streamThinking ?? false;
     const resolveTools = createResolveToolsCallback(toolDefs, this);
 
     this.agentLoop = new AgentLoop(
@@ -288,6 +292,10 @@ export class Session {
     return this.queue.length;
   }
 
+  getQueueMetrics(): QueueMetrics {
+    return this.queue.getMetrics();
+  }
+
   hasQueuedMessages(): boolean {
     return !this.queue.isEmpty;
   }
diff --git a/assistant/src/daemon/shutdown-handlers.ts b/assistant/src/daemon/shutdown-handlers.ts
new file mode 100644
index 00000000000..0f0d2c70f48
--- /dev/null
+++ b/assistant/src/daemon/shutdown-handlers.ts
@@ -0,0 +1,122 @@
+import * as Sentry from '@sentry/node';
+import { getSqlite, resetDb } from '../memory/db.js';
+import { browserManager } from '../tools/browser/browser-manager.js';
+import { getEnrichmentService } from '../workspace/commit-message-enrichment-service.js';
+import { getLogger } from '../util/logger.js';
+import type { DaemonServer } from './server.js';
+import type { RuntimeHttpServer } from '../runtime/http-server.js';
+import type { HeartbeatService } from '../workspace/heartbeat-service.js';
+import type { AgentHeartbeatService } from '../agent-heartbeat/agent-heartbeat-service.js';
+import type { QdrantManager } from '../memory/qdrant-manager.js';
+import type { HookManager } from '../hooks/manager.js';
+
+const log = getLogger('lifecycle');
+
+export interface ShutdownDeps {
+  server: DaemonServer;
+  heartbeat: HeartbeatService;
+  agentHeartbeat: AgentHeartbeatService;
+  hookManager: HookManager;
+  runtimeHttp: RuntimeHttpServer | null;
+  scheduler: { stop(): void };
+  memoryWorker: { stop(): void };
+  qdrantManager: QdrantManager;
+  cleanupPidFile: () => void;
+}
+
+export function installShutdownHandlers(deps: ShutdownDeps): void {
+  let shuttingDown = false;
+
+  const shutdown = async () => {
+    if (shuttingDown) return;
+    shuttingDown = true;
+    log.info('Shutting down daemon...');
+
+    deps.hookManager.stopWatching();
+
+    // Force exit if graceful shutdown takes too long.
+    // Set this BEFORE awaiting heartbeat stop and triggering daemon-stop hooks
+    // so it covers all potentially-blocking async shutdown work.
+    const forceTimer = setTimeout(() => {
+      log.warn('Graceful shutdown timed out, forcing exit');
+      deps.cleanupPidFile();
+      process.exit(1);
+    }, 10_000);
+    forceTimer.unref();
+
+    await deps.heartbeat.stop();
+    await deps.agentHeartbeat.stop();
+
+    try {
+      await deps.hookManager.trigger('daemon-stop', { pid: process.pid });
+    } catch {
+      // Don't let hook failures block shutdown
+    }
+
+    // Commit any uncommitted workspace changes before stopping the server.
+    // This ensures no workspace state is lost during graceful shutdown.
+    try {
+      log.info({ phase: 'pre_stop' }, 'Committing pending workspace changes');
+      await deps.heartbeat.commitAllPending();
+    } catch (err) {
+      log.warn({ err, phase: 'pre_stop' }, 'Shutdown workspace commit failed');
+    }
+
+    await deps.server.stop();
+
+    // Final commit sweep: catch any writes that occurred during server.stop()
+    // (e.g. in-flight tool executions completing during drain).
+    try {
+      log.info({ phase: 'post_stop' }, 'Final workspace commit sweep');
+      await deps.heartbeat.commitAllPending();
+    } catch (err) {
+      log.warn({ err, phase: 'post_stop' }, 'Post-stop workspace commit failed');
+    }
+
+    // Flush in-flight enrichment jobs so shutdown commit notes are not dropped.
+    // The enrichment service's shutdown() drains active jobs and discards pending ones.
+    try {
+      await getEnrichmentService().shutdown();
+    } catch (err) {
+      log.warn({ err }, 'Enrichment service shutdown failed (non-fatal)');
+    }
+
+    if (deps.runtimeHttp) await deps.runtimeHttp.stop();
+    await browserManager.closeAllPages();
+    deps.scheduler.stop();
+    deps.memoryWorker.stop();
+    await deps.qdrantManager.stop();
+
+    // Checkpoint WAL and close SQLite so no writes are lost on exit.
+    // Checkpoint and close are in separate try blocks so that close()
+    // always runs even if checkpointing throws (e.g. SQLITE_BUSY).
+    try {
+      getSqlite().exec('PRAGMA wal_checkpoint(TRUNCATE)');
+    } catch (err) {
+      log.warn({ err }, 'WAL checkpoint failed (non-fatal)');
+    }
+    try {
+      resetDb();
+    } catch (err) {
+      log.warn({ err }, 'Database close failed (non-fatal)');
+    }
+
+    await Sentry.flush(2000);
+    clearTimeout(forceTimer);
+    deps.cleanupPidFile();
+    process.exit(0);
+  };
+
+  process.on('SIGTERM', shutdown);
+  process.on('SIGINT', shutdown);
+
+  process.on('unhandledRejection', (reason) => {
+    log.error({ err: reason }, 'Unhandled promise rejection');
+    Sentry.captureException(reason);
+  });
+
+  process.on('uncaughtException', (err) => {
+    log.error({ err }, 'Uncaught exception');
+    Sentry.captureException(err);
+  });
+}
diff --git a/assistant/src/index.ts b/assistant/src/index.ts
index af39aaf7bcc..0fd5c762987 100755
--- a/assistant/src/index.ts
+++ b/assistant/src/index.ts
@@ -28,6 +28,7 @@ import { registerDoordashCommand } from './cli/doordash.js';
 import { registerAmazonCommand } from './cli/amazon.js';
 import { registerTwitterCommand } from './cli/twitter.js';
 import { registerMapCommand } from './cli/map.js';
+import { registerInfluencerCommand } from './cli/influencer.js';
 
 const program = new Command();
 
@@ -55,5 +56,6 @@ registerCompletionsCommand(program);
 
 registerTwitterCommand(program);
 registerMapCommand(program);
+registerInfluencerCommand(program);
 
 program.parse();
diff --git a/assistant/src/influencer/client.ts b/assistant/src/influencer/client.ts
new file mode 100644
index 00000000000..0b2c469dce3
--- /dev/null
+++ b/assistant/src/influencer/client.ts
@@ -0,0 +1,1104 @@
+/**
+ * Influencer Research Client
+ *
+ * ARCHITECTURE
+ * ============
+ * All scraping runs inside Chrome browser tabs via the extension relay. The
+ * relay's evaluate command uses CDP Runtime.evaluate (via chrome.debugger API)
+ * as a fallback, which bypasses strict CSP on sites like Instagram.
+ *
+ * The user must be logged into Instagram, TikTok, and/or X in their Chrome
+ * browser for this to work.
+ *
+ * INSTAGRAM DISCOVERY FLOW
+ * ========================
+ * Instagram's search at /explore/search/keyword/?q=... returns a grid of POSTS
+ * (not profiles). To discover influencers:
+ *   1. Search by keyword → get grid of post links (/p/ and /reel/)
+ *   2. Visit each post → extract the author username from page text
+ *   3. Deduplicate usernames
+ *   4. Visit each unique profile → scrape stats from meta[name="description"]
+ *      which reliably contains "49K Followers, 463 Following, 551 Posts - ..."
+ *   5. Filter by criteria and rank
+ *
+ * TIKTOK DISCOVERY FLOW
+ * =====================
+ * TikTok has a dedicated user search at /search/user?q=... which returns
+ * profile cards directly with follower counts and bios.
+ *
+ * X/TWITTER DISCOVERY FLOW
+ * ========================
+ * X has a people search at /search?q=...&f=user which returns UserCell
+ * components with profile data.
+ *
+ * EVALUATE SCRIPTS
+ * ================
+ * All scripts passed to evalInTab() are wrapped in (function(){ ... })() by
+ * the relay's CDP Runtime.evaluate. Use `return` to return values. Results
+ * should be JSON strings for complex data.
+ *
+ * LIMITATIONS
+ * ===========
+ *   - Requires the user to be logged in on each platform in Chrome
+ *   - Rate limiting may apply; built-in delays of 1.5-3s between navigations
+ *   - Platform HTML structures change frequently; selectors may need updates
+ *   - The chrome.debugger API shows a yellow infobar on the tab being debugged
+ */
+
+import { extensionRelayServer } from '../browser-extension-relay/server.js';
+import type { ExtensionCommand, ExtensionResponse } from '../browser-extension-relay/protocol.js';
+import { readHttpToken } from '../util/platform.js';
+
+// ---------------------------------------------------------------------------
+// Types
+// ---------------------------------------------------------------------------
+
+export interface InfluencerSearchCriteria {
+  /** Keywords, niche, or topic to search for */
+  query: string;
+  /** Platforms to search on */
+  platforms?: ('instagram' | 'tiktok' | 'twitter')[];
+  /** Minimum follower count */
+  minFollowers?: number;
+  /** Maximum follower count */
+  maxFollowers?: number;
+  /** Maximum number of results per platform */
+  limit?: number;
+  /** Language/locale filter */
+  language?: string;
+  /** Look for verified accounts only */
+  verifiedOnly?: boolean;
+}
+
+export interface InfluencerProfile {
+  /** Platform the profile was found on */
+  platform: 'instagram' | 'tiktok' | 'twitter';
+  /** Username/handle */
+  username: string;
+  /** Display name */
+  displayName: string;
+  /** Profile URL */
+  profileUrl: string;
+  /** Bio/description */
+  bio: string;
+  /** Follower count (numeric) */
+  followers: number | undefined;
+  /** Follower count (display string, e.g. "1.2M") */
+  followersDisplay: string;
+  /** Following count */
+  following: number | undefined;
+  /** Post/video count */
+  postCount: number | undefined;
+  /** Whether the account is verified */
+  isVerified: boolean;
+  /** Profile picture URL */
+  avatarUrl: string | undefined;
+  /** Engagement rate estimate (if available) */
+  engagementRate: number | undefined;
+  /** Average likes per post (if available from recent posts) */
+  avgLikes: number | undefined;
+  /** Average comments per post (if available from recent posts) */
+  avgComments: number | undefined;
+  /** Content categories/themes detected from bio and recent posts */
+  contentThemes: string[];
+  /** Recent post captions/snippets for context */
+  recentPosts: { text: string; likes?: number; comments?: number }[];
+  /** Raw score for ranking */
+  relevanceScore: number;
+}
+
+export interface InfluencerSearchResult {
+  platform: string;
+  profiles: InfluencerProfile[];
+  count: number;
+  query: string;
+  error?: string;
+}
+
+// ---------------------------------------------------------------------------
+// Relay command routing (same pattern as Amazon client)
+// ---------------------------------------------------------------------------
+
+async function sendRelayCommand(command: Record<string, unknown>): Promise<ExtensionResponse> {
+  const status = extensionRelayServer.getStatus();
+  if (status.connected) {
+    return extensionRelayServer.sendCommand(command as Omit<ExtensionCommand, 'id'>);
+  }
+
+  // Fall back to HTTP relay endpoint on the daemon
+  const token = readHttpToken();
+  if (!token) {
+    throw new Error(
+      'Browser extension relay is not connected and no HTTP token found. Is the daemon running?',
+    );
+  }
+
+  const resp = await fetch('http://127.0.0.1:7821/v1/browser-relay/command', {
+    method: 'POST',
+    headers: {
+      'Content-Type': 'application/json',
+      Authorization: `Bearer ${token}`,
+    },
+    body: JSON.stringify(command),
+  });
+
+  if (!resp.ok) {
+    const body = await resp.text();
+    throw new Error(`Relay HTTP command failed (${resp.status}): ${body}`);
+  }
+
+  return (await resp.json()) as ExtensionResponse;
+}
+
+// ---------------------------------------------------------------------------
+// Tab management & eval
+// ---------------------------------------------------------------------------
+
+async function findOrOpenTab(urlPattern: string, fallbackUrl: string): Promise<number> {
+  const resp = await sendRelayCommand({ action: 'find_tab', url: urlPattern });
+  if (resp.success && resp.tabId !== undefined) {
+    return resp.tabId;
+  }
+
+  const newTab = await sendRelayCommand({ action: 'new_tab', url: fallbackUrl });
+  if (!newTab.success || newTab.tabId === undefined) {
+    throw new Error(`Could not open tab for ${fallbackUrl}`);
+  }
+
+  await sleep(2500);
+  return newTab.tabId;
+}
+
+async function navigateTab(tabId: number, url: string): Promise<void> {
+  const resp = await sendRelayCommand({ action: 'navigate', tabId, url });
+  if (!resp.success) {
+    throw new Error(`Failed to navigate: ${resp.error ?? 'unknown error'}`);
+  }
+  await sleep(3000);
+}
+
+/**
+ * Evaluate a JS script in a tab. The script is wrapped in an IIFE by the relay
+ * so use `return` to yield a value. For complex results, return a JSON string.
+ */
+async function evalInTab(tabId: number, script: string): Promise<unknown> {
+  const resp = await sendRelayCommand({ action: 'evaluate', tabId, code: script });
+  if (!resp.success) {
+    throw new Error(`Browser eval failed: ${resp.error ?? 'unknown error'}`);
+  }
+  return resp.result;
+}
+
+function sleep(ms: number): Promise<void> {
+  return new Promise((resolve) => setTimeout(resolve, ms));
+}
+
+// ---------------------------------------------------------------------------
+// Follower count parser
+// ---------------------------------------------------------------------------
+
+function parseFollowerCount(text: string): number | undefined {
+  if (!text) return undefined;
+  const cleaned = text.toLowerCase().replace(/,/g, '').replace(/\s+/g, '').trim();
+  const match = cleaned.match(/([\d.]+)\s*([kmbt]?)/);
+  if (!match) return undefined;
+
+  const num = parseFloat(match[1]);
+  const suffix = match[2];
+  const multipliers: Record<string, number> = {
+    '': 1,
+    k: 1_000,
+    m: 1_000_000,
+    b: 1_000_000_000,
+    t: 1_000_000_000_000,
+  };
+  return Math.round(num * (multipliers[suffix] || 1));
+}
+
+// ---------------------------------------------------------------------------
+// Instagram scraping
+// ---------------------------------------------------------------------------
+
+/**
+ * Search Instagram for influencers by keyword.
+ *
+ * Strategy: search by keyword → extract post links → visit each post to find
+ * the author → deduplicate → visit each unique profile for stats.
+ */
+async function searchInstagram(
+  criteria: InfluencerSearchCriteria,
+): Promise<InfluencerProfile[]> {
+  const limit = criteria.limit ?? 10;
+  const tabId = await findOrOpenTab('*://*.instagram.com/*', 'https://www.instagram.com');
+
+  // Step 1: Navigate to keyword search (shows a grid of posts)
+  const searchUrl = `https://www.instagram.com/explore/search/keyword/?q=${encodeURIComponent(criteria.query)}`;
+  await navigateTab(tabId, searchUrl);
+  await sleep(2000);
+
+  // Step 2: Extract post links from the search grid
+  const postLinksRaw = await evalInTab(tabId, `
+    var links = [];
+    document.querySelectorAll('a[href]').forEach(function(a) {
+      var h = a.getAttribute('href');
+      if (h && (h.indexOf('/p/') > -1 || h.indexOf('/reel/') > -1)) links.push(h);
+    });
+    return JSON.stringify(links.slice(0, ${limit * 2}));
+  `);
+
+  let postLinks: string[];
+  try {
+    postLinks = JSON.parse(String(postLinksRaw));
+  } catch {
+    postLinks = [];
+  }
+
+  if (postLinks.length === 0) {
+    return [];
+  }
+
+  // Step 3: Visit each post to extract the author username
+  const seenUsernames = new Set<string>();
+  const authorUsernames: string[] = [];
+
+  // Navigation skip list — known non-profile IG paths
+  const skipUsernames = new Set([
+    'reels', 'explore', 'stories', 'direct', 'accounts', 'about',
+    'p', 'reel', 'tv', 'search', 'nametag', 'directory', '',
+  ]);
+
+  for (const postLink of postLinks) {
+    if (authorUsernames.length >= limit) break;
+
+    try {
+      await navigateTab(tabId, `https://www.instagram.com${postLink}`);
+      await sleep(1000);
+
+      // Extract the author username from the post page.
+      // The post page body text starts with navigation items, then shows:
+      //   "username\n...audio info...\nFollow\nusername\n..."
+      // We look for the first profile link that isn't a nav item.
+      const authorRaw = await evalInTab(tabId, `
+        var bodyText = document.body.innerText;
+        // The author name appears after navigation elements, usually right before "Follow"
+        // Also try extracting from links
+        var links = document.querySelectorAll('a[href]');
+        var skip = ['', 'reels', 'explore', 'stories', 'direct', 'accounts', 'about',
+                     'p', 'reel', 'tv', 'search', 'nametag', 'directory'];
+        var navLabels = ['Instagram', 'Home', 'HomeHome', 'Reels', 'ReelsReels', 'Messages',
+                         'MessagesMessages', 'Search', 'SearchSearch', 'Explore', 'ExploreExplore',
+                         'Notifications', 'NotificationsNotifications', 'Create', 'New postCreate',
+                         'Profile', 'More', 'SettingsMore', 'Also from Meta', 'Also from MetaAlso from Meta'];
+        var author = null;
+        for (var i = 0; i < links.length; i++) {
+          var href = links[i].getAttribute('href') || '';
+          var text = links[i].textContent.trim();
+          var match = href.match(/^\\/([a-zA-Z0-9_.]+)\\/$/);
+          if (!match) continue;
+          var username = match[1];
+          if (skip.indexOf(username) > -1) continue;
+          if (navLabels.indexOf(text) > -1) continue;
+          // Skip the logged-in user's profile link (usually "Profile" or their own name in nav)
+          if (text === 'Profile' || text === '') continue;
+          author = username;
+          break;
+        }
+        // Fallback: parse from body text — look for the pattern after "Follow\\n"
+        if (!author) {
+          var followIdx = bodyText.indexOf('Follow\\n');
+          if (followIdx > -1) {
+            var afterFollow = bodyText.substring(followIdx + 7, followIdx + 50);
+            var lineEnd = afterFollow.indexOf('\\n');
+            if (lineEnd > -1) {
+              author = afterFollow.substring(0, lineEnd).trim();
+            }
+          }
+        }
+        return author;
+      `);
+
+      const authorUsername = String(authorRaw || '').trim();
+      if (authorUsername && !skipUsernames.has(authorUsername) && !seenUsernames.has(authorUsername)) {
+        seenUsernames.add(authorUsername);
+        authorUsernames.push(authorUsername);
+      }
+    } catch {
+      // Skip posts that fail
+      continue;
+    }
+  }
+
+  if (authorUsernames.length === 0) {
+    return [];
+  }
+
+  // Step 4: Visit each unique profile to scrape stats
+  const profiles: InfluencerProfile[] = [];
+
+  for (const username of authorUsernames) {
+    try {
+      const profile = await scrapeInstagramProfile(tabId, username, criteria);
+      if (profile && matchesCriteria(profile, criteria)) {
+        profiles.push(profile);
+      }
+      await sleep(1500);
+    } catch {
+      continue;
+    }
+  }
+
+  return profiles;
+}
+
+/**
+ * Scrape a single Instagram profile page for stats.
+ *
+ * The most reliable data source is the meta[name="description"] tag which
+ * contains: "49K Followers, 463 Following, 551 Posts - Display Name (@username)
+ * on Instagram: "bio text""
+ *
+ * Falls back to parsing from body text.
+ */
+async function scrapeInstagramProfile(
+  tabId: number,
+  username: string,
+  criteria: InfluencerSearchCriteria,
+): Promise<InfluencerProfile | null> {
+  await navigateTab(tabId, `https://www.instagram.com/${username}/`);
+  await sleep(2000);
+
+  const raw = await evalInTab(tabId, `
+    var r = { username: '${username}' };
+
+    // Primary source: meta description tag
+    // Format: "49K Followers, 463 Following, 551 Posts - Display Name (@user) on Instagram: \\"bio\\""
+    var meta = document.querySelector('meta[name="description"]');
+    r.meta = meta ? meta.getAttribute('content') : '';
+
+    // Parse meta for structured data
+    if (r.meta) {
+      var fMatch = r.meta.match(/([\\d,.]+[KkMmBb]?)\\s*Follower/i);
+      var fgMatch = r.meta.match(/([\\d,.]+[KkMmBb]?)\\s*Following/i);
+      var pMatch = r.meta.match(/([\\d,.]+[KkMmBb]?)\\s*Post/i);
+      r.followers = fMatch ? fMatch[1] : '';
+      r.following = fgMatch ? fgMatch[1] : '';
+      r.posts = pMatch ? pMatch[1] : '';
+
+      // Display name: between "Posts - " and " (@"
+      var nameMatch = r.meta.match(/Posts\\s*-\\s*(.+?)\\s*\\(@/);
+      r.displayName = nameMatch ? nameMatch[1].trim() : '';
+
+      // Bio: after 'on Instagram: "' until end quote
+      var bioMatch = r.meta.match(/on Instagram:\\s*"(.+?)"/);
+      r.bio = bioMatch ? bioMatch[1] : '';
+    }
+
+    // Fallback: parse from body text
+    var bodyText = document.body.innerText;
+    if (!r.followers) {
+      var bfMatch = bodyText.match(/([\\d,.]+[KkMmBb]?)\\s*followers/i);
+      r.followers = bfMatch ? bfMatch[1] : '';
+    }
+    if (!r.following) {
+      var bgMatch = bodyText.match(/([\\d,.]+[KkMmBb]?)\\s*following/i);
+      r.following = bgMatch ? bgMatch[1] : '';
+    }
+    if (!r.posts) {
+      var bpMatch = bodyText.match(/([\\d,.]+[KkMmBb]?)\\s*posts/i);
+      r.posts = bpMatch ? bpMatch[1] : '';
+    }
+
+    // Verified status
+    r.isVerified = bodyText.indexOf('Verified') > -1;
+
+    // Bio fallback: grab the text between "following" and "Follow" button
+    if (!r.bio) {
+      var followingIdx = bodyText.indexOf(' following');
+      if (followingIdx > -1) {
+        var afterFollowing = bodyText.substring(followingIdx + 10, followingIdx + 400);
+        // Cut at common boundaries
+        var cutPoints = ['Follow', 'Message', 'Meta', 'About'];
+        var minCut = afterFollowing.length;
+        for (var c = 0; c < cutPoints.length; c++) {
+          var idx = afterFollowing.indexOf(cutPoints[c]);
+          if (idx > -1 && idx < minCut) minCut = idx;
+        }
+        r.bio = afterFollowing.substring(0, minCut).trim();
+      }
+    }
+
+    // Avatar
+    var avatarEl = document.querySelector('header img') ||
+                   document.querySelector('img[alt*="profile"]');
+    r.avatarUrl = avatarEl ? avatarEl.getAttribute('src') : null;
+
+    return JSON.stringify(r);
+  `);
+
+  let data: Record<string, unknown>;
+  try {
+    data = JSON.parse(String(raw));
+  } catch {
+    return null;
+  }
+
+  const followersNum = parseFollowerCount(String(data.followers || ''));
+  const followingNum = parseFollowerCount(String(data.following || ''));
+  const postCount = parseFollowerCount(String(data.posts || ''));
+
+  return {
+    platform: 'instagram',
+    username,
+    displayName: String(data.displayName || username),
+    profileUrl: `https://www.instagram.com/${username}/`,
+    bio: String(data.bio || ''),
+    followers: followersNum,
+    followersDisplay: String(data.followers || 'unknown'),
+    following: followingNum,
+    postCount,
+    isVerified: Boolean(data.isVerified),
+    avatarUrl: data.avatarUrl ? String(data.avatarUrl) : undefined,
+    engagementRate: undefined,
+    avgLikes: undefined,
+    avgComments: undefined,
+    contentThemes: extractThemes(String(data.bio || '') + ' ' + String(data.meta || ''), criteria.query),
+    recentPosts: [],
+    relevanceScore: 0,
+  };
+}
+
+// ---------------------------------------------------------------------------
+// TikTok scraping
+// ---------------------------------------------------------------------------
+
+/**
+ * Search TikTok for influencers by keyword.
+ *
+ * TikTok's user search at /search/user?q=... renders a list where each card
+ * produces a predictable text pattern in innerText:
+ *
+ *   DisplayName
+ *   username
+ *   77.9K          (follower count)
+ *   Followers
+ *   ·
+ *   1.5M           (like count)
+ *   Likes
+ *   Follow
+ *
+ * DOM class-based selectors are unreliable on TikTok (obfuscated class names),
+ * so we parse this text pattern directly.
+ */
+async function searchTikTok(
+  criteria: InfluencerSearchCriteria,
+): Promise<InfluencerProfile[]> {
+  const limit = criteria.limit ?? 10;
+  const tabId = await findOrOpenTab('*://*.tiktok.com/*', 'https://www.tiktok.com');
+
+  const searchUrl = `https://www.tiktok.com/search/user?q=${encodeURIComponent(criteria.query)}`;
+  await navigateTab(tabId, searchUrl);
+  await sleep(3000);
+
+  // Scroll to load more results
+  await evalInTab(tabId, `window.scrollTo(0, document.body.scrollHeight); return 'scrolled'`);
+  await sleep(2000);
+
+  // Parse the text pattern: DisplayName, username, count, "Followers", "·", count, "Likes"
+  const raw = await evalInTab(tabId, `
+    var text = document.body.innerText;
+    var lines = text.split('\\n').map(function(l) { return l.trim(); }).filter(function(l) { return l.length > 0; });
+    var users = [];
+    for (var i = 0; i < lines.length - 6; i++) {
+      if (lines[i+2] &&
+          lines[i+2].match(/^[\\d,.]+[KkMmBb]?$/) &&
+          lines[i+3] === 'Followers' &&
+          lines[i+4] === '·' &&
+          lines[i+6] === 'Likes') {
+        var username = lines[i+1];
+        if (!username.match(/^[a-zA-Z0-9_.]+$/)) continue;
+        users.push({
+          displayName: lines[i],
+          username: username,
+          followers: lines[i+2],
+          likes: lines[i+5],
+        });
+        i += 7;
+      }
+    }
+    return JSON.stringify(users.slice(0, ${limit * 2}));
+  `);
+
+  let searchResults: Array<{
+    username: string;
+    displayName: string;
+    followers: string;
+    likes: string;
+  }>;
+  try {
+    searchResults = JSON.parse(String(raw));
+  } catch {
+    return [];
+  }
+
+  // Convert to profiles — we only have basic data from search, no bios yet
+  const profiles: InfluencerProfile[] = searchResults.map((p) => ({
+    platform: 'tiktok' as const,
+    username: p.username,
+    displayName: p.displayName || p.username,
+    profileUrl: `https://www.tiktok.com/@${p.username}`,
+    bio: '',
+    followers: parseFollowerCount(p.followers),
+    followersDisplay: p.followers || 'unknown',
+    following: undefined,
+    postCount: undefined,
+    isVerified: false,
+    avatarUrl: undefined,
+    engagementRate: undefined,
+    avgLikes: undefined,
+    avgComments: undefined,
+    contentThemes: extractThemes(p.displayName, criteria.query),
+    recentPosts: [],
+    relevanceScore: 0,
+  }));
+
+  // Filter by criteria first to avoid unnecessary profile visits
+  const filtered = profiles.filter((p) => matchesCriteria(p, criteria));
+
+  // Enrich with bios by visiting each profile
+  const enriched: InfluencerProfile[] = [];
+  for (const profile of filtered.slice(0, limit)) {
+    try {
+      const detailed = await scrapeTikTokProfile(tabId, profile.username, criteria);
+      if (detailed) {
+        enriched.push(detailed);
+      } else {
+        enriched.push(profile);
+      }
+      await sleep(1500);
+    } catch {
+      enriched.push(profile);
+    }
+  }
+
+  return enriched;
+}
+
+/**
+ * Scrape a single TikTok profile page for detailed stats.
+ *
+ * TikTok profile pages show stats and bio in the body text. We use a
+ * combination of data-e2e selectors (when they work) and body text regex
+ * as a fallback. The bio is also extracted from the region between
+ * "Following" and "Videos" in the body text.
+ */
+async function scrapeTikTokProfile(
+  tabId: number,
+  username: string,
+  criteria: InfluencerSearchCriteria,
+): Promise<InfluencerProfile | null> {
+  await navigateTab(tabId, `https://www.tiktok.com/@${username}`);
+  await sleep(2500);
+
+  const raw = await evalInTab(tabId, `
+    var r = { username: '${username}' };
+    var bodyText = document.body.innerText;
+
+    // Stats from body text (most reliable)
+    var fMatch = bodyText.match(/([\\d,.]+[KkMmBb]?)\\s*[Ff]ollower/);
+    var fgMatch = bodyText.match(/([\\d,.]+[KkMmBb]?)\\s*[Ff]ollowing/);
+    var lMatch = bodyText.match(/([\\d,.]+[KkMmBb]?)\\s*[Ll]ike/);
+    r.followers = fMatch ? fMatch[1] : '';
+    r.following = fgMatch ? fgMatch[1] : '';
+    r.likes = lMatch ? lMatch[1] : '';
+
+    // Bio: try data-e2e selector first, fall back to text parsing
+    var bioEl = document.querySelector('[data-e2e="user-bio"]') ||
+                document.querySelector('h2[data-e2e="user-subtitle"]');
+    r.bio = bioEl ? bioEl.textContent.trim() : '';
+
+    if (!r.bio) {
+      // Fallback: extract bio from between "Following" and "Videos" in body text
+      var followingIdx = bodyText.indexOf('Following');
+      if (followingIdx > -1) {
+        var chunk = bodyText.substring(followingIdx + 10, followingIdx + 500);
+        var videosIdx = chunk.indexOf('Videos');
+        if (videosIdx > -1) chunk = chunk.substring(0, videosIdx);
+        // Also cut at "Liked" or "Reposts"
+        var likedIdx = chunk.indexOf('Liked');
+        if (likedIdx > -1 && likedIdx < chunk.length) chunk = chunk.substring(0, likedIdx);
+        r.bio = chunk.trim();
+      }
+    }
+
+    // Display name: try data-e2e, fall back to page title
+    var nameEl = document.querySelector('[data-e2e="user-title"]') ||
+                 document.querySelector('h1[data-e2e="user-title"]');
+    r.displayName = nameEl ? nameEl.textContent.trim() : '';
+    if (!r.displayName) {
+      // TikTok titles are often "displayname (@username) | TikTok"
+      var titleMatch = document.title.match(/^(.+?)\\s*\\(@/);
+      r.displayName = titleMatch ? titleMatch[1].trim() : '${username}';
+    }
+
+    // Verified
+    r.isVerified = bodyText.indexOf('Verified') > -1 ||
+                   !!document.querySelector('svg[class*="verify"]') ||
+                   !!document.querySelector('[class*="verified"]');
+
+    // Avatar
+    var img = document.querySelector('img[class*="avatar"]') ||
+              document.querySelector('img[src*="tiktokcdn"]');
+    r.avatarUrl = img ? img.getAttribute('src') : null;
+
+    return JSON.stringify(r);
+  `);
+
+  let data: Record<string, unknown>;
+  try {
+    data = JSON.parse(String(raw));
+  } catch {
+    return null;
+  }
+
+  const bio = String(data.bio || '');
+
+  return {
+    platform: 'tiktok',
+    username,
+    displayName: String(data.displayName || username),
+    profileUrl: `https://www.tiktok.com/@${username}`,
+    bio,
+    followers: parseFollowerCount(String(data.followers || '')),
+    followersDisplay: String(data.followers || 'unknown'),
+    following: parseFollowerCount(String(data.following || '')),
+    postCount: undefined,
+    isVerified: Boolean(data.isVerified),
+    avatarUrl: data.avatarUrl ? String(data.avatarUrl) : undefined,
+    engagementRate: undefined,
+    avgLikes: undefined,
+    avgComments: undefined,
+    contentThemes: extractThemes(bio, criteria.query),
+    recentPosts: [],
+    relevanceScore: 0,
+  };
+}
+
+// ---------------------------------------------------------------------------
+// X / Twitter scraping
+// ---------------------------------------------------------------------------
+
+/**
+ * Search X/Twitter for influencers by keyword.
+ *
+ * X has a people search at /search?q=...&f=user. Results are rendered as
+ * [data-testid="UserCell"] components. Each cell's innerText follows this
+ * pattern:
+ *
+ *   [Followed by X and Y others]   (optional social proof line)
+ *   Display Name
+ *   @username
+ *   Follow
+ *   Bio text...
+ *
+ * We parse the @username from the text (the DOM selector approach picks up
+ * "Followed by..." text instead of handles). After extracting from search,
+ * we visit each profile to get follower counts since the search page doesn't
+ * include them.
+ *
+ * NOTE: Keep search queries SHORT (2-4 words). X returns "No results" for
+ * long multi-word people searches.
+ */
+async function searchTwitter(
+  criteria: InfluencerSearchCriteria,
+): Promise<InfluencerProfile[]> {
+  const limit = criteria.limit ?? 10;
+  const tabId = await findOrOpenTab('*://*.x.com/*', 'https://x.com');
+
+  // Use a short query — X people search fails with long queries
+  const queryWords = criteria.query.split(/\s+/).slice(0, 4).join(' ');
+  const searchUrl = `https://x.com/search?q=${encodeURIComponent(queryWords)}&f=user`;
+  await navigateTab(tabId, searchUrl);
+  await sleep(4000);
+
+  // Scroll to load more results
+  await evalInTab(tabId, `window.scrollTo(0, 800); return 'ok'`);
+  await sleep(2000);
+  await evalInTab(tabId, `window.scrollTo(0, document.body.scrollHeight); return 'ok'`);
+  await sleep(2000);
+
+  // Extract profiles from UserCell components using text pattern parsing
+  const raw = await evalInTab(tabId, `
+    var cells = document.querySelectorAll('[data-testid="UserCell"]');
+    var results = [];
+    var seen = {};
+    for (var j = 0; j < cells.length; j++) {
+      var text = cells[j].innerText;
+      var lines = text.split('\\n').map(function(l) { return l.trim(); }).filter(function(l) { return l.length > 0; });
+
+      var username = '';
+      var displayName = '';
+      var bio = '';
+      for (var k = 0; k < lines.length; k++) {
+        var m = lines[k].match(/^@([a-zA-Z0-9_]+)$/);
+        if (m) {
+          username = m[1];
+          // Display name is the line before @username (unless it's "Followed by...")
+          if (k > 0 && !lines[k-1].startsWith('Followed')) {
+            displayName = lines[k-1];
+          } else if (k > 1) {
+            displayName = lines[k-2] || '';
+          }
+          // Bio is everything after "Follow" button text
+          var afterFollow = false;
+          for (var n = k + 1; n < lines.length; n++) {
+            if (lines[n] === 'Follow') { afterFollow = true; continue; }
+            if (afterFollow) {
+              bio = lines.slice(n).join(' ').substring(0, 250);
+              break;
+            }
+          }
+          break;
+        }
+      }
+
+      if (!username || seen[username]) continue;
+      seen[username] = true;
+      if (!displayName || displayName.startsWith('Followed')) displayName = username;
+
+      var verified = !!cells[j].querySelector('svg[data-testid="icon-verified"]');
+      var img = cells[j].querySelector('img[src*="profile_images"]');
+
+      results.push({
+        username: username,
+        displayName: displayName,
+        bio: bio,
+        isVerified: verified,
+        avatarUrl: img ? img.getAttribute('src') : null,
+      });
+    }
+    return JSON.stringify(results.slice(0, ${limit * 3}));
+  `);
+
+  let searchResults: Array<{
+    username: string;
+    displayName: string;
+    bio: string;
+    isVerified: boolean;
+    avatarUrl: string | null;
+  }>;
+  try {
+    searchResults = JSON.parse(String(raw));
+  } catch {
+    return [];
+  }
+
+  if (searchResults.length === 0) return [];
+
+  // Visit each profile to get follower counts (search results don't include them)
+  const profiles: InfluencerProfile[] = [];
+  for (const sr of searchResults.slice(0, limit)) {
+    try {
+      const profile = await scrapeTwitterProfile(tabId, sr.username, criteria);
+      if (profile && matchesCriteria(profile, criteria)) {
+        profiles.push(profile);
+      }
+      await sleep(1500);
+    } catch {
+      // Still include with search data if profile visit fails
+      profiles.push({
+        platform: 'twitter',
+        username: sr.username,
+        displayName: sr.displayName,
+        profileUrl: `https://x.com/${sr.username}`,
+        bio: sr.bio,
+        followers: undefined,
+        followersDisplay: 'unknown',
+        following: undefined,
+        postCount: undefined,
+        isVerified: sr.isVerified,
+        avatarUrl: sr.avatarUrl ?? undefined,
+        engagementRate: undefined,
+        avgLikes: undefined,
+        avgComments: undefined,
+        contentThemes: extractThemes(sr.bio, criteria.query),
+        recentPosts: [],
+        relevanceScore: 0,
+      });
+    }
+  }
+
+  return profiles;
+}
+
+/**
+ * Scrape a single X/Twitter profile page for detailed stats.
+ *
+ * Uses a combination of data-testid selectors (reliable on X) and body text
+ * regex for follower/following counts. The data-testid="UserName",
+ * data-testid="UserDescription" selectors work well on X profile pages.
+ * Follower counts are extracted from body text as the DOM structure for
+ * stat links varies.
+ */
+async function scrapeTwitterProfile(
+  tabId: number,
+  username: string,
+  _criteria: InfluencerSearchCriteria,
+): Promise<InfluencerProfile | null> {
+  await navigateTab(tabId, `https://x.com/${username}`);
+  await sleep(2500);
+
+  const raw = await evalInTab(tabId, `
+    var r = { username: '${username}' };
+
+    // Display name from UserName testid
+    var nameEl = document.querySelector('[data-testid="UserName"]');
+    if (nameEl) {
+      var spans = nameEl.querySelectorAll('span');
+      if (spans.length > 0) r.displayName = spans[0].textContent.trim();
+    }
+
+    // Bio from UserDescription testid
+    var bioEl = document.querySelector('[data-testid="UserDescription"]');
+    r.bio = bioEl ? bioEl.textContent.trim() : '';
+
+    // Follower/following counts from body text (most reliable)
+    var bodyText = document.body.innerText;
+    var fMatch = bodyText.match(/([\\.\\d,]+[KkMm]?)\\s*Follower/);
+    var fgMatch = bodyText.match(/([\\.\\d,]+[KkMm]?)\\s*Following/);
+    r.followers = fMatch ? fMatch[1] : '';
+    r.following = fgMatch ? fgMatch[1] : '';
+
+    // Verified
+    r.isVerified = !!document.querySelector('svg[data-testid="icon-verified"]') ||
+                   !!document.querySelector('[aria-label*="Verified"]');
+
+    // Avatar
+    var img = document.querySelector('img[src*="profile_images"]');
+    r.avatarUrl = img ? img.getAttribute('src') : null;
+
+    return JSON.stringify(r);
+  `);
+
+  let data: Record<string, unknown>;
+  try {
+    data = JSON.parse(String(raw));
+  } catch {
+    return null;
+  }
+
+  return {
+    platform: 'twitter',
+    username,
+    displayName: String(data.displayName || username),
+    profileUrl: `https://x.com/${username}`,
+    bio: String(data.bio || ''),
+    followers: parseFollowerCount(String(data.followers || '')),
+    followersDisplay: String(data.followers || 'unknown'),
+    following: parseFollowerCount(String(data.following || '')),
+    postCount: undefined,
+    isVerified: Boolean(data.isVerified),
+    avatarUrl: data.avatarUrl ? String(data.avatarUrl) : undefined,
+    engagementRate: undefined,
+    avgLikes: undefined,
+    avgComments: undefined,
+    contentThemes: extractThemes(String(data.bio || ''), ''),
+    recentPosts: [],
+    relevanceScore: 0,
+  };
+}
+
+// ---------------------------------------------------------------------------
+// Scoring & filtering
+// ---------------------------------------------------------------------------
+
+function matchesCriteria(
+  profile: InfluencerProfile,
+  criteria: InfluencerSearchCriteria,
+): boolean {
+  if (criteria.minFollowers && profile.followers !== undefined) {
+    if (profile.followers < criteria.minFollowers) return false;
+  }
+  if (criteria.maxFollowers && profile.followers !== undefined) {
+    if (profile.followers > criteria.maxFollowers) return false;
+  }
+  if (criteria.verifiedOnly && !profile.isVerified) {
+    return false;
+  }
+  return true;
+}
+
+function scoreProfile(
+  profile: InfluencerProfile,
+  criteria: InfluencerSearchCriteria,
+): number {
+  let score = 0;
+
+  // Follower count scoring
+  if (profile.followers !== undefined) {
+    if (profile.followers >= 1_000) score += 10;
+    if (profile.followers >= 10_000) score += 20;
+    if (profile.followers >= 100_000) score += 30;
+    if (profile.followers >= 1_000_000) score += 20;
+
+    // Bonus for being within requested range
+    if (criteria.minFollowers && criteria.maxFollowers) {
+      const mid = (criteria.minFollowers + criteria.maxFollowers) / 2;
+      const distance = Math.abs(profile.followers - mid) / mid;
+      score += Math.max(0, 20 - distance * 20);
+    }
+  }
+
+  // Verified boost
+  if (profile.isVerified) score += 15;
+
+  // Bio relevance
+  const queryTerms = criteria.query.toLowerCase().split(/\s+/);
+  const bioLower = profile.bio.toLowerCase();
+  for (const term of queryTerms) {
+    if (bioLower.includes(term)) score += 10;
+  }
+
+  // Content theme matching
+  if (profile.contentThemes.length > 0) score += 5 * profile.contentThemes.length;
+
+  // Completeness bonuses
+  if (profile.avatarUrl) score += 5;
+  if (profile.bio.length > 20) score += 5;
+
+  return score;
+}
+
+function extractThemes(bio: string, query: string): string[] {
+  const themes: string[] = [];
+  const text = (bio + ' ' + query).toLowerCase();
+
+  const themeKeywords: Record<string, string[]> = {
+    fashion: ['fashion', 'style', 'outfit', 'ootd', 'clothing', 'wear', 'designer'],
+    beauty: ['beauty', 'makeup', 'skincare', 'cosmetic', 'hair', 'glow'],
+    fitness: ['fitness', 'gym', 'workout', 'health', 'training', 'athlete', 'sports'],
+    food: ['food', 'recipe', 'cooking', 'chef', 'foodie', 'restaurant', 'eat'],
+    travel: ['travel', 'wanderlust', 'adventure', 'explore', 'tourism', 'destination'],
+    tech: ['tech', 'technology', 'gadget', 'software', 'coding', 'developer', 'ai', 'artificial intelligence'],
+    gaming: ['gaming', 'gamer', 'esports', 'twitch', 'stream', 'game'],
+    music: ['music', 'musician', 'singer', 'artist', 'producer', 'dj'],
+    lifestyle: ['lifestyle', 'daily', 'vlog', 'life', 'mom', 'dad', 'family'],
+    business: ['business', 'entrepreneur', 'startup', 'marketing', 'ceo', 'founder'],
+    photography: ['photo', 'photography', 'photographer', 'visual', 'creative'],
+    comedy: ['comedy', 'funny', 'humor', 'meme', 'comedian', 'laugh'],
+    education: ['education', 'learn', 'teach', 'tutor', 'tips', 'howto', 'teaching'],
+    wellness: ['wellness', 'mindfulness', 'meditation', 'yoga', 'mental health'],
+    career: ['career', 'job', 'hiring', 'resume', 'interview', 'salary', 'remote work'],
+  };
+
+  for (const [theme, keywords] of Object.entries(themeKeywords)) {
+    if (keywords.some((kw) => text.includes(kw))) {
+      themes.push(theme);
+    }
+  }
+
+  return themes;
+}
+
+// ---------------------------------------------------------------------------
+// Main search orchestrator
+// ---------------------------------------------------------------------------
+
+/**
+ * Search for influencers across specified platforms.
+ */
+export async function searchInfluencers(
+  criteria: InfluencerSearchCriteria,
+): Promise<InfluencerSearchResult[]> {
+  const platforms = criteria.platforms ?? ['instagram', 'tiktok', 'twitter'];
+  const results: InfluencerSearchResult[] = [];
+
+  for (const platform of platforms) {
+    try {
+      let profiles: InfluencerProfile[];
+
+      switch (platform) {
+        case 'instagram':
+          profiles = await searchInstagram(criteria);
+          break;
+        case 'tiktok':
+          profiles = await searchTikTok(criteria);
+          break;
+        case 'twitter':
+          profiles = await searchTwitter(criteria);
+          break;
+        default:
+          continue;
+      }
+
+      // Score and sort
+      profiles = profiles.map((p) => ({
+        ...p,
+        relevanceScore: scoreProfile(p, criteria),
+      }));
+      profiles.sort((a, b) => b.relevanceScore - a.relevanceScore);
+
+      results.push({
+        platform,
+        profiles,
+        count: profiles.length,
+        query: criteria.query,
+      });
+    } catch (err) {
+      results.push({
+        platform,
+        profiles: [],
+        count: 0,
+        query: criteria.query,
+        error: err instanceof Error ? err.message : String(err),
+      });
+    }
+  }
+
+  return results;
+}
+
+/**
+ * Get detailed profile data for a specific influencer.
+ */
+export async function getInfluencerProfile(
+  platform: 'instagram' | 'tiktok' | 'twitter',
+  username: string,
+): Promise<InfluencerProfile | null> {
+  const criteria: InfluencerSearchCriteria = { query: '' };
+
+  switch (platform) {
+    case 'instagram': {
+      const tabId = await findOrOpenTab('*://*.instagram.com/*', 'https://www.instagram.com');
+      return scrapeInstagramProfile(tabId, username, criteria);
+    }
+    case 'twitter': {
+      const tabId = await findOrOpenTab('*://*.x.com/*', 'https://x.com');
+      return scrapeTwitterProfile(tabId, username, criteria);
+    }
+    case 'tiktok': {
+      const tabId = await findOrOpenTab('*://*.tiktok.com/*', 'https://www.tiktok.com');
+      return scrapeTikTokProfile(tabId, username, criteria);
+    }
+    default:
+      return null;
+  }
+}
+
+/**
+ * Compare multiple influencers side by side.
+ */
+export async function compareInfluencers(
+  influencers: { platform: 'instagram' | 'tiktok' | 'twitter'; username: string }[],
+): Promise<InfluencerProfile[]> {
+  const profiles: InfluencerProfile[] = [];
+
+  for (const inf of influencers) {
+    const profile = await getInfluencerProfile(inf.platform, inf.username);
+    if (profile) {
+      profiles.push(profile);
+    }
+    await sleep(2000);
+  }
+
+  return profiles;
+}
diff --git a/assistant/src/memory/conflict-store.ts b/assistant/src/memory/conflict-store.ts
index 3a9b8da5911..ea52d42c493 100644
--- a/assistant/src/memory/conflict-store.ts
+++ b/assistant/src/memory/conflict-store.ts
@@ -3,6 +3,7 @@ import { v4 as uuid } from 'uuid';
 import { getDb, getSqlite, rawAll } from './db.js';
 import { enqueueMemoryJob } from './jobs-store.js';
 import { memoryItemConflicts, memoryItems } from './schema.js';
+import { clampUnitInterval } from './validation.js';
 
 export type MemoryConflictRelationship =
   | 'contradiction'
@@ -319,7 +320,7 @@ export function applyConflictResolution(input: ApplyConflictResolutionInput): bo
           status: 'active',
           invalidAt: null,
           lastSeenAt: Math.max(existingItem.lastSeenAt, candidateItem.lastSeenAt, now),
-          confidence: Math.max(existingItem.confidence, candidateItem.confidence),
+          confidence: clampUnitInterval(Math.max(existingItem.confidence, candidateItem.confidence)),
         })
         .where(eq(memoryItems.id, existingItem.id))
         .run();
diff --git a/assistant/src/memory/contradiction-checker.ts b/assistant/src/memory/contradiction-checker.ts
index caebe876409..0b8d5d47e48 100644
--- a/assistant/src/memory/contradiction-checker.ts
+++ b/assistant/src/memory/contradiction-checker.ts
@@ -9,6 +9,7 @@ import { createOrUpdatePendingConflict } from './conflict-store.js';
 import { getDb, getSqlite, rawAll } from './db.js';
 import { enqueueMemoryJob } from './jobs-store.js';
 import { memoryItems } from './schema.js';
+import { clampUnitInterval } from './validation.js';
 
 const log = getLogger('memory-contradiction-checker');
 
@@ -335,7 +336,7 @@ function handleRelationship(
           .set({
             statement: newItem.statement,
             lastSeenAt: Math.max(freshExisting.lastSeenAt, freshNew!.lastSeenAt),
-            confidence: Math.max(freshExisting.confidence, freshNew!.confidence),
+            confidence: clampUnitInterval(Math.max(freshExisting.confidence, freshNew!.confidence)),
           })
           .where(eq(memoryItems.id, existingItem.id))
           .run();
@@ -364,6 +365,8 @@ function handleRelationship(
         });
         return true;
       }
+      default:
+        return false;
     }
   }).immediate();
 }
diff --git a/assistant/src/memory/conversation-store.ts b/assistant/src/memory/conversation-store.ts
index 534c7300884..9206af91428 100644
--- a/assistant/src/memory/conversation-store.ts
+++ b/assistant/src/memory/conversation-store.ts
@@ -1,18 +1,41 @@
 import { eq, desc, asc, and, count, sql, inArray, or, isNull } from 'drizzle-orm';
 import { v4 as uuid } from 'uuid';
+import { z } from 'zod';
 import { getDb, rawGet, rawExec } from './db.js';
 import { conversations, messages, toolInvocations, messageRuns, channelInboundEvents, memoryItemSources, memoryItems, memoryEmbeddings, memoryItemEntities, memorySegments, messageAttachments, llmRequestLogs } from './schema.js';
 import { getConfig } from '../config/loader.js';
 import { indexMessageNow } from './indexer.js';
 import { parseChannelId } from '../channels/types.js';
 import type { ChannelId } from '../channels/types.js';
-import { isChannelId } from '../channels/types.js';
+import { isChannelId, CHANNEL_IDS } from '../channels/types.js';
 import { getLogger } from '../util/logger.js';
 import { deleteOrphanAttachments } from './attachments-store.js';
 import { createRowMapper } from '../util/row-mapper.js';
 
 const log = getLogger('conversation-store');
 
+// ── Message metadata Zod schema ──────────────────────────────────────
+// Validates the JSON stored in messages.metadata. Known fields are typed;
+// extra keys are allowed via passthrough so callers can attach ad-hoc data.
+
+const channelIdSchema = z.enum(CHANNEL_IDS);
+
+const subagentNotificationSchema = z.object({
+  subagentId: z.string(),
+  label: z.string(),
+  status: z.enum(['completed', 'failed', 'aborted']),
+  error: z.string().optional(),
+  conversationId: z.string().optional(),
+});
+
+export const messageMetadataSchema = z.object({
+  userMessageChannel: channelIdSchema.optional(),
+  assistantMessageChannel: channelIdSchema.optional(),
+  subagentNotification: subagentNotificationSchema.optional(),
+}).passthrough();
+
+export type MessageMetadata = z.infer<typeof messageMetadataSchema>;
+
 export interface ConversationRow {
   id: string;
   title: string | null;
@@ -181,6 +204,14 @@ export function getLatestConversation(): ConversationRow | null {
 export function addMessage(conversationId: string, role: string, content: string, metadata?: Record<string, unknown>) {
   const db = getDb();
   const messageId = uuid();
+
+  if (metadata) {
+    const result = messageMetadataSchema.safeParse(metadata);
+    if (!result.success) {
+      log.warn({ conversationId, messageId, issues: result.error.issues }, 'Invalid message metadata, storing as-is');
+    }
+  }
+
   const metadataStr = metadata ? JSON.stringify(metadata) : undefined;
   const originChannelCandidate =
     metadata && isChannelId(metadata.userMessageChannel)
diff --git a/assistant/src/memory/db-init.ts b/assistant/src/memory/db-init.ts
index 352fcfc91c7..28b083e71f6 100644
--- a/assistant/src/memory/db-init.ts
+++ b/assistant/src/memory/db-init.ts
@@ -16,6 +16,9 @@ import {
   migrateGuardianActionTables,
   migrateBackfillInboxThreadStateFromBindings,
   migrateDropActiveSearchIndex,
+  migrateMemorySegmentsIndexes,
+  migrateMemoryItemsIndexes,
+  migrateRemainingTableIndexes,
   validateMigrationState,
 } from './schema-migration.js';
 
@@ -1267,5 +1270,11 @@ export function initializeDb(): void {
 
   migrateMemoryFtsBackfill(database);
 
+  migrateMemorySegmentsIndexes(database);
+
+  migrateMemoryItemsIndexes(database);
+
+  migrateRemainingTableIndexes(database);
+
   validateMigrationState(database);
 }
diff --git a/assistant/src/memory/items-extractor.ts b/assistant/src/memory/items-extractor.ts
index fbfc2d4a4e3..eb63511bc25 100644
--- a/assistant/src/memory/items-extractor.ts
+++ b/assistant/src/memory/items-extractor.ts
@@ -10,6 +10,7 @@ import { enqueueMemoryJob } from './jobs-store.js';
 import { extractTextFromStoredMessageContent } from './message-content.js';
 import { getDb } from './db.js';
 import { memoryItemConflicts, memoryItems, memoryItemSources, messages } from './schema.js';
+import { clampUnitInterval } from './validation.js';
 
 const log = getLogger('memory-items-extractor');
 
@@ -196,8 +197,8 @@ async function extractItemsWithLLM(
         if (!raw.subject || !raw.statement) continue;
         const subject = truncate(String(raw.subject), 80, '');
         const statement = truncate(String(raw.statement), 500, '');
-        const confidence = clamp(parseScore(raw.confidence, 0.5), 0, 1);
-        const importance = clamp(parseScore(raw.importance, 0.5), 0, 1);
+        const confidence = clampUnitInterval(parseScore(raw.confidence, 0.5));
+        const importance = clampUnitInterval(parseScore(raw.importance, 0.5));
         const fingerprint = computeMemoryFingerprint(scopeId, raw.kind, subject, statement);
         items.push({
           kind: raw.kind as MemoryItemKind,
@@ -284,8 +285,8 @@ export async function extractAndUpsertMemoryItemsForMessage(messageId: string, s
       db.update(memoryItems)
         .set({
           status: effectiveStatus,
-          confidence: Math.max(existing.confidence, item.confidence),
-          importance: Math.max(existing.importance ?? 0, item.importance),
+          confidence: clampUnitInterval(Math.max(existing.confidence, item.confidence)),
+          importance: clampUnitInterval(Math.max(existing.importance ?? 0, item.importance)),
           lastSeenAt: Math.max(existing.lastSeenAt, seenAt),
           verificationState: promotedState,
         })
@@ -438,10 +439,6 @@ function parseScore(value: unknown, fallback: number): number {
   return Number.isFinite(n) ? n : fallback;
 }
 
-function clamp(value: number, min: number, max: number): number {
-  return Math.min(max, Math.max(min, value));
-}
-
 /** Returns true if the given memory item is the candidate in an unresolved conflict. */
 function hasPendingConflict(itemId: string): boolean {
   const db = getDb();
diff --git a/assistant/src/memory/job-handlers/media-processing.ts b/assistant/src/memory/job-handlers/media-processing.ts
index aedcff7f376..f681a84325c 100644
--- a/assistant/src/memory/job-handlers/media-processing.ts
+++ b/assistant/src/memory/job-handlers/media-processing.ts
@@ -36,14 +36,14 @@ export async function mediaProcessingJob(job: MemoryJob): Promise<void> {
   }
 
   const handlers: Record<PipelineStageName, StageHandler> = {
-    preprocess: { execute: (assetId, onProgress) => preprocessForAsset(assetId, {}, onProgress) },
-    map: { execute: (assetId, onProgress) => mapSegmentsForAsset(assetId, {
+    preprocess: { execute: async (assetId, onProgress) => { await preprocessForAsset(assetId, {}, onProgress); } },
+    map: { execute: async (assetId, onProgress) => { await mapSegmentsForAsset(assetId, {
       systemPrompt: 'Describe what you see in these video frames. For each frame, note: subjects present, actions occurring, scene context, and any text visible.',
       outputSchema: { type: 'object', properties: { frames: { type: 'array', items: { type: 'object', properties: { timestamp: { type: 'number' }, subjects: { type: 'array', items: { type: 'string' } }, actions: { type: 'array', items: { type: 'string' } }, scene: { type: 'string' }, text: { type: 'string' } } } } } }
-    }, onProgress) },
-    reduce: { execute: (assetId, onProgress) => reduceForAsset(assetId, {
+    }, onProgress); } },
+    reduce: { execute: async (assetId, onProgress) => { await reduceForAsset(assetId, {
       systemPrompt: 'Summarize the video content based on the structured observations.',
-    }, onProgress) },
+    }, onProgress); } },
   };
 
   const result = await runPipeline(mediaAssetId, handlers, {
diff --git a/assistant/src/memory/migrations/016-memory-segments-indexes.ts b/assistant/src/memory/migrations/016-memory-segments-indexes.ts
new file mode 100644
index 00000000000..97aff7d50d3
--- /dev/null
+++ b/assistant/src/memory/migrations/016-memory-segments-indexes.ts
@@ -0,0 +1,11 @@
+import type { DrizzleDb } from '../db-connection.js';
+
+/**
+ * Idempotent migration to ensure memory_segments has indexes on scope_id and
+ * conversation_id for faster lookups.  scope_id was already covered by
+ * db-init, but we include both here for completeness.
+ */
+export function migrateMemorySegmentsIndexes(database: DrizzleDb): void {
+  database.run(/*sql*/ `CREATE INDEX IF NOT EXISTS idx_memory_segments_scope_id ON memory_segments(scope_id)`);
+  database.run(/*sql*/ `CREATE INDEX IF NOT EXISTS idx_memory_segments_conversation_id ON memory_segments(conversation_id)`);
+}
diff --git a/assistant/src/memory/migrations/017-memory-items-indexes.ts b/assistant/src/memory/migrations/017-memory-items-indexes.ts
new file mode 100644
index 00000000000..27a4db15ad4
--- /dev/null
+++ b/assistant/src/memory/migrations/017-memory-items-indexes.ts
@@ -0,0 +1,10 @@
+import type { DrizzleDb } from '../db-connection.js';
+
+/**
+ * Idempotent migration to add indexes on memory_items for scope_id and
+ * fingerprint — critical for duplicate detection and scope-filtered queries.
+ */
+export function migrateMemoryItemsIndexes(database: DrizzleDb): void {
+  database.run(/*sql*/ `CREATE INDEX IF NOT EXISTS idx_memory_items_scope_id ON memory_items(scope_id)`);
+  database.run(/*sql*/ `CREATE INDEX IF NOT EXISTS idx_memory_items_fingerprint ON memory_items(fingerprint)`);
+}
diff --git a/assistant/src/memory/migrations/018-remaining-table-indexes.ts b/assistant/src/memory/migrations/018-remaining-table-indexes.ts
new file mode 100644
index 00000000000..961f2ff511a
--- /dev/null
+++ b/assistant/src/memory/migrations/018-remaining-table-indexes.ts
@@ -0,0 +1,13 @@
+import type { DrizzleDb } from '../db-connection.js';
+
+/**
+ * Idempotent migration to add indexes on foreign-key and scope columns that
+ * lacked them.  messages.conversation_id is a FK used for ON DELETE CASCADE,
+ * so the index also speeds up cascading deletes.
+ */
+export function migrateRemainingTableIndexes(database: DrizzleDb): void {
+  database.run(/*sql*/ `CREATE INDEX IF NOT EXISTS idx_memory_item_conflicts_scope_id ON memory_item_conflicts(scope_id)`);
+  database.run(/*sql*/ `CREATE INDEX IF NOT EXISTS idx_memory_summaries_scope_id ON memory_summaries(scope_id)`);
+  database.run(/*sql*/ `CREATE INDEX IF NOT EXISTS idx_messages_conversation_id ON messages(conversation_id)`);
+  database.run(/*sql*/ `CREATE INDEX IF NOT EXISTS idx_tool_invocations_conversation_id ON tool_invocations(conversation_id)`);
+}
diff --git a/assistant/src/memory/migrations/index.ts b/assistant/src/memory/migrations/index.ts
index ee4e81e38f8..781f4799ef9 100644
--- a/assistant/src/memory/migrations/index.ts
+++ b/assistant/src/memory/migrations/index.ts
@@ -19,3 +19,6 @@ export { migrateCallSessionsAddInitiatedFrom } from './012-call-sessions-add-ini
 export { migrateGuardianActionTables } from './013-guardian-action-tables.js';
 export { migrateBackfillInboxThreadStateFromBindings } from './014-backfill-inbox-thread-state.js';
 export { migrateDropActiveSearchIndex } from './015-drop-active-search-index.js';
+export { migrateMemorySegmentsIndexes } from './016-memory-segments-indexes.js';
+export { migrateMemoryItemsIndexes } from './017-memory-items-indexes.js';
+export { migrateRemainingTableIndexes } from './018-remaining-table-indexes.js';
diff --git a/assistant/src/memory/retriever.ts b/assistant/src/memory/retriever.ts
index f99f0af3ecb..cf512fb5d8b 100644
--- a/assistant/src/memory/retriever.ts
+++ b/assistant/src/memory/retriever.ts
@@ -778,9 +778,15 @@ function isAbortError(err: unknown): boolean {
  * HTTP/API clients), then falls back to looking for "status <code>" patterns
  * in the message. This avoids false positives from dimension numbers like 512.
  */
+function getErrorStatusCode(err: Error): unknown {
+  if ('status' in err) return (err as { status: unknown }).status;
+  if ('statusCode' in err) return (err as { statusCode: unknown }).statusCode;
+  return undefined;
+}
+
 function isHttpStatusError(err: unknown): boolean {
   if (!(err instanceof Error)) return false;
-  const status = (err as Record<string, unknown>).status ?? (err as Record<string, unknown>).statusCode;
+  const status = getErrorStatusCode(err);
   if (typeof status === 'number') {
     return status === 429 || (status >= 500 && status < 600);
   }
diff --git a/assistant/src/memory/schema-migration.ts b/assistant/src/memory/schema-migration.ts
index cbb56660310..11cb21dc99b 100644
--- a/assistant/src/memory/schema-migration.ts
+++ b/assistant/src/memory/schema-migration.ts
@@ -19,4 +19,7 @@ export {
   migrateGuardianActionTables,
   migrateBackfillInboxThreadStateFromBindings,
   migrateDropActiveSearchIndex,
+  migrateMemorySegmentsIndexes,
+  migrateMemoryItemsIndexes,
+  migrateRemainingTableIndexes,
 } from './migrations/index.js';
diff --git a/assistant/src/memory/schema.ts b/assistant/src/memory/schema.ts
index 143cc5c22b8..4b754dfd6a8 100644
--- a/assistant/src/memory/schema.ts
+++ b/assistant/src/memory/schema.ts
@@ -1,4 +1,4 @@
-import { sqliteTable, text, integer, real, blob } from 'drizzle-orm/sqlite-core';
+import { sqliteTable, text, integer, real, blob, index } from 'drizzle-orm/sqlite-core';
 
 export const conversations = sqliteTable('conversations', {
   id: text('id').primaryKey(),
@@ -26,7 +26,9 @@ export const messages = sqliteTable('messages', {
   content: text('content').notNull(),
   createdAt: integer('created_at').notNull(),
   metadata: text('metadata'),
-});
+}, (table) => [
+  index('idx_messages_conversation_id').on(table.conversationId),
+]);
 
 export const toolInvocations = sqliteTable('tool_invocations', {
   id: text('id').primaryKey(),
@@ -40,7 +42,9 @@ export const toolInvocations = sqliteTable('tool_invocations', {
   riskLevel: text('risk_level').notNull(),
   durationMs: integer('duration_ms').notNull(),
   createdAt: integer('created_at').notNull(),
-});
+}, (table) => [
+  index('idx_tool_invocations_conversation_id').on(table.conversationId),
+]);
 
 export const memorySegments = sqliteTable('memory_segments', {
   id: text('id').primaryKey(),
@@ -58,7 +62,10 @@ export const memorySegments = sqliteTable('memory_segments', {
   contentHash: text('content_hash'),
   createdAt: integer('created_at').notNull(),
   updatedAt: integer('updated_at').notNull(),
-});
+}, (table) => [
+  index('idx_memory_segments_scope_id').on(table.scopeId),
+  index('idx_memory_segments_conversation_id').on(table.conversationId),
+]);
 
 export const memoryItems = sqliteTable('memory_items', {
   id: text('id').primaryKey(),
@@ -77,7 +84,10 @@ export const memoryItems = sqliteTable('memory_items', {
   lastUsedAt: integer('last_used_at'),
   validFrom: integer('valid_from'),
   invalidAt: integer('invalid_at'),
-});
+}, (table) => [
+  index('idx_memory_items_scope_id').on(table.scopeId),
+  index('idx_memory_items_fingerprint').on(table.fingerprint),
+]);
 
 export const memoryItemSources = sqliteTable('memory_item_sources', {
   memoryItemId: text('memory_item_id')
@@ -107,7 +117,9 @@ export const memoryItemConflicts = sqliteTable('memory_item_conflicts', {
   resolvedAt: integer('resolved_at'),
   createdAt: integer('created_at').notNull(),
   updatedAt: integer('updated_at').notNull(),
-});
+}, (table) => [
+  index('idx_memory_item_conflicts_scope_id').on(table.scopeId),
+]);
 
 export const memorySummaries = sqliteTable('memory_summaries', {
   id: text('id').primaryKey(),
@@ -121,7 +133,9 @@ export const memorySummaries = sqliteTable('memory_summaries', {
   endAt: integer('end_at').notNull(),
   createdAt: integer('created_at').notNull(),
   updatedAt: integer('updated_at').notNull(),
-});
+}, (table) => [
+  index('idx_memory_summaries_scope_id').on(table.scopeId),
+]);
 
 export const memoryEmbeddings = sqliteTable('memory_embeddings', {
   id: text('id').primaryKey(),
diff --git a/assistant/src/memory/validation.ts b/assistant/src/memory/validation.ts
new file mode 100644
index 00000000000..00ff100163c
--- /dev/null
+++ b/assistant/src/memory/validation.ts
@@ -0,0 +1,19 @@
+import { z } from 'zod';
+
+/**
+ * Unit interval [0, 1] — used for confidence and importance fields on memory items.
+ * Coerces out-of-range numbers to the nearest bound rather than rejecting,
+ * since LLM-generated values occasionally exceed the range.
+ */
+export const unitInterval = z.number().transform((v) => Math.min(1, Math.max(0, v)));
+
+/** Zod schema for validating confidence/importance values on memory items. */
+export const memoryItemScores = z.object({
+  confidence: unitInterval,
+  importance: unitInterval,
+});
+
+/** Clamp a numeric value to [0, 1]. */
+export function clampUnitInterval(value: number): number {
+  return Math.min(1, Math.max(0, value));
+}
diff --git a/assistant/src/migrations/config-merge.ts b/assistant/src/migrations/config-merge.ts
new file mode 100644
index 00000000000..1c83fd0f95a
--- /dev/null
+++ b/assistant/src/migrations/config-merge.ts
@@ -0,0 +1,53 @@
+import { existsSync, readFileSync, writeFileSync, unlinkSync } from 'node:fs';
+import { isPlainObject } from '../util/object.js';
+import { migrationLog } from './log.js';
+
+/**
+ * When migratePath skips config.json because the workspace copy already
+ * exists, the legacy root config may still contain keys (e.g. slackWebhookUrl)
+ * that were never written to the workspace config. This merges any missing
+ * top-level keys from the legacy file into the workspace file so they are
+ * not silently lost during upgrade.
+ */
+export function mergeSkippedConfigKeys(legacyPath: string, workspacePath: string): void {
+  if (!existsSync(legacyPath) || !existsSync(workspacePath)) return;
+
+  let legacy: Record<string, unknown>;
+  let workspace: Record<string, unknown>;
+  try {
+    const legacyRaw = JSON.parse(readFileSync(legacyPath, 'utf-8'));
+    const workspaceRaw = JSON.parse(readFileSync(workspacePath, 'utf-8'));
+    if (!isPlainObject(legacyRaw) || !isPlainObject(workspaceRaw)) return;
+    legacy = legacyRaw;
+    workspace = workspaceRaw;
+  } catch {
+    return; // malformed JSON — skip silently
+  }
+
+  const merged: string[] = [];
+  for (const key of Object.keys(legacy)) {
+    if (!(key in workspace)) {
+      workspace[key] = legacy[key];
+      merged.push(key);
+    }
+  }
+
+  if (merged.length > 0) {
+    try {
+      writeFileSync(workspacePath, JSON.stringify(workspace, null, 2) + '\n');
+      // Remove merged keys from legacy config so they are not resurrected
+      // if a user later deletes them from the workspace config.
+      for (const key of merged) {
+        delete legacy[key];
+      }
+      if (Object.keys(legacy).length === 0) {
+        unlinkSync(legacyPath);
+      } else {
+        writeFileSync(legacyPath, JSON.stringify(legacy, null, 2) + '\n');
+      }
+      migrationLog('info', 'Merged legacy config keys into workspace config', { keys: merged });
+    } catch (err) {
+      migrationLog('warn', 'Failed to merge legacy config keys', { err: String(err), keys: merged });
+    }
+  }
+}
diff --git a/assistant/src/migrations/data-layout.ts b/assistant/src/migrations/data-layout.ts
new file mode 100644
index 00000000000..ce273bf7691
--- /dev/null
+++ b/assistant/src/migrations/data-layout.ts
@@ -0,0 +1,68 @@
+import { existsSync, mkdirSync, renameSync } from 'node:fs';
+import { join, dirname } from 'node:path';
+import { getRootDir } from '../util/platform.js';
+import { migrationLog } from './log.js';
+
+/**
+ * Migrate files from the old flat ~/.vellum layout to the new structured
+ * layout with data/ and protected/ subdirectories.
+ *
+ * Idempotent: skips items that have already been migrated.
+ * Uses renameSync for atomic moves (same filesystem).
+ */
+export function migrateToDataLayout(): void {
+  const root = getRootDir();
+  const data = join(root, 'data');
+
+  if (!existsSync(root)) return;
+
+  function migrateItem(oldPath: string, newPath: string): void {
+    if (!existsSync(oldPath)) return;
+    if (existsSync(newPath)) return;
+    try {
+      const newDir = dirname(newPath);
+      if (!existsSync(newDir)) {
+        mkdirSync(newDir, { recursive: true });
+      }
+      renameSync(oldPath, newPath);
+      migrationLog('info', 'Migrated path', { from: oldPath, to: newPath });
+    } catch (err) {
+      migrationLog('warn', 'Failed to migrate path', { err: String(err), from: oldPath, to: newPath });
+    }
+  }
+
+  // DB: ~/.vellum/data/assistant.db → ~/.vellum/data/db/assistant.db
+  migrateItem(join(data, 'assistant.db'), join(data, 'db', 'assistant.db'));
+  migrateItem(join(data, 'assistant.db-wal'), join(data, 'db', 'assistant.db-wal'));
+  migrateItem(join(data, 'assistant.db-shm'), join(data, 'db', 'assistant.db-shm'));
+
+  // Qdrant PID: ~/.vellum/qdrant.pid → ~/.vellum/data/qdrant/qdrant.pid
+  migrateItem(join(root, 'qdrant.pid'), join(data, 'qdrant', 'qdrant.pid'));
+
+  // Qdrant binary: ~/.vellum/bin/ → ~/.vellum/data/qdrant/bin/
+  migrateItem(join(root, 'bin'), join(data, 'qdrant', 'bin'));
+
+  // Logs: ~/.vellum/logs/ → ~/.vellum/data/logs/
+  migrateItem(join(root, 'logs'), join(data, 'logs'));
+
+  // Memory: ~/.vellum/memory/ → ~/.vellum/data/memory/
+  migrateItem(join(root, 'memory'), join(data, 'memory'));
+
+  // Apps: ~/.vellum/apps/ → ~/.vellum/data/apps/
+  migrateItem(join(root, 'apps'), join(data, 'apps'));
+
+  // Browser auth: ~/.vellum/browser-auth/ → ~/.vellum/data/browser-auth/
+  migrateItem(join(root, 'browser-auth'), join(data, 'browser-auth'));
+
+  // Browser profile: ~/.vellum/browser-profile/ → ~/.vellum/data/browser-profile/
+  migrateItem(join(root, 'browser-profile'), join(data, 'browser-profile'));
+
+  // History: ~/.vellum/history → ~/.vellum/data/history
+  migrateItem(join(root, 'history'), join(data, 'history'));
+
+  // Protected files: ~/.vellum/X → ~/.vellum/protected/X
+  const protectedDir = join(root, 'protected');
+  migrateItem(join(root, 'trust.json'), join(protectedDir, 'trust.json'));
+  migrateItem(join(root, 'keys.enc'), join(protectedDir, 'keys.enc'));
+  migrateItem(join(root, 'secret-allowlist.json'), join(protectedDir, 'secret-allowlist.json'));
+}
diff --git a/assistant/src/migrations/data-merge.ts b/assistant/src/migrations/data-merge.ts
new file mode 100644
index 00000000000..0d8b1541c7d
--- /dev/null
+++ b/assistant/src/migrations/data-merge.ts
@@ -0,0 +1,33 @@
+import { existsSync, readdirSync, renameSync } from 'node:fs';
+import { join } from 'node:path';
+import { migrationLog } from './log.js';
+
+/**
+ * When migratePath skips the data directory because workspace/data already
+ * exists (e.g. the user's project had a data/ folder that was extracted from
+ * sandbox/fs), the legacy data directory may still contain internal state
+ * subdirectories (db/, logs/, sandbox/, etc.) that need to be preserved.
+ * This merges any missing entries from the legacy data path into workspace/data.
+ */
+export function mergeLegacyDataEntries(legacyDir: string, workspaceDir: string): void {
+  if (!existsSync(legacyDir) || !existsSync(workspaceDir)) return;
+
+  let entries: import('node:fs').Dirent[];
+  try {
+    entries = readdirSync(legacyDir, { withFileTypes: true });
+  } catch {
+    return;
+  }
+
+  for (const entry of entries) {
+    const src = join(legacyDir, entry.name);
+    const dest = join(workspaceDir, entry.name);
+    if (existsSync(dest)) continue; // already present in workspace
+    try {
+      renameSync(src, dest);
+      migrationLog('info', 'Merged legacy data entry into workspace', { from: src, to: dest });
+    } catch (err) {
+      migrationLog('warn', 'Failed to merge legacy data entry', { err: String(err), from: src, to: dest });
+    }
+  }
+}
diff --git a/assistant/src/migrations/hooks-merge.ts b/assistant/src/migrations/hooks-merge.ts
new file mode 100644
index 00000000000..9102f397e44
--- /dev/null
+++ b/assistant/src/migrations/hooks-merge.ts
@@ -0,0 +1,90 @@
+import { existsSync, readdirSync, readFileSync, writeFileSync, renameSync, unlinkSync } from 'node:fs';
+import { join } from 'node:path';
+import { isPlainObject } from '../util/object.js';
+import { migrationLog } from './log.js';
+
+/**
+ * Merge missing hook entries from a legacy hooks/config.json into the
+ * workspace hooks/config.json. Only adds hooks that don't already exist
+ * in the workspace config so user changes are never overwritten.
+ */
+export function mergeHooksConfig(legacyPath: string, workspacePath: string): void {
+  let legacy: Record<string, unknown>;
+  let workspace: Record<string, unknown>;
+  try {
+    const legacyRaw = JSON.parse(readFileSync(legacyPath, 'utf-8'));
+    const workspaceRaw = JSON.parse(readFileSync(workspacePath, 'utf-8'));
+    if (!isPlainObject(legacyRaw) || !isPlainObject(workspaceRaw)) return;
+    legacy = legacyRaw;
+    workspace = workspaceRaw;
+  } catch {
+    return;
+  }
+
+  const legacyHooks = legacy.hooks;
+  const wsHooks = workspace.hooks;
+  if (!isPlainObject(legacyHooks) || !isPlainObject(wsHooks)) return;
+
+  const merged: string[] = [];
+  for (const hookName of Object.keys(legacyHooks)) {
+    if (!(hookName in wsHooks)) {
+      wsHooks[hookName] = legacyHooks[hookName];
+      merged.push(hookName);
+    }
+  }
+
+  if (merged.length > 0) {
+    try {
+      writeFileSync(workspacePath, JSON.stringify(workspace, null, 2) + '\n');
+      // Remove merged hooks from legacy config to prevent resurrection
+      for (const hookName of merged) {
+        delete legacyHooks[hookName];
+      }
+      if (Object.keys(legacyHooks).length === 0) {
+        unlinkSync(legacyPath);
+      } else {
+        writeFileSync(legacyPath, JSON.stringify(legacy, null, 2) + '\n');
+      }
+      migrationLog('info', 'Merged legacy hooks config entries into workspace', { hooks: merged });
+    } catch (err) {
+      migrationLog('warn', 'Failed to merge legacy hooks config', { err: String(err), hooks: merged });
+    }
+  }
+}
+
+/**
+ * When migratePath skips the hooks directory because the workspace copy
+ * already exists (e.g. pre-created by ensureDataDir), the legacy hooks
+ * directory may still contain individual hook files/subdirectories that
+ * were never moved. This merges any missing entries from the legacy
+ * path into the workspace hooks path so they are not silently lost.
+ */
+export function mergeLegacyHooks(legacyDir: string, workspaceDir: string): void {
+  if (!existsSync(legacyDir) || !existsSync(workspaceDir)) return;
+
+  let entries: import('node:fs').Dirent[];
+  try {
+    entries = readdirSync(legacyDir, { withFileTypes: true });
+  } catch {
+    return;
+  }
+
+  for (const entry of entries) {
+    const src = join(legacyDir, entry.name);
+    const dest = join(workspaceDir, entry.name);
+    if (existsSync(dest)) {
+      // config.json needs a merge rather than a skip — the legacy file may
+      // contain hook enabled/settings entries that the workspace copy lacks.
+      if (entry.name === 'config.json') {
+        mergeHooksConfig(src, dest);
+      }
+      continue;
+    }
+    try {
+      renameSync(src, dest);
+      migrationLog('info', 'Merged legacy hook into workspace', { from: src, to: dest });
+    } catch (err) {
+      migrationLog('warn', 'Failed to merge legacy hook', { err: String(err), from: src, to: dest });
+    }
+  }
+}
diff --git a/assistant/src/migrations/index.ts b/assistant/src/migrations/index.ts
new file mode 100644
index 00000000000..08aa5f274d0
--- /dev/null
+++ b/assistant/src/migrations/index.ts
@@ -0,0 +1,6 @@
+export { migrateToDataLayout } from './data-layout.js';
+export { migrateToWorkspaceLayout, migratePath } from './workspace-layout.js';
+export { mergeSkippedConfigKeys } from './config-merge.js';
+export { mergeLegacyHooks, mergeHooksConfig } from './hooks-merge.js';
+export { mergeLegacySkills } from './skills-merge.js';
+export { mergeLegacyDataEntries } from './data-merge.js';
diff --git a/assistant/src/migrations/log.ts b/assistant/src/migrations/log.ts
new file mode 100644
index 00000000000..2f466f7ceef
--- /dev/null
+++ b/assistant/src/migrations/log.ts
@@ -0,0 +1,23 @@
+import pino from 'pino';
+import { logSerializers } from '../util/log-redact.js';
+
+/**
+ * Standalone pino instance for migration code. This must NOT use getLogger()
+ * because that triggers ensureDataDir(), which pre-creates workspace
+ * destination directories and causes migration moves to no-op.
+ *
+ * Writes to stderr only — no log files that might not exist yet.
+ */
+const migrationLogger: pino.Logger = pino(
+  { name: 'migration', level: 'info', serializers: logSerializers },
+  pino.destination(2),
+);
+
+export function migrationLog(level: 'info' | 'warn' | 'debug', msg: string, data?: Record<string, unknown>): void {
+  if (level === 'debug') return;
+  if (data) {
+    migrationLogger[level](data, msg);
+  } else {
+    migrationLogger[level](msg);
+  }
+}
diff --git a/assistant/src/migrations/skills-merge.ts b/assistant/src/migrations/skills-merge.ts
new file mode 100644
index 00000000000..4d7719e72c1
--- /dev/null
+++ b/assistant/src/migrations/skills-merge.ts
@@ -0,0 +1,33 @@
+import { existsSync, readdirSync, renameSync } from 'node:fs';
+import { join } from 'node:path';
+import { migrationLog } from './log.js';
+
+/**
+ * When migratePath skips the skills directory because the workspace copy
+ * already exists (e.g. pre-created by ensureDataDir), the legacy skills
+ * directory may still contain individual skill subdirectories that were
+ * never moved. This merges any missing skill subdirectories from the
+ * legacy path into the workspace skills path so they are not stranded.
+ */
+export function mergeLegacySkills(legacyDir: string, workspaceDir: string): void {
+  if (!existsSync(legacyDir) || !existsSync(workspaceDir)) return;
+
+  let entries: import('node:fs').Dirent[];
+  try {
+    entries = readdirSync(legacyDir, { withFileTypes: true });
+  } catch {
+    return;
+  }
+
+  for (const entry of entries) {
+    const src = join(legacyDir, entry.name);
+    const dest = join(workspaceDir, entry.name);
+    if (existsSync(dest)) continue; // already present in workspace
+    try {
+      renameSync(src, dest);
+      migrationLog('info', 'Merged legacy skill into workspace', { from: src, to: dest });
+    } catch (err) {
+      migrationLog('warn', 'Failed to merge legacy skill', { err: String(err), from: src, to: dest });
+    }
+  }
+}
diff --git a/assistant/src/migrations/workspace-layout.ts b/assistant/src/migrations/workspace-layout.ts
new file mode 100644
index 00000000000..dc58545ef1a
--- /dev/null
+++ b/assistant/src/migrations/workspace-layout.ts
@@ -0,0 +1,79 @@
+import { existsSync, mkdirSync, renameSync } from 'node:fs';
+import { join, dirname } from 'node:path';
+import { getRootDir, getWorkspaceDir } from '../util/platform.js';
+import { migrationLog } from './log.js';
+import { mergeSkippedConfigKeys } from './config-merge.js';
+import { mergeLegacyHooks } from './hooks-merge.js';
+import { mergeLegacySkills } from './skills-merge.js';
+import { mergeLegacyDataEntries } from './data-merge.js';
+
+/**
+ * Idempotent move: relocates source to destination for migration.
+ * - No-op if source is missing (already migrated or never existed).
+ * - No-op if destination already exists (avoids clobbering).
+ * - Creates destination parent directories as needed.
+ * - Logs warning on failure instead of throwing.
+ *
+ * Exported for testing; not intended for general use outside migrations.
+ */
+export function migratePath(source: string, destination: string): void {
+  if (!existsSync(source)) return;
+  if (existsSync(destination)) {
+    migrationLog('debug', 'Migration skipped: destination already exists', { source, destination });
+    return;
+  }
+  try {
+    const destDir = dirname(destination);
+    if (!existsSync(destDir)) {
+      mkdirSync(destDir, { recursive: true });
+    }
+    renameSync(source, destination);
+    migrationLog('info', 'Migrated path', { from: source, to: destination });
+  } catch (err) {
+    migrationLog('warn', 'Failed to migrate path', { err: String(err), from: source, to: destination });
+  }
+}
+
+/**
+ * Migrate from the flat ~/.vellum layout to the workspace-based layout.
+ *
+ * Step (a) is special: if the workspace dir doesn't exist yet but the old
+ * sandbox working dir (data/sandbox/fs) does, its contents are "extracted"
+ * to become the new workspace root via rename. All subsequent moves then
+ * land inside that workspace directory.
+ *
+ * Idempotent: safe to call on every startup — already-migrated items are
+ * skipped, and a second run is a no-op.
+ */
+export function migrateToWorkspaceLayout(): void {
+  const root = getRootDir();
+  if (!existsSync(root)) return;
+
+  const ws = getWorkspaceDir();
+
+  // (a) Extract data/sandbox/fs -> workspace (only when workspace doesn't exist yet)
+  if (!existsSync(ws)) {
+    const sandboxFs = join(root, 'data', 'sandbox', 'fs');
+    if (existsSync(sandboxFs)) {
+      try {
+        renameSync(sandboxFs, ws);
+        migrationLog('info', 'Extracted sandbox/fs as workspace root', { from: sandboxFs, to: ws });
+      } catch (err) {
+        migrationLog('warn', 'Failed to extract sandbox/fs', { err: String(err), from: sandboxFs, to: ws });
+      }
+    }
+  }
+
+  // (b)-(h) Move legacy root-level items into workspace
+  migratePath(join(root, 'config.json'), join(ws, 'config.json'));
+  mergeSkippedConfigKeys(join(root, 'config.json'), join(ws, 'config.json'));
+  migratePath(join(root, 'data'), join(ws, 'data'));
+  mergeLegacyDataEntries(join(root, 'data'), join(ws, 'data'));
+  migratePath(join(root, 'hooks'), join(ws, 'hooks'));
+  mergeLegacyHooks(join(root, 'hooks'), join(ws, 'hooks'));
+  migratePath(join(root, 'IDENTITY.md'), join(ws, 'IDENTITY.md'));
+  migratePath(join(root, 'skills'), join(ws, 'skills'));
+  mergeLegacySkills(join(root, 'skills'), join(ws, 'skills'));
+  migratePath(join(root, 'SOUL.md'), join(ws, 'SOUL.md'));
+  migratePath(join(root, 'USER.md'), join(ws, 'USER.md'));
+}
diff --git a/assistant/src/permissions/checker.ts b/assistant/src/permissions/checker.ts
index 2b4c92faa88..e889b627dc8 100644
--- a/assistant/src/permissions/checker.ts
+++ b/assistant/src/permissions/checker.ts
@@ -1,6 +1,6 @@
+import { createHash } from 'node:crypto';
 import { RiskLevel, type PermissionCheckResult, type AllowlistOption, type ScopeOption, type PolicyContext } from './types.js';
-import { findHighestPriorityRule } from './trust-store.js';
-import { parse } from '../tools/terminal/parser.js';
+import { findHighestPriorityRule, onRulesChanged } from './trust-store.js';
 import { resolveSkillSelector } from '../config/skills.js';
 import { computeSkillVersionHash } from '../skills/version-hash.js';
 import { getTool } from '../tools/registry.js';
@@ -11,9 +11,33 @@ import { homedir } from 'node:os';
 import { looksLikeHostPortShorthand, looksLikePathOnlyInput } from '../tools/network/url-safety.js';
 import { normalizeFilePath, isSkillSourcePath } from '../skills/path-classifier.js';
 import { isWorkspaceScopedInvocation } from './workspace-policy.js';
-import { buildShellCommandCandidates, buildShellAllowlistOptions, type ParsedCommand } from './shell-identity.js';
+import { buildShellCommandCandidates, buildShellAllowlistOptions, cachedParse, type ParsedCommand } from './shell-identity.js';
 import type { ManifestOverride } from '../tools/execution-target.js';
 
+// ── Risk classification cache ────────────────────────────────────────────────
+// classifyRisk() is called on every permission check and can invoke WASM
+// parsing for shell commands. Cache results keyed on (toolName, inputHash).
+// Invalidated when trust rules change since risk classification for file tools
+// depends on skill source path checks which reference config, but the core
+// risk logic is input-deterministic.
+const RISK_CACHE_MAX = 256;
+const riskCache = new Map<string, RiskLevel>();
+
+function riskCacheKey(toolName: string, input: Record<string, unknown>): string {
+  const inputJson = JSON.stringify(input);
+  const hash = createHash('sha256').update(inputJson).digest('hex');
+  return `${toolName}\0${hash}`;
+}
+
+/** Clear the risk classification cache. Called when trust rules change. */
+export function clearRiskCache(): void {
+  riskCache.clear();
+}
+
+// Invalidate risk cache whenever trust rules change so that risk decisions
+// referencing config-dependent checks (e.g. skill source paths) stay fresh.
+onRulesChanged(clearRiskCache);
+
 // Ensures the legacy mode deprecation warning fires at most once per process.
 let _legacyDeprecationWarned = false;
 
@@ -280,7 +304,36 @@ async function buildCommandCandidates(toolName: string, input: Record<string, un
   return [...new Set(candidates)];
 }
 
-export async function classifyRisk(toolName: string, input: Record<string, unknown>, workingDir?: string, preParsed?: ParsedCommand, manifestOverride?: ManifestOverride): Promise<RiskLevel> {
+export async function classifyRisk(toolName: string, input: Record<string, unknown>, workingDir?: string, preParsed?: ParsedCommand, manifestOverride?: ManifestOverride, signal?: AbortSignal): Promise<RiskLevel> {
+  if (signal?.aborted) throw new Error('Cancelled');
+
+  // Check cache first (skip when preParsed is provided since caller already
+  // parsed and we'd just be duplicating the key computation cost).
+  const cacheKey = preParsed ? null : riskCacheKey(toolName, input);
+  if (cacheKey) {
+    const cached = riskCache.get(cacheKey);
+    if (cached !== undefined) {
+      // LRU refresh
+      riskCache.delete(cacheKey);
+      riskCache.set(cacheKey, cached);
+      return cached;
+    }
+  }
+
+  const result = await classifyRiskUncached(toolName, input, workingDir, preParsed, manifestOverride);
+
+  if (cacheKey) {
+    if (riskCache.size >= RISK_CACHE_MAX) {
+      const oldest = riskCache.keys().next().value;
+      if (oldest !== undefined) riskCache.delete(oldest);
+    }
+    riskCache.set(cacheKey, result);
+  }
+
+  return result;
+}
+
+async function classifyRiskUncached(toolName: string, input: Record<string, unknown>, workingDir?: string, preParsed?: ParsedCommand, manifestOverride?: ManifestOverride): Promise<RiskLevel> {
   if (toolName === 'file_read') return RiskLevel.Low;
   if (toolName === 'file_write' || toolName === 'file_edit') {
     const filePath = getStringField(input, 'path', 'file_path');
@@ -320,7 +373,7 @@ export async function classifyRisk(toolName: string, input: Record<string, unkno
     const command = (input.command as string) ?? '';
     if (!command.trim()) return RiskLevel.Low;
 
-    const parsed = preParsed ?? await parse(command);
+    const parsed = preParsed ?? await cachedParse(command);
 
     // Dangerous patterns → High
     if (parsed.dangerousPatterns.length > 0) return RiskLevel.High;
@@ -411,17 +464,20 @@ export async function check(
   workingDir: string,
   policyContext?: PolicyContext,
   manifestOverride?: ManifestOverride,
+  signal?: AbortSignal,
 ): Promise<PermissionCheckResult> {
+  if (signal?.aborted) throw new Error('Cancelled');
+
   // For shell tools, parse once and share the result to avoid duplicate tree-sitter work.
   let shellParsed: ParsedCommand | undefined;
   if (toolName === 'bash' || toolName === 'host_bash') {
     const command = ((input.command as string) ?? '').trim();
     if (command) {
-      shellParsed = await parse(command);
+      shellParsed = await cachedParse(command);
     }
   }
 
-  const risk = await classifyRisk(toolName, input, workingDir, shellParsed, manifestOverride);
+  const risk = await classifyRisk(toolName, input, workingDir, shellParsed, manifestOverride, signal);
 
   // Build command string candidates for rule matching
   const commandCandidates = await buildCommandCandidates(toolName, input, workingDir, shellParsed);
@@ -551,7 +607,8 @@ function friendlyHostname(url: URL): string {
   return url.hostname.replace(/^www\./, '');
 }
 
-export async function generateAllowlistOptions(toolName: string, input: Record<string, unknown>): Promise<AllowlistOption[]> {
+export async function generateAllowlistOptions(toolName: string, input: Record<string, unknown>, signal?: AbortSignal): Promise<AllowlistOption[]> {
+  if (signal?.aborted) throw new Error('Cancelled');
   if (toolName === 'bash' || toolName === 'host_bash') {
     const command = ((input.command as string) ?? '').trim();
     return buildShellAllowlistOptions(command);
diff --git a/assistant/src/permissions/prompter.ts b/assistant/src/permissions/prompter.ts
index a730e494cf7..59b3746bb43 100644
--- a/assistant/src/permissions/prompter.ts
+++ b/assistant/src/permissions/prompter.ts
@@ -43,12 +43,15 @@ export class PermissionPrompter {
     sessionId?: string,
     executionTarget?: ExecutionTarget,
     persistentDecisionsAllowed?: boolean,
+    signal?: AbortSignal,
   ): Promise<{
     decision: UserDecision;
     selectedPattern?: string;
     selectedScope?: string;
     decisionContext?: string;
   }> {
+    if (signal?.aborted) return { decision: 'deny' };
+
     const requestId = uuid();
 
     return new Promise((resolve, reject) => {
@@ -61,6 +64,17 @@ export class PermissionPrompter {
 
       this.pending.set(requestId, { resolve, reject, timer });
 
+      if (signal) {
+        const onAbort = () => {
+          if (this.pending.has(requestId)) {
+            clearTimeout(timer);
+            this.pending.delete(requestId);
+            resolve({ decision: 'deny' });
+          }
+        };
+        signal.addEventListener('abort', onAbort, { once: true });
+      }
+
       this.sendToClient({
         type: 'confirmation_request',
         requestId,
diff --git a/assistant/src/permissions/shell-identity.ts b/assistant/src/permissions/shell-identity.ts
index 127ad29ebfb..7b4949f2365 100644
--- a/assistant/src/permissions/shell-identity.ts
+++ b/assistant/src/permissions/shell-identity.ts
@@ -3,6 +3,36 @@ import type { AllowlistOption } from './types.js';
 
 export type { ParsedCommand };
 
+// ── Shell parse result cache ─────────────────────────────────────────────────
+// Shell parsing via web-tree-sitter WASM is deterministic — the same command
+// string always produces the same ParsedCommand. Cache results to avoid
+// redundant WASM invocations on repeated permission checks.
+const PARSE_CACHE_MAX = 256;
+const parseCache = new Map<string, ParsedCommand>();
+
+export async function cachedParse(command: string): Promise<ParsedCommand> {
+  const cached = parseCache.get(command);
+  if (cached !== undefined) {
+    // LRU refresh: move to end of insertion order
+    parseCache.delete(command);
+    parseCache.set(command, cached);
+    return cached;
+  }
+  const result = await parse(command);
+  // Evict oldest entry if at capacity
+  if (parseCache.size >= PARSE_CACHE_MAX) {
+    const oldest = parseCache.keys().next().value;
+    if (oldest !== undefined) parseCache.delete(oldest);
+  }
+  parseCache.set(command, result);
+  return result;
+}
+
+/** Clear the shell parse cache. Exposed for testing. */
+export function clearShellParseCache(): void {
+  parseCache.clear();
+}
+
 export interface ShellActionKey {
   /** e.g. "action:gh", "action:gh pr", "action:gh pr view" */
   key: string;
@@ -40,7 +70,7 @@ const MAX_ACTION_KEY_DEPTH = 3;
  * identity information for permission decisions.
  */
 export async function analyzeShellCommand(command: string, preParsed?: ParsedCommand): Promise<ShellIdentityAnalysis> {
-  const parsed = preParsed ?? await parse(command);
+  const parsed = preParsed ?? await cachedParse(command);
 
   const operators: string[] = [];
   for (const seg of parsed.segments) {
diff --git a/assistant/src/permissions/trust-store.ts b/assistant/src/permissions/trust-store.ts
index a8a4d15ae65..607e5f06bb8 100644
--- a/assistant/src/permissions/trust-store.ts
+++ b/assistant/src/permissions/trust-store.ts
@@ -21,6 +21,21 @@ interface TrustFile {
 let cachedRules: TrustRule[] | null = null;
 let cachedStarterBundleAccepted: boolean | null = null;
 
+// Callbacks invoked when trust rules change (add/update/remove/clear).
+// Used by the permission checker to invalidate dependent caches.
+const rulesChangedListeners: Array<() => void> = [];
+
+/** Register a callback to be invoked whenever trust rules change. */
+export function onRulesChanged(listener: () => void): void {
+  rulesChangedListeners.push(listener);
+}
+
+function notifyRulesChanged(): void {
+  for (const listener of rulesChangedListeners) {
+    listener();
+  }
+}
+
 /**
  * Cache of pre-compiled Minimatch objects keyed by pattern string.
  * Rebuilt whenever cachedRules changes. Avoids re-parsing glob patterns
@@ -368,6 +383,7 @@ export function addRule(
   cachedRules = rules;
   rebuildPatternCache(rules);
   saveToDisk(rules);
+  notifyRulesChanged();
   log.info({ rule }, 'Added trust rule');
   return rule;
 }
@@ -395,6 +411,7 @@ export function updateRule(
   cachedRules = rules;
   rebuildPatternCache(rules);
   saveToDisk(rules);
+  notifyRulesChanged();
   log.info({ rule }, 'Updated trust rule');
   return rule;
 }
@@ -412,6 +429,7 @@ export function removeRule(id: string): boolean {
   cachedRules = rules;
   rebuildPatternCache(rules);
   saveToDisk(rules);
+  notifyRulesChanged();
   log.info({ id }, 'Removed trust rule');
   return true;
 }
@@ -508,6 +526,7 @@ export function clearAllRules(): void {
   cachedRules = rules;
   rebuildPatternCache(rules);
   saveToDisk(rules);
+  notifyRulesChanged();
   log.info('Cleared all user trust rules (default rules preserved)');
 }
 
@@ -608,6 +627,7 @@ export function acceptStarterBundle(): AcceptStarterBundleResult {
   cachedRules = rules;
   rebuildPatternCache(rules);
   saveToDisk(rules);
+  notifyRulesChanged();
   log.info({ rulesAdded: added }, 'Starter approval bundle accepted');
 
   return { accepted: true, rulesAdded: added, alreadyAccepted: false };
diff --git a/assistant/src/runtime/channel-retry-sweep.ts b/assistant/src/runtime/channel-retry-sweep.ts
new file mode 100644
index 00000000000..808066fa77e
--- /dev/null
+++ b/assistant/src/runtime/channel-retry-sweep.ts
@@ -0,0 +1,184 @@
+/**
+ * Periodic retry sweep for failed channel inbound events.
+ */
+
+import { parseChannelId, isChannelId } from '../channels/types.js';
+import { getLogger } from '../util/logger.js';
+import * as channelDeliveryStore from '../memory/channel-delivery-store.js';
+import * as conversationStore from '../memory/conversation-store.js';
+import * as attachmentsStore from '../memory/attachments-store.js';
+import { renderHistoryContent } from '../daemon/handlers.js';
+import { deliverChannelReply } from './gateway-client.js';
+import type { GuardianRuntimeContext } from '../daemon/session-runtime-assembly.js';
+import type { MessageProcessor } from './http-types.js';
+
+const log = getLogger('runtime-http');
+
+function parseGuardianRuntimeContext(value: unknown): GuardianRuntimeContext | undefined {
+  if (!value || typeof value !== 'object') return undefined;
+  const raw = value as Record<string, unknown>;
+  const actorRole = raw.actorRole;
+  if (
+    actorRole !== 'guardian'
+    && actorRole !== 'non-guardian'
+    && actorRole !== 'unverified_channel'
+  ) {
+    return undefined;
+  }
+  const rawSourceChannel = typeof raw.sourceChannel === 'string' && raw.sourceChannel.trim().length > 0
+    ? raw.sourceChannel
+    : undefined;
+  if (!rawSourceChannel || !isChannelId(rawSourceChannel)) return undefined;
+  const sourceChannel = rawSourceChannel;
+  const denialReason =
+    raw.denialReason === 'no_binding' || raw.denialReason === 'no_identity'
+      ? raw.denialReason
+      : undefined;
+  return {
+    sourceChannel,
+    actorRole,
+    guardianChatId: typeof raw.guardianChatId === 'string' ? raw.guardianChatId : undefined,
+    guardianExternalUserId: typeof raw.guardianExternalUserId === 'string' ? raw.guardianExternalUserId : undefined,
+    requesterIdentifier: typeof raw.requesterIdentifier === 'string' ? raw.requesterIdentifier : undefined,
+    requesterExternalUserId: typeof raw.requesterExternalUserId === 'string' ? raw.requesterExternalUserId : undefined,
+    requesterChatId: typeof raw.requesterChatId === 'string' ? raw.requesterChatId : undefined,
+    denialReason,
+  };
+}
+
+/**
+ * Periodically retry failed channel inbound events that have passed
+ * their exponential backoff delay.
+ */
+export async function sweepFailedEvents(
+  processMessage: MessageProcessor,
+  bearerToken: string | undefined,
+): Promise<void> {
+  const events = channelDeliveryStore.getRetryableEvents();
+  if (events.length === 0) return;
+
+  log.info({ count: events.length }, 'Retrying failed channel inbound events');
+
+  for (const event of events) {
+    if (!event.rawPayload) {
+      // No payload stored -- can't replay, move to dead letter
+      channelDeliveryStore.recordProcessingFailure(
+        event.id,
+        new Error('No raw payload stored for replay'),
+      );
+      continue;
+    }
+
+    let payload: Record<string, unknown>;
+    try {
+      payload = JSON.parse(event.rawPayload) as Record<string, unknown>;
+    } catch {
+      channelDeliveryStore.recordProcessingFailure(
+        event.id,
+        new Error('Failed to parse stored raw payload'),
+      );
+      continue;
+    }
+
+    const content = typeof payload.content === 'string' ? payload.content.trim() : '';
+    const attachmentIds = Array.isArray(payload.attachmentIds) ? payload.attachmentIds as string[] : undefined;
+    const sourceChannel = parseChannelId(payload.sourceChannel);
+    if (!sourceChannel) {
+      channelDeliveryStore.recordProcessingFailure(
+        event.id,
+        new Error(`Invalid sourceChannel: ${String(payload.sourceChannel)}`),
+      );
+      continue;
+    }
+    const sourceMetadata = payload.sourceMetadata as Record<string, unknown> | undefined;
+    const assistantId = typeof payload.assistantId === 'string'
+      ? payload.assistantId
+      : undefined;
+    const guardianContext = parseGuardianRuntimeContext(payload.guardianCtx);
+
+    const metadataHintsRaw = sourceMetadata?.hints;
+    const metadataHints = Array.isArray(metadataHintsRaw)
+      ? metadataHintsRaw.filter((h): h is string => typeof h === 'string' && h.trim().length > 0)
+      : [];
+    const metadataUxBrief = typeof sourceMetadata?.uxBrief === 'string' && sourceMetadata.uxBrief.trim().length > 0
+      ? sourceMetadata.uxBrief.trim()
+      : undefined;
+
+    try {
+      const { messageId: userMessageId } = await processMessage(
+        event.conversationId,
+        content,
+        attachmentIds,
+        {
+          transport: {
+            channelId: sourceChannel,
+            hints: metadataHints.length > 0 ? metadataHints : undefined,
+            uxBrief: metadataUxBrief,
+          },
+          assistantId,
+          guardianContext,
+        },
+      );
+      channelDeliveryStore.linkMessage(event.id, userMessageId);
+      channelDeliveryStore.markProcessed(event.id);
+      log.info({ eventId: event.id }, 'Successfully replayed failed channel event');
+
+      const replyCallbackUrl = typeof payload.replyCallbackUrl === 'string'
+        ? payload.replyCallbackUrl
+        : undefined;
+      if (replyCallbackUrl) {
+        const externalChatId = typeof payload.externalChatId === 'string'
+          ? payload.externalChatId
+          : undefined;
+        if (externalChatId) {
+          await deliverReplyViaCallback(
+            event.conversationId,
+            externalChatId,
+            replyCallbackUrl,
+            bearerToken,
+            assistantId,
+          );
+        }
+      }
+    } catch (err) {
+      log.error({ err, eventId: event.id }, 'Retry failed for channel event');
+      channelDeliveryStore.recordProcessingFailure(event.id, err);
+    }
+  }
+}
+
+async function deliverReplyViaCallback(
+  conversationId: string,
+  externalChatId: string,
+  callbackUrl: string,
+  bearerToken: string | undefined,
+  assistantId?: string,
+): Promise<void> {
+  const msgs = conversationStore.getMessages(conversationId);
+  for (let i = msgs.length - 1; i >= 0; i--) {
+    if (msgs[i].role === 'assistant') {
+      let parsed: unknown;
+      try { parsed = JSON.parse(msgs[i].content); } catch { parsed = msgs[i].content; }
+      const rendered = renderHistoryContent(parsed);
+
+      const linked = attachmentsStore.getAttachmentMetadataForMessage(msgs[i].id);
+      const replyAttachments = linked.map((a) => ({
+        id: a.id,
+        filename: a.originalFilename,
+        mimeType: a.mimeType,
+        sizeBytes: a.sizeBytes,
+        kind: a.kind,
+      }));
+
+      if (rendered.text || replyAttachments.length > 0) {
+        await deliverChannelReply(callbackUrl, {
+          chatId: externalChatId,
+          text: rendered.text || undefined,
+          attachments: replyAttachments.length > 0 ? replyAttachments : undefined,
+          assistantId,
+        }, bearerToken);
+      }
+      break;
+    }
+  }
+}
diff --git a/assistant/src/runtime/http-server.ts b/assistant/src/runtime/http-server.ts
index 148f3b7f630..85aa775c219 100644
--- a/assistant/src/runtime/http-server.ts
+++ b/assistant/src/runtime/http-server.ts
@@ -1,27 +1,19 @@
 /**
  * Optional HTTP server that exposes the canonical runtime API.
  *
- * Runs in the same process as the daemon. Started only when
- * `RUNTIME_HTTP_PORT` is set (default: disabled).
+ * Runs in the same process as the daemon. Always started on the
+ * configured port (default: 7821).
  */
 
-import { existsSync, readFileSync, statSync, statfsSync } from 'node:fs';
-import { resolve, join, dirname } from 'node:path';
-import { fileURLToPath } from 'node:url';
-import { timingSafeEqual } from 'node:crypto';
-import { parseChannelId, isChannelId } from '../channels/types.js';
-import { ConfigError, IngressBlockedError } from '../util/errors.js';
+import { existsSync, readFileSync } from 'node:fs';
+import { resolve } from 'node:path';
+import { parseChannelId } from '../channels/types.js';
 import { getLogger } from '../util/logger.js';
-import { getWorkspacePromptPath, readLockfile } from '../util/platform.js';
 import {
   getGatewayInternalBaseUrl,
-  isTwilioWebhookValidationDisabled,
   isHttpAuthDisabled,
   getRuntimeGatewayOriginSecret,
 } from '../config/env.js';
-import { TwilioConversationRelayProvider } from '../calls/twilio-provider.js';
-import { loadConfig } from '../config/loader.js';
-import { getPublicBaseUrl } from '../inbound/public-ingress-urls.js';
 import type { RunOrchestrator } from './run-orchestrator.js';
 
 // Route handlers — grouped by domain
@@ -56,12 +48,8 @@ import {
   startGuardianActionSweep,
   stopGuardianActionSweep,
 } from '../calls/guardian-action-sweep.js';
-import * as channelDeliveryStore from '../memory/channel-delivery-store.js';
 import * as conversationStore from '../memory/conversation-store.js';
 import * as externalConversationStore from '../memory/external-conversation-store.js';
-import * as attachmentsStore from '../memory/attachments-store.js';
-import { renderHistoryContent } from '../daemon/handlers.js';
-import { deliverChannelReply } from './gateway-client.js';
 import {
   handleServePage,
   handleShareApp,
@@ -88,13 +76,39 @@ import { extensionRelayServer } from '../browser-extension-relay/server.js';
 import type { BrowserRelayWebSocketData } from '../browser-extension-relay/server.js';
 import { handleSubscribeAssistantEvents } from './routes/events-routes.js';
 import { consumeCallback, consumeCallbackError } from '../security/oauth-callback-registry.js';
-import type { GuardianRuntimeContext } from '../daemon/session-runtime-assembly.js';
 import { PairingStore } from '../daemon/pairing-store.js';
+import type { ServerMessage } from '../daemon/ipc-contract.js';
+
+// Middleware
+import {
+  verifyBearerToken,
+  isLoopbackHost,
+  isPrivateNetworkPeer,
+  isPrivateNetworkOrigin,
+  extractBearerToken,
+} from './middleware/auth.js';
+import { withErrorHandling } from './middleware/error-handler.js';
+import {
+  TWILIO_WEBHOOK_RE,
+  TWILIO_GATEWAY_WEBHOOK_RE,
+  GATEWAY_SUBPATH_MAP,
+  GATEWAY_ONLY_BLOCKED_SUBPATHS,
+  validateTwilioWebhook,
+  cloneRequestWithBody,
+} from './middleware/twilio-validation.js';
+
+// Extracted route handlers
 import {
-  isDeviceApproved,
-  refreshDevice,
-  hashDeviceId,
-} from '../daemon/approved-devices-store.js';
+  handlePairingRegister,
+  handlePairingRequest,
+  handlePairingStatus,
+} from './routes/pairing-routes.js';
+import type { PairingHandlerContext } from './routes/pairing-routes.js';
+import { handleHealth, handleGetIdentity } from './routes/identity-routes.js';
+import { sweepFailedEvents } from './channel-retry-sweep.js';
+
+// Re-export for consumers
+export { isPrivateAddress } from './middleware/auth.js';
 
 // Re-export shared types so existing consumers don't need to update imports
 export type {
@@ -104,6 +118,7 @@ export type {
   RuntimeHttpServerOptions,
   RuntimeAttachmentMetadata,
   ApprovalCopyGenerator,
+  ApprovalConversationGenerator,
 } from './http-types.js';
 
 import type {
@@ -111,6 +126,7 @@ import type {
   NonBlockingMessageProcessor,
   RuntimeHttpServerOptions,
   ApprovalCopyGenerator,
+  ApprovalConversationGenerator,
 } from './http-types.js';
 
 const log = getLogger('runtime-http');
@@ -118,279 +134,9 @@ const log = getLogger('runtime-http');
 const DEFAULT_PORT = 7821;
 const DEFAULT_HOSTNAME = '127.0.0.1';
 
-/** Resolve the gateway base URL for internal delivery callbacks. */
-function getGatewayBaseUrl(): string {
-  return getGatewayInternalBaseUrl();
-}
-
-/** Global hard cap on request body size (50 MB). Bun rejects larger payloads before they reach handlers. */
+/** Global hard cap on request body size (50 MB). */
 const MAX_REQUEST_BODY_BYTES = 50 * 1024 * 1024;
 
-function parseGuardianRuntimeContext(value: unknown): GuardianRuntimeContext | undefined {
-  if (!value || typeof value !== 'object') return undefined;
-  const raw = value as Record<string, unknown>;
-  const actorRole = raw.actorRole;
-  if (
-    actorRole !== 'guardian'
-    && actorRole !== 'non-guardian'
-    && actorRole !== 'unverified_channel'
-  ) {
-    return undefined;
-  }
-  const rawSourceChannel = typeof raw.sourceChannel === 'string' && raw.sourceChannel.trim().length > 0
-    ? raw.sourceChannel
-    : undefined;
-  if (!rawSourceChannel || !isChannelId(rawSourceChannel)) return undefined;
-  const sourceChannel = rawSourceChannel;
-  const denialReason =
-    raw.denialReason === 'no_binding' || raw.denialReason === 'no_identity'
-      ? raw.denialReason
-      : undefined;
-  return {
-    sourceChannel,
-    actorRole,
-    guardianChatId: typeof raw.guardianChatId === 'string' ? raw.guardianChatId : undefined,
-    guardianExternalUserId: typeof raw.guardianExternalUserId === 'string' ? raw.guardianExternalUserId : undefined,
-    requesterIdentifier: typeof raw.requesterIdentifier === 'string' ? raw.requesterIdentifier : undefined,
-    requesterExternalUserId: typeof raw.requesterExternalUserId === 'string' ? raw.requesterExternalUserId : undefined,
-    requesterChatId: typeof raw.requesterChatId === 'string' ? raw.requesterChatId : undefined,
-    denialReason,
-  };
-}
-
-interface DiskSpaceInfo {
-  path: string;
-  totalMb: number;
-  usedMb: number;
-  freeMb: number;
-}
-
-function getDiskSpaceInfo(): DiskSpaceInfo | null {
-  try {
-    const baseDataDir = process.env.BASE_DATA_DIR?.trim();
-    const diskPath = baseDataDir && existsSync(baseDataDir) ? baseDataDir : '/';
-    const stats = statfsSync(diskPath);
-    const totalBytes = stats.bsize * stats.blocks;
-    const freeBytes = stats.bsize * stats.bavail;
-    const bytesToMb = (b: number) => Math.round((b / (1024 * 1024)) * 100) / 100;
-    return {
-      path: diskPath,
-      totalMb: bytesToMb(totalBytes),
-      usedMb: bytesToMb(totalBytes - freeBytes),
-      freeMb: bytesToMb(freeBytes),
-    };
-  } catch {
-    return null;
-  }
-}
-
-/**
- * Regex to extract the Twilio webhook subpath from both top-level and
- * assistant-scoped route shapes:
- *   /v1/calls/twilio/<subpath>
- *   /v1/assistants/<id>/calls/twilio/<subpath>
- */
-const TWILIO_WEBHOOK_RE = /^\/v1\/(?:assistants\/[^/]+\/)?calls\/twilio\/(.+)$/;
-
-/**
- * Gateway-compatible Twilio webhook paths:
- *   /webhooks/twilio/<subpath>
- *
- * Maps gateway path segments to the internal subpath names used by the
- * dispatcher below (e.g. "voice" -> "voice-webhook").
- */
-const TWILIO_GATEWAY_WEBHOOK_RE = /^\/webhooks\/twilio\/(.+)$/;
-const GATEWAY_SUBPATH_MAP: Record<string, string> = {
-  voice: 'voice-webhook',
-  status: 'status',
-  'connect-action': 'connect-action',
-  sms: 'sms',
-};
-
-/**
- * Direct Twilio webhook subpaths that are blocked in gateway_only mode.
- * Includes all public-facing webhook paths (voice, status, connect-action, SMS)
- * because the runtime must never serve as a direct ingress for external webhooks.
- * Internal forwarding endpoints (gateway→runtime) are unaffected.
- */
-const GATEWAY_ONLY_BLOCKED_SUBPATHS = new Set(['voice-webhook', 'status', 'connect-action', 'sms']);
-
-/**
- * Check if a request origin is from a private/internal network address.
- * Extracts the hostname from the Origin header and validates it against
- * isPrivateAddress(), consistent with the isPrivateNetworkPeer check.
- */
-function isPrivateNetworkOrigin(req: Request): boolean {
-  const origin = req.headers.get('origin');
-  // No origin header (e.g., server-initiated or same-origin) — allow
-  if (!origin) return true;
-  try {
-    const url = new URL(origin);
-    const host = url.hostname;
-    if (host === 'localhost') return true;
-    // URL.hostname wraps IPv6 addresses in brackets (e.g. "[::1]") — strip them
-    const rawHost = host.startsWith('[') && host.endsWith(']') ? host.slice(1, -1) : host;
-    return isPrivateAddress(rawHost);
-  } catch {
-    return false;
-  }
-}
-
-/**
- * Check if a hostname is a loopback address.
- */
-function isLoopbackHost(hostname: string): boolean {
-  return hostname === '127.0.0.1' || hostname === '::1' || hostname === 'localhost';
-}
-
-/**
- * Check if the actual peer/remote address of a connection is from a
- * private/internal network. Uses Bun's server.requestIP() to get the
- * real peer address, which cannot be spoofed unlike the Origin header.
- *
- * Accepts loopback, RFC 1918 private IPv4, link-local, and RFC 4193
- * unique-local IPv6 — including their IPv4-mapped IPv6 forms. This
- * supports container/pod deployments (e.g. Kubernetes sidecars) where
- * gateway and runtime communicate over pod-internal private IPs.
- */
-function isPrivateNetworkPeer(server: { requestIP(req: Request): { address: string; family: string; port: number } | null }, req: Request): boolean {
-  const ip = server.requestIP(req);
-  if (!ip) return false;
-  return isPrivateAddress(ip.address);
-}
-
-/**
- * @internal Exported for testing.
- *
- * Determine whether an IP address string belongs to a private/internal
- * network range:
- *   - Loopback: 127.0.0.0/8, ::1
- *   - RFC 1918: 10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16
- *   - Link-local: 169.254.0.0/16
- *   - IPv6 unique local: fc00::/7 (fc00::–fdff::)
- *   - IPv4-mapped IPv6 variants of all of the above (::ffff:x.x.x.x)
- */
-export function isPrivateAddress(addr: string): boolean {
-  // Handle IPv4-mapped IPv6 (e.g. ::ffff:10.0.0.1) — extract the IPv4 part
-  const v4Mapped = addr.match(/^::ffff:(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})$/i);
-  const normalized = v4Mapped ? v4Mapped[1] : addr;
-
-  // IPv4 checks
-  if (normalized.includes('.')) {
-    const parts = normalized.split('.').map(Number);
-    if (parts.length !== 4 || parts.some(p => isNaN(p) || p < 0 || p > 255)) return false;
-
-    // Loopback: 127.0.0.0/8
-    if (parts[0] === 127) return true;
-    // 10.0.0.0/8
-    if (parts[0] === 10) return true;
-    // 172.16.0.0/12 (172.16.x.x – 172.31.x.x)
-    if (parts[0] === 172 && parts[1] >= 16 && parts[1] <= 31) return true;
-    // 192.168.0.0/16
-    if (parts[0] === 192 && parts[1] === 168) return true;
-    // Link-local: 169.254.0.0/16
-    if (parts[0] === 169 && parts[1] === 254) return true;
-
-    return false;
-  }
-
-  // IPv6 checks
-  const lower = normalized.toLowerCase();
-  // Loopback
-  if (lower === '::1') return true;
-  // Unique local: fc00::/7 (fc00:: through fdff::)
-  if (lower.startsWith('fc') || lower.startsWith('fd')) return true;
-  // Link-local: fe80::/10
-  if (lower.startsWith('fe80')) return true;
-
-  return false;
-}
-
-/**
- * Validate a Twilio webhook request's X-Twilio-Signature header.
- *
- * Returns the raw body text on success so callers can reconstruct the Request
- * for downstream handlers (which also need to read the body).
- * Returns a 403 Response if signature validation fails.
- *
- * Fail-closed: if the auth token is not configured, the request is rejected
- * with 403 rather than silently skipping validation. An explicit local-dev
- * bypass is available via TWILIO_WEBHOOK_VALIDATION_DISABLED=true.
- */
-async function validateTwilioWebhook(
-  req: Request,
-): Promise<{ body: string } | Response> {
-  const rawBody = await req.text();
-
-  // Allow explicit local-dev bypass — must be exactly "true"
-  if (isTwilioWebhookValidationDisabled()) {
-    log.warn('Twilio webhook signature validation explicitly disabled via TWILIO_WEBHOOK_VALIDATION_DISABLED');
-    return { body: rawBody };
-  }
-
-  const authToken = TwilioConversationRelayProvider.getAuthToken();
-
-  // Fail-closed: reject if no auth token is configured
-  if (!authToken) {
-    log.error('Twilio auth token not configured — rejecting webhook request (fail-closed)');
-    return Response.json({ error: 'Forbidden' }, { status: 403 });
-  }
-
-  const signature = req.headers.get('x-twilio-signature');
-  if (!signature) {
-    log.warn('Twilio webhook request missing X-Twilio-Signature header');
-    return Response.json({ error: 'Forbidden' }, { status: 403 });
-  }
-
-  // Parse form-urlencoded body into key-value params for signature computation
-  const params: Record<string, string> = {};
-  const formData = new URLSearchParams(rawBody);
-  for (const [key, value] of formData.entries()) {
-    params[key] = value;
-  }
-
-  // Reconstruct the public-facing URL that Twilio signed against.
-  // Behind proxies/gateways, req.url is the local server URL (e.g.
-  // http://127.0.0.1:7821/...) which differs from the public URL Twilio
-  // used to compute the HMAC-SHA1 signature.
-  let publicBaseUrl: string | undefined;
-  try {
-    publicBaseUrl = getPublicBaseUrl(loadConfig());
-  } catch {
-    // No webhook base URL configured — fall back to using req.url as-is
-  }
-  const parsedUrl = new URL(req.url);
-  const publicUrl = publicBaseUrl
-    ? publicBaseUrl + parsedUrl.pathname + parsedUrl.search
-    : req.url;
-
-  const isValid = TwilioConversationRelayProvider.verifyWebhookSignature(
-    publicUrl,
-    params,
-    signature,
-    authToken,
-  );
-
-  if (!isValid) {
-    log.warn('Twilio webhook signature validation failed');
-    return Response.json({ error: 'Forbidden' }, { status: 403 });
-  }
-
-  return { body: rawBody };
-}
-
-/**
- * Re-create a Request with the same method, headers, and URL but with a
- * pre-read body string so downstream handlers can call req.text() again.
- */
-function cloneRequestWithBody(original: Request, body: string): Request {
-  return new Request(original.url, {
-    method: original.method,
-    headers: original.headers,
-    body,
-  });
-}
-
 export class RuntimeHttpServer {
   private server: ReturnType<typeof Bun.serve> | null = null;
   private port: number;
@@ -400,13 +146,14 @@ export class RuntimeHttpServer {
   private persistAndProcessMessage?: NonBlockingMessageProcessor;
   private runOrchestrator?: RunOrchestrator;
   private approvalCopyGenerator?: ApprovalCopyGenerator;
+  private approvalConversationGenerator?: ApprovalConversationGenerator;
   private interfacesDir: string | null;
   private suggestionCache = new Map<string, string>();
   private suggestionInFlight = new Map<string, Promise<string | null>>();
   private retrySweepTimer: ReturnType<typeof setInterval> | null = null;
   private sweepInProgress = false;
   private pairingStore = new PairingStore();
-  private pairingBroadcast?: (msg: { type: string; [key: string]: unknown }) => void;
+  private pairingBroadcast?: (msg: ServerMessage) => void;
 
   constructor(options: RuntimeHttpServerOptions = {}) {
     this.port = options.port ?? DEFAULT_PORT;
@@ -416,6 +163,7 @@ export class RuntimeHttpServer {
     this.persistAndProcessMessage = options.persistAndProcessMessage;
     this.runOrchestrator = options.runOrchestrator;
     this.approvalCopyGenerator = options.approvalCopyGenerator;
+    this.approvalConversationGenerator = options.approvalConversationGenerator;
     this.interfacesDir = options.interfacesDir ?? null;
   }
 
@@ -430,10 +178,18 @@ export class RuntimeHttpServer {
   }
 
   /** Set a callback for broadcasting IPC messages (wired by daemon server). */
-  setPairingBroadcast(fn: (msg: { type: string; [key: string]: unknown }) => void): void {
+  setPairingBroadcast(fn: (msg: ServerMessage) => void): void {
     this.pairingBroadcast = fn;
   }
 
+  private get pairingContext(): PairingHandlerContext {
+    return {
+      pairingStore: this.pairingStore,
+      bearerToken: this.bearerToken,
+      pairingBroadcast: this.pairingBroadcast,
+    };
+  }
+
   async start(): Promise<void> {
     type AllWebSocketData = RelayWebSocketData | BrowserRelayWebSocketData;
     this.server = Bun.serve<AllWebSocketData>({
@@ -449,7 +205,6 @@ export class RuntimeHttpServer {
             extensionRelayServer.handleOpen(ws as any);
             return;
           }
-          // call-relay
           const callSessionId = (data as RelayWebSocketData).callSessionId;
           log.info({ callSessionId }, 'ConversationRelay WebSocket opened');
           if (callSessionId) {
@@ -466,7 +221,6 @@ export class RuntimeHttpServer {
             extensionRelayServer.handleMessage(ws as any, raw);
             return;
           }
-          // call-relay
           const callSessionId = (data as RelayWebSocketData).callSessionId;
           if (callSessionId) {
             const connection = activeRelayConnections.get(callSessionId);
@@ -480,7 +234,6 @@ export class RuntimeHttpServer {
             extensionRelayServer.handleClose(ws as any, code, reason?.toString());
             return;
           }
-          // call-relay
           const callSessionId = (data as RelayWebSocketData).callSessionId;
           log.info({ callSessionId, code, reason: reason?.toString() }, 'ConversationRelay WebSocket closed');
           if (callSessionId) {
@@ -493,28 +246,24 @@ export class RuntimeHttpServer {
       },
     });
 
-    // Sweep failed channel inbound events for retry every 30 seconds
     if (this.processMessage) {
+      const pm = this.processMessage;
+      const bt = this.bearerToken;
       this.retrySweepTimer = setInterval(() => {
         if (this.sweepInProgress) return;
         this.sweepInProgress = true;
-        this.sweepFailedEvents().finally(() => { this.sweepInProgress = false; });
+        sweepFailedEvents(pm, bt).finally(() => { this.sweepInProgress = false; });
       }, 30_000);
     }
 
-    // Start proactive guardian approval expiry sweep whenever orchestrator
-    // support is available. Guardian approvals can be created even when the
-    // generic channel-approval UX flag is disabled.
     if (this.runOrchestrator) {
-      startGuardianExpirySweep(this.runOrchestrator, getGatewayBaseUrl(), this.bearerToken, this.approvalCopyGenerator);
+      startGuardianExpirySweep(this.runOrchestrator, getGatewayInternalBaseUrl(), this.bearerToken, this.approvalCopyGenerator);
       log.info('Guardian approval expiry sweep started');
     }
 
-    // Start guardian action request expiry sweep (cross-channel voice guardian)
-    startGuardianActionSweep(getGatewayBaseUrl(), this.bearerToken);
+    startGuardianActionSweep(getGatewayInternalBaseUrl(), this.bearerToken);
     log.info('Guardian action expiry sweep started');
 
-    // Startup guard: log gateway-only mode warnings
     log.info('Running in gateway-only ingress mode. Direct webhook routes disabled.');
     if (!isLoopbackHost(this.hostname)) {
       log.warn('RUNTIME_HTTP_HOST is not bound to loopback. This may expose the runtime to direct public access.');
@@ -540,144 +289,49 @@ export class RuntimeHttpServer {
     }
   }
 
-  /**
-   * Constant-time comparison of two bearer tokens to prevent timing attacks.
-   */
-  private verifyToken(provided: string): boolean {
-    const expected = this.bearerToken!;
-    const a = Buffer.from(provided);
-    const b = Buffer.from(expected);
-    if (a.length !== b.length) return false;
-    return timingSafeEqual(a, b);
-  }
-
   private async handleRequest(req: Request, server: ReturnType<typeof Bun.serve>): Promise<Response> {
     const url = new URL(req.url);
     const path = url.pathname;
 
-    // Health checks are unauthenticated — they expose no sensitive data.
     if (path === '/healthz' && req.method === 'GET') {
-      return this.handleHealth();
+      return handleHealth();
     }
 
     // WebSocket upgrade for the Chrome extension browser relay.
-    // Localhost-only; optional bearer token via ?token= query param.
     if (path === '/v1/browser-relay' && req.headers.get('upgrade')?.toLowerCase() === 'websocket') {
-      if (!isLoopbackHost(new URL(req.url).hostname) && !isPrivateNetworkPeer(server, req)) {
-        return Response.json(
-          { error: 'Browser relay only accepts connections from localhost', code: 'LOCALHOST_ONLY' },
-          { status: 403 },
-        );
-      }
-
-      // Optional bearer token check via ?token= query param
-      if ((process.env.DISABLE_HTTP_AUTH ?? '').toLowerCase() !== 'true' && this.bearerToken) {
-        const wsUrl = new URL(req.url);
-        const token = wsUrl.searchParams.get('token');
-        if (!token || !this.verifyToken(token)) {
-          return Response.json({ error: 'Unauthorized' }, { status: 401 });
-        }
-      }
-
-      const connectionId = crypto.randomUUID();
-      const upgraded = server.upgrade(req, {
-        data: { wsType: 'browser-relay', connectionId } satisfies BrowserRelayWebSocketData,
-      });
-      if (!upgraded) {
-        return new Response('WebSocket upgrade failed', { status: 500 });
-      }
-      return undefined as unknown as Response;
+      return this.handleBrowserRelayUpgrade(req, server);
     }
 
     // WebSocket upgrade for ConversationRelay — before auth check because
     // Twilio WebSocket connections don't use bearer tokens.
     if (path.startsWith('/v1/calls/relay') && req.headers.get('upgrade')?.toLowerCase() === 'websocket') {
-      // Only allow relay connections from private network peers.
-      // Primary check: actual peer address (cannot be spoofed) — accepts loopback
-      // and RFC 1918/4193 private addresses to support container deployments.
-      // Secondary check: Origin header (defense in depth).
-      if (!isPrivateNetworkPeer(server, req) || !isPrivateNetworkOrigin(req)) {
-        return Response.json(
-          { error: 'Direct relay access disabled — only private network peers allowed', code: 'GATEWAY_ONLY' },
-          { status: 403 },
-        );
-      }
-
-      const wsUrl = new URL(req.url);
-      const callSessionId = wsUrl.searchParams.get('callSessionId');
-      if (!callSessionId) {
-        return new Response('Missing callSessionId', { status: 400 });
-      }
-      const upgraded = server.upgrade(req, { data: { callSessionId } });
-      if (!upgraded) {
-        return new Response('WebSocket upgrade failed', { status: 500 });
-      }
-      // Bun handles the response after a successful upgrade.
-      // The RelayConnection is created in the websocket.open handler.
-      return undefined as unknown as Response;
+      return this.handleRelayUpgrade(req, server);
     }
 
-    // ── Twilio webhook endpoints — before auth check because Twilio
-    //    webhook POSTs don't include bearer tokens.
-    //    Supports /v1/calls/twilio/*, /v1/assistants/:id/calls/twilio/*,
-    //    and gateway-compatible /webhooks/twilio/* paths.
-    //    Validates X-Twilio-Signature to prevent unauthorized access. ──
-    const twilioMatch = path.match(TWILIO_WEBHOOK_RE);
-    const gatewayTwilioMatch = !twilioMatch ? path.match(TWILIO_GATEWAY_WEBHOOK_RE) : null;
-    const resolvedTwilioSubpath = twilioMatch
-      ? twilioMatch[1]
-      : gatewayTwilioMatch
-        ? GATEWAY_SUBPATH_MAP[gatewayTwilioMatch[1]]
-        : null;
-    if (resolvedTwilioSubpath && req.method === 'POST') {
-      const twilioSubpath = resolvedTwilioSubpath;
-
-      // Block direct Twilio webhook routes — must go through the gateway
-      if (GATEWAY_ONLY_BLOCKED_SUBPATHS.has(twilioSubpath)) {
-        return Response.json(
-          { error: 'Direct webhook access disabled. Use the gateway.', code: 'GATEWAY_ONLY' },
-          { status: 410 },
-        );
-      }
-
-      // Validate Twilio request signature before dispatching
-      const validation = await validateTwilioWebhook(req);
-      if (validation instanceof Response) return validation;
+    // Twilio webhook endpoints — before auth check because Twilio
+    // webhook POSTs don't include bearer tokens.
+    const twilioResponse = await this.handleTwilioWebhook(req, path);
+    if (twilioResponse) return twilioResponse;
 
-      // Reconstruct request so handlers can read the body
-      const validatedReq = cloneRequestWithBody(req, validation.body);
-
-      if (twilioSubpath === 'voice-webhook') {
-        return await handleVoiceWebhook(validatedReq);
-      }
-      if (twilioSubpath === 'status') {
-        return await handleStatusCallback(validatedReq);
-      }
-      if (twilioSubpath === 'connect-action') {
-        return await handleConnectAction(validatedReq);
-      }
-    }
-
-    // ── Pairing endpoints (unauthenticated, secret-gated) ──────────
+    // Pairing endpoints (unauthenticated, secret-gated)
     if (path === '/v1/pairing/request' && req.method === 'POST') {
-      return await this.handlePairingRequest(req);
+      return await handlePairingRequest(req, this.pairingContext);
     }
     if (path === '/v1/pairing/status' && req.method === 'GET') {
-      return this.handlePairingStatus(url);
+      return handlePairingStatus(url, this.pairingContext);
     }
 
     // Require bearer token when configured
     if (!isHttpAuthDisabled() && this.bearerToken) {
-      const authHeader = req.headers.get('authorization');
-      const token = authHeader?.startsWith('Bearer ') ? authHeader.slice(7) : null;
-      if (!token || !this.verifyToken(token)) {
+      const token = extractBearerToken(req);
+      if (!token || !verifyBearerToken(token, this.bearerToken)) {
         return Response.json({ error: 'Unauthorized' }, { status: 401 });
       }
     }
 
-    // ── Pairing registration (bearer-authenticated) ──────────────
+    // Pairing registration (bearer-authenticated)
     if (path === '/v1/pairing/register' && req.method === 'POST') {
-      return await this.handlePairingRegister(req);
+      return await handlePairingRegister(req, this.pairingContext);
     }
 
     // Serve shareable app pages
@@ -691,11 +345,9 @@ export class RuntimeHttpServer {
       }
     }
 
-    // ── Cloud sharing endpoints ───────────────────────────────────────
+    // Cloud sharing endpoints
     if (path === '/v1/apps/share' && req.method === 'POST') {
-      try {
-        return await handleShareApp(req);
-      } catch (err) {
+      try { return await handleShareApp(req); } catch (err) {
         log.error({ err }, 'Runtime HTTP handler error sharing app');
         return Response.json({ error: 'Internal server error' }, { status: 500 });
       }
@@ -705,17 +357,13 @@ export class RuntimeHttpServer {
     if (sharedTokenMatch) {
       const shareToken = sharedTokenMatch[1];
       if (req.method === 'GET') {
-        try {
-          return handleDownloadSharedApp(shareToken);
-        } catch (err) {
+        try { return handleDownloadSharedApp(shareToken); } catch (err) {
           log.error({ err, shareToken }, 'Runtime HTTP handler error downloading shared app');
           return Response.json({ error: 'Internal server error' }, { status: 500 });
         }
       }
       if (req.method === 'DELETE') {
-        try {
-          return handleDeleteSharedApp(shareToken);
-        } catch (err) {
+        try { return handleDeleteSharedApp(shareToken); } catch (err) {
           log.error({ err, shareToken }, 'Runtime HTTP handler error deleting shared app');
           return Response.json({ error: 'Internal server error' }, { status: 500 });
         }
@@ -724,27 +372,21 @@ export class RuntimeHttpServer {
 
     const sharedMetadataMatch = path.match(/^\/v1\/apps\/shared\/([^/]+)\/metadata$/);
     if (sharedMetadataMatch && req.method === 'GET') {
-      try {
-        return handleGetSharedAppMetadata(sharedMetadataMatch[1]);
-      } catch (err) {
+      try { return handleGetSharedAppMetadata(sharedMetadataMatch[1]); } catch (err) {
         log.error({ err, shareToken: sharedMetadataMatch[1] }, 'Runtime HTTP handler error getting shared app metadata');
         return Response.json({ error: 'Internal server error' }, { status: 500 });
       }
     }
 
-    // ── Secret management endpoint ─────────────────────────────────────
+    // Secret management endpoint
     if (path === '/v1/secrets' && req.method === 'POST') {
-      try {
-        return await handleAddSecret(req);
-      } catch (err) {
+      try { return await handleAddSecret(req); } catch (err) {
         log.error({ err }, 'Runtime HTTP handler error adding secret');
         return Response.json({ error: 'Internal server error' }, { status: 500 });
       }
     }
 
     // New assistant-less runtime routes: /v1/<endpoint>
-    // These supersede the legacy /v1/assistants/:assistantId/... shape.
-    // Paths already handled above (/v1/apps/..., /v1/secrets) will never reach here.
     const newRouteMatch = path.match(/^\/v1\/(?!assistants\/)(.+)$/);
     if (newRouteMatch) {
       return this.dispatchEndpoint(newRouteMatch[1], req, url);
@@ -762,10 +404,85 @@ export class RuntimeHttpServer {
     return this.dispatchEndpoint(endpoint, req, url, assistantId);
   }
 
+  private handleBrowserRelayUpgrade(req: Request, server: ReturnType<typeof Bun.serve>): Response {
+    if (!isLoopbackHost(new URL(req.url).hostname) && !isPrivateNetworkPeer(server, req)) {
+      return Response.json(
+        { error: 'Browser relay only accepts connections from localhost', code: 'LOCALHOST_ONLY' },
+        { status: 403 },
+      );
+    }
+
+    if ((process.env.DISABLE_HTTP_AUTH ?? '').toLowerCase() !== 'true' && this.bearerToken) {
+      const wsUrl = new URL(req.url);
+      const token = wsUrl.searchParams.get('token');
+      if (!token || !verifyBearerToken(token, this.bearerToken)) {
+        return Response.json({ error: 'Unauthorized' }, { status: 401 });
+      }
+    }
+
+    const connectionId = crypto.randomUUID();
+    const upgraded = server.upgrade(req, {
+      data: { wsType: 'browser-relay', connectionId } satisfies BrowserRelayWebSocketData,
+    });
+    if (!upgraded) {
+      return new Response('WebSocket upgrade failed', { status: 500 });
+    }
+    return undefined as unknown as Response;
+  }
+
+  private handleRelayUpgrade(req: Request, server: ReturnType<typeof Bun.serve>): Response {
+    if (!isPrivateNetworkPeer(server, req) || !isPrivateNetworkOrigin(req)) {
+      return Response.json(
+        { error: 'Direct relay access disabled — only private network peers allowed', code: 'GATEWAY_ONLY' },
+        { status: 403 },
+      );
+    }
+
+    const wsUrl = new URL(req.url);
+    const callSessionId = wsUrl.searchParams.get('callSessionId');
+    if (!callSessionId) {
+      return new Response('Missing callSessionId', { status: 400 });
+    }
+    const upgraded = server.upgrade(req, { data: { callSessionId } });
+    if (!upgraded) {
+      return new Response('WebSocket upgrade failed', { status: 500 });
+    }
+    return undefined as unknown as Response;
+  }
+
+  private async handleTwilioWebhook(req: Request, path: string): Promise<Response | null> {
+    const twilioMatch = path.match(TWILIO_WEBHOOK_RE);
+    const gatewayTwilioMatch = !twilioMatch ? path.match(TWILIO_GATEWAY_WEBHOOK_RE) : null;
+    const resolvedTwilioSubpath = twilioMatch
+      ? twilioMatch[1]
+      : gatewayTwilioMatch
+        ? GATEWAY_SUBPATH_MAP[gatewayTwilioMatch[1]]
+        : null;
+    if (!resolvedTwilioSubpath || req.method !== 'POST') return null;
+
+    const twilioSubpath = resolvedTwilioSubpath;
+
+    if (GATEWAY_ONLY_BLOCKED_SUBPATHS.has(twilioSubpath)) {
+      return Response.json(
+        { error: 'Direct webhook access disabled. Use the gateway.', code: 'GATEWAY_ONLY' },
+        { status: 410 },
+      );
+    }
+
+    const validation = await validateTwilioWebhook(req);
+    if (validation instanceof Response) return validation;
+
+    const validatedReq = cloneRequestWithBody(req, validation.body);
+
+    if (twilioSubpath === 'voice-webhook') return await handleVoiceWebhook(validatedReq);
+    if (twilioSubpath === 'status') return await handleStatusCallback(validatedReq);
+    if (twilioSubpath === 'connect-action') return await handleConnectAction(validatedReq);
+
+    return null;
+  }
+
   /**
    * Dispatch a request to the appropriate endpoint handler.
-   * Used by both the new assistant-less routes (/v1/<endpoint>) and the
-   * legacy assistant-scoped routes (/v1/assistants/:assistantId/<endpoint>).
    */
   private async dispatchEndpoint(
     endpoint: string,
@@ -773,10 +490,8 @@ export class RuntimeHttpServer {
     url: URL,
     assistantId: string = 'self',
   ): Promise<Response> {
-    try {
-      if (endpoint === 'health' && req.method === 'GET') {
-        return this.handleHealth();
-      }
+    return withErrorHandling(endpoint, async () => {
+      if (endpoint === 'health' && req.method === 'GET') return handleHealth();
 
       if (endpoint === 'browser-relay/status' && req.method === 'GET') {
         return Response.json(extensionRelayServer.getStatus());
@@ -785,7 +500,7 @@ export class RuntimeHttpServer {
       if (endpoint === 'browser-relay/command' && req.method === 'POST') {
         try {
           const body = await req.json() as Record<string, unknown>;
-          const resp = await extensionRelayServer.sendCommand(body as any);
+          const resp = await extensionRelayServer.sendCommand(body as Omit<import('../browser-extension-relay/protocol.js').ExtensionCommand, 'id'>);
           return Response.json(resp);
         } catch (err) {
           return Response.json({ success: false, error: err instanceof Error ? err.message : String(err) }, { status: 500 });
@@ -825,13 +540,8 @@ export class RuntimeHttpServer {
         });
       }
 
-      if (endpoint === 'messages' && req.method === 'GET') {
-        return handleListMessages(url, this.interfacesDir);
-      }
-
-      if (endpoint === 'search' && req.method === 'GET') {
-        return handleSearchConversations(url);
-      }
+      if (endpoint === 'messages' && req.method === 'GET') return handleListMessages(url, this.interfacesDir);
+      if (endpoint === 'search' && req.method === 'GET') return handleSearchConversations(url);
 
       if (endpoint === 'messages' && req.method === 'POST') {
         return await handleSendMessage(req, {
@@ -840,19 +550,11 @@ export class RuntimeHttpServer {
         });
       }
 
-      if (endpoint === 'attachments' && req.method === 'POST') {
-        return await handleUploadAttachment(req);
-      }
-
-      if (endpoint === 'attachments' && req.method === 'DELETE') {
-        return await handleDeleteAttachment(req);
-      }
+      if (endpoint === 'attachments' && req.method === 'POST') return await handleUploadAttachment(req);
+      if (endpoint === 'attachments' && req.method === 'DELETE') return await handleDeleteAttachment(req);
 
-      // Match attachments/:attachmentId
       const attachmentMatch = endpoint.match(/^attachments\/([^/]+)$/);
-      if (attachmentMatch && req.method === 'GET') {
-        return handleGetAttachment(attachmentMatch[1]);
-      }
+      if (attachmentMatch && req.method === 'GET') return handleGetAttachment(attachmentMatch[1]);
 
       if (endpoint === 'suggestion' && req.method === 'GET') {
         return await handleGetSuggestion(url, {
@@ -862,530 +564,101 @@ export class RuntimeHttpServer {
       }
 
       if (endpoint === 'runs' && req.method === 'POST') {
-        if (!this.runOrchestrator) {
-          return Response.json({ error: 'Run orchestration not configured' }, { status: 503 });
-        }
+        if (!this.runOrchestrator) return Response.json({ error: 'Run orchestration not configured' }, { status: 503 });
         return await handleCreateRun(req, this.runOrchestrator);
       }
 
-      // Match runs/:runId, runs/:runId/decision, runs/:runId/trust-rule, runs/:runId/secret
       const runsMatch = endpoint.match(/^runs\/([^/]+)(\/decision|\/trust-rule|\/secret)?$/);
       if (runsMatch) {
-        if (!this.runOrchestrator) {
-          return Response.json({ error: 'Run orchestration not configured' }, { status: 503 });
-        }
+        if (!this.runOrchestrator) return Response.json({ error: 'Run orchestration not configured' }, { status: 503 });
         const runId = runsMatch[1];
-        if (runsMatch[2] === '/decision' && req.method === 'POST') {
-          return await handleRunDecision(runId, req, this.runOrchestrator);
-        }
-        if (runsMatch[2] === '/secret' && req.method === 'POST') {
-          return await handleRunSecret(runId, req, this.runOrchestrator);
-        }
+        if (runsMatch[2] === '/decision' && req.method === 'POST') return await handleRunDecision(runId, req, this.runOrchestrator);
+        if (runsMatch[2] === '/secret' && req.method === 'POST') return await handleRunSecret(runId, req, this.runOrchestrator);
         if (runsMatch[2] === '/trust-rule' && req.method === 'POST') {
           const run = this.runOrchestrator.getRun(runId);
-          if (!run) {
-            return Response.json({ error: 'Run not found' }, { status: 404 });
-          }
+          if (!run) return Response.json({ error: 'Run not found' }, { status: 404 });
           return await handleAddTrustRule(runId, req);
         }
-        if (req.method === 'GET') {
-          return handleGetRun(runId, this.runOrchestrator);
-        }
+        if (req.method === 'GET') return handleGetRun(runId, this.runOrchestrator);
       }
 
       const interfacesMatch = endpoint.match(/^interfaces\/(.+)$/);
-      if (interfacesMatch && req.method === 'GET') {
-        return this.handleGetInterface(interfacesMatch[1]);
-      }
+      if (interfacesMatch && req.method === 'GET') return this.handleGetInterface(interfacesMatch[1]);
 
-      if (endpoint === 'channels/conversation' && req.method === 'DELETE') {
-        return await handleDeleteConversation(req, assistantId);
-      }
+      if (endpoint === 'channels/conversation' && req.method === 'DELETE') return await handleDeleteConversation(req, assistantId);
 
       if (endpoint === 'channels/inbound' && req.method === 'POST') {
         const gatewayOriginSecret = getRuntimeGatewayOriginSecret();
-        return await handleChannelInbound(req, this.processMessage, this.bearerToken, this.runOrchestrator, assistantId, gatewayOriginSecret, this.approvalCopyGenerator);
-      }
-
-      if (endpoint === 'channels/delivery-ack' && req.method === 'POST') {
-        return await handleChannelDeliveryAck(req);
+        return await handleChannelInbound(req, this.processMessage, this.bearerToken, this.runOrchestrator, assistantId, gatewayOriginSecret, this.approvalCopyGenerator, this.approvalConversationGenerator);
       }
 
-      if (endpoint === 'channels/dead-letters' && req.method === 'GET') {
-        return handleListDeadLetters();
-      }
+      if (endpoint === 'channels/delivery-ack' && req.method === 'POST') return await handleChannelDeliveryAck(req);
+      if (endpoint === 'channels/dead-letters' && req.method === 'GET') return handleListDeadLetters();
+      if (endpoint === 'channels/replay' && req.method === 'POST') return await handleReplayDeadLetters(req);
 
-      if (endpoint === 'channels/replay' && req.method === 'POST') {
-        return await handleReplayDeadLetters(req);
-      }
+      if (endpoint === 'calls/start' && req.method === 'POST') return await handleStartCall(req, assistantId);
 
-      // ── Call API routes ───────────────────────────────────────────
-      if (endpoint === 'calls/start' && req.method === 'POST') {
-        return await handleStartCall(req, assistantId);
-      }
-
-      // Match calls/:callSessionId and calls/:callSessionId/cancel, calls/:callSessionId/answer, calls/:callSessionId/instruction
       const callsMatch = endpoint.match(/^calls\/([^/]+?)(\/cancel|\/answer|\/instruction)?$/);
       if (callsMatch) {
         const callSessionId = callsMatch[1];
-        // Skip known sub-paths that are handled elsewhere (twilio, relay)
         if (callSessionId !== 'twilio' && callSessionId !== 'relay' && callSessionId !== 'start') {
-          if (callsMatch[2] === '/cancel' && req.method === 'POST') {
-            return await handleCancelCall(req, callSessionId);
-          }
-          if (callsMatch[2] === '/answer' && req.method === 'POST') {
-            return await handleAnswerCall(req, callSessionId);
-          }
-          if (callsMatch[2] === '/instruction' && req.method === 'POST') {
-            return await handleInstructionCall(req, callSessionId);
-          }
-          if (!callsMatch[2] && req.method === 'GET') {
-            return handleGetCallStatus(callSessionId);
-          }
+          if (callsMatch[2] === '/cancel' && req.method === 'POST') return await handleCancelCall(req, callSessionId);
+          if (callsMatch[2] === '/answer' && req.method === 'POST') return await handleAnswerCall(req, callSessionId);
+          if (callsMatch[2] === '/instruction' && req.method === 'POST') return await handleInstructionCall(req, callSessionId);
+          if (!callsMatch[2] && req.method === 'GET') return handleGetCallStatus(callSessionId);
         }
       }
 
-      // ── Internal Twilio forwarding endpoints (gateway → runtime) ──
-      // These accept JSON payloads from the gateway (which already validated
-      // the Twilio signature) and reconstruct requests for the existing
-      // Twilio route handlers.
+      // Internal Twilio forwarding endpoints (gateway -> runtime)
       if (endpoint === 'internal/twilio/voice-webhook' && req.method === 'POST') {
         const json = await req.json() as { params: Record<string, string>; originalUrl?: string; assistantId?: string };
         const formBody = new URLSearchParams(json.params).toString();
-        // Reconstruct request URL: keep the original URL query string (callSessionId)
         const reconstructedUrl = json.originalUrl ?? req.url;
-        const fakeReq = new Request(reconstructedUrl, {
-          method: 'POST',
-          headers: { 'Content-Type': 'application/x-www-form-urlencoded' },
-          body: formBody,
-        });
+        const fakeReq = new Request(reconstructedUrl, { method: 'POST', headers: { 'Content-Type': 'application/x-www-form-urlencoded' }, body: formBody });
         return await handleVoiceWebhook(fakeReq, json.assistantId);
       }
 
       if (endpoint === 'internal/twilio/status' && req.method === 'POST') {
         const json = await req.json() as { params: Record<string, string> };
         const formBody = new URLSearchParams(json.params).toString();
-        const fakeReq = new Request(req.url, {
-          method: 'POST',
-          headers: { 'Content-Type': 'application/x-www-form-urlencoded' },
-          body: formBody,
-        });
+        const fakeReq = new Request(req.url, { method: 'POST', headers: { 'Content-Type': 'application/x-www-form-urlencoded' }, body: formBody });
         return await handleStatusCallback(fakeReq);
       }
 
       if (endpoint === 'internal/twilio/connect-action' && req.method === 'POST') {
         const json = await req.json() as { params: Record<string, string> };
         const formBody = new URLSearchParams(json.params).toString();
-        const fakeReq = new Request(req.url, {
-          method: 'POST',
-          headers: { 'Content-Type': 'application/x-www-form-urlencoded' },
-          body: formBody,
-        });
+        const fakeReq = new Request(req.url, { method: 'POST', headers: { 'Content-Type': 'application/x-www-form-urlencoded' }, body: formBody });
         return await handleConnectAction(fakeReq);
       }
 
-      if (endpoint === 'identity' && req.method === 'GET') {
-        return this.handleGetIdentity();
-      }
+      if (endpoint === 'identity' && req.method === 'GET') return handleGetIdentity();
+      if (endpoint === 'events' && req.method === 'GET') return handleSubscribeAssistantEvents(req, url);
 
-      if (endpoint === 'events' && req.method === 'GET') {
-        return handleSubscribeAssistantEvents(req, url);
-      }
-
-      // ── Internal OAuth callback endpoint (gateway → runtime) ──
+      // Internal OAuth callback endpoint (gateway -> runtime)
       if (endpoint === 'internal/oauth/callback' && req.method === 'POST') {
         const json = await req.json() as { state: string; code?: string; error?: string };
-        if (!json.state) {
-          return Response.json({ error: 'Missing state parameter' }, { status: 400 });
-        }
+        if (!json.state) return Response.json({ error: 'Missing state parameter' }, { status: 400 });
         if (json.error) {
           const consumed = consumeCallbackError(json.state, json.error);
-          return consumed
-            ? Response.json({ ok: true })
-            : Response.json({ error: 'Unknown state' }, { status: 404 });
+          return consumed ? Response.json({ ok: true }) : Response.json({ error: 'Unknown state' }, { status: 404 });
         }
         if (json.code) {
           const consumed = consumeCallback(json.state, json.code);
-          return consumed
-            ? Response.json({ ok: true })
-            : Response.json({ error: 'Unknown state' }, { status: 404 });
+          return consumed ? Response.json({ ok: true }) : Response.json({ error: 'Unknown state' }, { status: 404 });
         }
         return Response.json({ error: 'Missing code or error parameter' }, { status: 400 });
       }
 
       return Response.json({ error: 'Not found', source: 'runtime' }, { status: 404 });
-    } catch (err) {
-      if (err instanceof IngressBlockedError) {
-        log.warn({ endpoint, detectedTypes: err.detectedTypes }, 'Blocked HTTP request containing secrets');
-        return Response.json({ error: err.message, code: err.code }, { status: 422 });
-      }
-      if (err instanceof ConfigError) {
-        log.warn({ err, endpoint }, 'Runtime HTTP config error');
-        return Response.json({ error: err.message, code: err.code }, { status: 422 });
-      }
-      log.error({ err, endpoint }, 'Runtime HTTP handler error');
-      const message = err instanceof Error ? err.message : 'Internal server error';
-      return Response.json({ error: message }, { status: 500 });
-    }
-  }
-
-  /**
-   * Periodically retry failed channel inbound events that have passed
-   * their exponential backoff delay.
-   */
-  private async sweepFailedEvents(): Promise<void> {
-    if (!this.processMessage) return;
-
-    const events = channelDeliveryStore.getRetryableEvents();
-    if (events.length === 0) return;
-
-    log.info({ count: events.length }, 'Retrying failed channel inbound events');
-
-    for (const event of events) {
-      if (!event.rawPayload) {
-        // No payload stored — can't replay, move to dead letter
-        channelDeliveryStore.recordProcessingFailure(
-          event.id,
-          new Error('No raw payload stored for replay'),
-        );
-        continue;
-      }
-
-      let payload: Record<string, unknown>;
-      try {
-        payload = JSON.parse(event.rawPayload) as Record<string, unknown>;
-      } catch {
-        channelDeliveryStore.recordProcessingFailure(
-          event.id,
-          new Error('Failed to parse stored raw payload'),
-        );
-        continue;
-      }
-
-      const content = typeof payload.content === 'string' ? payload.content.trim() : '';
-      const attachmentIds = Array.isArray(payload.attachmentIds) ? payload.attachmentIds as string[] : undefined;
-      const sourceChannel = parseChannelId(payload.sourceChannel);
-      if (!sourceChannel) {
-        channelDeliveryStore.recordProcessingFailure(
-          event.id,
-          new Error(`Invalid sourceChannel: ${String(payload.sourceChannel)}`),
-        );
-        continue;
-      }
-      const sourceMetadata = payload.sourceMetadata as Record<string, unknown> | undefined;
-      const assistantId = typeof payload.assistantId === 'string'
-        ? payload.assistantId
-        : undefined;
-      const guardianContext = parseGuardianRuntimeContext(payload.guardianCtx);
-
-      const metadataHintsRaw = sourceMetadata?.hints;
-      const metadataHints = Array.isArray(metadataHintsRaw)
-        ? metadataHintsRaw.filter((h): h is string => typeof h === 'string' && h.trim().length > 0)
-        : [];
-      const metadataUxBrief = typeof sourceMetadata?.uxBrief === 'string' && sourceMetadata.uxBrief.trim().length > 0
-        ? sourceMetadata.uxBrief.trim()
-        : undefined;
-
-      try {
-        const { messageId: userMessageId } = await this.processMessage(
-          event.conversationId,
-          content,
-          attachmentIds,
-          {
-            transport: {
-              channelId: sourceChannel,
-              hints: metadataHints.length > 0 ? metadataHints : undefined,
-              uxBrief: metadataUxBrief,
-            },
-            assistantId,
-            guardianContext,
-          },
-        );
-        channelDeliveryStore.linkMessage(event.id, userMessageId);
-        channelDeliveryStore.markProcessed(event.id);
-        log.info({ eventId: event.id }, 'Successfully replayed failed channel event');
-
-        const replyCallbackUrl = typeof payload.replyCallbackUrl === 'string'
-          ? payload.replyCallbackUrl
-          : undefined;
-        if (replyCallbackUrl) {
-          const externalChatId = typeof payload.externalChatId === 'string'
-            ? payload.externalChatId
-            : undefined;
-          if (externalChatId) {
-            await this.deliverReplyViaCallback(
-              event.conversationId,
-              externalChatId,
-              replyCallbackUrl,
-              assistantId,
-            );
-          }
-        }
-      } catch (err) {
-        log.error({ err, eventId: event.id }, 'Retry failed for channel event');
-        channelDeliveryStore.recordProcessingFailure(event.id, err);
-      }
-    }
-  }
-
-  private async deliverReplyViaCallback(
-    conversationId: string,
-    externalChatId: string,
-    callbackUrl: string,
-    assistantId?: string,
-  ): Promise<void> {
-    const msgs = conversationStore.getMessages(conversationId);
-    for (let i = msgs.length - 1; i >= 0; i--) {
-      if (msgs[i].role === 'assistant') {
-        let parsed: unknown;
-        try { parsed = JSON.parse(msgs[i].content); } catch { parsed = msgs[i].content; }
-        const rendered = renderHistoryContent(parsed);
-
-        const linked = attachmentsStore.getAttachmentMetadataForMessage(msgs[i].id);
-        const replyAttachments = linked.map((a) => ({
-          id: a.id,
-          filename: a.originalFilename,
-          mimeType: a.mimeType,
-          sizeBytes: a.sizeBytes,
-          kind: a.kind,
-        }));
-
-        if (rendered.text || replyAttachments.length > 0) {
-          await deliverChannelReply(callbackUrl, {
-            chatId: externalChatId,
-            text: rendered.text || undefined,
-            attachments: replyAttachments.length > 0 ? replyAttachments : undefined,
-            assistantId,
-          }, this.bearerToken);
-        }
-        break;
-      }
-    }
-  }
-
-  private handleGetIdentity(): Response {
-    const identityPath = getWorkspacePromptPath('IDENTITY.md');
-    if (!existsSync(identityPath)) {
-      return Response.json({ error: 'IDENTITY.md not found' }, { status: 404 });
-    }
-
-    const content = readFileSync(identityPath, 'utf-8');
-    const fields: Record<string, string> = {};
-    for (const line of content.split('\n')) {
-      const trimmed = line.trim();
-      const lower = trimmed.toLowerCase();
-      const extract = (prefix: string): string | null => {
-        if (!lower.startsWith(prefix)) return null;
-        return trimmed.split(':**').pop()?.trim() ?? null;
-      };
-
-      const name = extract('- **name:**');
-      if (name) { fields.name = name; continue; }
-      const role = extract('- **role:**');
-      if (role) { fields.role = role; continue; }
-      const personality = extract('- **personality:**') ?? extract('- **vibe:**');
-      if (personality) { fields.personality = personality; continue; }
-      const emoji = extract('- **emoji:**');
-      if (emoji) { fields.emoji = emoji; continue; }
-      const home = extract('- **home:**');
-      if (home) { fields.home = home; continue; }
-    }
-
-    // Read version from package.json
-    let version: string | undefined;
-    try {
-      const pkgPath = join(dirname(fileURLToPath(import.meta.url)), '../../package.json');
-      const pkg = JSON.parse(readFileSync(pkgPath, 'utf-8'));
-      version = pkg.version;
-    } catch {
-      // ignore
-    }
-
-    // Read createdAt from IDENTITY.md file birthtime
-    let createdAt: string | undefined;
-    try {
-      const stats = statSync(identityPath);
-      createdAt = stats.birthtime.toISOString();
-    } catch {
-      // ignore
-    }
-
-    // Read lockfile for assistantId, cloud, and originSystem
-    let assistantId: string | undefined;
-    let cloud: string | undefined;
-    let originSystem: string | undefined;
-    try {
-      const lockData = readLockfile();
-      const assistants = lockData?.assistants as Array<Record<string, unknown>> | undefined;
-      if (assistants && assistants.length > 0) {
-        // Use the most recently hatched assistant
-        const sorted = [...assistants].sort((a, b) => {
-          const dateA = new Date(a.hatchedAt as string || 0).getTime();
-          const dateB = new Date(b.hatchedAt as string || 0).getTime();
-          return dateB - dateA;
-        });
-        const latest = sorted[0];
-        assistantId = latest.assistantId as string | undefined;
-        cloud = latest.cloud as string | undefined;
-        originSystem = cloud === 'local' ? 'local' : cloud;
-      }
-    } catch {
-      // ignore — lockfile may not exist
-    }
-
-    return Response.json({
-      name: fields.name ?? '',
-      role: fields.role ?? '',
-      personality: fields.personality ?? '',
-      emoji: fields.emoji ?? '',
-      home: fields.home ?? '',
-      version,
-      assistantId,
-      createdAt,
-      originSystem,
-    });
-  }
-
-  private handleHealth(): Response {
-    return Response.json({
-      status: 'healthy',
-      timestamp: new Date().toISOString(),
-      disk: getDiskSpaceInfo(),
     });
   }
 
-  // ── Pairing HTTP handlers ─────────────────────────────────────────
-
-  /**
-   * POST /v1/pairing/register — Bearer-authenticated.
-   * macOS pre-registers a pairing request when the QR is displayed.
-   */
-  private async handlePairingRegister(req: Request): Promise<Response> {
-    try {
-      const body = await req.json() as Record<string, unknown>;
-      const pairingRequestId = typeof body.pairingRequestId === 'string' ? body.pairingRequestId : '';
-      const pairingSecret = typeof body.pairingSecret === 'string' ? body.pairingSecret : '';
-      const gatewayUrl = typeof body.gatewayUrl === 'string' ? body.gatewayUrl : '';
-      const localLanUrl = typeof body.localLanUrl === 'string' ? body.localLanUrl : null;
-
-      if (!pairingRequestId || !pairingSecret || !gatewayUrl) {
-        return Response.json({ error: 'Missing required fields: pairingRequestId, pairingSecret, gatewayUrl' }, { status: 400 });
-      }
-
-      const result = this.pairingStore.register({ pairingRequestId, pairingSecret, gatewayUrl, localLanUrl });
-      if (!result.ok) {
-        return Response.json({ error: 'Conflict: pairingRequestId exists with different secret' }, { status: 409 });
-      }
-
-      return Response.json({ ok: true });
-    } catch (err) {
-      log.error({ err }, 'Failed to register pairing request');
-      return Response.json({ error: 'Internal server error' }, { status: 500 });
-    }
-  }
-
-  /**
-   * POST /v1/pairing/request — Unauthenticated (secret-gated).
-   * iOS initiates a pairing handshake.
-   */
-  private async handlePairingRequest(req: Request): Promise<Response> {
-    try {
-      const body = await req.json() as Record<string, unknown>;
-      const pairingRequestId = typeof body.pairingRequestId === 'string' ? body.pairingRequestId : '';
-      const pairingSecret = typeof body.pairingSecret === 'string' ? body.pairingSecret : '';
-      const deviceId = typeof body.deviceId === 'string' ? body.deviceId.trim() : '';
-      const deviceName = typeof body.deviceName === 'string' ? body.deviceName.trim() : '';
-
-      // Redact secret from any potential logging of body
-      log.info({ pairingRequestId, deviceName, hasDeviceId: !!deviceId }, 'Pairing request received');
-
-      if (!deviceId || !deviceName) {
-        return Response.json({ error: 'Missing required fields: deviceId, deviceName' }, { status: 400 });
-      }
-
-      if (!pairingRequestId || !pairingSecret) {
-        return Response.json({ error: 'Missing required fields: pairingRequestId, pairingSecret' }, { status: 400 });
-      }
-
-      const result = this.pairingStore.beginRequest({ pairingRequestId, pairingSecret, deviceId, deviceName });
-      if (!result.ok) {
-        const statusCode = result.reason === 'invalid_secret' ? 403 : result.reason === 'not_found' ? 403 : 410;
-        return Response.json({ error: 'Forbidden' }, { status: statusCode });
-      }
-
-      const entry = result.entry;
-      const hashedDeviceId = hashDeviceId(deviceId);
-
-      // Auto-approve if device is in the allowlist
-      if (isDeviceApproved(hashedDeviceId) && this.bearerToken) {
-        refreshDevice(hashedDeviceId, deviceName);
-        this.pairingStore.approve(pairingRequestId, this.bearerToken);
-        log.info({ pairingRequestId, hashedDeviceId }, 'Auto-approved allowlisted device');
-        return Response.json({
-          status: 'approved',
-          bearerToken: this.bearerToken,
-          gatewayUrl: entry.gatewayUrl,
-          localLanUrl: entry.localLanUrl,
-        });
-      }
-
-      // Send IPC to macOS to show approval prompt
-      if (this.pairingBroadcast) {
-        this.pairingBroadcast({
-          type: 'pairing_approval_request',
-          pairingRequestId,
-          deviceId: hashedDeviceId,
-          deviceName,
-        });
-      }
-
-      return Response.json({ status: 'pending' });
-    } catch (err) {
-      log.error({ err }, 'Failed to process pairing request');
-      return Response.json({ error: 'Internal server error' }, { status: 500 });
-    }
-  }
-
-  /**
-   * GET /v1/pairing/status?id=<id>&secret=<secret> — Unauthenticated (secret-gated).
-   * iOS polls for approval status.
-   */
-  private handlePairingStatus(url: URL): Response {
-    const id = url.searchParams.get('id') ?? '';
-    // Note: secret is redacted from logs
-    const secret = url.searchParams.get('secret') ?? '';
-
-    if (!id || !secret) {
-      return Response.json({ error: 'Missing required params: id, secret' }, { status: 400 });
-    }
-
-    if (!this.pairingStore.validateSecret(id, secret)) {
-      return Response.json({ error: 'Forbidden' }, { status: 403 });
-    }
-
-    const entry = this.pairingStore.get(id);
-    if (!entry) {
-      return Response.json({ error: 'Not found' }, { status: 404 });
-    }
-
-    if (entry.status === 'approved') {
-      return Response.json({
-        status: 'approved',
-        bearerToken: entry.bearerToken,
-        gatewayUrl: entry.gatewayUrl,
-        localLanUrl: entry.localLanUrl,
-      });
-    }
-
-    return Response.json({ status: entry.status });
-  }
-
   private handleGetInterface(interfacePath: string): Response {
     if (!this.interfacesDir) {
       return Response.json({ error: 'Interface not found' }, { status: 404 });
     }
     const fullPath = resolve(this.interfacesDir, interfacePath);
-    // Enforce directory boundary so prefix-sibling paths (e.g. "interfaces-other/") are rejected
     if (
       (fullPath !== this.interfacesDir && !fullPath.startsWith(this.interfacesDir + '/')) ||
       !existsSync(fullPath)
diff --git a/assistant/src/runtime/middleware/auth.ts b/assistant/src/runtime/middleware/auth.ts
new file mode 100644
index 00000000000..9ee8baaa2b5
--- /dev/null
+++ b/assistant/src/runtime/middleware/auth.ts
@@ -0,0 +1,116 @@
+/**
+ * Auth middleware: bearer token validation, private network checks,
+ * and gateway-origin verification.
+ */
+
+import { timingSafeEqual } from 'node:crypto';
+
+/**
+ * Constant-time comparison of two bearer tokens to prevent timing attacks.
+ */
+export function verifyBearerToken(provided: string, expected: string): boolean {
+  const a = Buffer.from(provided);
+  const b = Buffer.from(expected);
+  if (a.length !== b.length) return false;
+  return timingSafeEqual(a, b);
+}
+
+/**
+ * Check if a hostname is a loopback address.
+ */
+export function isLoopbackHost(hostname: string): boolean {
+  return hostname === '127.0.0.1' || hostname === '::1' || hostname === 'localhost';
+}
+
+/**
+ * @internal Exported for testing.
+ *
+ * Determine whether an IP address string belongs to a private/internal
+ * network range:
+ *   - Loopback: 127.0.0.0/8, ::1
+ *   - RFC 1918: 10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16
+ *   - Link-local: 169.254.0.0/16
+ *   - IPv6 unique local: fc00::/7 (fc00::--fdff::)
+ *   - IPv4-mapped IPv6 variants of all of the above (::ffff:x.x.x.x)
+ */
+export function isPrivateAddress(addr: string): boolean {
+  // Handle IPv4-mapped IPv6 (e.g. ::ffff:10.0.0.1) -- extract the IPv4 part
+  const v4Mapped = addr.match(/^::ffff:(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})$/i);
+  const normalized = v4Mapped ? v4Mapped[1] : addr;
+
+  // IPv4 checks
+  if (normalized.includes('.')) {
+    const parts = normalized.split('.').map(Number);
+    if (parts.length !== 4 || parts.some(p => isNaN(p) || p < 0 || p > 255)) return false;
+
+    // Loopback: 127.0.0.0/8
+    if (parts[0] === 127) return true;
+    // 10.0.0.0/8
+    if (parts[0] === 10) return true;
+    // 172.16.0.0/12 (172.16.x.x -- 172.31.x.x)
+    if (parts[0] === 172 && parts[1] >= 16 && parts[1] <= 31) return true;
+    // 192.168.0.0/16
+    if (parts[0] === 192 && parts[1] === 168) return true;
+    // Link-local: 169.254.0.0/16
+    if (parts[0] === 169 && parts[1] === 254) return true;
+
+    return false;
+  }
+
+  // IPv6 checks
+  const lower = normalized.toLowerCase();
+  // Loopback
+  if (lower === '::1') return true;
+  // Unique local: fc00::/7 (fc00:: through fdff::)
+  if (lower.startsWith('fc') || lower.startsWith('fd')) return true;
+  // Link-local: fe80::/10
+  if (lower.startsWith('fe80')) return true;
+
+  return false;
+}
+
+/**
+ * Check if the actual peer/remote address of a connection is from a
+ * private/internal network. Uses Bun's server.requestIP() to get the
+ * real peer address, which cannot be spoofed unlike the Origin header.
+ *
+ * Accepts loopback, RFC 1918 private IPv4, link-local, and RFC 4193
+ * unique-local IPv6 -- including their IPv4-mapped IPv6 forms. This
+ * supports container/pod deployments (e.g. Kubernetes sidecars) where
+ * gateway and runtime communicate over pod-internal private IPs.
+ */
+export function isPrivateNetworkPeer(server: { requestIP(req: Request): { address: string; family: string; port: number } | null }, req: Request): boolean {
+  const ip = server.requestIP(req);
+  if (!ip) return false;
+  return isPrivateAddress(ip.address);
+}
+
+/**
+ * Check if a request origin is from a private/internal network address.
+ * Extracts the hostname from the Origin header and validates it against
+ * isPrivateAddress(), consistent with the isPrivateNetworkPeer check.
+ */
+export function isPrivateNetworkOrigin(req: Request): boolean {
+  const origin = req.headers.get('origin');
+  // No origin header (e.g., server-initiated or same-origin) -- allow
+  if (!origin) return true;
+  try {
+    const url = new URL(origin);
+    const host = url.hostname;
+    if (host === 'localhost') return true;
+    // URL.hostname wraps IPv6 addresses in brackets (e.g. "[::1]") -- strip them
+    const rawHost = host.startsWith('[') && host.endsWith(']') ? host.slice(1, -1) : host;
+    return isPrivateAddress(rawHost);
+  } catch {
+    return false;
+  }
+}
+
+/**
+ * Extract and validate a bearer token from the Authorization header.
+ * Returns the token string if present, or null.
+ */
+export function extractBearerToken(req: Request): string | null {
+  const authHeader = req.headers.get('authorization');
+  return authHeader?.startsWith('Bearer ') ? authHeader.slice(7) : null;
+}
diff --git a/assistant/src/runtime/middleware/error-handler.ts b/assistant/src/runtime/middleware/error-handler.ts
new file mode 100644
index 00000000000..61ba2e6b50d
--- /dev/null
+++ b/assistant/src/runtime/middleware/error-handler.ts
@@ -0,0 +1,33 @@
+/**
+ * Centralized error handling for runtime HTTP request dispatch.
+ */
+
+import { ConfigError, IngressBlockedError } from '../../util/errors.js';
+import { getLogger } from '../../util/logger.js';
+
+const log = getLogger('runtime-http');
+
+/**
+ * Wrap an async endpoint handler with standard error handling.
+ * Catches IngressBlockedError (422), ConfigError (422), and generic errors (500).
+ */
+export async function withErrorHandling(
+  endpoint: string,
+  handler: () => Promise<Response>,
+): Promise<Response> {
+  try {
+    return await handler();
+  } catch (err) {
+    if (err instanceof IngressBlockedError) {
+      log.warn({ endpoint, detectedTypes: err.detectedTypes }, 'Blocked HTTP request containing secrets');
+      return Response.json({ error: err.message, code: err.code }, { status: 422 });
+    }
+    if (err instanceof ConfigError) {
+      log.warn({ err, endpoint }, 'Runtime HTTP config error');
+      return Response.json({ error: err.message, code: err.code }, { status: 422 });
+    }
+    log.error({ err, endpoint }, 'Runtime HTTP handler error');
+    const message = err instanceof Error ? err.message : 'Internal server error';
+    return Response.json({ error: message }, { status: 500 });
+  }
+}
diff --git a/assistant/src/runtime/middleware/twilio-validation.ts b/assistant/src/runtime/middleware/twilio-validation.ts
new file mode 100644
index 00000000000..c8b6553048a
--- /dev/null
+++ b/assistant/src/runtime/middleware/twilio-validation.ts
@@ -0,0 +1,127 @@
+/**
+ * Twilio webhook signature validation and related constants.
+ */
+
+import { getLogger } from '../../util/logger.js';
+import { isTwilioWebhookValidationDisabled } from '../../config/env.js';
+import { TwilioConversationRelayProvider } from '../../calls/twilio-provider.js';
+import { loadConfig } from '../../config/loader.js';
+import { getPublicBaseUrl } from '../../inbound/public-ingress-urls.js';
+
+const log = getLogger('runtime-http');
+
+/**
+ * Regex to extract the Twilio webhook subpath from both top-level and
+ * assistant-scoped route shapes:
+ *   /v1/calls/twilio/<subpath>
+ *   /v1/assistants/<id>/calls/twilio/<subpath>
+ */
+export const TWILIO_WEBHOOK_RE = /^\/v1\/(?:assistants\/[^/]+\/)?calls\/twilio\/(.+)$/;
+
+/**
+ * Gateway-compatible Twilio webhook paths:
+ *   /webhooks/twilio/<subpath>
+ *
+ * Maps gateway path segments to the internal subpath names used by the
+ * dispatcher below (e.g. "voice" -> "voice-webhook").
+ */
+export const TWILIO_GATEWAY_WEBHOOK_RE = /^\/webhooks\/twilio\/(.+)$/;
+export const GATEWAY_SUBPATH_MAP: Record<string, string> = {
+  voice: 'voice-webhook',
+  status: 'status',
+  'connect-action': 'connect-action',
+  sms: 'sms',
+};
+
+/**
+ * Direct Twilio webhook subpaths that are blocked in gateway_only mode.
+ * Includes all public-facing webhook paths (voice, status, connect-action, SMS)
+ * because the runtime must never serve as a direct ingress for external webhooks.
+ * Internal forwarding endpoints (gateway->runtime) are unaffected.
+ */
+export const GATEWAY_ONLY_BLOCKED_SUBPATHS = new Set(['voice-webhook', 'status', 'connect-action', 'sms']);
+
+/**
+ * Validate a Twilio webhook request's X-Twilio-Signature header.
+ *
+ * Returns the raw body text on success so callers can reconstruct the Request
+ * for downstream handlers (which also need to read the body).
+ * Returns a 403 Response if signature validation fails.
+ *
+ * Fail-closed: if the auth token is not configured, the request is rejected
+ * with 403 rather than silently skipping validation. An explicit local-dev
+ * bypass is available via TWILIO_WEBHOOK_VALIDATION_DISABLED=true.
+ */
+export async function validateTwilioWebhook(
+  req: Request,
+): Promise<{ body: string } | Response> {
+  const rawBody = await req.text();
+
+  // Allow explicit local-dev bypass -- must be exactly "true"
+  if (isTwilioWebhookValidationDisabled()) {
+    log.warn('Twilio webhook signature validation explicitly disabled via TWILIO_WEBHOOK_VALIDATION_DISABLED');
+    return { body: rawBody };
+  }
+
+  const authToken = TwilioConversationRelayProvider.getAuthToken();
+
+  // Fail-closed: reject if no auth token is configured
+  if (!authToken) {
+    log.error('Twilio auth token not configured — rejecting webhook request (fail-closed)');
+    return Response.json({ error: 'Forbidden' }, { status: 403 });
+  }
+
+  const signature = req.headers.get('x-twilio-signature');
+  if (!signature) {
+    log.warn('Twilio webhook request missing X-Twilio-Signature header');
+    return Response.json({ error: 'Forbidden' }, { status: 403 });
+  }
+
+  // Parse form-urlencoded body into key-value params for signature computation
+  const params: Record<string, string> = {};
+  const formData = new URLSearchParams(rawBody);
+  for (const [key, value] of formData.entries()) {
+    params[key] = value;
+  }
+
+  // Reconstruct the public-facing URL that Twilio signed against.
+  // Behind proxies/gateways, req.url is the local server URL (e.g.
+  // http://127.0.0.1:7821/...) which differs from the public URL Twilio
+  // used to compute the HMAC-SHA1 signature.
+  let publicBaseUrl: string | undefined;
+  try {
+    publicBaseUrl = getPublicBaseUrl(loadConfig());
+  } catch {
+    // No webhook base URL configured -- fall back to using req.url as-is
+  }
+  const parsedUrl = new URL(req.url);
+  const publicUrl = publicBaseUrl
+    ? publicBaseUrl + parsedUrl.pathname + parsedUrl.search
+    : req.url;
+
+  const isValid = TwilioConversationRelayProvider.verifyWebhookSignature(
+    publicUrl,
+    params,
+    signature,
+    authToken,
+  );
+
+  if (!isValid) {
+    log.warn('Twilio webhook signature validation failed');
+    return Response.json({ error: 'Forbidden' }, { status: 403 });
+  }
+
+  return { body: rawBody };
+}
+
+/**
+ * Re-create a Request with the same method, headers, and URL but with a
+ * pre-read body string so downstream handlers can call req.text() again.
+ */
+export function cloneRequestWithBody(original: Request, body: string): Request {
+  return new Request(original.url, {
+    method: original.method,
+    headers: original.headers,
+    body,
+  });
+}
diff --git a/assistant/src/runtime/routes/channel-guardian-routes.ts b/assistant/src/runtime/routes/channel-guardian-routes.ts
index d0ac925bfb3..3547e6e9ec1 100644
--- a/assistant/src/runtime/routes/channel-guardian-routes.ts
+++ b/assistant/src/runtime/routes/channel-guardian-routes.ts
@@ -39,7 +39,6 @@ import {
   parseCallbackData,
   requiredDecisionKeywords,
   buildGuardianDenyContext,
-  buildPromptDeliveryFailureContext,
 } from './channel-route-shared.js';
 import { schedulePostDecisionDelivery } from './channel-delivery-routes.js';
 
diff --git a/assistant/src/runtime/routes/channel-inbound-routes.ts b/assistant/src/runtime/routes/channel-inbound-routes.ts
index b49801f8b58..3da02201fef 100644
--- a/assistant/src/runtime/routes/channel-inbound-routes.ts
+++ b/assistant/src/runtime/routes/channel-inbound-routes.ts
@@ -12,7 +12,6 @@ import * as externalConversationStore from '../../memory/external-conversation-s
 import { getPendingConfirmationsByConversation } from '../../memory/runs-store.js';
 import { checkIngressForSecrets } from '../../security/secret-ingress.js';
 import { IngressBlockedError } from '../../util/errors.js';
-import { getConfig } from '../../config/loader.js';
 import { getLogger } from '../../util/logger.js';
 import { findMember, updateLastSeen } from '../../memory/ingress-member-store.js';
 import {
@@ -44,7 +43,6 @@ import type {
   ApprovalCopyGenerator,
   ApprovalConversationGenerator,
 } from '../http-types.js';
-import type { GuardianRuntimeContext } from '../../daemon/session-runtime-assembly.js';
 import { composeApprovalMessageGenerative } from '../approval-message-composer.js';
 import { refreshThreadEscalation } from '../../memory/inbox-escalation-projection.js';
 import {
@@ -58,7 +56,7 @@ import {
   RUN_POLL_INTERVAL_MS,
   getEffectivePollMaxWait,
 } from './channel-route-shared.js';
-import { deliverReplyViaCallback, schedulePostDecisionDelivery } from './channel-delivery-routes.js';
+import { deliverReplyViaCallback } from './channel-delivery-routes.js';
 import { handleApprovalInterception, deliverGeneratedApprovalPrompt } from './channel-guardian-routes.js';
 
 const log = getLogger('runtime-http');
@@ -187,12 +185,11 @@ export async function handleChannelInbound(
   }
 
   // ── Ingress ACL enforcement ──
-  const inboxConfig = getConfig().assistantInbox;
   // Track the resolved member so the escalate branch can reference it after
   // recordInbound (where we have a conversationId).
   let resolvedMember: ReturnType<typeof findMember> = null;
 
-  if (inboxConfig.enabled && inboxConfig.memberAclEnabled && body.senderExternalUserId) {
+  if (body.senderExternalUserId) {
     resolvedMember = findMember({
       sourceChannel,
       externalUserId: body.senderExternalUserId,
@@ -889,7 +886,7 @@ function processChannelMessageWithApprovals(params: ApprovalProcessingParams): v
         assistantMessageChannel: sourceChannel,
       };
 
-      const run = await orchestrator.startRun(
+      const { run } = await orchestrator.startRun(
         conversationId,
         content,
         attachmentIds,
diff --git a/assistant/src/runtime/routes/identity-routes.ts b/assistant/src/runtime/routes/identity-routes.ts
new file mode 100644
index 00000000000..d57321399ff
--- /dev/null
+++ b/assistant/src/runtime/routes/identity-routes.ts
@@ -0,0 +1,126 @@
+/**
+ * Identity and health endpoint handlers.
+ */
+
+import { existsSync, readFileSync, statSync, statfsSync } from 'node:fs';
+import { join, dirname } from 'node:path';
+import { fileURLToPath } from 'node:url';
+import { getWorkspacePromptPath, readLockfile } from '../../util/platform.js';
+import { getBaseDataDir } from '../../config/env-registry.js';
+
+interface DiskSpaceInfo {
+  path: string;
+  totalMb: number;
+  usedMb: number;
+  freeMb: number;
+}
+
+function getDiskSpaceInfo(): DiskSpaceInfo | null {
+  try {
+    const baseDataDir = getBaseDataDir();
+    const diskPath = baseDataDir && existsSync(baseDataDir) ? baseDataDir : '/';
+    const stats = statfsSync(diskPath);
+    const totalBytes = stats.bsize * stats.blocks;
+    const freeBytes = stats.bsize * stats.bavail;
+    const bytesToMb = (b: number) => Math.round((b / (1024 * 1024)) * 100) / 100;
+    return {
+      path: diskPath,
+      totalMb: bytesToMb(totalBytes),
+      usedMb: bytesToMb(totalBytes - freeBytes),
+      freeMb: bytesToMb(freeBytes),
+    };
+  } catch {
+    return null;
+  }
+}
+
+export function handleHealth(): Response {
+  return Response.json({
+    status: 'healthy',
+    timestamp: new Date().toISOString(),
+    disk: getDiskSpaceInfo(),
+  });
+}
+
+export function handleGetIdentity(): Response {
+  const identityPath = getWorkspacePromptPath('IDENTITY.md');
+  if (!existsSync(identityPath)) {
+    return Response.json({ error: 'IDENTITY.md not found' }, { status: 404 });
+  }
+
+  const content = readFileSync(identityPath, 'utf-8');
+  const fields: Record<string, string> = {};
+  for (const line of content.split('\n')) {
+    const trimmed = line.trim();
+    const lower = trimmed.toLowerCase();
+    const extract = (prefix: string): string | null => {
+      if (!lower.startsWith(prefix)) return null;
+      return trimmed.split(':**').pop()?.trim() ?? null;
+    };
+
+    const name = extract('- **name:**');
+    if (name) { fields.name = name; continue; }
+    const role = extract('- **role:**');
+    if (role) { fields.role = role; continue; }
+    const personality = extract('- **personality:**') ?? extract('- **vibe:**');
+    if (personality) { fields.personality = personality; continue; }
+    const emoji = extract('- **emoji:**');
+    if (emoji) { fields.emoji = emoji; continue; }
+    const home = extract('- **home:**');
+    if (home) { fields.home = home; continue; }
+  }
+
+  // Read version from package.json
+  let version: string | undefined;
+  try {
+    const pkgPath = join(dirname(fileURLToPath(import.meta.url)), '../../../package.json');
+    const pkg = JSON.parse(readFileSync(pkgPath, 'utf-8'));
+    version = pkg.version;
+  } catch {
+    // ignore
+  }
+
+  // Read createdAt from IDENTITY.md file birthtime
+  let createdAt: string | undefined;
+  try {
+    const stats = statSync(identityPath);
+    createdAt = stats.birthtime.toISOString();
+  } catch {
+    // ignore
+  }
+
+  // Read lockfile for assistantId, cloud, and originSystem
+  let assistantId: string | undefined;
+  let cloud: string | undefined;
+  let originSystem: string | undefined;
+  try {
+    const lockData = readLockfile();
+    const assistants = lockData?.assistants as Array<Record<string, unknown>> | undefined;
+    if (assistants && assistants.length > 0) {
+      // Use the most recently hatched assistant
+      const sorted = [...assistants].sort((a, b) => {
+        const dateA = new Date(a.hatchedAt as string || 0).getTime();
+        const dateB = new Date(b.hatchedAt as string || 0).getTime();
+        return dateB - dateA;
+      });
+      const latest = sorted[0];
+      assistantId = latest.assistantId as string | undefined;
+      cloud = latest.cloud as string | undefined;
+      originSystem = cloud === 'local' ? 'local' : cloud;
+    }
+  } catch {
+    // ignore -- lockfile may not exist
+  }
+
+  return Response.json({
+    name: fields.name ?? '',
+    role: fields.role ?? '',
+    personality: fields.personality ?? '',
+    emoji: fields.emoji ?? '',
+    home: fields.home ?? '',
+    version,
+    assistantId,
+    createdAt,
+    originSystem,
+  });
+}
diff --git a/assistant/src/runtime/routes/pairing-routes.ts b/assistant/src/runtime/routes/pairing-routes.ts
new file mode 100644
index 00000000000..f5160ad421c
--- /dev/null
+++ b/assistant/src/runtime/routes/pairing-routes.ts
@@ -0,0 +1,144 @@
+/**
+ * Pairing HTTP route handlers for device pairing flow.
+ */
+
+import { getLogger } from '../../util/logger.js';
+import { PairingStore } from '../../daemon/pairing-store.js';
+import {
+  isDeviceApproved,
+  refreshDevice,
+  hashDeviceId,
+} from '../../daemon/approved-devices-store.js';
+import type { ServerMessage } from '../../daemon/ipc-contract.js';
+
+const log = getLogger('runtime-http');
+
+export interface PairingHandlerContext {
+  pairingStore: PairingStore;
+  bearerToken: string | undefined;
+  pairingBroadcast?: (msg: ServerMessage) => void;
+}
+
+/**
+ * POST /v1/pairing/register -- Bearer-authenticated.
+ * macOS pre-registers a pairing request when the QR is displayed.
+ */
+export async function handlePairingRegister(req: Request, ctx: PairingHandlerContext): Promise<Response> {
+  try {
+    const body = await req.json() as Record<string, unknown>;
+    const pairingRequestId = typeof body.pairingRequestId === 'string' ? body.pairingRequestId : '';
+    const pairingSecret = typeof body.pairingSecret === 'string' ? body.pairingSecret : '';
+    const gatewayUrl = typeof body.gatewayUrl === 'string' ? body.gatewayUrl : '';
+    const localLanUrl = typeof body.localLanUrl === 'string' ? body.localLanUrl : null;
+
+    if (!pairingRequestId || !pairingSecret || !gatewayUrl) {
+      return Response.json({ error: 'Missing required fields: pairingRequestId, pairingSecret, gatewayUrl' }, { status: 400 });
+    }
+
+    const result = ctx.pairingStore.register({ pairingRequestId, pairingSecret, gatewayUrl, localLanUrl });
+    if (!result.ok) {
+      return Response.json({ error: 'Conflict: pairingRequestId exists with different secret' }, { status: 409 });
+    }
+
+    return Response.json({ ok: true });
+  } catch (err) {
+    log.error({ err }, 'Failed to register pairing request');
+    return Response.json({ error: 'Internal server error' }, { status: 500 });
+  }
+}
+
+/**
+ * POST /v1/pairing/request -- Unauthenticated (secret-gated).
+ * iOS initiates a pairing handshake.
+ */
+export async function handlePairingRequest(req: Request, ctx: PairingHandlerContext): Promise<Response> {
+  try {
+    const body = await req.json() as Record<string, unknown>;
+    const pairingRequestId = typeof body.pairingRequestId === 'string' ? body.pairingRequestId : '';
+    const pairingSecret = typeof body.pairingSecret === 'string' ? body.pairingSecret : '';
+    const deviceId = typeof body.deviceId === 'string' ? body.deviceId.trim() : '';
+    const deviceName = typeof body.deviceName === 'string' ? body.deviceName.trim() : '';
+
+    // Redact secret from any potential logging of body
+    log.info({ pairingRequestId, deviceName, hasDeviceId: !!deviceId }, 'Pairing request received');
+
+    if (!deviceId || !deviceName) {
+      return Response.json({ error: 'Missing required fields: deviceId, deviceName' }, { status: 400 });
+    }
+
+    if (!pairingRequestId || !pairingSecret) {
+      return Response.json({ error: 'Missing required fields: pairingRequestId, pairingSecret' }, { status: 400 });
+    }
+
+    const result = ctx.pairingStore.beginRequest({ pairingRequestId, pairingSecret, deviceId, deviceName });
+    if (!result.ok) {
+      const statusCode = result.reason === 'invalid_secret' ? 403 : result.reason === 'not_found' ? 403 : 410;
+      return Response.json({ error: 'Forbidden' }, { status: statusCode });
+    }
+
+    const entry = result.entry;
+    const hashedDeviceId = hashDeviceId(deviceId);
+
+    // Auto-approve if device is in the allowlist
+    if (isDeviceApproved(hashedDeviceId) && ctx.bearerToken) {
+      refreshDevice(hashedDeviceId, deviceName);
+      ctx.pairingStore.approve(pairingRequestId, ctx.bearerToken);
+      log.info({ pairingRequestId, hashedDeviceId }, 'Auto-approved allowlisted device');
+      return Response.json({
+        status: 'approved',
+        bearerToken: ctx.bearerToken,
+        gatewayUrl: entry.gatewayUrl,
+        localLanUrl: entry.localLanUrl,
+      });
+    }
+
+    // Send IPC to macOS to show approval prompt
+    if (ctx.pairingBroadcast) {
+      ctx.pairingBroadcast({
+        type: 'pairing_approval_request',
+        pairingRequestId,
+        deviceId: hashedDeviceId,
+        deviceName,
+      });
+    }
+
+    return Response.json({ status: 'pending' });
+  } catch (err) {
+    log.error({ err }, 'Failed to process pairing request');
+    return Response.json({ error: 'Internal server error' }, { status: 500 });
+  }
+}
+
+/**
+ * GET /v1/pairing/status?id=<id>&secret=<secret> -- Unauthenticated (secret-gated).
+ * iOS polls for approval status.
+ */
+export function handlePairingStatus(url: URL, ctx: PairingHandlerContext): Response {
+  const id = url.searchParams.get('id') ?? '';
+  // Note: secret is redacted from logs
+  const secret = url.searchParams.get('secret') ?? '';
+
+  if (!id || !secret) {
+    return Response.json({ error: 'Missing required params: id, secret' }, { status: 400 });
+  }
+
+  if (!ctx.pairingStore.validateSecret(id, secret)) {
+    return Response.json({ error: 'Forbidden' }, { status: 403 });
+  }
+
+  const entry = ctx.pairingStore.get(id);
+  if (!entry) {
+    return Response.json({ error: 'Not found' }, { status: 404 });
+  }
+
+  if (entry.status === 'approved') {
+    return Response.json({
+      status: 'approved',
+      bearerToken: entry.bearerToken,
+      gatewayUrl: entry.gatewayUrl,
+      localLanUrl: entry.localLanUrl,
+    });
+  }
+
+  return Response.json({ status: entry.status });
+}
diff --git a/assistant/src/runtime/routes/run-routes.ts b/assistant/src/runtime/routes/run-routes.ts
index 8f3951ac0e9..2a07fe60235 100644
--- a/assistant/src/runtime/routes/run-routes.ts
+++ b/assistant/src/runtime/routes/run-routes.ts
@@ -66,7 +66,7 @@ export async function handleCreateRun(
   const mapping = getOrCreateConversation(conversationKey);
 
   try {
-    const run = await runOrchestrator.startRun(
+    const { run } = await runOrchestrator.startRun(
       mapping.conversationId,
       content ?? '',
       hasAttachments ? attachmentIds : undefined,
diff --git a/assistant/src/runtime/run-orchestrator.ts b/assistant/src/runtime/run-orchestrator.ts
index 5f600726380..04b5ef6d47f 100644
--- a/assistant/src/runtime/run-orchestrator.ts
+++ b/assistant/src/runtime/run-orchestrator.ts
@@ -34,6 +34,29 @@ const log = getLogger('run-orchestrator');
 // Types
 // ---------------------------------------------------------------------------
 
+/**
+ * Real-time event sink for voice TTS streaming. When provided to startRun(),
+ * agent-loop events are forwarded here alongside the existing assistantEventHub
+ * publication. This enables voice relay to receive streaming text deltas for
+ * real-time text-to-speech without modifying the standard channel path.
+ */
+export interface VoiceRunEventSink {
+  onTextDelta(text: string): void;
+  onMessageComplete(): void;
+  onError(message: string): void;
+  onToolUse(toolName: string, input: Record<string, unknown>): void;
+}
+
+/**
+ * Handle returned by startRun() that allows callers to abort an in-flight
+ * run. Used by voice barge-in to cancel the current turn without crashing
+ * session state.
+ */
+export interface RunHandle {
+  run: Run;
+  abort: () => void;
+}
+
 interface PendingRunState {
   prompterRequestId: string;
   session: Session;
@@ -92,6 +115,19 @@ export interface RunStartOptions {
   commandIntent?: { type: string; payload?: string; languageCode?: string };
   /** Resolved channel context for this turn. */
   turnChannelContext?: TurnChannelContext;
+  /**
+   * When provided, agent-loop events are forwarded to this sink in real time.
+   * Used by voice relay for streaming TTS token delivery.
+   */
+  eventSink?: VoiceRunEventSink;
+  /**
+   * When true, any confirmation_request from the prompter is immediately
+   * auto-denied instead of being stored for client polling. Used by the
+   * voice path when forceStrictSideEffects is active: the voice transport
+   * has no interactive approval UI, so without this flag the run would
+   * stall for the full permission timeout (300s by default).
+   */
+  voiceAutoDenyConfirmations?: boolean;
 }
 
 // ---------------------------------------------------------------------------
@@ -116,13 +152,16 @@ export class RunOrchestrator {
   /**
    * Start a new run: persist the user message, create a run record,
    * and fire the agent loop in the background.
+   *
+   * Returns a RunHandle containing the Run record and an abort() function
+   * that can cancel the in-flight agent loop (e.g. for voice barge-in).
    */
   async startRun(
     conversationId: string,
     content: string,
     attachmentIds?: string[],
     options?: RunStartOptions,
-  ): Promise<Run> {
+  ): Promise<RunHandle> {
     // Block inbound content that contains secrets — mirrors the IPC check in sessions.ts
     const ingressCheck = checkIngressForSecrets(content);
     if (ingressCheck.blocked) {
@@ -202,9 +241,33 @@ export class RunOrchestrator {
     // When the prompter sends one of these, we record it in the run store so
     // the client can poll and submit a decision/secret via the respective endpoint.
     // Do NOT set hasNoClient — run sessions have a client (the HTTP caller).
+    const autoDeny = options?.voiceAutoDenyConfirmations === true;
     let lastError: string | null = null;
     session.updateClient((msg: ServerMessage) => {
       if (msg.type === 'confirmation_request') {
+        if (autoDeny) {
+          // Voice path with strict side effects: immediately deny the
+          // confirmation request so the agent loop resumes without
+          // waiting for the full permission timeout (300s). The voice
+          // transport has no interactive approval UI, so polling would
+          // just stall. Security is preserved — the tool call is denied.
+          log.info(
+            { runId: run.id, toolName: msg.toolName },
+            'Auto-denying confirmation request for voice turn (forceStrictSideEffects)',
+          );
+          session.handleConfirmationResponse(
+            msg.requestId,
+            'deny',
+            undefined,
+            undefined,
+            `Permission denied for "${msg.toolName}": this voice call does not have interactive approval capabilities. Side-effect tools are not available for non-guardian voice callers. In your next assistant reply, explain briefly that this action requires guardian-level access and cannot be performed during this call.`,
+          );
+          // Still publish to hub for observability, but skip run-store
+          // bookkeeping since the confirmation is already resolved.
+          publishToHub(msg);
+          return;
+        }
+
         runsStore.setRunConfirmation(run.id, {
           toolName: msg.toolName,
           toolUseId: msg.requestId,
@@ -256,6 +319,8 @@ export class RunOrchestrator {
       session.updateClient(() => {}, true);
     };
 
+    const eventSink = options?.eventSink;
+
     void (async () => {
       try {
         await session.runAgentLoop(content, messageId, (msg: ServerMessage) => {
@@ -270,6 +335,27 @@ export class RunOrchestrator {
           // prompter (confirmation_request). Both paths must publish so SSE
           // consumers receive the full response stream.
           publishToHub(msg);
+
+          // Forward voice-relevant events to the real-time event sink when
+          // provided. This runs in addition to (not instead of) the hub
+          // publication above so both paths remain active.
+          if (eventSink) {
+            if (msg.type === 'assistant_text_delta') {
+              eventSink.onTextDelta(msg.text);
+            } else if (msg.type === 'message_complete') {
+              eventSink.onMessageComplete();
+            } else if (msg.type === 'generation_cancelled') {
+              // Treat cancellation as a completed turn so the voice
+              // turnComplete promise settles instead of hanging forever.
+              eventSink.onMessageComplete();
+            } else if (msg.type === 'error') {
+              eventSink.onError(msg.message);
+            } else if (msg.type === 'session_error') {
+              eventSink.onError(msg.userMessage);
+            } else if (msg.type === 'tool_use_start') {
+              eventSink.onToolUse(msg.toolName, msg.input);
+            }
+          }
         });
         if (lastError) {
           log.error({ runId: run.id, error: lastError }, 'Run failed (error event from agent loop)');
@@ -281,12 +367,28 @@ export class RunOrchestrator {
         const message = err instanceof Error ? err.message : String(err);
         log.error({ err, runId: run.id }, 'Run failed');
         runsStore.failRun(run.id, message);
+        // Notify the voice event sink so the caller's turnComplete
+        // promise settles instead of hanging on unhandled exceptions.
+        if (eventSink) {
+          eventSink.onError(message);
+        }
       } finally {
         cleanup();
       }
     })();
 
-    return run;
+    return {
+      run,
+      // Scope the abort to this specific run by capturing the requestId.
+      // If the session has moved on to a new turn (different currentRequestId),
+      // this abort is stale and becomes a no-op — preventing voice barge-in
+      // from cancelling unrelated turns.
+      abort: () => {
+        if (session.currentRequestId === requestId) {
+          session.abort();
+        }
+      },
+    };
   }
 
   /** Read current run state from the store. */
diff --git a/assistant/src/schedule/schedule-store.ts b/assistant/src/schedule/schedule-store.ts
index 04bcde897c1..b977ef643df 100644
--- a/assistant/src/schedule/schedule-store.ts
+++ b/assistant/src/schedule/schedule-store.ts
@@ -2,6 +2,7 @@ import { and, asc, desc, eq, lte } from 'drizzle-orm';
 import { v4 as uuid } from 'uuid';
 import { Cron } from 'croner';
 import { getDb } from '../memory/db.js';
+import { rawChanges } from '../memory/raw-query.js';
 import { scheduleJobs, scheduleRuns } from '../memory/schema.js';
 import { computeNextRunAt as computeNextRunAtEngine, isValidScheduleExpression } from './recurrence-engine.js';
 import { getLogger } from '../util/logger.js';
@@ -189,8 +190,8 @@ export function updateSchedule(
 
 export function deleteSchedule(id: string): boolean {
   const db = getDb();
-  const result = db.delete(scheduleJobs).where(eq(scheduleJobs.id, id)).run() as unknown as { changes?: number };
-  return (result.changes ?? 0) > 0;
+  db.delete(scheduleJobs).where(eq(scheduleJobs.id, id)).run();
+  return rawChanges() > 0;
 }
 
 /**
@@ -244,13 +245,13 @@ export function claimDueSchedules(now: number): ScheduleJob[] {
       updates.nextRunAt = newNextRunAt!;
     }
 
-    const result = db
+    db
       .update(scheduleJobs)
       .set(updates)
       .where(and(eq(scheduleJobs.id, row.id), eq(scheduleJobs.nextRunAt, row.nextRunAt)))
-      .run() as unknown as { changes?: number };
+      .run();
 
-    if ((result.changes ?? 0) === 0) continue;
+    if (rawChanges() === 0) continue;
 
     claimed.push(parseJobRow({
       ...row,
@@ -359,7 +360,14 @@ export function formatLocalDate(timestamp: number): string {
 export function describeCronExpression(expr: string): string {
   try {
     const cron = new Cron(expr, { maxRuns: 0 });
-    const p = (cron as unknown as { _states: { pattern: {
+    // Access Croner internal state to extract the parsed cron pattern.
+    // This is fragile but necessary — Croner doesn't expose a public API for this.
+    const cronInternal = cron as unknown as Record<string, unknown>;
+    const states = cronInternal._states;
+    if (!states || typeof states !== 'object') return expr;
+    const p = (states as Record<string, unknown>).pattern;
+    if (!p || typeof p !== 'object') return expr;
+    const pattern = p as {
       minute: number[];
       hour: number[];
       day: number[];
@@ -367,18 +375,18 @@ export function describeCronExpression(expr: string): string {
       dayOfWeek: number[];
       starDOM: boolean;
       starDOW: boolean;
-    } } })._states.pattern;
+    };
 
-    const activeMinutes = p.minute.reduce<number[]>((acc, v, i) => { if (v) acc.push(i); return acc; }, []);
-    const activeHours = p.hour.reduce<number[]>((acc, v, i) => { if (v) acc.push(i); return acc; }, []);
-    const activeDays = p.day.reduce<number[]>((acc, v, i) => { if (v) acc.push(i + 1); return acc; }, []);
-    const activeDOW = p.dayOfWeek.reduce<number[]>((acc, v, i) => { if (v) acc.push(i); return acc; }, []);
-    const activeMonths = p.month.reduce<number[]>((acc, v, i) => { if (v) acc.push(i + 1); return acc; }, []);
+    const activeMinutes = pattern.minute.reduce<number[]>((acc, v, i) => { if (v) acc.push(i); return acc; }, []);
+    const activeHours = pattern.hour.reduce<number[]>((acc, v, i) => { if (v) acc.push(i); return acc; }, []);
+    const activeDays = pattern.day.reduce<number[]>((acc, v, i) => { if (v) acc.push(i + 1); return acc; }, []);
+    const activeDOW = pattern.dayOfWeek.reduce<number[]>((acc, v, i) => { if (v) acc.push(i); return acc; }, []);
+    const activeMonths = pattern.month.reduce<number[]>((acc, v, i) => { if (v) acc.push(i + 1); return acc; }, []);
 
     const allMinutes = activeMinutes.length === 60;
     const allHours = activeHours.length === 24;
-    const allDays = p.starDOM;
-    const allDOW = p.starDOW;
+    const allDays = pattern.starDOM;
+    const allDOW = pattern.starDOW;
     const allMonths = activeMonths.length === 12;
 
     const fixedMinute = activeMinutes.length === 1;
diff --git a/assistant/src/skills/vellum-catalog-remote.ts b/assistant/src/skills/vellum-catalog-remote.ts
index 9571339ac15..877ab298ff6 100644
--- a/assistant/src/skills/vellum-catalog-remote.ts
+++ b/assistant/src/skills/vellum-catalog-remote.ts
@@ -1,13 +1,14 @@
 import { readFileSync } from 'node:fs';
 import { join } from 'node:path';
+import { gunzipSync } from 'node:zlib';
 
 import type { CatalogEntry } from '../tools/skills/vellum-catalog.js';
 import { getLogger } from '../util/logger.js';
+import { readPlatformToken } from '../util/platform.js';
 
 const log = getLogger('vellum-catalog-remote');
 
-const GITHUB_RAW_BASE =
-  'https://raw.githubusercontent.com/vellum-ai/vellum-assistant/main/assistant/src/config/vellum-skills';
+const PLATFORM_URL = process.env.VELLUM_ASSISTANT_PLATFORM_URL ?? 'https://assistant.vellum.ai';
 
 const CACHE_TTL_MS = 60 * 60 * 1000; // 1 hour
 
@@ -43,7 +44,20 @@ function getBundledSkillContent(skillId: string): string | null {
   }
 }
 
-/** Fetch catalog entries (cached, async). Falls back to bundled copy. */
+/** Build request headers, including platform token when available. */
+function buildPlatformHeaders(): Record<string, string> {
+  const headers: Record<string, string> = {};
+  const token = readPlatformToken();
+  if (token) {
+    headers['X-Session-Token'] = token;
+  }
+  return headers;
+}
+
+/**
+ * Fetch catalog entries from the platform API. Falls back to bundled copy.
+ * Reads the platform token from ~/.vellum/platform-token automatically.
+ */
 export async function fetchCatalogEntries(): Promise<CatalogEntry[]> {
   const now = Date.now();
   if (cachedEntries && now - cacheTimestamp < CACHE_TTL_MS) {
@@ -51,8 +65,9 @@ export async function fetchCatalogEntries(): Promise<CatalogEntry[]> {
   }
 
   try {
-    const url = `${GITHUB_RAW_BASE}/catalog.json`;
+    const url = `${PLATFORM_URL}/v1/skills/`;
     const response = await fetch(url, {
+      headers: buildPlatformHeaders(),
       signal: AbortSignal.timeout(5000),
     });
 
@@ -63,14 +78,14 @@ export async function fetchCatalogEntries(): Promise<CatalogEntry[]> {
     const manifest: CatalogManifest = await response.json();
     const skills = manifest.skills;
     if (!Array.isArray(skills) || skills.length === 0) {
-      throw new Error('Remote catalog has invalid or empty skills array');
+      throw new Error('Platform catalog has invalid or empty skills array');
     }
     cachedEntries = skills;
     cacheTimestamp = now;
-    log.info({ count: cachedEntries.length }, 'Fetched remote vellum-skills catalog');
+    log.info({ count: cachedEntries.length }, 'Fetched vellum-skills catalog from platform API');
     return cachedEntries;
   } catch (err) {
-    log.warn({ err }, 'Failed to fetch remote catalog, falling back to bundled copy');
+    log.warn({ err }, 'Failed to fetch catalog from platform API, falling back to bundled copy');
     const bundled = loadBundledCatalog();
     // Cache the bundled result too so we don't re-fetch on every call during outage
     cachedEntries = bundled;
@@ -79,28 +94,72 @@ export async function fetchCatalogEntries(): Promise<CatalogEntry[]> {
   }
 }
 
-/** Fetch a skill's SKILL.md content from GitHub. Falls back to bundled copy. */
+/**
+ * Extract SKILL.md content from a tar archive (uncompressed).
+ * Tar format: 512-byte header blocks followed by file data blocks.
+ */
+function extractSkillMdFromTar(tarBuffer: Buffer): string | null {
+  let offset = 0;
+  while (offset + 512 <= tarBuffer.length) {
+    const header = tarBuffer.subarray(offset, offset + 512);
+
+    // Check for end-of-archive (two consecutive zero blocks)
+    if (header.every((b) => b === 0)) break;
+
+    // Extract filename (bytes 0-99, null-terminated)
+    const nameEnd = header.indexOf(0, 0);
+    const name = header.subarray(0, Math.min(nameEnd >= 0 ? nameEnd : 100, 100)).toString('utf-8');
+
+    // Extract file size (bytes 124-135, octal string)
+    const sizeStr = header.subarray(124, 136).toString('utf-8').trim();
+    const size = parseInt(sizeStr, 8) || 0;
+
+    offset += 512; // move past header
+
+    if (name.endsWith('SKILL.md') || name === 'SKILL.md') {
+      return tarBuffer.subarray(offset, offset + size).toString('utf-8');
+    }
+
+    // Skip to next header (data blocks are padded to 512 bytes)
+    offset += Math.ceil(size / 512) * 512;
+  }
+  return null;
+}
+
+/**
+ * Fetch a skill's SKILL.md content from the platform tar API.
+ * GET /v1/skills/{skill_id}/ returns a tar.gz archive containing all skill files.
+ * Falls back to bundled copy on failure.
+ */
 export async function fetchSkillContent(skillId: string): Promise<string | null> {
   try {
-    const url = `${GITHUB_RAW_BASE}/${encodeURIComponent(skillId)}/SKILL.md`;
+    const url = `${PLATFORM_URL}/v1/skills/${encodeURIComponent(skillId)}/`;
     const response = await fetch(url, {
-      signal: AbortSignal.timeout(10000),
+      headers: buildPlatformHeaders(),
+      signal: AbortSignal.timeout(15000),
     });
 
     if (!response.ok) {
       throw new Error(`HTTP ${response.status}: ${response.statusText}`);
     }
 
-    const content = await response.text();
-    log.info({ skillId }, 'Fetched remote SKILL.md');
-    return content;
+    const gzipBuffer = Buffer.from(await response.arrayBuffer());
+    const tarBuffer = gunzipSync(gzipBuffer);
+    const skillMd = extractSkillMdFromTar(tarBuffer);
+
+    if (skillMd) {
+      return skillMd;
+    }
+
+    log.warn({ skillId }, 'SKILL.md not found in platform tar archive, falling back to bundled');
   } catch (err) {
-    log.warn({ err, skillId }, 'Failed to fetch remote SKILL.md, falling back to bundled copy');
-    return getBundledSkillContent(skillId);
+    log.warn({ err, skillId }, 'Failed to fetch skill content from platform API, falling back to bundled');
   }
+
+  return getBundledSkillContent(skillId);
 }
 
-/** Check if a skill ID exists in the remote catalog. */
+/** Check if a skill ID exists in the catalog. */
 export async function checkVellumSkill(skillId: string): Promise<boolean> {
   const entries = await fetchCatalogEntries();
   return entries.some((e) => e.id === skillId);
diff --git a/assistant/src/tools/browser/browser-execution.ts b/assistant/src/tools/browser/browser-execution.ts
index 603a1272590..f67ac7103fa 100644
--- a/assistant/src/tools/browser/browser-execution.ts
+++ b/assistant/src/tools/browser/browser-execution.ts
@@ -93,6 +93,10 @@ export async function executeBrowserNavigate(
   input: Record<string, unknown>,
   context: ToolContext,
 ): Promise<ToolExecutionResult> {
+  if (context.signal?.aborted) {
+    return { content: 'Error: operation was cancelled', isError: true };
+  }
+
   const parsedUrl = parseUrl(input.url);
   if (!parsedUrl) {
     return { content: 'Error: url is required and must be a valid HTTP(S) URL', isError: true };
@@ -316,6 +320,7 @@ export async function executeBrowserNavigate(
       if (challenge?.type === 'captcha') {
         log.info('CAPTCHA detected, waiting up to 5s for auto-resolve');
         for (let i = 0; i < 5; i++) {
+          if (context.signal?.aborted) break;
           await new Promise((r) => setTimeout(r, 1000));
           const still = await detectCaptchaChallenge(page);
           if (!still) {
@@ -797,6 +802,10 @@ export async function executeBrowserWaitFor(
   input: Record<string, unknown>,
   context: ToolContext,
 ): Promise<ToolExecutionResult> {
+  if (context.signal?.aborted) {
+    return { content: 'Error: operation was cancelled', isError: true };
+  }
+
   const selector = typeof input.selector === 'string' && input.selector ? input.selector : null;
   const text = typeof input.text === 'string' && input.text ? input.text : null;
   const duration = typeof input.duration === 'number' ? input.duration : null;
diff --git a/assistant/src/tools/executor.ts b/assistant/src/tools/executor.ts
index e38c0af3380..41957c00f15 100644
--- a/assistant/src/tools/executor.ts
+++ b/assistant/src/tools/executor.ts
@@ -186,14 +186,14 @@ export class ToolExecutor {
 
     try {
       // Check permissions
-      const risk = await classifyRisk(name, input, context.workingDir);
+      const risk = await classifyRisk(name, input, context.workingDir, undefined, undefined, context.signal);
       riskLevel = risk;
 
       // Build principal context from tool metadata so policy rules can
       // distinguish skill-provided tools from core built-ins. Also includes
       // ephemeral rules when executing within a task run.
       const policyContext = buildPolicyContext(tool, context);
-      const result = await check(name, input, context.workingDir, policyContext);
+      const result = await check(name, input, context.workingDir, policyContext, undefined, context.signal);
 
       // Private threads force prompting for side-effect tools even when a
       // trust/allow rule would auto-allow. Deny decisions are preserved —
@@ -255,7 +255,7 @@ export class ToolExecutor {
         }
 
         // Need user approval
-        const allowlistOptions = await generateAllowlistOptions(name, input);
+        const allowlistOptions = await generateAllowlistOptions(name, input, context.signal);
         const scopeOptions = generateScopeOptions(context.workingDir, name);
 
         // Compute preview diff for file tools so the user sees what will change
@@ -313,6 +313,7 @@ export class ToolExecutor {
           context.conversationId,
           executionTarget,
           persistentDecisionsAllowed,
+          context.signal,
         );
 
         decision = response.decision;
@@ -632,6 +633,7 @@ export class ToolExecutor {
               context.conversationId,
               executionTarget,
               false, // no persistent decisions
+              context.signal,
             );
 
             if (response.decision === 'deny' || response.decision === 'always_deny') {
diff --git a/assistant/src/tools/network/web-fetch.ts b/assistant/src/tools/network/web-fetch.ts
index 06de358799a..20248bd39d3 100644
--- a/assistant/src/tools/network/web-fetch.ts
+++ b/assistant/src/tools/network/web-fetch.ts
@@ -63,6 +63,7 @@ type WebFetchRequestExecutor = (
 type ExecuteWebFetchOptions = {
   resolveHostAddresses?: ResolveHostAddresses;
   requestExecutor?: WebFetchRequestExecutor;
+  signal?: AbortSignal;
 };
 
 type NodeHttpResponseLike = {
@@ -451,6 +452,17 @@ export async function executeWebFetch(
     controller.abort();
   }, timeoutSeconds * 1000);
 
+  // Forward external cancellation signal to our controller
+  const externalSignal = options?.signal;
+  const onExternalAbort = () => controller.abort();
+  if (externalSignal) {
+    if (externalSignal.aborted) {
+      controller.abort();
+    } else {
+      externalSignal.addEventListener('abort', onExternalAbort, { once: true });
+    }
+  }
+
   try {
     log.debug({ url: safeRequestedUrl, timeoutSeconds, maxChars, startIndex, rawMode }, 'Fetching webpage');
 
@@ -651,6 +663,9 @@ export async function executeWebFetch(
     };
   } catch (err) {
     if (err instanceof Error && err.name === 'AbortError') {
+      if (externalSignal?.aborted) {
+        return { content: 'Error: web fetch was cancelled', isError: true };
+      }
       return { content: `Error: web fetch timed out after ${timeoutSeconds}s`, isError: true };
     }
 
@@ -659,6 +674,7 @@ export async function executeWebFetch(
     return { content: `Error: Web fetch failed: ${msg}`, isError: true };
   } finally {
     clearTimeout(timeoutHandle);
+    externalSignal?.removeEventListener('abort', onExternalAbort);
   }
 }
 
@@ -705,8 +721,8 @@ class WebFetchTool implements Tool {
     };
   }
 
-  async execute(input: Record<string, unknown>, _context: ToolContext): Promise<ToolExecutionResult> {
-    return executeWebFetch(input);
+  async execute(input: Record<string, unknown>, context: ToolContext): Promise<ToolExecutionResult> {
+    return executeWebFetch(input, { signal: context.signal });
   }
 }
 
diff --git a/assistant/src/tools/network/web-search.ts b/assistant/src/tools/network/web-search.ts
index 076b9051696..a012580a895 100644
--- a/assistant/src/tools/network/web-search.ts
+++ b/assistant/src/tools/network/web-search.ts
@@ -115,6 +115,7 @@ async function executeBraveSearch(
   offset: number,
   freshness: string | undefined,
   apiKey: string,
+  signal?: AbortSignal,
 ): Promise<ToolExecutionResult> {
   const params = new URLSearchParams({
     q: query,
@@ -136,6 +137,7 @@ async function executeBraveSearch(
         'Accept-Encoding': 'gzip',
         'X-Subscription-Token': apiKey,
       },
+      signal,
     });
 
     if (response.ok) {
@@ -170,6 +172,7 @@ async function executeBraveSearch(
 async function executePerplexitySearch(
   query: string,
   apiKey: string,
+  signal?: AbortSignal,
 ): Promise<ToolExecutionResult> {
   for (let attempt = 0; attempt <= DEFAULT_MAX_RETRIES; attempt++) {
     const response = await fetch(PERPLEXITY_API_URL, {
@@ -184,6 +187,7 @@ async function executePerplexitySearch(
           { role: 'user', content: query },
         ],
       }),
+      signal,
     });
 
     if (response.ok) {
@@ -249,7 +253,7 @@ class WebSearchTool implements Tool {
     };
   }
 
-  async execute(input: Record<string, unknown>, _context: ToolContext): Promise<ToolExecutionResult> {
+  async execute(input: Record<string, unknown>, context: ToolContext): Promise<ToolExecutionResult> {
     const query = input.query;
     if (!query || typeof query !== 'string') {
       return { content: 'Error: query is required and must be a string', isError: true };
@@ -281,10 +285,10 @@ class WebSearchTool implements Tool {
         const count = typeof input.count === 'number' ? Math.min(20, Math.max(1, Math.round(input.count))) : 10;
         const offset = typeof input.offset === 'number' ? Math.min(9, Math.max(0, Math.round(input.offset))) : 0;
         const freshness = typeof input.freshness === 'string' ? input.freshness : undefined;
-        return await executeBraveSearch(query, count, offset, freshness, apiKey);
+        return await executeBraveSearch(query, count, offset, freshness, apiKey, context.signal);
       }
 
-      return await executePerplexitySearch(query, apiKey);
+      return await executePerplexitySearch(query, apiKey, context.signal);
     } catch (err) {
       const msg = err instanceof Error ? err.message : String(err);
       log.error({ err }, 'Web search failed');
diff --git a/assistant/src/tools/terminal/evaluate-typescript.ts b/assistant/src/tools/terminal/evaluate-typescript.ts
index ab669a7d19f..4363db8bcb1 100644
--- a/assistant/src/tools/terminal/evaluate-typescript.ts
+++ b/assistant/src/tools/terminal/evaluate-typescript.ts
@@ -169,7 +169,7 @@ export class EvaluateTypescriptTool implements Tool {
     timeoutSec: number,
     timeoutMs: number,
     maxOutputChars: number,
-    _context: ToolContext,
+    context: ToolContext,
   ): Promise<EvalResult> {
     return new Promise<EvalResult>((resolve) => {
       const startTime = Date.now();
@@ -197,11 +197,24 @@ export class EvaluateTypescriptTool implements Tool {
         child.kill('SIGKILL');
       }, timeoutMs);
 
+      // Cooperative cancellation via AbortSignal
+      const onAbort = () => {
+        child.kill('SIGKILL');
+      };
+      if (context.signal) {
+        if (context.signal.aborted) {
+          child.kill('SIGKILL');
+        } else {
+          context.signal.addEventListener('abort', onAbort, { once: true });
+        }
+      }
+
       child.stdout.on('data', (data: Buffer) => stdoutChunks.push(data));
       child.stderr.on('data', (data: Buffer) => stderrChunks.push(data));
 
       child.on('close', (code) => {
         clearTimeout(timer);
+        context.signal?.removeEventListener('abort', onAbort);
         const durationMs = Date.now() - startTime;
 
         let stdout = Buffer.concat(stdoutChunks).toString();
@@ -251,6 +264,7 @@ export class EvaluateTypescriptTool implements Tool {
 
       child.on('error', (err) => {
         clearTimeout(timer);
+        context.signal?.removeEventListener('abort', onAbort);
         const durationMs = Date.now() - startTime;
         resolve({
           ok: false,
diff --git a/assistant/src/util/log-redact.ts b/assistant/src/util/log-redact.ts
new file mode 100644
index 00000000000..a0f9b186b00
--- /dev/null
+++ b/assistant/src/util/log-redact.ts
@@ -0,0 +1,189 @@
+/**
+ * Pino log serializers that scrub sensitive data (bearer tokens, API keys,
+ * authorization headers) from logged values.  Applied to every pino instance
+ * so secrets never reach log files even when errors bubble up opaque objects.
+ *
+ * API-key patterns are intentionally duplicated from security/secret-scanner.ts
+ * rather than imported — the scanner carries entropy analysis, encoded-secret
+ * detection, and other heavyweight logic that a hot-path serializer should not
+ * pull in.
+ */
+
+// ---------------------------------------------------------------------------
+// Sensitive-value patterns (subset of secret-scanner PATTERNS)
+// ---------------------------------------------------------------------------
+
+const BEARER_RE = /Bearer [A-Za-z0-9._\-]+/g;
+
+const API_KEY_PATTERNS: RegExp[] = [
+  // AWS
+  /AKIA[0-9A-Z]{16}/g,
+  // GitHub
+  /gh[pousr]_[A-Za-z0-9_]{36,255}/g,
+  /github_pat_[A-Za-z0-9_]{22,255}/g,
+  // GitLab
+  /glpat-[A-Za-z0-9\-_]{20,}/g,
+  // Stripe
+  /sk_live_[A-Za-z0-9]{24,}/g,
+  /rk_live_[A-Za-z0-9]{24,}/g,
+  // Slack
+  /xoxb-[0-9]{10,}-[0-9]{10,}-[A-Za-z0-9]{24,}/g,
+  /xoxp-[0-9]{10,}-[0-9]{10,}-[0-9]{10,}-[a-f0-9]{32}/g,
+  // Anthropic
+  /sk-ant-[A-Za-z0-9\-_]{80,}/g,
+  // OpenAI
+  /sk-[A-Za-z0-9]{20}T3BlbkFJ[A-Za-z0-9]{20}/g,
+  /sk-proj-[A-Za-z0-9\-_]{40,}/g,
+  // Google
+  /AIza[A-Za-z0-9\-_]{35}/g,
+  /GOCSPX-[A-Za-z0-9\-_]{28}/g,
+  // SendGrid
+  /SG\.[A-Za-z0-9\-_]{22}\.[A-Za-z0-9\-_]{43}/g,
+  // Telegram bot token
+  /[0-9]{8,10}:[A-Za-z0-9_-]{35}/g,
+  // npm
+  /npm_[A-Za-z0-9]{36}/g,
+];
+
+// Header names whose values should always be fully redacted
+const SENSITIVE_HEADERS = new Set([
+  'authorization',
+  'proxy-authorization',
+  'cookie',
+  'set-cookie',
+  'x-api-key',
+  'x-auth-token',
+]);
+
+// ---------------------------------------------------------------------------
+// String redaction
+// ---------------------------------------------------------------------------
+
+function redactString(value: string): string {
+  let result = value;
+
+  // Redact bearer tokens
+  result = result.replace(BEARER_RE, 'Bearer [REDACTED]');
+
+  // Redact API key patterns
+  for (const pattern of API_KEY_PATTERNS) {
+    pattern.lastIndex = 0;
+    result = result.replace(pattern, '[REDACTED]');
+  }
+
+  return result;
+}
+
+// ---------------------------------------------------------------------------
+// Deep value redaction — walks objects/arrays and scrubs strings in place
+// ---------------------------------------------------------------------------
+
+function redactValue(value: unknown, depth: number): unknown {
+  if (depth > 8) return value;
+
+  if (typeof value === 'string') {
+    return redactString(value);
+  }
+
+  if (Array.isArray(value)) {
+    return value.map((item) => redactValue(item, depth + 1));
+  }
+
+  if (value != null && typeof value === 'object') {
+    const result: Record<string, unknown> = {};
+    for (const [key, val] of Object.entries(value as Record<string, unknown>)) {
+      const lowerKey = key.toLowerCase();
+      // Fully redact sensitive header values
+      if (SENSITIVE_HEADERS.has(lowerKey)) {
+        result[key] = '[REDACTED]';
+      } else {
+        result[key] = redactValue(val, depth + 1);
+      }
+    }
+    return result;
+  }
+
+  return value;
+}
+
+// ---------------------------------------------------------------------------
+// Error serialization — extracts non-enumerable Error fields and cause chain
+// ---------------------------------------------------------------------------
+
+function serializeError(err: unknown, depth: number): unknown {
+  if (depth > 8 || err == null) return err;
+
+  if (!(err instanceof Error)) {
+    return err;
+  }
+
+  const serialized: Record<string, unknown> = {
+    name: err.name,
+    message: err.message,
+  };
+
+  // AssistantError and subclasses carry a structured ErrorCode
+  if ('code' in err && typeof (err as { code: unknown }).code === 'string') {
+    serialized.code = (err as { code: string }).code;
+  }
+
+  if (err.stack) {
+    serialized.stack = err.stack;
+  }
+
+  // Walk the cause chain recursively
+  if (err.cause !== undefined) {
+    serialized.cause = serializeError(err.cause, depth + 1);
+  }
+
+  // Preserve any additional enumerable properties (e.g. provider, statusCode, toolName)
+  for (const [key, val] of Object.entries(err)) {
+    if (!(key in serialized)) {
+      serialized[key] = val;
+    }
+  }
+
+  return serialized;
+}
+
+// ---------------------------------------------------------------------------
+// Pino serializers
+// ---------------------------------------------------------------------------
+
+/**
+ * Pino serializer for the `err` binding — extracts non-enumerable Error fields
+ * (name, message, stack), structured codes, and cause chains, then redacts
+ * secrets from the result.
+ */
+function errSerializer(err: unknown): unknown {
+  return redactValue(serializeError(err, 0), 0);
+}
+
+/**
+ * Pino serializer for `req` (HTTP request objects) — redacts authorization
+ * headers and sensitive values in the URL/body.
+ */
+function reqSerializer(req: unknown): unknown {
+  return redactValue(req, 0);
+}
+
+/**
+ * Pino serializer for `res` (HTTP response objects) — redacts sensitive
+ * header values that may appear in response logs.
+ */
+function resSerializer(res: unknown): unknown {
+  return redactValue(res, 0);
+}
+
+/**
+ * Pino serializers config object.  Spread this into the pino options `serializers`
+ * field on every logger instance.
+ */
+export const logSerializers: Record<string, (value: unknown) => unknown> = {
+  err: errSerializer,
+  req: reqSerializer,
+  res: resSerializer,
+};
+
+// Exported for testing
+export { redactString as _redactString, redactValue as _redactValue };
diff --git a/assistant/src/util/logger.ts b/assistant/src/util/logger.ts
index 34551e99900..67311e175e7 100644
--- a/assistant/src/util/logger.ts
+++ b/assistant/src/util/logger.ts
@@ -3,7 +3,9 @@ import { join } from 'node:path';
 import { Writable } from 'node:stream';
 import pino from 'pino';
 import pinoPretty from 'pino-pretty';
+import { logSerializers } from './log-redact.js';
 import { getLogPath } from './platform.js';
+import { getDebugMode, getLogStderr, getDebugStdoutLogs } from '../config/env-registry.js';
 
 export type LogFileConfig = {
   dir: string | undefined;
@@ -55,7 +57,7 @@ let activeLogFileConfig: LogFileConfig | null = null;
 
 function buildRotatingLogger(config: LogFileConfig): pino.Logger {
   if (!config.dir) {
-    return pino({ name: 'assistant' }, pinoPretty({ destination: 1 }));
+    return pino({ name: 'assistant', serializers: logSerializers }, pinoPretty({ destination: 1 }));
   }
 
   if (!existsSync(config.dir)) {
@@ -64,17 +66,17 @@ function buildRotatingLogger(config: LogFileConfig): pino.Logger {
 
   const today = formatDate(new Date());
   const filePath = logFilePathForDate(config.dir, new Date());
-  const fileStream = pino.destination({ dest: filePath, sync: false, mkdir: true });
+  const fileStream = pino.destination({ dest: filePath, sync: false, mkdir: true, mode: 0o600 });
 
   activeLogDate = today;
   activeLogFileConfig = config;
 
-  const level = process.env.VELLUM_DEBUG === '1' ? 'debug' : 'info';
+  const level = getDebugMode() ? 'debug' : 'info';
 
-  if (process.env.VELLUM_DEBUG === '1') {
+  if (getDebugMode()) {
     const prettyStream = pinoPretty({ destination: 2 });
     return pino(
-      { name: 'assistant', level },
+      { name: 'assistant', level, serializers: logSerializers },
       pino.multistream([
         { stream: fileStream, level: 'info' as const },
         { stream: prettyStream, level: 'debug' as const },
@@ -83,7 +85,7 @@ function buildRotatingLogger(config: LogFileConfig): pino.Logger {
   }
 
   return pino(
-    { name: 'assistant', level },
+    { name: 'assistant', level, serializers: logSerializers },
     pino.multistream([
       { stream: fileStream, level: 'info' as const },
       { stream: pinoPretty({ destination: 1 }), level: 'info' as const },
@@ -118,38 +120,38 @@ function getRootLogger(): pino.Logger {
     const forceStderr =
       process.env.BUN_TEST === '1'
       || process.env.NODE_ENV === 'test'
-      || process.env.VELLUM_LOG_STDERR === '1';
+      || getLogStderr();
     if (forceStderr) {
       rootLogger = pino(
-        { level: process.env.VELLUM_DEBUG === '1' ? 'debug' : 'info' },
+        { level: getDebugMode() ? 'debug' : 'info', serializers: logSerializers },
         pino.destination(2),
       );
       return rootLogger;
     }
 
     try {
-      const fileStream = pino.destination({ dest: getLogPath(), sync: false, mkdir: true });
+      const fileStream = pino.destination({ dest: getLogPath(), sync: false, mkdir: true, mode: 0o600 });
 
-      if (process.env.VELLUM_DEBUG === '1') {
+      if (getDebugMode()) {
         const prettyStream = pinoPretty({ destination: 2 });
         const multi = pino.multistream([
           { stream: fileStream, level: 'info' as const },
           { stream: prettyStream, level: 'debug' as const },
         ]);
-        rootLogger = pino({ level: 'debug' }, multi);
-      } else if (process.env.DEBUG_STDOUT_LOGS === '1') {
+        rootLogger = pino({ level: 'debug', serializers: logSerializers }, multi);
+      } else if (getDebugStdoutLogs()) {
         rootLogger = pino(
-          { level: 'info' },
+          { level: 'info', serializers: logSerializers },
           pino.multistream([
             { stream: fileStream, level: 'info' as const },
             { stream: pinoPretty({ destination: 1 }), level: 'info' as const },
           ]),
         );
       } else {
-        rootLogger = pino({ level: 'info' }, fileStream);
+        rootLogger = pino({ level: 'info', serializers: logSerializers }, fileStream);
       }
     } catch {
-      rootLogger = pino({ level: process.env.VELLUM_DEBUG === '1' ? 'debug' : 'info' }, pinoPretty({ destination: 2 }));
+      rootLogger = pino({ level: getDebugMode() ? 'debug' : 'info', serializers: logSerializers }, pinoPretty({ destination: 2 }));
     }
   }
   return rootLogger;
@@ -157,7 +159,7 @@ function getRootLogger(): pino.Logger {
 
 /** Returns true when VELLUM_DEBUG=1 is set. */
 export function isDebug(): boolean {
-  return process.env.VELLUM_DEBUG === '1';
+  return getDebugMode();
 }
 
 /**
@@ -225,7 +227,7 @@ export function getCliLogger(name: string): pino.Logger {
     get(_target, prop, receiver) {
       if (!logger) {
         logger = pino(
-          { name, level: 'trace' },
+          { name, level: 'trace', serializers: logSerializers },
           pino.multistream([
             { stream: cliDestination(1, 49), level: 'trace' as const },
             { stream: cliDestination(2), level: 'error' as const },
diff --git a/assistant/src/util/object.ts b/assistant/src/util/object.ts
new file mode 100644
index 00000000000..7d8af715f5e
--- /dev/null
+++ b/assistant/src/util/object.ts
@@ -0,0 +1,3 @@
+export function isPlainObject(value: unknown): value is Record<string, unknown> {
+  return value != null && typeof value === 'object' && !Array.isArray(value);
+}
diff --git a/assistant/src/util/platform.ts b/assistant/src/util/platform.ts
index 2a7c14fbf45..3c0b637f30d 100644
--- a/assistant/src/util/platform.ts
+++ b/assistant/src/util/platform.ts
@@ -1,18 +1,14 @@
-import { mkdirSync, existsSync, statSync, unlinkSync, renameSync, readFileSync, writeFileSync, readdirSync, chmodSync } from 'node:fs';
-import { join, dirname } from 'node:path';
+import { mkdirSync, existsSync, statSync, unlinkSync, readFileSync, writeFileSync, chmodSync } from 'node:fs';
+import { join } from 'node:path';
 import { homedir } from 'node:os';
-/**
- * Stderr-only logger for migration code. Using the pino logger during
- * migration is unsafe because pino initialization calls ensureDataDir(),
- * which pre-creates workspace destination directories and causes migration
- * moves to no-op.
- */
-function migrationLog(level: 'info' | 'warn' | 'debug', msg: string, data?: Record<string, unknown>): void {
-  if (level === 'debug') return; // suppress debug-level migration noise
-  const prefix = level === 'warn' ? 'WARN' : 'INFO';
-  const extra = data ? ' ' + JSON.stringify(data) : '';
-  process.stderr.write(`[migration] ${prefix}: ${msg}${extra}\n`);
-}
+import {
+  getBaseDataDir,
+  getDaemonSocket,
+  getDaemonTcpPort,
+  getDaemonTcpEnabled,
+  getDaemonTcpHost,
+  getDaemonIosPairing,
+} from '../config/env-registry.js';
 
 export function isMacOS(): boolean {
   return process.platform === 'darwin';
@@ -52,7 +48,7 @@ export function getClipboardCommand(): string | null {
  * Returns null if neither file exists or both are malformed.
  */
 export function readLockfile(): Record<string, unknown> | null {
-  const base = process.env.BASE_DATA_DIR?.trim() || homedir();
+  const base = getBaseDataDir() || homedir();
   const candidates = [
     join(base, '.vellum.lock.json'),
     join(base, '.vellum.lockfile.json'),
@@ -104,7 +100,7 @@ export function normalizeAssistantId(assistantId: string): string {
  * Respects BASE_DATA_DIR for non-standard home directories.
  */
 export function writeLockfile(data: Record<string, unknown>): void {
-  const base = process.env.BASE_DATA_DIR?.trim() || homedir();
+  const base = getBaseDataDir() || homedir();
   writeFileSync(join(base, '.vellum.lock.json'), JSON.stringify(data, null, 2) + '\n');
 }
 
@@ -113,7 +109,7 @@ export function writeLockfile(data: Record<string, unknown>): void {
  * files, skills) and runtime files (socket, PID) live here.
  */
 export function getRootDir(): string {
-  return join(process.env.BASE_DATA_DIR?.trim() || homedir(), '.vellum');
+  return join(getBaseDataDir() || homedir(), '.vellum');
 }
 
 /**
@@ -154,7 +150,7 @@ export function getInterfacesDir(): string {
 }
 
 export function getSocketPath(): string {
-  const override = process.env.VELLUM_DAEMON_SOCKET?.trim();
+  const override = getDaemonSocket();
   if (override) {
     return expandHomePath(override);
   }
@@ -170,12 +166,7 @@ export function getSessionTokenPath(): string {
  * Reads VELLUM_DAEMON_TCP_PORT env var; defaults to 8765.
  */
 export function getTCPPort(): number {
-  const override = process.env.VELLUM_DAEMON_TCP_PORT?.trim();
-  if (override) {
-    const port = parseInt(override, 10);
-    if (!isNaN(port) && port > 0 && port <= 65535) return port;
-  }
-  return 8765;
+  return getDaemonTcpPort();
 }
 
 /**
@@ -190,9 +181,8 @@ export function getTCPPort(): number {
  * The macOS CLI (AssistantCli) also sets the env var for bundled-binary deployments.
  */
 export function isTCPEnabled(): boolean {
-  const override = process.env.VELLUM_DAEMON_TCP_ENABLED?.trim();
-  if (override === 'true' || override === '1') return true;
-  if (override === 'false' || override === '0') return false;
+  const envValue = getDaemonTcpEnabled();
+  if (envValue !== undefined) return envValue;
   return existsSync(join(getRootDir(), 'tcp-enabled'));
 }
 
@@ -204,7 +194,7 @@ export function isTCPEnabled(): boolean {
  *   3. Default: '127.0.0.1' (localhost only)
  */
 export function getTCPHost(): string {
-  const override = process.env.VELLUM_DAEMON_TCP_HOST?.trim();
+  const override = getDaemonTcpHost();
   if (override) return override;
   if (isIOSPairingEnabled()) return '0.0.0.0';
   return '127.0.0.1';
@@ -225,9 +215,8 @@ export function getTCPHost(): string {
  * access without exposing the daemon to the LAN.
  */
 export function isIOSPairingEnabled(): boolean {
-  const override = process.env.VELLUM_DAEMON_IOS_PAIRING?.trim();
-  if (override === 'true' || override === '1') return true;
-  if (override === 'false' || override === '0') return false;
+  const envValue = getDaemonIosPairing();
+  if (envValue !== undefined) return envValue;
   return existsSync(join(getRootDir(), 'ios-pairing-enabled'));
 }
 
@@ -235,6 +224,27 @@ export function getHttpTokenPath(): string {
   return join(getRootDir(), 'http-token');
 }
 
+/**
+ * Returns the path to the platform API token file (~/.vellum/platform-token).
+ * This token is the X-Session-Token used to authenticate with the Vellum
+ * Platform API (e.g. assistant.vellum.ai).
+ */
+export function getPlatformTokenPath(): string {
+  return join(getRootDir(), 'platform-token');
+}
+
+/**
+ * Read the platform API token from disk. Returns null if the file
+ * doesn't exist or can't be read.
+ */
+export function readPlatformToken(): string | null {
+  try {
+    return readFileSync(getPlatformTokenPath(), 'utf-8').trim();
+  } catch {
+    return null;
+  }
+}
+
 /**
  * Read the daemon session token from disk. Returns null if the file
  * doesn't exist or can't be read (daemon not running).
@@ -330,276 +340,9 @@ export function getWorkspacePromptPath(file: string): string {
   return join(getWorkspaceDir(), file);
 }
 
-/**
- * Idempotent move: relocates source to destination for migration.
- * - No-op if source is missing (already migrated or never existed).
- * - No-op if destination already exists (avoids clobbering).
- * - Creates destination parent directories as needed.
- * - Logs warning on failure instead of throwing.
- *
- * Exported for testing; not intended for general use outside migrations.
- */
-export function migratePath(source: string, destination: string): void {
-  if (!existsSync(source)) return;
-  if (existsSync(destination)) {
-    migrationLog('debug', 'Migration skipped: destination already exists', { source, destination });
-    return;
-  }
-  try {
-    const destDir = dirname(destination);
-    if (!existsSync(destDir)) {
-      mkdirSync(destDir, { recursive: true });
-    }
-    renameSync(source, destination);
-    migrationLog('info', 'Migrated path', { from: source, to: destination });
-  } catch (err) {
-    migrationLog('warn', 'Failed to migrate path', { err: String(err), from: source, to: destination });
-  }
-}
-
-/**
- * When migratePath skips config.json because the workspace copy already
- * exists, the legacy root config may still contain keys (e.g. slackWebhookUrl)
- * that were never written to the workspace config. This merges any missing
- * top-level keys from the legacy file into the workspace file so they are
- * not silently lost during upgrade.
- */
-function isPlainObject(v: unknown): v is Record<string, unknown> {
-  return v != null && typeof v === 'object' && !Array.isArray(v);
-}
-
-function mergeSkippedConfigKeys(legacyPath: string, workspacePath: string): void {
-  if (!existsSync(legacyPath) || !existsSync(workspacePath)) return;
-
-  let legacy: Record<string, unknown>;
-  let workspace: Record<string, unknown>;
-  try {
-    const legacyRaw = JSON.parse(readFileSync(legacyPath, 'utf-8'));
-    const workspaceRaw = JSON.parse(readFileSync(workspacePath, 'utf-8'));
-    if (!isPlainObject(legacyRaw) || !isPlainObject(workspaceRaw)) return;
-    legacy = legacyRaw;
-    workspace = workspaceRaw;
-  } catch {
-    return; // malformed JSON — skip silently
-  }
-
-  const merged: string[] = [];
-  for (const key of Object.keys(legacy)) {
-    if (!(key in workspace)) {
-      workspace[key] = legacy[key];
-      merged.push(key);
-    }
-  }
-
-  if (merged.length > 0) {
-    try {
-      writeFileSync(workspacePath, JSON.stringify(workspace, null, 2) + '\n');
-      // Remove merged keys from legacy config so they are not resurrected
-      // if a user later deletes them from the workspace config.
-      for (const key of merged) {
-        delete legacy[key];
-      }
-      if (Object.keys(legacy).length === 0) {
-        unlinkSync(legacyPath);
-      } else {
-        writeFileSync(legacyPath, JSON.stringify(legacy, null, 2) + '\n');
-      }
-      migrationLog('info', 'Merged legacy config keys into workspace config', { keys: merged });
-    } catch (err) {
-      migrationLog('warn', 'Failed to merge legacy config keys', { err: String(err), keys: merged });
-    }
-  }
-}
-
-/**
- * When migratePath skips the hooks directory because the workspace copy
- * already exists (e.g. pre-created by ensureDataDir), the legacy hooks
- * directory may still contain individual hook files/subdirectories that
- * were never moved. This merges any missing entries from the legacy
- * path into the workspace hooks path so they are not silently lost.
- */
-function mergeLegacyHooks(legacyDir: string, workspaceDir: string): void {
-  if (!existsSync(legacyDir) || !existsSync(workspaceDir)) return;
-
-  let entries: import('node:fs').Dirent[];
-  try {
-    entries = readdirSync(legacyDir, { withFileTypes: true });
-  } catch {
-    return;
-  }
-
-  for (const entry of entries) {
-    const src = join(legacyDir, entry.name);
-    const dest = join(workspaceDir, entry.name);
-    if (existsSync(dest)) {
-      // config.json needs a merge rather than a skip — the legacy file may
-      // contain hook enabled/settings entries that the workspace copy lacks.
-      if (entry.name === 'config.json') {
-        mergeHooksConfig(src, dest);
-      }
-      continue;
-    }
-    try {
-      renameSync(src, dest);
-      migrationLog('info', 'Merged legacy hook into workspace', { from: src, to: dest });
-    } catch (err) {
-      migrationLog('warn', 'Failed to merge legacy hook', { err: String(err), from: src, to: dest });
-    }
-  }
-}
-
-/**
- * Merge missing hook entries from a legacy hooks/config.json into the
- * workspace hooks/config.json. Only adds hooks that don't already exist
- * in the workspace config so user changes are never overwritten.
- */
-function mergeHooksConfig(legacyPath: string, workspacePath: string): void {
-  let legacy: Record<string, unknown>;
-  let workspace: Record<string, unknown>;
-  try {
-    const legacyRaw = JSON.parse(readFileSync(legacyPath, 'utf-8'));
-    const workspaceRaw = JSON.parse(readFileSync(workspacePath, 'utf-8'));
-    if (!isPlainObject(legacyRaw) || !isPlainObject(workspaceRaw)) return;
-    legacy = legacyRaw;
-    workspace = workspaceRaw;
-  } catch {
-    return;
-  }
-
-  const legacyHooks = legacy.hooks;
-  const wsHooks = workspace.hooks;
-  if (!isPlainObject(legacyHooks) || !isPlainObject(wsHooks)) return;
-
-  const merged: string[] = [];
-  for (const hookName of Object.keys(legacyHooks)) {
-    if (!(hookName in wsHooks)) {
-      wsHooks[hookName] = legacyHooks[hookName];
-      merged.push(hookName);
-    }
-  }
-
-  if (merged.length > 0) {
-    try {
-      writeFileSync(workspacePath, JSON.stringify(workspace, null, 2) + '\n');
-      // Remove merged hooks from legacy config to prevent resurrection
-      for (const hookName of merged) {
-        delete legacyHooks[hookName];
-      }
-      if (Object.keys(legacyHooks).length === 0) {
-        unlinkSync(legacyPath);
-      } else {
-        writeFileSync(legacyPath, JSON.stringify(legacy, null, 2) + '\n');
-      }
-      migrationLog('info', 'Merged legacy hooks config entries into workspace', { hooks: merged });
-    } catch (err) {
-      migrationLog('warn', 'Failed to merge legacy hooks config', { err: String(err), hooks: merged });
-    }
-  }
-}
-
-/**
- * When migratePath skips the skills directory because the workspace copy
- * already exists (e.g. pre-created by ensureDataDir), the legacy skills
- * directory may still contain individual skill subdirectories that were
- * never moved. This merges any missing skill subdirectories from the
- * legacy path into the workspace skills path so they are not stranded.
- */
-function mergeLegacySkills(legacyDir: string, workspaceDir: string): void {
-  if (!existsSync(legacyDir) || !existsSync(workspaceDir)) return;
-
-  let entries: import('node:fs').Dirent[];
-  try {
-    entries = readdirSync(legacyDir, { withFileTypes: true });
-  } catch {
-    return;
-  }
-
-  for (const entry of entries) {
-    const src = join(legacyDir, entry.name);
-    const dest = join(workspaceDir, entry.name);
-    if (existsSync(dest)) continue; // already present in workspace
-    try {
-      renameSync(src, dest);
-      migrationLog('info', 'Merged legacy skill into workspace', { from: src, to: dest });
-    } catch (err) {
-      migrationLog('warn', 'Failed to merge legacy skill', { err: String(err), from: src, to: dest });
-    }
-  }
-}
-
-/**
- * When migratePath skips the data directory because workspace/data already
- * exists (e.g. the user's project had a data/ folder that was extracted from
- * sandbox/fs), the legacy data directory may still contain internal state
- * subdirectories (db/, logs/, sandbox/, etc.) that need to be preserved.
- * This merges any missing entries from the legacy data path into workspace/data.
- */
-function mergeLegacyDataEntries(legacyDir: string, workspaceDir: string): void {
-  if (!existsSync(legacyDir) || !existsSync(workspaceDir)) return;
-
-  let entries: import('node:fs').Dirent[];
-  try {
-    entries = readdirSync(legacyDir, { withFileTypes: true });
-  } catch {
-    return;
-  }
-
-  for (const entry of entries) {
-    const src = join(legacyDir, entry.name);
-    const dest = join(workspaceDir, entry.name);
-    if (existsSync(dest)) continue; // already present in workspace
-    try {
-      renameSync(src, dest);
-      migrationLog('info', 'Merged legacy data entry into workspace', { from: src, to: dest });
-    } catch (err) {
-      migrationLog('warn', 'Failed to merge legacy data entry', { err: String(err), from: src, to: dest });
-    }
-  }
-}
-
-/**
- * Migrate from the flat ~/.vellum layout to the workspace-based layout.
- *
- * Step (a) is special: if the workspace dir doesn't exist yet but the old
- * sandbox working dir (data/sandbox/fs) does, its contents are "extracted"
- * to become the new workspace root via rename. All subsequent moves then
- * land inside that workspace directory.
- *
- * Idempotent: safe to call on every startup — already-migrated items are
- * skipped, and a second run is a no-op.
- */
-export function migrateToWorkspaceLayout(): void {
-  const root = getRootDir();
-  if (!existsSync(root)) return;
-
-  const ws = getWorkspaceDir();
-
-  // (a) Extract data/sandbox/fs -> workspace (only when workspace doesn't exist yet)
-  if (!existsSync(ws)) {
-    const sandboxFs = join(root, 'data', 'sandbox', 'fs');
-    if (existsSync(sandboxFs)) {
-      try {
-        renameSync(sandboxFs, ws);
-        migrationLog('info', 'Extracted sandbox/fs as workspace root', { from: sandboxFs, to: ws });
-      } catch (err) {
-        migrationLog('warn', 'Failed to extract sandbox/fs', { err: String(err), from: sandboxFs, to: ws });
-      }
-    }
-  }
-
-  // (b)-(h) Move legacy root-level items into workspace
-  migratePath(join(root, 'config.json'), join(ws, 'config.json'));
-  mergeSkippedConfigKeys(join(root, 'config.json'), join(ws, 'config.json'));
-  migratePath(join(root, 'data'), join(ws, 'data'));
-  mergeLegacyDataEntries(join(root, 'data'), join(ws, 'data'));
-  migratePath(join(root, 'hooks'), join(ws, 'hooks'));
-  mergeLegacyHooks(join(root, 'hooks'), join(ws, 'hooks'));
-  migratePath(join(root, 'IDENTITY.md'), join(ws, 'IDENTITY.md'));
-  migratePath(join(root, 'skills'), join(ws, 'skills'));
-  mergeLegacySkills(join(root, 'skills'), join(ws, 'skills'));
-  migratePath(join(root, 'SOUL.md'), join(ws, 'SOUL.md'));
-  migratePath(join(root, 'USER.md'), join(ws, 'USER.md'));
-}
+// Re-export migration functions so existing consumers don't break.
+export { migratePath, migrateToWorkspaceLayout } from '../migrations/workspace-layout.js';
+export { migrateToDataLayout } from '../migrations/data-layout.js';
 
 export function ensureDataDir(): void {
   const root = getRootDir();
@@ -636,67 +379,3 @@ export function ensureDataDir(): void {
     // Non-fatal: some filesystems don't support Unix permissions
   }
 }
-
-/**
- * Migrate files from the old flat ~/.vellum layout to the new structured
- * layout with data/ and protected/ subdirectories.
- *
- * Idempotent: skips items that have already been migrated.
- * Uses renameSync for atomic moves (same filesystem).
- */
-export function migrateToDataLayout(): void {
-  const root = getRootDir();
-  const data = join(root, 'data');
-
-  if (!existsSync(root)) return;
-
-  function migrateItem(oldPath: string, newPath: string): void {
-    if (!existsSync(oldPath)) return;
-    if (existsSync(newPath)) return;
-    try {
-      const newDir = dirname(newPath);
-      if (!existsSync(newDir)) {
-        mkdirSync(newDir, { recursive: true });
-      }
-      renameSync(oldPath, newPath);
-      migrationLog('info', 'Migrated path', { from: oldPath, to: newPath });
-    } catch (err) {
-      migrationLog('warn', 'Failed to migrate path', { err: String(err), from: oldPath, to: newPath });
-    }
-  }
-
-  // DB: ~/.vellum/data/assistant.db → ~/.vellum/data/db/assistant.db
-  migrateItem(join(data, 'assistant.db'), join(data, 'db', 'assistant.db'));
-  migrateItem(join(data, 'assistant.db-wal'), join(data, 'db', 'assistant.db-wal'));
-  migrateItem(join(data, 'assistant.db-shm'), join(data, 'db', 'assistant.db-shm'));
-
-  // Qdrant PID: ~/.vellum/qdrant.pid → ~/.vellum/data/qdrant/qdrant.pid
-  migrateItem(join(root, 'qdrant.pid'), join(data, 'qdrant', 'qdrant.pid'));
-
-  // Qdrant binary: ~/.vellum/bin/ → ~/.vellum/data/qdrant/bin/
-  migrateItem(join(root, 'bin'), join(data, 'qdrant', 'bin'));
-
-  // Logs: ~/.vellum/logs/ → ~/.vellum/data/logs/
-  migrateItem(join(root, 'logs'), join(data, 'logs'));
-
-  // Memory: ~/.vellum/memory/ → ~/.vellum/data/memory/
-  migrateItem(join(root, 'memory'), join(data, 'memory'));
-
-  // Apps: ~/.vellum/apps/ → ~/.vellum/data/apps/
-  migrateItem(join(root, 'apps'), join(data, 'apps'));
-
-  // Browser auth: ~/.vellum/browser-auth/ → ~/.vellum/data/browser-auth/
-  migrateItem(join(root, 'browser-auth'), join(data, 'browser-auth'));
-
-  // Browser profile: ~/.vellum/browser-profile/ → ~/.vellum/data/browser-profile/
-  migrateItem(join(root, 'browser-profile'), join(data, 'browser-profile'));
-
-  // History: ~/.vellum/history → ~/.vellum/data/history
-  migrateItem(join(root, 'history'), join(data, 'history'));
-
-  // Protected files: ~/.vellum/X → ~/.vellum/protected/X
-  const protectedDir = join(root, 'protected');
-  migrateItem(join(root, 'trust.json'), join(protectedDir, 'trust.json'));
-  migrateItem(join(root, 'keys.enc'), join(protectedDir, 'keys.enc'));
-  migrateItem(join(root, 'secret-allowlist.json'), join(protectedDir, 'secret-allowlist.json'));
-}
diff --git a/cli/package.json b/cli/package.json
index ed8ea4e0df8..624b54ab12e 100644
--- a/cli/package.json
+++ b/cli/package.json
@@ -1,6 +1,6 @@
 {
   "name": "@vellumai/cli",
-  "version": "0.3.5",
+  "version": "0.3.6",
   "description": "CLI tools for vellum-assistant",
   "type": "module",
   "exports": {
diff --git a/clients/Package.swift b/clients/Package.swift
index fee3920e456..4555ff03284 100644
--- a/clients/Package.swift
+++ b/clients/Package.swift
@@ -1,7 +1,7 @@
 // swift-tools-version: 5.9
 import PackageDescription
 
-let appVersion = "0.3.5"
+let appVersion = "0.3.6"
 
 let package = Package(
     name: "vellum-assistant",
@@ -26,7 +26,8 @@ let package = Package(
     ],
     dependencies: [
         .package(url: "https://github.com/sparkle-project/Sparkle", from: "2.0.0"),
-        .package(url: "https://github.com/Picovoice/porcupine", from: "3.0.0"),
+        // Porcupine removed — iOS-only SPM package breaks on Xcode 26.2.
+        // TODO: Re-add via C SDK wrapper (lib/mac/) for macOS wake word support.
     ],
     targets: [
         .target(
@@ -49,7 +50,7 @@ let package = Package(
             dependencies: [
                 "VellumAssistantShared",
                 "Sparkle",
-                .product(name: "Porcupine", package: "porcupine"),
+                // Porcupine dep removed — stub engine doesn't need it
             ],
             path: "macos/vellum-assistant",
             exclude: ["Resources/Info.plist", "Resources/bg.png"],
diff --git a/clients/chrome-extension/background/worker.ts b/clients/chrome-extension/background/worker.ts
index 8fa3eeeb70d..25c701a7855 100644
--- a/clients/chrome-extension/background/worker.ts
+++ b/clients/chrome-extension/background/worker.ts
@@ -167,16 +167,66 @@ async function handleEvaluate(cmd: ExtensionCommand): Promise<DispatchResult> {
 
   const code = cmd.code;
 
-  // Inject into MAIN world so the script has access to the page's fetch,
-  // document, window — same as CDP Runtime.evaluate.
-  const results = await chrome.scripting.executeScript({
-    target: { tabId: cmd.tabId },
-    world: 'MAIN',
-    func: (src: string) => (0, eval)(src), // indirect eval runs in global scope
-    args: [code],
-  });
+  // Use chrome.debugger API (CDP Runtime.evaluate) for ALL evaluations.
+  //
+  // Why not chrome.scripting.executeScript?
+  //   1. MAIN world + eval/new Function is blocked by CSP on Instagram, Facebook,
+  //      TikTok, and many other sites.
+  //   2. MAIN world results don't serialize back through executeScript reliably
+  //      (returns null even when the script succeeds).
+  //   3. ISOLATED world can't access page JS globals or cookie-authenticated fetch().
+  //
+  // The debugger API operates at the browser engine level, bypassing ALL CSP
+  // restrictions while having full access to the page context (DOM, fetch,
+  // cookies, JS globals). It's the only approach that works universally.
+  //
+  // Trade-off: Chrome shows a yellow "debugging this tab" infobar while
+  // attached. We minimize this by attaching and detaching for each command.
+
+  // Attach debugger to the tab
+  try {
+    await chrome.debugger.attach({ tabId: cmd.tabId }, '1.3');
+  } catch (e) {
+    // Already attached is fine
+    if (!(e instanceof Error && e.message.includes('Already attached'))) {
+      throw new Error(`Could not attach debugger: ${e instanceof Error ? e.message : String(e)}`);
+    }
+  }
 
-  return { result: results[0]?.result, tabId: cmd.tabId };
+  try {
+    // Wrap code in an IIFE so "return" statements work — but only if the code
+    // isn't already a self-executing expression (e.g. "(async function(){...})()"
+    // or "(function(){...})()"). Double-wrapping breaks the return value.
+    const trimmed = code.trim();
+    const isIIFE = /^\((?:async\s+)?function\s*\(/.test(trimmed) && trimmed.endsWith(')');
+    const wrapped = isIIFE ? trimmed : `(function(){ ${code} })()`;
+    const evalResult = await chrome.debugger.sendCommand(
+      { tabId: cmd.tabId },
+      'Runtime.evaluate',
+      {
+        expression: wrapped,
+        returnByValue: true,
+        awaitPromise: true,
+        userGesture: true,
+      },
+    ) as { result?: { value?: unknown; type?: string; description?: string }; exceptionDetails?: { text?: string; exception?: { description?: string } } };
+
+    if (evalResult.exceptionDetails) {
+      const errMsg = evalResult.exceptionDetails.exception?.description
+        ?? evalResult.exceptionDetails.text
+        ?? 'Unknown eval error';
+      throw new Error(errMsg);
+    }
+
+    return { result: evalResult.result?.value ?? null, tabId: cmd.tabId };
+  } finally {
+    // Detach debugger — minimizes the yellow infobar visibility
+    try {
+      await chrome.debugger.detach({ tabId: cmd.tabId });
+    } catch {
+      // Best effort
+    }
+  }
 }
 
 async function handleNavigate(cmd: ExtensionCommand): Promise<DispatchResult> {
diff --git a/clients/chrome-extension/manifest.json b/clients/chrome-extension/manifest.json
index 81315396338..dd555c25676 100644
--- a/clients/chrome-extension/manifest.json
+++ b/clients/chrome-extension/manifest.json
@@ -9,7 +9,8 @@
     "activeTab",
     "scripting",
     "cookies",
-    "storage"
+    "storage",
+    "debugger"
   ],
   "host_permissions": [
     "<all_urls>"
diff --git a/clients/chrome-extension/popup/popup.ts b/clients/chrome-extension/popup/popup.ts
index d79cb767a46..30042b0cd7a 100644
--- a/clients/chrome-extension/popup/popup.ts
+++ b/clients/chrome-extension/popup/popup.ts
@@ -47,6 +47,8 @@ btnConnect.addEventListener('click', async () => {
     if (!isNaN(portNum) && portNum > 0 && portNum <= 65535) {
       storageUpdate.relayPort = portNum;
     }
+  } else {
+    await chrome.storage.local.remove('relayPort');
   }
   await chrome.storage.local.set(storageUpdate);
 
diff --git a/clients/ios/App/AppDelegate.swift b/clients/ios/App/AppDelegate.swift
index 6f3eee3ba10..8a28f10da60 100644
--- a/clients/ios/App/AppDelegate.swift
+++ b/clients/ios/App/AppDelegate.swift
@@ -90,6 +90,11 @@ class AppDelegate: NSObject, UIApplicationDelegate {
         // the new approval flow. Runs once; the flag persists across future launches.
         migrateToPairingV4IfNeeded()
 
+        // Separate migration: clear stale override values when the old toggle was OFF.
+        // This runs independently of the v4 migration so users who already completed
+        // v4 migration still get the override cleanup.
+        migratePairingOverridesIfNeeded()
+
         // Initial connect is handled by SceneDelegate.sceneWillEnterForeground, which fires
         // during launch and on every background→foreground transition. Calling connect() here
         // too would race with the scene's connect() since isConnected is false while in-flight.
@@ -179,7 +184,6 @@ class AppDelegate: NSObject, UIApplicationDelegate {
 
         // Clear dev pairing keys
         defaults.removeObject(forKey: "devLocalPairingEnabled")
-        defaults.removeObject(forKey: "iosPairingUseOverride")
 
         // Mark migration as done
         defaults.set(true, forKey: Self.pairingV4MigrationKey)
@@ -190,6 +194,36 @@ class AppDelegate: NSObject, UIApplicationDelegate {
         log.info("v4 pairing migration complete — legacy pairing state cleared")
     }
 
+    // MARK: - Pairing Override Migration
+
+    /// Key that tracks whether the pairing override migration has run.
+    private static let pairingOverrideMigrationKey = "pairing_override_migration_done"
+
+    /// Clears stale gateway/token override values when the old toggle was OFF.
+    /// Runs once; the flag persists across future launches.
+    ///
+    /// This is separate from the v4 migration so users who already completed
+    /// v4 migration still get the override cleanup.
+    private func migratePairingOverridesIfNeeded() {
+        let defaults = UserDefaults.standard
+        guard !defaults.bool(forKey: Self.pairingOverrideMigrationKey) else { return }
+
+        // Only clean up when the legacy toggle key is actually present.
+        // After M9 the toggle is no longer persisted, so absence means
+        // the user may have intentionally set overrides post-M9.
+        if defaults.object(forKey: "iosPairingUseOverride") != nil {
+            let overrideWasEnabled = defaults.bool(forKey: "iosPairingUseOverride")
+            if !overrideWasEnabled {
+                defaults.removeObject(forKey: PairingConfiguration.gatewayOverrideKey)
+                defaults.removeObject(forKey: PairingConfiguration.tokenOverrideKey)
+            }
+            // Clean up the legacy toggle key itself — no longer used.
+            defaults.removeObject(forKey: "iosPairingUseOverride")
+        }
+
+        defaults.set(true, forKey: Self.pairingOverrideMigrationKey)
+    }
+
     func application(
         _ application: UIApplication,
         configurationForConnecting connectingSceneSession: UISceneSession,
diff --git a/clients/ios/Views/Settings/QRPairingSheet.swift b/clients/ios/Views/Settings/QRPairingSheet.swift
index 5933b85e405..dbec4230128 100644
--- a/clients/ios/Views/Settings/QRPairingSheet.swift
+++ b/clients/ios/Views/Settings/QRPairingSheet.swift
@@ -19,6 +19,7 @@ struct QRPairingSheet: View {
     @State private var scannedPayload: DaemonQRPayloadV4?
     @State private var errorMessage: String?
     @State private var pollTimer: Timer?
+    @State private var pairingTask: Task<Void, Never>?
 
     enum PairingPhase {
         case scanning
@@ -52,6 +53,8 @@ struct QRPairingSheet: View {
             .toolbar {
                 ToolbarItem(placement: .cancellationAction) {
                     Button("Cancel") {
+                        pairingTask?.cancel()
+                        pairingTask = nil
                         stopPolling()
                         dismiss()
                     }
@@ -59,6 +62,8 @@ struct QRPairingSheet: View {
             }
         }
         .onDisappear {
+            pairingTask?.cancel()
+            pairingTask = nil
             stopPolling()
         }
     }
@@ -282,7 +287,7 @@ struct QRPairingSheet: View {
             return
         }
 
-        Task {
+        pairingTask = Task {
             // Try LAN first if available
             if let lanUrl = payload.localLanUrl,
                isAllowedLocalHttp(urlString: lanUrl, payload: payload) {
@@ -292,10 +297,13 @@ struct QRPairingSheet: View {
                     timeoutSeconds: 3
                 )
                 if let result = result {
-                    handlePairingResponse(result, payload: payload)
+                    handlePairingResponse(result, payload: payload, effectiveBaseURL: lanUrl)
                     return
                 }
-                // LAN failed, fall through to cloud gateway
+                // LAN failed — but if we were cancelled while waiting, stop here
+                // instead of falling through to the gateway path.
+                guard !Task.isCancelled else { return }
+                // Fall through to cloud gateway
             }
 
             // Cloud gateway
@@ -305,8 +313,8 @@ struct QRPairingSheet: View {
                 timeoutSeconds: 15
             )
             if let result = result {
-                handlePairingResponse(result, payload: payload)
-            } else {
+                handlePairingResponse(result, payload: payload, effectiveBaseURL: payload.gatewayURL)
+            } else if !Task.isCancelled {
                 await MainActor.run {
                     errorMessage = "Could not reach your Mac. Make sure the Vellum daemon is running."
                     phase = .error
@@ -337,7 +345,7 @@ struct QRPairingSheet: View {
         }
     }
 
-    private func handlePairingResponse(_ response: [String: Any], payload: DaemonQRPayloadV4) {
+    private func handlePairingResponse(_ response: [String: Any], payload: DaemonQRPayloadV4, effectiveBaseURL: String) {
         guard let status = response["status"] as? String else {
             errorMessage = "Unexpected response from Mac."
             phase = .error
@@ -363,7 +371,7 @@ struct QRPairingSheet: View {
 
         case "pending":
             phase = .waitingForApproval
-            startPolling(payload: payload)
+            startPolling(payload: payload, effectiveBaseURL: effectiveBaseURL)
 
         case "denied":
             errorMessage = "Pairing was denied on your Mac."
@@ -381,20 +389,12 @@ struct QRPairingSheet: View {
 
     // MARK: - Polling
 
-    private func startPolling(payload: DaemonQRPayloadV4) {
+    private func startPolling(payload: DaemonQRPayloadV4, effectiveBaseURL: String) {
         stopPolling()
 
-        // Determine which URL to poll
-        let baseURL: String
-        if let lanUrl = payload.localLanUrl, isAllowedLocalHttp(urlString: lanUrl, payload: payload) {
-            baseURL = lanUrl
-        } else {
-            baseURL = payload.gatewayURL
-        }
-
         pollTimer = Timer.scheduledTimer(withTimeInterval: 2.5, repeats: true) { _ in
             Task {
-                await pollPairingStatus(baseURL: baseURL, payload: payload)
+                await pollPairingStatus(baseURL: effectiveBaseURL, payload: payload)
             }
         }
     }
diff --git a/clients/macos/build.sh b/clients/macos/build.sh
index c85f3a1ca7a..924da51014c 100755
--- a/clients/macos/build.sh
+++ b/clients/macos/build.sh
@@ -40,6 +40,7 @@ APP_DIR="$SCRIPT_DIR/dist/$BUNDLE_DISPLAY_NAME.app"
 CONTENTS="$APP_DIR/Contents"
 MACOS_DIR="$CONTENTS/MacOS"
 RESOURCES_DIR="$CONTENTS/Resources"
+FRAMEWORKS_DIR="$CONTENTS/Frameworks"
 
 # Version (overridable via env for CI, defaults to Package.swift)
 if [ -z "${DISPLAY_VERSION:-}" ]; then
@@ -270,8 +271,19 @@ if [ -f "$SCRIPT_DIR/gateway-bin/vellum-gateway" ]; then
     fi
 fi
 
+# Also rebuild if Porcupine dylib changed or newly added
+PORCUPINE_CHECKOUT="$SCRIPT_DIR/../.build/checkouts/porcupine"
+if [ -d "$PORCUPINE_CHECKOUT/lib/mac" ]; then
+    ARCH=$(uname -m)
+    PORCUPINE_DYLIB_SRC="$PORCUPINE_CHECKOUT/lib/mac/$ARCH/libpv_porcupine.dylib"
+    if [ -f "$PORCUPINE_DYLIB_SRC" ]; then
+        if [ ! -f "$FRAMEWORKS_DIR/libpv_porcupine.dylib" ] || [ "$PORCUPINE_DYLIB_SRC" -nt "$FRAMEWORKS_DIR/libpv_porcupine.dylib" ]; then
+            NEEDS_REBUILD=true
+        fi
+    fi
+fi
+
 # Ensure .app bundle structure exists
-FRAMEWORKS_DIR="$CONTENTS/Frameworks"
 mkdir -p "$MACOS_DIR" "$RESOURCES_DIR" "$FRAMEWORKS_DIR"
 
 if [ "$NEEDS_REBUILD" = true ]; then
@@ -337,6 +349,56 @@ else
     echo "WARNING: Sparkle.framework not found at $SPARKLE_FW"
 fi
 
+# Bundle Porcupine dylib, model, and keyword files (if checkout exists)
+PORCUPINE_CHECKOUT="$SCRIPT_DIR/../.build/checkouts/porcupine"
+if [ -d "$PORCUPINE_CHECKOUT/lib/mac" ]; then
+    if [ "$CONFIG" = "release" ]; then
+        # Universal binary for release builds
+        PORCUPINE_ARM64="$PORCUPINE_CHECKOUT/lib/mac/arm64/libpv_porcupine.dylib"
+        PORCUPINE_X86_64="$PORCUPINE_CHECKOUT/lib/mac/x86_64/libpv_porcupine.dylib"
+        if [ -f "$PORCUPINE_ARM64" ] && [ -f "$PORCUPINE_X86_64" ]; then
+            if [ ! -f "$FRAMEWORKS_DIR/libpv_porcupine.dylib" ] || \
+               [ "$PORCUPINE_ARM64" -nt "$FRAMEWORKS_DIR/libpv_porcupine.dylib" ] || \
+               [ "$PORCUPINE_X86_64" -nt "$FRAMEWORKS_DIR/libpv_porcupine.dylib" ]; then
+                echo "Bundling Porcupine dylib (universal)..."
+                lipo -create "$PORCUPINE_ARM64" "$PORCUPINE_X86_64" -output "$FRAMEWORKS_DIR/libpv_porcupine.dylib"
+            fi
+        else
+            echo "WARNING: Porcupine dylibs not found for universal binary — skipping"
+        fi
+    else
+        # Arch-specific for debug builds
+        ARCH=$(uname -m)
+        PORCUPINE_DYLIB_SRC="$PORCUPINE_CHECKOUT/lib/mac/$ARCH/libpv_porcupine.dylib"
+        if [ -f "$PORCUPINE_DYLIB_SRC" ]; then
+            if [ ! -f "$FRAMEWORKS_DIR/libpv_porcupine.dylib" ] || [ "$PORCUPINE_DYLIB_SRC" -nt "$FRAMEWORKS_DIR/libpv_porcupine.dylib" ]; then
+                echo "Bundling Porcupine dylib..."
+                cp "$PORCUPINE_DYLIB_SRC" "$FRAMEWORKS_DIR/libpv_porcupine.dylib"
+            fi
+        else
+            echo "WARNING: Porcupine dylib not found at $PORCUPINE_DYLIB_SRC — skipping"
+        fi
+    fi
+else
+    echo "WARNING: Porcupine checkout not found at $PORCUPINE_CHECKOUT — skipping dylib bundling"
+fi
+
+# Bundle Porcupine model file
+if [ -f "$PORCUPINE_CHECKOUT/lib/common/porcupine_params.pv" ]; then
+    if [ ! -f "$RESOURCES_DIR/porcupine_params.pv" ] || [ "$PORCUPINE_CHECKOUT/lib/common/porcupine_params.pv" -nt "$RESOURCES_DIR/porcupine_params.pv" ]; then
+        echo "Bundling Porcupine model file..."
+        cp "$PORCUPINE_CHECKOUT/lib/common/porcupine_params.pv" "$RESOURCES_DIR/porcupine_params.pv"
+    fi
+fi
+
+# Bundle Porcupine keyword files
+if [ -d "$PORCUPINE_CHECKOUT/resources/keyword_files/mac" ]; then
+    mkdir -p "$RESOURCES_DIR/porcupine-keywords"
+    for ppn in "$PORCUPINE_CHECKOUT/resources/keyword_files/mac/"*.ppn; do
+        [ -f "$ppn" ] && cp "$ppn" "$RESOURCES_DIR/porcupine-keywords/"
+    done
+fi
+
 # Always refresh bundled skills in app bundle (skill assets change independently of binaries)
 if [ -d "$SCRIPT_DIR/daemon-bin/bundled-skills" ]; then
     rm -rf "$RESOURCES_DIR/bundled-skills"
@@ -530,6 +592,16 @@ if [ -d "$FRAMEWORKS_DIR/Sparkle.framework" ]; then
     echo "Sparkle.framework signed (including nested binaries)"
 fi
 
+# Sign Porcupine dylib
+if [ -f "$FRAMEWORKS_DIR/libpv_porcupine.dylib" ]; then
+    PV_SIGN_FLAGS=(--force --sign "$SIGN_IDENTITY")
+    if [ "$CONFIG" = "release" ] && [ "$SIGN_IDENTITY" != "-" ]; then
+        PV_SIGN_FLAGS+=(--timestamp --options runtime)
+    fi
+    codesign "${PV_SIGN_FLAGS[@]}" "$FRAMEWORKS_DIR/libpv_porcupine.dylib"
+    echo "Porcupine dylib signed"
+fi
+
 # Sign CLI binary
 if [ -f "$MACOS_DIR/vellum-cli" ]; then
     CLI_SIGN_FLAGS=(--force --sign "$SIGN_IDENTITY")
diff --git a/clients/macos/vellum-assistant/App/AppDelegate+MenuBar.swift b/clients/macos/vellum-assistant/App/AppDelegate+MenuBar.swift
index c4dd5a87776..e93a75c641c 100644
--- a/clients/macos/vellum-assistant/App/AppDelegate+MenuBar.swift
+++ b/clients/macos/vellum-assistant/App/AppDelegate+MenuBar.swift
@@ -16,6 +16,14 @@ extension AppDelegate {
             button.action = #selector(statusBarButtonClicked(_:))
             button.target = self
         }
+
+        // Start observing daemon connection state immediately so the icon
+        // reflects disconnected/connected before the main window opens.
+        connectionStatusCancellable = daemonClient.$isConnected
+            .receive(on: RunLoop.main)
+            .sink { [weak self] _ in
+                self?.updateMenuBarIcon()
+            }
     }
 
     func setupFileMenu() {
@@ -80,6 +88,7 @@ extension AppDelegate {
 
         let status = currentAssistantStatus
         let dotColor = status.statusColor
+        let dotAlpha = status.shouldPulse ? pulsePhase : 1.0
 
         let composited = NSImage(size: NSSize(width: iconSize, height: iconSize))
         composited.lockFocus()
@@ -94,14 +103,38 @@ extension AppDelegate {
         let dotRect = NSRect(x: dotX, y: dotY, width: dotSize, height: dotSize)
         NSColor.black.withAlphaComponent(0.5).setFill()
         NSBezierPath(ovalIn: dotRect.insetBy(dx: -0.5, dy: -0.5)).fill()
-        dotColor.setFill()
+        dotColor.withAlphaComponent(dotAlpha).setFill()
         NSBezierPath(ovalIn: dotRect).fill()
         composited.unlockFocus()
         composited.isTemplate = false
         button.image = composited
+
+        managePulseTimer(for: status)
+    }
+
+    /// Starts or stops the pulse timer based on the current status.
+    private func managePulseTimer(for status: AssistantStatus) {
+        if status.shouldPulse {
+            guard pulseTimer == nil else { return }
+            pulsePhase = 1.0
+            pulseTimer = Timer.scheduledTimer(withTimeInterval: 0.05, repeats: true) { [weak self] _ in
+                Task { @MainActor in
+                    guard let self, self.statusItem != nil, let button = self.statusItem.button else { return }
+                    // Triangle wave between 0.3 and 1.0 over ~1.4s (28 frames at 50ms)
+                    self.pulsePhase -= 0.05
+                    if self.pulsePhase <= 0.3 { self.pulsePhase = 1.0 }
+                    self.configureMenuBarIcon(button)
+                }
+            }
+        } else {
+            pulseTimer?.invalidate()
+            pulseTimer = nil
+            pulsePhase = 1.0
+        }
     }
 
     var currentAssistantStatus: AssistantStatus {
+        if !daemonClient.isConnected { return .disconnected }
         guard let viewModel = mainWindow?.threadManager.activeViewModel else { return .idle }
         if let error = viewModel.errorText { return .error(error) }
         if viewModel.isThinking { return .thinking }
diff --git a/clients/macos/vellum-assistant/App/AppDelegate+Notifications.swift b/clients/macos/vellum-assistant/App/AppDelegate+Notifications.swift
index f3e0e7b2bc2..4405cc3441e 100644
--- a/clients/macos/vellum-assistant/App/AppDelegate+Notifications.swift
+++ b/clients/macos/vellum-assistant/App/AppDelegate+Notifications.swift
@@ -72,19 +72,19 @@ extension AppDelegate {
             options: []
         )
 
-        let viewQuickChatAction = UNNotificationAction(
-            identifier: "VIEW_QUICK_CHAT",
-            title: "View in Chat",
-            options: .foreground
+        let viewGuardianAction = UNNotificationAction(
+            identifier: "VIEW_GUARDIAN",
+            title: "View",
+            options: [.foreground]
         )
-        let quickChatCategory = UNNotificationCategory(
-            identifier: "QUICK_CHAT_RESPONSE",
-            actions: [viewQuickChatAction],
+        let guardianRequestCategory = UNNotificationCategory(
+            identifier: "GUARDIAN_REQUEST",
+            actions: [viewGuardianAction],
             intentIdentifiers: [],
             options: []
         )
 
-        center.setNotificationCategories([activityCategory, toolConfirmationCategory, rideShotgunCategory, voiceResponseCategory, quickChatCategory])
+        center.setNotificationCategories([activityCategory, toolConfirmationCategory, rideShotgunCategory, voiceResponseCategory, guardianRequestCategory])
     }
 
     func registerBundledFonts() {
@@ -160,12 +160,12 @@ extension AppDelegate: UNUserNotificationCenterDelegate {
             return
         }
 
-        // Handle quick chat response notifications — open the thread in the main window
-        if categoryId == "QUICK_CHAT_RESPONSE" {
+        // Handle guardian request notifications — open the guardian thread in the main window
+        if categoryId == "GUARDIAN_REQUEST" {
             let conversationId = response.notification.request.content.userInfo["conversationId"] as? String
             await MainActor.run {
                 guard !self.isAwaitingFirstLaunchReady else { return }
-                self.openQuickChatThread(conversationId: conversationId)
+                self.openConversationThread(conversationId: conversationId)
             }
             return
         }
diff --git a/clients/macos/vellum-assistant/App/AppDelegate+Sessions.swift b/clients/macos/vellum-assistant/App/AppDelegate+Sessions.swift
index 6cb6f416762..0f58c9ae047 100644
--- a/clients/macos/vellum-assistant/App/AppDelegate+Sessions.swift
+++ b/clients/macos/vellum-assistant/App/AppDelegate+Sessions.swift
@@ -240,171 +240,30 @@ extension AppDelegate {
         }
     }
 
-    // MARK: - Background Session (Quick Chat)
-
-    /// Starts a background session that sends a message to the daemon without
-    /// showing any UI. When the assistant responds, a macOS notification is
-    /// delivered. Multiple background sessions can run concurrently.
-    func startBackgroundSession(task: String, source: String) {
-        let sessionTask = task.trimmingCharacters(in: .whitespacesAndNewlines)
-        guard !sessionTask.isEmpty else { return }
-
-        Task { @MainActor in
-            // Ensure daemon connection
-            if !daemonClient.isConnected {
-                log.info("Daemon not connected, attempting to connect before background session")
-                do {
-                    try await daemonClient.connect()
-                    self.setupAmbientAgent()
-                } catch {
-                    log.error("Failed to connect to daemon for background session: \(error.localizedDescription)")
-                    self.deliverQuickChatErrorNotification(task: sessionTask)
-                    return
-                }
-            }
-
-            // Subscribe to daemon stream before sending task_submit
-            let messageStream = self.daemonClient.subscribe()
-
-            // Send task_submit
-            let screenBounds = CGDisplayBounds(CGMainDisplayID())
-            do {
-                try self.daemonClient.send(TaskSubmitMessage(
-                    task: sessionTask,
-                    screenWidth: Int(screenBounds.width),
-                    screenHeight: Int(screenBounds.height),
-                    attachments: nil,
-                    source: source
-                ))
-            } catch {
-                log.error("Failed to send background task submit: \(error)")
-                return
-            }
-
-            // Wait for task_routed, then listen for the response.
-            // Known limitation: IPCTaskRouted has no correlation/requestId field, so if two
-            // background sessions are started nearly simultaneously, one could capture the
-            // other's sessionId. The same limitation exists in startSession(). Adding a
-            // requestId to the IPC contract would fix this but requires a coordinated
-            // TypeScript + Swift change.
-            var routedMessage: TaskRoutedMessage?
-            for await message in messageStream {
-                if case .taskRouted(let routed) = message {
-                    routedMessage = routed
-                    break
-                }
-                if case .error(let err) = message {
-                    log.error("Background task routing failed: \(err.message, privacy: .private)")
-                    break
-                }
-            }
-
-            guard let routed = routedMessage else {
-                log.error("Background session: no routed message received")
-                return
-            }
-
-            let sessionId = routed.sessionId
-
-            // Create a thread in ThreadManager so the user can find it later
-            if let threadManager = self.mainWindow?.threadManager {
-                threadManager.createTaskRunThread(
-                    conversationId: sessionId,
-                    workItemId: "",
-                    title: String(sessionTask.prefix(50))
-                )
-            }
-
-            // Listen for the assistant response in the background
-            var accumulatedText = ""
-            for await message in messageStream {
-                switch message {
-                case .assistantTextDelta(let delta) where delta.sessionId == sessionId || delta.sessionId == nil:
-                    accumulatedText += delta.text
-
-                case .messageComplete(let complete) where complete.sessionId == sessionId:
-                    let responseText = accumulatedText.isEmpty ? "(No response)" : accumulatedText
-                    self.deliverQuickChatNotification(
-                        responseText: responseText,
-                        conversationId: sessionId
-                    )
-                    return
-
-                case .generationHandoff(let handoff) where handoff.sessionId == sessionId:
-                    let responseText = accumulatedText.isEmpty ? "(No response)" : accumulatedText
-                    self.deliverQuickChatNotification(
-                        responseText: responseText,
-                        conversationId: sessionId
-                    )
-                    return
-
-                case .cuError(let error) where error.sessionId == sessionId:
-                    self.deliverQuickChatNotification(
-                        responseText: "Error: \(error.message)",
-                        conversationId: sessionId
-                    )
-                    return
-
-                case .sessionError(let error) where error.sessionId == sessionId:
-                    self.deliverQuickChatNotification(
-                        responseText: "Error: \(error.userMessage)",
-                        conversationId: sessionId
-                    )
-                    return
-
-                default:
-                    break
-                }
-            }
-        }
-    }
-
-    private func deliverQuickChatNotification(responseText: String, conversationId: String) {
+    func deliverGuardianRequestNotification(title: String, questionText: String, conversationId: String) {
         let content = UNMutableNotificationContent()
-        content.title = "Quick Chat"
-        // Truncate long responses for the notification body
-        if responseText.count > 200 {
-            content.body = String(responseText.prefix(200)) + "..."
-        } else {
-            content.body = responseText
-        }
+        content.title = title
+        content.body = String(questionText.prefix(200))
         content.sound = .default
-        content.categoryIdentifier = "QUICK_CHAT_RESPONSE"
+        content.categoryIdentifier = "GUARDIAN_REQUEST"
         content.userInfo = ["conversationId": conversationId]
 
         let request = UNNotificationRequest(
-            identifier: "quick-chat-\(conversationId)",
-            content: content,
-            trigger: nil
-        )
-        UNUserNotificationCenter.current().add(request) { error in
-            if let error {
-                log.error("Failed to post quick chat notification: \(error.localizedDescription)")
-            }
-        }
-    }
-
-    private func deliverQuickChatErrorNotification(task: String) {
-        let content = UNMutableNotificationContent()
-        content.title = "Quick Chat"
-        content.body = "Could not connect to the assistant. Please try again."
-        content.sound = .default
-
-        let request = UNNotificationRequest(
-            identifier: "quick-chat-error-\(UUID().uuidString)",
+            identifier: "guardian-request-\(conversationId)",
             content: content,
             trigger: nil
         )
         UNUserNotificationCenter.current().add(request) { error in
             if let error {
-                log.error("Failed to post quick chat error notification: \(error.localizedDescription)")
+                log.error("Failed to post guardian request notification: \(error.localizedDescription)")
             }
         }
     }
 
     /// Opens the main window and navigates to the thread for the given conversation ID.
     /// Retries if the thread isn't populated yet (e.g., ThreadManager hasn't loaded it).
-    func openQuickChatThread(conversationId: String?) {
+    /// Used by Quick Chat, Guardian Request, and other notification deep links.
+    func openConversationThread(conversationId: String?) {
         showMainWindow()
         guard let conversationId else { return }
 
diff --git a/clients/macos/vellum-assistant/App/AppDelegate.swift b/clients/macos/vellum-assistant/App/AppDelegate.swift
index 936c1c9326a..fc7cda858d3 100644
--- a/clients/macos/vellum-assistant/App/AppDelegate.swift
+++ b/clients/macos/vellum-assistant/App/AppDelegate.swift
@@ -1,4 +1,5 @@
 import AppKit
+import Carbon
 import VellumAssistantShared
 import Combine
 import CoreText
@@ -12,12 +13,14 @@ enum AssistantStatus {
     case idle
     case thinking
     case error(String)
+    case disconnected
 
     var menuTitle: String {
         switch self {
         case .idle: return "Assistant is idle"
         case .thinking: return "Assistant is thinking..."
         case .error(let msg): return "Error: \(msg)"
+        case .disconnected: return "Disconnected from daemon"
         }
     }
 
@@ -26,6 +29,7 @@ enum AssistantStatus {
         case .idle: return .systemGray
         case .thinking: return .systemGreen
         case .error: return .systemRed
+        case .disconnected: return .systemOrange
         }
     }
 
@@ -38,6 +42,12 @@ enum AssistantStatus {
         image.unlockFocus()
         return image
     }
+
+    /// Whether the dot should pulse (animate opacity)
+    var shouldPulse: Bool {
+        if case .thinking = self { return true }
+        return false
+    }
 }
 
 enum InteractionType {
@@ -45,6 +55,35 @@ enum InteractionType {
     case textQA
 }
 
+/// Carbon event handler for the Quick Input hotkey (Cmd+/).
+/// Must be a free function because Carbon callbacks are C function pointers.
+private func quickInputHotKeyHandler(
+    _: EventHandlerCallRef?,
+    event: EventRef?,
+    _: UnsafeMutableRawPointer?
+) -> OSStatus {
+    guard let event else { return OSStatus(eventNotHandledErr) }
+
+    var hotKeyID = EventHotKeyID()
+    let status = GetEventParameter(
+        event,
+        EventParamName(kEventParamDirectObject),
+        EventParamType(typeEventHotKeyID),
+        nil,
+        MemoryLayout<EventHotKeyID>.size,
+        nil,
+        &hotKeyID
+    )
+    guard status == noErr, hotKeyID.id == 1 else { return OSStatus(eventNotHandledErr) }
+
+    Task { @MainActor in
+        guard let appDelegate = AppDelegate.shared,
+              !appDelegate.isAwaitingFirstLaunchReady else { return }
+        appDelegate.toggleQuickInput()
+    }
+    return noErr
+}
+
 @MainActor
 public final class AppDelegate: NSObject, NSApplicationDelegate {
     /// Shared reference — `NSApp.delegate as? AppDelegate` fails under
@@ -54,11 +93,9 @@ public final class AppDelegate: NSObject, NSApplicationDelegate {
 
     var statusItem: NSStatusItem!
     private var hotKeyMonitor: Any?
+    private var lastRegisteredGlobalHotkey: String?
+    private var globalHotkeyObserver: AnyCancellable?
     private var escapeMonitor: Any?
-    private var quickChatPanel: QuickChatPanel?
-    private var quickChatMonitor: Any?
-    private var quickChatLocalMonitor: Any?
-    private var lastRegisteredShortcut: String?
     var overlayWindow: SessionOverlayWindow?
     var currentSession: ComputerUseSession?
     var currentTextSession: TextSession?
@@ -69,6 +106,9 @@ public final class AppDelegate: NSObject, NSApplicationDelegate {
     private var wakeWordCoordinator: WakeWordCoordinator?
     private var voiceTranscriptionWindow: VoiceTranscriptionWindow?
     var thinkingWindow: ThinkingIndicatorWindow?
+    private var quickInputWindow: QuickInputWindow?
+    private var quickInputHotKeyRef: EventHotKeyRef?
+    private var quickInputEventHandlerRef: EventHandlerRef?
     public let services = AppServices()
     private let assistantCli = AssistantCli()
     public let updateManager = UpdateManager()
@@ -100,9 +140,11 @@ public final class AppDelegate: NSObject, NSApplicationDelegate {
     var galleryWindow: ComponentGalleryWindow?
     #endif
     private var windowObserver: Any?
-    private var quickChatShortcutObserver: AnyCancellable?
     private weak var recordingViewModel: ChatViewModel?
-    private var statusIconCancellable: AnyCancellable?
+    var statusIconCancellable: AnyCancellable?
+    var connectionStatusCancellable: AnyCancellable?
+    var pulseTimer: Timer?
+    var pulsePhase: CGFloat = 1.0
     var cachedSkills: [SkillInfo] = []
     var refreshSkillsTask: Task<Void, Never>?
     var cachedApps: [AppItem] = []
@@ -131,6 +173,12 @@ public final class AppDelegate: NSObject, NSApplicationDelegate {
         // renders EmptyView — we handle settings in the main window panel).
         UserDefaults.standard.removeObject(forKey: "NSWindow Frame com_apple_SwiftUI_Settings_window")
 
+        // Migration: clear stale pairing override values when the toggle was OFF.
+        // M9 removed the isOverrideEnabled gate — any non-empty override now
+        // applies unconditionally. Users who had override values typed in but
+        // the toggle OFF would have those stale values silently activate.
+        migratePairingOverridesIfNeeded()
+
         if let envPath = FeatureFlagManager.findRepoEnvFile() {
             FeatureFlagManager.shared.loadFromFile(at: envPath)
         }
@@ -198,7 +246,6 @@ public final class AppDelegate: NSObject, NSApplicationDelegate {
         setupFileMenu()
         setupViewMenu()
         setupHotKey()
-        setupQuickChatHotKey()
         setupEscapeMonitor()
         setupVoiceInput()
         setupAmbientAgent()
@@ -343,23 +390,16 @@ public final class AppDelegate: NSObject, NSApplicationDelegate {
                 NSEvent.removeMonitor(hotKeyMonitor)
                 self.hotKeyMonitor = nil
             }
+            self.tearDownQuickInputMonitors()
+            quickInputWindow?.dismiss()
+            quickInputWindow = nil
+            lastRegisteredGlobalHotkey = nil
+            globalHotkeyObserver?.cancel()
+            globalHotkeyObserver = nil
             if let escapeMonitor {
                 NSEvent.removeMonitor(escapeMonitor)
                 self.escapeMonitor = nil
             }
-            if let quickChatMonitor {
-                NSEvent.removeMonitor(quickChatMonitor)
-                self.quickChatMonitor = nil
-            }
-            if let quickChatLocalMonitor {
-                NSEvent.removeMonitor(quickChatLocalMonitor)
-                self.quickChatLocalMonitor = nil
-            }
-            lastRegisteredShortcut = nil
-            quickChatShortcutObserver?.cancel()
-            quickChatShortcutObserver = nil
-            quickChatPanel?.dismiss()
-            quickChatPanel = nil
             voiceInput?.stop()
             voiceInput = nil
             wakeWordCoordinator = nil
@@ -371,6 +411,10 @@ public final class AppDelegate: NSObject, NSApplicationDelegate {
             }
             statusIconCancellable?.cancel()
             statusIconCancellable = nil
+            connectionStatusCancellable?.cancel()
+            connectionStatusCancellable = nil
+            pulseTimer?.invalidate()
+            pulseTimer = nil
 
             if let item = statusItem {
                 NSStatusBar.system.removeStatusItem(item)
@@ -456,23 +500,16 @@ public final class AppDelegate: NSObject, NSApplicationDelegate {
             NSEvent.removeMonitor(hotKeyMonitor)
             self.hotKeyMonitor = nil
         }
+        tearDownQuickInputMonitors()
+        quickInputWindow?.dismiss()
+        quickInputWindow = nil
+        lastRegisteredGlobalHotkey = nil
+        globalHotkeyObserver?.cancel()
+        globalHotkeyObserver = nil
         if let escapeMonitor {
             NSEvent.removeMonitor(escapeMonitor)
             self.escapeMonitor = nil
         }
-        if let quickChatMonitor {
-            NSEvent.removeMonitor(quickChatMonitor)
-            self.quickChatMonitor = nil
-        }
-        if let quickChatLocalMonitor {
-            NSEvent.removeMonitor(quickChatLocalMonitor)
-            self.quickChatLocalMonitor = nil
-        }
-        lastRegisteredShortcut = nil
-        quickChatShortcutObserver?.cancel()
-        quickChatShortcutObserver = nil
-        quickChatPanel?.dismiss()
-        quickChatPanel = nil
         voiceInput?.stop()
         voiceInput = nil
         wakeWordCoordinator = nil
@@ -484,6 +521,10 @@ public final class AppDelegate: NSObject, NSApplicationDelegate {
         }
         statusIconCancellable?.cancel()
         statusIconCancellable = nil
+        connectionStatusCancellable?.cancel()
+        connectionStatusCancellable = nil
+        pulseTimer?.invalidate()
+        pulseTimer = nil
 
         if let item = statusItem {
             NSStatusBar.system.removeStatusItem(item)
@@ -589,6 +630,37 @@ public final class AppDelegate: NSObject, NSApplicationDelegate {
         log.info("Configured HTTP transport for remote assistant \(assistant.assistantId) at \(runtimeUrl, privacy: .public)")
     }
 
+    // MARK: - Pairing Override Migration
+
+    /// Key that tracks whether the pairing override migration has run.
+    private static let pairingOverrideMigrationKey = "pairing_override_migration_done"
+
+    /// Clears stale gateway/token override values when the old toggle was OFF.
+    /// Runs once; the flag persists across future launches.
+    ///
+    /// Only acts when the legacy `iosPairingUseOverride` key is actually present.
+    /// After M9 the toggle is no longer persisted, so absence means the user
+    /// may have intentionally set overrides post-M9 — skip cleanup to preserve them.
+    private func migratePairingOverridesIfNeeded() {
+        let defaults = UserDefaults.standard
+        guard !defaults.bool(forKey: Self.pairingOverrideMigrationKey) else { return }
+
+        // Only clean up when the legacy toggle key is actually present.
+        // After M9 the toggle is no longer persisted, so absence means
+        // the user may have intentionally set overrides post-M9.
+        if defaults.object(forKey: "iosPairingUseOverride") != nil {
+            let overrideWasEnabled = defaults.bool(forKey: "iosPairingUseOverride")
+            if !overrideWasEnabled {
+                defaults.removeObject(forKey: PairingConfiguration.gatewayOverrideKey)
+                defaults.removeObject(forKey: PairingConfiguration.tokenOverrideKey)
+            }
+            // Clean up the legacy toggle key itself — no longer used.
+            defaults.removeObject(forKey: "iosPairingUseOverride")
+        }
+
+        defaults.set(true, forKey: Self.pairingOverrideMigrationKey)
+    }
+
     func setupDaemonClient() {
         guard !hasSetupDaemon else { return }
         hasSetupDaemon = true
@@ -700,10 +772,20 @@ public final class AppDelegate: NSObject, NSApplicationDelegate {
                 callSessionId: msg.callSessionId,
                 title: msg.title
             )
-            if let thread = self.mainWindow?.threadManager.threads.first(where: { $0.sessionId == msg.conversationId }) {
-                self.mainWindow?.threadManager.activeThreadId = thread.id
+            if NSApp.isActive {
+                // App is in foreground — select thread and show window immediately
+                if let thread = self.mainWindow?.threadManager.threads.first(where: { $0.sessionId == msg.conversationId }) {
+                    self.mainWindow?.threadManager.activeThreadId = thread.id
+                }
+                self.showMainWindow()
+            } else {
+                // App is backgrounded — post native notification
+                self.deliverGuardianRequestNotification(
+                    title: msg.title,
+                    questionText: msg.questionText,
+                    conversationId: msg.conversationId
+                )
             }
-            self.showMainWindow()
         }
 
         // Handle escalation: text_qa -> computer_use via computer_use_request_control
@@ -1060,90 +1142,110 @@ public final class AppDelegate: NSObject, NSApplicationDelegate {
     // MARK: - Hotkey
 
     private func setupHotKey() {
-        // Use NSEvent global monitor instead of Carbon RegisterEventHotKey (HotKey package).
-        // Carbon hotkeys consume the event globally, preventing other apps from seeing the
-        // keystroke. NSEvent.addGlobalMonitorForEvents observes without consuming, so Cmd+Shift+G
-        // still reaches the frontmost app.
-        hotKeyMonitor = NSEvent.addGlobalMonitorForEvents(matching: .keyDown) { [weak self] event in
-            guard event.modifierFlags.intersection(.deviceIndependentFlagsMask) == [.command, .shift],
-                  event.charactersIgnoringModifiers?.lowercased() == "g" else { return }
-            Task { @MainActor in
-                // Don't create the main window while the first-launch
-                // readiness gate is in progress.
-                guard self?.isAwaitingFirstLaunchReady != true else { return }
-                self?.showMainWindow()
-            }
-        }
-    }
+        registerGlobalHotkeyMonitor()
+        registerQuickInputMonitor()
 
-    private func setupQuickChatHotKey() {
-        registerQuickChatMonitor()
-
-        // Re-register the global hotkey whenever the user changes the shortcut
-        quickChatShortcutObserver = NotificationCenter.default
+        globalHotkeyObserver = NotificationCenter.default
             .publisher(for: UserDefaults.didChangeNotification)
             .receive(on: RunLoop.main)
             .sink { [weak self] _ in
-                self?.registerQuickChatMonitor()
+                self?.registerGlobalHotkeyMonitor()
             }
     }
 
-    /// Tears down the existing Quick Chat monitors and registers new ones
-    /// based on the current `quickChatShortcut` UserDefaults value.
-    /// Skips re-registration if the shortcut hasn't changed.
-    private func registerQuickChatMonitor() {
-        let shortcut = UserDefaults.standard.string(forKey: "quickChatShortcut") ?? "cmd+shift+space"
+    /// Registers a Carbon hotkey (Cmd+/) that intercepts system-wide,
+    /// before the frontmost app's menu system can consume it.
+    private func registerQuickInputMonitor() {
+        // Install Carbon event handler for hotkey events
+        var eventType = EventTypeSpec(eventClass: OSType(kEventClassKeyboard), eventKind: UInt32(kEventHotKeyPressed))
+        var handlerRef: EventHandlerRef?
+        InstallEventHandler(GetApplicationEventTarget(), quickInputHotKeyHandler, 1, &eventType, nil, &handlerRef)
+        quickInputEventHandlerRef = handlerRef
+
+        // Register Cmd+/ (keyCode 44) as a system-wide hotkey
+        let hotKeyID = EventHotKeyID(signature: OSType(0x564C_4D51), id: 1) // "VLMQ"
+        var hotKeyRef: EventHotKeyRef?
+        let status = RegisterEventHotKey(
+            UInt32(kVK_ANSI_Slash),
+            UInt32(cmdKey),
+            hotKeyID,
+            GetApplicationEventTarget(),
+            0,
+            &hotKeyRef
+        )
+        if status == noErr {
+            quickInputHotKeyRef = hotKeyRef
+            log.info("Quick Input: Carbon hotkey Cmd+/ registered successfully")
+        } else {
+            log.error("Quick Input: Failed to register Carbon hotkey, status: \(status)")
+        }
+    }
 
-        // Skip re-registration when the shortcut hasn't changed
-        if shortcut == lastRegisteredShortcut { return }
+    /// Removes the Carbon hotkey and event handler registrations.
+    private func tearDownQuickInputMonitors() {
+        if let ref = quickInputHotKeyRef {
+            UnregisterEventHotKey(ref)
+            quickInputHotKeyRef = nil
+        }
+        if let ref = quickInputEventHandlerRef {
+            RemoveEventHandler(ref)
+            quickInputEventHandlerRef = nil
+        }
+    }
 
-        if let existing = quickChatMonitor {
-            NSEvent.removeMonitor(existing)
-            quickChatMonitor = nil
+    func toggleQuickInput() {
+        if let window = quickInputWindow, window.isVisible {
+            window.dismiss()
+            return
+        }
+
+        let window = QuickInputWindow()
+        window.onSubmit = { [weak self] message in
+            self?.handleQuickInputSubmit(message)
         }
-        if let existing = quickChatLocalMonitor {
+        window.show()
+        quickInputWindow = window
+    }
+
+    private func handleQuickInputSubmit(_ message: String) {
+        showMainWindow()
+        guard let mainWindow else { return }
+        mainWindow.threadManager.createThread()
+        if let viewModel = mainWindow.activeViewModel {
+            viewModel.inputText = message
+            viewModel.sendMessage()
+        }
+    }
+
+    /// Tears down and re-registers the global "Open Vellum" hotkey based on
+    /// the current `globalHotkeyShortcut` UserDefaults value. Skips
+    /// re-registration if the shortcut hasn't changed.
+    private func registerGlobalHotkeyMonitor() {
+        let shortcut = UserDefaults.standard.string(forKey: "globalHotkeyShortcut") ?? "cmd+shift+g"
+
+        if shortcut == lastRegisteredGlobalHotkey { return }
+
+        if let existing = hotKeyMonitor {
             NSEvent.removeMonitor(existing)
-            quickChatLocalMonitor = nil
+            hotKeyMonitor = nil
         }
 
         let (targetModifiers, targetKey) = ShortcutHelper.parseShortcut(shortcut)
 
-        quickChatMonitor = NSEvent.addGlobalMonitorForEvents(matching: .keyDown) { [weak self] event in
+        // Use NSEvent global monitor instead of Carbon RegisterEventHotKey (HotKey package).
+        // Carbon hotkeys consume the event globally, preventing other apps from seeing the
+        // keystroke. NSEvent.addGlobalMonitorForEvents observes without consuming.
+        hotKeyMonitor = NSEvent.addGlobalMonitorForEvents(matching: .keyDown) { [weak self] event in
             let eventMods = event.modifierFlags.intersection(.deviceIndependentFlagsMask).subtracting(.numericPad)
             guard eventMods == targetModifiers,
                   event.charactersIgnoringModifiers?.lowercased() == targetKey.lowercased() else { return }
             Task { @MainActor in
-                self?.showQuickChat()
-            }
-        }
-
-        // Local monitor fires when the app itself is active (global monitor only
-        // fires when other apps are frontmost). Return nil to consume the event.
-        quickChatLocalMonitor = NSEvent.addLocalMonitorForEvents(matching: .keyDown) { [weak self] event in
-            let eventMods = event.modifierFlags.intersection(.deviceIndependentFlagsMask).subtracting(.numericPad)
-            guard eventMods == targetModifiers,
-                  event.charactersIgnoringModifiers?.lowercased() == targetKey.lowercased() else { return event }
-            Task { @MainActor in
-                self?.showQuickChat()
+                guard self?.isAwaitingFirstLaunchReady != true else { return }
+                self?.showMainWindow()
             }
-            return nil
-        }
-
-        lastRegisteredShortcut = shortcut
-    }
-
-    func showQuickChat() {
-        if let existing = quickChatPanel, existing.isVisible {
-            existing.dismiss()
-            return
         }
 
-        let panel = QuickChatPanel()
-        panel.onSubmit = { [weak self] message in
-            self?.startBackgroundSession(task: message, source: "quick_chat")
-        }
-        panel.show()
-        quickChatPanel = panel
+        lastRegisteredGlobalHotkey = shortcut
     }
 
     private func setupEscapeMonitor() {
@@ -1258,7 +1360,8 @@ public final class AppDelegate: NSObject, NSApplicationDelegate {
         }
 
         let sensitivity = UserDefaults.standard.float(forKey: "wakeWordSensitivity")
-        let engine = PorcupineWakeWordEngine(sensitivity: sensitivity > 0 ? sensitivity : 0.5)
+        let keyword = UserDefaults.standard.string(forKey: "wakeWordKeyword") ?? "computer"
+        let engine = PorcupineWakeWordEngine(sensitivity: sensitivity > 0 ? sensitivity : 0.5, keyword: keyword)
         let audioMonitor = AlwaysOnAudioMonitor(engine: engine)
 
         let coordinator = WakeWordCoordinator(
@@ -1516,22 +1619,18 @@ public final class AppDelegate: NSObject, NSApplicationDelegate {
         if let monitor = hotKeyMonitor {
             NSEvent.removeMonitor(monitor)
         }
+        tearDownQuickInputMonitors()
+        globalHotkeyObserver?.cancel()
         if let monitor = escapeMonitor {
             NSEvent.removeMonitor(monitor)
         }
-        if let monitor = quickChatMonitor {
-            NSEvent.removeMonitor(monitor)
-        }
-        if let monitor = quickChatLocalMonitor {
-            NSEvent.removeMonitor(monitor)
-        }
-        quickChatShortcutObserver?.cancel()
-        quickChatPanel?.dismiss()
-        quickChatPanel = nil
         if let observer = windowObserver {
             NotificationCenter.default.removeObserver(observer)
         }
         statusIconCancellable?.cancel()
+        connectionStatusCancellable?.cancel()
+        pulseTimer?.invalidate()
+        pulseTimer = nil
         voiceInput?.stop()
         ambientAgent.teardown()
         surfaceManager.dismissAll()
diff --git a/clients/macos/vellum-assistant/Features/Chat/ChatBubble.swift b/clients/macos/vellum-assistant/Features/Chat/ChatBubble.swift
index 41aa94369be..9ea8636e3e3 100644
--- a/clients/macos/vellum-assistant/Features/Chat/ChatBubble.swift
+++ b/clients/macos/vellum-assistant/Features/Chat/ChatBubble.swift
@@ -90,7 +90,7 @@ struct ChatBubble: View {
                         .offset(x: isUser ? -(24 + VSpacing.sm) : (24 + VSpacing.sm))
                 }
             }
-            .frame(maxWidth: 520, alignment: isUser ? .trailing : .leading)
+            .frame(maxWidth: message.isError ? .infinity : 520, alignment: isUser ? .trailing : .leading)
     }
 
     private var formattedTimestamp: String {
diff --git a/clients/macos/vellum-assistant/Features/Chat/ComposerView.swift b/clients/macos/vellum-assistant/Features/Chat/ComposerView.swift
index 492a91058c3..861942bc666 100644
--- a/clients/macos/vellum-assistant/Features/Chat/ComposerView.swift
+++ b/clients/macos/vellum-assistant/Features/Chat/ComposerView.swift
@@ -906,12 +906,14 @@ private final class CenteringClipView: NSClipView {
                let placeholder = textView.placeholderText,
                !placeholder.isEmpty {
                 let font = textView.font ?? NSFont.systemFont(ofSize: 13)
+                let paragraph = NSMutableParagraphStyle()
+                paragraph.lineBreakMode = .byTruncatingTail
                 let linePadding = textContainer.lineFragmentPadding
                 let availableWidth = max(0, textView.bounds.width - textView.textContainerInset.width * 2 - linePadding * 2)
                 let placeholderSize = (placeholder as NSString).boundingRect(
                     with: NSSize(width: availableWidth, height: .greatestFiniteMagnitude),
                     options: [.usesLineFragmentOrigin],
-                    attributes: [.font: font]
+                    attributes: [.font: font, .paragraphStyle: paragraph]
                 )
                 usedHeight = max(usedHeight, placeholderSize.height)
             }
diff --git a/clients/macos/vellum-assistant/Features/MainWindow/MainWindowView.swift b/clients/macos/vellum-assistant/Features/MainWindow/MainWindowView.swift
index 3ccb39620ba..f2d04760e03 100644
--- a/clients/macos/vellum-assistant/Features/MainWindow/MainWindowView.swift
+++ b/clients/macos/vellum-assistant/Features/MainWindow/MainWindowView.swift
@@ -988,10 +988,8 @@ struct MainWindowView: View {
             SidebarNavRow(icon: "sparkles", label: "Skills", isActive: windowState.activePanel == .agent) {
                 windowState.togglePanel(.agent)
             }
-            if FeatureFlagManager.shared.isEnabled(.assistantInboxEnabled) {
-                SidebarNavRow(icon: "tray.fill", label: "Inbox", isActive: windowState.activePanel == .assistantInbox) {
-                    windowState.togglePanel(.assistantInbox)
-                }
+            SidebarNavRow(icon: "tray.fill", label: "Inbox", isActive: windowState.activePanel == .assistantInbox) {
+                windowState.togglePanel(.assistantInbox)
             }
 
             // Divider between nav items and threads
@@ -1102,10 +1100,8 @@ struct MainWindowView: View {
             SidebarNavRow(icon: "sparkles", label: "Skills", isActive: windowState.activePanel == .agent, isExpanded: false) {
                 windowState.togglePanel(.agent)
             }
-            if FeatureFlagManager.shared.isEnabled(.assistantInboxEnabled) {
-                SidebarNavRow(icon: "tray.fill", label: "Inbox", isActive: windowState.activePanel == .assistantInbox, isExpanded: false) {
-                    windowState.togglePanel(.assistantInbox)
-                }
+            SidebarNavRow(icon: "tray.fill", label: "Inbox", isActive: windowState.activePanel == .assistantInbox, isExpanded: false) {
+                windowState.togglePanel(.assistantInbox)
             }
 
             VColor.divider
diff --git a/clients/macos/vellum-assistant/Features/MainWindow/Panels/AgentPanel.swift b/clients/macos/vellum-assistant/Features/MainWindow/Panels/AgentPanel.swift
index d5a86f3c7a1..a62e02ab8e8 100644
--- a/clients/macos/vellum-assistant/Features/MainWindow/Panels/AgentPanel.swift
+++ b/clients/macos/vellum-assistant/Features/MainWindow/Panels/AgentPanel.swift
@@ -141,19 +141,23 @@ struct AgentPanelContent: View {
     // MARK: - Available Skills Tab
 
     /// ClaWHub skills filtered to exclude already-installed ones, with local search and sort.
+    /// When filtering to Vellum-only, installed Vellum skills are kept so the catalog is always visible.
     private var availableClawhubSkills: [ClawhubSkillItem] {
         let installedNames = Set(skillsManager.skills.map(\.name))
-        var filtered = skillsManager.searchResults
-            .filter { !installedNames.contains($0.name) }
 
-        // Source filter
+        // Source filter — applied before installed-name exclusion so we can
+        // relax the exclusion for Vellum-only view.
+        var filtered: [ClawhubSkillItem]
         switch skillSourceFilter {
         case .all:
-            break
+            filtered = skillsManager.searchResults
+                .filter { !installedNames.contains($0.name) }
         case .vellum:
-            filtered = filtered.filter { $0.isVellum }
+            // Show all Vellum catalog skills even if already installed
+            filtered = skillsManager.searchResults.filter { $0.isVellum }
         case .community:
-            filtered = filtered.filter { !$0.isVellum }
+            filtered = skillsManager.searchResults
+                .filter { !$0.isVellum && !installedNames.contains($0.name) }
         }
 
         // Local fuzzy filter by name/description
@@ -269,7 +273,7 @@ struct AgentPanelContent: View {
                         title: hasActiveSearch ? "No matches in Available" : "No results",
                         subtitle: hasActiveSearch
                             ? "No available skills matched \"\(globalSkillSearchQuery)\""
-                            : "No \(skillSourceFilter.rawValue.lowercased()) skills found",
+                            : "No \(skillSourceFilter.rawValue) skills found",
                         icon: "magnifyingglass"
                     )
 
@@ -287,24 +291,26 @@ struct AgentPanelContent: View {
                 .frame(minHeight: 100)
             }
 
-            // Community disclaimer
-            VStack(spacing: VSpacing.sm) {
-                HStack(spacing: VSpacing.sm) {
-                    Image(systemName: "exclamationmark.shield.fill")
-                        .font(.system(size: 10))
-                        .foregroundColor(Amber._500)
-                    Text("Community skills are not verified by Vellum. Review before installing.")
-                        .font(VFont.caption)
-                        .foregroundColor(VColor.textMuted)
-                }
+            // Community disclaimer — hidden when filtering to Vellum-only
+            if skillSourceFilter != .vellum {
+                VStack(spacing: VSpacing.sm) {
+                    HStack(spacing: VSpacing.sm) {
+                        Image(systemName: "exclamationmark.shield.fill")
+                            .font(.system(size: 10))
+                            .foregroundColor(Amber._500)
+                        Text("Community skills are not verified by Vellum. Review before installing.")
+                            .font(VFont.caption)
+                            .foregroundColor(VColor.textMuted)
+                    }
 
-                HStack(spacing: VSpacing.sm) {
-                    Image(systemName: "sparkles")
-                        .font(.system(size: 10))
-                        .foregroundColor(VColor.accent)
-                    Text("Browse more on ClawhHub")
-                        .font(VFont.caption)
-                        .foregroundColor(VColor.textMuted)
+                    HStack(spacing: VSpacing.sm) {
+                        Image(systemName: "sparkles")
+                            .font(.system(size: 10))
+                            .foregroundColor(VColor.accent)
+                        Text("Browse more on ClawhHub")
+                            .font(VFont.caption)
+                            .foregroundColor(VColor.textMuted)
+                    }
                 }
             }
         }
@@ -352,6 +358,8 @@ struct AgentPanelContent: View {
     }
 
     private func clawhubSkillCard(_ skill: ClawhubSkillItem) -> some View {
+        let installedNames = Set(skillsManager.skills.map(\.name))
+        let isAlreadyInstalled = installedNames.contains(skill.name)
         let isInstalling = installingSlug == skill.slug
         let isNew = !skill.isVellum && skill.createdAt > 0 && Date().timeIntervalSince(
             Date(timeIntervalSince1970: Double(skill.createdAt) / 1000)
@@ -392,24 +400,30 @@ struct AgentPanelContent: View {
 
                 Spacer()
 
-                VButton(
-                    label: isInstalling ? "Installing..." : "Install",
-                    icon: isInstalling ? nil : "arrow.down.circle.fill",
-                    style: .primary,
-                    isDisabled: installingSlug != nil
-                ) {
-                    guard installingSlug == nil else { return }
-                    let attemptId = UUID()
-                    installingSlug = skill.slug
-                    installAttemptId = attemptId
-                    skillsManager.installSkill(slug: skill.slug)
-                    installTimeoutTask?.cancel()
-                    installTimeoutTask = Task {
-                        try? await Task.sleep(nanoseconds: 10_000_000_000)
-                        guard !Task.isCancelled else { return }
-                        if installingSlug == skill.slug && installAttemptId == attemptId {
-                            installingSlug = nil
-                            installAttemptId = nil
+                if isAlreadyInstalled {
+                    Text("Installed")
+                        .font(VFont.caption)
+                        .foregroundColor(VColor.success)
+                } else {
+                    VButton(
+                        label: isInstalling ? "Installing..." : "Install",
+                        icon: isInstalling ? nil : "arrow.down.circle.fill",
+                        style: .primary,
+                        isDisabled: installingSlug != nil
+                    ) {
+                        guard installingSlug == nil else { return }
+                        let attemptId = UUID()
+                        installingSlug = skill.slug
+                        installAttemptId = attemptId
+                        skillsManager.installSkill(slug: skill.slug)
+                        installTimeoutTask?.cancel()
+                        installTimeoutTask = Task {
+                            try? await Task.sleep(nanoseconds: 10_000_000_000)
+                            guard !Task.isCancelled else { return }
+                            if installingSlug == skill.slug && installAttemptId == attemptId {
+                                installingSlug = nil
+                                installAttemptId = nil
+                            }
                         }
                     }
                 }
@@ -823,11 +837,25 @@ struct AgentPanelContent: View {
                 }
                 .frame(minHeight: 100)
             } else {
-                Text("No skills installed")
-                    .font(VFont.caption)
-                    .foregroundColor(VColor.textMuted)
+                VStack(spacing: VSpacing.md) {
+                    Text("No skills installed")
+                        .font(VFont.caption)
+                        .foregroundColor(VColor.textMuted)
+                        .frame(maxWidth: .infinity, alignment: .leading)
+
+                    Button(action: { skillsManager.fetchSkills() }) {
+                        HStack(spacing: VSpacing.xs) {
+                            Image(systemName: "arrow.clockwise")
+                                .font(.system(size: 11))
+                            Text("Refresh")
+                                .font(VFont.caption)
+                        }
+                        .foregroundColor(VColor.accent)
+                    }
+                    .buttonStyle(.plain)
                     .frame(maxWidth: .infinity, alignment: .leading)
-                    .padding(.vertical, VSpacing.sm)
+                }
+                .padding(.vertical, VSpacing.sm)
             }
         } else {
             VStack(spacing: VSpacing.md) {
diff --git a/clients/macos/vellum-assistant/Features/MainWindow/Panels/SettingsPanel.swift b/clients/macos/vellum-assistant/Features/MainWindow/Panels/SettingsPanel.swift
index 516a1b9a128..bc57ed0d168 100644
--- a/clients/macos/vellum-assistant/Features/MainWindow/Panels/SettingsPanel.swift
+++ b/clients/macos/vellum-assistant/Features/MainWindow/Panels/SettingsPanel.swift
@@ -1183,9 +1183,8 @@ private struct ModelPickerItem: View {
     }
 }
 
-// MARK: - Environment Variables Sheet (Debug Only)
+// MARK: - Environment Variables Sheet
 
-#if DEBUG
 struct SettingsPanelEnvVarsSheet: View {
     let appEnvVars: [(String, String)]
     let daemonEnvVars: [(String, String)]
@@ -1243,7 +1242,6 @@ struct SettingsPanelEnvVarsSheet: View {
         }
     }
 }
-#endif
 
 struct SettingsPanel_Previews: PreviewProvider {
     static var previews: some View {
diff --git a/clients/macos/vellum-assistant/Features/QuickChat/QuickChatPanel.swift b/clients/macos/vellum-assistant/Features/QuickChat/QuickChatPanel.swift
deleted file mode 100644
index 8cbc12cdfa6..00000000000
--- a/clients/macos/vellum-assistant/Features/QuickChat/QuickChatPanel.swift
+++ /dev/null
@@ -1,224 +0,0 @@
-import AppKit
-import SwiftUI
-import VellumAssistantShared
-
-/// Borderless NSPanel subclass that can become key window.
-/// Without this override, borderless windows refuse key status
-/// and SwiftUI TextEditor won't accept keyboard input.
-private class KeyablePanel: NSPanel {
-    override var canBecomeKey: Bool { true }
-}
-
-/// A borderless, floating NSPanel that hosts the Quick Chat text editor.
-/// Appears centered on the active screen with a vibrancy/blur background.
-/// Dismisses itself when it resigns key window status.
-@MainActor
-final class QuickChatPanel {
-    private var panel: NSPanel?
-    private var toastPanel: NSPanel?
-    private var resignObserver: Any?
-    private var previousApp: NSRunningApplication?
-
-    /// Callback invoked when the user submits a message.
-    var onSubmit: ((String) -> Void)?
-
-    func show() {
-        // Remember the frontmost app so we can restore focus on dismiss
-        previousApp = NSWorkspace.shared.frontmostApplication
-
-        if let existing = panel {
-            existing.makeKeyAndOrderFront(nil)
-            NSApp.activate(ignoringOtherApps: true)
-            return
-        }
-
-        let view = QuickChatView(
-            onSubmit: { [weak self] message in
-                self?.onSubmit?(message)
-                // Capture position before dismissing so the toast appears in the same spot
-                let panelFrame = self?.panel?.frame
-                self?.dismiss()
-                if let frame = panelFrame {
-                    self?.showSentToast(near: frame)
-                }
-            },
-            onDismiss: { [weak self] in
-                self?.dismiss()
-            }
-        )
-
-        let hostingController = NSHostingController(rootView: view)
-
-        let panel = KeyablePanel(
-            contentRect: NSRect(x: 0, y: 0, width: 400, height: 60),
-            styleMask: [.borderless, .nonactivatingPanel],
-            backing: .buffered,
-            defer: false
-        )
-
-        panel.contentViewController = hostingController
-        panel.level = .floating
-        panel.isMovableByWindowBackground = true
-        panel.titleVisibility = .hidden
-        panel.titlebarAppearsTransparent = true
-        panel.isReleasedWhenClosed = false
-        panel.backgroundColor = .clear
-        panel.isOpaque = false
-        panel.hasShadow = true
-        panel.collectionBehavior = [.canJoinAllSpaces, .fullScreenAuxiliary]
-
-        // Center on the active screen
-        centerOnScreen(panel)
-
-        // Animate in: start transparent and slightly scaled down, then animate to full
-        panel.alphaValue = 0
-        panel.makeKeyAndOrderFront(nil)
-        NSApp.activate(ignoringOtherApps: true)
-
-        NSAnimationContext.runAnimationGroup { context in
-            context.duration = VAnimation.durationFast
-            context.timingFunction = CAMediaTimingFunction(name: .easeOut)
-            panel.animator().alphaValue = 1
-        }
-
-        // Dismiss when the panel loses focus
-        resignObserver = NotificationCenter.default.addObserver(
-            forName: NSWindow.didResignKeyNotification,
-            object: panel,
-            queue: .main
-        ) { [weak self] _ in
-            Task { @MainActor in
-                self?.dismiss()
-            }
-        }
-
-        self.panel = panel
-    }
-
-    func dismiss() {
-        if let resignObserver {
-            NotificationCenter.default.removeObserver(resignObserver)
-        }
-        resignObserver = nil
-
-        guard let panel else { return }
-
-        // Restore focus to the app that was active before Quick Chat appeared
-        let appToRestore = previousApp
-        previousApp = nil
-
-        // Animate out: fade to transparent, then close
-        NSAnimationContext.runAnimationGroup({ context in
-            context.duration = VAnimation.durationFast
-            context.timingFunction = CAMediaTimingFunction(name: .easeIn)
-            panel.animator().alphaValue = 0
-        }, completionHandler: { [weak self] in
-            panel.close()
-            self?.panel = nil
-            appToRestore?.activate()
-        })
-    }
-
-    var isVisible: Bool {
-        panel?.isVisible ?? false
-    }
-
-    // MARK: - Private
-
-    /// Shows a brief "Message sent" toast near the given frame, then auto-dismisses.
-    private func showSentToast(near frame: NSRect) {
-        // Close any existing toast immediately to prevent timer races
-        if let existing = toastPanel {
-            existing.close()
-            toastPanel = nil
-        }
-
-        let toastView = HStack(spacing: VSpacing.sm) {
-            Image(systemName: "checkmark.circle.fill")
-                .foregroundColor(VColor.success)
-            Text("Message sent")
-                .font(VFont.body)
-                .foregroundColor(VColor.textPrimary)
-        }
-        .padding(.horizontal, VSpacing.lg)
-        .padding(.vertical, VSpacing.md)
-        .background(
-            VisualEffectBlur(material: .hudWindow, blendingMode: .behindWindow)
-        )
-        .clipShape(RoundedRectangle(cornerRadius: VRadius.md))
-
-        let hosting = NSHostingController(rootView: toastView)
-
-        let toast = NSPanel(
-            contentRect: NSRect(x: 0, y: 0, width: 180, height: 40),
-            styleMask: [.borderless, .nonactivatingPanel],
-            backing: .buffered,
-            defer: false
-        )
-        toast.contentViewController = hosting
-        toast.level = .floating
-        toast.titleVisibility = .hidden
-        toast.titlebarAppearsTransparent = true
-        toast.isReleasedWhenClosed = false
-        toast.backgroundColor = .clear
-        toast.isOpaque = false
-        toast.hasShadow = true
-        toast.collectionBehavior = [.canJoinAllSpaces, .fullScreenAuxiliary]
-
-        // Position just below where the Quick Chat panel was
-        if let fittingSize = toast.contentView?.fittingSize {
-            let x = frame.midX - fittingSize.width / 2
-            let y = frame.origin.y - fittingSize.height - 8
-            toast.setFrame(
-                NSRect(x: x, y: y, width: fittingSize.width, height: fittingSize.height),
-                display: true
-            )
-        }
-
-        toast.alphaValue = 0
-        toast.orderFrontRegardless()
-
-        NSAnimationContext.runAnimationGroup { context in
-            context.duration = VAnimation.durationFast
-            context.timingFunction = CAMediaTimingFunction(name: .easeOut)
-            toast.animator().alphaValue = 1
-        }
-
-        self.toastPanel = toast
-
-        // Auto-dismiss after 1.5 seconds. Capture `toast` directly so it's
-        // always closed even if `self` (QuickChatPanel) is deallocated first.
-        DispatchQueue.main.asyncAfter(deadline: .now() + 1.5) { [weak self] in
-            NSAnimationContext.runAnimationGroup({ context in
-                context.duration = VAnimation.durationFast
-                context.timingFunction = CAMediaTimingFunction(name: .easeIn)
-                toast.animator().alphaValue = 0
-            }, completionHandler: {
-                toast.close()
-                if self?.toastPanel === toast {
-                    self?.toastPanel = nil
-                }
-            })
-        }
-    }
-
-    private func centerOnScreen(_ panel: NSPanel) {
-        let screen = NSScreen.main ?? NSScreen.screens.first
-        guard let screenFrame = screen?.visibleFrame else { return }
-
-        // Let the hosting controller size the panel to fit the content
-        if let fittingSize = panel.contentView?.fittingSize {
-            let width = max(fittingSize.width, 400)
-            let height = fittingSize.height
-            let x = screenFrame.midX - width / 2
-            // Position slightly above center (like Spotlight)
-            let y = screenFrame.midY + screenFrame.height * 0.1
-            panel.setFrame(
-                NSRect(x: x, y: y, width: width, height: height),
-                display: true
-            )
-        } else {
-            panel.center()
-        }
-    }
-}
diff --git a/clients/macos/vellum-assistant/Features/QuickChat/QuickChatView.swift b/clients/macos/vellum-assistant/Features/QuickInput/QuickInputView.swift
similarity index 57%
rename from clients/macos/vellum-assistant/Features/QuickChat/QuickChatView.swift
rename to clients/macos/vellum-assistant/Features/QuickInput/QuickInputView.swift
index f79858e4df3..3435cf2aced 100644
--- a/clients/macos/vellum-assistant/Features/QuickChat/QuickChatView.swift
+++ b/clients/macos/vellum-assistant/Features/QuickInput/QuickInputView.swift
@@ -1,59 +1,50 @@
 import SwiftUI
 import VellumAssistantShared
 
-struct QuickChatView: View {
+struct QuickInputView: View {
     let onSubmit: (String) -> Void
     let onDismiss: () -> Void
 
     @State private var text = ""
-    @State private var isPresented = false
     @FocusState private var isFocused: Bool
 
-    private let panelWidth: CGFloat = 400
-    private let minEditorHeight: CGFloat = 36
-    private let maxEditorHeight: CGFloat = 120
+    private let panelWidth: CGFloat = 500
 
     var body: some View {
-        VStack(spacing: 0) {
-            TextEditor(text: $text)
+        HStack(spacing: VSpacing.sm) {
+            TextField("Send a message...", text: $text)
                 .font(VFont.body)
                 .foregroundColor(VColor.textPrimary)
-                .scrollContentBackground(.hidden)
-                .padding(.horizontal, VSpacing.md)
-                .padding(.vertical, VSpacing.sm)
-                .frame(minHeight: minEditorHeight, maxHeight: maxEditorHeight)
-                .fixedSize(horizontal: false, vertical: true)
+                .textFieldStyle(.plain)
                 .focused($isFocused)
-                .overlay(alignment: .topLeading) {
-                    if text.isEmpty {
-                        Text("Type a message...")
-                            .font(VFont.body)
-                            .foregroundColor(VColor.textMuted)
-                            .padding(.horizontal, VSpacing.md + 5)
-                            .padding(.vertical, VSpacing.sm + 1)
-                            .allowsHitTesting(false)
-                    }
-                }
-                .onKeyPress(.return) {
+                .onSubmit {
                     submit()
-                    return .handled
                 }
                 .onKeyPress(.escape) {
                     onDismiss()
                     return .handled
                 }
+
+            Button(action: submit) {
+                Image(systemName: "arrow.up.circle.fill")
+                    .font(.system(size: 20))
+                    .foregroundColor(text.trimmingCharacters(in: .whitespacesAndNewlines).isEmpty
+                        ? VColor.textMuted
+                        : VColor.sendButton)
+            }
+            .buttonStyle(.plain)
+            .disabled(text.trimmingCharacters(in: .whitespacesAndNewlines).isEmpty)
+            .accessibilityLabel("Send message")
         }
+        .padding(.horizontal, VSpacing.lg)
+        .padding(.vertical, VSpacing.md)
         .frame(width: panelWidth)
         .background(
             VisualEffectBlur(material: .hudWindow, blendingMode: .behindWindow)
         )
         .clipShape(RoundedRectangle(cornerRadius: VRadius.lg))
-        .scaleEffect(isPresented ? 1.0 : 0.95)
         .onAppear {
             isFocused = true
-            withAnimation(VAnimation.fast) {
-                isPresented = true
-            }
         }
     }
 
@@ -84,12 +75,10 @@ struct VisualEffectBlur: NSViewRepresentable {
     }
 }
 
-// MARK: - Preview
-
-#Preview("QuickChatView") {
+#Preview("QuickInputView") {
     ZStack {
         Color.black.opacity(0.5).ignoresSafeArea()
-        QuickChatView(
+        QuickInputView(
             onSubmit: { message in
                 print("Submitted: \(message)")
             },
@@ -98,5 +87,5 @@ struct VisualEffectBlur: NSViewRepresentable {
             }
         )
     }
-    .frame(width: 500, height: 300)
+    .frame(width: 600, height: 200)
 }
diff --git a/clients/macos/vellum-assistant/Features/QuickInput/QuickInputWindow.swift b/clients/macos/vellum-assistant/Features/QuickInput/QuickInputWindow.swift
new file mode 100644
index 00000000000..572dc9338b3
--- /dev/null
+++ b/clients/macos/vellum-assistant/Features/QuickInput/QuickInputWindow.swift
@@ -0,0 +1,152 @@
+import AppKit
+import SwiftUI
+import VellumAssistantShared
+
+/// Borderless NSPanel subclass that can become key window.
+/// Without this override, borderless windows refuse key status
+/// and SwiftUI TextField won't accept keyboard input.
+private class KeyablePanel: NSPanel {
+    override var canBecomeKey: Bool { true }
+}
+
+/// A borderless, floating NSPanel that hosts the Quick Input text field.
+/// Appears centered on the active screen, slightly above center (Spotlight-style).
+/// Dismisses itself when it resigns key window status.
+@MainActor
+final class QuickInputWindow {
+    private var panel: NSPanel?
+    private var resignObserver: Any?
+    private var previousApp: NSRunningApplication?
+    private var isDismissing = false
+
+    /// Callback invoked when the user submits a message.
+    var onSubmit: ((String) -> Void)?
+
+    func show() {
+        // Remember the frontmost app so we can restore focus on dismiss
+        previousApp = NSWorkspace.shared.frontmostApplication
+
+        if let existing = panel {
+            existing.makeKeyAndOrderFront(nil)
+            NSApp.activate(ignoringOtherApps: true)
+            return
+        }
+
+        let view = QuickInputView(
+            onSubmit: { [weak self] message in
+                self?.onSubmit?(message)
+                self?.dismiss(restorePreviousApp: false)
+            },
+            onDismiss: { [weak self] in
+                self?.dismiss()
+            }
+        )
+
+        let hostingController = NSHostingController(rootView: view)
+
+        let panel = KeyablePanel(
+            contentRect: NSRect(x: 0, y: 0, width: 500, height: 48),
+            styleMask: [.borderless, .nonactivatingPanel],
+            backing: .buffered,
+            defer: false
+        )
+
+        panel.contentViewController = hostingController
+        panel.level = .floating
+        panel.isMovableByWindowBackground = true
+        panel.titleVisibility = .hidden
+        panel.titlebarAppearsTransparent = true
+        panel.isReleasedWhenClosed = false
+        panel.backgroundColor = .clear
+        panel.isOpaque = false
+        panel.hasShadow = true
+        panel.collectionBehavior = [.canJoinAllSpaces, .fullScreenAuxiliary]
+
+        // Center horizontally, ~1/3 from top vertically (Spotlight-style)
+        centerOnScreen(panel)
+
+        // Animate in
+        panel.alphaValue = 0
+        panel.makeKeyAndOrderFront(nil)
+        NSApp.activate(ignoringOtherApps: true)
+
+        NSAnimationContext.runAnimationGroup { context in
+            context.duration = VAnimation.durationFast
+            context.timingFunction = CAMediaTimingFunction(name: .easeOut)
+            panel.animator().alphaValue = 1
+        }
+
+        // Dismiss when the panel loses focus. Don't restore the previous
+        // app — the user clicked elsewhere, so that app already has focus.
+        resignObserver = NotificationCenter.default.addObserver(
+            forName: NSWindow.didResignKeyNotification,
+            object: panel,
+            queue: .main
+        ) { [weak self] _ in
+            Task { @MainActor in
+                self?.dismiss(restorePreviousApp: false)
+            }
+        }
+
+        self.panel = panel
+    }
+
+    func dismiss(restorePreviousApp: Bool = true) {
+        guard !isDismissing else { return }
+        isDismissing = true
+
+        if let resignObserver {
+            NotificationCenter.default.removeObserver(resignObserver)
+        }
+        resignObserver = nil
+
+        guard let panel else {
+            isDismissing = false
+            return
+        }
+
+        let appToRestore = restorePreviousApp ? previousApp : nil
+        previousApp = nil
+
+        NSAnimationContext.runAnimationGroup({ context in
+            context.duration = VAnimation.durationFast
+            context.timingFunction = CAMediaTimingFunction(name: .easeIn)
+            panel.animator().alphaValue = 0
+        }, completionHandler: { [weak self] in
+            panel.close()
+            self?.panel = nil
+            self?.isDismissing = false
+            appToRestore?.activate()
+        })
+    }
+
+    var isVisible: Bool {
+        panel?.isVisible ?? false
+    }
+
+    // MARK: - Private
+
+    private func centerOnScreen(_ panel: NSPanel) {
+        // Use the screen containing the mouse cursor so the panel appears
+        // on the active display, even when triggered from another app.
+        let mouseLocation = NSEvent.mouseLocation
+        let screen = NSScreen.screens.first(where: { $0.frame.contains(mouseLocation) })
+            ?? NSScreen.main
+            ?? NSScreen.screens.first
+        guard let screenFrame = screen?.visibleFrame else { return }
+
+        if let fittingSize = panel.contentView?.fittingSize {
+            let width = max(fittingSize.width, 500)
+            let height = fittingSize.height
+            let x = screenFrame.midX - width / 2
+            // Position ~1/3 from top (like Spotlight)
+            let y = screenFrame.midY + screenFrame.height * 0.15
+            panel.setFrame(
+                NSRect(x: x, y: y, width: width, height: height),
+                display: true
+            )
+        } else {
+            panel.center()
+        }
+    }
+}
diff --git a/clients/macos/vellum-assistant/Features/Settings/ApprovedDevicesSection.swift b/clients/macos/vellum-assistant/Features/Settings/ApprovedDevicesSection.swift
index 79ad6b9b146..26509e48330 100644
--- a/clients/macos/vellum-assistant/Features/Settings/ApprovedDevicesSection.swift
+++ b/clients/macos/vellum-assistant/Features/Settings/ApprovedDevicesSection.swift
@@ -69,8 +69,8 @@ struct ApprovedDevicesSection: View {
         }
     }
 
-    private func formattedDate(_ timestamp: Double) -> String {
-        let date = Date(timeIntervalSince1970: timestamp / 1000.0)
+    private func formattedDate(_ timestamp: Int) -> String {
+        let date = Date(timeIntervalSince1970: Double(timestamp) / 1000.0)
         let formatter = RelativeDateTimeFormatter()
         formatter.unitsStyle = .abbreviated
         return formatter.localizedString(for: date, relativeTo: Date())
diff --git a/clients/macos/vellum-assistant/Features/Settings/PairingApprovalWindow.swift b/clients/macos/vellum-assistant/Features/Settings/PairingApprovalWindow.swift
index 900ee80558e..0721d61f026 100644
--- a/clients/macos/vellum-assistant/Features/Settings/PairingApprovalWindow.swift
+++ b/clients/macos/vellum-assistant/Features/Settings/PairingApprovalWindow.swift
@@ -8,19 +8,32 @@ import VellumAssistantShared
 final class PairingApprovalWindow {
     private var window: NSWindow?
     private let daemonClient: DaemonClient
+    private var currentPairingRequestId: String?
+    private var responseSent: Bool = false
+    private var windowDelegate: WindowCloseDelegate?
 
     init(daemonClient: DaemonClient) {
         self.daemonClient = daemonClient
     }
 
     /// Show the pairing approval prompt for a specific device.
-    /// If a window is already showing, it is closed first (one prompt at a time).
+    /// If a window is already showing for a different request, it is closed first
+    /// (one prompt at a time) and a deny is sent for the superseded request.
+    /// If the same pairingRequestId is delivered again (daemon retry/rebroadcast),
+    /// the existing prompt is kept as-is — no deny is sent.
     func show(pairingRequestId: String, deviceName: String) {
-        // Close any existing prompt before showing a new one
+        // Same request ID redelivered (retry/rebroadcast) — keep current prompt.
+        if pairingRequestId == currentPairingRequestId, window != nil {
+            return
+        }
+
+        // Close any existing prompt before showing a new one.
+        // This will send a deny for the previous (different) request if unanswered.
         close()
 
         let view = PairingApprovalView(deviceName: deviceName) { [weak self] decision in
             guard let self else { return }
+            self.responseSent = true
             try? self.daemonClient.sendPairingApprovalResponse(
                 pairingRequestId: pairingRequestId,
                 decision: decision
@@ -43,10 +56,19 @@ final class PairingApprovalWindow {
         window.isReleasedWhenClosed = false
         window.center()
 
+        // Delegate catches X-button close and sends deny if no response was sent.
+        let delegate = WindowCloseDelegate { [weak self] in
+            self?.handleWindowClosed()
+        }
+        window.delegate = delegate
+        self.windowDelegate = delegate
+
         window.makeKeyAndOrderFront(nil)
         NSApp.activate(ignoringOtherApps: true)
 
         self.window = window
+        self.currentPairingRequestId = pairingRequestId
+        self.responseSent = false
     }
 
     var isVisible: Bool {
@@ -54,7 +76,45 @@ final class PairingApprovalWindow {
     }
 
     func close() {
+        denyIfNeeded()
         window?.close()
         window = nil
+        windowDelegate = nil
+    }
+
+    // MARK: - Private
+
+    /// Sends a deny for the current request if no explicit response has been sent yet.
+    private func denyIfNeeded() {
+        guard let requestId = currentPairingRequestId, !responseSent else { return }
+        responseSent = true
+        try? daemonClient.sendPairingApprovalResponse(
+            pairingRequestId: requestId,
+            decision: "deny"
+        )
+    }
+
+    /// Called by the window delegate when the user clicks the X button.
+    private func handleWindowClosed() {
+        denyIfNeeded()
+        window = nil
+        windowDelegate = nil
+    }
+}
+
+// MARK: - WindowCloseDelegate
+
+/// Lightweight NSWindowDelegate that forwards windowWillClose to a closure.
+private final class WindowCloseDelegate: NSObject, NSWindowDelegate {
+    private let onClose: @MainActor () -> Void
+
+    init(onClose: @escaping @MainActor () -> Void) {
+        self.onClose = onClose
+    }
+
+    func windowWillClose(_ notification: Notification) {
+        MainActor.assumeIsolated {
+            onClose()
+        }
     }
 }
diff --git a/clients/macos/vellum-assistant/Features/Settings/PairingQRCodeSheet.swift b/clients/macos/vellum-assistant/Features/Settings/PairingQRCodeSheet.swift
index 4374167f8a2..cd9accf10e0 100644
--- a/clients/macos/vellum-assistant/Features/Settings/PairingQRCodeSheet.swift
+++ b/clients/macos/vellum-assistant/Features/Settings/PairingQRCodeSheet.swift
@@ -17,7 +17,7 @@ struct PairingQRCodeSheet: View {
     @Environment(\.dismiss) var dismiss
 
     let gatewayUrl: String
-    let daemonClient: DaemonClient
+    let daemonClient: DaemonClient?
 
     @State private var hostId: String = ""
     @State private var pairingRequestId: String = UUID().uuidString
@@ -25,14 +25,26 @@ struct PairingQRCodeSheet: View {
     @State private var localLanUrl: String? = nil
     @State private var registrationState: RegistrationState = .idle
     @State private var registrationError: String? = nil
+    @State private var refreshTask: Task<Void, Never>? = nil
+    @State private var consecutiveRefreshFailures: Int = 0
+
+    /// Re-register every 4 minutes to stay ahead of the 5-minute TTL.
+    private static let refreshInterval: UInt64 = 4 * 60 * 1_000_000_000
 
     enum RegistrationState {
         case idle, registering, registered, failed
     }
 
+    /// The effective gateway URL for iOS to connect to. Prefers the configured
+    /// cloud gateway URL, falls back to the local LAN gateway address.
+    private var effectiveGatewayUrl: String {
+        if !gatewayUrl.isEmpty { return gatewayUrl }
+        return localLanUrl ?? ""
+    }
+
     /// Whether the configuration is sufficient for pairing.
     private var canGenerateQR: Bool {
-        !gatewayUrl.isEmpty && registrationState == .registered
+        !effectiveGatewayUrl.isEmpty && registrationState == .registered
     }
 
     var body: some View {
@@ -45,54 +57,58 @@ struct PairingQRCodeSheet: View {
                 Button("Done") { dismiss() }
             }
 
-            switch registrationState {
-            case .idle, .registering:
-                VStack(spacing: VSpacing.sm) {
-                    ProgressView()
-                        .controlSize(.large)
-                    Text("Registering pairing request...")
-                        .font(VFont.body)
-                        .foregroundColor(VColor.textSecondary)
-                }
-                .frame(width: 220, height: 220)
-
-            case .registered:
-                if let qrImage = generateQRImage() {
-                    Image(nsImage: qrImage)
-                        .resizable()
-                        .interpolation(.none)
-                        .scaledToFit()
-                        .frame(width: 220, height: 220)
-                        .padding(VSpacing.md)
-                        .background(Color.white)
-                        .cornerRadius(VRadius.md)
-                } else {
-                    errorContent("Failed to generate QR code.")
-                }
-
-            case .failed:
-                errorContent(registrationError ?? "Could not register pairing request. Ensure the daemon is running.")
-            }
+            if daemonClient == nil {
+                errorContent("Cannot generate QR code \u{2014} daemon not connected. Please wait for the daemon to start and try again.")
+            } else {
+                switch registrationState {
+                case .idle, .registering:
+                    VStack(spacing: VSpacing.sm) {
+                        ProgressView()
+                            .controlSize(.large)
+                        Text("Registering pairing request...")
+                            .font(VFont.body)
+                            .foregroundColor(VColor.textSecondary)
+                    }
+                    .frame(width: 220, height: 220)
+
+                case .registered:
+                    if let qrImage = generateQRImage() {
+                        Image(nsImage: qrImage)
+                            .resizable()
+                            .interpolation(.none)
+                            .scaledToFit()
+                            .frame(width: 220, height: 220)
+                            .padding(VSpacing.md)
+                            .background(Color.white)
+                            .cornerRadius(VRadius.md)
+                    } else {
+                        errorContent("Failed to generate QR code.")
+                    }
 
-            // State indicator
-            if canGenerateQR {
-                HStack(spacing: VSpacing.sm) {
-                    Image(systemName: "checkmark.circle.fill")
-                        .foregroundColor(VColor.success)
-                        .font(.system(size: 14))
-                    Text("Ready to pair with iOS")
-                        .font(VFont.body)
-                        .foregroundColor(VColor.success)
+                case .failed:
+                    errorContent(registrationError ?? "Could not register pairing request. Ensure the daemon is running.")
                 }
 
-                if localLanUrl != nil {
-                    HStack(spacing: VSpacing.xs) {
-                        Image(systemName: "wifi")
-                            .foregroundColor(VColor.textMuted)
-                            .font(.system(size: 12))
-                        Text("LAN pairing available")
-                            .font(VFont.caption)
-                            .foregroundColor(VColor.textMuted)
+                // State indicator
+                if canGenerateQR {
+                    HStack(spacing: VSpacing.sm) {
+                        Image(systemName: "checkmark.circle.fill")
+                            .foregroundColor(VColor.success)
+                            .font(.system(size: 14))
+                        Text("Ready to pair with iOS")
+                            .font(VFont.body)
+                            .foregroundColor(VColor.success)
+                    }
+
+                    if localLanUrl != nil {
+                        HStack(spacing: VSpacing.xs) {
+                            Image(systemName: "wifi")
+                                .foregroundColor(VColor.textMuted)
+                                .font(.system(size: 12))
+                            Text("LAN pairing available")
+                                .font(VFont.caption)
+                                .foregroundColor(VColor.textMuted)
+                        }
                     }
                 }
             }
@@ -102,11 +118,13 @@ struct PairingQRCodeSheet: View {
                 .foregroundColor(VColor.textSecondary)
                 .multilineTextAlignment(.center)
 
-            if registrationState == .failed {
+            if registrationState == .failed && daemonClient != nil {
                 Button("Retry") {
+                    consecutiveRefreshFailures = 0
                     pairingRequestId = UUID().uuidString
                     pairingSecret = Self.generatePairingSecret()
                     registerWithDaemon()
+                    startRefreshTimer()
                 }
                 .buttonStyle(.bordered)
             }
@@ -116,7 +134,12 @@ struct PairingQRCodeSheet: View {
         .onAppear {
             hostId = Self.computeHostId()
             localLanUrl = computeLocalLanUrl()
+            guard daemonClient != nil else { return }
             registerWithDaemon()
+            startRefreshTimer()
+        }
+        .onDisappear {
+            stopRefreshTimer()
         }
     }
 
@@ -133,18 +156,91 @@ struct PairingQRCodeSheet: View {
         .frame(width: 220, height: 220)
     }
 
+    // MARK: - Refresh Timer
+
+    private func startRefreshTimer() {
+        stopRefreshTimer()
+        refreshTask = Task { @MainActor in
+            while !Task.isCancelled {
+                try? await Task.sleep(nanoseconds: Self.refreshInterval)
+                guard !Task.isCancelled else { break }
+                // Generate new credentials into locals so the old QR stays visible
+                // while the re-registration HTTP request is in-flight.
+                let newRequestId = UUID().uuidString
+                let newSecret = Self.generatePairingSecret()
+                await refreshRegistration(newRequestId: newRequestId, newSecret: newSecret)
+            }
+        }
+    }
+
+    private func stopRefreshTimer() {
+        refreshTask?.cancel()
+        refreshTask = nil
+    }
+
     // MARK: - Registration
 
+    /// Resolve the daemon HTTP port. Prefers the IPC-reported value, falls back
+    /// to `RUNTIME_HTTP_PORT` env var, then the default `7821`.
+    private var resolvedHttpPort: Int {
+        if let port = daemonClient?.httpPort { return port }
+        let envPort = ProcessInfo.processInfo.environment["RUNTIME_HTTP_PORT"]
+            ?? getenv("RUNTIME_HTTP_PORT").flatMap({ String(cString: $0) })
+        return envPort.flatMap(Int.init) ?? 7821
+    }
+
     private func registerWithDaemon() {
         registrationState = .registering
         registrationError = nil
 
-        guard let port = daemonClient.httpPort else {
-            registrationState = .failed
-            registrationError = "Daemon HTTP server not running."
-            return
+        let port = resolvedHttpPort
+
+        let reqId = pairingRequestId
+        let secret = pairingSecret
+
+        Task {
+            let result = await performRegistrationRequest(port: port, requestId: reqId, secret: secret)
+            switch result {
+            case .success:
+                registrationState = .registered
+            case .failure(let error):
+                registrationState = .failed
+                registrationError = error.message
+            }
         }
+    }
 
+    /// Re-register with new credentials without disrupting the visible QR code.
+    /// Only swaps pairingRequestId, pairingSecret, and registrationState atomically
+    /// once the HTTP 200 response comes back. On failure the old QR stays visible.
+    private func refreshRegistration(newRequestId: String, newSecret: String) async {
+        let port = resolvedHttpPort
+
+        let result = await performRegistrationRequest(port: port, requestId: newRequestId, secret: newSecret)
+        switch result {
+        case .success:
+            pairingRequestId = newRequestId
+            pairingSecret = newSecret
+            registrationState = .registered
+            consecutiveRefreshFailures = 0
+        case .failure:
+            consecutiveRefreshFailures += 1
+            if consecutiveRefreshFailures >= 2 {
+                registrationState = .failed
+                registrationError = "Re-registration failed. Close and reopen to try again."
+                stopRefreshTimer()
+            }
+            // On first failure, keep old QR visible; the next timer tick will retry.
+        }
+    }
+
+    /// Error wrapper for registration request results.
+    private struct RegistrationRequestError: Error {
+        let message: String
+    }
+
+    /// Shared HTTP request logic for pairing registration.
+    private func performRegistrationRequest(port: Int, requestId: String, secret: String) async -> Result<Void, RegistrationRequestError> {
         let tokenPath = resolveHttpTokenPath()
         let bearerToken: String? = {
             guard let path = tokenPath else { return nil }
@@ -154,18 +250,16 @@ struct PairingQRCodeSheet: View {
         let url = URL(string: "http://localhost:\(port)/v1/pairing/register")!
 
         var body: [String: Any] = [
-            "pairingRequestId": pairingRequestId,
-            "pairingSecret": pairingSecret,
-            "gatewayUrl": gatewayUrl,
+            "pairingRequestId": requestId,
+            "pairingSecret": secret,
+            "gatewayUrl": effectiveGatewayUrl,
         ]
         if let lan = localLanUrl {
             body["localLanUrl"] = lan
         }
 
         guard let jsonData = try? JSONSerialization.data(withJSONObject: body) else {
-            registrationState = .failed
-            registrationError = "Failed to serialize registration payload."
-            return
+            return .failure(RegistrationRequestError(message: "Failed to serialize registration payload."))
         }
 
         var request = URLRequest(url: url)
@@ -176,20 +270,16 @@ struct PairingQRCodeSheet: View {
             request.setValue("Bearer \(token)", forHTTPHeaderField: "Authorization")
         }
 
-        Task {
-            do {
-                let (_, response) = try await URLSession.shared.data(for: request)
-                if let httpResponse = response as? HTTPURLResponse, httpResponse.statusCode == 200 {
-                    registrationState = .registered
-                } else {
-                    let statusCode = (response as? HTTPURLResponse)?.statusCode ?? 0
-                    registrationState = .failed
-                    registrationError = "Registration failed (HTTP \(statusCode))."
-                }
-            } catch {
-                registrationState = .failed
-                registrationError = "Could not reach daemon: \(error.localizedDescription)"
+        do {
+            let (_, response) = try await URLSession.shared.data(for: request)
+            if let httpResponse = response as? HTTPURLResponse, httpResponse.statusCode == 200 {
+                return .success(())
+            } else {
+                let statusCode = (response as? HTTPURLResponse)?.statusCode ?? 0
+                return .failure(RegistrationRequestError(message: "Registration failed (HTTP \(statusCode))."))
             }
+        } catch {
+            return .failure(RegistrationRequestError(message: "Could not reach daemon: \(error.localizedDescription)"))
         }
     }
 
@@ -215,7 +305,7 @@ struct PairingQRCodeSheet: View {
             "type": "vellum-daemon",
             "v": 4,
             "id": hostId,
-            "g": gatewayUrl,
+            "g": effectiveGatewayUrl,
             "pairingRequestId": pairingRequestId,
             "pairingSecret": pairingSecret,
         ]
diff --git a/clients/macos/vellum-assistant/Features/Settings/SettingsAdvancedTab.swift b/clients/macos/vellum-assistant/Features/Settings/SettingsAdvancedTab.swift
index 7d47e42bf57..6b492e01cc4 100644
--- a/clients/macos/vellum-assistant/Features/Settings/SettingsAdvancedTab.swift
+++ b/clients/macos/vellum-assistant/Features/Settings/SettingsAdvancedTab.swift
@@ -19,11 +19,12 @@ struct SettingsAdvancedTab: View {
     @State private var remoteIdentity: RemoteIdentityInfo?
     @State private var flagStates: [(flag: FeatureFlag, enabled: Bool)] = []
 
-    #if DEBUG
+    @State private var devModeTapCount: Int = 0
+    @State private var devModeMessage: String?
+
     @State private var showingEnvVars = false
     @State private var appEnvVars: [(String, String)] = []
     @State private var daemonEnvVars: [(String, String)] = []
-    #endif
 
     var body: some View {
         VStack(alignment: .leading, spacing: VSpacing.xl) {
@@ -35,9 +36,9 @@ struct SettingsAdvancedTab: View {
             hatchNewAssistantSection
             featureFlagSection
 
-            #if DEBUG
-            developerSection
-            #endif
+            if store.isDevMode {
+                developerSection
+            }
         }
         .onAppear {
             lockfileAssistants = LockfileAssistant.loadAll()
@@ -87,14 +88,12 @@ struct SettingsAdvancedTab: View {
             .frame(minWidth: 260)
             .interactiveDismissDisabled()
         }
-        #if DEBUG
         .sheet(isPresented: $showingEnvVars) {
             SettingsPanelEnvVarsSheet(appEnvVars: appEnvVars, daemonEnvVars: daemonEnvVars)
         }
         .onDisappear {
             daemonClient?.onEnvVarsResponse = nil
         }
-        #endif
     }
 
     // MARK: - Assistant Info
@@ -107,11 +106,32 @@ struct SettingsAdvancedTab: View {
 
             if let assistant = lockfileAssistants.first(where: { $0.assistantId == selectedAssistantId }) {
                 infoRow(label: "Assistant ID", value: assistant.assistantId, mono: true)
+                    .onTapGesture {
+                        devModeTapCount += 1
+                        if devModeTapCount >= 7 {
+                            store.toggleDevMode()
+                            devModeTapCount = 0
+                            devModeMessage = store.isDevMode
+                                ? "Dev mode enabled"
+                                : "Dev mode disabled"
+                            Task {
+                                try? await Task.sleep(nanoseconds: 2_000_000_000)
+                                devModeMessage = nil
+                            }
+                        }
+                    }
 
                 let home = assistant.home
                 homeRow(home: home)
             }
 
+            if let message = devModeMessage {
+                Text(message)
+                    .font(VFont.caption)
+                    .foregroundColor(VColor.accent)
+                    .transition(.opacity)
+            }
+
             // Process status (child view observes @Published changes)
             if let daemonClient {
                 DaemonStatusRows(daemonClient: daemonClient)
@@ -347,7 +367,7 @@ struct SettingsAdvancedTab: View {
 
     @ViewBuilder
     private var featureFlagSection: some View {
-        if FeatureFlagManager.shared.isEnabled(.featureFlagEditorEnabled) {
+        if store.isDevMode {
             VStack(alignment: .leading, spacing: VSpacing.md) {
                 Text("Feature Flags")
                     .font(VFont.sectionTitle)
@@ -371,9 +391,8 @@ struct SettingsAdvancedTab: View {
         }
     }
 
-    // MARK: - Developer (Debug Only)
+    // MARK: - Developer
 
-    #if DEBUG
     @ViewBuilder
     private var developerSection: some View {
         if daemonClient != nil {
@@ -413,7 +432,6 @@ struct SettingsAdvancedTab: View {
             .vCard(background: VColor.surfaceSubtle)
         }
     }
-    #endif
 }
 
 // MARK: - Daemon Status Rows
diff --git a/clients/macos/vellum-assistant/Features/Settings/SettingsAppearanceTab.swift b/clients/macos/vellum-assistant/Features/Settings/SettingsAppearanceTab.swift
index bbe8e43bfd9..cff9ac333dd 100644
--- a/clients/macos/vellum-assistant/Features/Settings/SettingsAppearanceTab.swift
+++ b/clients/macos/vellum-assistant/Features/Settings/SettingsAppearanceTab.swift
@@ -6,7 +6,7 @@ struct SettingsAppearanceTab: View {
     @ObservedObject var store: SettingsStore
     @AppStorage("themePreference") private var themePreference: String = "system"
     @State private var newAllowlistDomain = ""
-    @State private var isRecordingShortcut = false
+    @State private var isRecordingGlobalHotkey = false
     @State private var shortcutMonitor: Any?
     @State private var shortcutConflictWarning: String?
 
@@ -48,13 +48,13 @@ struct SettingsAppearanceTab: View {
                     .font(VFont.sectionTitle)
                     .foregroundColor(VColor.textPrimary)
 
-                // Quick Chat (configurable)
+                // Open Vellum (configurable)
                 HStack {
-                    Text("Quick Chat")
+                    Text("Open Vellum")
                         .font(VFont.body)
                         .foregroundColor(VColor.textSecondary)
                     Spacer()
-                    Text(ShortcutHelper.displayString(for: store.quickChatShortcut))
+                    Text(ShortcutHelper.displayString(for: store.globalHotkeyShortcut))
                         .font(VFont.mono)
                         .foregroundColor(VColor.textPrimary)
                         .padding(.horizontal, VSpacing.sm)
@@ -66,7 +66,7 @@ struct SettingsAppearanceTab: View {
                                 .stroke(VColor.surfaceBorder, lineWidth: 1)
                         )
 
-                    if isRecordingShortcut {
+                    if isRecordingGlobalHotkey {
                         VButton(label: "Press shortcut...", style: .tertiary) {
                             stopRecording()
                         }
@@ -82,25 +82,6 @@ struct SettingsAppearanceTab: View {
                         .font(VFont.caption)
                         .foregroundColor(VColor.warning)
                 }
-
-                // Open Vellum (fixed, non-editable)
-                HStack {
-                    Text("Open Vellum")
-                        .font(VFont.body)
-                        .foregroundColor(VColor.textSecondary)
-                    Spacer()
-                    Text("\u{2318}\u{21E7}G")
-                        .font(VFont.mono)
-                        .foregroundColor(VColor.textMuted)
-                        .padding(.horizontal, VSpacing.sm)
-                        .padding(.vertical, VSpacing.xs)
-                        .background(VColor.surface)
-                        .clipShape(RoundedRectangle(cornerRadius: VRadius.sm))
-                        .overlay(
-                            RoundedRectangle(cornerRadius: VRadius.sm)
-                                .stroke(VColor.surfaceBorder, lineWidth: 1)
-                        )
-                }
             }
             .padding(VSpacing.lg)
             .vCard(background: VColor.surfaceSubtle)
@@ -190,23 +171,18 @@ struct SettingsAppearanceTab: View {
 
     // MARK: - Shortcut Recording
 
-    /// The fixed "Open Vellum" shortcut tokens for order-independent conflict detection.
-    private static let openVellumShortcutTokens = ShortcutHelper.normalizeShortcut("cmd+shift+g")
-
     private func startRecording() {
-        isRecordingShortcut = true
+        isRecordingGlobalHotkey = true
         shortcutConflictWarning = nil
-        // Use local monitor so we capture key events while the settings window is focused
+
         shortcutMonitor = NSEvent.addLocalMonitorForEvents(matching: .keyDown) { event in
             let mods = event.modifierFlags.intersection(.deviceIndependentFlagsMask)
 
-            // Escape cancels recording without changing the shortcut
             if event.keyCode == 53 {
                 stopRecording()
                 return nil
             }
 
-            // Require at least one modifier key to form a valid global shortcut
             let hasModifier = mods.contains(.command) || mods.contains(.control)
                 || mods.contains(.option)
             guard hasModifier,
@@ -218,22 +194,15 @@ struct SettingsAppearanceTab: View {
                 from: mods, key: chars, keyCode: event.keyCode
             )
 
-            // Check for conflict with the fixed "Open Vellum" shortcut
-            if ShortcutHelper.normalizeShortcut(shortcut) == Self.openVellumShortcutTokens {
-                shortcutConflictWarning = "This shortcut conflicts with the Open Vellum shortcut (\u{2318}\u{21E7}G). Choose a different shortcut."
-                stopRecording()
-                return nil
-            }
-
             shortcutConflictWarning = nil
-            store.quickChatShortcut = shortcut
+            store.globalHotkeyShortcut = shortcut
             stopRecording()
-            return nil // consume the event
+            return nil
         }
     }
 
     private func stopRecording() {
-        isRecordingShortcut = false
+        isRecordingGlobalHotkey = false
         if let monitor = shortcutMonitor {
             NSEvent.removeMonitor(monitor)
             shortcutMonitor = nil
diff --git a/clients/macos/vellum-assistant/Features/Settings/SettingsConnectTab.swift b/clients/macos/vellum-assistant/Features/Settings/SettingsConnectTab.swift
index d0063d09bb0..bbe5d8b3d49 100644
--- a/clients/macos/vellum-assistant/Features/Settings/SettingsConnectTab.swift
+++ b/clients/macos/vellum-assistant/Features/Settings/SettingsConnectTab.swift
@@ -41,6 +41,9 @@ struct SettingsConnectTab: View {
     // Guardian copy state (tracks which channel's command was just copied)
     @State private var guardianCommandCopiedChannel: String?
 
+    // Token regeneration state
+    @State private var isRegeneratingToken: Bool = false
+
     // Override fields for power users / debugging
     @AppStorage(PairingConfiguration.gatewayOverrideKey) private var iosPairingGatewayOverride: String = ""
     @AppStorage(PairingConfiguration.tokenOverrideKey) private var iosPairingTokenOverride: String = ""
@@ -57,6 +60,7 @@ struct SettingsConnectTab: View {
         }
         .onAppear {
             Task { await authManager.checkSession() }
+            if store.isDevMode { Task { await store.checkVellumPlatform() } }
             store.refreshIngressConfig()
             store.refreshAssistantEmail()
             gatewayUrlText = store.ingressPublicBaseUrl
@@ -91,17 +95,37 @@ struct SettingsConnectTab: View {
             Text("This will replace the current bearer token and restart the daemon. Any paired devices will need to reconnect.")
         }
         .sheet(isPresented: $showingPairingQR) {
-            if let client = daemonClient {
-                PairingQRCodeSheet(
-                    gatewayUrl: store.resolvedIosGatewayUrl,
-                    daemonClient: client
-                )
-            }
+            PairingQRCodeSheet(
+                gatewayUrl: store.resolvedIosGatewayUrl,
+                daemonClient: daemonClient
+            )
         }
     }
 
     // MARK: - Vellum Section
 
+    private var platformHealthIcon: String {
+        if store.isCheckingVellumPlatform {
+            return "arrow.trianglehead.2.counterclockwise"
+        }
+        switch store.vellumPlatformReachable {
+        case .some(true): return "checkmark.circle.fill"
+        case .some(false): return "xmark.circle.fill"
+        case .none: return "questionmark.circle"
+        }
+    }
+
+    private var platformHealthIconColor: Color {
+        if store.isCheckingVellumPlatform {
+            return VColor.textMuted
+        }
+        switch store.vellumPlatformReachable {
+        case .some(true): return VColor.success
+        case .some(false): return VColor.error
+        case .none: return VColor.textMuted
+        }
+    }
+
     private var vellumSection: some View {
         VStack(alignment: .leading, spacing: VSpacing.md) {
             Text("Vellum")
@@ -147,15 +171,18 @@ struct SettingsConnectTab: View {
                     .foregroundColor(VColor.error)
             }
 
-            Divider().background(VColor.surfaceBorder)
+            if store.isDevMode {
+                Divider().background(VColor.surfaceBorder)
 
-            channelStatusRow(
-                label: "Platform URL",
-                icon: "link",
-                iconColor: VColor.textMuted,
-                value: AuthService.shared.baseURL,
-                valueFont: VFont.mono
-            )
+                channelStatusRow(
+                    label: "Platform URL",
+                    icon: platformHealthIcon,
+                    iconColor: platformHealthIconColor,
+                    value: AuthService.shared.baseURL,
+                    valueFont: VFont.mono
+                )
+                .help(store.vellumPlatformError ?? "")
+            }
         }
         .padding(VSpacing.lg)
         .frame(maxWidth: .infinity, alignment: .leading)
@@ -1178,10 +1205,12 @@ struct SettingsConnectTab: View {
             VButton(label: "Show QR Code", leftIcon: "qrcode", style: .primary) {
                 showingPairingQR = true
             }
+            .disabled(isRegeneratingToken)
 
             // Status line — use resolvedIosGatewayUrl for gateway (no I/O) and
             // cached bearerToken + override for token (avoids synchronous disk read).
-            let hasGateway = !store.resolvedIosGatewayUrl.isEmpty
+            // LAN pairing works without a cloud gateway URL.
+            let hasGateway = !store.resolvedIosGatewayUrl.isEmpty || LANIPHelper.currentLANAddress() != nil
             let trimmedOverrideToken = iosPairingTokenOverride.trimmingCharacters(in: .whitespacesAndNewlines)
             // Has a usable token — either the daemon file token or a non-empty override.
             let hasToken = !bearerToken.isEmpty || !trimmedOverrideToken.isEmpty
@@ -1190,7 +1219,16 @@ struct SettingsConnectTab: View {
             // is present and no override token has been entered.
             let tokenFromDaemon = !bearerToken.isEmpty && trimmedOverrideToken.isEmpty
 
-            if hasGateway && hasToken {
+            if isRegeneratingToken {
+                // "Restarting daemon..." — spinner while daemon restarts with new token
+                HStack(spacing: VSpacing.sm) {
+                    ProgressView()
+                        .controlSize(.small)
+                    Text("Restarting daemon with new token\u{2026}")
+                        .font(VFont.body)
+                        .foregroundColor(VColor.textSecondary)
+                }
+            } else if hasGateway && hasToken {
                 // "Ready to pair" — green checkmark + subtle regenerate (daemon token only)
                 HStack(spacing: VSpacing.sm) {
                     Image(systemName: "checkmark.circle.fill")
@@ -1454,10 +1492,29 @@ struct SettingsConnectTab: View {
         bearerToken = newToken
         // Kill the daemon so the health monitor restarts it with the new token.
         // The daemon only reads the token at startup, so a restart is required.
+        isRegeneratingToken = true
         let pidPath = resolvePidPath()
         if let pidStr = try? String(contentsOfFile: pidPath, encoding: .utf8).trimmingCharacters(in: .whitespacesAndNewlines),
            let pid = Int32(pidStr) {
             kill(pid, SIGTERM)
         }
+        // Wait for the daemon to restart and become reachable with the new token.
+        Task {
+            let port = ProcessInfo.processInfo.environment["RUNTIME_HTTP_PORT"]
+                .flatMap(Int.init) ?? 7821
+            let url = URL(string: "http://localhost:\(port)/healthz")!
+            var request = URLRequest(url: url)
+            request.setValue("Bearer \(newToken)", forHTTPHeaderField: "Authorization")
+            request.timeoutInterval = 2
+            for _ in 0..<30 { // up to ~30s
+                try? await Task.sleep(nanoseconds: 1_000_000_000)
+                if let (_, response) = try? await URLSession.shared.data(for: request),
+                   let http = response as? HTTPURLResponse, http.statusCode == 200 {
+                    isRegeneratingToken = false
+                    return
+                }
+            }
+            isRegeneratingToken = false
+        }
     }
 }
diff --git a/clients/macos/vellum-assistant/Features/Settings/SettingsStore.swift b/clients/macos/vellum-assistant/Features/Settings/SettingsStore.swift
index fdf0738fceb..76415ed5483 100644
--- a/clients/macos/vellum-assistant/Features/Settings/SettingsStore.swift
+++ b/clients/macos/vellum-assistant/Features/Settings/SettingsStore.swift
@@ -59,7 +59,7 @@ public final class SettingsStore: ObservableObject {
 
     @Published var maxSteps: Double
     @Published var activityNotificationsEnabled: Bool
-    @Published var quickChatShortcut: String
+    @Published var globalHotkeyShortcut: String
 
     // MARK: - Media Embed Settings
 
@@ -144,6 +144,14 @@ public final class SettingsStore: ObservableObject {
     @Published var gatewayLastChecked: Date?
     @Published var isCheckingGateway: Bool = false
 
+    @Published var vellumPlatformReachable: Bool?
+    @Published var vellumPlatformError: String?
+    @Published var isCheckingVellumPlatform: Bool = false
+
+    // MARK: - Dev Mode
+
+    @Published var isDevMode: Bool
+
     // MARK: - Trust Rules Coordination
 
     /// Whether any settings surface currently has a trust rules sheet open.
@@ -244,7 +252,13 @@ public final class SettingsStore: ObservableObject {
         // Default to enabled for notifications
         self.activityNotificationsEnabled = UserDefaults.standard.object(forKey: "activityNotificationsEnabled") as? Bool ?? true
 
-        self.quickChatShortcut = UserDefaults.standard.string(forKey: "quickChatShortcut") ?? "cmd+shift+space"
+        self.globalHotkeyShortcut = UserDefaults.standard.string(forKey: "globalHotkeyShortcut") ?? "cmd+shift+g"
+
+        #if DEBUG
+        self.isDevMode = UserDefaults.standard.object(forKey: "devModeEnabled") as? Bool ?? true
+        #else
+        self.isDevMode = UserDefaults.standard.bool(forKey: "devModeEnabled")
+        #endif
 
         // Load media embed settings from workspace config
         let mediaSettings = Self.loadMediaEmbedSettings(from: configPath)
@@ -284,9 +298,14 @@ public final class SettingsStore: ObservableObject {
             .store(in: &cancellables)
 
         // Persist shortcut changes immediately so the hotkey re-registers without delay
-        $quickChatShortcut
+        $globalHotkeyShortcut
             .dropFirst()
-            .sink { value in UserDefaults.standard.set(value, forKey: "quickChatShortcut") }
+            .sink { value in UserDefaults.standard.set(value, forKey: "globalHotkeyShortcut") }
+            .store(in: &cancellables)
+
+        $isDevMode
+            .dropFirst()
+            .sink { value in UserDefaults.standard.set(value, forKey: "devModeEnabled") }
             .store(in: &cancellables)
 
         // Mirror DaemonClient's trust-rules-open flag so views can disable their buttons
@@ -1229,6 +1248,37 @@ public final class SettingsStore: ObservableObject {
         }
     }
 
+    // MARK: - Platform Health Check
+
+    func checkVellumPlatform() async {
+        isCheckingVellumPlatform = true
+        defer { isCheckingVellumPlatform = false }
+
+        let baseUrl = AuthService.shared.baseURL
+        let normalized = baseUrl.hasSuffix("/") ? String(baseUrl.dropLast()) : baseUrl
+        guard let url = URL(string: "\(normalized)/healthz") else {
+            vellumPlatformReachable = false
+            vellumPlatformError = "Invalid URL"
+            return
+        }
+        var request = URLRequest(url: url)
+        request.timeoutInterval = 5
+        do {
+            let (_, response) = try await URLSession.shared.data(for: request)
+            if let http = response as? HTTPURLResponse, (200..<300).contains(http.statusCode) {
+                vellumPlatformReachable = true
+                vellumPlatformError = nil
+            } else {
+                let code = (response as? HTTPURLResponse)?.statusCode ?? 0
+                vellumPlatformReachable = false
+                vellumPlatformError = "HTTP \(code)"
+            }
+        } catch {
+            vellumPlatformReachable = false
+            vellumPlatformError = error.localizedDescription
+        }
+    }
+
     // MARK: - Approved Devices
 
     @Published var approvedDevices: [ApprovedDevicesListResponseMessage.Device] = []
@@ -1243,18 +1293,24 @@ public final class SettingsStore: ObservableObject {
 
     func removeApprovedDevice(hashedDeviceId: String) {
         guard let daemonClient else { return }
-        daemonClient.onApprovedDeviceRemoveResponse = { [weak self] msg in
-            if msg.success {
-                self?.approvedDevices.removeAll { $0.hashedDeviceId == hashedDeviceId }
-            }
+        let removed = approvedDevices.filter { $0.hashedDeviceId == hashedDeviceId }
+        approvedDevices.removeAll { $0.hashedDeviceId == hashedDeviceId }
+        do {
+            try daemonClient.sendApprovedDeviceRemove(hashedDeviceId: hashedDeviceId)
+        } catch {
+            // IPC failed — restore optimistically removed devices
+            approvedDevices.append(contentsOf: removed)
         }
-        try? daemonClient.sendApprovedDeviceRemove(hashedDeviceId: hashedDeviceId)
     }
 
     func clearAllApprovedDevices() {
         guard let daemonClient else { return }
-        try? daemonClient.sendApprovedDevicesClear()
-        approvedDevices = []
+        do {
+            try daemonClient.sendApprovedDevicesClear()
+            approvedDevices = []
+        } catch {
+            // IPC failed — don't clear local state
+        }
     }
 
     // MARK: - Override Resolution
@@ -1275,6 +1331,12 @@ public final class SettingsStore: ObservableObject {
         return "http://\(ip):7830"
     }
 
+    // MARK: - Dev Mode Actions
+
+    func toggleDevMode() {
+        isDevMode.toggle()
+    }
+
     // MARK: - Model Actions
 
     func setModel(_ model: String) {
diff --git a/clients/macos/vellum-assistant/Features/Settings/WakeWordSettingsView.swift b/clients/macos/vellum-assistant/Features/Settings/WakeWordSettingsView.swift
index 44b622e0b0e..927e85de852 100644
--- a/clients/macos/vellum-assistant/Features/Settings/WakeWordSettingsView.swift
+++ b/clients/macos/vellum-assistant/Features/Settings/WakeWordSettingsView.swift
@@ -1,4 +1,5 @@
 import SwiftUI
+import VellumAssistantShared
 
 /// Wake word settings tab — enable/disable wake word listening,
 /// configure Picovoice access key, sensitivity, and conversation timeout.
@@ -6,6 +7,7 @@ struct WakeWordSettingsView: View {
     @AppStorage("wakeWordEnabled") private var wakeWordEnabled: Bool = false
     @AppStorage("wakeWordSensitivity") private var wakeWordSensitivity: Double = 0.5
     @AppStorage("wakeWordTimeoutSeconds") private var wakeWordTimeoutSeconds: Int = 30
+    @AppStorage("wakeWordKeyword") private var wakeWordKeyword: String = "computer"
 
     @State private var picovoiceKeyText: String = ""
 
@@ -13,6 +15,7 @@ struct WakeWordSettingsView: View {
         VStack(alignment: .leading, spacing: VSpacing.xl) {
             statusSection
             enableSection
+            keywordSection
             accessKeySection
             sensitivitySection
             timeoutSection
@@ -30,7 +33,7 @@ struct WakeWordSettingsView: View {
                 .font(.system(size: 14))
                 .foregroundColor(wakeWordEnabled ? VColor.success : VColor.textMuted)
 
-            Text(wakeWordEnabled ? "Listening for \"hey vellum\"" : "Wake word disabled")
+            Text(wakeWordEnabled ? "Listening for \"\(wakeWordKeyword)\"" : "Wake word disabled")
                 .font(VFont.body)
                 .foregroundColor(wakeWordEnabled ? VColor.textPrimary : VColor.textSecondary)
 
@@ -53,7 +56,7 @@ struct WakeWordSettingsView: View {
                     Text("Enable wake word listening")
                         .font(VFont.body)
                         .foregroundColor(VColor.textSecondary)
-                    Text("Activate the assistant by saying \"hey vellum\" instead of using a keyboard shortcut.")
+                    Text("Activate the assistant by saying the wake word instead of using a keyboard shortcut.")
                         .font(VFont.caption)
                         .foregroundColor(VColor.textMuted)
                 }
@@ -68,6 +71,45 @@ struct WakeWordSettingsView: View {
         .vCard(background: VColor.surfaceSubtle)
     }
 
+    // MARK: - Keyword
+
+    private var keywordSection: some View {
+        VStack(alignment: .leading, spacing: VSpacing.md) {
+            Text("Keyword")
+                .font(VFont.sectionTitle)
+                .foregroundColor(VColor.textPrimary)
+
+            HStack {
+                Text("Keyword")
+                    .font(VFont.body)
+                    .foregroundColor(VColor.textSecondary)
+                Spacer()
+                Picker("", selection: $wakeWordKeyword) {
+                    Text("Computer").tag("computer")
+                    Text("Jarvis").tag("jarvis")
+                    Text("Alexa").tag("alexa")
+                    Text("Hey Siri").tag("hey siri")
+                    Text("Picovoice").tag("picovoice")
+                    Text("Porcupine").tag("porcupine")
+                    Text("Terminator").tag("terminator")
+                    Text("Bumblebee").tag("bumblebee")
+                    Text("Blueberry").tag("blueberry")
+                    Text("Grapefruit").tag("grapefruit")
+                    Text("Grasshopper").tag("grasshopper")
+                }
+                .pickerStyle(.menu)
+                .frame(width: 160)
+                .accessibilityLabel("Wake word keyword")
+            }
+
+            Text("The keyword that triggers voice activation. Requires restart of wake word listening to take effect.")
+                .font(VFont.caption)
+                .foregroundColor(VColor.textMuted)
+        }
+        .padding(VSpacing.lg)
+        .vCard(background: VColor.surfaceSubtle)
+    }
+
     // MARK: - Access Key
 
     private var accessKeySection: some View {
diff --git a/clients/macos/vellum-assistant/Features/Voice/WakeWord/AlwaysOnAudioMonitor.swift b/clients/macos/vellum-assistant/Features/Voice/WakeWord/AlwaysOnAudioMonitor.swift
index 791f86a1918..7c2927ecb1c 100644
--- a/clients/macos/vellum-assistant/Features/Voice/WakeWord/AlwaysOnAudioMonitor.swift
+++ b/clients/macos/vellum-assistant/Features/Voice/WakeWord/AlwaysOnAudioMonitor.swift
@@ -96,18 +96,50 @@ final class AlwaysOnAudioMonitor: ObservableObject {
             throw AudioMonitorError.noInputChannels
         }
 
-        // Install a tap using the hardware format for low-latency capture.
-        // The buffer is available for the WakeWordEngine to consume via its
-        // own internal processing (the engine's start() primes it for detection).
+        // Porcupine requires 16kHz mono Int16 PCM. Use hardware format for the tap
+        // (avoids runtime assertions on some Macs) and resample in the callback.
+        let targetFormat = AVAudioFormat(
+            commonFormat: .pcmFormatFloat32,
+            sampleRate: 16000,
+            channels: 1,
+            interleaved: false
+        )!
+
+        guard let converter = AVAudioConverter(from: hwFormat, to: targetFormat) else {
+            throw AudioMonitorError.converterCreationFailed
+        }
+
         inputNode.installTap(
             onBus: 0,
             bufferSize: Self.bufferSize,
             format: hwFormat
         ) { [weak self] buffer, _ in
             guard let self else { return }
-            guard let floatData = buffer.floatChannelData else { return }
-            let frameLength = Int(buffer.frameLength)
-            // Convert Float32 PCM to Int16 for Porcupine
+
+            // Resample hardware audio to 16kHz mono
+            let frameCapacity = AVAudioFrameCount(
+                ceil(Double(buffer.frameLength) * targetFormat.sampleRate / hwFormat.sampleRate)
+            )
+            guard let convertedBuffer = AVAudioPCMBuffer(
+                pcmFormat: targetFormat,
+                frameCapacity: frameCapacity
+            ) else { return }
+
+            var error: NSError?
+            var inputConsumed = false
+            let status = converter.convert(to: convertedBuffer, error: &error) { _, outStatus in
+                if inputConsumed {
+                    outStatus.pointee = .noDataNow
+                    return nil
+                }
+                inputConsumed = true
+                outStatus.pointee = .haveData
+                return buffer
+            }
+            guard status == .haveData || status == .inputRanDry else { return }
+
+            guard let floatData = convertedBuffer.floatChannelData else { return }
+            let frameLength = Int(convertedBuffer.frameLength)
             var int16Samples = [Int16](repeating: 0, count: frameLength)
             for i in 0..<frameLength {
                 let sample = max(-1.0, min(1.0, floatData[0][i]))
@@ -186,11 +218,14 @@ final class AlwaysOnAudioMonitor: ObservableObject {
 
 enum AudioMonitorError: LocalizedError {
     case noInputChannels
+    case converterCreationFailed
 
     var errorDescription: String? {
         switch self {
         case .noInputChannels:
             return "No audio input channels available — microphone may not be connected or permitted"
+        case .converterCreationFailed:
+            return "Failed to create audio format converter for 16kHz resampling"
         }
     }
 }
diff --git a/clients/macos/vellum-assistant/Features/Voice/WakeWord/PorcupineBinding.swift b/clients/macos/vellum-assistant/Features/Voice/WakeWord/PorcupineBinding.swift
new file mode 100644
index 00000000000..54ec638f6ae
--- /dev/null
+++ b/clients/macos/vellum-assistant/Features/Voice/WakeWord/PorcupineBinding.swift
@@ -0,0 +1,322 @@
+import Foundation
+import os
+
+/// Errors from the Porcupine C library binding.
+enum PorcupineBindingError: Error, CustomStringConvertible {
+    case loadFailed(String)
+    case symbolNotFound(String)
+    case outOfMemory(String, [String])
+    case ioError(String, [String])
+    case invalidArgument(String, [String])
+    case stopIteration(String, [String])
+    case keyError(String, [String])
+    case invalidState(String, [String])
+    case runtimeError(String, [String])
+    case activationError(String, [String])
+    case activationLimitReached(String, [String])
+    case activationThrottled(String, [String])
+    case activationRefused(String, [String])
+    case unknownError(Int32, String, [String])
+
+    var description: String {
+        switch self {
+        case .loadFailed(let msg):
+            return "PorcupineBinding load failed: \(msg)"
+        case .symbolNotFound(let sym):
+            return "PorcupineBinding symbol not found: \(sym)"
+        case .outOfMemory(let msg, let stack):
+            return "Porcupine out of memory: \(msg)\(formatStack(stack))"
+        case .ioError(let msg, let stack):
+            return "Porcupine IO error: \(msg)\(formatStack(stack))"
+        case .invalidArgument(let msg, let stack):
+            return "Porcupine invalid argument: \(msg)\(formatStack(stack))"
+        case .stopIteration(let msg, let stack):
+            return "Porcupine stop iteration: \(msg)\(formatStack(stack))"
+        case .keyError(let msg, let stack):
+            return "Porcupine key error: \(msg)\(formatStack(stack))"
+        case .invalidState(let msg, let stack):
+            return "Porcupine invalid state: \(msg)\(formatStack(stack))"
+        case .runtimeError(let msg, let stack):
+            return "Porcupine runtime error: \(msg)\(formatStack(stack))"
+        case .activationError(let msg, let stack):
+            return "Porcupine activation error: \(msg)\(formatStack(stack))"
+        case .activationLimitReached(let msg, let stack):
+            return "Porcupine activation limit reached: \(msg)\(formatStack(stack))"
+        case .activationThrottled(let msg, let stack):
+            return "Porcupine activation throttled: \(msg)\(formatStack(stack))"
+        case .activationRefused(let msg, let stack):
+            return "Porcupine activation refused: \(msg)\(formatStack(stack))"
+        case .unknownError(let code, let msg, let stack):
+            return "Porcupine unknown error (\(code)): \(msg)\(formatStack(stack))"
+        }
+    }
+
+    private func formatStack(_ stack: [String]) -> String {
+        guard !stack.isEmpty else { return "" }
+        return " | Error stack: " + stack.joined(separator: " -> ")
+    }
+}
+
+// MARK: - Function pointer typedefs
+
+private typealias PvPorcupineInitFunc = @convention(c) (
+    UnsafePointer<CChar>?,   // access_key
+    UnsafePointer<CChar>?,   // model_path
+    Int32,                    // num_keywords
+    UnsafeMutablePointer<UnsafePointer<CChar>?>?,  // keyword_paths
+    UnsafePointer<Float>?,   // sensitivities
+    UnsafeMutablePointer<OpaquePointer?>?           // object (out)
+) -> Int32
+
+private typealias PvPorcupineDeleteFunc = @convention(c) (
+    OpaquePointer?           // object
+) -> Void
+
+private typealias PvPorcupineProcessFunc = @convention(c) (
+    OpaquePointer?,          // object
+    UnsafePointer<Int16>?,   // pcm
+    UnsafeMutablePointer<Int32>?  // keyword_index (out)
+) -> Int32
+
+private typealias PvPorcupineFrameLengthFunc = @convention(c) () -> Int32
+
+private typealias PvSampleRateFunc = @convention(c) () -> Int32
+
+private typealias PvPorcupineVersionFunc = @convention(c) () -> UnsafePointer<CChar>?
+
+private typealias PvGetErrorStackFunc = @convention(c) (
+    UnsafeMutablePointer<UnsafeMutablePointer<UnsafeMutablePointer<CChar>?>?>?,  // message_stack (out)
+    UnsafeMutablePointer<Int32>?  // message_stack_depth (out)
+) -> Int32
+
+private typealias PvFreeErrorStackFunc = @convention(c) (
+    UnsafeMutablePointer<UnsafeMutablePointer<CChar>?>?  // message_stack
+) -> Void
+
+private typealias PvStatusToStringFunc = @convention(c) (
+    Int32  // status
+) -> UnsafePointer<CChar>?
+
+// MARK: - PorcupineBinding
+
+/// Swift wrapper around Porcupine's C API, loaded via `dlopen`/`dlsym`.
+///
+/// Loads `libpv_porcupine.dylib` from `Bundle.main.privateFrameworksPath` and
+/// resolves all function pointers at init time. Exposes a Swift-friendly
+/// interface for wake word detection.
+final class PorcupineBinding {
+
+    private static let logger = Logger(
+        subsystem: "com.vellum.vellum-assistant",
+        category: "PorcupineBinding"
+    )
+
+    // MARK: - Library handle & function pointers
+
+    private let libraryHandle: UnsafeMutableRawPointer
+    private let pvPorcupineInit: PvPorcupineInitFunc
+    private let pvPorcupineDelete: PvPorcupineDeleteFunc
+    private let pvPorcupineProcess: PvPorcupineProcessFunc
+    private let pvPorcupineFrameLength: PvPorcupineFrameLengthFunc
+    private let pvSampleRate: PvSampleRateFunc
+    private let pvPorcupineVersion: PvPorcupineVersionFunc
+    private let pvGetErrorStack: PvGetErrorStackFunc
+    private let pvFreeErrorStack: PvFreeErrorStackFunc
+    private let pvStatusToString: PvStatusToStringFunc
+
+    /// Opaque handle returned by `pv_porcupine_init`.
+    private var handle: OpaquePointer?
+
+    // MARK: - Init
+
+    /// Load `libpv_porcupine.dylib` from the given path and resolve all C symbols.
+    ///
+    /// - Parameter dylibPath: Absolute path to `libpv_porcupine.dylib`.
+    /// - Throws: `PorcupineBindingError.loadFailed` if `dlopen` fails,
+    ///           `PorcupineBindingError.symbolNotFound` if any symbol is missing.
+    init(dylibPath: String) throws {
+        guard let lib = dlopen(dylibPath, RTLD_NOW) else {
+            let err = String(cString: dlerror())
+            throw PorcupineBindingError.loadFailed("dlopen failed for \(dylibPath): \(err)")
+        }
+        self.libraryHandle = lib
+
+        func resolve<T>(_ name: String) throws -> T {
+            guard let sym = dlsym(lib, name) else {
+                dlclose(lib)
+                throw PorcupineBindingError.symbolNotFound(name)
+            }
+            return unsafeBitCast(sym, to: T.self)
+        }
+
+        self.pvPorcupineInit = try resolve("pv_porcupine_init")
+        self.pvPorcupineDelete = try resolve("pv_porcupine_delete")
+        self.pvPorcupineProcess = try resolve("pv_porcupine_process")
+        self.pvPorcupineFrameLength = try resolve("pv_porcupine_frame_length")
+        self.pvSampleRate = try resolve("pv_sample_rate")
+        self.pvPorcupineVersion = try resolve("pv_porcupine_version")
+        self.pvGetErrorStack = try resolve("pv_get_error_stack")
+        self.pvFreeErrorStack = try resolve("pv_free_error_stack")
+        self.pvStatusToString = try resolve("pv_status_to_string")
+
+        // Smoke test: verify the dylib loaded correctly by reading its version
+        let ver = version
+        Self.logger.info("Loaded Porcupine dylib version \(ver) from \(dylibPath)")
+    }
+
+    deinit {
+        delete()
+        dlclose(libraryHandle)
+    }
+
+    // MARK: - Public interface
+
+    /// Initialize the Porcupine engine with the given parameters.
+    ///
+    /// - Parameters:
+    ///   - accessKey: Picovoice access key.
+    ///   - modelPath: Absolute path to the model `.pv` file.
+    ///   - keywordPaths: Absolute paths to keyword `.ppn` files.
+    ///   - sensitivities: Detection sensitivities in [0, 1], one per keyword.
+    /// - Throws: `PorcupineBindingError` on failure.
+    func initialize(
+        accessKey: String,
+        modelPath: String,
+        keywordPaths: [String],
+        sensitivities: [Float]
+    ) throws {
+        guard keywordPaths.count == sensitivities.count else {
+            throw PorcupineBindingError.invalidArgument(
+                "Number of keyword paths (\(keywordPaths.count)) does not match number of sensitivities (\(sensitivities.count))",
+                []
+            )
+        }
+
+        // Build a C-compatible array of keyword path strings using strdup
+        // (same pattern as the iOS binding)
+        var cKeywordPaths = keywordPaths.map { UnsafePointer<CChar>(strdup($0)) }
+        defer { cKeywordPaths.forEach { free(UnsafeMutablePointer(mutating: $0)) } }
+
+        var porcupineHandle: OpaquePointer?
+        let status = pvPorcupineInit(
+            accessKey,
+            modelPath,
+            Int32(keywordPaths.count),
+            &cKeywordPaths,
+            sensitivities,
+            &porcupineHandle
+        )
+
+        if status != 0 {
+            let messageStack = getErrorStack()
+            throw mapStatus(status, message: "pv_porcupine_init failed", stack: messageStack)
+        }
+
+        // Release any previously-initialized engine before overwriting the handle
+        delete()
+
+        self.handle = porcupineHandle
+        Self.logger.info("Porcupine engine initialized with \(keywordPaths.count) keyword(s)")
+    }
+
+    /// Process one frame of 16-bit PCM audio.
+    ///
+    /// - Parameter pcm: Audio samples; length must equal `frameLength`.
+    /// - Returns: Index of detected keyword (0-based), or -1 if none detected.
+    /// - Throws: `PorcupineBindingError` on failure.
+    func process(pcm: [Int16]) throws -> Int32 {
+        guard let handle = self.handle else {
+            throw PorcupineBindingError.invalidState("Porcupine not initialized", [])
+        }
+
+        guard pcm.count == Int(frameLength) else {
+            throw PorcupineBindingError.invalidArgument(
+                "PCM frame must contain exactly \(frameLength) samples, got \(pcm.count)",
+                []
+            )
+        }
+
+        var keywordIndex: Int32 = -1
+        let status = pvPorcupineProcess(handle, pcm, &keywordIndex)
+
+        if status != 0 {
+            let messageStack = getErrorStack()
+            throw mapStatus(status, message: "pv_porcupine_process failed", stack: messageStack)
+        }
+
+        return keywordIndex
+    }
+
+    /// Release the Porcupine engine. Safe to call multiple times.
+    func delete() {
+        if let handle = self.handle {
+            pvPorcupineDelete(handle)
+            self.handle = nil
+            Self.logger.info("Porcupine engine deleted")
+        }
+    }
+
+    // MARK: - Computed properties
+
+    /// Number of audio samples per frame expected by `process(pcm:)`.
+    var frameLength: Int32 {
+        pvPorcupineFrameLength()
+    }
+
+    /// Audio sample rate expected by the engine (typically 16000).
+    var sampleRate: Int32 {
+        pvSampleRate()
+    }
+
+    /// Porcupine library version string.
+    var version: String {
+        guard let cStr = pvPorcupineVersion() else { return "unknown" }
+        return String(cString: cStr)
+    }
+
+    // MARK: - Error handling
+
+    /// Retrieve the error message stack from Porcupine after a failed call.
+    private func getErrorStack() -> [String] {
+        var messageStackRef: UnsafeMutablePointer<UnsafeMutablePointer<CChar>?>?
+        var messageStackDepth: Int32 = 0
+        let status = pvGetErrorStack(&messageStackRef, &messageStackDepth)
+
+        guard status == 0, let stackPtr = messageStackRef else {
+            return []
+        }
+
+        var messages: [String] = []
+        for i in 0..<Int(messageStackDepth) {
+            if let msgPtr = stackPtr.advanced(by: i).pointee {
+                messages.append(String(cString: msgPtr))
+            }
+        }
+
+        pvFreeErrorStack(messageStackRef)
+        return messages
+    }
+
+    /// Map a `pv_status_t` integer to a `PorcupineBindingError`.
+    private func mapStatus(_ status: Int32, message: String, stack: [String]) -> PorcupineBindingError {
+        switch status {
+        case 1:  return .outOfMemory(message, stack)
+        case 2:  return .ioError(message, stack)
+        case 3:  return .invalidArgument(message, stack)
+        case 4:  return .stopIteration(message, stack)
+        case 5:  return .keyError(message, stack)
+        case 6:  return .invalidState(message, stack)
+        case 7:  return .runtimeError(message, stack)
+        case 8:  return .activationError(message, stack)
+        case 9:  return .activationLimitReached(message, stack)
+        case 10: return .activationThrottled(message, stack)
+        case 11: return .activationRefused(message, stack)
+        default:
+            var statusName = "unknown"
+            if let cStr = pvStatusToString(status) {
+                statusName = String(cString: cStr)
+            }
+            return .unknownError(status, "\(statusName): \(message)", stack)
+        }
+    }
+}
diff --git a/clients/macos/vellum-assistant/Features/Voice/WakeWord/PorcupineWakeWordEngine.swift b/clients/macos/vellum-assistant/Features/Voice/WakeWord/PorcupineWakeWordEngine.swift
index 113a79c9718..34a17ddaff9 100644
--- a/clients/macos/vellum-assistant/Features/Voice/WakeWord/PorcupineWakeWordEngine.swift
+++ b/clients/macos/vellum-assistant/Features/Voice/WakeWord/PorcupineWakeWordEngine.swift
@@ -3,12 +3,12 @@ import os
 
 private let log = Logger(subsystem: "com.vellum.vellum-assistant", category: "PorcupineWakeWordEngine")
 
-/// Placeholder wake word engine backed by Porcupine.
+/// Wake word engine backed by Porcupine's C SDK via `PorcupineBinding`.
 ///
-/// Currently a stub that conforms to `WakeWordEngine` so that
-/// `AlwaysOnAudioMonitor` and `WakeWordCoordinator` can be wired
-/// end-to-end. Swap in real Porcupine SDK calls when the dependency
-/// is integrated.
+/// Loads `libpv_porcupine.dylib` at runtime, resolves model and keyword
+/// files from the app bundle, and processes 16 kHz Int16 PCM audio in
+/// 512-sample frames. Thread-safe: `start()` and `stop()` are called from
+/// the main thread; `processAudioFrame(_:)` runs on the audio thread.
 final class PorcupineWakeWordEngine: WakeWordEngine {
 
     var onWakeWordDetected: ((Float) -> Void)?
@@ -18,24 +18,157 @@ final class PorcupineWakeWordEngine: WakeWordEngine {
     /// Detection sensitivity (0.0 = least sensitive, 1.0 = most sensitive).
     let sensitivity: Float
 
-    init(sensitivity: Float = 0.5) {
+    /// Built-in keyword name (e.g. "computer") or absolute path to a custom .ppn file.
+    let keyword: String
+
+    private var binding: PorcupineBinding?
+    private var frameBuffer: [Int16] = []
+    private var frameLength: Int = 512
+
+    /// Guards `binding` and `frameBuffer` for thread safety between
+    /// the main thread (`start`/`stop`) and the audio thread (`processAudioFrame`).
+    private var lock = os_unfair_lock()
+
+    /// Whether an error has already been logged during `processAudioFrame` to
+    /// avoid flooding the log on every frame.
+    private var hasLoggedProcessError = false
+
+    init(sensitivity: Float = 0.5, keyword: String = "computer") {
         self.sensitivity = sensitivity
+        self.keyword = keyword
     }
 
+    // MARK: - WakeWordEngine
+
     func start() throws {
         guard !isRunning else { return }
+
+        // 1. Access key
+        guard let accessKey = APIKeyManager.getKey(for: "picovoice") else {
+            log.warning("Picovoice access key not found in keychain — wake word detection disabled")
+            return
+        }
+
+        // 2. Dylib path
+        guard let frameworksPath = Bundle.main.privateFrameworksPath else {
+            log.warning("Bundle.main.privateFrameworksPath is nil — wake word detection disabled")
+            return
+        }
+        let dylibPath = (frameworksPath as NSString).appendingPathComponent("libpv_porcupine.dylib")
+        guard FileManager.default.fileExists(atPath: dylibPath) else {
+            log.warning("libpv_porcupine.dylib not found at \(dylibPath) — wake word detection disabled")
+            return
+        }
+
+        // 3. Create binding (loads dylib, resolves symbols)
+        let newBinding: PorcupineBinding
+        do {
+            newBinding = try PorcupineBinding(dylibPath: dylibPath)
+        } catch {
+            log.error("Failed to load PorcupineBinding: \(error)")
+            return
+        }
+
+        // 4. Model path
+        guard let resourceURL = Bundle.main.resourceURL else {
+            log.error("Bundle.main.resourceURL is nil — cannot locate Porcupine model")
+            return
+        }
+        let modelPath = resourceURL.appendingPathComponent("porcupine_params.pv").path
+        guard FileManager.default.fileExists(atPath: modelPath) else {
+            log.error("Porcupine model not found at \(modelPath)")
+            return
+        }
+
+        // 5. Keyword path
+        let keywordPath: String
+        let keywordDir = resourceURL.appendingPathComponent("porcupine-keywords")
+        let builtinPath = keywordDir.appendingPathComponent(self.keyword.lowercased() + "_mac.ppn").path
+        if FileManager.default.fileExists(atPath: builtinPath) {
+            keywordPath = builtinPath
+        } else if self.keyword.hasPrefix("/") && FileManager.default.fileExists(atPath: self.keyword) {
+            // Treat keyword as an absolute path to a custom .ppn file
+            keywordPath = self.keyword
+        } else {
+            log.error("Keyword file not found: tried \(builtinPath) and absolute path \(self.keyword)")
+            return
+        }
+
+        // 6. Initialize Porcupine engine
+        do {
+            try newBinding.initialize(
+                accessKey: accessKey,
+                modelPath: modelPath,
+                keywordPaths: [keywordPath],
+                sensitivities: [self.sensitivity]
+            )
+        } catch {
+            log.error("Failed to initialize Porcupine engine: \(error)")
+            return
+        }
+
+        // 7. Query actual frame length from binding
+        let actualFrameLength = Int(newBinding.frameLength)
+
+        // 8. Commit state
+        withLock {
+            self.binding = newBinding
+            self.frameBuffer = []
+            self.frameLength = actualFrameLength
+            self.hasLoggedProcessError = false
+        }
         isRunning = true
-        log.info("PorcupineWakeWordEngine started (sensitivity: \(self.sensitivity))")
+        log.info("PorcupineWakeWordEngine started (keyword: \(self.keyword), sensitivity: \(self.sensitivity), version: \(newBinding.version))")
     }
 
     func stop() {
         guard isRunning else { return }
+        withLock {
+            binding?.delete()
+            binding = nil
+            frameBuffer = []
+        }
         isRunning = false
         log.info("PorcupineWakeWordEngine stopped")
     }
 
-    /// Feed a buffer of 16-bit PCM audio samples for wake word detection.
-    func process(pcm: [Int16]) {
-        // Stub — real implementation will call Porcupine's process() here
+    // MARK: - Audio processing (audio thread)
+
+    func processAudioFrame(_ frame: [Int16]) {
+        var shouldNotify = false
+        withLock {
+            guard binding != nil else { return }
+            frameBuffer.append(contentsOf: frame)
+
+            while frameBuffer.count >= frameLength {
+                let chunk = Array(frameBuffer.prefix(frameLength))
+                frameBuffer.removeFirst(frameLength)
+
+                do {
+                    let keywordIndex = try binding!.process(pcm: chunk)
+                    if keywordIndex >= 0 {
+                        shouldNotify = true
+                    }
+                } catch {
+                    if !hasLoggedProcessError {
+                        hasLoggedProcessError = true
+                        log.error("Porcupine process error (further errors suppressed): \(error)")
+                    }
+                    // Stop processing further frames this call
+                    return
+                }
+            }
+        }
+        if shouldNotify {
+            onWakeWordDetected?(1.0)
+        }
+    }
+
+    // MARK: - Lock helpers
+
+    private func withLock<T>(_ body: () -> T) -> T {
+        os_unfair_lock_lock(&lock)
+        defer { os_unfair_lock_unlock(&lock) }
+        return body()
     }
 }
diff --git a/clients/macos/vellum-assistant/Features/Voice/WakeWord/WakeWordCoordinator.swift b/clients/macos/vellum-assistant/Features/Voice/WakeWord/WakeWordCoordinator.swift
index e458f5cca80..c3d3f396aba 100644
--- a/clients/macos/vellum-assistant/Features/Voice/WakeWord/WakeWordCoordinator.swift
+++ b/clients/macos/vellum-assistant/Features/Voice/WakeWord/WakeWordCoordinator.swift
@@ -1,5 +1,6 @@
 import Foundation
 import Combine
+import VellumAssistantShared
 import os
 
 private let log = Logger(subsystem: "com.vellum.vellum-assistant", category: "WakeWordCoordinator")
@@ -77,7 +78,7 @@ final class WakeWordCoordinator: ObservableObject {
 
         // Ignore if voice mode is already active
         guard voiceModeManager.state == .off else {
-            log.info("Wake word ignored — voice mode already active (state: \(String(describing: voiceModeManager.state)))")
+            log.info("Wake word ignored — voice mode already active (state: \(String(describing: self.voiceModeManager.state)))")
             return
         }
 
@@ -88,7 +89,6 @@ final class WakeWordCoordinator: ObservableObject {
         }
 
         log.info("Wake word detected — activating voice mode")
-        activatedViaWakeWord = true
 
         // 1. Play activation chime and show visual indicator
         WakeWordFeedback.playActivationChime()
@@ -107,6 +107,7 @@ final class WakeWordCoordinator: ObservableObject {
             audioMonitor.startMonitoring()
             return
         }
+        activatedViaWakeWord = true
         voiceModeManager.startListening()
     }
 
@@ -137,10 +138,10 @@ final class WakeWordCoordinator: ObservableObject {
                         if self.activatedViaWakeWord {
                             WakeWordFeedback.playDeactivationChime()
                             self.activationWindow.show(state: .listening)
-                            self.activatedViaWakeWord = false
                         }
                         self.audioMonitor.startMonitoring()
                     }
+                    self.activatedViaWakeWord = false  // always reset, regardless of setting
                 }
             }
     }
diff --git a/clients/macos/vellum-assistantTests/SettingsStoreOverrideResolutionTests.swift b/clients/macos/vellum-assistantTests/SettingsStoreOverrideResolutionTests.swift
index eb42b844b91..8008e58b414 100644
--- a/clients/macos/vellum-assistantTests/SettingsStoreOverrideResolutionTests.swift
+++ b/clients/macos/vellum-assistantTests/SettingsStoreOverrideResolutionTests.swift
@@ -7,7 +7,6 @@ final class SettingsStoreOverrideResolutionTests: XCTestCase {
     // Each test manipulates UserDefaults keys that the override resolution
     // reads. Clean up after each test to avoid cross-contamination.
     override func tearDown() {
-        UserDefaults.standard.removeObject(forKey: "iosPairingUseOverride")
         UserDefaults.standard.removeObject(forKey: "iosPairingGatewayOverride")
         UserDefaults.standard.removeObject(forKey: "iosPairingTokenOverride")
         super.tearDown()
@@ -15,30 +14,26 @@ final class SettingsStoreOverrideResolutionTests: XCTestCase {
 
     // MARK: - iOS Gateway URL
 
-    func testIosGatewayReturnsGlobalWhenOverrideOff() {
-        UserDefaults.standard.set(false, forKey: "iosPairingUseOverride")
+    func testIosGatewayReturnsOverrideWhenNonEmpty() {
         UserDefaults.standard.set("https://custom.example.com", forKey: "iosPairingGatewayOverride")
 
         let store = SettingsStore()
-        // Simulate global URL being set via IPC response
         store.ingressPublicBaseUrl = "https://global.example.com"
 
-        XCTAssertEqual(store.resolvedIosGatewayUrl, "https://global.example.com")
+        XCTAssertEqual(store.resolvedIosGatewayUrl, "https://custom.example.com")
     }
 
-    func testIosGatewayReturnsOverrideWhenOverrideOn() {
-        UserDefaults.standard.set(true, forKey: "iosPairingUseOverride")
-        UserDefaults.standard.set("https://custom.example.com", forKey: "iosPairingGatewayOverride")
+    func testIosGatewayFallsBackToGlobalWhenOverrideEmpty() {
+        UserDefaults.standard.set("", forKey: "iosPairingGatewayOverride")
 
         let store = SettingsStore()
         store.ingressPublicBaseUrl = "https://global.example.com"
 
-        XCTAssertEqual(store.resolvedIosGatewayUrl, "https://custom.example.com")
+        XCTAssertEqual(store.resolvedIosGatewayUrl, "https://global.example.com")
     }
 
-    func testIosGatewayFallsBackToGlobalWhenOverrideOnButEmpty() {
-        UserDefaults.standard.set(true, forKey: "iosPairingUseOverride")
-        UserDefaults.standard.set("", forKey: "iosPairingGatewayOverride")
+    func testIosGatewayFallsBackToGlobalWhenOverrideWhitespaceOnly() {
+        UserDefaults.standard.set("   ", forKey: "iosPairingGatewayOverride")
 
         let store = SettingsStore()
         store.ingressPublicBaseUrl = "https://global.example.com"
@@ -46,10 +41,8 @@ final class SettingsStoreOverrideResolutionTests: XCTestCase {
         XCTAssertEqual(store.resolvedIosGatewayUrl, "https://global.example.com")
     }
 
-    func testIosGatewayFallsBackToGlobalWhenOverrideOnAndWhitespaceOnly() {
-        UserDefaults.standard.set(true, forKey: "iosPairingUseOverride")
-        UserDefaults.standard.set("   ", forKey: "iosPairingGatewayOverride")
-
+    func testIosGatewayFallsBackToGlobalWhenOverrideAbsent() {
+        // No override key set at all
         let store = SettingsStore()
         store.ingressPublicBaseUrl = "https://global.example.com"
 
diff --git a/clients/shared/App/Auth/SessionTokenManager.swift b/clients/shared/App/Auth/SessionTokenManager.swift
index 7f1b939c8a5..129fac43ba9 100644
--- a/clients/shared/App/Auth/SessionTokenManager.swift
+++ b/clients/shared/App/Auth/SessionTokenManager.swift
@@ -8,23 +8,53 @@ public extension Notification.Name {
 /// Replaces the macOS-only `/usr/bin/security` CLI approach.
 /// Uses provider "session-token" to match the old keychain account name
 /// so existing macOS users' stored sessions are preserved after upgrade.
+///
+/// Also writes the token to `~/.vellum/platform-token` so the daemon can
+/// read it for authenticated platform API calls without IPC round-trips.
 public enum SessionTokenManager {
     private static let provider = "session-token"
 
+    /// Path to the platform token file the daemon reads.
+    private static var platformTokenPath: String {
+        resolveVellumDir() + "/platform-token"
+    }
+
     public static func getToken() -> String? {
         APIKeyManager.shared.getAPIKey(provider: provider)
     }
 
     public static func setToken(_ token: String) {
         _ = APIKeyManager.shared.setAPIKey(token, provider: provider)
+        writePlatformTokenFile(token)
         NotificationCenter.default.post(name: .sessionTokenDidChange, object: nil)
     }
 
     public static func deleteToken() {
         _ = APIKeyManager.shared.deleteAPIKey(provider: provider)
+        removePlatformTokenFile()
         NotificationCenter.default.post(name: .sessionTokenDidChange, object: nil)
     }
 
+    // MARK: - Platform token file bridge
+
+    private static func writePlatformTokenFile(_ token: String) {
+        let path = platformTokenPath
+        do {
+            try token.write(toFile: path, atomically: true, encoding: .utf8)
+            // Restrict permissions to owner-only (0600)
+            try FileManager.default.setAttributes(
+                [.posixPermissions: 0o600],
+                ofItemAtPath: path
+            )
+        } catch {
+            // Best-effort; daemon falls back to bundled catalog if token is unavailable
+        }
+    }
+
+    private static func removePlatformTokenFile() {
+        try? FileManager.default.removeItem(atPath: platformTokenPath)
+    }
+
     public static func getTokenAsync() async -> String? {
         await withCheckedContinuation { continuation in
             DispatchQueue.global(qos: .userInitiated).async {
diff --git a/clients/shared/IPC/DaemonConnection.swift b/clients/shared/IPC/DaemonConnection.swift
index b2e38061b58..0b21ba8378a 100644
--- a/clients/shared/IPC/DaemonConnection.swift
+++ b/clients/shared/IPC/DaemonConnection.swift
@@ -135,6 +135,8 @@ extension DaemonClient {
                             resumed = true
                             timeoutTask.cancel()
                             checkedContinuation.resume(throwing: NWError.posix(.ECANCELED))
+                        } else {
+                            self.scheduleReconnect()
                         }
 
                     case .waiting(let error):
diff --git a/clients/shared/IPC/Generated/IPCContractGenerated.swift b/clients/shared/IPC/Generated/IPCContractGenerated.swift
index cbb791a5518..854032e13d3 100644
--- a/clients/shared/IPC/Generated/IPCContractGenerated.swift
+++ b/clients/shared/IPC/Generated/IPCContractGenerated.swift
@@ -1911,13 +1911,15 @@ public struct IPCGuardianRequestThreadCreated: Codable, Sendable {
     public let requestId: String
     public let callSessionId: String
     public let title: String
+    public let questionText: String
 
-    public init(type: String, conversationId: String, requestId: String, callSessionId: String, title: String) {
+    public init(type: String, conversationId: String, requestId: String, callSessionId: String, title: String, questionText: String) {
         self.type = type
         self.conversationId = conversationId
         self.requestId = requestId
         self.callSessionId = callSessionId
         self.title = title
+        self.questionText = questionText
     }
 }
 
diff --git a/clients/shared/IPC/IPCMessages.swift b/clients/shared/IPC/IPCMessages.swift
index 2ede1f54e63..d35556d0a41 100644
--- a/clients/shared/IPC/IPCMessages.swift
+++ b/clients/shared/IPC/IPCMessages.swift
@@ -2775,7 +2775,7 @@ public struct ApprovedDevicesListResponseMessage: Decodable, Sendable {
     public struct Device: Decodable, Sendable {
         public let hashedDeviceId: String
         public let deviceName: String
-        public let lastPairedAt: Double
+        public let lastPairedAt: Int
     }
     public let devices: [Device]
 }
diff --git a/clients/shared/Utilities/FeatureFlagManager.swift b/clients/shared/Utilities/FeatureFlagManager.swift
index 5a87b1c6697..d4d29b89f32 100644
--- a/clients/shared/Utilities/FeatureFlagManager.swift
+++ b/clients/shared/Utilities/FeatureFlagManager.swift
@@ -8,7 +8,6 @@ public enum FeatureFlag: String, CaseIterable {
     case featureFlagEditorEnabled = "feature_flag_editor_enabled"
     case hatchNewAssistantEnabled = "hatch_new_assistant_enabled"
     case localHttpEnabled = "local_http_enabled"
-    case assistantInboxEnabled = "assistant_inbox_enabled"
 
     public var displayName: String {
         switch self {
@@ -17,7 +16,6 @@ public enum FeatureFlag: String, CaseIterable {
         case .featureFlagEditorEnabled: return "Feature Flag Editor Enabled"
         case .hatchNewAssistantEnabled: return "Hatch New Assistant Enabled"
         case .localHttpEnabled: return "Local HTTP Enabled"
-        case .assistantInboxEnabled: return "Assistant Inbox Enabled"
         }
     }
 }
diff --git a/gateway/package.json b/gateway/package.json
index f76f486c99b..4eb83705981 100644
--- a/gateway/package.json
+++ b/gateway/package.json
@@ -1,6 +1,6 @@
 {
   "name": "@vellumai/vellum-gateway",
-  "version": "0.3.5",
+  "version": "0.3.6",
   "type": "module",
   "scripts": {
     "dev": "bun run --watch src/index.ts",
diff --git a/gateway/src/http/routes/pairing-proxy.ts b/gateway/src/http/routes/pairing-proxy.ts
index b94f0ab09c9..6863dcc81b0 100644
--- a/gateway/src/http/routes/pairing-proxy.ts
+++ b/gateway/src/http/routes/pairing-proxy.ts
@@ -12,6 +12,9 @@ import { getLogger } from "../../logger.js";
 
 const log = getLogger("pairing-proxy");
 
+/** 64 KB — pairing payloads are tiny JSON; cap well below maxWebhookPayloadBytes. */
+const MAX_PAIRING_PAYLOAD_BYTES = 64 * 1024;
+
 const HOP_BY_HOP_HEADERS = [
   "connection",
   "keep-alive",
@@ -65,8 +68,31 @@ export function createPairingProxyHandler(config: GatewayConfig) {
     }
 
     const hasBody = req.method !== "GET" && req.method !== "HEAD";
+
+    // Payload size guard — primary defense: reject via Content-Length before
+    // reading the body into memory. This is the main protection against
+    // oversized requests because Bun's Request.arrayBuffer() buffers the
+    // entire body with no streaming-limit API, so once we call it the
+    // memory is already allocated.
+    if (hasBody) {
+      const contentLength = req.headers.get("content-length");
+      if (contentLength) {
+        const declared = Number(contentLength);
+        if (declared > MAX_PAIRING_PAYLOAD_BYTES || Number.isNaN(declared)) {
+          log.warn({ contentLength }, "Pairing proxy payload too large (content-length)");
+          return Response.json({ error: "Payload too large" }, { status: 413 });
+        }
+      }
+    }
+
     const bodyBuffer = hasBody ? await req.arrayBuffer() : null;
+    // Belt-and-suspenders: verify actual size after read in case
+    // Content-Length was absent (chunked) or spoofed.
     if (bodyBuffer !== null) {
+      if (bodyBuffer.byteLength > MAX_PAIRING_PAYLOAD_BYTES) {
+        log.warn({ bodyLength: bodyBuffer.byteLength }, "Pairing proxy payload too large");
+        return Response.json({ error: "Payload too large" }, { status: 413 });
+      }
       reqHeaders.set("content-length", String(bodyBuffer.byteLength));
     }
 
diff --git a/gateway/src/http/routes/twilio-connect-action-webhook.ts b/gateway/src/http/routes/twilio-connect-action-webhook.ts
index ea3cd9f4a1f..1ac85af0e3e 100644
--- a/gateway/src/http/routes/twilio-connect-action-webhook.ts
+++ b/gateway/src/http/routes/twilio-connect-action-webhook.ts
@@ -1,6 +1,6 @@
 import type { GatewayConfig } from "../../config.js";
 import { getLogger } from "../../logger.js";
-import { forwardTwilioConnectActionWebhook } from "../../runtime/client.js";
+import { CircuitBreakerOpenError, forwardTwilioConnectActionWebhook } from "../../runtime/client.js";
 import { validateTwilioWebhookRequest } from "../../twilio/validate-webhook.js";
 
 const log = getLogger("twilio-connect-action-webhook");
@@ -20,6 +20,12 @@ export function createTwilioConnectActionWebhookHandler(config: GatewayConfig) {
         headers: runtimeResponse.headers,
       });
     } catch (err) {
+      if (err instanceof CircuitBreakerOpenError) {
+        return Response.json(
+          { error: "Service temporarily unavailable" },
+          { status: 503, headers: { "Retry-After": String(err.retryAfterSecs) } },
+        );
+      }
       log.error({ err }, "Failed to forward Twilio connect-action webhook to runtime");
       return Response.json({ error: "Internal server error" }, { status: 502 });
     }
diff --git a/gateway/src/http/routes/twilio-status-webhook.ts b/gateway/src/http/routes/twilio-status-webhook.ts
index 26fe56384f7..7610df51f4e 100644
--- a/gateway/src/http/routes/twilio-status-webhook.ts
+++ b/gateway/src/http/routes/twilio-status-webhook.ts
@@ -1,6 +1,6 @@
 import type { GatewayConfig } from "../../config.js";
 import { getLogger } from "../../logger.js";
-import { forwardTwilioStatusWebhook } from "../../runtime/client.js";
+import { CircuitBreakerOpenError, forwardTwilioStatusWebhook } from "../../runtime/client.js";
 import { validateTwilioWebhookRequest } from "../../twilio/validate-webhook.js";
 
 const log = getLogger("twilio-status-webhook");
@@ -23,6 +23,12 @@ export function createTwilioStatusWebhookHandler(config: GatewayConfig) {
         headers: runtimeResponse.headers,
       });
     } catch (err) {
+      if (err instanceof CircuitBreakerOpenError) {
+        return Response.json(
+          { error: "Service temporarily unavailable" },
+          { status: 503, headers: { "Retry-After": String(err.retryAfterSecs) } },
+        );
+      }
       log.error({ err }, "Failed to forward Twilio status webhook to runtime");
       return Response.json({ error: "Internal server error" }, { status: 502 });
     }
diff --git a/gateway/src/http/routes/twilio-voice-webhook.ts b/gateway/src/http/routes/twilio-voice-webhook.ts
index f88edd46993..0bd70746998 100644
--- a/gateway/src/http/routes/twilio-voice-webhook.ts
+++ b/gateway/src/http/routes/twilio-voice-webhook.ts
@@ -1,6 +1,6 @@
 import type { GatewayConfig } from "../../config.js";
 import { getLogger } from "../../logger.js";
-import { forwardTwilioVoiceWebhook } from "../../runtime/client.js";
+import { CircuitBreakerOpenError, forwardTwilioVoiceWebhook } from "../../runtime/client.js";
 import { resolveAssistant, resolveAssistantByPhoneNumber, isRejection } from "../../routing/resolve-assistant.js";
 import { validateTwilioWebhookRequest } from "../../twilio/validate-webhook.js";
 
@@ -69,6 +69,12 @@ export function createTwilioVoiceWebhookHandler(config: GatewayConfig) {
         headers: runtimeResponse.headers,
       });
     } catch (err) {
+      if (err instanceof CircuitBreakerOpenError) {
+        return Response.json(
+          { error: "Service temporarily unavailable" },
+          { status: 503, headers: { "Retry-After": String(err.retryAfterSecs) } },
+        );
+      }
       log.error({ err }, "Failed to forward Twilio voice webhook to runtime");
       return Response.json({ error: "Internal server error" }, { status: 502 });
     }
diff --git a/gateway/src/http/routes/whatsapp-deliver.test.ts b/gateway/src/http/routes/whatsapp-deliver.test.ts
index c20c95e9c74..5d0eb6e5164 100644
--- a/gateway/src/http/routes/whatsapp-deliver.test.ts
+++ b/gateway/src/http/routes/whatsapp-deliver.test.ts
@@ -275,7 +275,7 @@ describe("/deliver/whatsapp", () => {
 
   it("returns 502 when sendWhatsAppReply throws", async () => {
     // Override the mock to throw
-    const throwingMock = mock.module("../../whatsapp/send.js", () => ({
+    const _throwingMock = mock.module("../../whatsapp/send.js", () => ({
       sendWhatsAppReply: async () => {
         throw new Error("WhatsApp API failure");
       },
@@ -307,7 +307,7 @@ describe("/deliver/whatsapp", () => {
     expect(res.status).toBe(200);
 
     expect(sendWhatsAppReplyCalls).toHaveLength(1);
-    const [config, to, text] = sendWhatsAppReplyCalls[0] as [GatewayConfig, string, string];
+    const [_config, to, text] = sendWhatsAppReplyCalls[0] as [GatewayConfig, string, string];
     expect(to).toBe("+15559876543");
     expect(text).toBe("Test message");
   });
diff --git a/gateway/src/index.ts b/gateway/src/index.ts
index 08e3b36e735..129efeb639b 100644
--- a/gateway/src/index.ts
+++ b/gateway/src/index.ts
@@ -17,6 +17,7 @@ import { createWhatsAppDeliverHandler } from "./http/routes/whatsapp-deliver.js"
 import { createOAuthCallbackHandler } from "./http/routes/oauth-callback.js";
 import { createPairingProxyHandler } from "./http/routes/pairing-proxy.js";
 import { getLogger, initLogger } from "./logger.js";
+import { CircuitBreakerOpenError } from "./runtime/client.js";
 import { buildSchema } from "./schema.js";
 import { callTelegramApi } from "./telegram/api.js";
 import { reconcileTelegramWebhook } from "./telegram/webhook-manager.js";
@@ -64,6 +65,12 @@ function main() {
     port: config.port,
     websocket: getRelayWebsocketHandlers(),
     error(err) {
+      if (err instanceof CircuitBreakerOpenError) {
+        return Response.json(
+          { error: "Service temporarily unavailable — runtime is unreachable" },
+          { status: 503, headers: { "Retry-After": String(err.retryAfterSecs) } },
+        );
+      }
       log.error({ err }, "Unhandled gateway error");
       return Response.json({ error: "Internal server error" }, { status: 500 });
     },
diff --git a/gateway/src/log-redact.ts b/gateway/src/log-redact.ts
new file mode 100644
index 00000000000..fe42c6952bd
--- /dev/null
+++ b/gateway/src/log-redact.ts
@@ -0,0 +1,97 @@
+/**
+ * Pino log serializers that scrub sensitive data (bearer tokens, API keys,
+ * authorization headers) from logged values.
+ *
+ * This is a standalone copy for the gateway package — kept in sync with
+ * assistant/src/util/log-redact.ts.  The gateway has no dependency on the
+ * assistant package, so we duplicate the lightweight serializer rather than
+ * adding a cross-package import.
+ */
+
+// ---------------------------------------------------------------------------
+// Sensitive-value patterns
+// ---------------------------------------------------------------------------
+
+const BEARER_RE = /Bearer [A-Za-z0-9._\-]+/g;
+
+const API_KEY_PATTERNS: RegExp[] = [
+  /AKIA[0-9A-Z]{16}/g,
+  /gh[pousr]_[A-Za-z0-9_]{36,255}/g,
+  /github_pat_[A-Za-z0-9_]{22,255}/g,
+  /glpat-[A-Za-z0-9\-_]{20,}/g,
+  /sk_live_[A-Za-z0-9]{24,}/g,
+  /rk_live_[A-Za-z0-9]{24,}/g,
+  /xoxb-[0-9]{10,}-[0-9]{10,}-[A-Za-z0-9]{24,}/g,
+  /xoxp-[0-9]{10,}-[0-9]{10,}-[0-9]{10,}-[a-f0-9]{32}/g,
+  /sk-ant-[A-Za-z0-9\-_]{80,}/g,
+  /sk-[A-Za-z0-9]{20}T3BlbkFJ[A-Za-z0-9]{20}/g,
+  /sk-proj-[A-Za-z0-9\-_]{40,}/g,
+  /AIza[A-Za-z0-9\-_]{35}/g,
+  /GOCSPX-[A-Za-z0-9\-_]{28}/g,
+  /SG\.[A-Za-z0-9\-_]{22}\.[A-Za-z0-9\-_]{43}/g,
+  /[0-9]{8,10}:[A-Za-z0-9_-]{35}/g,
+  /npm_[A-Za-z0-9]{36}/g,
+];
+
+const SENSITIVE_HEADERS = new Set([
+  "authorization",
+  "proxy-authorization",
+  "cookie",
+  "set-cookie",
+  "x-api-key",
+  "x-auth-token",
+]);
+
+// ---------------------------------------------------------------------------
+// String redaction
+// ---------------------------------------------------------------------------
+
+function redactString(value: string): string {
+  let result = value;
+  result = result.replace(BEARER_RE, "Bearer [REDACTED]");
+  for (const pattern of API_KEY_PATTERNS) {
+    pattern.lastIndex = 0;
+    result = result.replace(pattern, "[REDACTED]");
+  }
+  return result;
+}
+
+// ---------------------------------------------------------------------------
+// Deep value redaction
+// ---------------------------------------------------------------------------
+
+function redactValue(value: unknown, depth: number): unknown {
+  if (depth > 8) return value;
+
+  if (typeof value === "string") {
+    return redactString(value);
+  }
+
+  if (Array.isArray(value)) {
+    return value.map((item) => redactValue(item, depth + 1));
+  }
+
+  if (value !== null && typeof value === "object") {
+    const result: Record<string, unknown> = {};
+    for (const [key, val] of Object.entries(value as Record<string, unknown>)) {
+      if (SENSITIVE_HEADERS.has(key.toLowerCase())) {
+        result[key] = "[REDACTED]";
+      } else {
+        result[key] = redactValue(val, depth + 1);
+      }
+    }
+    return result;
+  }
+
+  return value;
+}
+
+// ---------------------------------------------------------------------------
+// Pino serializers
+// ---------------------------------------------------------------------------
+
+export const logSerializers: Record<string, (value: unknown) => unknown> = {
+  err: (err) => redactValue(err, 0),
+  req: (req) => redactValue(req, 0),
+  res: (res) => redactValue(res, 0),
+};
diff --git a/gateway/src/logger.ts b/gateway/src/logger.ts
index f09b1dc544b..00dd318d36a 100644
--- a/gateway/src/logger.ts
+++ b/gateway/src/logger.ts
@@ -2,6 +2,7 @@ import { existsSync, mkdirSync, readdirSync, unlinkSync } from "node:fs";
 import { join } from "node:path";
 import pino from "pino";
 import pinoPretty from "pino-pretty";
+import { logSerializers } from "./log-redact.js";
 
 export type LogFileConfig = {
   dir: string | undefined;
@@ -53,7 +54,7 @@ let activeConfig: LogFileConfig | null = null;
 
 function buildLogger(config: LogFileConfig | null): pino.Logger {
   if (!config?.dir) {
-    return pino({ name: "gateway" }, pinoPretty({ destination: 1 }));
+    return pino({ name: "gateway", serializers: logSerializers }, pinoPretty({ destination: 1 }));
   }
 
   if (!existsSync(config.dir)) {
@@ -62,13 +63,13 @@ function buildLogger(config: LogFileConfig | null): pino.Logger {
 
   const today = formatDate(new Date());
   const filePath = logFilePathForDate(config.dir, new Date());
-  const fileStream = pino.destination({ dest: filePath, sync: false, mkdir: true });
+  const fileStream = pino.destination({ dest: filePath, sync: false, mkdir: true, mode: 0o600 });
 
   activeLogDate = today;
   activeConfig = config;
 
   return pino(
-    { name: "gateway" },
+    { name: "gateway", serializers: logSerializers },
     pino.multistream([
       { stream: fileStream, level: "info" as const },
       { stream: pinoPretty({ destination: 1 }), level: "info" as const },
diff --git a/gateway/src/runtime/client.ts b/gateway/src/runtime/client.ts
index 6548ebc675e..4955c057265 100644
--- a/gateway/src/runtime/client.ts
+++ b/gateway/src/runtime/client.ts
@@ -5,6 +5,94 @@ import { getLogger } from "../logger.js";
 
 const log = getLogger("runtime-client");
 
+// ── Circuit breaker ──────────────────────────────────────────────────
+
+const enum CircuitState {
+  CLOSED = 0,
+  OPEN = 1,
+  HALF_OPEN = 2,
+}
+
+const CB_FAILURE_THRESHOLD = 5;
+const CB_COOLDOWN_MS = 30_000;
+
+/**
+ * Thrown when the circuit breaker is open. Callers should return 503
+ * with a Retry-After header derived from `retryAfterSecs`.
+ */
+export class CircuitBreakerOpenError extends Error {
+  readonly retryAfterSecs: number;
+  constructor(retryAfterSecs: number) {
+    super("Circuit breaker is open — runtime is unavailable");
+    this.name = "CircuitBreakerOpenError";
+    this.retryAfterSecs = retryAfterSecs;
+  }
+}
+
+let cbState: CircuitState = CircuitState.CLOSED;
+let cbConsecutiveFailures = 0;
+let cbOpenedAt = 0;
+
+function cbRetryAfterSecs(): number {
+  const elapsed = Date.now() - cbOpenedAt;
+  return Math.max(1, Math.ceil((CB_COOLDOWN_MS - elapsed) / 1000));
+}
+
+/**
+ * Check the circuit before making a request. Throws if open.
+ * Returns true when this is a half-open probe (caller must record outcome).
+ */
+function cbBeforeRequest(): boolean {
+  if (cbState === CircuitState.CLOSED) return false;
+
+  if (cbState === CircuitState.OPEN) {
+    if (Date.now() - cbOpenedAt >= CB_COOLDOWN_MS) {
+      cbState = CircuitState.HALF_OPEN;
+      log.info("Circuit breaker entering HALF_OPEN — allowing probe request");
+      return true;
+    }
+    throw new CircuitBreakerOpenError(cbRetryAfterSecs());
+  }
+
+  // HALF_OPEN: only one probe in flight; reject additional requests
+  throw new CircuitBreakerOpenError(cbRetryAfterSecs());
+}
+
+function cbOnSuccess(): void {
+  if (cbState !== CircuitState.CLOSED) {
+    log.info("Circuit breaker closing — runtime recovered");
+  }
+  cbState = CircuitState.CLOSED;
+  cbConsecutiveFailures = 0;
+}
+
+function cbOnFailure(): void {
+  cbConsecutiveFailures++;
+
+  if (cbState === CircuitState.HALF_OPEN) {
+    cbState = CircuitState.OPEN;
+    cbOpenedAt = Date.now();
+    log.warn({ failures: cbConsecutiveFailures }, "Circuit breaker re-opening after failed probe");
+    return;
+  }
+
+  if (cbConsecutiveFailures >= CB_FAILURE_THRESHOLD) {
+    cbState = CircuitState.OPEN;
+    cbOpenedAt = Date.now();
+    log.warn(
+      { failures: cbConsecutiveFailures },
+      "Circuit breaker opening — runtime appears down",
+    );
+  }
+}
+
+/** Exported for testing — resets circuit breaker to initial state. */
+export function _resetCircuitBreaker(): void {
+  cbState = CircuitState.CLOSED;
+  cbConsecutiveFailures = 0;
+  cbOpenedAt = 0;
+}
+
 /**
  * Header name used to prove a request originated from the gateway.
  * The value is the dedicated gateway-origin secret (or the bearer token as
@@ -92,6 +180,8 @@ export async function forwardToRuntime(
   payload: RuntimeInboundPayload,
   options?: ForwardOptions,
 ): Promise<RuntimeInboundResponse> {
+  cbBeforeRequest();
+
   const url = `${config.assistantRuntimeBaseUrl}/v1/assistants/${encodeURIComponent(assistantId)}/channels/inbound`;
 
   const extraHeaders: Record<string, string> = { "Content-Type": "application/json" };
@@ -122,6 +212,8 @@ export async function forwardToRuntime(
           { status: response.status, body, assistantId },
           "Runtime returned client error, not retrying",
         );
+        // 4xx = client error, not a daemon outage — don't trip the breaker
+        cbOnSuccess();
         throw new Error(`Runtime returned ${response.status}: ${body}`);
       }
 
@@ -140,6 +232,7 @@ export async function forwardToRuntime(
         { assistantId, eventId: result.eventId, duplicate: result.duplicate },
         "Runtime forward succeeded",
       );
+      cbOnSuccess();
       return result;
     } catch (err) {
       if (
@@ -156,6 +249,7 @@ export async function forwardToRuntime(
     }
   }
 
+  cbOnFailure();
   throw lastError ?? new Error("Runtime forward failed after retries");
 }
 
@@ -165,19 +259,30 @@ export async function resetConversation(
   sourceChannel: ChannelId,
   externalChatId: string,
 ): Promise<void> {
+  cbBeforeRequest();
+
   const url = `${config.assistantRuntimeBaseUrl}/v1/assistants/${encodeURIComponent(assistantId)}/channels/conversation`;
 
-  const response = await fetchImpl(url, {
-    method: "DELETE",
-    headers: runtimeHeaders(config, { "Content-Type": "application/json" }),
-    body: JSON.stringify({ sourceChannel, externalChatId }),
-    signal: AbortSignal.timeout(config.runtimeTimeoutMs),
-  });
+  let response: Response;
+  try {
+    response = await fetchImpl(url, {
+      method: "DELETE",
+      headers: runtimeHeaders(config, { "Content-Type": "application/json" }),
+      body: JSON.stringify({ sourceChannel, externalChatId }),
+      signal: AbortSignal.timeout(config.runtimeTimeoutMs),
+    });
+  } catch (err) {
+    cbOnFailure();
+    throw err;
+  }
 
   if (!response.ok) {
     const body = await response.text();
+    if (response.status >= 500) cbOnFailure(); else cbOnSuccess();
     throw new Error(`Reset conversation failed (${response.status}): ${body}`);
   }
+
+  cbOnSuccess();
 }
 
 export type UploadAttachmentInput = {
@@ -195,19 +300,29 @@ export async function downloadAttachment(
   assistantId: string,
   attachmentId: string,
 ): Promise<RuntimeAttachmentPayload> {
+  cbBeforeRequest();
+
   const url = `${config.assistantRuntimeBaseUrl}/v1/assistants/${encodeURIComponent(assistantId)}/attachments/${encodeURIComponent(attachmentId)}`;
 
-  const response = await fetchImpl(url, {
-    method: "GET",
-    headers: runtimeHeaders(config),
-    signal: AbortSignal.timeout(config.runtimeTimeoutMs),
-  });
+  let response: Response;
+  try {
+    response = await fetchImpl(url, {
+      method: "GET",
+      headers: runtimeHeaders(config),
+      signal: AbortSignal.timeout(config.runtimeTimeoutMs),
+    });
+  } catch (err) {
+    cbOnFailure();
+    throw err;
+  }
 
   if (!response.ok) {
     const body = await response.text();
+    if (response.status >= 500) cbOnFailure(); else cbOnSuccess();
     throw new Error(`Attachment download failed (${response.status}): ${body}`);
   }
 
+  cbOnSuccess();
   return (await response.json()) as RuntimeAttachmentPayload;
 }
 
@@ -219,19 +334,29 @@ export async function downloadAttachmentById(
   config: GatewayConfig,
   attachmentId: string,
 ): Promise<RuntimeAttachmentPayload> {
+  cbBeforeRequest();
+
   const url = `${config.assistantRuntimeBaseUrl}/v1/attachments/${encodeURIComponent(attachmentId)}`;
 
-  const response = await fetchImpl(url, {
-    method: "GET",
-    headers: runtimeHeaders(config),
-    signal: AbortSignal.timeout(config.runtimeTimeoutMs),
-  });
+  let response: Response;
+  try {
+    response = await fetchImpl(url, {
+      method: "GET",
+      headers: runtimeHeaders(config),
+      signal: AbortSignal.timeout(config.runtimeTimeoutMs),
+    });
+  } catch (err) {
+    cbOnFailure();
+    throw err;
+  }
 
   if (!response.ok) {
     const body = await response.text();
+    if (response.status >= 500) cbOnFailure(); else cbOnSuccess();
     throw new Error(`Attachment download failed (${response.status}): ${body}`);
   }
 
+  cbOnSuccess();
   return (await response.json()) as RuntimeAttachmentPayload;
 }
 
@@ -257,20 +382,29 @@ export async function forwardTwilioVoiceWebhook(
   originalUrl: string,
   assistantId?: string,
 ): Promise<TwilioForwardResponse> {
+  cbBeforeRequest();
+
   const url = `${config.assistantRuntimeBaseUrl}/v1/internal/twilio/voice-webhook`;
 
-  const response = await fetchImpl(url, {
-    method: "POST",
-    headers: runtimeHeaders(config, { "Content-Type": "application/json" }),
-    body: JSON.stringify({ params, originalUrl, assistantId }),
-    signal: AbortSignal.timeout(config.runtimeTimeoutMs),
-  });
+  let response: Response;
+  try {
+    response = await fetchImpl(url, {
+      method: "POST",
+      headers: runtimeHeaders(config, { "Content-Type": "application/json" }),
+      body: JSON.stringify({ params, originalUrl, assistantId }),
+      signal: AbortSignal.timeout(config.runtimeTimeoutMs),
+    });
+  } catch (err) {
+    cbOnFailure();
+    throw err;
+  }
 
   const body = await response.text();
   const headers: Record<string, string> = {};
   const contentType = response.headers.get("content-type");
   if (contentType) headers["Content-Type"] = contentType;
 
+  if (response.status >= 500) cbOnFailure(); else cbOnSuccess();
   return { status: response.status, body, headers };
 }
 
@@ -281,20 +415,29 @@ export async function forwardTwilioStatusWebhook(
   config: GatewayConfig,
   params: Record<string, string>,
 ): Promise<TwilioForwardResponse> {
+  cbBeforeRequest();
+
   const url = `${config.assistantRuntimeBaseUrl}/v1/internal/twilio/status`;
 
-  const response = await fetchImpl(url, {
-    method: "POST",
-    headers: runtimeHeaders(config, { "Content-Type": "application/json" }),
-    body: JSON.stringify({ params }),
-    signal: AbortSignal.timeout(config.runtimeTimeoutMs),
-  });
+  let response: Response;
+  try {
+    response = await fetchImpl(url, {
+      method: "POST",
+      headers: runtimeHeaders(config, { "Content-Type": "application/json" }),
+      body: JSON.stringify({ params }),
+      signal: AbortSignal.timeout(config.runtimeTimeoutMs),
+    });
+  } catch (err) {
+    cbOnFailure();
+    throw err;
+  }
 
   const body = await response.text();
   const headers: Record<string, string> = {};
   const contentType = response.headers.get("content-type");
   if (contentType) headers["Content-Type"] = contentType;
 
+  if (response.status >= 500) cbOnFailure(); else cbOnSuccess();
   return { status: response.status, body, headers };
 }
 
@@ -305,20 +448,29 @@ export async function forwardTwilioConnectActionWebhook(
   config: GatewayConfig,
   params: Record<string, string>,
 ): Promise<TwilioForwardResponse> {
+  cbBeforeRequest();
+
   const url = `${config.assistantRuntimeBaseUrl}/v1/internal/twilio/connect-action`;
 
-  const response = await fetchImpl(url, {
-    method: "POST",
-    headers: runtimeHeaders(config, { "Content-Type": "application/json" }),
-    body: JSON.stringify({ params }),
-    signal: AbortSignal.timeout(config.runtimeTimeoutMs),
-  });
+  let response: Response;
+  try {
+    response = await fetchImpl(url, {
+      method: "POST",
+      headers: runtimeHeaders(config, { "Content-Type": "application/json" }),
+      body: JSON.stringify({ params }),
+      signal: AbortSignal.timeout(config.runtimeTimeoutMs),
+    });
+  } catch (err) {
+    cbOnFailure();
+    throw err;
+  }
 
   const body = await response.text();
   const headers: Record<string, string> = {};
   const contentType = response.headers.get("content-type");
   if (contentType) headers["Content-Type"] = contentType;
 
+  if (response.status >= 500) cbOnFailure(); else cbOnSuccess();
   return { status: response.status, body, headers };
 }
 
@@ -327,14 +479,22 @@ export async function uploadAttachment(
   assistantId: string,
   input: UploadAttachmentInput,
 ): Promise<UploadAttachmentResponse> {
+  cbBeforeRequest();
+
   const url = `${config.assistantRuntimeBaseUrl}/v1/assistants/${encodeURIComponent(assistantId)}/attachments`;
 
-  const response = await fetchImpl(url, {
-    method: "POST",
-    headers: runtimeHeaders(config, { "Content-Type": "application/json" }),
-    body: JSON.stringify(input),
-    signal: AbortSignal.timeout(config.runtimeTimeoutMs),
-  });
+  let response: Response;
+  try {
+    response = await fetchImpl(url, {
+      method: "POST",
+      headers: runtimeHeaders(config, { "Content-Type": "application/json" }),
+      body: JSON.stringify(input),
+      signal: AbortSignal.timeout(config.runtimeTimeoutMs),
+    });
+  } catch (err) {
+    cbOnFailure();
+    throw err;
+  }
 
   if (!response.ok) {
     const body = await response.text();
@@ -342,13 +502,16 @@ export async function uploadAttachment(
     // extension, missing fields). Distinguish from transient 5xx/network errors
     // so callers can decide whether to skip or propagate.
     if (response.status >= 400 && response.status < 500) {
+      cbOnSuccess();
       throw new AttachmentValidationError(
         `Attachment rejected (${response.status}): ${body}`,
       );
     }
+    cbOnFailure();
     throw new Error(`Attachment upload failed (${response.status}): ${body}`);
   }
 
+  cbOnSuccess();
   return (await response.json()) as UploadAttachmentResponse;
 }
 
@@ -370,15 +533,24 @@ export async function forwardOAuthCallback(
   code?: string,
   error?: string,
 ): Promise<OAuthCallbackResponse> {
+  cbBeforeRequest();
+
   const url = `${config.assistantRuntimeBaseUrl}/v1/internal/oauth/callback`;
 
-  const response = await fetchImpl(url, {
-    method: "POST",
-    headers: runtimeHeaders(config, { "Content-Type": "application/json" }),
-    body: JSON.stringify({ state, code, error }),
-    signal: AbortSignal.timeout(config.runtimeTimeoutMs),
-  });
+  let response: Response;
+  try {
+    response = await fetchImpl(url, {
+      method: "POST",
+      headers: runtimeHeaders(config, { "Content-Type": "application/json" }),
+      body: JSON.stringify({ state, code, error }),
+      signal: AbortSignal.timeout(config.runtimeTimeoutMs),
+    });
+  } catch (err) {
+    cbOnFailure();
+    throw err;
+  }
 
   const body = await response.text();
+  if (response.status >= 500) cbOnFailure(); else cbOnSuccess();
   return { status: response.status, body };
 }
diff --git a/meta/package.json b/meta/package.json
index f78a8dcae4c..89e7ff3d8b3 100644
--- a/meta/package.json
+++ b/meta/package.json
@@ -1,13 +1,13 @@
 {
   "name": "vellum",
-  "version": "0.3.5",
+  "version": "0.3.6",
   "description": "Install the full Vellum stack locally",
   "bin": {
     "vellum": "./bin/vellum.js"
   },
   "dependencies": {
-    "@vellumai/assistant": "0.3.5",
-    "@vellumai/cli": "0.3.5",
-    "@vellumai/vellum-gateway": "0.3.5"
+    "@vellumai/assistant": "0.3.6",
+    "@vellumai/cli": "0.3.6",
+    "@vellumai/vellum-gateway": "0.3.6"
   }
 }