feat: AAP host + Mobile Use app integration (launch/stop/skills end-to-end)#250
Conversation
…o-end)
Implements the Agentic Apps Protocol host so Deus can spawn, supervise, and
MCP-route installed agentic apps. Ships with `deus.mobile-use` (device-use)
as the first integrated app — launchable from the UI's Apps tab or via the
agent's `mcp__deus__launch_app` tool.
Architecture (per docs/aap-host-design.md):
- Shared: manifest schema (Zod), template substitution, public AAP types,
`apps:launched` / `apps:stopped` q:events.
- Backend (services/aap/): apps.service owns the `runningApps` in-memory
map; lifecycle (spawn + ready-probe + SIGTERM stop) runs single-process;
pid-journal catches orphans across restarts; port-allocator + storage
helpers; mcp-bridge fires register/unregister to the agent-server; new
`readAppSkill(appId)` reads manifest-declared skill files on demand.
q:command launchApp/stopApp for user-initiated UI path; shared service
method for the agent-RPC path so both converge on identical inputs.
- Agent-server: deus-tools now exposes `list_apps`, `launch_app`,
`stop_app`, `read_app_skill`; `app-registrar` holds the dynamic MCP map
and broadcasts via `query.setMcpServers` while preserving each query's
initial SDK servers via attachQuery/detachQuery — without this the SDK
disconnects the transport mid-tool-call and hangs `launch_app` forever.
- Frontend: features/apps/ (AppsLauncher card in a new Apps content tab,
status chips, `useInstalledApps`/`useRunningApps` subscriptions);
`useAppsLaunched` opens a Browser tab on launch, `useAppsStopped` closes
it on exit; Browser tabs stamp `openedAt` at creation so matching
survives Electron's chrome-error rewrite after the backend dies.
Device-use fixes discovered during integration:
- `serve` runs in-process (was spawning a grandchild that orphaned on
SIGTERM/SIGKILL since the cli.js wrapper couldn't forward signals).
- Manifest sets NODE_ENV=production so the UI serves the prod bundle
instead of proxying to a nonexistent :5173 Vite dev server.
- Static root anchored to the package via `import.meta.url` — AAP spawns
with `cwd={workspace}`, so `./dist/frontend` would resolve wrong.
AAP launch triggers fire-and-forget MCP registration from the backend to
break a re-entrant control-request deadlock: the very tool call in flight
is what owns the transport the register RPC needs to update.
Tests: 9 AAP test files cover manifest/template/registry/lifecycle/
mcp-bridge/port-allocator/storage/integration/command-dispatch; an
integration test pins the fire-and-forget contract; app-registrar tests
pin SDK-server preservation.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughImplements Agentic Apps (AAP) end-to-end: shared manifest/template/types, backend app lifecycle and runtime (spawn/probe/stop, PID journal, port allocation), MCP bridge to agent-server, agent-server registrar and MCP tools, frontend Apps UI/hooks, tests, docs, and adds Changes
Sequence Diagram(s)sequenceDiagram
participant User as User (Frontend)
participant Frontend as Apps UI
participant Backend as Backend Service
participant AgentServer as Agent-Server
participant Query as Claude Query (SDK)
User->>Frontend: Click "Launch" (appId, workspaceId)
Frontend->>Backend: sendCommand("launchApp", {appId, workspaceId})
Backend->>Backend: validate → allocate port → spawn child → waitForReady
Backend->>Backend: record running entry → broadcast apps:launched
Backend->>AgentServer: registerMcpForRunningApp(appId,mcpUrl) (fire-and-forget)
AgentServer->>AgentServer: registerAppMcp(serverName,url) → update registeredServers
AgentServer->>Query: query.setMcpServers(fullPerQueryMap)
Backend-->>Frontend: return LaunchAppResult { runningAppId, url, bootstrap? }
Frontend->>Frontend: useAppsLaunched handler opens browser tab
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 10
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
apps/agent-server/agents/claude/claude-handler.ts (1)
490-505:⚠️ Potential issue | 🔴 CriticalHoist
queryResultout of thetryblock before using it infinally.
queryResultis declared with block scope insidetry(line 490), but thefinallyblock at line 685 tries to access it. This causes aReferenceErrorat runtime. Hoist the variable declaration and initialize it tonull, then null-guard the cleanup to handle cases where the SDK query creation fails before assignment.Proposed fix
// Mutable context accumulated during the streaming loop. const ctx = createStreamContext(); + let queryResult: ReturnType<typeof claudeSDK> | null = null; try { - const queryResult = claudeSDK({ prompt: promptInput, options: sdkOptions }); + queryResult = claudeSDK({ prompt: promptInput, options: sdkOptions }); console.log( `[TIMING][${generatorId}] claudeSDK() constructor returned in ${Date.now() - tSdkSpawn}ms` ); claudeQueries.set(sessionId, queryResult);and in the finally block:
} finally { if (claudeSessions.owns(sessionId, session)) { - detachQuery(queryResult); + if (queryResult) detachQuery(queryResult); claudeQueries.delete(sessionId); claudeSessions.delete(sessionId); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/agent-server/agents/claude/claude-handler.ts` around lines 490 - 505, The variable queryResult is declared inside the try block but accessed in the finally block causing a ReferenceError; hoist queryResult (declare it before the try, initialized to null) in claude-handler.ts near where claudeSDK is invoked, then update the finally cleanup to null-guard any operations that use queryResult (e.g., claudeQueries.delete(sessionId), attachQuery cleanup, or other teardown) so cleanup only runs when queryResult was successfully assigned after calling claudeSDK({ prompt: promptInput, options: sdkOptions }) and before calling attachQuery.
🧹 Nitpick comments (6)
apps/backend/src/services/aap/storage.ts (1)
14-20: Enforce the absolute-path contract beforemkdir().Right now
ensureStorageDirs()will create any truthy path. If a manifest substitution ever yields a relative path, this writes under the backend process cwd instead of the intended workspace/global root. Fail fast here so malformed app config cannot create stray directories.Suggested hardening
export async function ensureStorageDirs(paths: { workspace?: string; global?: string; }): Promise<void> { - if (paths.workspace) await mkdir(paths.workspace, { recursive: true }); - if (paths.global) await mkdir(paths.global, { recursive: true }); + if (paths.workspace) { + if (!isAbsolute(paths.workspace)) { + throw new Error(`storage.workspace must be absolute: ${paths.workspace}`); + } + await mkdir(paths.workspace, { recursive: true }); + } + if (paths.global) { + if (!isAbsolute(paths.global)) { + throw new Error(`storage.global must be absolute: ${paths.global}`); + } + await mkdir(paths.global, { recursive: true }); + } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/backend/src/services/aap/storage.ts` around lines 14 - 20, ensureStorageDirs currently calls mkdir on any truthy path which can create directories relative to the process CWD; validate inputs and fail fast by checking that any provided paths (the workspace and global parameters) are absolute before calling mkdir. In the ensureStorageDirs function, use path.isAbsolute (or equivalent) to assert each provided path is absolute and throw a clear error if not, so mkdir(paths.workspace, ...) and mkdir(paths.global, ...) are only called after the absolute-path check; include the offending path value in the error message for easier debugging.apps/backend/test/unit/services/aap/port-allocator.test.ts (1)
24-25: Rename the test to match actual behavior.Line 24 says “sequential calls,” but Line 25 runs concurrent allocations via
Promise.all. Renaming improves intent clarity.✏️ Suggested rename
- it("returns different ports across sequential calls (sanity)", async () => { + it("returns different ports across concurrent calls (sanity)", async () => {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/backend/test/unit/services/aap/port-allocator.test.ts` around lines 24 - 25, The test name says "sequential calls" but the body calls allocateFreePort() concurrently with Promise.all; update the it(...) description to reflect the actual behavior (e.g., "returns different ports across concurrent calls" or "returns different ports when called concurrently") so the test name matches the use of allocateFreePort() in the Promise.all invocation.apps/agent-server/test/deus-tools.test.ts (1)
267-383: Missing test coverage forread_app_skilltool.Test suites exist for
list_apps,launch_app, andstop_app, butread_app_skillhas no corresponding test coverage despite being one of the four AAP tools defined inapps/agent-server/agents/deus-tools/apps.ts.Do you want me to generate the test suite for
read_app_skill?🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/agent-server/test/deus-tools.test.ts` around lines 267 - 383, Add a new test suite for the read_app_skill tool to mirror coverage of the other AAP tools: call getRegisteredTools(server.instance)["read_app_skill"] and assert that on success mockFrontendAPI.requestReadAppSkill is called with { sessionId: SESSION_ID, appId } and the handler returns the skill JSON/text as expected; add a test for when the backend returns a not-found or empty skill response (assert appropriate message/omission), and a test for when mockFrontendAPI.requestReadAppSkill.mockRejectedValue(...) to ensure the handler returns an "AAP error: ..." string (does NOT throw). Use the same test patterns and assertions as list_apps/launch_app/stop_app suites to keep behavior consistent.apps/backend/test/unit/services/agent-commands-aap.test.ts (1)
13-38: Usevi.hoistedfor the shared mock handles.Vitest 4 hoists
vi.mock, and its factory cannot safely capture file-scope variables unless they come fromvi.hoisted. These factories currently close overmockLaunchApp/mockStopApp/mockGetWorkspaceForMiddleware/mockComputeWorkspacePath, so please move those spies into a hoisted block before the mocks. (v4.vitest.dev)♻️ Suggested change
-import { afterEach, beforeEach, describe, expect, it, vi } from "vitest"; +import { afterEach, beforeEach, describe, expect, it, vi } from "vitest"; -// ---- Hoisted mocks ---- -const mockLaunchApp = vi.fn(); -const mockStopApp = vi.fn(); -const mockGetWorkspaceForMiddleware = vi.fn(); -const mockComputeWorkspacePath = vi.fn(); +// ---- Hoisted mocks ---- +const { + mockLaunchApp, + mockStopApp, + mockGetWorkspaceForMiddleware, + mockComputeWorkspacePath, +} = vi.hoisted(() => ({ + mockLaunchApp: vi.fn(), + mockStopApp: vi.fn(), + mockGetWorkspaceForMiddleware: vi.fn(), + mockComputeWorkspacePath: vi.fn(), +}));As per coding guidelines "Write tests using Vitest with
vi.mock()andvi.hoisted()."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/backend/test/unit/services/agent-commands-aap.test.ts` around lines 13 - 38, Move the file-scope mock spy variables into a vi.hoisted block so the vi.mock factories don't close over regular variables; specifically hoist mockLaunchApp, mockStopApp, mockGetWorkspaceForMiddleware, and mockComputeWorkspacePath using vi.hoisted(...) before the vi.mock(...) calls, then keep the vi.mock factories referencing those hoisted handles (the factories that return launchApp/stopApp, getWorkspaceForMiddleware, and computeWorkspacePath) so Vitest v4's hoisting works correctly.apps/backend/test/integration/aap.test.ts (1)
593-594: Clarify the intent ofvoidstatements.The
void a; void b;statements appear to be silencing unused-variable warnings afterstopAllApps()already cleaned up. Consider adding a brief comment explaining this is intentional to avoid lint errors, or use// eslint-disable-next-line@typescript-eslint/no-unused-vars`` if applicable.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/backend/test/integration/aap.test.ts` around lines 593 - 594, The two lines "void a; void b;" are used to silence unused-variable warnings after stopAllApps() has cleaned up; replace them with an explicit intention comment or an inline ESLint suppression to make the intent clear: either add a brief comment referencing stopAllApps() and why a and b are intentionally unused, or add "// eslint-disable-next-line `@typescript-eslint/no-unused-vars`" immediately above the declarations of a and b (or remove the variables if they are no longer needed) so the lint rule and future readers understand this is intentional.docs/aap-host-design.md (1)
44-64: Add language specifiers to fenced code blocks.The ASCII architecture diagram (Lines 44-64) and file layout tree (Lines 76-120) lack language specifiers. While these aren't executable code, adding
textorplaintextsatisfies markdown linting and improves editor syntax highlighting.📝 Suggested fix
-``` +```text Backend (owns state) Agent-server (stateless) Frontend-``` +```text shared/aap/ ├── manifest.ts # Zod schema, Manifest type, idToServerNameAlso applies to: 76-120
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/aap-host-design.md` around lines 44 - 64, Add a language specifier (e.g., "text" or "plaintext") to the fenced code blocks containing the ASCII architecture diagram ("Backend (owns state) Agent-server (stateless) Frontend") and the file tree starting with "shared/aap/" so markdown linting and editor highlighting pass; update the opening fences for those blocks to include the specifier and leave the block contents unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@apps/agent-server/test/deus-tools.test.ts`:
- Around line 82-85: The test is missing an assertion that the new AAP tool
"read_app_skill" is registered; update the assertion block that checks toolNames
(variable toolNames in the test) to include
expect(toolNames).toContain("read_app_skill") alongside the existing
expect(toolNames).toContain("list_apps"), expect(... "launch_app"), and
expect(... "stop_app") checks so the test validates the new tool registration.
- Around line 11-14: Add a mock for requestReadAppSkill in the same mock object
where requestListApps, requestLaunchApp, and requestStopApp are defined so tests
for the read_app_skill tool can run; locate the mock definition in
deus-tools.test.ts (the object containing
requestListApps/requestLaunchApp/requestStopApp) and add requestReadAppSkill:
vi.fn() alongside them, ensuring any test setup that references read_app_skill
uses this mock.
In `@apps/backend/src/services/aap/lifecycle.ts`:
- Around line 267-269: The current check for string-form "bin" mistakenly
compares pj.bin to command and returns pj.main; instead, when typeof pj.bin ===
"string" you should treat the binary name as the package name: check pj.name ===
command and return resolvePath(packageRoot, pj.bin) (use pj.name, pj.bin and
packageRoot/resolvePath to locate the change). Ensure you remove the incorrect
pj.main comparison and return the resolved pj.bin path.
In `@apps/backend/src/services/aap/pid-journal.ts`:
- Around line 55-56: The current chain .map((s) => parseInt(s.trim(),
10)).filter((n) => Number.isFinite(n) && n > 0) accepts strings like "123abc";
replace the loose parseInt coercion with a strict digit-only check: for each
string (the mapped variable), trim it, verify it matches /^\d+$/ (or equivalent)
before converting to a number, then filter by Number.isInteger(...) && n > 0;
update the mapping/filtering logic in the PID parsing pipeline (the map + filter
that produces PIDs) to only convert and emit values that pass the digit-only
test so malformed entries are ignored.
In `@apps/backend/src/services/aap/registry.ts`:
- Around line 34-45: The loop over manifest.skills currently only rejects
absolute paths (isAbsolute(rel)) but still permits paths like "../foo" that
escape the package directory; update the validation in the for loop that
processes manifest.skills (where isAbsolute(rel) and resolvePath(packageRoot,
rel) are used) to ensure the resolved absolute path (abs) is contained within
packageRoot — for example compute relative = path.relative(packageRoot, abs) and
throw an Error if relative startsWith('..') or path.isAbsolute(relative); keep
the existing existsSync(abs) check and use the same manifest.id and rel values
in the new error message.
In `@apps/backend/src/services/query-engine.ts`:
- Around line 542-545: The call to await Promise.race([
stopAppsForWorkspace(workspaceId), new Promise(... ARCHIVE_STOP_CEILING_MS) ])
can reject if stopAppsForWorkspace throws, which will abort the archive; make
stopping apps best-effort by swallowing errors from stopAppsForWorkspace before
racing with the timeout. Replace the direct call with a wrapped promise like
stopAppsForWorkspace(workspaceId).catch(err => { /* log a warning */ }) so the
race never rejects due to stopApps failures, preserving the existing timeout
(ARCHIVE_STOP_CEILING_MS) and keeping the archive mutation resilient.
In `@apps/backend/test/unit/shared/events.test.ts`:
- Around line 291-293: The test's expectations for PROTOCOL_EVENTS are stale:
update the assertions in the unit test that reference PROTOCOL_EVENTS to also
expect the "apps:stopped" event alongside "apps:launched" (i.e., ensure
PROTOCOL_EVENTS contains "apps:stopped") and change the length assertion from 22
to 23 to match the current exported constant set; locate the assertions in the
test referencing PROTOCOL_EVENTS and adjust them accordingly.
In `@apps/web/src/features/apps/ui/AppCard.tsx`:
- Around line 127-135: The primary action branch currently shows a disabled
"Launch" button even when the app is in the stopping state; update the rendering
logic inside the Button (the element using handleLaunch, busy, isTransitioning,
chipStatus) to detect when running.status (or chipStatus) === "stopping" and
render a "Stopping..." label (and keep the button disabled) instead of "Launch"
so the main action matches the chip and state machine.
In `@apps/web/src/features/browser/store/browserWindowStore.ts`:
- Around line 36-43: The current single-slot pendingNewTab/pendingCloseTab will
drop events; change both to FIFO queues (e.g., pendingNewTabs:
PendingNewTabRequest[] and pendingCloseTabs: PendingCloseTabRequest[]) and
update the API: have requestNewTab and requestCloseTabByUrlPrefix push a new
request object (including a unique requestId/timestamp) onto the appropriate
queue, and change consumePendingNewTab and consumePendingClose to accept a
requestId (or return the dequeued item’s requestId) so the BrowserPanel can
acknowledge and remove a specific entry instead of clearing the whole slot;
update related types and any callers (e.g., BrowserPanel consumer code and the
set/clear helpers like setDetachedWindowOpen/clearDetachedWindow) to
enqueue/dequeue by requestId.
In `@apps/web/src/features/browser/types.ts`:
- Around line 56-63: PersistedBrowserTab currently omits the openedAt field so
hydratePersistedTab() drops the invariant that identifies app-owned tabs; update
the PersistedBrowserTab type to include openedAt?: string and change
hydratePersistedTab() (and any serialization/save logic that builds
PersistedBrowserTab) to copy/preserve the openedAt value from the persisted data
into the hydrated BrowserTab object so restored tabs retain their openedAt
identifier for apps:stopped matching.
---
Outside diff comments:
In `@apps/agent-server/agents/claude/claude-handler.ts`:
- Around line 490-505: The variable queryResult is declared inside the try block
but accessed in the finally block causing a ReferenceError; hoist queryResult
(declare it before the try, initialized to null) in claude-handler.ts near where
claudeSDK is invoked, then update the finally cleanup to null-guard any
operations that use queryResult (e.g., claudeQueries.delete(sessionId),
attachQuery cleanup, or other teardown) so cleanup only runs when queryResult
was successfully assigned after calling claudeSDK({ prompt: promptInput,
options: sdkOptions }) and before calling attachQuery.
---
Nitpick comments:
In `@apps/agent-server/test/deus-tools.test.ts`:
- Around line 267-383: Add a new test suite for the read_app_skill tool to
mirror coverage of the other AAP tools: call
getRegisteredTools(server.instance)["read_app_skill"] and assert that on success
mockFrontendAPI.requestReadAppSkill is called with { sessionId: SESSION_ID,
appId } and the handler returns the skill JSON/text as expected; add a test for
when the backend returns a not-found or empty skill response (assert appropriate
message/omission), and a test for when
mockFrontendAPI.requestReadAppSkill.mockRejectedValue(...) to ensure the handler
returns an "AAP error: ..." string (does NOT throw). Use the same test patterns
and assertions as list_apps/launch_app/stop_app suites to keep behavior
consistent.
In `@apps/backend/src/services/aap/storage.ts`:
- Around line 14-20: ensureStorageDirs currently calls mkdir on any truthy path
which can create directories relative to the process CWD; validate inputs and
fail fast by checking that any provided paths (the workspace and global
parameters) are absolute before calling mkdir. In the ensureStorageDirs
function, use path.isAbsolute (or equivalent) to assert each provided path is
absolute and throw a clear error if not, so mkdir(paths.workspace, ...) and
mkdir(paths.global, ...) are only called after the absolute-path check; include
the offending path value in the error message for easier debugging.
In `@apps/backend/test/integration/aap.test.ts`:
- Around line 593-594: The two lines "void a; void b;" are used to silence
unused-variable warnings after stopAllApps() has cleaned up; replace them with
an explicit intention comment or an inline ESLint suppression to make the intent
clear: either add a brief comment referencing stopAllApps() and why a and b are
intentionally unused, or add "// eslint-disable-next-line
`@typescript-eslint/no-unused-vars`" immediately above the declarations of a and b
(or remove the variables if they are no longer needed) so the lint rule and
future readers understand this is intentional.
In `@apps/backend/test/unit/services/aap/port-allocator.test.ts`:
- Around line 24-25: The test name says "sequential calls" but the body calls
allocateFreePort() concurrently with Promise.all; update the it(...) description
to reflect the actual behavior (e.g., "returns different ports across concurrent
calls" or "returns different ports when called concurrently") so the test name
matches the use of allocateFreePort() in the Promise.all invocation.
In `@apps/backend/test/unit/services/agent-commands-aap.test.ts`:
- Around line 13-38: Move the file-scope mock spy variables into a vi.hoisted
block so the vi.mock factories don't close over regular variables; specifically
hoist mockLaunchApp, mockStopApp, mockGetWorkspaceForMiddleware, and
mockComputeWorkspacePath using vi.hoisted(...) before the vi.mock(...) calls,
then keep the vi.mock factories referencing those hoisted handles (the factories
that return launchApp/stopApp, getWorkspaceForMiddleware, and
computeWorkspacePath) so Vitest v4's hoisting works correctly.
In `@docs/aap-host-design.md`:
- Around line 44-64: Add a language specifier (e.g., "text" or "plaintext") to
the fenced code blocks containing the ASCII architecture diagram ("Backend (owns
state) Agent-server (stateless) Frontend") and the
file tree starting with "shared/aap/" so markdown linting and editor
highlighting pass; update the opening fences for those blocks to include the
specifier and leave the block contents unchanged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: f317c5c7-b071-46f8-8bed-2e442eddc37a
📒 Files selected for processing (62)
.gitignoreapps/agent-server/agents/claude/claude-handler.tsapps/agent-server/agents/deus-tools/apps.tsapps/agent-server/agents/deus-tools/index.tsapps/agent-server/agents/session-store.tsapps/agent-server/app-registrar.tsapps/agent-server/event-broadcaster.tsapps/agent-server/index.tsapps/agent-server/rpc-schemas.tsapps/agent-server/test/app-registrar.test.tsapps/agent-server/test/deus-tools.test.tsapps/backend/src/config/installed-apps.tsapps/backend/src/lib/repo-root.tsapps/backend/src/server.tsapps/backend/src/services/aap/apps.service.tsapps/backend/src/services/aap/index.tsapps/backend/src/services/aap/lifecycle.tsapps/backend/src/services/aap/mcp-bridge.tsapps/backend/src/services/aap/pid-journal.tsapps/backend/src/services/aap/port-allocator.tsapps/backend/src/services/aap/registry.tsapps/backend/src/services/aap/storage.tsapps/backend/src/services/agent/client.tsapps/backend/src/services/agent/commands.tsapps/backend/src/services/agent/index.tsapps/backend/src/services/agent/service.tsapps/backend/src/services/query-engine.tsapps/backend/test/integration/aap.test.tsapps/backend/test/unit/services/aap/lifecycle.test.tsapps/backend/test/unit/services/aap/mcp-bridge.test.tsapps/backend/test/unit/services/aap/port-allocator.test.tsapps/backend/test/unit/services/aap/registry.test.tsapps/backend/test/unit/services/aap/storage.test.tsapps/backend/test/unit/services/agent-commands-aap.test.tsapps/backend/test/unit/shared/aap/manifest.test.tsapps/backend/test/unit/shared/aap/template.test.tsapps/backend/test/unit/shared/events.test.tsapps/web/src/app/layouts/ContentView.tsxapps/web/src/app/layouts/content-tabs.tsapps/web/src/features/apps/hooks/index.tsapps/web/src/features/apps/hooks/useAppsLaunched.tsapps/web/src/features/apps/hooks/useAppsStopped.tsapps/web/src/features/apps/hooks/useInstalledApps.tsapps/web/src/features/apps/hooks/useRunningApps.tsapps/web/src/features/apps/index.tsapps/web/src/features/apps/ui/AppCard.tsxapps/web/src/features/apps/ui/AppStatusChip.tsxapps/web/src/features/apps/ui/AppsLauncher.tsxapps/web/src/features/apps/ui/index.tsapps/web/src/features/browser/store/browserWindowStore.tsapps/web/src/features/browser/types.tsapps/web/src/features/browser/ui/BrowserPanel.tsxapps/web/src/features/workspace/store/workspaceLayoutStore.tsdocs/aap-host-design.mdpackages/device-use/agentic-app.jsonpackages/device-use/src/cli/commands/serve.tspackages/device-use/src/server/index.tsshared/aap/manifest.tsshared/aap/template.tsshared/aap/types.tsshared/events.tsshared/schema.ts
| pendingNewTab: PendingNewTabRequest | null; | ||
| pendingCloseTab: PendingCloseTabRequest | null; | ||
| setDetachedWindowOpen: (context: DetachedBrowserWorkspaceContext) => void; | ||
| clearDetachedWindow: () => void; | ||
| requestNewTab: (workspaceId: string, url: string) => void; | ||
| consumePendingNewTab: () => void; | ||
| requestCloseTabByUrlPrefix: (workspaceId: string, urlPrefix: string) => void; | ||
| consumePendingCloseTab: () => void; |
There was a problem hiding this comment.
Single-slot “pending” state will drop tab open/close events.
pendingNewTab and pendingCloseTab only hold one request each. If two apps:launched or apps:stopped events arrive before BrowserPanel consumes the first one, the later set() overwrites the earlier request. The zero-arg consumePending*() calls can also clear a newer request accidentally. Please make these actual queues, and acknowledge consumption by requestId.
Also applies to: 74-96
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@apps/web/src/features/browser/store/browserWindowStore.ts` around lines 36 -
43, The current single-slot pendingNewTab/pendingCloseTab will drop events;
change both to FIFO queues (e.g., pendingNewTabs: PendingNewTabRequest[] and
pendingCloseTabs: PendingCloseTabRequest[]) and update the API: have
requestNewTab and requestCloseTabByUrlPrefix push a new request object
(including a unique requestId/timestamp) onto the appropriate queue, and change
consumePendingNewTab and consumePendingClose to accept a requestId (or return
the dequeued item’s requestId) so the BrowserPanel can acknowledge and remove a
specific entry instead of clearing the whole slot; update related types and any
callers (e.g., BrowserPanel consumer code and the set/clear helpers like
setDetachedWindowOpen/clearDetachedWindow) to enqueue/dequeue by requestId.
There was a problem hiding this comment.
Holding the single-slot design. Two reasons:
- Backend dedups.
apps.serviceenforces one running instance per (appId, workspace), so a duplicatelaunchAppreturns the existingrunningAppIdinstead of emitting a secondapps:launched. The race the suggestion targets (two events arriving before BrowserPanel consumes) is essentially unreachable for the same app, and a user would have to click two distinct app cards within the same React macrotask for it to fire across apps. - Consumer is synchronous.
BrowserPanel'suseEffectfires on the same render cycle that the producer'sset()triggers — no animation frame or async wait in between. This pattern (single-slot + immediate consume) is the same one we use forchatInsertStore.
Promoting to a queue with requestId ack would touch four call sites and add a new failure mode (forgotten ack → unbounded growth) for a race we can't reproduce. If we ever see it in practice, the fix lands in one place — happy to revisit then.
There was a problem hiding this comment.
@zvadaadam, understood — both arguments are solid. The backend's single-instance enforcement removes the root cause of the race, and the synchronous useEffect / consume cycle makes the window vanishingly small even across different apps. Acknowledged, I'll stop flagging this pattern here.
✏️ Learnings added
Learnt from: zvadaadam
Repo: zvadaadam/deus-machine PR: 250
File: apps/web/src/features/browser/store/browserWindowStore.ts:36-43
Timestamp: 2026-04-19T22:14:39.702Z
Learning: In `apps/web/src/features/browser/store/browserWindowStore.ts`, the single-slot `pendingNewTab`/`pendingCloseTab` design is intentional. Two reasons: (1) `apps.service` enforces one running instance per (appId, workspace), so a second `apps:launched` event for the same app is never emitted — the duplicate `launchApp` call returns the existing `runningAppId`; (2) `BrowserPanel`'s `useEffect` consumer fires synchronously on the same render cycle that the producer's Zustand `set()` triggers (same pattern as `chatInsertStore`), so the slot is consumed before any second event could overwrite it. Do not suggest converting to a FIFO queue or requestId-ack pattern in future reviews.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
🧠 Learnings used
Learnt from: zvadaadam
Repo: zvadaadam/deus-machine PR: 244
File: apps/web/src/features/session/hooks/useSlashCommand.ts:87-154
Timestamp: 2026-04-15T21:01:15.460Z
Learning: In `apps/web/src/features/session/hooks/useSlashCommand.ts`, the `useEffect` that fetches skills/commands on workspace change intentionally omits a stale-response cancellation guard. The trade-off is deliberate: workspace path changes are rare (only on repo switch), the backend reads local filesystem files in sub-100ms, and the race condition window is practically impossible. Do not flag the absence of a cancellation guard in this hook in future reviews.
Learnt from: zvadaadam
Repo: zvadaadam/deus-machine PR: 247
File: apps/web/src/app/layouts/useChatTabs.ts:328-347
Timestamp: 2026-04-17T08:21:39.540Z
Learning: In `apps/web/src/app/layouts/useChatTabs.ts`, `updateChatTabAgentHarness` is intentionally a no-op when `tab.data?.hasStarted` is true. Once a session has messages, the harness is locked server-side in `handleSendMessage`; allowing local UI harness changes on started tabs would cause drift from the persisted session until reload. Do not flag this early-return as missing logic in future reviews.
Learnt from: zvadaadam
Repo: zvadaadam/deus-machine PR: 246
File: apps/backend/src/services/agent/commands.ts:240-272
Timestamp: 2026-04-16T17:05:23.922Z
Learning: In `apps/backend/src/services/agent/commands.ts`, the `sim:launchApp`, `sim:terminateApp`, and `sim:uninstallApp` handlers intentionally do not guard against the TOCTOU race between `getContextForWorkspace()` and `execFile("xcrun", ...)`. The decision is deliberate: simctl operations complete in under 100ms and session mutations (startStream/stopStream) are rare, making the race window practically negligible. Do not flag the absence of a session lock in these handlers in future reviews.
Learnt from: zvadaadam
Repo: zvadaadam/deus-machine PR: 244
File: apps/web/src/features/session/hooks/useSlashCommand.ts:67-154
Timestamp: 2026-04-15T21:01:13.186Z
Learning: In `apps/web/src/features/session/hooks/useSlashCommand.ts`, the one-shot fetch pattern using `useEffect` + `useState` + `fetchedWorkspaceRef` is intentional. Skills/commands are local config reads that happen once per workspace open and do not need TanStack Query (no cache invalidation, stale-while-revalidate, or server state management required). Do not suggest migrating this to TanStack Query; it would be over-engineering for this use case.
Learnt from: zvadaadam
Repo: zvadaadam/deus-machine PR: 240
File: apps/web/src/features/session/hooks/useTextBuffer.ts:49-60
Timestamp: 2026-04-12T23:40:23.156Z
Learning: In `apps/web/src/features/session/hooks/useTextBuffer.ts`, the markdown code-fence check in the rAF streaming buffer is intentionally conservative: it flushes the full text immediately when any unclosed ``` is detected in the current slice, even if it's a false positive (e.g. right after a closing fence). This trade-off is deliberate — premature flush (showing full text) is harmless, while rendering a partial/unclosed code fence causes visual glitches. Do not flag this as a bug in future reviews.
Learnt from: zvadaadam
Repo: zvadaadam/deus-machine PR: 246
File: apps/agent-server/event-broadcaster.ts:401-408
Timestamp: 2026-04-16T17:05:42.204Z
Learning: In zvadaadam/deus-machine, `EventBroadcaster.tunnels` in `apps/agent-server/event-broadcaster.ts` only has the backend attached in practice. The agent-server exposes a single WebSocket endpoint that only the backend connects to. `requestSimulatorContext` (calling `rpc("getSimulatorContext", ...)`) is safe because `apps/backend/src/services/agent/service.ts` intercepts `onFrontendRpc` calls with `method === "getSimulatorContext"` and returns `getContextForSession(sessionId)` directly, never forwarding to relay or desktop clients. Do not flag multi-tunnel routing concerns for this method.
Learnt from: zvadaadam
Repo: zvadaadam/deus-machine PR: 247
File: apps/backend/cli.ts:278-278
Timestamp: 2026-04-17T08:15:31.894Z
Learning: In zvadaadam/deus-machine, `apps/backend/cli.ts` is a local developer-only integration CLI that reads `--agent` exclusively from process argv and writes to the developer's own local SQLite DB. SQL interpolation of `opts.agent` without escaping is acceptable risk here — do not flag raw argv interpolation in this file in future reviews.
Learnt from: zvadaadam
Repo: zvadaadam/box-ide PR: 177
File: src/app/layouts/hooks/useRepoActions.ts:21-22
Timestamp: 2026-03-12T23:25:58.674Z
Learning: In zvadaadam/box-ide, two TypeScript path aliases coexist: 'shared/*' -> '/shared/' (root-level, cross-package) and '@/shared/*' -> 'src/shared/' (frontend-only). Do not flag mixed usage as inconsistent since they reference different directories and scopes. Ensure imports use the correct alias per intended scope and that tsconfig.json and vite.config.ts are configured accordingly. This applies to all TypeScript files in the repo.
- claude-handler: hoist queryResult declaration so the finally block can detach the SDK query without a ReferenceError on early throws (CI red) - deus-tools test: add requestReadAppSkill mock + read_app_skill suite, bump tool count 41 → 42 (CI red) - events test: PROTOCOL_EVENTS now contains apps:stopped, length 22 → 23 (CI red) - pid-journal: strict /^\d+$/ parse so a corrupted entry like "123abc" can't target an unrelated PID during the orphan sweep - aap registry: contain skill paths within the package root so a future untrusted manifest can't escape via "../foo" via read_app_skill - aap lifecycle: fix bin shorthand resolution per the npm spec — match pj.name === command and resolve pj.bin (was matching pj.bin and resolving pj.main, dead code today but a foot-gun for future apps) - query-engine archiveWorkspace: swallow stopAppsForWorkspace rejection so a slow/crashing child can't block the archive — orphan sweep on next boot is the safety net - AppCard: render "Stopping..." (matching the chip) instead of a disabled "Launch" while running.status === "stopping" - browser types: persist openedAt in PersistedBrowserTab + hydrate it back so apps:stopped can still match an app-owned tab after a workspace reload Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (3)
apps/backend/src/services/aap/lifecycle.ts (2)
52-52: Naming mismatch:RING_MAX_LINESlimits chunks, not lines.The constant name suggests line-based limiting, but
pushBoundedpushes entireBuffer.toString()results (chunks). A single data event can span multiple lines or be a partial line. The current behavior is fine for a tail buffer, but the name is misleading.📝 Suggested rename for clarity
-const RING_MAX_LINES = 50; +const RING_MAX_CHUNKS = 50;Also applies to: 85-88
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/backend/src/services/aap/lifecycle.ts` at line 52, The constant RING_MAX_LINES is misleading because it limits pushed chunks (from Buffer.toString()) rather than lines; rename RING_MAX_LINES to something reflecting chunk-based limits (e.g., RING_MAX_CHUNKS or RING_MAX_ENTRIES) and update all references (including where pushBounded is called) and any related comments to match the new name; ensure uses in lifecycle.ts (the declaration and the logic around pushBounded at lines referenced in the review) are updated consistently so behavior remains unchanged but the identifier accurately describes that it bounds chunks, not lines.
155-163: Usets-patternwith.exhaustive()for theReadyProbediscriminated union.
ReadyProbeis a discriminated union ontype("http" | "tcp"). The current if-else chain with a fallbackreturn falsebypasses compile-time exhaustiveness checking.♻️ Suggested refactor using ts-pattern
+import { match } from "ts-pattern"; + async function probeOnce(probe: ReadyProbe, port: number): Promise<boolean> { try { - if (probe.type === "http") return await probeHttp(port, probe.path); - if (probe.type === "tcp") return await probeTcp(port); - return false; + return await match(probe) + .with({ type: "http" }, (p) => probeHttp(port, p.path)) + .with({ type: "tcp" }, () => probeTcp(port)) + .exhaustive(); } catch { return false; } }As per coding guidelines: "Prefer
ts-patternfor discriminated unions. Use.exhaustive()for exhaustiveness checking."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/backend/src/services/aap/lifecycle.ts` around lines 155 - 163, Replace the if-chain in probeOnce so the ReadyProbe discriminated union is handled via ts-pattern: match(probe).with({type: "http"}, () => probeHttp(port, probe.path)).with({type: "tcp"}, () => probeTcp(port)).exhaustive(); keep the try/catch and return semantics (wrap the match call in the existing try and return false on catch) and reference the functions ReadyProbe, probeOnce, probeHttp, and probeTcp so the pattern match is used for compile-time exhaustiveness checking with .exhaustive().apps/agent-server/agents/claude/claude-handler.ts (1)
500-509: Consider refining the type cast.The cast to
McpSdkServerConfigWithInstanceis technically imprecise sincesdkOptions.mcpServerscontains config objects passed to the SDK, not the connected instances returned by the SDK. However, this is functionally safe because:
broadcast()spreads these into aRecord<string, McpServerConfig>(the base type)- TypeScript's structural typing means missing
instanceproperties are ignored downstreamThe current code works, but a more precise type would be
Record<string, McpServerConfig>matching whatbroadcast()actually consumes.♻️ Optional: Use more precise type
- const sdkServers = (sdkOptions.mcpServers ?? {}) as Record< - string, - McpSdkServerConfigWithInstance - >; + const sdkServers = (sdkOptions.mcpServers ?? {}) as Record<string, McpServerConfig>;This would also require updating the
attachQuerysignature inapp-registrar.tsto acceptRecord<string, McpServerConfig>.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/agent-server/agents/claude/claude-handler.ts` around lines 500 - 509, The code narrows sdkOptions.mcpServers to McpSdkServerConfigWithInstance which is imprecise; change the local type to Record<string, McpServerConfig> (the shape broadcast() actually consumes) and update attachQuery's parameter type in app-registrar.ts to accept Record<string, McpServerConfig> so the cast is no longer needed; locate sdkOptions.mcpServers and the attachQuery call in claude-handler.ts and update the attachQuery signature in app-registrar.ts accordingly (retain behavior, just adjust types).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@apps/web/src/features/browser/ui/BrowserPanel.tsx`:
- Around line 352-354: The filter for doomed tabs currently only matches when
t.openedAt exists; change the predicate in the tabs.filter used to compute
doomed so it first tries t.openedAt.startsWith(pendingCloseTab.urlPrefix) and,
if t.openedAt is falsy, falls back to checking
t.currentUrl?.startsWith(pendingCloseTab.urlPrefix) so older persisted tabs
without openedAt are included; update the expression around doomed to use this
fallback logic and keep pendingCloseTab.urlPrefix as the matching value.
---
Nitpick comments:
In `@apps/agent-server/agents/claude/claude-handler.ts`:
- Around line 500-509: The code narrows sdkOptions.mcpServers to
McpSdkServerConfigWithInstance which is imprecise; change the local type to
Record<string, McpServerConfig> (the shape broadcast() actually consumes) and
update attachQuery's parameter type in app-registrar.ts to accept Record<string,
McpServerConfig> so the cast is no longer needed; locate sdkOptions.mcpServers
and the attachQuery call in claude-handler.ts and update the attachQuery
signature in app-registrar.ts accordingly (retain behavior, just adjust types).
In `@apps/backend/src/services/aap/lifecycle.ts`:
- Line 52: The constant RING_MAX_LINES is misleading because it limits pushed
chunks (from Buffer.toString()) rather than lines; rename RING_MAX_LINES to
something reflecting chunk-based limits (e.g., RING_MAX_CHUNKS or
RING_MAX_ENTRIES) and update all references (including where pushBounded is
called) and any related comments to match the new name; ensure uses in
lifecycle.ts (the declaration and the logic around pushBounded at lines
referenced in the review) are updated consistently so behavior remains unchanged
but the identifier accurately describes that it bounds chunks, not lines.
- Around line 155-163: Replace the if-chain in probeOnce so the ReadyProbe
discriminated union is handled via ts-pattern: match(probe).with({type: "http"},
() => probeHttp(port, probe.path)).with({type: "tcp"}, () =>
probeTcp(port)).exhaustive(); keep the try/catch and return semantics (wrap the
match call in the existing try and return false on catch) and reference the
functions ReadyProbe, probeOnce, probeHttp, and probeTcp so the pattern match is
used for compile-time exhaustiveness checking with .exhaustive().
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 0d9b6487-0a44-447f-a9ea-4046996ff224
📒 Files selected for processing (10)
apps/agent-server/agents/claude/claude-handler.tsapps/agent-server/test/deus-tools.test.tsapps/backend/src/services/aap/lifecycle.tsapps/backend/src/services/aap/pid-journal.tsapps/backend/src/services/aap/registry.tsapps/backend/src/services/query-engine.tsapps/backend/test/unit/shared/events.test.tsapps/web/src/features/apps/ui/AppCard.tsxapps/web/src/features/browser/types.tsapps/web/src/features/browser/ui/BrowserPanel.tsx
✅ Files skipped from review due to trivial changes (1)
- apps/backend/src/services/aap/pid-journal.ts
🚧 Files skipped from review as they are similar to previous changes (5)
- apps/backend/src/services/query-engine.ts
- apps/backend/test/unit/shared/events.test.ts
- apps/web/src/features/browser/types.ts
- apps/web/src/features/apps/ui/AppCard.tsx
- apps/backend/src/services/aap/registry.ts
| const doomed = tabs.filter( | ||
| (t) => t.openedAt && t.openedAt.startsWith(pendingCloseTab.urlPrefix) | ||
| ); |
There was a problem hiding this comment.
Add a fallback matcher for tabs without openedAt.
Line 353 currently excludes tabs that don’t have openedAt (e.g., older persisted tabs before this field existed). That can miss expected close behavior on apps:stopped. Keep openedAt as primary, but fall back to currentUrl when missing.
💡 Proposed patch
- const doomed = tabs.filter(
- (t) => t.openedAt && t.openedAt.startsWith(pendingCloseTab.urlPrefix)
- );
+ const doomed = tabs.filter((t) => {
+ const matchUrl = t.openedAt ?? t.currentUrl ?? "";
+ return matchUrl.startsWith(pendingCloseTab.urlPrefix);
+ });📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const doomed = tabs.filter( | |
| (t) => t.openedAt && t.openedAt.startsWith(pendingCloseTab.urlPrefix) | |
| ); | |
| const doomed = tabs.filter((t) => { | |
| const matchUrl = t.openedAt ?? t.currentUrl ?? ""; | |
| return matchUrl.startsWith(pendingCloseTab.urlPrefix); | |
| }); |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@apps/web/src/features/browser/ui/BrowserPanel.tsx` around lines 352 - 354,
The filter for doomed tabs currently only matches when t.openedAt exists; change
the predicate in the tabs.filter used to compute doomed so it first tries
t.openedAt.startsWith(pendingCloseTab.urlPrefix) and, if t.openedAt is falsy,
falls back to checking t.currentUrl?.startsWith(pendingCloseTab.urlPrefix) so
older persisted tabs without openedAt are included; update the expression around
doomed to use this fallback logic and keep pendingCloseTab.urlPrefix as the
matching value.
There was a problem hiding this comment.
Holding the openedAt-only matcher. The block comment two lines above (apps/web/src/features/browser/ui/BrowserPanel.tsx:343-347) explains why currentUrl is the wrong fallback: when an app dies, Electron's BrowserView fails the next reload with ERR_CONNECTION_REFUSED and rewrites currentUrl to chrome-error://chromewebdata/, so the startsWith(urlPrefix) check would silently miss the very tab we want to close. openedAt is the immutable identifier that solves that.
The 'older persisted tabs' migration scenario also doesn't apply here — this PR introduces both the field and its persistence simultaneously, and there's no shipped version where AAP-owned tabs existed without it. Falling back to currentUrl would also force-close any user-opened tab that happens to share a URL prefix (e.g. their own localhost:3000 dev tab), which the AAP stop flow shouldn't do.
There was a problem hiding this comment.
@zvadaadam, you're right on all three counts — the reasoning is airtight:
currentUrlerasure: Electron'sERR_CONNECTION_REFUSEDreload rewritingcurrentUrltochrome-error://chromewebdata/means anystartsWith(urlPrefix)check oncurrentUrlwould silently miss the very tab it's supposed to close — the opposite of a safe fallback.- No migration gap: Since
openedAtstamping and its persistence land in the same PR as AAP tab creation, there's no pre-existing persisted state where AAP-owned tabs lack the field. - Accidental close risk: Falling back to
currentUrlcould close unrelated user tabs that happen to share the samelocalhostorigin/port prefix.
Withdrawing the suggestion. The openedAt-only approach is correct here.
✏️ Learnings added
Learnt from: zvadaadam
Repo: zvadaadam/deus-machine PR: 250
File: apps/web/src/features/browser/ui/BrowserPanel.tsx:352-354
Timestamp: 2026-04-19T22:14:42.822Z
Learning: In `apps/web/src/features/browser/ui/BrowserPanel.tsx`, the `pendingCloseTab` effect intentionally matches only on `t.openedAt` (not `t.currentUrl`) when filtering tabs to close on `apps:stopped`. Do NOT suggest a `currentUrl` fallback for three reasons: (1) Electron rewrites `currentUrl` to `chrome-error://chromewebdata/` on `ERR_CONNECTION_REFUSED` reload when an app dies, so `startsWith(urlPrefix)` on `currentUrl` would silently miss the tab; (2) `openedAt` and its persistence were introduced in the same PR as AAP tab creation, so there is no migration scenario where AAP-owned tabs lack the field; (3) falling back to `currentUrl` risks accidentally closing user-opened tabs that share the same localhost origin/port prefix.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
🧠 Learnings used
Learnt from: zvadaadam
Repo: zvadaadam/deus-machine PR: 244
File: apps/web/src/features/session/hooks/useSlashCommand.ts:87-154
Timestamp: 2026-04-15T21:01:15.460Z
Learning: In `apps/web/src/features/session/hooks/useSlashCommand.ts`, the `useEffect` that fetches skills/commands on workspace change intentionally omits a stale-response cancellation guard. The trade-off is deliberate: workspace path changes are rare (only on repo switch), the backend reads local filesystem files in sub-100ms, and the race condition window is practically impossible. Do not flag the absence of a cancellation guard in this hook in future reviews.
Learnt from: zvadaadam
Repo: zvadaadam/deus-machine PR: 247
File: apps/web/src/app/layouts/useChatTabs.ts:328-347
Timestamp: 2026-04-17T08:21:39.540Z
Learning: In `apps/web/src/app/layouts/useChatTabs.ts`, `updateChatTabAgentHarness` is intentionally a no-op when `tab.data?.hasStarted` is true. Once a session has messages, the harness is locked server-side in `handleSendMessage`; allowing local UI harness changes on started tabs would cause drift from the persisted session until reload. Do not flag this early-return as missing logic in future reviews.
Learnt from: zvadaadam
Repo: zvadaadam/deus-machine PR: 240
File: apps/web/src/features/session/hooks/useTextBuffer.ts:49-60
Timestamp: 2026-04-12T23:40:23.156Z
Learning: In `apps/web/src/features/session/hooks/useTextBuffer.ts`, the markdown code-fence check in the rAF streaming buffer is intentionally conservative: it flushes the full text immediately when any unclosed ``` is detected in the current slice, even if it's a false positive (e.g. right after a closing fence). This trade-off is deliberate — premature flush (showing full text) is harmless, while rendering a partial/unclosed code fence causes visual glitches. Do not flag this as a bug in future reviews.
Learnt from: zvadaadam
Repo: zvadaadam/deus-machine PR: 247
File: apps/backend/cli.ts:278-278
Timestamp: 2026-04-17T08:15:31.894Z
Learning: In zvadaadam/deus-machine, `apps/backend/cli.ts` is a local developer-only integration CLI that reads `--agent` exclusively from process argv and writes to the developer's own local SQLite DB. SQL interpolation of `opts.agent` without escaping is acceptable risk here — do not flag raw argv interpolation in this file in future reviews.
Greenlight — Round 1Fixed (9 in
Disagreed (2)
CI is green (4/4 checks passing). |
- aap/lifecycle: rename RING_MAX_LINES → RING_MAX_CHUNKS — the bound is on
pushBounded chunks (one per stdout/stderr "data" event), not lines, and
the old name was misleading
- aap/lifecycle: switch probeOnce's ReadyProbe handling to ts-pattern
match(...).exhaustive() per CLAUDE.md ("Prefer ts-pattern for
discriminated unions, use .exhaustive() for exhaustiveness checking").
Drops the dead `return false` fallback that was bypassing the check
- app-registrar / claude-handler: type protectedByQuery (and attachQuery's
parameter) as Record<string, McpServerConfig> — the union the SDK's
setMcpServers actually consumes. Drops the imprecise cast to
McpSdkServerConfigWithInstance in claude-handler
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Greenlight — Round 2Caught 3 nitpick comments that were collapsed inside the second CodeRabbit review body (not posted as inline comments). Fixed (3 in
All 407 agent-server tests + 14 lifecycle tests still pass. CI green. |
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@apps/agent-server/agents/claude/claude-handler.ts`:
- Around line 683-685: The detachQuery call should run unconditionally so the
Query's protectedByQuery entry is always removed even if
claudeSessions.owns(sessionId, session) becomes false; change the finally block
to first call detachQuery(queryResult) if queryResult is truthy, and then
perform the ownership-guarded cleanup (i.e., only call
claudeQueries.delete(sessionId) and any claudeSessions-related deletions inside
the if (claudeSessions.owns(sessionId, session)) block). Keep the
functions/variables claudeSessions.owns, detachQuery, queryResult, and
claudeQueries.delete as the locating references.
In `@apps/agent-server/app-registrar.ts`:
- Around line 90-92: attachQuery currently only stores sdkServers in
protectedByQuery for a new Query but does not populate that Query with the
existing registered servers, so queries attached after apps were registered
never see pre-existing entries; modify attachQuery (or call a sync step from
broadcast after attachQuery) to merge the current registeredServers snapshot
into the new Query immediately — i.e., after protectedByQuery.set(query,
sdkServers) invoke the same codepath that applies a registration payload to a
Query (using the registeredServers map) so the newly attached Query receives all
currently registered servers without waiting for the next register/unregister
event.
- Around line 63-81: Concurrent registry updates can interleave: wrap the
mutate-then-broadcast sequence in a single serialized lock so registerAppMcp and
unregisterAppMcp never run concurrently and an older broadcast cannot overwrite
a newer snapshot; implement a module-scoped async mutex/promise-chain (e.g., a
single Promise lastOp or a small mutex class) and perform the
registeredServers.set/delete and the subsequent await broadcast() inside that
critical section, and apply the same serialization pattern to any other
functions that call broadcast() (the code path that triggers
query.setMcpServers()) so all registry mutations are ordered.
In `@apps/backend/src/services/aap/lifecycle.ts`:
- Around line 64-65: The manifest's relative paths must be anchored to the app
package root: when computing cwd use the substituted launch.cwd but if that
value is a relative path resolve it against packageRoot (so cwd becomes
path.resolve(packageRoot, substitutedCwd) when not absolute); likewise update
resolveCommand(launch.command, packageRoot) to treat commands that contain path
separators or start with ./ or ../ as file paths and resolve those relative
commands against packageRoot (instead of the process cwd), and apply the same
fix wherever resolveCommand or launch.cwd are used later (e.g., the spawn/exec
paths around the 264–299 logic) so spawned processes run from the packageRoot
context.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 334ffec3-d6f0-4577-ab86-61594e6ea309
📒 Files selected for processing (3)
apps/agent-server/agents/claude/claude-handler.tsapps/agent-server/app-registrar.tsapps/backend/src/services/aap/lifecycle.ts
- claude-handler: detach the Query from the registrar unconditionally in the finally block. The registrar's protectedByQuery is keyed by Query identity, so a rapid re-query (which makes claudeSessions.owns return false) used to leak the dead Query handle until process exit. Only the sessionId-keyed maps still need the ownership guard. - app-registrar: serialize register/unregister broadcasts through a single promise chain. The JSON-RPC dispatcher fires handlers without locking, so two concurrent state changes could race — the later mutation finishing first, then the earlier broadcast overwriting the SDK with a stale snapshot. FIFO chain enforces submission order without a real mutex. - app-registrar: catch new queries up to the current dynamic state when attachQuery runs. Previously, apps registered before a session existed stayed invisible to that session until the next register/unregister event happened to fire. - aap/lifecycle: anchor relative `launch.cwd` and path-form `launch.command` (e.g. "./dist/cli.js") to the app's packageRoot. Node's spawn would otherwise resolve them against the backend's process cwd — fine for device-use today, but a foot-gun for any future manifest using relative paths. Tests: - app-registrar: catch-up, no-op-when-empty-registry, FIFO serialization - aap/lifecycle: relative cwd resolves under packageRoot, not process.cwd Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Greenlight — Round 3CodeRabbit caught a 4-comment cluster around AAP registrar concurrency / lifecycle. All actionable, all fixed in Fixed (4)
Tests added (4)
All 410 agent-server tests + 61 AAP unit/integration tests pass locally. CI green. |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
apps/agent-server/test/app-registrar.test.ts (1)
162-181: Consider exposing a test hook for awaiting the update chain.The
__sync__register/unregister pattern works but is indirect. A dedicated__awaitPendingUpdatesForTests()helper in the registrar would make the intent clearer and avoid the dead-letter0.0.0.0:0URL. However, this is a minor ergonomic nit—the current approach correctly validates the FIFO guarantee.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/agent-server/test/app-registrar.test.ts` around lines 162 - 181, Add a dedicated test hook on the registrar to await the pending update chain instead of using the indirect "__sync__" register/unregister trick: expose a method named something like awaitPendingUpdatesForTests() in the registrar implementation and export it for tests, then update this test to call await awaitPendingUpdatesForTests() after attachQuery (replacing the registerAppMcp("__sync__", ...) / unregisterAppMcp("__sync__") calls). Ensure the new helper waits until the same update chain that attachQuery triggers has been processed so the existing expectation on q1.setMcpServers remains valid.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@apps/agent-server/test/app-registrar.test.ts`:
- Around line 100-109: Rename the test case to reflect that unregistering an
unknown server does not broadcast; update the it(...) description string so it
matches the assertion (e.g., change the string in the test containing
"unregister an unknown server is a no-op for the map but still broadcasts
current state" to something like "unregister an unknown server is a no-op and
does not broadcast"), keeping the test body that uses claudeQueries, q1, and
asserts q1.setMcpServers.not.toHaveBeenCalled() and the call to
unregisterAppMcp("never_registered") unchanged.
In `@apps/backend/src/services/aap/lifecycle.ts`:
- Around line 120-122: The current code finalizes buffered stderr inside the
child.once("exit", ...) handler which can miss pending stderr data; change the
listener to child.once("close", (code, signal) => { finalizeExit(code, signal,
getStderr()); }) so finalizeExit is invoked after stdio is fully drained; update
any references to the exit handler to use "close" and keep using finalizeExit
and getStderr as before.
---
Nitpick comments:
In `@apps/agent-server/test/app-registrar.test.ts`:
- Around line 162-181: Add a dedicated test hook on the registrar to await the
pending update chain instead of using the indirect "__sync__"
register/unregister trick: expose a method named something like
awaitPendingUpdatesForTests() in the registrar implementation and export it for
tests, then update this test to call await awaitPendingUpdatesForTests() after
attachQuery (replacing the registerAppMcp("__sync__", ...) /
unregisterAppMcp("__sync__") calls). Ensure the new helper waits until the same
update chain that attachQuery triggers has been processed so the existing
expectation on q1.setMcpServers remains valid.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 90b886b1-0e23-46c2-b44b-98459fb3392e
📒 Files selected for processing (5)
apps/agent-server/agents/claude/claude-handler.tsapps/agent-server/app-registrar.tsapps/agent-server/test/app-registrar.test.tsapps/backend/src/services/aap/lifecycle.tsapps/backend/test/unit/services/aap/lifecycle.test.ts
- aap/lifecycle: switch the spawnApp completion listener from "exit" to "close" so all stdio is fully drained before getStderr() is captured. The ring buffer exists for crash diagnostics; using "exit" could miss the very stderr chunks that explain why a child died. - lifecycle.test "SIGKILLs a child that ignores SIGTERM": await the onExit callback explicitly. Previously the test relied on stopChild returning at the same tick as onExit; "close" lands a tick or two after "exit" so the assertion would race the callback. - app-registrar.test: rename "no-op for the map but still broadcasts current state" → "silent no-op — no broadcast" so the description matches the assertion. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Greenlight — Round 4Two minor cleanups in Fixed (2)
All 410 agent-server + 61 AAP tests pass locally. |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@apps/backend/test/unit/services/aap/lifecycle.test.ts`:
- Around line 149-163: The test starts a delayed setTimeout that flips the
server response but never clears its timer, so if the test fails early the
pending callback can mutate server state; store the timer id returned from
setTimeout (replace the anonymous call with e.g. const flipTimer =
setTimeout(...)) and in the existing finally block call clearTimeout(flipTimer)
before awaiting close() to ensure the delayed flip is cancelled; update
references around the server flip code and the finally block where close() is
awaited to clear the timer.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: cec806a6-b27d-4b99-9c29-ed43f174ad25
📒 Files selected for processing (3)
apps/agent-server/test/app-registrar.test.tsapps/backend/src/services/aap/lifecycle.tsapps/backend/test/unit/services/aap/lifecycle.test.ts
🚧 Files skipped from review as they are similar to previous changes (2)
- apps/agent-server/test/app-registrar.test.ts
- apps/backend/src/services/aap/lifecycle.ts
- "SIGTERMs a graceful child" was racing the onExit callback after the spawnApp listener moved from "exit" to "close" (close lands a tick or two after exit). Await the callback through a Promise like the SIGKILL test already does, so the assertion fires after onExit lands. - "keeps polling when the server returns 5xx" never cleared its 400ms flip timer in the finally block — an early throw would leave the callback alive and hit a closed server in the next test. Hoist the timer outside try and clearTimeout in finally for tighter isolation. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Greenlight — Round 5Fixed (2)
All 15 lifecycle tests pass locally. |
CodeRabbit suggested finalizing on "close" to capture any trailing stderr
chunks emitted between exit and close. That's theoretically safer, but on
Linux CI it makes the SIGTERM lifecycle test time out at 10s:
spawn("sh -c \"sleep 30\"") → sh (PID X) forks sleep (PID Y)
stopChild sends SIGTERM to X; dash exits immediately without
propagating SIGTERM to Y. `sleep` is reparented to init and keeps
holding its inherited stdout/stderr pipe fds open. Node's "close"
event only fires when ALL writers release the pipe — so close waits
up to 30 seconds for `sleep` to naturally exit.
macOS bash kills its children on SIGTERM, so close fires immediately
there. But we can't rely on that on Linux.
Trade-off: the ring buffer already captures everything delivered up to
the exit moment. For crash diagnostics that's the material signal —
a handful of trailing bytes from a not-yet-flushed write are not worth
the cross-platform hang. Revert to "exit" and document the rationale
in place so the next reviewer doesn't re-propose the same change.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Greenlight — Round 6 (revert)Rolling back the Round 4
macOS bash kills its children on SIGTERM so close fires immediately there; we can't rely on that. Trade-off: the ring buffer already captures everything delivered by the time 15 lifecycle tests pass. CI should be green. |
Summary
Implements the Agentic Apps Protocol host so Deus can spawn, supervise, and MCP-route installed agentic apps —
deus.mobile-useships as the first integrated app, launchable from the UI's Apps tab or via Claude'smcp__deus__launch_apptool (+stop_app,list_apps,read_app_skill).Backend owns lifecycle (spawn + ready-probe + SIGTERM stop, orphan sweep via PID journal), agent-server's
app-registrarbroadcasts MCP config viaquery.setMcpServerswhile preserving the session's built-in SDK servers (attach/detachQuery — without this the deus transport would disconnect mid-tool-call and hanglaunch_appforever), frontend opens/closes Browser tabs onapps:launched/apps:stoppedq:events matching by an immutableopenedAtstamp so Electron's chrome-error rewrite after app exit doesn't break the match.Along the way fixed: device-use's
servenow runs in-process (was orphaning a grandchild server on stop), its UI serves the prod bundle (NODE_ENV=productionin manifest) with the static root anchored to the package viaimport.meta.url(AAP spawns withcwd={workspace}), and register-mcp is fire-and-forget to break a re-entrant control-request deadlock.Test plan
register-mcpcontractlaunch_app→read_app_skill→stop_appround-trips,apps:launched+apps:stoppedevents fire with correct payloads🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Documentation
Tests
Chores