Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
5d1ec3e
feat(device-use): Phase 1 — scaffold v2 server + frontend
zvadaadam Apr 17, 2026
63b66ad
feat(device-use): Phase 2 — expand engine with xcodebuild, build wrap…
zvadaadam Apr 17, 2026
ac09d6d
feat(device-use): Phase 3 — REST API, MCP HTTP endpoint, WS event bus
zvadaadam Apr 17, 2026
c704fea
feat(device-use): Phase 4 — React viewer (TopBar, DeviceFrame, Sideba…
zvadaadam Apr 17, 2026
29a71d8
feat(device-use): Phase 5 — CLI cleanup (drop SDK, drop stream CLI, a…
zvadaadam Apr 17, 2026
28aabd2
feat(device-use): Phase 6 — docs, AAP manifest, composite run tool
zvadaadam Apr 17, 2026
c79baaa
test(device-use): add end-to-end server harness against real simulator
zvadaadam Apr 17, 2026
dd2bce5
fix(device-use): bind server to 0.0.0.0 so IPv6 orphans can't shadow us
zvadaadam Apr 17, 2026
740b865
fix(device-use): boot + set_active_simulator now start the MJPEG stream
zvadaadam Apr 17, 2026
4c8999e
feat(device-use): interactive tap + swipe on the stream
zvadaadam Apr 17, 2026
20582a3
fix(device-use): use canvas + binary WS for interactive stream
zvadaadam Apr 17, 2026
678a5ac
feat(device-use): auto-refresh the elements sidebar after each canvas…
zvadaadam Apr 17, 2026
43b0924
fix(device-use): ref-based tap uses tapEntry (id → label → coords fal…
zvadaadam Apr 17, 2026
5324c77
feat(device-use): hardware-button row + context-menu suppression
zvadaadam Apr 17, 2026
fd1bd9a
refactor(device-use): post-cleanup — remove dead code + tighten server
zvadaadam Apr 17, 2026
1f9c00e
refactor(device-use): drop Vol+/Vol− buttons — simbridge doesn't map …
zvadaadam Apr 17, 2026
ef6e7ca
fix(device-use): stop the 10s blink — use primitive deps in DeviceFrame
zvadaadam Apr 17, 2026
8f0fb70
fix(device-use): stop the blink at the source — sim-store preserves i…
zvadaadam Apr 17, 2026
01cc5bd
feat(device-use): keyboard input + CI workflow + long-press verified
zvadaadam Apr 17, 2026
1d9ed9e
docs(device-use): handoff section for the next PR (Deus AAP host)
zvadaadam Apr 18, 2026
f14d701
fix(device-use): address CodeRabbit review on PR #249
zvadaadam Apr 18, 2026
52307fa
chore(device-use): sync bun.lock with zustand workspace dep
zvadaadam Apr 18, 2026
248d281
fix(device-use): address CodeRabbit round 2 (3/4)
zvadaadam Apr 18, 2026
92c6ff1
docs(device-use): clarify launch.ready manifest path in handoff section
zvadaadam Apr 18, 2026
fa9fe35
docs(device-use): address CodeRabbit round 4 (5 doc nits)
zvadaadam Apr 18, 2026
7349537
docs(device-use): align manifest example with shipped agentic-app.json
zvadaadam Apr 18, 2026
0727f9f
style(device-use): adopt Deus design tokens (Jony-Ive dark theme)
zvadaadam Apr 18, 2026
da10f8b
feat(device-use): collapsible logs drawer (collapsed by default)
zvadaadam Apr 18, 2026
1209382
feat(device-use): themed Select (Radix UI) — replaces native <select>
zvadaadam Apr 18, 2026
f07de8f
fix(device-use): correct cache-control for prod static serving
zvadaadam Apr 18, 2026
84f06e5
feat(device-use): proper iPhone/iPad device frame (ported from apps/web)
zvadaadam Apr 18, 2026
5ff20ec
fix(device-use): drop overlay dynamic island/camera dot — stream alre…
zvadaadam Apr 18, 2026
ae01a44
fix(device-use): shell aspect ratio derived from stream so the screen…
zvadaadam Apr 18, 2026
a501569
refactor(device-use): collapsible sidebar with Panel wrapper
zvadaadam Apr 18, 2026
f07bf09
docs: video-use + AAP v1.1 design draft
zvadaadam Apr 18, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
59 changes: 59 additions & 0 deletions .github/workflows/device-use-e2e.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
name: device-use e2e

# Exercises the full device-use v2 stack against a real iOS simulator:
# server spawn → /health → WS subscribe → REST + MCP tool-call →
# xcodebuild → install → launch → snapshot → tap → state persistence.
#
# Schedule only — macOS minutes are 10× Linux, so we don't run per-PR.
# Manual dispatch is available if you want to verify a branch before merge.
on:
schedule:
- cron: "0 7 * * *" # 07:00 UTC nightly
workflow_dispatch:

jobs:
e2e:
runs-on: macos-15
timeout-minutes: 15
steps:
- uses: actions/checkout@v4

- name: Setup Bun
uses: oven-sh/setup-bun@v2
with:
bun-version: latest

- name: Install deps (also builds simbridge via prepare-device-use)
run: bun install --frozen-lockfile

- name: Pick + boot a simulator
id: sim
run: |
UDID=$(xcrun simctl list devices available --json \
| jq -r '.devices | to_entries[] | .value[] | select(.name | startswith("iPhone")) | select(.isAvailable == true) | .udid' \
| head -1)
if [ -z "$UDID" ]; then
echo "::error::No iPhone simulator available on this runner"
xcrun simctl list devices available
exit 1
fi
echo "Picked simulator: $UDID"
echo "udid=$UDID" >> "$GITHUB_OUTPUT"
xcrun simctl boot "$UDID" || true
# Wait until the sim actually reports Booted, max ~30s.
for _ in $(seq 1 30); do
STATE=$(xcrun simctl list devices | grep "$UDID" | grep -oE '\((Booted|Booting|Shutdown)\)' | tr -d '()')
[ "$STATE" = "Booted" ] && break
sleep 1
done
xcrun simctl list devices | grep "$UDID"

- name: Run e2e against the real sim
working-directory: packages/device-use
env:
E2E_SIM_UDID: ${{ steps.sim.outputs.udid }}
run: bun scripts/e2e-server.ts

- name: Shutdown simulator (cleanup)
if: always()
run: xcrun simctl shutdown "${{ steps.sim.outputs.udid }}" || true
96 changes: 89 additions & 7 deletions bun.lock

Large diffs are not rendered by default.

373 changes: 373 additions & 0 deletions docs/device-use-v2-design.md

Large diffs are not rendered by default.

515 changes: 515 additions & 0 deletions docs/video-use-and-aap-v1.1-design.md

Large diffs are not rendered by default.

4 changes: 2 additions & 2 deletions package.json
Original file line number Diff line number Diff line change
Expand Up @@ -101,12 +101,12 @@
"@xterm/addon-web-links": "^0.12.0",
"@xterm/xterm": "^5.5.0",
"agent-browser": "^0.21.4",
"device-use": "workspace:*",
"better-sqlite3": "^12.4.1",
"chokidar": "^4.0.0",
"class-variance-authority": "^0.7.1",
"clsx": "^2.1.1",
"cmdk": "^1.1.1",
"device-use": "workspace:*",
"electron-updater": "^6.0.0",
"framer-motion": "^12.23.24",
"hono": "^4.11.7",
Expand All @@ -132,7 +132,7 @@
"ts-pattern": "^5.9.0",
"ws": "^8.19.0",
"zod": "^4.3.6",
"zustand": "^5.0.8"
"zustand": "^5.0.12"
},
"devDependencies": {
"@electron-toolkit/tsconfig": "^2.0.0",
Expand Down
1 change: 1 addition & 0 deletions packages/device-use/.gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ node_modules/
dist/
bin/
.context/
.device-use/
native/.build/
native/.swiftpm/
test-apps/*/build/
Expand Down
106 changes: 66 additions & 40 deletions packages/device-use/AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,61 +2,87 @@

Notes for Claude / agents working on this codebase.

## What this package is

**device-use v2** — a standalone iOS Simulator workbench. Ships:

- A Bun server (`src/server/`) hosting a React viewer at `/`, an MCP HTTP endpoint at `/mcp`, a WebSocket event bus at `/ws`, a REST API under `/api/`, and an MJPEG passthrough at `/stream.mjpeg`.
- A React SPA (`src/frontend/`) — phone frame + sim picker + project/scheme + ▶ Run + inspector + logs drawer.
- A stateless CLI (`src/cli/`) — per-command, imports engine directly, works without the server.
- A Swift engine (`native/simbridge`) — HID + accessibility + MJPEG via private CoreSimulator frameworks.

**CLI and server are peers**: both import `src/engine/` directly. Neither depends on the other.

## Layout

- `native/` — Swift package producing the `simbridge` binary. Uses private
CoreSimulator/AccessibilityPlatform frameworks. **Don't rewrite** without
very good reason; it's ~3k LOC of Swift doing work that can't be done
from JS.
- `src/engine/` — Pure primitives. No CLI/SDK imports. Tests here should be
fast and have no external dependencies.
- `src/cli/` — Hand-rolled arg parser + command registry. Each command
lives in `src/cli/commands/<name>.ts` with a zod schema and a handler.
- `src/sdk/` — Fluent `session()` builder. Used for programmatic automation.
- `skills/device-use/SKILL.md` — Gets copied to `~/.claude/skills/` by
`device-use install`.
```
packages/device-use/
├── native/ Swift simbridge — don't rewrite without good reason
├── src/
│ ├── engine/ Pure primitives (tests injectable executors/spawners)
│ ├── cli/ Stateless CLI: commands + registry + args parser
│ ├── server/ Bun.serve + Hono: tools, state, stream, mcp, ws
│ └── frontend/ Vite + React SPA served at /
├── scripts/ Build + compile + smoke tests
├── test/ Unit + integration tests (outside src/)
├── agentic-app.json AAP manifest — consumed by host IDEs
└── skills/ Claude skill gets copied via `device-use install`
```

## Adding a tool

## Adding a command
Tools are the one surface the agent sees. Every tool is defined once in `src/server/tools.ts` and routed through `invokeTool` (`src/server/invoker.ts`) which emits `tool-event` frames to anyone listening on `/ws`.

1. Create `src/cli/commands/foo.ts` exporting `fooCommand: CommandDefinition<Params>`.
2. Register it in `src/cli/index.ts`.
3. Add examples to the help text + to `skills/device-use/SKILL.md`.
1. Define a tool in `src/server/tools.ts` using the `tool({ name, description, schema, handler })` factory. Schema is Zod; handler takes `(ctx: Context, params: z.infer<schema>)`.
2. Append it to the `TOOLS` array.
3. Add an integration test in `test/server.test.ts` — exercise via `invokeTool(ctx, "your_tool", params)` against a mock `CommandExecutor`.

## Build pipeline
The same tool is automatically visible via REST (`POST /api/tools/<name>`), MCP (`tools/call`), and the WS invoke frame. No separate wiring.

- `bun run build:native` → Swift binary at `native/.build/release/simbridge`
- `bun run build:ts` → ESM bundles in `dist/` (CLI, SDK, engine)
- `bun run compile` → Single compiled executable at `bin/device-use`,
with `simbridge` copied to `bin/simbridge`
## Adding a CLI command

## simbridge path resolution
CLI is independent — doesn't route through the server.

The CLI finds `simbridge` in this order:
1. Create `src/cli/commands/foo.ts` exporting `fooCommand: CommandDefinition<Params>`.
2. Import + register in `src/cli/index.ts`.
3. Document in `skills/device-use/SKILL.md` (copied to `~/.claude/skills/` by `device-use install`).

1. `$DEVICE_USE_SIMBRIDGE` env override
2. Sibling of `process.execPath` (compiled-binary case)
3. Relative to `import.meta.url` (dev / bundled case)
## Event shape (`/ws`)

If you change packaging, make sure `findBridgePath()` in
`src/engine/simbridge.ts` still locates the binary.
Every tool invocation emits:

## JSON contract
```ts
{ type: "tool-event", id, at, tool, params, status: "started"|"completed"|"failed", result?, error? }
```

All commands return:
Long-running tools (`build`, `stream_logs`) additionally emit:

```json
{ "success": true, "command": "...", "data": ..., "message": "...", "nextSteps": [], "warnings": [] }
```ts
{ type: "tool-log", id, stream: "stdout"|"stderr", text }
```

Commands auto-switch to JSON when stdout is not a TTY. Agents should pipe
stderr to `/dev/null` — `simbridge` prints diagnostics there.
The `id` correlates across lifecycle events. We reuse this exact shape for MCP tool calls — no parallel schema.

## Testing against the real simulator
## Build pipeline

Boot any iOS simulator and run `./bin/device-use doctor` to confirm
everything is wired up. Then:
- `bun run build:native` → `native/.build/release/simbridge`
- `bun run build:ts` → `dist/cli.js` + `dist/engine.js`
- `bun run build:frontend` → `dist/frontend/` (static SPA)
- `bun run build` → all of the above
- `bun run compile` → single compiled `bin/device-use` + copied `bin/simbridge`

```
./bin/device-use snapshot -i 2>/dev/null | jq .
./bin/device-use tap @e1
```
## simbridge path resolution

`src/engine/simbridge.ts → findBridgePath()` looks up in order:

1. `$DEVICE_USE_SIMBRIDGE` override
2. Sibling of `process.execPath` (compiled binary case)
3. Relative to `import.meta.url` (source / bundled case)

## Hard rules

- **Never bypass `invokeTool`.** If you're adding a code path that touches the engine from MCP/REST/WS, it goes through the invoker so events fire.
- **Never persist state outside `state.json`.** If you need a new persisted field, add it to `PersistedState` in `src/server/state.ts`.
- **`src/frontend/` must only talk HTTP/WS.** No engine imports from the browser.
- **Tests live in `test/`** — never colocate inside `src/`.
- **Stateless CLI stays stateless.** Per-command, no long-lived subprocess (except `serve`, which IS the long-lived subprocess and spawns the server).
133 changes: 80 additions & 53 deletions packages/device-use/README.md
Original file line number Diff line number Diff line change
@@ -1,82 +1,109 @@
# device-use

iOS Simulator automation for AI agents — CLI, SDK, and engine.
Standalone iOS Simulator workbench for humans and agents.

Ported from [`expo/agent-simulator`](https://github.com/expo/agent-simulator), refactored
for [Bun](https://bun.sh) with a single-file compiled binary.
- **Viewer** — open `localhost:3100` to see a live phone screen, boot sims, build + run your Xcode project, inspect the a11y tree.
- **MCP server** — `/mcp` exposes 23 tools (build, install, launch, tap, type, snapshot, …). Any MCP-speaking client (Claude Code, Claude Desktop, Cursor, …) can drive the simulator.
- **CLI** — `device-use list`, `device-use tap @e1`, `device-use serve`, etc. Works standalone, no server required.

## Install (from source)
Under the hood: a Bun server hosting a React viewer, an HTTP MCP transport, a WebSocket event bus, and a Swift `simbridge` binary that talks to private CoreSimulator + AccessibilityPlatform frameworks.

## Install from source

```bash
bun install
bun run build:native # Builds simbridge Swift binary (requires Xcode)
bun run compile # Produces ./bin/device-use + ./bin/simbridge
./bin/device-use install # Installs the Claude skill
bun run build:native # Builds the Swift simbridge binary (requires Xcode)
```

## Quick start
## Run the server

```bash
bun run dev # Hono server on 3100 (proxies to Vite for HMR)
bun run dev:frontend # Vite dev server on 5173 (second terminal)
# or, from the CLI:
bunx device-use serve --port 3100 --open
```

Open [http://localhost:3100](http://localhost:3100). Pick a simulator, paste a `.xcodeproj`/`.xcworkspace` path, click **▶ Run**.

For production: `bun run build && bun run start`.

## MCP endpoint

Point any MCP client at `http://localhost:3100/mcp`. Example — Claude Desktop's `.claude/mcp.json`:

```json
{
"mcpServers": {
"device-use": {
"type": "http",
"url": "http://localhost:3100/mcp"
}
}
}
```

Tools available: `list_devices`, `boot`, `set_active_simulator`, `set_active_project`, `get_project_info`, `build`, `install`, `launch_app`, `terminate_app`, `list_apps`, `app_state`, `snapshot`, `tap`, `type_text`, `swipe`, `press_button`, `screenshot`, `wait_for`, `open_url`, `grant_permission`, `stream_logs`, `stop_logs`, `get_state`.

## CLI

```bash
device-use list # List simulators
device-use boot "iPhone 17 Pro" # Boot by name
device-use snapshot -i # Dump interactive UI with @refs
device-use snapshot -i # Accessibility tree with @refs
device-use tap @e1 # Tap by ref
device-use type "hello@example.com" # Type into focused field
device-use screenshot result.png # Capture screen
device-use serve --open # Start server + open viewer
device-use doctor # Verify environment
```

Full command list: `device-use help`.

## Architecture

- **`native/`** — Swift package (`simbridge`) that talks to private CoreSimulator
and AccessibilityPlatform frameworks. Handles HID injection, accessibility
queries, and MJPEG streaming.
- **`src/engine/`** — TypeScript primitives wrapping `xcrun simctl` and
`simbridge` IPC. No CLI or SDK imports.
- **`src/cli/`** — Hand-rolled CLI with flat commands and JSON-when-piped.
- **`src/sdk/`** — Fluent `session()` builder for programmatic automation.
- **`skills/device-use/SKILL.md`** — Claude Code skill definition.

## Commands

| Command | Purpose |
| ---------------------------- | --------------------------------------------------------------------------------- |
| `list` | List available simulators |
| `boot` / `shutdown` / `open` | Simulator lifecycle |
| `snapshot` | Accessibility tree with `@refs` (`-i` for interactive only, `--diff` for changes) |
| `screenshot` | PNG/JPEG capture, optionally base64 |
| `tap` | By `@ref`, `--id`, `--label`, or `-x -y` |
| `type` | Into focused field, optional `--submit` |
| `wait-for` | Poll until element appears/disappears |
| `stream` | MJPEG screen server (`enable`/`disable`/`status`) |
| `open-url` | Deep link / URL |
| `session` | Manage default simulator + ref state |
| `doctor` | Environment check |
| `install` | Verify setup + install Claude skill |

## SDK

```ts
import { session } from "device-use";

await session("iPhone 17 Pro").app("Maps").snapshot().tapOn("@e1").inputText("Coffee").run();
```text
┌──────────────────────────────────────────────────────────────────┐
│ packages/device-use/ │
│ │
│ native/ — Swift simbridge binary (unchanged) │
│ │
│ src/engine/ — TS primitives: simctl + simbridge IPC + │
│ xcodebuild + logs. Pure, injectable executors. │
│ ▲ │
│ │ imported by │
│ │ │
│ ┌───┴────────────┐ ┌────────────────────────────┐ │
│ │ src/cli/ │ │ src/server/ │ │
│ │ per-command, │ │ long-lived Bun.serve │ │
│ │ stateless │ │ / /mcp /ws /health │ │
│ └────────────────┘ │ /stream.mjpeg /api/* │ │
│ └──────────┬─────────────────┘ │
│ │ serves │
│ ▼ │
│ src/frontend/ (Vite + React) │
│ TopBar, DeviceFrame, Sidebar, │
│ LogsDrawer — WS client of server │
└──────────────────────────────────────────────────────────────────┘
```

## Distribution
CLI and server share the engine but are independent peers — neither needs the other to work.

The compiled `./bin/device-use` is a single ~58 MB Bun executable. Ship
it alongside `./bin/simbridge` (~1.6 MB) — the CLI looks for `simbridge` as a
sibling of its own binary (or via `$DEVICE_USE_SIMBRIDGE` override).
## Agentic Apps Protocol

## Requirements
device-use is the reference implementation of an AAP app. The `agentic-app.json` at the package root declares how a host IDE (Deus, any MCP-speaking IDE) should launch and embed it.

- macOS 14+ with Xcode installed
- Bun 1.1+ (dev only — compiled binary has no runtime dep)

## Testing
## Develop

```bash
bun test # Unit tests
bun run typecheck # TS check
./bin/device-use doctor # End-to-end env check
bun test # 86 unit + integration tests, no real sim needed
bun run typecheck # tsc --noEmit
bun run build # simbridge + ts bundles + frontend bundle
bun run compile # single-file bin/device-use + bin/simbridge
```

Tests live in `test/` — never colocated with `src/`. `packages/device-use/scripts/ws-smoke.ts` is a manual WS sanity check (server must be running).

## License

MIT
Loading
Loading