feat(fork): autonomous TODO agent with live worker visibility and intervention#181
feat(fork): autonomous TODO agent with live worker visibility and intervention#181
Conversation
Introduces the main-process scaffolding for a new fork-local "TODO" feature
that drives Claude Code autonomously toward a user-defined goal until a
decisive verify command passes. This commit establishes the backend
surface — schema, supervisor, and tRPC router — without any renderer work
or existing-UI integration, so it can be iterated on and reviewed in
isolation.
Why this shape
--------------
- The supervisor is pure TypeScript in the main process, not a second
Claude Code. All creativity stays in one worker; "management" is
deterministic code. This avoids LLM-to-LLM communication, which the
research survey flagged as the biggest reliability sink for long-horizon
autonomous loops.
- The worker runs as interactive Claude Code inside a real PTY pane (same
infra the existing Run button uses), so users can watch it live and type
into it to intervene. Completion per turn is detected by idle timing on
the PTY data stream; decisive success is the exit code of the user's
verify command (e.g. `bun test`). LLM self-report is never trusted.
- Fork-conflict surface is kept to three 1-line edits in existing files
(trpc routers index, local-db schema.ts re-export, local-db schema
barrel). Everything else lives in new files under new directories.
What lands here
---------------
- apps/desktop/plans/todo-agent-plan.md — full design doc covering goals,
non-goals, architecture, execution loop, intervention UX, UI surface,
fork-conflict strategy, data model, tRPC surface, phased delivery, and
unresolved questions.
- packages/local-db/src/schema/todo-sessions.ts — new `todo_sessions`
SQLite table (workspace-scoped, status machine, budget, verdict fields,
artifact path). Re-exported from schema.ts so drizzle-kit picks it up
without changing the drizzle.config.ts entry.
- apps/desktop/src/main/todo-agent/
- types.ts zod input schemas + shared constants
- session-store.ts localDb-backed CRUD + EventEmitter fan-out, plus a
worktree-path resolver for main-process callers.
- supervisor.ts Singleton loop driver: prepares artifacts
(`.superset/todo/<id>/goal.md`), writes the iteration
prompt into the worker PTY via the workspace
terminal runtime, waits for idle, runs the verify
command as a detached child process, applies
futility (3x same failing test) and budget
(iteration count, wall-clock) guards, and settles
the session to done/failed/escalated/aborted.
Also exposes abort() (sends double Ctrl-C to the
pane) and sendInput() passthroughs.
- trpc-router.ts `todoAgent.*` router: create / list / get /
attachPane / abort / sendInput + an observable-based
subscribeState subscription (per trpc-electron
constraint documented in apps/desktop/AGENTS.md).
- index.ts Barrel.
- apps/desktop/src/lib/trpc/routers/index.ts — register the new router
as `todoAgent` on the app router (import + one field, clearly fork-
marked).
Not yet in this commit
----------------------
- Renderer UI (TodoButton, TodoModal, TodoPanel) and the PresetsBar
integration point next to WorkspaceRunButton.
- Drizzle migration file. Per repo policy, migrations are generated by
running `bunx drizzle-kit generate` locally and never hand-written;
this will be generated when the feature is wired end-to-end.
- Stop-hook integration via `--settings`. v1 uses idle-detection to
stay decoupled from Claude Code CLI internals. Tracked as an
Unresolved item in the plan doc for v2.
Verified
--------
- `bun run typecheck` in apps/desktop — clean.
- `tsc --noEmit` in packages/local-db — clean.
Refs: apps/desktop/plans/todo-agent-plan.md
Adds the first user-facing surface of the autonomous TODO agent: a compact TODO button placed immediately left of WorkspaceRunButton in PresetsBar, plus the creation modal that collects the task details the supervisor needs to start a run. Scope of this commit -------------------- Deliberately limited to session *creation*. Clicking the button opens a modal, the user fills in the form, and submit creates a `todo_sessions` row via `todoAgent.create`. The supervisor does not start executing yet — pane attach + execution handoff lands in a follow-up commit along with TodoPanel. This keeps each commit independently reviewable and rollback-safe. TodoButton (TodoButton/TodoButton.tsx) -------------------------------------- - Small ghost-variant button with a list icon and "TODO" label, styled to sit naturally next to WorkspaceRunButton without visually competing with it. - Polls `todoAgent.list` every 3s for the current workspace and shows a badge with the count of queued/preparing/running/verifying sessions so users can see at a glance that work is in flight. - Opens the modal as local state; no global store needed. TodoModal (TodoModal/TodoModal.tsx) ----------------------------------- Form fields, each mapped 1:1 to the zod schema in `main/todo-agent/types.ts`: - Title (max 200) - What should be done? (multiline, max 10k) - Clear goal / acceptance criteria (multiline, required — this is the single most important input for making the loop terminate) - Verify command (default `bun test`, exit code is the ground truth) - Max iterations (default 10, capped at 100) - Wall-clock minutes (default 30, capped at 240) Submit calls `electronTrpc.todoAgent.create.useMutation` and invalidates `todoAgent.list` so the button badge updates immediately. Success and failure are surfaced via the existing sonner toast. Cancel and close both reset the form. Rendering changes ----------------- - `PresetsBar.tsx` now imports TodoButton and renders it inside the existing `ml-auto flex items-center gap-1 shrink-0` wrapper, immediately before WorkspaceRunButton. The wrapper already handles spacing so no layout tweaks are needed. - Both the TodoButton import line and the render line are isolated additions to keep upstream merge conflicts cheap. Co-location ----------- Component code follows the repo's folder-per-component convention under `src/renderer/features/todo-agent/` so all fork-local feature code stays in one directory and is easy to delete or rebase. Verified -------- - `bun run typecheck` in apps/desktop — clean.
Closes the v1 control loop for the autonomous TODO agent: users can now
start a queued session, watch it run in a normal workspace terminal
tab, abort it, and type interventions directly into the running worker.
TodoButton dropdown
-------------------
The primary click still opens the creation modal (fast path for the
common action), but a chevron next to the button now opens a
DropdownMenu with "New TODO…" and "Open panel" so users can reach the
sessions drawer without having to create a new task first. The button
group is rendered as a single fused control (rounded-r-none +
rounded-l-none) so it reads as one widget next to WorkspaceRunButton.
TodoPanel (TodoPanel/TodoPanel.tsx)
-----------------------------------
Right-side Sheet, 540px wide, 2-column layout:
- Left: scrollable list of sessions for the current workspace, polled
every 2s while the panel is open. Selection is local state.
- Right: detail view for the selected session — status, title,
description, goal, verify command, iteration/budget snapshot, last
verdict reason (as a max-h-40 scrollable pre block so long failure
logs don't blow up the layout).
Controls in the detail view:
- **Start** (visible only when status === "queued")
The handoff to the supervisor is done client-side in four steps so
it composes cleanly with existing workspace terminal infra instead
of adding new tab-creation primitives in the main process:
1. `useTabsStore.getState().addTab(workspaceId)` creates a new
terminal tab + pane in the Zustand store. The tab shows up in
the workspace tab bar like any other terminal, so anyone can
click over to watch the worker live.
2. `setTabAutoTitle(tabId, "TODO: …")` labels the tab so it is
easy to spot.
3. `launchCommandInPane` (same helper the existing agent launcher
uses) runs interactive `claude <prompt>` in the new pane,
passing the session-specific initial prompt that points at
`.superset/todo/<id>/goal.md` (written by the supervisor at
creation time).
4. `todoAgent.attachPane({ sessionId, tabId, paneId })` hands the
session over to the supervisor, which takes it from `queued`
to `running` and begins the idle-detect/verify loop.
- **Abort** (visible when active and already attached): calls
`todoAgent.abort` which double-Ctrl-C's the pane and marks the
session aborted.
- **Intervene input**: a small Input + Send button that writes text
directly into the worker PTY via `todoAgent.sendInput`. Enter
submits, shift+Enter does nothing (no multi-line for v1). This is
the explicit "you can intervene while it runs" surface the plan
doc promised; users can also just click over to the terminal tab
and type there, since it is a real PTY.
A small footer reminds users that the worker runs in a normal
workspace terminal tab and can be opened from the tab bar directly —
no special terminal embed is needed inside the panel itself for v1,
which avoids bringing in the heavy TerminalPane + registry UI.
Scope deliberately out of this commit
-------------------------------------
- No auto-start on create. The user must explicitly click Start from
the panel. This makes the handoff observable and keeps the modal
commit rollback-safe.
- No live PTY embed inside the panel. v1 relies on the workspace's
own tab bar for that. Can be added later if users want an
in-panel viewer.
- No queue UI. The supervisor already queues internally if a second
Start is pressed while another session runs, but there is no
renderer affordance to reorder yet.
Integration note
----------------
`addTab` is the underlying Zustand method; `addTerminalTab` only
exists on the agent-session-orchestrator adapter layer as a thin
wrapper. Calling `addTab` directly keeps this feature from depending
on the full AgentLaunchTabsAdapter plumbing.
Verified
--------
- `bun run typecheck` in apps/desktop — clean.
Adds migration 0049, auto-generated by `bunx drizzle-kit generate` in packages/local-db against the new `todo_sessions` table that landed in the backend-scaffold commit. Creates the table with all 22 columns, the three indexes defined in the schema file (workspace / status / created_at), and the two foreign keys (workspace_id → workspaces.id ON DELETE CASCADE, project_id → projects.id ON DELETE SET NULL). Per repo policy, migration SQL and snapshot files are never edited by hand; they are regenerated from the schema source. The journal update is part of the same generate run. Required follow-up: the migration runs automatically on the next desktop app start (local-db migrations apply on boot). No manual action needed beyond relaunching the app.
guessFailingTest (apps/desktop/src/main/todo-agent/supervisor.ts)
-----------------------------------------------------------------
The previous heuristic was a single regex matching `FAIL|✗|×` and
would both miss common runners and return run-specific strings that
broke the "same failure 3 times in a row → escalate" check — a timing
suffix like "(12 ms)" changing between runs was enough to reset the
consecutive-failure counter and make the futility guard toothless.
The replacement:
- Strips ANSI escapes before matching, so colored runner output is
handled.
- Tries a prioritized list of line patterns covering bun test, vitest
(tree view + summary + inline), jest (FAIL + ✕), generic ✗, TAP /
node:test ("not ok 1 - …"), and playwright. Priority order matters
because some runners emit several matches per failure and we want
the most specific one first.
- Falls back to the first line containing "Error:" or "Assertion:" so
shell verify commands that are not test runners (build scripts,
type-checkers) still produce a stable identifier.
- Normalizes the returned id through `normalizeTestId`, which:
* drops "(NNN ms)" and "[NNN ms]" timing suffixes,
* collapses object hex addresses ("Foo@0x7f8b…") to "Foo@0x?",
* truncates wording-variant ": expected X to be Y" tails,
* caps length at 240 chars.
This is the part that actually makes the futility guard work: the
same logical failure now produces the same id across reruns even
if the runner prints slightly different noise.
Plan doc paths (apps/desktop/plans/todo-agent-plan.md)
------------------------------------------------------
Three references still pointed at the Postgres `packages/db` schema
from the original design sketch. The feature actually lives in
`packages/local-db` (SQLite, the desktop app's local store). Updated
both the "files touched" checklist and the inline data-model code
block so future readers of the plan don't hunt in the wrong package.
Verified
--------
- `bun run typecheck` in apps/desktop — clean.
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 48 minutes and 20 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (3)
📝 WalkthroughWalkthrough新しい自律TODO エージェント機能をデスクトップアプリに追加。tRPCルーターマウント、メイン・プロセスの supervisor・session-store・git操作モジュール、UI コンポーネント(ボタン・モーダル・マネージャー・サイドバー)、データベーススキーマ定義で構成。 Changes
Sequence Diagram(s)sequenceDiagram
participant User as ユーザー
participant Renderer as レンダラー
participant Main as メイン・プロセス
participant DB as ローカルDB
participant Claude as Claude CLI
participant Verify as 検証 Script
User->>Renderer: TODO ボタン・モーダルを開く
Renderer->>Renderer: セッション作成フォーム送信
Renderer->>Main: todoAgent.create (tRPC)
Main->>DB: INSERT todo_sessions (queued)
Main->>Main: アーティファクト ディレクトリ作成
Renderer->>Renderer: トーストで成功表示
User->>Renderer: セッション開始ボタン
Renderer->>Main: todoAgent.start (tRPC)
Main->>DB: UPDATE status="preparing"
Main->>Main: Git HEAD SHA キャプチャ
Main->>Main: stream イベント初期化
Main->>DB: UPDATE status="running"
Main->>Main: supervisor ループ開始 (非同期)
Main->>Claude: spawn("claude", headless)
Claude->>Claude: Claude ターン実行
Claude-->>Main: NDJSON ストリーム出力
Main->>Main: ストリーム解析(テキスト・ツール・結果)
Main->>DB: UPDATE stream イベント追記
Main->>DB: UPDATE iteration/cost/turns
Main->>Renderer: subscribeStream イベント発行
Renderer->>Renderer: ストリーム ビュー更新
alt 検証コマンド存在
Main->>Verify: execute verifyCommand
Verify-->>Main: exit code / ログ
Main->>DB: UPDATE status="done" or "failed"
Main->>Main: 3回連続失敗なら escalated
else 検証コマンドなし
Main->>DB: UPDATE status="done"
end
Main->>DB: subscribeState イベント発行
Renderer->>Renderer: セッション詳細更新(判定・コスト表示)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
When a user created a TODO session against a workspace with `type="branch"` (no worktree row), the supervisor threw `todo-agent: workspace <id> has no worktree` at creation time and refused to run. The bug was that `resolveWorktreePath` only looked at the `worktrees` table. Workspaces in this app come in two flavors: - `type="worktree"` — backed by a real `worktrees.path` - `type="branch"` — runs directly in the project's `mainRepoPath`, with no worktrees row at all The existing terminal runtime already handles both via `workspace-terminal-context.ts`, which falls back to `projects.mainRepoPath` when no worktree row exists. The TODO agent now follows the same resolution strategy: LEFT JOIN both `projects` and `worktrees`, return `worktreePath ?? mainRepoPath`. The only undefined-returning case is now "workspace row itself does not exist". This unblocks session creation for any branch-type workspace. No schema or API surface changes.
Two changes that turn the TODO agent from a "code task + test gate"
feature into a general autonomous task runner that covers research and
investigation use cases, and aligns the UI language with the rest of
this fork-local feature.
1. Verify command is now optional
---------------------------------
Motivation: not every TODO has a sensible acceptance command. Research
tasks ("このファイル群を調査して設計案をまとめて"), code-reading
tasks, and one-shot refactors do not have a `bun test` that can decide
"done" — forcing users to invent one made the feature feel
code-centric when it is really about autonomous execution in general.
Behavior:
- `packages/local-db/src/schema/todo-sessions.ts`: `verify_command` is
now nullable. New migration `0050_todo_verify_optional.sql`
(drizzle-kit generated) applies the NOT NULL drop.
- `apps/desktop/src/main/todo-agent/types.ts`: `verifyCommand` is now
an optional zod string that transforms empty to undefined, so
trimming an empty input reliably reaches the supervisor as
"unset" rather than "empty string".
- `apps/desktop/src/main/todo-agent/supervisor.ts`: new branch at the
top of `runSession` for the "no verify" path — single-turn mode.
It writes the initial prompt once, waits for the worker PTY to go
idle, and marks the session `done` with a verdict message asking
the user to review the output in the worker terminal. No iteration
loop, no futility detection, no budget polling beyond the shared
wall-clock cap on the idle-wait. The user drives any follow-up
turns manually by typing into the same terminal tab.
- The existing iteration loop is preserved verbatim for sessions
that do have a verify command; only the branch above it is new.
- Goal doc and per-iteration prompts composed by the supervisor now
switch wording based on whether a verify command is set
(`renderGoalDoc` and `buildIterationPrompt`), and the panel's
Start handler does the same for the initial claude invocation.
Rationale for keeping single-turn as one iteration rather than
capping the existing loop at 1: the loop's structure assumes a
verify-then-maybe-continue flow. Short-circuiting it keeps the
branching explicit and makes the "単発モード" state machine
readable at a glance in the supervisor.
2. UI localized to Japanese
---------------------------
This is a fork-local feature and the rest of the user's workflow is
in Japanese, so there is no reason for the TODO surface to be
English. Translated strings:
- `TodoButton/TodoButton.tsx`: tooltips, dropdown items
- `TodoModal/TodoModal.tsx`: dialog title/description, all form
labels, placeholders, helper text, buttons, toasts, and error
messages. The verify field is explicitly marked "(任意)" and its
helper text explains the empty-equals-single-turn behavior. The
budget fields (max iterations, wall-clock minutes) are now
conditionally rendered only when the verify field has a value,
since they have no meaning in single-turn mode.
- `TodoPanel/TodoPanel.tsx`: sheet header, session list empty
state, detail labels (ステータス/タイトル/やってほしいこと/
ゴール/Verify/予算/直近の結果), button labels (Start remains
in-English as a recognizable verb, 中断/送信 are translated),
intervene input placeholder, footer hint. The Verify field in the
detail view now shows "単発モード(verify なし)" when the session
was created without one, and the budget display adapts too.
- Toast messages (作成しました / 開始しました / 中断しました /
送信に失敗しました, etc.) and the error thrown by trpc `create`
when workspace path resolution fails.
- Supervisor-authored `goal.md` content and in-prompt wording are
also Japanese so the worker Claude speaks the same language as
the user.
Verified
--------
- `bun run typecheck` in apps/desktop — clean.
- Migration generated via `bunx drizzle-kit generate` in
packages/local-db (not hand-edited).
Two related fixes for a failure mode reported after the v1 rollout.
Symptom: on Start, the worker terminal tab showed
claude ".superset/todo/<id>/goal.md …"
⎿ Please run /login · API Error: 401 {authentication_error …}
and yet the TodoPanel flipped the session to `done · iter 1`. Two
distinct bugs were stacked here: (1) the command shape we sent to the
PTY was not the one the rest of the app uses, and (2) the supervisor
treated "the worker went idle" as "the worker finished successfully"
even when the idle was caused by an immediate authentication crash.
1. Use the canonical claude prompt command builder
---------------------------------------------------
`TodoPanel/TodoPanel.tsx`'s Start handler was building the launch
command by hand:
const command = `claude ${JSON.stringify(initialPrompt)}`;
That invocation was subtly wrong in two ways:
- It skipped `--dangerously-skip-permissions`, which is part of the
canonical claude command defined in
`packages/shared/src/builtin-terminal-agents.ts` and is included by
every other agent launch path in the app (Run button, tasks view,
agent preset menu). Bypassing it changes how claude-code boots and
how its interactive auth / tool-use prompts are handled.
- It passed the prompt as a single JSON-quoted positional arg instead
of using the heredoc-cat form produced by `buildPromptCommandString`
for the `argv` transport. The heredoc form is what the terminal
runtime's `~/.superset/bin` shim is designed to see, and it survives
multi-line prompts, quoting, and the wrapper's argument parsing.
Both problems go away by routing through `buildAgentPromptCommand`
from `@superset/shared/agent-command` with `agent: "claude"`, which
is the exact same code path the existing Run / task launches use.
The panel now calls:
buildAgentPromptCommand({
prompt: initialPrompt,
randomId: session.id,
agent: "claude",
})
and writes the resulting string through `launchCommandInPane`.
`session.id` is already a UUID so it is a fine `randomId` for the
delimiter.
2. Detect worker startup errors instead of marking them `done`
---------------------------------------------------------------
`supervisor.ts`'s `waitForIdle` used to return a plain `boolean` for
"did we reach idle?" and the single-turn path settled the session
with `status: "done"` as long as idle was reached. That is the wrong
contract: if the claude process prints an auth error and exits
(or sits at a login prompt), the PTY goes idle too, and the session
was being reported as successfully complete.
Changes:
- `waitForIdle` now accumulates PTY output into a ~16 KB ring buffer
during the wait and returns `{ idled, buffer }` instead of just
`idled`. The buffer is used only for post-hoc scanning; it is not
emitted anywhere.
- New `detectStartupError(buffer)` helper scans the captured text
(with ANSI stripped) for a small, deliberately conservative set of
fatal markers:
* `Please run /login`
* `authentication_error` / `Invalid authentication credentials`
* `claude: command not found` / `command not found: claude`
* `API Error: 5xx`
* `fatal:`
Each pattern maps to a Japanese, actionable `verdictReason`
explaining what went wrong. The set is intentionally narrow so we
do not confuse a normal test failure inside the worker's TUI with a
startup crash — those patterns never appear in healthy runs.
- Single-turn path now runs the detector immediately after idle. On a
hit, the session is moved to `failed` with the matched reason. On a
miss, the existing "done with review-the-terminal" verdict stands.
- Iteration-mode path runs the detector once, after the first
iteration's idle, before executing the verify command. This is the
only moment the detector adds value: running verify against a
worker that never actually booted would produce a misleading
"verify failed" verdict instead of the real reason. Subsequent
iterations are assumed to be live because the supervisor is still
feeding them follow-up prompts and the worker is clearly
processing.
Behavior after this fix on the reported failure
------------------------------------------------
Starting a session against a workspace whose `claude` binary is not
authenticated will now:
1. Launch claude via the canonical preset command (same as Run
button). If the auth problem was an artifact of the hand-built
command shape, it may resolve on its own.
2. If claude still fails authentication, the session will show
`status: failed` with verdictReason
"Claude Code の認証に失敗しました(API Error 401)。ワーカーの
ターミナルで `/login` を実行してください。" instead of the
misleading `done · iter 1`.
Verified
--------
- `bun run typecheck` in apps/desktop — clean.
…rkspace list for TODO agent
This commit is the backend half of a broader redesign of the TODO
agent surface. Three discrete changes land here:
1. `goal` is now optional on todo_sessions
-------------------------------------------
Motivation: not every TODO has a crisp acceptance sentence. Research
and investigation tasks naturally use "やって欲しいこと
(description) が終わったら完了" as the implicit goal, and making users
invent a separate goal string was pure friction.
Changes:
- `packages/local-db/src/schema/todo-sessions.ts`: `goal` column drops
`.notNull()`. Migration `0051_todo_goal_optional.sql` is the drizzle-
kit generated table-recreate migration that removes the NOT NULL
constraint.
- `apps/desktop/src/main/todo-agent/types.ts`: `todoCreateInputSchema`'s
`goal` is now an optional trimmed string that transforms empty to
undefined — same pattern as `verifyCommand`.
- `apps/desktop/src/main/todo-agent/trpc-router.ts`: the `create`
mutation now inserts `goal ?? null` so an omitted goal becomes a DB
null, not an empty string.
- `apps/desktop/src/main/todo-agent/supervisor.ts`:
- `renderGoalDoc` now emits "(未指定。上記『やって欲しいこと』が
完了した時点で完了とみなす)" as the goal body when the session
has no explicit goal, so the file the worker reads still has a
coherent acceptance section.
- `buildIterationPrompt` composes a `goalClause` that says either
"ゴール(受け入れ条件)を達成することを目指してください" or
"『やって欲しいこと』が完了した時点で完了とみなしてください"
depending on whether `session.goal` is set, and threads that
clause through all three prompt shapes (single-turn, first
iteration with verify, retry iteration).
2. AI rewrite helper for the TODO creation form
------------------------------------------------
New backend for the sparkle/✨ button that the creation modal will get
in the follow-up commit. Click → send the field's current text to a
small model with a tight rewrite prompt → receive a cleaner, more
LLM-friendly version back.
Implementation notes:
- Reuses the existing `callSmallModel` plumbing from
`apps/desktop/src/lib/ai/call-small-model.ts` — the same path the
workspace auto-namer uses. Zero new credential handling, zero new
provider fallback logic, diagnostics integration for free.
- `apps/desktop/src/main/todo-agent/enhance-text.ts` exposes
`enhanceTodoText(rawText, kind)` where `kind` is `"description" |
"goal"`. Each kind has a dedicated Japanese system prompt baked in:
* description: "ユーザーが書いた雑な TODO の記述を、自律
コーディングエージェントが理解しやすい明確な指示に書き換える"
* goal: "雑なゴールを、検証可能な受け入れ条件に書き換える"
Both prompts explicitly say "元の意図を保つ" and "新しい要件を
追加しない" to prevent the model from hallucinating scope creep,
cap the output at ~1-6 lines, and return only the rewritten text
without any "Sure, here's the rewrite:" preambles.
- Invokes via `callSmallModel` → `generateText` from the Vercel AI SDK
directly, since the `model` passed to the invoke callback is a
`LanguageModel` from `@ai-sdk/anthropic` (for the Anthropic path,
`claude-haiku-4-5-20251001`) or `@ai-sdk/openai` (OpenAI path).
Both accept `generateText({ model, system, prompt })` uniformly,
so the branching in `ai-name.ts` isn't needed here.
- `describeEnhanceFailure(attempts)` turns the SmallModelAttempt[] into
a user-facing Japanese error string, honoring the same hierarchy
the workspace namer uses (expired > failed > unsupported >
missing-credentials).
New tRPC surface:
- `todoAgent.enhanceText` — `{ text, kind }` in, `{ text }` out.
Throws TRPCError(INTERNAL_SERVER_ERROR, <japanese message>) on any
failure so the renderer can surface it in a toast.
3. Cross-workspace session list for the Agent Manager view
----------------------------------------------------------
The existing `todoAgent.list` query is workspace-scoped. The
follow-up Agent-Manager-style view needs a single flat feed of all
TODO sessions grouped by workspace, so we can present something
closer to Antigravity's "all agents in one place" UX.
- `apps/desktop/src/main/todo-agent/session-store.ts` adds a new
`TodoSessionListEntry` type (session fields + workspaceName,
workspaceBranch, projectName) and a new `listAll()` method that
LEFT JOINs `workspaces` and `projects` for those labels, filters
out workspaces being deleted (`isNull(workspaces.deletingAt)`),
and orders by `createdAt DESC`.
- `apps/desktop/src/main/todo-agent/trpc-router.ts` exposes
`todoAgent.listAll` as a no-arg query returning that entry list.
- The existing `list` is kept for any callers that only need a single
workspace and will continue to back the per-workspace badge count
on the TODO button.
Housekeeping
------------
`.gitignore` now includes `.superset/todo/` so TODO agent runtime
artifacts (goal.md + any per-session state files) stay out of git.
Verified
--------
- `bun run typecheck` in apps/desktop — clean.
- Migration generated via `bunx drizzle-kit generate` in
packages/local-db (not hand-edited).
TODO autonomous agent sessions write goal.md and per-session state files into `.superset/todo/<session-id>/` inside the worktree. These are runtime scratch data, not source — keep them out of git. Paired with the backend commit that introduced the directory.
…ew and AI enhance Major UX reshape of the TODO autonomous agent surface, replacing the drawer panel with a full-view Agent-Manager-style interface inspired by Google Antigravity's Agent Manager, Cursor 2.0's agents sidebar, and Factory Desktop's sessions layout. Clicking the TODO button now opens a single-pane-of-glass view of every autonomous session across every workspace, with session creation available from within. Research backing this shape: [antigravity.google/docs/agent-manager], [cursor.com/changelog/2-0], [factory.ai/product/desktop], and [docs.devin.ai/release-notes] consistently converge on a 2-pane layout — grouped session list + detail — with a primary "+ new" button in the header and a workspace / project as the grouping axis. Goal-optional UX is also standard across these tools (Antigravity treats the initial prompt as the implicit goal, Devin derives session titles from the first message). TodoManager (new: TodoManager/TodoManager.tsx) ----------------------------------------------- Full-screen Dialog, 95vw × 86vh, no background scroll. Two panes: - **Header** (h-12) — "TODO Agent Manager" title, subtitle, primary `+ 新しい TODO` button that opens the existing TodoModal with the current workspaceId pre-filled, close button. No workspace switcher here — the list is already cross-workspace. - **Sidebar** (300px) — a filter input at the top (title / description / workspace substring match) and a scrollable list grouped by workspace. Each group has a sticky, uppercase-label header showing "project / workspace" and the session count. Grouping is done client-side from the flat `listAll` feed so we do not pay N queries. Each row shows a status dot, title, and `status · iter N` subline. Status dot colors follow the same convention used elsewhere in the app — amber-pulse for running/verifying/preparing, emerald for done, rose for failed/escalated, muted for queued/aborted. - **Detail pane** — metadata header (status dot + status label + workspace / project breadcrumb + title), action buttons (Start for queued, 中断 for active), DetailBlock sections for やって欲しい こと / ゴール / Verify / 予算 / 直近の結果 / 介入. ゴール shows "未指定 ·『やって欲しいこと』の完了をゴールとみなします" when the session was created without an explicit goal. Verify shows "単発モード(verify なし)" similarly. The intervene row is a single-line Input + Send button that passes Enter to `todoAgent.sendInput`, which writes into the worker PTY. Start handler (migrated from the old TodoPanel, essentially verbatim): creates a terminal tab via `useTabsStore.getState().addTab`, renames it with `TODO: <title>`, builds the initial prompt with the same goal-optional awareness as the supervisor (`ゴール...を目指してください` vs `『やって欲しいこと』が完了した時点 で完了とみなしてください`), launches via the canonical `buildAgentPromptCommand` from `@superset/shared/agent-command`, then calls `todoAgent.attachPane` to hand the session to the supervisor. On success it invalidates both `listAll` and the per-workspace `list` so the badge count on the TODO button updates. Uses the new `TodoSessionListEntry` type (moved from session-store into `main/todo-agent/types.ts` so the renderer can import it with a `type` import — types are stripped at runtime, so it is safe despite the main-process file location). TodoButton (simplified: TodoButton/TodoButton.tsx) --------------------------------------------------- Previously had a split button with a dropdown offering "新しい TODO" and "Open panel". Now a single compact button: click → open TodoManager. Session creation moved entirely inside the Manager, so the dropdown is gone — the primary affordance matches the user's requested "click TODO → see what exists first, create from there" flow. The active-sessions counter badge is retained so users still see in-flight work at a glance from the PresetsBar. TodoModal (TodoModal/TodoModal.tsx) ------------------------------------ Two changes: - **Goal is now optional.** Drops the goal field from the submit validation (`canSubmit`), marks the field "(任意)" in its label, adds placeholder guidance "(空欄なら『やって欲しいこと』の完了 をゴールとします)". Submit passes `goal: hasGoal ? goal.trim() : undefined` so an empty field cleanly becomes a null in the DB via the zod transform. - **AI enhance buttons on description and goal.** New `EnhanceButton` component lives under `TodoModal/components/EnhanceButton/` (one level of co-location since it is only used by TodoModal). It is a small sparkle/✨ ghost button rendered to the right of each Label, taking the current field value and a setter. Click calls `todoAgent.enhanceText` (the new mutation added in the backend commit) with `kind: "description" | "goal"`, and on success replaces the field value with the returned text. Uses `HiMiniSparkles` from react-icons with an `animate-pulse` while running, matches the common "Improve writing" UX pattern seen in Raycast AI, Linear AI, Notion AI, and v0. The button is disabled when the field is empty (so users get deterministic text before the rewrite pass). Removed ------- - `TodoPanel/TodoPanel.tsx` and `TodoPanel/index.ts` — fully superseded by TodoManager. No consumers remain. Not in this commit ------------------- - File tree / terminal / browser preview side pane (the Factory Desktop pattern). The Manager deliberately delegates the live worker view to the workspace's own terminal tab to keep the surface focused. - Archive / delete / bulk operations. All rows are currently read-only except for Start / 中断 / Send. - Artifact panel (TODO list / plan / diff viewer). Phase 3 item. - Session pinning or status-based reorder. Current sort is createdAt DESC within each workspace group. Verified -------- - `bun run typecheck` in apps/desktop — clean.
The Agent Manager view was rendering at ~512px wide instead of 95vw because shadcn's `DialogContent` default classes include `max-w-[calc(100%-2rem)] ... sm:max-w-lg` (see `packages/ui/src/components/ui/dialog.tsx:64`). On any screen >= 640px (i.e. the entire desktop target of this app) the `sm:max-w-lg` rule lives inside a `@media (min-width: 640px)` block and overrides the base-layer `max-w-none` I was passing in the override className. tailwind-merge resolves conflicts per variant, not across variants — it sees `max-w-none` and `sm:max-w-lg` as two distinct utilities and keeps both. To actually nullify the sm-level cap I need to override it with the same variant prefix. Fix: add `sm:max-w-none` alongside the base `max-w-none` in the DialogContent className. Everything else (w-[95vw], h-[86vh], p-0, gap-0, overflow-hidden) stays. Verified: - `bun run typecheck` in apps/desktop — clean.
Previous commit expanded the Agent Manager to 95vw after discovering the sm:max-w-lg override issue — but on a typical laptop that is way too big and crowds the app chrome. Dial it back to a bounded fixed width with a viewport cap. - DialogContent width: `w-[1080px] max-w-[calc(100vw-4rem)]` with the matching `sm:max-w-[calc(100vw-4rem)]` override so shadcn's default `sm:max-w-lg` (512px) stays disabled. 1080px is roughly the Antigravity Agent Manager width on a 1440p monitor and leaves 32px margins on narrower screens. - DialogContent height: `h-[80vh] max-h-[760px]` so tall monitors get a reasonable cap instead of stretching near-full-screen. - Dialog layout switched to `flex flex-col` and the content grid is now `flex-1 min-h-0` instead of `h-[calc(86vh-48px)]`. This adapts automatically whether the content is limited by the viewport or by max-h-[760px], and avoids the brittle subtraction math that breaks when the header height changes. - Sidebar narrowed from 300px to 260px to give the detail pane more breathing room at the smaller overall width. Verified: - `bun run typecheck` in apps/desktop — clean.
Middle ground between the previous two attempts: - 95vw (committed as 55a484c) was too wide on desktop - 1080px × 80vh (f0031b3 / 25f0719) was too small Settles at: - width: `w-[1360px]` target with `max-w-[calc(100vw-2rem)]` cap (plus the matching `sm:max-w-[calc(100vw-2rem)]` to keep shadcn's default `sm:max-w-lg` from re-applying at the sm breakpoint). 1rem margin on each side on narrower screens. - height: `h-[85vh] max-h-[860px]` — a bit taller so the detail pane comfortably shows header + description + goal + verify + budget + verdict without scrolling for a typical session. - sidebar: back to 300px (down from 260px) now that the overall width can afford it. Matches the original design intent and the Antigravity / Cursor sidebar widths the research reference had. Verified: `bun run typecheck` in apps/desktop — clean (no code paths changed, only class tokens).
…start, delete/rerun, collapsible groups
Seven related UX improvements for the Agent Manager view, addressing
the feedback from the first round of live usage:
1. Manager size: ~1.5×
-----------------------
Previous pass landed at `w-[1360px] h-[85vh] max-h-[860px]`. The
user found it too constrained for the cross-workspace view. Bumped
to `w-[2040px] max-w-[calc(100vw-2rem)] h-[92vh] max-h-[1290px]`
with sidebar widened from 300px to 340px to match. The width cap
means narrower laptops still get `viewport - 2rem` so it never
exceeds the screen.
2. "ターミナルを開く" button
----------------------------
New outline button in the detail-pane header whenever the session
has `attachedTabId`. Click → calls `useTabsStore.setActiveTab(
workspaceId, attachedTabId)` which makes the worker tab the active
tab in that workspace. If the session is in a *different* workspace
than the one the user is currently on, we still set the active tab
(so it sticks when the user navigates there) and show a toast
explaining they need to switch workspaces manually. Cross-workspace
navigation from the manager is a v2 item.
3. Background start
--------------------
Previously, clicking Start called `tabs.addTab(workspaceId)` which
also set the new tab as the active tab — stealing the user's focus
the moment they closed the dialog. Now the Start handler captures
`activeTabIds[workspaceId]` BEFORE calling addTab, and after the
launch finishes calls `setActiveTab(workspaceId, previousActiveTabId)`
to restore it. The worker keeps running in the background tab; the
user only sees it when they explicitly click "ターミナルを開く".
Toast message updated to "バックグラウンドで開始しました: <title>".
4. Collapsible workspace groups
-------------------------------
Sidebar group headers are now buttons. Each click toggles a local
`collapsedGroups: Set<string>` keyed by workspaceId. Collapsed
groups hide their session rows but keep showing the group header
with a chevron-right icon (vs chevron-down when expanded). State
is component-local — not persisted — which is fine for a dialog
that opens transiently.
5. Delete past sessions
-----------------------
New `todoAgent.delete(sessionId)` tRPC mutation:
- Calls `supervisor.abort` as a safety no-op (the supervisor's
`abort()` is already idempotent for non-active sessions).
- `store.remove(sessionId)` drops the DB row via drizzle delete.
- Best-effort `rmSync(.superset/todo/<id>, { recursive: true,
force: true })` wipes the artifact directory. Failure is
logged but does NOT fail the mutation — DB is the source of
truth.
UI: trash icon button in the detail header. First click switches
to an inline "本当に削除 / キャンセル" confirmation to avoid
accidental deletes. Disabled while the session is active and not
yet in `queued` state so users cannot blow away a running worker
without an explicit abort.
6. Re-run past sessions
-----------------------
New `todoAgent.rerun(sessionId)` tRPC mutation: loads the source
session and creates a brand-new queued session that copies every
user-authored field (title, description, goal, verifyCommand,
maxIterations, maxWallClockSec, workspaceId, projectId) and resets
all execution state (status, phase, iteration, pane IDs, verdict,
completedAt). Calls `prepareArtifacts` so the new session has its
own `goal.md` under `.superset/todo/<new-id>/`.
UI: refresh-arrow icon button in the detail header, shown only for
"final" statuses (done / failed / escalated / aborted). On success
the new session appears at the top of the list via the listAll
refetch, and the user picks it up to Start.
7. "新しい TODO" stacks on top of the Manager
---------------------------------------------
Previously the TodoModal was rendered INSIDE TodoManager as a
sibling Dialog. Two shadcn/Radix Dialogs at the same level caused
click-outside / focus interactions to interfere — the modal would
sometimes close the Manager underneath it or focus the wrong
portal layer.
Fix: lift the modal up to TodoButton so both dialogs are rendered
as siblings at the TodoButton top level. TodoManager gains a new
`onRequestNewTodo: () => void` prop that it calls when the user
clicks "+ 新しい TODO"; TodoButton hooks that to its own
`setModalOpen(true)` state. The Manager stays open underneath;
the modal opens on top cleanly. `projectId` is now threaded through
TodoButton → TodoModal again.
Files touched
-------------
- `apps/desktop/src/main/todo-agent/session-store.ts`: new
`remove(sessionId)` method wrapping drizzle delete.
- `apps/desktop/src/main/todo-agent/trpc-router.ts`: new `delete`
and `rerun` mutations with artifact cleanup and prepareArtifacts
plumbing. Imports `TODO_ARTIFACT_SUBDIR` from types.ts.
- `apps/desktop/src/renderer/features/todo-agent/TodoButton/TodoButton.tsx`:
owns both Manager and Modal open state; passes `onRequestNewTodo`
into Manager; renders TodoModal as a sibling Dialog.
- `apps/desktop/src/renderer/features/todo-agent/TodoManager/TodoManager.tsx`:
resized, collapsible groups, terminal-jump + delete + rerun
buttons in detail header, background-start active-tab restore,
`onDeleted` callback to clear selection after delete.
Verified
--------
- `bun run typecheck` in apps/desktop — clean.
…ew and real verdict text
Root-cause rewrite addressing three critical bugs from the first live-
usage round:
1. **『まだ実行中なのに Done になってる』** — the idle-window heuristic
mistook long-thinking / long-tool-running phases for turn completion
and flipped sessions to `done` while Claude was still working.
2. **『verdict が単発タスク完了...の固定文字列で、Claude の最終応答が見えない』**
— the single-turn path wrote a static placeholder instead of the
real final assistant message.
3. **『通常タブに TODO のターミナルが見えてしまう』** — the previous
architecture created a real workspace terminal tab (via
`useTabsStore.addTab`) to host the interactive claude PTY, so
worker tabs leaked into the workspace tab bar.
Codex consulted for design (see the branch discussion), and its
recommendation was unambiguous: move to **headless Claude Code
(`claude -p --output-format stream-json`) instead of a PTY**. That
single change dissolves all three problems at once:
- completion judgment = child process exit. No more idle guessing.
- verdict text = `result.result` from the NDJSON stream. No more
PTY ANSI scraping.
- no PTY → no tab store involvement → no leaked workspace tab.
Intentionally NOT passing `--bare`: per the installed claude 2.1.109
help text, `--bare` forces ANTHROPIC_API_KEY and refuses OAuth /
keychain reads, which would break the user's Claude Max auth. Running
without `--bare` keeps keychain OAuth working and we still get full
control over every argument.
Main-process backend (apps/desktop/src/main/todo-agent/*)
---------------------------------------------------------
- `supervisor.ts`: complete rewrite.
- `runClaudeTurn(...)` spawns `claude -p --output-format stream-json
--verbose --include-partial-messages --permission-mode acceptEdits
[--resume <sessionId>] <prompt>` as a node `child_process.spawn`
under the worktree cwd. No PTY.
- Line-buffered NDJSON parser on stdout (`drainLines` + `handleLine`).
Each parsed record is classified by `classifyStreamJson`:
* `system/init` → captures `session_id` for `--resume`
* `assistant` → extracts the text portion(s) and emits as a
`assistant_text` event; tool uses become `tool_use` events
with a one-line summary of command/file_path/pattern input
* `user` → `tool_result` events for tool outputs (truncated)
* `result` → captures `result` (final text), `total_cost_usd`,
`num_turns`; promoted to DB columns
* `error` → `error` event
* unknown → `raw` fallback
- Turn completion is process exit. If exit != 0 and no `result`
seen, stderr tail becomes the verdictReason. Signal abort uses
SIGINT then a 1.5s SIGKILL fallback.
- Iteration loop preserved: verify fail → retry with `--resume
claudeSessionId` + failure tail in prompt. Futility guard (3x
same failing test → escalated) and wall-clock budget cap are
unchanged. iter budget is unchanged.
- New `queueIntervention(sessionId, data)`: sets
`pendingIntervention` on the DB row. Supervisor reads-then-clears
it at each turn boundary and prepends it to the next prompt —
this is the headless replacement for mid-stream PTY keystrokes.
- Single-turn mode (no verify): one iteration, the final
assistant text becomes the verdict, session goes to `done`.
- `session-store.ts`: adds per-session in-memory stream event buffer
(capped at 500 events, ring-style trim from head). New
`appendStreamEvents / getStreamEvents / clearStreamEvents /
subscribeStream` APIs backed by EventEmitter. `remove` now also
drops the stream buffer.
- `types.ts`: new `TodoStreamEvent`, `TodoStreamEventKind`, and
`TodoStreamUpdate` types describing the condensed events the UI
renders. Kept intentionally small (id, ts, iteration, kind, label,
text) so tRPC IPC stays lightweight.
- `trpc-router.ts`:
- `attachPane` removed. It made no sense in a paneless world.
- New `start(sessionId)` mutation: validates the session is in a
non-active state, flips to `preparing`, kicks off the supervisor
loop fire-and-forget.
- `sendInput` renamed semantics → `queueIntervention`. Writes to
the `pending_intervention` DB column and is consumed at the next
turn boundary.
- New `getStream(sessionId)` query for the initial paint.
- New `subscribeStream(sessionId)` observable subscription for
live stream events. trpc-electron constraint satisfied via
`@trpc/server/observable` as documented in AGENTS.md.
Schema (packages/local-db/src/schema/todo-sessions.ts)
------------------------------------------------------
Five new columns:
- `claude_session_id` TEXT — captured from `system.init`; used as
`--resume` key for retry iterations so the same conversation state
persists across verify loops.
- `final_assistant_text` TEXT — the real final Claude response,
captured from `result.result`. Replaces the static placeholder.
- `total_cost_usd` REAL, `total_num_turns` INTEGER — aggregated
across iterations from `result` events; displayed in a "消費"
block in the Manager detail pane.
- `pending_intervention` TEXT — the intervention queue.
Migration `0052_todo_headless_fields.sql` generated by drizzle-kit,
not hand-edited. The three legacy columns `attached_pane_id`,
`attached_tab_id`, and the old `verify`/`verdict` fields are retained
for backwards compat with existing rows; they are no longer written
by the supervisor.
Renderer (apps/desktop/src/renderer/features/todo-agent/TodoManager/TodoManager.tsx)
------------------------------------------------------------------------------------
Fully rewritten detail pane. Major changes:
- **No more `launchCommandInPane` / `addTab` / pane bookkeeping.**
Start simply calls `todoAgent.start.mutate({ sessionId })` and
lets the main-process supervisor do the work. Zero workspace
tab bar involvement.
- **Live stream view INSIDE the Manager.** New `StreamView` +
`StreamEventRow` components render the parsed events as colored
bubbles:
* assistant_text → primary border/background
* tool_use → amber
* tool_result → emerald
* result → stronger emerald
* error → rose
* raw / user-prompt → neutral
Each row shows `[iter N] label` + wall-clock HH:MM:SS + wrapped
body text. Capped at 500 events client-side to mirror the server
buffer. This is the "terminal inside the Manager" the user asked
for — not a PTY but a structured, labeled view of exactly what
Claude is doing, which is more useful than raw ANSI anyway.
- **Subscription wiring.** Selection changes reset the local event
state; `todoAgent.getStream` paints the initial snapshot; then
`todoAgent.subscribeStream` streams appends live.
- **Timing block.** New `TimingBlock` showing 作成 / 開始 / 終了 /
実行時間 in a 4-column grid. Uses `formatTimestamp` (local
wall-clock) and `formatDuration` (秒 / 分秒 / 時間分). Duration
falls back to `Date.now()` when the session is still running so
users see the live elapsed time.
- **最終回答 block.** Dedicated panel that shows
`session.finalAssistantText` when present. This is the real last
Claude response, stored directly from `result.result` — no more
PTY scraping, no more static placeholder.
- **消費 block.** Shows `$X.XXXX` and `N turns` when cost / turn
count are available from any `result` event.
- **Intervention.** The intervene input is retained but its
semantics are now "次のターンに注入する指示". Button label is
now キュー. Any pending intervention is shown below the input as
"予約済み: ..." so the user can see what will reach Claude next.
- **Start / 中断 / 再実行 / 削除** actions are preserved. "Start" is
enabled for queued / failed / aborted / escalated sessions (so
users can re-run a failed session in place) as well as fresh
queued ones. The "ターミナルを開く" button is gone; the live
stream view replaces it.
.gitignore
----------
Adds `.claude/worktrees/` so nested Claude Code scratch dirs (some
of them are themselves git repositories) do not get sucked into
commits. Unrelated to TODO agent but landed in the same surgery.
Verified
--------
- `bun run typecheck` in apps/desktop — clean.
- Migration generated via `bunx drizzle-kit generate` in
packages/local-db (not hand-edited).
Sources
-------
- `claude -p / --output-format stream-json / --resume`:
https://code.claude.com/docs/en/headless
- `--bare` ANTHROPIC_API_KEY requirement: local
`claude --help` on installed CLI (v2.1.109), matching
https://code.claude.com/docs/en/cli-reference
- Observable-only tRPC subscriptions:
apps/desktop/AGENTS.md
…ise, abort race, timing tick) Advisor review of 50c4641 flagged three blocking issues plus two cheap UX nits. All five land in one commit since they are all in the same two files and independently reviewable. Blocking fixes -------------- 1. **`--permission-mode acceptEdits` can hang in headless `-p` mode.** `acceptEdits` auto-approves Edit/Write, but Bash tool calls still prompt for approval. In `-p` mode there is no one to grant that approval, so the child process would sit forever waiting for a prompt that never arrives — and our Promise, tied to process `close`, would never resolve. The session stays in `running` with no way for the user to know it is dead. Changed to `--permission-mode bypassPermissions` in `runClaudeTurn()`. This is the correct mode for fully autonomous operation — the user already opted into "let Claude do whatever" by creating an autonomous TODO, so bypassing all permission checks is the right default. Added an inline comment explaining why `acceptEdits` is insufficient. 2. **Child `error` event without `close` left the Promise hung.** Spawn failures like ENOENT (claude binary missing from PATH) or EACCES are reported asynchronously via the `error` event AFTER the `spawn()` call returns. Node does not guarantee a subsequent `close` in every failure path, so the old implementation — which only set `errorText` in the error handler and only resolved in the close handler — could hang forever in exactly the production failure mode it was trying to report. Introduced a single-shot `settle()` helper with a `settled` boolean guard. Both the `error` and `close` handlers funnel through it, so whichever fires first cleans up the abort listener, drains any residual stdoutBuffer, and resolves the outer Promise. The `error` handler now also composes a user-facing reason prefix (`claude プロセスエラー: ...`) so the session flips to `failed` with an actionable verdictReason instead of vanishing into a `running` limbo. 3. **Abort race overwrote `aborted` with `escalated`.** The iteration loop's `if (ac.signal.aborted) break;` exited the while loop, but execution fell through to the unconditional final `store.update(... status: "escalated" verdictReason: "iteration 予算を使い切りました")`. The abort handler had already written `status: "aborted"`, so the final write silently mislabeled aborted sessions as escalated with the wrong reason. Wrapped the final update in `if (!ac.signal.aborted)` so the escalation verdict only lands when we exhausted the budget cleanly. Abort now wins the race deterministically. Cheap UX fixes --------------- 4. **TimingBlock 実行時間 was not ticking live.** The component only re-rendered when the session prop changed, which happens on the 2-second `listAll` polling cadence. The 実行時間 counter could lag by up to 2 seconds behind the wall clock while a session was running. Added a 1-second `setInterval` in `SessionDetail` that forces a re-render via a throwaway `tick` state. The interval only runs while `session.completedAt == null` — it auto-stops the moment a session settles so finished rows do not pay re-render cost. 5. **`getStream` initial-paint query duplicated the subscription's own initial emit.** `subscribeStream` already sends the current in-memory buffer to new subscribers on connect, so the separate `todoAgent.getStream.useQuery` was delivering every event twice on mount. The client dedupe Set absorbed it, but it was wasted IPC and obscured the data-flow model. Removed the `getStream` call from `SessionDetail`. The subscription is now the single source of truth for stream events. (The server-side `getStream` route is left intact as a harmless read helper for potential future use.) Non-blocking items intentionally deferred ------------------------------------------ - `pending_intervention` read-then-clear race (narrow window, harmless drop, user can re-queue). - Queue drains after abort (probably intended behavior — the user aborted session A, not session B which was queued separately). - Claude Code Stop hooks via `--settings` for a second completion source (v2 reliability boost). - `--max-budget-usd` automatic cost cap. - Stream events JSONL persistence for session replay. Verified -------- - `bun run typecheck` in apps/desktop — clean.
…rvention and markdown stream Four related UX fixes to the TODO Agent Manager detail view after seeing it in action: 1. Live stream kept growing past the visible area and the intervention input fell off the bottom with no way to scroll to it. 2. Intervention / send button was not reachable once the stream had enough content to push past the viewport. 3. Right side of the dialog was mostly empty whitespace because the detail pane was capped at `max-w-5xl` (~1024px) inside a 2040px dialog, wasting ~700px of horizontal real estate. 4. Claude's assistant messages rendered as raw text with newlines instead of real markdown, so code blocks, lists, and headings all came through unformatted. New SessionDetail layout ------------------------ Complete restructure from one flex-column-inside-scroll-area to a three-region layout that owns its own scroll containers: ┌─────────────────────────────────────────────┐ │ HEADER (fixed): title + status + actions + │ │ timing block │ ├─────────────────┬───────────────────────────┤ │ LEFT COL │ RIGHT COL │ │ (scroll, │ (scroll, fills height) │ │ ~34% of pane) │ │ │ - description │ CLAUDE の応答 / │ │ - goal │ ライブストリーム │ │ - verify/budget │ │ │ - 消費 │ [iter N] event bubbles │ │ - 最終回答 │ ... │ │ - verify 失敗 │ (auto-scrolls) │ ├─────────────────┴───────────────────────────┤ │ FOOTER (fixed): intervention input + hint │ └─────────────────────────────────────────────┘ Implementation: - Outer ScrollArea wrapper removed from TodoManager around the detail slot. SessionDetail now claims the full grid cell with `flex flex-col h-full min-h-0` and manages its own internal scrolling. The TodoManager's 2-column grid already had `flex-1 min-h-0` so the flex math chains all the way up. - `grid-cols-[minmax(380px,34%)_1fr]` in the body region: left column has a minimum of 380px for the metadata to stay readable and grows to 34% on wide dialogs; right column (1fr) soaks up the rest for the stream. On the current 1360–2040px dialog that gives the stream ~900–1350px of horizontal space — huge improvement over the previous 1024px cap. - Left column is wrapped in ScrollArea so long descriptions / failure logs scroll independently without pushing the stream. - Right column stream is its own flex flex-col with a small sticky-ish label row on top (shrink-0) and a flex-1 min-h-0 StreamView beneath, so the stream ALWAYS fills the remaining height — this is what the user actually wanted to see. - Footer is shrink-0 border-t so the intervention input is always anchored at the bottom of the pane no matter how much content piles up above it. StreamView: self-scrolling + auto-pin to bottom ------------------------------------------------ The previous StreamView was a flex column with a fixed `max-h-[50vh]` inner scroll that caused the weird double-scroll behavior. It is now a single-container, self-owning scroll surface (`h-full overflow-auto`) that fills whatever vertical space its parent gives it. Added auto-scroll-to-bottom with a pin-to-bottom ref so users who have scrolled up to read earlier output do not get yanked back down on every new event: - `pinnedToBottomRef` starts `true`. - `onScroll` recomputes distance-from-bottom; pin flips to `false` when the user scrolls more than 40px up and back to `true` when they return near the bottom. - A useEffect on events.length scrolls to the bottom only when `pinnedToBottomRef.current` is true. New events never interrupt a scroll-up read. Markdown rendering for Claude's responses ------------------------------------------ StreamEventRow now branches on event.kind: - `assistant_text` and `result` events go through the shared `MarkdownRenderer` at `renderer/components/MarkdownRenderer`, which wraps react-markdown with remark-gfm, rehype-raw, and rehype-sanitize. `scrollable: false` so it expands naturally inside the event bubble. - `tool_use`, `tool_result`, `error`, and raw log lines stay in a plain `whitespace-pre-wrap font-mono` div so command strings, file paths, and stack traces keep their literal layout. The "最終回答" block on the left column also uses MarkdownRenderer now so the summary view renders identically to the in-stream final message. No new dependencies: react-markdown, remark-gfm, rehype-raw, and rehype-sanitize were already apps/desktop deps, and the shared MarkdownRenderer component already handles image safety, selection menu, and theming. Zero surgery on the ui package. Verified -------- - `bun run typecheck` in apps/desktop — clean.
…opy buttons, rounded, flex layout
Five related improvements to the Agent Manager surface after another
round of live feedback on `bd0cd2cb6`:
1. **Footer clipped under heavy content**
The previous commit tried to pin the intervention input with a
`shrink-0 border-t` footer but still used a CSS grid
(`grid-cols-[minmax(380px,34%)_1fr]`) for the body region. In
some Chromium layout passes, grid rows inside a flex parent did
not compute their height from `flex-1 min-h-0` cleanly, so when
the stream view had enough content it pushed the footer out of
view and the user could not reach the input or hint.
Fix: converted the body region to **flex** (`flex flex-1
min-h-0` with a sized left column and a `flex-1 min-w-0 min-h-0`
right column). Flex chains height resolution deterministically
from `DialogContent` → TodoManager body → SessionDetail body →
StreamView. The footer is guaranteed visible regardless of
content height. Left column width is `w-[34%] min-w-[360px]
max-w-[520px]` so metadata stays readable on narrow screens and
does not hog space on wide ones.
2. **Sidebar collapse toggle**
New chevron button in the TodoManager header at the top-left.
Clicking it toggles `sidebarCollapsed: boolean` local state. The
sidebar wrapper is always mounted (no content remount flash on
reopen); its width transitions between `w-[320px]` and `w-0`
over 150ms with `overflow-hidden` so the collapsed state is
truly invisible and gives the detail pane the full remaining
width. `border-r-0` is applied when collapsed so the lingering
border does not create a 1px sliver on the left.
3. **Row kebab menu with rename / re-run / delete**
Every `SessionRow` in the sidebar now has an
`HiMiniEllipsisVertical` button in its top-right corner
(`opacity-0 group-hover:opacity-100` so it does not clutter
idle rows). It opens a DropdownMenu with:
- **リネーム** — starts inline rename via a small Input that
swaps in for the title, autoFocused, with Enter to commit
and Escape to cancel. Blur also commits. The new trpc
`todoAgent.updateTitle` mutation validates 1..200 chars,
writes `title` + `updatedAt` on the DB row, and invalidates
`listAll` + per-workspace `list` so the row re-renders.
- **タイトルをコピー** — copies just the session title to the
clipboard via the shared `copyToClipboard` helper.
- **同じ内容で再実行** — calls the existing `todoAgent.rerun`
mutation (already added in 70bce0d) to clone the session
with a fresh queued row.
- **削除** — if the session is currently active, aborts it
first (idempotent no-op otherwise), then calls `todoAgent.
delete`. Clears selection in the parent via `onDeleted` so
the detail pane does not keep showing a dead row.
Rows are now rounded (`rounded-lg`), use
`px-1.5 py-1 gap-0.5` spacing in each group, and hover states
match the rest of the app's sidebar.
4. **Copy buttons on content blocks**
New `<CopyIconButton value title label />` helper component:
small rounded-md ghost button with `HiMiniDocumentDuplicate`.
Writes the value to the system clipboard via
`navigator.clipboard.writeText` (same pattern
`ProblemsView.tsx:184` uses) and surfaces a toast.
Wired into:
- **`<DetailBlock>`** via a new optional `action` prop that
renders in the header row. Used on "最終回答" and "直近の
verify 失敗ログ" so the user can grab the full content
without manually selecting.
- **`StreamEventRow`** header: button is
`opacity-0 group-hover:opacity-100` next to the timestamp
so every event bubble can be copied with one click without
visual noise during read.
Final answer and failure log containers are now
`rounded-lg border border-border/40 bg-muted/40` to match the
stream event bubbles and the rest of the app's card style.
5. **Rounded polish to match app design language**
- DialogContent: `rounded-xl` on the dialog itself
- All header buttons: `rounded-md`
- Sidebar filter input: `rounded-md`
- Stream event bubbles: `rounded-lg`
- Final answer / failure log containers: `rounded-lg`
- Session rows: `rounded-lg`
- Copy buttons / kebab button: `rounded-md`
Backend
-------
- New `todoAgent.updateTitle(sessionId, title)` mutation.
Validates via zod (1..200 chars, trimmed), throws NOT_FOUND if
the session is missing, otherwise updates just the title via the
existing `sessionStore.update` helper (which also bumps
`updatedAt` and emits the state event for any live
subscriptions).
Verified
--------
- `bun run typecheck` in apps/desktop — clean.
…ative dates and panel-left icon Five fixes in one round after seeing the previous polish commit in use: 1. **最終回答 + verify 失敗ログ still overflowed under the footer.** The flex restructure in c5cd524 helped, but the left column was still wrapped in shadcn `<ScrollArea>`. ScrollArea's internal height plumbing (viewport h-full inside a flex-col- inside-flex-col chain) did not always converge in this layout, letting the content push the pinned intervention input off the dialog. Replaced the left column's ScrollArea with a plain `<div className="... min-h-0 overflow-y-auto">`. Added `overflow-hidden` to the SessionDetail root, its body flex div, and the right column + StreamView wrapper so any child that somehow grows beyond its allotment gets clipped instead of pushing siblings. This is belt-and-suspenders on top of the existing flex math and has been the more reliable pattern for "pinned footer + scrolling body" in Electron/Chromium. 2. **Stream history was lost when a session was not currently running.** Events only lived in-memory (ring-buffered at 500 entries) and were cleared on each new `runSession`, so past sessions — including ones from a previous app launch — showed an empty stream view. Persistence in `session-store.ts`: - `appendStreamEvents` now also appends to `{artifactPath}/stream.jsonl` via `appendFileSync`, line per event, full event shape (id, ts, iteration, kind, label, text). The artifact dir is already per-session and gets cleaned up on delete via the existing `rmSync(recursive: true)`, so persistence and cleanup stay coupled. - `getStreamEvents` now falls back to `loadStreamEventsFromDisk` when the in-memory buffer is empty, parsing the JSONL and validating each line defensively (malformed lines are silently skipped). - No DB / schema changes. No new dependencies. Works on app restart. The tRPC `subscribeStream` observable already seeds new subscribers by calling `getStreamEvents()`, so historical events flow through the existing subscription path without any client changes. Past sessions just "come back to life" when selected. 3. **Row kebab button was aligned to the top and would collide with the new relative date at bottom-right.** Restructured `SessionRow` from "absolute-positioned kebab over the row" to "flex row with a dedicated kebab column": the main button is `flex-1 min-w-0` and the kebab trigger sits in a sibling `<div className="flex items-center pr-1">` at the right edge. `items-center` vertically centers the kebab relative to the full two-line row, and placing it in its own column guarantees zero overlap with content inside the button. 4. **Relative time display on each sidebar row.** Added `formatRelativeTime(ms)` helper: 今 / N分前 / N時間前 / N日前 / Nヶ月前 / N年前. Rendered at bottom-right of each row in `text-[10px] text-muted-foreground tabular-nums`. The status label gets `flex-1` so it takes whatever horizontal room is left after the date, and the date stays stuck to the right edge. Kebab is in its own column so hover-revealing it never covers the date. 5. **Sidebar toggle icon.** Replaced the makeshift chevron with the lucide `LuPanelLeftOpen` / `LuPanelLeftClose` pair from `react-icons/lu` — the same icon the app's other panel toggles use and the one the user pointed at in screenshot 10. No new dep: `react-icons` was already used heavily (WorkspaceListItem, SearchDialog, etc.) so this is just picking a different icon from the same bundle. Verified -------- - `bun run typecheck` in apps/desktop — clean.
… working tree, per-file diff)
Adds a third pane to the TODO Agent Manager that shows exactly what
the worker produced in a given session — git commits made since the
session started, current working tree state, and unified diffs for
any selected file or commit.
Why scoped to the session
-------------------------
The existing `ChangesView` right sidebar in the rest of the app is
hard-coded to show diffs against the default branch HEAD, which
conflates "what this TODO did" with "everything the user has been
working on in this worktree". To get a clean per-session view we
capture the git HEAD SHA the instant the supervisor starts a run
and use it as the range base for `git log <sha>..HEAD`. Commits
made by the user before the session started are excluded by
construction.
Schema
------
- `packages/local-db/src/schema/todo-sessions.ts`: new nullable
`startHeadSha: text("start_head_sha")` column. Nullable so
existing rows and freshly-created `queued` sessions without a
captured base do not break.
- `packages/local-db/drizzle/0053_todo_start_head_sha.sql`:
drizzle-kit generated `ALTER TABLE ADD COLUMN` migration. SQLite
handles this append-only so no data rewrite.
Backend
-------
- `apps/desktop/src/main/todo-agent/git-status.ts` (new):
- `getCurrentHeadSha(cwd)` — thin wrapper used by the supervisor
at run start.
- `getSessionGitSnapshot({ cwd, startHeadSha })` — runs
`rev-parse --abbrev-ref HEAD`, `rev-parse HEAD`,
`log <startSha>..HEAD --format=...` (when start SHA is set
and different from current), `status --porcelain=v1
--untracked-files=all`, and `rev-list --left-right --count
HEAD...@{u}` to return branch / commit list / working-tree
files (with stage distinction staged/unstaged/untracked) /
ahead-behind counters.
- `getSessionFileDiff({ cwd, startHeadSha, path, scope,
commitSha })` — unified diffs for four scopes: `session`
(startSHA..HEAD for that path), `staged` (git diff --cached),
`unstaged` (git diff), `commit` (git show for a single
commit's changes to that path).
- All calls go through `execGitWithShellPath` from
`lib/trpc/routers/workspaces/utils/git-client` so PATH is
resolved the same way the rest of the app's git layer does.
Read-only only; no mutations.
- `apps/desktop/src/main/todo-agent/supervisor.ts`: at the top of
`runSession`, capture `const startHeadSha = await
getCurrentHeadSha(worktreePath)` and pass it to the initial
`store.update({ ..., startHeadSha })`. Done before the claude
subprocess is even spawned so the range anchor is accurate even
if the run fails immediately.
- `apps/desktop/src/main/todo-agent/trpc-router.ts`:
- New `todoAgent.gitSnapshot({ sessionId })` query.
- New `todoAgent.gitFileDiff({ sessionId, path, scope,
commitSha })` query.
- Both resolve the worktree path via the existing
`resolveWorktreePath(workspaceId)` and delegate to the helper.
- Also threaded `startHeadSha: null` through the existing
`create` and `rerun` store.insert calls to satisfy the new
NOT-OPTIONAL-at-insert type shape.
Renderer
--------
- `apps/desktop/src/renderer/features/todo-agent/TodoManager/
ChangesSidebar/ChangesSidebar.tsx` (new):
- Header row with branch name + spinning refresh icon that
invalidates both the snapshot and the currently selected diff
query.
- "開始時 HEAD" block showing `startSha.slice(0,12) →
currentSha.slice(0,12)` and ahead/behind counts when set.
- "コミット (N)" collapsible section listing each new commit
with shortSha, subject, author, short relative date. Click
selects the commit and loads its diff via `gitFileDiff`
scope=`commit`.
- "ワーキングツリー (N)" collapsible section listing staged,
unstaged, and untracked files. Small colored status badge
(M/A/D/R/? with amber/emerald/rose/primary/muted colors).
Click selects file+stage and loads its diff via `gitFileDiff`
scope=`staged`|`unstaged`. Untracked files are shown but not
clickable for diff (no diff target).
- `DiffBlock` — monospace `<pre>` renderer with color-coded
lines (+/-, hunk headers, file headers). Wrapped in a
`max-h-[50vh] overflow-auto` so long diffs are scrollable
inside the sidebar without pushing other sections off-screen.
- `apps/desktop/src/renderer/features/todo-agent/TodoManager/
TodoManager.tsx`:
- New `changesSidebarCollapsed` state, persisted in component
local state.
- New `LuPanelRightOpen` / `LuPanelRightClose` toggle button
placed before the close (×) button in the header.
- Body flex now renders a third column after the detail pane:
`shrink-0 border-l min-h-0 overflow-hidden transition-[width]`
that swaps between `w-[380px]` and `w-0 border-l-0`. Mounts
`<ChangesSidebar sessionId workspaceId active />` where
`active` is true for queued/preparing/running/verifying so
the polling only runs while something meaningful can change.
Verified
--------
- `bun run typecheck` in apps/desktop — clean.
…se events Two small but visible polish items from the review. 1. **AI enhance buttons now render as pure icon** The little ✨ next to the やって欲しいこと / ゴール fields in the TODO creation modal used to render as `[✨ AI]` with the text label taking more horizontal room than the input fields had to spare. Now it is a single 24×24 ghost button holding only `HiMiniSparkles`; the running state swaps to `animate-pulse` on the icon instead of a text change. Tooltip carries both "AI で書き換える" and "AI で書き換え中…" for state clarity. File: `apps/desktop/src/renderer/features/todo-agent/TodoModal/ components/EnhanceButton/EnhanceButton.tsx` 2. **Setup-phase events rendered in the Manager live stream** The live stream was empty until Claude actually started producing output. Users had no signal for what the supervisor was doing during the (usually subsecond but sometimes notable) boot window: resolving the worktree, capturing the git HEAD, deciding the run shape. The sidebar now paints these upfront as `kind: "system_init"` events with iteration 0 (so they visually anchor before any turn-1 events): - "セットアップ — ワークスペースを解決しています…" - "worktree — <absolute path>" - "開始時 HEAD — <12-char sha>" - "verify — <command>" OR "モード — 単発タスク(外部 verify なし)" - "予算 — N iter · M 分" - "Claude — claude -p --output-format stream-json を起動します" Emitted via a new thin `appendSetupEvent(sessionId, label, text)` helper in `supervisor.ts` that wraps `getTodoSessionStore().appendStreamEvents` so the setup events flow through the existing in-memory + JSONL persistence + live subscription pipeline for free. Verified -------- - `bun run typecheck` in apps/desktop — clean.
Adds an optional "新しい worktree を作成してそこで実行する" checkbox
to the TODO creation modal. When checked, submit runs a two-step
flow:
1. Create a new workspace for the same project via the existing
`workspaces.create` tRPC mutation, passing the TODO title +
description joined as the `prompt` field. `workspaces.create`
already handles AI branch name generation
(`generateBranchNameFromPrompt` in
`workspaces/utils/ai-branch-name.ts`) and AI workspace auto-name
(`attemptWorkspaceAutoRenameFromPrompt` in
`workspaces/utils/ai-name.ts`) internally, both going through
the same `callSmallModel` path the TODO text enhancer uses, so
the naming stays consistent across features without touching
any of the workspace-creation plumbing.
2. Take the newly-created `workspace.id` and thread it into the
usual `todoAgent.create` mutation as the target workspaceId.
The TODO session is now scoped to the fresh worktree and,
when Start is clicked, the supervisor captures that worktree's
git HEAD as `startHeadSha` so the Changes right sidebar
automatically reflects "everything this TODO produced" from
the first commit onward.
UI (TodoModal)
--------------
- New bordered "card" below the title field with a shadcn
`Checkbox` plus a short label and explanation. Uses the
`bg-primary/5 border-primary/40` treatment when checked so the
user sees at a glance that they are entering a different mode.
- Checkbox is disabled (`opacity-60`) and shows a muted
explanation when the current workspace has no projectId. This
happens for workspaces that predate the projects table or for
branch-type workspaces without a project binding — the tRPC
mutation requires `projectId`, so we fail fast with a clear
message instead of letting the request explode.
- Added a small ✨ `HiMiniSparkles` next to the label to signal
that the naming is AI-driven, matching the visual language of
the per-field enhance buttons that now also render as
icon-only sparkles.
- `reset()` in the modal clears the checkbox too so reopen never
sticks the previous mode.
Flow (handleSubmit)
-------------------
- Default path (checkbox off): unchanged. Uses the current
workspaceId directly.
- New worktree path: calls `workspaces.create.mutateAsync`, waits
for the `{ workspace, ... }` result, extracts `workspace.id`
as `targetWorkspaceId`, then proceeds with the existing TODO
create mutation against that id. On success, toast says "新し
い worktree を作成して TODO セッションを紐付けました" so the
user knows both operations completed.
- Errors from either mutation surface via the existing toast
fallback — e.g. "このワークスペースにはプロジェクトが紐付いて
いないので新しい worktree を作成できません" when a user somehow
manages to submit with the checkbox enabled but no projectId.
Also in this commit
-------------------
- `cn` import added to TodoModal for the new conditional classes
on the checkbox card.
- `HiMiniSparkles` import added for the inline label icon.
Verified
--------
- `bun run typecheck` in apps/desktop — clean.
…oModal
Two requests from the review:
1. **Reusable system prompt presets** that users can attach to new
TODOs at creation time, managed from a Settings row at the bottom
of the Agent Manager's left sidebar.
2. **TodoModal is too text-heavy** — simplified the copy so the form
matches the visual density of the rest of the app.
Schema + migration
------------------
- `packages/local-db/src/schema/todo-prompt-presets.ts` (new): new
`todo_prompt_presets` table (id, name, content, createdAt,
updatedAt) with `name` and `updatedAt` indexes.
- `packages/local-db/src/schema/todo-sessions.ts`: new nullable
`custom_system_prompt` column. Selected preset content is copied
into this column at session create time so later preset edits do
not retroactively change a session that has already run.
- `packages/local-db/src/schema/index.ts` + `schema.ts`: re-export
the new table so drizzle-kit picks it up via the existing root.
- `packages/local-db/drizzle/0054_todo_prompt_presets.sql`:
auto-generated migration (CREATE TABLE + ALTER TABLE ADD COLUMN).
Backend
-------
- `apps/desktop/src/main/todo-agent/types.ts`:
- `todoCreateInputSchema` gains optional `customSystemPrompt`
(trimmed, max 20k, empty→undefined).
- `todoPresetCreateInputSchema` and `todoPresetUpdateInputSchema`
new zod shapes for the CRUD endpoints.
- `apps/desktop/src/main/todo-agent/supervisor.ts`:
- `runClaudeTurn` params gain `customSystemPrompt: string | null`.
- When present it is threaded into the spawned claude args as
`--append-system-prompt <content>`. This composes with the
iteration prompt + `--resume` so every turn in the session
inherits the steering without re-injecting it in every prompt.
- The per-turn call site in `runSession` reads the session row
at turn boundary and passes `currentSession.customSystemPrompt
?? null`.
- `apps/desktop/src/main/todo-agent/trpc-router.ts`:
- `create` now persists `input.customSystemPrompt ?? null` on
the new DB column.
- `rerun` now copies `source.customSystemPrompt` into the clone
so re-running preserves the steering.
- New nested `todoAgent.presets` router with:
* `list` query (orderBy updatedAt desc)
* `create` mutation (inputs: name 1..120, content 1..20k)
* `update` mutation (inputs: id + name + content)
* `delete` mutation (inputs: id; returns ok boolean)
- All mutations run against `localDb` via drizzle directly —
presets are a tiny kv-ish table, no caching needed.
Renderer
--------
- `apps/desktop/src/renderer/features/todo-agent/TodoManager/
PresetsDialog/PresetsDialog.tsx` (new):
- Full `Dialog` at 960×80vh with a 2-column layout: list of
presets on the left, edit form on the right.
- "新規プリセット" button at the top of the sidebar resets the
draft state and clears selection.
- Selecting a row populates the draft; editing flips a
`dirty` flag that gates the save button.
- Save routes to `create` or `update` depending on whether the
draft has an id; success toast on both paths.
- Delete uses the inline "本当に削除 / キャンセル" confirm
pattern already established in the SessionRow kebab menu.
- `apps/desktop/src/renderer/features/todo-agent/TodoManager/
TodoManager.tsx`:
- `presetsDialogOpen` state + mounted `<PresetsDialog>` as a
sibling Dialog inside the existing outer Dialog so it stacks
on top of the Manager the way `<TodoModal>` does.
- Left sidebar gains a `shrink-0 border-t` footer row with a
"設定 / プリセット" button using `HiMiniCog6Tooth`. Clicking
it opens PresetsDialog. The row mirrors the compact ghost-
link styling of the existing row controls.
TodoModal simplification + preset picker
----------------------------------------
- Removed the 5-line `DialogDescription` entirely. Users reached
the feature through the button's tooltip; the modal body needs
to carry only what is actionable.
- Title placeholder: "例: Issue #123 のログインリダイレクト問題を
修正" → "例: Issue #123 を修正" (half the width, same intent).
- Replaced the two-line "new worktree" card with a single-row
label that renders as a checkbox-styled button: "新しい
worktree を作成して実行" with a sparkle icon on the right.
Description text was the biggest offender; cut entirely. The
disabled state still shows via the muted opacity treatment.
- Description placeholder: long sentence → "やってほしい作業を
書く".
- Goal: "(任意)" moved into a compact `text-[10px]` suffix on
the label; placeholder shortened to "完了条件(空欄可)".
Textarea rows dropped from 3 to 2.
- Verify: same treatment. Placeholder → "例: bun test". Removed
the two-line explanation block below it entirely.
- New "システムプロンプト (任意)" row hosts a `PresetPicker`
trigger that renders the selected preset name + an inline
clear (×) button when set. Dropdown shows the full preset
list with name + first ~2 lines of content as a preview, plus
a "選択を解除" footer row and a hint when no presets exist.
Selected preset content is read at submit time and passed as
`customSystemPrompt` to the create mutation.
- All form inputs gained `rounded-md` to match the rest of the
app.
Verified
--------
- `bun run typecheck` in apps/desktop — clean.
…sist, atomic create, stranded cleanup) Four fixes from the code-review round (rv-pr #181). All four were classified "修正推奨" after Codex pruned the false positives. Q3: abort guard after runVerify -------------------------------- In the iteration loop, if the user pressed 中断 while `runVerify` was still executing, the verify child process died with an AbortError and returned `{ passed: false, log: "AbortError: ..." }`. The very next line wrote that "verify failed" verdict to the DB before the loop's `break` on `ac.signal.aborted` could fire, so aborted sessions ended up labeled "aborted" but carried a bogus "verify failed: AbortError..." trail in the UI. Fix in `supervisor.ts`: a one-liner `if (ac.signal.aborted) return;` between `await runVerify(...)` and `appendVerifyEvent(...)`. Once the user has aborted, we do not record the terminated verify at all — `abort()` has already written the clean `aborted` state. Q4: stream persistence no longer blocks the main process -------------------------------------------------------- `persistStreamEvents` used to run per event: - a synchronous `localDb.select()` to fetch the session row just for `artifactPath` (even though it never changes during a run) - a synchronous `fs.appendFileSync` for the JSONL append Claude's stream fires dozens to hundreds of events per turn, so that was the main-process event loop being jammed several ms per event for the duration of a run. In Electron the main process is shared with the renderer so tab switches, tRPC calls, and terminal writes would visibly stutter. Rewritten in `session-store.ts`: - New `artifactPathCache: Map<sessionId, absolutePath>`. The supervisor calls `store.setArtifactPathCache(sessionId, session0.artifactPath)` at the top of `runSession`, which also pre-`mkdirSync`s the directory exactly once. `persistStreamEvents` now reads from the cache; the DB fallback is kept only for historical-session replay outside of an active run. - `appendFileSync` → `appendFile` from `node:fs/promises`. Async I/O so the main process thread stays free. - New `persistQueues: Map<sessionId, Promise<void>>` chains the per-session appends so bursty events do not race and write out of order. Each subsequent append awaits the previous one via `.then(...)`; failures are swallowed with a console.warn so one bad append cannot poison the chain. Net effect: CPU time per event drops by 10-100x, event order in `stream.jsonl` is still fully ordered, the renderer no longer stutters while a worker is chattering. Q8: atomic create — no more half-written PENDING rows ----------------------------------------------------- Previously, `todoAgent.create` / `rerun`: 1. `store.insert({ ..., artifactPath: ".superset/todo/PENDING" })` 2. `prepareArtifacts(session)` — computes real path, mkdir, write goal.md 3. `store.update(id, { artifactPath })` If the app crashed between steps 1 and 2 (or 2 and 3), the DB was left with a row whose `artifactPath` was literally `.superset/todo/PENDING`. The next time the user clicked Start on that row, the supervisor would try to read/write inside a bogus directory and fail inscrutably. It was also a latent correctness hazard for any downstream that assumed artifactPath was an absolute path. Split `prepareArtifacts` into two responsibilities: - New `TodoSupervisor.computeArtifactPath({ sessionId, workspaceId })` — pure path calculation, throws if the workspace has no resolvable path. No fs side-effects. - Existing `prepareArtifacts(session)` now expects the session's `artifactPath` to already be set and simply `mkdirSync`s the directory + writes `goal.md`. `trpc-router.ts` `create` / `rerun` now: 1. Generate a UUID up front (`randomUUID`) 2. `computeArtifactPath` with that UUID 3. `store.insert({ id, ..., artifactPath })` — one shot, final value 4. `prepareArtifacts(session)` to materialize the directory A crash anywhere in this flow can still leave a queued session with no artifact dir, but never a row with a broken synthetic path. And since `prepareArtifacts` is now idempotent on the directory itself, Start can recover by recreating the dir on first use if needed. A2: stranded-session sweep on startup ------------------------------------- If the previous process died while a session was `preparing`, `running`, or `verifying`, the in-memory supervisor state is gone but the DB row is still there. The UI would render that session forever as "running" with a pulsing amber dot, and the only way out was to delete it manually. `TodoSessionStore`'s constructor now runs a single-shot `rehydrateStrandedSessions()` that UPDATEs any row in those three non-terminal states to `status: "failed"` + a clear verdictReason ("前回の実行が中断されました(アプリ再起動)。再実行するか削除して ください。") + `completedAt: Date.now()`. Only sessions with `queued` status are left alone — those never started, so they should stay queueable. Constructor-time is the right injection point because the store is a lazy singleton (`getTodoSessionStore()`), so the sweep runs exactly once on first access — typically before the Agent Manager has mounted. Verified -------- - `bun run typecheck` in apps/desktop — clean.
CI Lint was failing with 17 errors / 4 warnings on feat/todo-
autonomous-agent. Ran `bun run lint:fix` which auto-fixed 14 files
worth of Biome formatting / import ordering / style nits, then
resolved the 2 remaining errors manually.
Auto-fixed (14 files):
- apps/desktop/src/main/todo-agent/enhance-text.ts
- apps/desktop/src/main/todo-agent/git-status.ts
- apps/desktop/src/main/todo-agent/index.ts
- apps/desktop/src/main/todo-agent/session-store.ts
- apps/desktop/src/main/todo-agent/trpc-router.ts
- apps/desktop/src/main/todo-agent/types.ts
- apps/desktop/src/lib/trpc/routers/index.ts (Biome reordered the
fork-local TodoAgent import next to the other `main/*` import)
- apps/desktop/src/renderer/features/todo-agent/TodoManager/
PresetsDialog/PresetsDialog.tsx
- apps/desktop/src/renderer/features/todo-agent/TodoModal/
TodoModal.tsx
- apps/desktop/src/renderer/features/todo-agent/TodoModal/
components/EnhanceButton/EnhanceButton.tsx
- apps/desktop/src/renderer/screens/main/components/
WorkspaceView/ContentView/components/PresetsBar/PresetsBar.tsx
(just the Biome import-grouping rewrite triggered by the new
TodoButton import)
- a few more minor whitespace/formatting-only touches
Manual fixes:
1. **noUnusedFunctionParameters** in `ChangesSidebar.tsx`: the
`workspaceId` prop was declared but never read. Removed both
from the `ChangesSidebarProps` interface and the TodoManager
call site. The component only needs `sessionId` + `active` —
workspace scoping is already handled server-side via the
session row lookup inside the `gitSnapshot` query.
2. **noControlCharactersInRegex** in `supervisor.ts`
`guessFailingTest`: the ANSI stripper used `/\x1b\[[0-9;]*m/g`
to remove ESC-based color escapes from verify command output.
Biome flags `\x1b` literals as suspicious (they often land in
regexes by mistake). Stripping real ANSI escapes is the entire
point here, so:
- Switched the escape to the equivalent Unicode form
`\u001B` (same byte, less alarming to Biome's default
pattern).
- Added a `biome-ignore` with an explanation so a future
contributor can see at a glance that the control char is
intentional.
Verified
--------
- `bun run lint` in apps/desktop — clean.
- `bun run typecheck` in apps/desktop — clean.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 72e446b474
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Automated review on commit 72e446b flagged three issues. All three were verified against the code and are real bugs; this commit fixes them. 1. [P1] Commit-scope file diffs returned blank (git-status.ts) -------------------------------------------------------------- `getSessionFileDiff` with `scope: "commit"` unconditionally appended `-- <path>` to the `git show` command. The ChangesSidebar sets commit-row selections with `path: ""` because commit clicks are not bound to a specific file — they should show the whole commit's patch. With an empty path the command became `git show --format= <sha> -- ""`, which Git rejects with "empty string is not a valid pathspec". `gitOut` swallows the failure and returns an empty string, so commit diffs silently rendered blank in the UI. Fix: only append `--` + path when the path is non-empty. When `path` is the empty string (the "whole commit" case) we now emit `git show --format= <sha>` which returns the full patch for every file the commit touched. File-scoped commit diffs still work when the caller actually provides a path. 2. [P1] Queue drain revived aborted sessions (supervisor.ts) ------------------------------------------------------------ `TodoSupervisor.start()` drains `this.queue` in a while loop once the active run finishes. If the user aborted (or deleted) a session while it was still waiting in the queue, its sessionId stayed in `this.queue` unchanged. When the active run finished the drain loop popped the aborted sessionId and ran it anyway, re-reviving an already-terminal session into execution. Two complementary fixes: - `abort(sessionId)` now proactively removes the sessionId from `this.queue` via `splice(queueIdx, 1)` before touching the active run, so the drain loop never sees it again. - The drain loop now re-reads the session row from the store after popping each id and `continue`s past any row whose status is already terminal (`aborted` / `failed` / `done` / `escalated`). This catches the abort race plus any status change made by another code path (`delete`, `rerun`) while the id was waiting. 3. [P2] SessionDetail stream events leaked across selections ------------------------------------------------------------ The effect that resets `streamEvents` on selection change had `[]` as its deps array, so it only ran once on initial mount. `SessionDetail` is reused across selections (the parent just swaps its `session` prop), so when the user clicked a different row the previous session's events stayed in state and got appended to the new session's subscription deliveries — the live stream panel showed a mix of two runs. Fix: deps are now `[session.id]` so the reset fires on every selection change. Added a `biome-ignore lint/correctness/ useExhaustiveDependencies` comment since `session.id` is a reset-on-change dep, not a value read inside the body — Biome cannot see the difference and would otherwise strip it. Verified -------- - `bun run lint` in apps/desktop — clean. - `bun run typecheck` in apps/desktop — clean.
There was a problem hiding this comment.
Actionable comments posted: 12
🧹 Nitpick comments (2)
packages/local-db/drizzle/0052_todo_headless_fields.sql (1)
3-3:total_cost_usdをREALではなく整数最小単位で保持する設計を推奨します。コストを後続で集計・比較する場合、浮動小数は誤差を持ち込みやすいです。
INTEGER(例: micro-USD / cent)での保存のほうが安全です。例: 精度重視のスキーマ案
-ALTER TABLE `todo_sessions` ADD `total_cost_usd` real;--> statement-breakpoint +ALTER TABLE `todo_sessions` ADD `total_cost_microusd` integer;--> statement-breakpoint🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/local-db/drizzle/0052_todo_headless_fields.sql` at line 3, 現在のALTER TABLE追加は`todo_sessions`の`total_cost_usd`をREALで追加していますが、金額は整数最小単位で保持すべきなので列型をREALからINTEGERに変更してマイグレーションを作り直してください: 更新対象は`todo_sessions.total_cost_usd`で、スキーマ変更をALTER TABLE ... ADD `total_cost_usd` INTEGER NOT NULL DEFAULT 0(またはNULL許容とデフォルトの要件に合わせる)にし、アプリ側でUSDをセントやマイクロ単位に変換して保存/読み出すロジック(保存時に*100 or *1_000_000、取得時に逆変換)を合わせて実装してください。packages/local-db/src/schema/schema.ts (1)
459-462: 再エクスポートの集約先を一本化すると保守しやすいです。
packages/local-db/src/schema/index.tsでも同じモジュールを再エクスポートしているため、公開面の管理を1箇所に寄せると将来の衝突リスクを下げられます。🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/local-db/src/schema/schema.ts` around lines 459 - 462, The two re-export lines (export * from "./todo-prompt-presets"; and export * from "./todo-sessions";) should be removed from this schema file and consolidated into the single public re-export barrel (index.ts) so all schema exports are managed in one place; update the central index.ts to re-export both todo-prompt-presets and todo-sessions (and add a brief comment explaining the barrel) and remove the duplicate exports here to avoid future conflicts.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@apps/desktop/plans/todo-agent-plan.md`:
- Line 32: Several fenced code blocks in the Markdown (the triple-backtick
blocks) are missing language identifiers causing markdownlint MD040; update each
problematic code fence (the three backtick blocks referenced in the review) to
include an appropriate language tag such as "text" or "ts" (e.g., replace ```
with ```text or ```ts) so linting passes; ensure you update all occurrences
mentioned in the review comment so each code fence has a language specifier.
- Around line 13-20: The implementation model described in the plan (the bullets
"ライブ可視性", "信頼性", "逐次実行", "upstream とのマージ容易性") conflicts with the PR's actual
design (headless Claude NDJSON streaming + verify exit-code completion); update
todo-agent-plan.md to either rewrite these sections to describe the headless
NDJSON stream + verify-exit-code flow used by the current codebase (including
removing or adjusting PTY-resident/idle-detection language) or clearly mark the
document as obsolete and point to the new implementation notes; ensure the
revised text names the implemented mechanisms (NDJSON stream, verify exit code)
so future readers are not misled.
In `@apps/desktop/src/main/todo-agent/enhance-text.ts`:
- Around line 84-101: In describeEnhanceFailure, add explicit handling for
attempts whose outcome === "empty-result" (SmallModelAttempt) before falling
through to generic messages; return a clear Japanese message indicating the
model call succeeded but produced an empty response (e.g.,
"モデルは応答しましたが空の結果でした。再試行してください。") so users can distinguish "empty response" from
other failures and make retry decisions.
In `@apps/desktop/src/main/todo-agent/supervisor.ts`:
- Around line 156-163: The in-memory clear (store.clearStreamEvents(sessionId))
leaves the append-only persisted stream.jsonl intact causing old events to be
reloaded on retries; update supervisor startup to also truncate or rotate the
persisted stream file before a new run. Add or call a store-level method (e.g.
store.truncateStreamFile(sessionId) or store.rotateStreamFile(sessionId, runId))
before priming the cache (before/around the
store.setArtifactPathCache(sessionId, session0.artifactPath)) so the persisted
stream for the same sessionId is either truncated or writes go to a run-specific
file to avoid mixing previous run events. Ensure the new method is implemented
in the store backend to atomically truncate or rename the existing stream.jsonl
for that sessionId.
- Around line 83-93: The bug: aborted sessions remain in this.queue so start()
later pulls them and runSession() executes them; fix by 1) updating
abort(sessionId) to remove that id from this.queue (e.g., this.queue =
this.queue.filter(id => id !== sessionId)) and 2) adding a defensive guard at
the top of runSession(sessionId) to check the session's terminal status
(aborted/completed) and return immediately if terminal; reference methods:
abort, start, runSession, and the this.queue field.
- Around line 482-485: The spawn call creating `child = spawn("claude", args, {
cwd: params.cwd, env: process.env })` uses the raw process.env which fails when
Electron is launched from Finder; replace it to use the shell-resolved
environment helper (the same helper used for the git implementation, e.g.
resolveShellEnv/getShellEnvironment) so PATH and other shell startup changes are
respected before spawning `claude` (and apply the same change to the similar
spawn at the other location around lines 730-734); update the spawn options to
pass the resolved env object instead of process.env.
In `@apps/desktop/src/main/todo-agent/trpc-router.ts`:
- Around line 106-120: The router calls enhanceTodoText with two args but the
actual function signature is enhanceTodoText({ sessionId, kind, text }); fix by
making the router and schema match that signature: add sessionId to
todoEnhanceTextInputSchema and call enhanceTodoText({ sessionId:
input.sessionId, kind: input.kind, text: input.text }) (or alternatively change
enhanceTodoText to accept (text, kind) and update its callers); ensure the TRPC
input type and the callsite use the same shape and update any related
imports/types (enhanceTodoText and todoEnhanceTextInputSchema) accordingly.
In
`@apps/desktop/src/renderer/features/todo-agent/TodoManager/ChangesSidebar/ChangesSidebar.tsx`:
- Around line 45-55: The diffQuery call is sending path: "" when a commit is
selected which violates the server validator for gitFileDiff
(z.string().min(1)); update the argument construction in
electronTrpc.todoAgent.gitFileDiff.useQuery to branch based on selected.scope
(e.g., if selected?.scope === "commit" send the payload without path and include
commitSha/scope accordingly, otherwise include path as before), or populate a
non-empty path when required so the client payload matches the gitFileDiff input
schema; adjust the selected-based conditional used to build the query args in
ChangesSidebar (diffQuery) to ensure scope === "commit" uses the API shape the
router expects.
In `@apps/desktop/src/renderer/features/todo-agent/TodoManager/TodoManager.tsx`:
- Around line 280-284: SessionDetail コンポーネントが別セッション選択時に同一インスタンスを再利用して前セッションの
streamEvents / 入力 / 削除確認 state を持ち越しているので、SessionDetail をレンダーする箇所に一意の key
を付与してマウントを強制的に切り替えてください;具体的には現在の selected を渡している箇所で SessionDetail に
key={selected.id} を追加し(selected / selected.id を参照)、onDeleted で
setSelectedId(null) する既存のハンドラはそのまま維持してください。
In
`@apps/desktop/src/renderer/features/todo-agent/TodoModal/components/EnhanceButton/EnhanceButton.tsx`:
- Around line 55-64: Enhance the accessibility of the icon-only button in the
EnhanceButton component by adding an explicit aria-label to the Button (use the
existing title prop or fallback to running ? "AI で書き換え中…" : "AI で書き換える") and
mark the HiMiniSparkles icon as non-interactive for assistive tech (e.g., add
aria-hidden="true" and focusable={false} to the icon element); update the Button
JSX where onClick={handleClick}, disabled={disabled}, title={...} is set to
include aria-label and update the HiMiniSparkles usage to include aria-hidden
and focusable props.
In `@apps/desktop/src/renderer/features/todo-agent/TodoModal/TodoModal.tsx`:
- Around line 123-150: The code currently calls
createWorkspaceMut.mutateAsync(...) to create a worktree and then
create.mutateAsync(...) to create a todo session, leaving an orphaned workspace
if the second call fails; update the flow so both operations are atomic: either
(A) move the worktree + todo creation into a single main-process mutation on the
backend (preferred) so one server-side transaction handles both, or (B)
implement compensation logic in the caller around createWorkspaceMut.mutateAsync
and create.mutateAsync — after createWorkspaceMut.mutateAsync returns a
result.workspace.id, call create.mutateAsync(...) and if that fails, reliably
delete the newly created workspace via the matching workspace delete API (use
the same workspace id returned), and surface the original error; reference
createWorkspaceMut.mutateAsync, create.mutateAsync, and targetWorkspaceId when
locating the code to change.
- Around line 355-383: The nested button inside the DropdownMenuTrigger must be
removed; in TodoModal replace the inner clear <button> (the one rendered when
selected) with a non-button interactive element (e.g., a <span> or <div> with
role="button" and tabIndex={0}) and wire its click and keyboard handlers to call
onSelect(null) while calling e.preventDefault()/e.stopPropagation() to avoid
triggering the parent trigger; keep the same classes, title="解除", and accessible
keyboard handling (Enter/Space) so the clear control remains focusable and
accessible without nesting a button inside the DropdownMenuTrigger's button.
---
Nitpick comments:
In `@packages/local-db/drizzle/0052_todo_headless_fields.sql`:
- Line 3: 現在のALTER
TABLE追加は`todo_sessions`の`total_cost_usd`をREALで追加していますが、金額は整数最小単位で保持すべきなので列型をREALからINTEGERに変更してマイグレーションを作り直してください:
更新対象は`todo_sessions.total_cost_usd`で、スキーマ変更をALTER TABLE ... ADD `total_cost_usd`
INTEGER NOT NULL DEFAULT
0(またはNULL許容とデフォルトの要件に合わせる)にし、アプリ側でUSDをセントやマイクロ単位に変換して保存/読み出すロジック(保存時に*100 or
*1_000_000、取得時に逆変換)を合わせて実装してください。
In `@packages/local-db/src/schema/schema.ts`:
- Around line 459-462: The two re-export lines (export * from
"./todo-prompt-presets"; and export * from "./todo-sessions";) should be removed
from this schema file and consolidated into the single public re-export barrel
(index.ts) so all schema exports are managed in one place; update the central
index.ts to re-export both todo-prompt-presets and todo-sessions (and add a
brief comment explaining the barrel) and remove the duplicate exports here to
avoid future conflicts.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 533515b0-d81a-4ebe-a85e-9458b4c7f77a
📒 Files selected for processing (40)
.gitignoreapps/desktop/plans/todo-agent-plan.mdapps/desktop/src/lib/trpc/routers/index.tsapps/desktop/src/main/todo-agent/enhance-text.tsapps/desktop/src/main/todo-agent/git-status.tsapps/desktop/src/main/todo-agent/index.tsapps/desktop/src/main/todo-agent/session-store.tsapps/desktop/src/main/todo-agent/supervisor.tsapps/desktop/src/main/todo-agent/trpc-router.tsapps/desktop/src/main/todo-agent/types.tsapps/desktop/src/renderer/features/todo-agent/TodoButton/TodoButton.tsxapps/desktop/src/renderer/features/todo-agent/TodoButton/index.tsapps/desktop/src/renderer/features/todo-agent/TodoManager/ChangesSidebar/ChangesSidebar.tsxapps/desktop/src/renderer/features/todo-agent/TodoManager/ChangesSidebar/index.tsapps/desktop/src/renderer/features/todo-agent/TodoManager/PresetsDialog/PresetsDialog.tsxapps/desktop/src/renderer/features/todo-agent/TodoManager/PresetsDialog/index.tsapps/desktop/src/renderer/features/todo-agent/TodoManager/TodoManager.tsxapps/desktop/src/renderer/features/todo-agent/TodoManager/index.tsapps/desktop/src/renderer/features/todo-agent/TodoModal/TodoModal.tsxapps/desktop/src/renderer/features/todo-agent/TodoModal/components/EnhanceButton/EnhanceButton.tsxapps/desktop/src/renderer/features/todo-agent/TodoModal/components/EnhanceButton/index.tsapps/desktop/src/renderer/features/todo-agent/TodoModal/index.tsapps/desktop/src/renderer/screens/main/components/WorkspaceView/ContentView/components/PresetsBar/PresetsBar.tsxpackages/local-db/drizzle/0049_add_todo_sessions.sqlpackages/local-db/drizzle/0050_todo_verify_optional.sqlpackages/local-db/drizzle/0051_todo_goal_optional.sqlpackages/local-db/drizzle/0052_todo_headless_fields.sqlpackages/local-db/drizzle/0053_todo_start_head_sha.sqlpackages/local-db/drizzle/0054_todo_prompt_presets.sqlpackages/local-db/drizzle/meta/0049_snapshot.jsonpackages/local-db/drizzle/meta/0050_snapshot.jsonpackages/local-db/drizzle/meta/0051_snapshot.jsonpackages/local-db/drizzle/meta/0052_snapshot.jsonpackages/local-db/drizzle/meta/0053_snapshot.jsonpackages/local-db/drizzle/meta/0054_snapshot.jsonpackages/local-db/drizzle/meta/_journal.jsonpackages/local-db/src/schema/index.tspackages/local-db/src/schema/schema.tspackages/local-db/src/schema/todo-prompt-presets.tspackages/local-db/src/schema/todo-sessions.ts
| - ライブ可視性: 実行中ワーカーは実際の PTY であり、既存の | ||
| `TerminalPane` コンポーネントで描画されるため、誰でも監視したり | ||
| 直接入力したりできる。 | ||
| - 信頼性: 完了判定は決定的な verify コマンドの終了コードで行い、 | ||
| LLM の自己申告には依存しない。 | ||
| - 逐次実行: 同時にアクティブなのは 1 タスクのみとし、それ以外はキューに入れる。 | ||
| - upstream とのマージ容易性: 新規コードはすべて新しいファイル / ディレクトリに | ||
| 置き、既存ファイルへの変更は追記のみ、かつ 1 行変更を 3 箇所に限定する。 |
There was a problem hiding this comment.
実装計画の実行モデルが現状実装とずれています。
この文書は PTY 常駐の対話 worker と idle 検知中心の流れを前提にしていますが、この PR の実装説明は headless Claude の NDJSON ストリームと verify exit code ベースの完了判定に寄っています。今のままだと後から読む人が誤った前提で保守しやすいので、現実装に合わせて更新するか obsolete と明記した方がいいです。
Also applies to: 71-99
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@apps/desktop/plans/todo-agent-plan.md` around lines 13 - 20, The
implementation model described in the plan (the bullets "ライブ可視性", "信頼性", "逐次実行",
"upstream とのマージ容易性") conflicts with the PR's actual design (headless Claude
NDJSON streaming + verify exit-code completion); update todo-agent-plan.md to
either rewrite these sections to describe the headless NDJSON stream +
verify-exit-code flow used by the current codebase (including removing or
adjusting PTY-resident/idle-detection language) or clearly mark the document as
obsolete and point to the new implementation notes; ensure the revised text
names the implemented mechanisms (NDJSON stream, verify exit code) so future
readers are not misled.
|
|
||
| ## アーキテクチャ | ||
|
|
||
| ``` |
There was a problem hiding this comment.
コードフェンスに言語指定を付けてください。
ここは markdownlint の MD040 が出ています。text や ts を付けるだけで警告を解消できます。
Also applies to: 60-60, 145-145, 237-237
🧰 Tools
🪛 markdownlint-cli2 (0.22.0)
[warning] 32-32: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@apps/desktop/plans/todo-agent-plan.md` at line 32, Several fenced code blocks
in the Markdown (the triple-backtick blocks) are missing language identifiers
causing markdownlint MD040; update each problematic code fence (the three
backtick blocks referenced in the review) to include an appropriate language tag
such as "text" or "ts" (e.g., replace ``` with ```text or ```ts) so linting
passes; ensure you update all occurrences mentioned in the review comment so
each code fence has a language specifier.
| export function describeEnhanceFailure(attempts: SmallModelAttempt[]): string { | ||
| for (let index = attempts.length - 1; index >= 0; index -= 1) { | ||
| const attempt = attempts[index]; | ||
| if (!attempt) continue; | ||
| if (attempt.outcome === "expired-credentials") { | ||
| return `${attempt.issue?.message ?? `${attempt.providerName} の認証が切れています`}。設定から再接続してください。`; | ||
| } | ||
| if (attempt.outcome === "failed") { | ||
| return `${attempt.providerName} での書き換えに失敗しました: ${attempt.issue?.message ?? attempt.reason ?? "unknown"}`; | ||
| } | ||
| if (attempt.outcome === "unsupported-credentials") { | ||
| return `${attempt.providerName} の認証種別が書き換えに対応していません。`; | ||
| } | ||
| } | ||
| if (attempts.every((a) => a.outcome === "missing-credentials")) { | ||
| return "AI 書き換えに使えるモデルアカウントが接続されていません。設定から Anthropic か OpenAI を接続してください。"; | ||
| } | ||
| return "AI 書き換えに失敗しました。"; |
There was a problem hiding this comment.
empty-result を明示的に扱ってください。
callSmallModel の attempt には empty-result があり得ますが、現状は汎用メッセージに落ちるので、ユーザーには「モデル呼び出し自体は通ったが空応答だった」のか「実行失敗」なのか区別できません。ここは専用メッセージを返した方が再試行時の判断がしやすいです。
💡 例
export function describeEnhanceFailure(attempts: SmallModelAttempt[]): string {
for (let index = attempts.length - 1; index >= 0; index -= 1) {
const attempt = attempts[index];
if (!attempt) continue;
if (attempt.outcome === "expired-credentials") {
return `${attempt.issue?.message ?? `${attempt.providerName} の認証が切れています`}。設定から再接続してください。`;
}
+ if (attempt.outcome === "empty-result") {
+ return `${attempt.providerName} から空の結果が返されました。入力を少し具体化して再試行してください。`;
+ }
if (attempt.outcome === "failed") {
return `${attempt.providerName} での書き換えに失敗しました: ${attempt.issue?.message ?? attempt.reason ?? "unknown"}`;
}
if (attempt.outcome === "unsupported-credentials") {
return `${attempt.providerName} の認証種別が書き換えに対応していません。`;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@apps/desktop/src/main/todo-agent/enhance-text.ts` around lines 84 - 101, In
describeEnhanceFailure, add explicit handling for attempts whose outcome ===
"empty-result" (SmallModelAttempt) before falling through to generic messages;
return a clear Japanese message indicating the model call succeeded but produced
an empty response (e.g., "モデルは応答しましたが空の結果でした。再試行してください。") so users can
distinguish "empty response" from other failures and make retry decisions.
| // Fresh in-memory buffer for this run. Old events from previous | ||
| // runs of the same session are cleared so the UI sees just the | ||
| // current attempt. | ||
| store.clearStreamEvents(sessionId); | ||
| // Prime the artifact-path cache so the hot stream-persist path | ||
| // does not need to do a synchronous SQLite read per event. | ||
| store.setArtifactPathCache(sessionId, session0.artifactPath); | ||
|
|
There was a problem hiding this comment.
同じ session の再実行で stream 履歴が混ざります。
ここで消しているのは in-memory buffer だけですが、永続化側の stream.jsonl は append-only のままです。failed / aborted / escalated session を同じ sessionId で再開すると、再起動後や disk fallback 時に前回 run のイベントまで一緒に読み戻されます。再実行前に既存の stream file を truncate するか、run ごとに別ファイルへ分けたいです。
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@apps/desktop/src/main/todo-agent/supervisor.ts` around lines 156 - 163, The
in-memory clear (store.clearStreamEvents(sessionId)) leaves the append-only
persisted stream.jsonl intact causing old events to be reloaded on retries;
update supervisor startup to also truncate or rotate the persisted stream file
before a new run. Add or call a store-level method (e.g.
store.truncateStreamFile(sessionId) or store.rotateStreamFile(sessionId, runId))
before priming the cache (before/around the
store.setArtifactPathCache(sessionId, session0.artifactPath)) so the persisted
stream for the same sessionId is either truncated or writes go to a run-specific
file to avoid mixing previous run events. Ensure the new method is implemented
in the store backend to atomically truncate or rename the existing stream.jsonl
for that sessionId.
| const diffQuery = electronTrpc.todoAgent.gitFileDiff.useQuery( | ||
| selected | ||
| ? { | ||
| sessionId, | ||
| path: selected.path, | ||
| scope: selected.scope, | ||
| commitSha: selected.commitSha, | ||
| } | ||
| : { sessionId, path: "", scope: "session" as const }, | ||
| { enabled: !!selected, staleTime: 5_000 }, | ||
| ); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
find apps/desktop -name "ChangesSidebar.tsx" -type fRepository: MocA-Love/superset
Length of output: 153
🏁 Script executed:
find apps/desktop -name "trpc-router.ts" -type fRepository: MocA-Love/superset
Length of output: 109
🏁 Script executed:
cat -n apps/desktop/src/renderer/features/todo-agent/TodoManager/ChangesSidebar/ChangesSidebar.tsx | head -200Repository: MocA-Love/superset
Length of output: 7657
🏁 Script executed:
cat -n apps/desktop/src/main/todo-agent/trpc-router.ts | sed -n '350,390p'Repository: MocA-Love/superset
Length of output: 1523
コミット選択時の diff クエリが gitFileDiff の入力制約を満たしていません。
コミットを選ぶと path: "" を送っていますが(176-183行)、apps/desktop/src/main/todo-agent/trpc-router.ts:361 の gitFileDiff は path: z.string().min(1) を要求しています。このままではコミット diff 表示が毎回 tRPC 検証エラーで失敗するため、scope === "commit" では path 不要の API に揃えるか、クライアント側の選択モデルを分岐してください。
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@apps/desktop/src/renderer/features/todo-agent/TodoManager/ChangesSidebar/ChangesSidebar.tsx`
around lines 45 - 55, The diffQuery call is sending path: "" when a commit is
selected which violates the server validator for gitFileDiff
(z.string().min(1)); update the argument construction in
electronTrpc.todoAgent.gitFileDiff.useQuery to branch based on selected.scope
(e.g., if selected?.scope === "commit" send the payload without path and include
commitSha/scope accordingly, otherwise include path as before), or populate a
non-empty path when required so the client payload matches the gitFileDiff input
schema; adjust the selected-based conditional used to build the query args in
ChangesSidebar (diffQuery) to ensure scope === "commit" uses the API shape the
router expects.
| <Button | ||
| type="button" | ||
| size="sm" | ||
| variant="ghost" | ||
| className="h-6 w-6 p-0 rounded-md text-muted-foreground hover:text-primary" | ||
| onClick={handleClick} | ||
| disabled={disabled} | ||
| title={title ?? (running ? "AI で書き換え中…" : "AI で書き換える")} | ||
| > | ||
| <HiMiniSparkles className={cn("size-3.5", running && "animate-pulse")} /> |
There was a problem hiding this comment.
アイコンボタンにアクセシブルネームを付けてください。
title 属性だけだと支援技術向けの名前として安定せず、このボタンの用途がスクリーンリーダー利用者に伝わりません。aria-label を付けて、アイコンは読み上げ対象から外した方が安全です。
♿ 修正例
<Button
type="button"
size="sm"
variant="ghost"
className="h-6 w-6 p-0 rounded-md text-muted-foreground hover:text-primary"
onClick={handleClick}
disabled={disabled}
title={title ?? (running ? "AI で書き換え中…" : "AI で書き換える")}
+ aria-label={title ?? (running ? "AI で書き換え中" : "AI で書き換える")}
>
- <HiMiniSparkles className={cn("size-3.5", running && "animate-pulse")} />
+ <HiMiniSparkles
+ aria-hidden="true"
+ className={cn("size-3.5", running && "animate-pulse")}
+ />
</Button>📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| <Button | |
| type="button" | |
| size="sm" | |
| variant="ghost" | |
| className="h-6 w-6 p-0 rounded-md text-muted-foreground hover:text-primary" | |
| onClick={handleClick} | |
| disabled={disabled} | |
| title={title ?? (running ? "AI で書き換え中…" : "AI で書き換える")} | |
| > | |
| <HiMiniSparkles className={cn("size-3.5", running && "animate-pulse")} /> | |
| <Button | |
| type="button" | |
| size="sm" | |
| variant="ghost" | |
| className="h-6 w-6 p-0 rounded-md text-muted-foreground hover:text-primary" | |
| onClick={handleClick} | |
| disabled={disabled} | |
| title={title ?? (running ? "AI で書き換え中…" : "AI で書き換える")} | |
| aria-label={title ?? (running ? "AI で書き換え中" : "AI で書き換える")} | |
| > | |
| <HiMiniSparkles | |
| aria-hidden="true" | |
| className={cn("size-3.5", running && "animate-pulse")} | |
| /> |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@apps/desktop/src/renderer/features/todo-agent/TodoModal/components/EnhanceButton/EnhanceButton.tsx`
around lines 55 - 64, Enhance the accessibility of the icon-only button in the
EnhanceButton component by adding an explicit aria-label to the Button (use the
existing title prop or fallback to running ? "AI で書き換え中…" : "AI で書き換える") and
mark the HiMiniSparkles icon as non-interactive for assistive tech (e.g., add
aria-hidden="true" and focusable={false} to the icon element); update the Button
JSX where onClick={handleClick}, disabled={disabled}, title={...} is set to
include aria-label and update the HiMiniSparkles usage to include aria-hidden
and focusable props.
| let targetWorkspaceId = workspaceId; | ||
| if (createWorktree) { | ||
| if (!projectId) { | ||
| throw new Error( | ||
| "このワークスペースにはプロジェクトが紐付いていないので新しい worktree を作成できません", | ||
| ); | ||
| } | ||
| const namingPrompt = [title.trim(), description.trim()] | ||
| .filter(Boolean) | ||
| .join("\n\n"); | ||
| const result = await createWorkspaceMut.mutateAsync({ | ||
| projectId, | ||
| prompt: namingPrompt || title.trim(), | ||
| }); | ||
| targetWorkspaceId = result.workspace.id; | ||
| } | ||
|
|
||
| const created = await create.mutateAsync({ | ||
| workspaceId: targetWorkspaceId, | ||
| projectId, | ||
| title: title.trim(), | ||
| description: description.trim(), | ||
| goal: hasGoal ? goal.trim() : undefined, | ||
| verifyCommand: hasVerify ? verifyCommand.trim() : undefined, | ||
| maxIterations, | ||
| maxWallClockSec: maxMinutes * 60, | ||
| customSystemPrompt: selectedPreset?.content ?? undefined, | ||
| }); |
There was a problem hiding this comment.
worktree 作成とセッション作成が分離されていて、失敗時に部分成功が残ります。
ここは workspaces.create が成功したあとに todoAgent.create が失敗すると、新しい workspace/worktree だけが残ってユーザーにはエラーが返ります。非冪等な副作用が 2 段になっているので、1 つの main-process mutation にまとめるか、失敗時の補償削除を入れないと孤立リソースが発生します。
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@apps/desktop/src/renderer/features/todo-agent/TodoModal/TodoModal.tsx` around
lines 123 - 150, The code currently calls createWorkspaceMut.mutateAsync(...) to
create a worktree and then create.mutateAsync(...) to create a todo session,
leaving an orphaned workspace if the second call fails; update the flow so both
operations are atomic: either (A) move the worktree + todo creation into a
single main-process mutation on the backend (preferred) so one server-side
transaction handles both, or (B) implement compensation logic in the caller
around createWorkspaceMut.mutateAsync and create.mutateAsync — after
createWorkspaceMut.mutateAsync returns a result.workspace.id, call
create.mutateAsync(...) and if that fails, reliably delete the newly created
workspace via the matching workspace delete API (use the same workspace id
returned), and surface the original error; reference
createWorkspaceMut.mutateAsync, create.mutateAsync, and targetWorkspaceId when
locating the code to change.
| <DropdownMenuTrigger asChild> | ||
| <button | ||
| type="button" | ||
| className={cn( | ||
| "flex items-center gap-2 px-2.5 py-1.5 rounded-md border text-xs transition", | ||
| selected | ||
| ? "border-primary/40 bg-primary/5 text-foreground" | ||
| : "border-border/40 text-muted-foreground hover:bg-muted/40", | ||
| )} | ||
| > | ||
| <HiMiniSparkles className="size-3 text-primary/80" /> | ||
| <span className="flex-1 text-left truncate"> | ||
| {selected ? selected.name : "プリセットを選択(設定から管理)"} | ||
| </span> | ||
| {selected && ( | ||
| <button | ||
| type="button" | ||
| className="size-4 rounded-sm flex items-center justify-center hover:bg-background/80" | ||
| onClick={(e) => { | ||
| e.preventDefault(); | ||
| e.stopPropagation(); | ||
| onSelect(null); | ||
| }} | ||
| title="解除" | ||
| > | ||
| <HiMiniXMark className="size-3" /> | ||
| </button> | ||
| )} | ||
| </button> |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cat -n apps/desktop/src/renderer/features/todo-agent/TodoModal/TodoModal.tsx | sed -n '350,390p'Repository: MocA-Love/superset
Length of output: 1642
ネストされた button 要素を削除してください。
クリア用の button が DropdownMenuTrigger の button の子要素として含まれており、これは無効な HTML 構造です。HTML 仕様によると、ボタン要素は他のボタンのような対話的なコンテンツをネストできません。この構造により、キーボード操作や支援技術(スクリーンリーダーなど)の動作が不安定になり、環境によっては親トリガーと子ボタンが同時に反応する可能性があります。
💡 修正例
- <DropdownMenuTrigger asChild>
- <button
- type="button"
- className={cn(
- "flex items-center gap-2 px-2.5 py-1.5 rounded-md border text-xs transition",
- selected
- ? "border-primary/40 bg-primary/5 text-foreground"
- : "border-border/40 text-muted-foreground hover:bg-muted/40",
- )}
- >
- <HiMiniSparkles className="size-3 text-primary/80" />
- <span className="flex-1 text-left truncate">
- {selected ? selected.name : "プリセットを選択(設定から管理)"}
- </span>
- {selected && (
- <button
- type="button"
- className="size-4 rounded-sm flex items-center justify-center hover:bg-background/80"
- onClick={(e) => {
- e.preventDefault();
- e.stopPropagation();
- onSelect(null);
- }}
- title="解除"
- >
- <HiMiniXMark className="size-3" />
- </button>
- )}
- </button>
- </DropdownMenuTrigger>
+ <div className="relative">
+ <DropdownMenuTrigger asChild>
+ <button
+ type="button"
+ className={cn(
+ "flex w-full items-center gap-2 rounded-md border px-2.5 py-1.5 pr-7 text-xs transition",
+ selected
+ ? "border-primary/40 bg-primary/5 text-foreground"
+ : "border-border/40 text-muted-foreground hover:bg-muted/40",
+ )}
+ >
+ <HiMiniSparkles className="size-3 text-primary/80" />
+ <span className="flex-1 truncate text-left">
+ {selected ? selected.name : "プリセットを選択(設定から管理)"}
+ </span>
+ </button>
+ </DropdownMenuTrigger>
+ {selected && (
+ <button
+ type="button"
+ className="absolute right-2 top-1/2 flex size-4 -translate-y-1/2 items-center justify-center rounded-sm hover:bg-background/80"
+ onClick={(e) => {
+ e.preventDefault();
+ e.stopPropagation();
+ onSelect(null);
+ }}
+ title="解除"
+ >
+ <HiMiniXMark className="size-3" />
+ </button>
+ )}
+ </div>📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| <DropdownMenuTrigger asChild> | |
| <button | |
| type="button" | |
| className={cn( | |
| "flex items-center gap-2 px-2.5 py-1.5 rounded-md border text-xs transition", | |
| selected | |
| ? "border-primary/40 bg-primary/5 text-foreground" | |
| : "border-border/40 text-muted-foreground hover:bg-muted/40", | |
| )} | |
| > | |
| <HiMiniSparkles className="size-3 text-primary/80" /> | |
| <span className="flex-1 text-left truncate"> | |
| {selected ? selected.name : "プリセットを選択(設定から管理)"} | |
| </span> | |
| {selected && ( | |
| <button | |
| type="button" | |
| className="size-4 rounded-sm flex items-center justify-center hover:bg-background/80" | |
| onClick={(e) => { | |
| e.preventDefault(); | |
| e.stopPropagation(); | |
| onSelect(null); | |
| }} | |
| title="解除" | |
| > | |
| <HiMiniXMark className="size-3" /> | |
| </button> | |
| )} | |
| </button> | |
| <div className="relative"> | |
| <DropdownMenuTrigger asChild> | |
| <button | |
| type="button" | |
| className={cn( | |
| "flex w-full items-center gap-2 rounded-md border px-2.5 py-1.5 pr-7 text-xs transition", | |
| selected | |
| ? "border-primary/40 bg-primary/5 text-foreground" | |
| : "border-border/40 text-muted-foreground hover:bg-muted/40", | |
| )} | |
| > | |
| <HiMiniSparkles className="size-3 text-primary/80" /> | |
| <span className="flex-1 truncate text-left"> | |
| {selected ? selected.name : "プリセットを選択(設定から管理)"} | |
| </span> | |
| </button> | |
| </DropdownMenuTrigger> | |
| {selected && ( | |
| <button | |
| type="button" | |
| className="absolute right-2 top-1/2 flex size-4 -translate-y-1/2 items-center justify-center rounded-sm hover:bg-background/80" | |
| onClick={(e) => { | |
| e.preventDefault(); | |
| e.stopPropagation(); | |
| onSelect(null); | |
| }} | |
| title="解除" | |
| > | |
| <HiMiniXMark className="size-3" /> | |
| </button> | |
| )} | |
| </div> |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@apps/desktop/src/renderer/features/todo-agent/TodoModal/TodoModal.tsx` around
lines 355 - 383, The nested button inside the DropdownMenuTrigger must be
removed; in TodoModal replace the inner clear <button> (the one rendered when
selected) with a non-button interactive element (e.g., a <span> or <div> with
role="button" and tabIndex={0}) and wire its click and keyboard handlers to call
onSelect(null) while calling e.preventDefault()/e.stopPropagation() to avoid
triggering the parent trigger; keep the same classes, title="解除", and accessible
keyboard handling (Enter/Space) so the clear control remains focusable and
accessible without nesting a button inside the DropdownMenuTrigger's button.
概要
フォーク限定の新機能として 自律 TODO エージェント を導入する。ユーザーがタスクと明確な受け入れ条件(ゴール)を入力すると、Claude Code ワーカーがそのゴールが決定論的に検証されるまで無人で実行を続ける。ワーカーは通常のワークスペースターミナルタブで対話モードの Claude Code として動くため、誰でもライブで様子を見られるし、直接タイプして介入もできる。
ワークスペースの
PresetsBarの既存 Run ボタンの真左に、新しい TODO ボタンを配置している。背景・目的
今のワークスペース体験では、Claude セッションを回している間ユーザーが付きっきりで見ている必要がある。長時間かかる作業(Issue 修正・段階的リファクタ・反復的な実装)において、ユーザーは次を求めている:
この PR は上記のユースケースの v1 ループ全体を同梱する。
仕組み
重要な設計判断
bun testが exit 0 ならセッションは done、それ以外なら失敗ログ末尾を次ターンに差し戻す。escalatedにして停止する。正規化済みテスト ID は ANSI・タイミング ("(12 ms)") ・hex アドレス・末尾の差分文言を落としているので、再実行間で ID が安定する。フォーク衝突面
意図的に極小。新規コードはすべて新規ディレクトリに置き、既存ファイルへの変更は孤立した追記のみに抑えている。
apps/desktop/src/lib/trpc/routers/index.tstodoAgent+1 行apps/desktop/src/renderer/screens/main/components/WorkspaceView/ContentView/components/PresetsBar/PresetsBar.tsx<TodoButton />描画 +1 行packages/local-db/src/schema/schema.tspackages/local-db/src/schema/index.ts残りはすべて新規ファイル:
apps/desktop/src/main/todo-agent/,apps/desktop/src/renderer/features/todo-agent/,packages/local-db/src/schema/todo-sessions.ts配下。この PR で入るもの
バックエンド(メインプロセス)
apps/desktop/src/main/todo-agent/types.ts— zod 入力スキーマ・共有定数・イベント型session-store.ts— localDb バックの CRUD、EventEmitter ファンアウト、メインプロセス用の worktree パス解決supervisor.ts— シングルトンのループドライバ: artifact 準備、プロンプト組み立て、PTY 書き込み、idle 待機、verify の child_process 実行、futility + 予算ガード、abort / sendInputtrpc-router.ts—todoAgent.*ルータ。apps/desktop/AGENTS.mdに記載のtrpc-electron制約に従いsubscribeStateは observable ベースindex.ts— barrelスキーマ
packages/local-db/src/schema/todo-sessions.ts— 新規todo_sessionsテーブル(22 カラム / 3 index / 2 FK)packages/local-db/drizzle/0049_add_todo_sessions.sql— 生成済みマイグレーションpackages/local-db/drizzle/meta/0049_snapshot.json— drizzle スナップショットレンダラ UI
apps/desktop/src/renderer/features/todo-agent/TodoButton/TodoButton.tsx— コンパクトな分割ボタン(本体クリックで作成モーダル、▾ ドロップダウンに「New TODO…」「Open panel」)、アクティブセッション数のカウンターバッジつきTodoModal/TodoModal.tsx— 作成フォーム: タイトル、説明、ゴール、verify コマンド、最大イテレーション数、wall-clock 分数TodoPanel/TodoPanel.tsx— 右側 Sheet ドロワ: セッション一覧 + 詳細ビュー(Start / Abort / 介入入力コントロール)Plan doc
apps/desktop/plans/todo-agent-plan.md— 設計ドキュメント一式(目的 / 非目的 / アーキテクチャ / 実行ループ / 介入 UX / フォーク衝突戦略 / データモデル / tRPC サーフェス / 段階リリース / 未解決事項)コミット
各コミットは独立してレビュー可能・ロールバック可能に切ってある:
feat(fork): scaffold TODO autonomous agent backend— plan doc + スキーマ + メインプロセス supervisor + tRPC ルータ配線feat(fork): add TODO button and session creation modal— 最初のユーザー面(セッション作成のみ、実行ハンドオフは含まない)feat(fork): add TodoPanel with execution handoff and intervention— v1 制御ループ完結: Start / Abort / 介入chore(fork): generate drizzle migration for todo_sessions— 自動生成 SQL + スナップショットrefactor(fork): harden TODO futility detection and fix plan doc paths— 複数ランナー対応の failing-test 抽出 + 正規化、これで「同じ失敗が 3 回」判定が実際に機能するv1 の非目的(意図的にスコープ外)
done時の PR 自動作成--settings経由の Stop hook 統合(v1 は Claude Code CLI の内部仕様から切り離すため idle 検知を採用。plan doc の「Unresolved」参照)テストプラン
apps/desktopでのbun run typecheckが通ること(ローカルで確認済み)hello.txtを作る / verify はtest -f hello.txt)でセッションを作成できることclaude "<初期プロンプト>"が走ることdoneに到達することescalatedに落ち着くことabortedになることv1 既知の弱点
claudeが PATH にある前提。既存のワークスペースターミナルが~/.superset/binを PATH 先頭に足しているので通常の環境では問題ない。Error:行を拾う設計。Closes: n/a(フォーク限定機能)
Refs:
apps/desktop/plans/todo-agent-plan.mdSummary by CodeRabbit
新機能