Skip to content

feat(fork): autonomous TODO agent with live worker visibility and intervention#181

Merged
MocA-Love merged 27 commits intomainfrom
feat/todo-autonomous-agent
Apr 15, 2026
Merged

feat(fork): autonomous TODO agent with live worker visibility and intervention#181
MocA-Love merged 27 commits intomainfrom
feat/todo-autonomous-agent

Conversation

@MocA-Love
Copy link
Copy Markdown
Owner

@MocA-Love MocA-Love commented Apr 15, 2026

概要

フォーク限定の新機能として 自律 TODO エージェント を導入する。ユーザーがタスクと明確な受け入れ条件(ゴール)を入力すると、Claude Code ワーカーがそのゴールが決定論的に検証されるまで無人で実行を続ける。ワーカーは通常のワークスペースターミナルタブで対話モードの Claude Code として動くため、誰でもライブで様子を見られるし、直接タイプして介入もできる。

ワークスペースの PresetsBar の既存 Run ボタンの真左に、新しい TODO ボタンを配置している。

背景・目的

今のワークスペース体験では、Claude セッションを回している間ユーザーが付きっきりで見ている必要がある。長時間かかる作業(Issue 修正・段階的リファクタ・反復的な実装)において、ユーザーは次を求めている:

  1. 何がしたいか・何をもって「完了」とするかを 1 回だけ説明する
  2. 席を離れて別のことをする
  3. 戻ってきたときには「LLM が完了したと言い張った」ではなく「実テストが通ったかどうか」で決まった判定結果が手元にある

この PR は上記のユースケースの v1 ループ全体を同梱する。

仕組み

Renderer                                    Main process
────────                                    ────────────
TodoButton (PresetsBar)                     TodoSupervisor (singleton)
  └─ TodoModal ──► todoAgent.create ──────► 行を INSERT、goal.md を書き出し
TodoPanel                                   キューをドレイン
  ├─ セッション一覧(ポーリング)             ├─ プロンプトを PTY に書き込み
  ├─ Start: pane 作成 → claude 起動          ├─ data:${paneId} で idle を待つ
  │        → todoAgent.attachPane ─────────►├─ verify コマンドを child_process で実行
  ├─ Abort / 介入入力 ─────────────────────►├─ exit 0 なら done、そうでなければ次イテレーション
  └─ observable subscription で状態受信      │    futility: 同じ failing test が 3 回 → escalated
                                             └─ 予算: イテレーション数 / wall-clock

重要な設計判断

  • Supervisor はメインプロセスの純粋な TypeScript であり、2 つ目の Claude Code インスタンスではない。LLM 間通信を排除し、創造的な処理はすべて単一ワーカーに集約、「管理」は決定論的な TS コードで行う。
  • ワーカーは実 PTY ペインで動かす。既存のワークスペースターミナル基盤をそのまま使い、特別な埋め込み機構は作らない。ワーカータブは普通のターミナルタブなので、誰でもタブバーから開いて監視したり介入したりできる。
  • 完了判定は verify コマンドの終了コードが単一の真実。LLM の自己申告は一切信用しない。bun test が exit 0 ならセッションは done、それ以外なら失敗ログ末尾を次ターンに差し戻す。
  • Futility 検知: 同じ failing test が 3 イテレーション連続で出たら escalated にして停止する。正規化済みテスト ID は ANSI・タイミング ("(12 ms)") ・hex アドレス・末尾の差分文言を落としているので、再実行間で ID が安定する。
  • フェーズごと fresh-session 運用(予算上限 + イテレーションループ)で context rot に対処する。

フォーク衝突面

意図的に極小。新規コードはすべて新規ディレクトリに置き、既存ファイルへの変更は孤立した追記のみに抑えている。

ファイル 変更
apps/desktop/src/lib/trpc/routers/index.ts import +1 行、router object に todoAgent +1 行
apps/desktop/src/renderer/screens/main/components/WorkspaceView/ContentView/components/PresetsBar/PresetsBar.tsx import +1 行、<TodoButton /> 描画 +1 行
packages/local-db/src/schema/schema.ts +1 行(drizzle-kit が新テーブルファイルを拾えるよう re-export)
packages/local-db/src/schema/index.ts +1 行 re-export

残りはすべて新規ファイル: apps/desktop/src/main/todo-agent/, apps/desktop/src/renderer/features/todo-agent/, packages/local-db/src/schema/todo-sessions.ts 配下。

この PR で入るもの

バックエンド(メインプロセス)

  • apps/desktop/src/main/todo-agent/
    • types.ts — zod 入力スキーマ・共有定数・イベント型
    • session-store.ts — localDb バックの CRUD、EventEmitter ファンアウト、メインプロセス用の worktree パス解決
    • supervisor.ts — シングルトンのループドライバ: artifact 準備、プロンプト組み立て、PTY 書き込み、idle 待機、verify の child_process 実行、futility + 予算ガード、abort / sendInput
    • trpc-router.tstodoAgent.* ルータ。apps/desktop/AGENTS.md に記載の trpc-electron 制約に従い subscribeState は observable ベース
    • index.ts — barrel

スキーマ

  • packages/local-db/src/schema/todo-sessions.ts — 新規 todo_sessions テーブル(22 カラム / 3 index / 2 FK)
  • packages/local-db/drizzle/0049_add_todo_sessions.sql — 生成済みマイグレーション
  • packages/local-db/drizzle/meta/0049_snapshot.json — drizzle スナップショット

レンダラ UI

  • apps/desktop/src/renderer/features/todo-agent/
    • TodoButton/TodoButton.tsx — コンパクトな分割ボタン(本体クリックで作成モーダル、▾ ドロップダウンに「New TODO…」「Open panel」)、アクティブセッション数のカウンターバッジつき
    • TodoModal/TodoModal.tsx — 作成フォーム: タイトル、説明、ゴール、verify コマンド、最大イテレーション数、wall-clock 分数
    • TodoPanel/TodoPanel.tsx — 右側 Sheet ドロワ: セッション一覧 + 詳細ビュー(Start / Abort / 介入入力コントロール)

Plan doc

  • apps/desktop/plans/todo-agent-plan.md — 設計ドキュメント一式(目的 / 非目的 / アーキテクチャ / 実行ループ / 介入 UX / フォーク衝突戦略 / データモデル / tRPC サーフェス / 段階リリース / 未解決事項)

コミット

各コミットは独立してレビュー可能・ロールバック可能に切ってある:

  1. feat(fork): scaffold TODO autonomous agent backend — plan doc + スキーマ + メインプロセス supervisor + tRPC ルータ配線
  2. feat(fork): add TODO button and session creation modal — 最初のユーザー面(セッション作成のみ、実行ハンドオフは含まない)
  3. feat(fork): add TodoPanel with execution handoff and intervention — v1 制御ループ完結: Start / Abort / 介入
  4. chore(fork): generate drizzle migration for todo_sessions — 自動生成 SQL + スナップショット
  5. refactor(fork): harden TODO futility detection and fix plan doc paths — 複数ランナー対応の failing-test 抽出 + 正規化、これで「同じ失敗が 3 回」判定が実際に機能する

v1 の非目的(意図的にスコープ外)

  • タスクの並列実行(supervisor は内部でキューを持つが、並列化は UI 含めて v2 以降)
  • Cloud / Modal サンドボックスでのワーカー実行(v1 はローカル worktree 前提)
  • LLM-as-judge の二次ゲート(単一の真実は verify コマンドの exit code)
  • done 時の PR 自動作成
  • --settings 経由の Stop hook 統合(v1 は Claude Code CLI の内部仕様から切り離すため idle 検知を採用。plan doc の「Unresolved」参照)
  • パネル内での PTY ライブ埋め込み(v1 はワークスペース自体のタブバーに任せる。ワーカータブは通常のターミナルタブなので既存 UI でそのまま見られる)

テストプラン

  • apps/desktop での bun run typecheck が通ること(ローカルで確認済み)
  • デスクトップアプリを起動し、起動時にマイグレーション 0049 が適用されること
  • PresetsBar の TODO ボタンをクリックしてモーダルが開くこと
  • 軽いゴール(例: worktree ルートに hello.txt を作る / verify は test -f hello.txt)でセッションを作成できること
  • パネルを開き queued セッションを Start できること
  • 新しいターミナルタブが出現し、claude "<初期プロンプト>" が走ること
  • ワーカーがイテレーションを回し、verify が実行され、最終的に done に到達すること
  • わざと失敗する verify を設定した場合、3 回の futility ヒット後に escalated に落ち着くこと
  • 実行中に Abort を押すと Ctrl-C が 2 回送られ、状態が aborted になること
  • 実行中に介入入力欄にタイプして送信すると、ワーカーのターミナルにそのテキストが現れること
  • 実行中のセッションがあるときに 2 つ目のセッションを作成 → 1 つ目が終わってから自動でキューから取り出されること

v1 既知の弱点

  • claude が PATH にある前提。既存のワークスペースターミナルが ~/.superset/bin を PATH 先頭に足しているので通常の環境では問題ない。
  • idle 検知は 5 秒のヒューリスティックで、長考中の Claude を「ターン完了」と誤認する可能性がある。v2 で Stop hook 統合に移行する予定。
  • failing test 抽出は主要ランナーに対応済みだが、エキゾチックなランナーは取りこぼす可能性あり。フォールバックで最初の Error: 行を拾う設計。

Closes: n/a(フォーク限定機能)
Refs: apps/desktop/plans/todo-agent-plan.md

Summary by CodeRabbit

新機能

  • 自律TODOエージェント機能を追加 - ワークスペースプリセットバーに新しいTODOボタンを追加。ユーザーが目標を指定してエージェントセッションを作成・実行でき、進行状況の追跡、Gitの変更表示、システムプロンプトテンプレートの管理、実行中の対話的入力送信に対応しています。

Introduces the main-process scaffolding for a new fork-local "TODO" feature
that drives Claude Code autonomously toward a user-defined goal until a
decisive verify command passes. This commit establishes the backend
surface — schema, supervisor, and tRPC router — without any renderer work
or existing-UI integration, so it can be iterated on and reviewed in
isolation.

Why this shape
--------------
- The supervisor is pure TypeScript in the main process, not a second
  Claude Code. All creativity stays in one worker; "management" is
  deterministic code. This avoids LLM-to-LLM communication, which the
  research survey flagged as the biggest reliability sink for long-horizon
  autonomous loops.
- The worker runs as interactive Claude Code inside a real PTY pane (same
  infra the existing Run button uses), so users can watch it live and type
  into it to intervene. Completion per turn is detected by idle timing on
  the PTY data stream; decisive success is the exit code of the user's
  verify command (e.g. `bun test`). LLM self-report is never trusted.
- Fork-conflict surface is kept to three 1-line edits in existing files
  (trpc routers index, local-db schema.ts re-export, local-db schema
  barrel). Everything else lives in new files under new directories.

What lands here
---------------
- apps/desktop/plans/todo-agent-plan.md — full design doc covering goals,
  non-goals, architecture, execution loop, intervention UX, UI surface,
  fork-conflict strategy, data model, tRPC surface, phased delivery, and
  unresolved questions.
- packages/local-db/src/schema/todo-sessions.ts — new `todo_sessions`
  SQLite table (workspace-scoped, status machine, budget, verdict fields,
  artifact path). Re-exported from schema.ts so drizzle-kit picks it up
  without changing the drizzle.config.ts entry.
- apps/desktop/src/main/todo-agent/
  - types.ts         zod input schemas + shared constants
  - session-store.ts localDb-backed CRUD + EventEmitter fan-out, plus a
                     worktree-path resolver for main-process callers.
  - supervisor.ts    Singleton loop driver: prepares artifacts
                     (`.superset/todo/<id>/goal.md`), writes the iteration
                     prompt into the worker PTY via the workspace
                     terminal runtime, waits for idle, runs the verify
                     command as a detached child process, applies
                     futility (3x same failing test) and budget
                     (iteration count, wall-clock) guards, and settles
                     the session to done/failed/escalated/aborted.
                     Also exposes abort() (sends double Ctrl-C to the
                     pane) and sendInput() passthroughs.
  - trpc-router.ts   `todoAgent.*` router: create / list / get /
                     attachPane / abort / sendInput + an observable-based
                     subscribeState subscription (per trpc-electron
                     constraint documented in apps/desktop/AGENTS.md).
  - index.ts         Barrel.
- apps/desktop/src/lib/trpc/routers/index.ts — register the new router
  as `todoAgent` on the app router (import + one field, clearly fork-
  marked).

Not yet in this commit
----------------------
- Renderer UI (TodoButton, TodoModal, TodoPanel) and the PresetsBar
  integration point next to WorkspaceRunButton.
- Drizzle migration file. Per repo policy, migrations are generated by
  running `bunx drizzle-kit generate` locally and never hand-written;
  this will be generated when the feature is wired end-to-end.
- Stop-hook integration via `--settings`. v1 uses idle-detection to
  stay decoupled from Claude Code CLI internals. Tracked as an
  Unresolved item in the plan doc for v2.

Verified
--------
- `bun run typecheck` in apps/desktop — clean.
- `tsc --noEmit` in packages/local-db — clean.

Refs: apps/desktop/plans/todo-agent-plan.md
Adds the first user-facing surface of the autonomous TODO agent: a
compact TODO button placed immediately left of WorkspaceRunButton in
PresetsBar, plus the creation modal that collects the task details the
supervisor needs to start a run.

Scope of this commit
--------------------
Deliberately limited to session *creation*. Clicking the button opens a
modal, the user fills in the form, and submit creates a `todo_sessions`
row via `todoAgent.create`. The supervisor does not start executing
yet — pane attach + execution handoff lands in a follow-up commit along
with TodoPanel. This keeps each commit independently reviewable and
rollback-safe.

TodoButton (TodoButton/TodoButton.tsx)
--------------------------------------
- Small ghost-variant button with a list icon and "TODO" label, styled
  to sit naturally next to WorkspaceRunButton without visually
  competing with it.
- Polls `todoAgent.list` every 3s for the current workspace and shows a
  badge with the count of queued/preparing/running/verifying sessions
  so users can see at a glance that work is in flight.
- Opens the modal as local state; no global store needed.

TodoModal (TodoModal/TodoModal.tsx)
-----------------------------------
Form fields, each mapped 1:1 to the zod schema in
`main/todo-agent/types.ts`:
- Title (max 200)
- What should be done? (multiline, max 10k)
- Clear goal / acceptance criteria (multiline, required — this is the
  single most important input for making the loop terminate)
- Verify command (default `bun test`, exit code is the ground truth)
- Max iterations (default 10, capped at 100)
- Wall-clock minutes (default 30, capped at 240)

Submit calls `electronTrpc.todoAgent.create.useMutation` and invalidates
`todoAgent.list` so the button badge updates immediately. Success and
failure are surfaced via the existing sonner toast. Cancel and close
both reset the form.

Rendering changes
-----------------
- `PresetsBar.tsx` now imports TodoButton and renders it inside the
  existing `ml-auto flex items-center gap-1 shrink-0` wrapper,
  immediately before WorkspaceRunButton. The wrapper already handles
  spacing so no layout tweaks are needed.
- Both the TodoButton import line and the render line are isolated
  additions to keep upstream merge conflicts cheap.

Co-location
-----------
Component code follows the repo's folder-per-component convention
under `src/renderer/features/todo-agent/` so all fork-local feature
code stays in one directory and is easy to delete or rebase.

Verified
--------
- `bun run typecheck` in apps/desktop — clean.
Closes the v1 control loop for the autonomous TODO agent: users can now
start a queued session, watch it run in a normal workspace terminal
tab, abort it, and type interventions directly into the running worker.

TodoButton dropdown
-------------------
The primary click still opens the creation modal (fast path for the
common action), but a chevron next to the button now opens a
DropdownMenu with "New TODO…" and "Open panel" so users can reach the
sessions drawer without having to create a new task first. The button
group is rendered as a single fused control (rounded-r-none +
rounded-l-none) so it reads as one widget next to WorkspaceRunButton.

TodoPanel (TodoPanel/TodoPanel.tsx)
-----------------------------------
Right-side Sheet, 540px wide, 2-column layout:
- Left: scrollable list of sessions for the current workspace, polled
  every 2s while the panel is open. Selection is local state.
- Right: detail view for the selected session — status, title,
  description, goal, verify command, iteration/budget snapshot, last
  verdict reason (as a max-h-40 scrollable pre block so long failure
  logs don't blow up the layout).

Controls in the detail view:
- **Start** (visible only when status === "queued")
  The handoff to the supervisor is done client-side in four steps so
  it composes cleanly with existing workspace terminal infra instead
  of adding new tab-creation primitives in the main process:
    1. `useTabsStore.getState().addTab(workspaceId)` creates a new
       terminal tab + pane in the Zustand store. The tab shows up in
       the workspace tab bar like any other terminal, so anyone can
       click over to watch the worker live.
    2. `setTabAutoTitle(tabId, "TODO: …")` labels the tab so it is
       easy to spot.
    3. `launchCommandInPane` (same helper the existing agent launcher
       uses) runs interactive `claude <prompt>` in the new pane,
       passing the session-specific initial prompt that points at
       `.superset/todo/<id>/goal.md` (written by the supervisor at
       creation time).
    4. `todoAgent.attachPane({ sessionId, tabId, paneId })` hands the
       session over to the supervisor, which takes it from `queued`
       to `running` and begins the idle-detect/verify loop.
- **Abort** (visible when active and already attached): calls
  `todoAgent.abort` which double-Ctrl-C's the pane and marks the
  session aborted.
- **Intervene input**: a small Input + Send button that writes text
  directly into the worker PTY via `todoAgent.sendInput`. Enter
  submits, shift+Enter does nothing (no multi-line for v1). This is
  the explicit "you can intervene while it runs" surface the plan
  doc promised; users can also just click over to the terminal tab
  and type there, since it is a real PTY.

A small footer reminds users that the worker runs in a normal
workspace terminal tab and can be opened from the tab bar directly —
no special terminal embed is needed inside the panel itself for v1,
which avoids bringing in the heavy TerminalPane + registry UI.

Scope deliberately out of this commit
-------------------------------------
- No auto-start on create. The user must explicitly click Start from
  the panel. This makes the handoff observable and keeps the modal
  commit rollback-safe.
- No live PTY embed inside the panel. v1 relies on the workspace's
  own tab bar for that. Can be added later if users want an
  in-panel viewer.
- No queue UI. The supervisor already queues internally if a second
  Start is pressed while another session runs, but there is no
  renderer affordance to reorder yet.

Integration note
----------------
`addTab` is the underlying Zustand method; `addTerminalTab` only
exists on the agent-session-orchestrator adapter layer as a thin
wrapper. Calling `addTab` directly keeps this feature from depending
on the full AgentLaunchTabsAdapter plumbing.

Verified
--------
- `bun run typecheck` in apps/desktop — clean.
Adds migration 0049, auto-generated by `bunx drizzle-kit generate` in
packages/local-db against the new `todo_sessions` table that landed in
the backend-scaffold commit. Creates the table with all 22 columns,
the three indexes defined in the schema file (workspace / status /
created_at), and the two foreign keys (workspace_id → workspaces.id
ON DELETE CASCADE, project_id → projects.id ON DELETE SET NULL).

Per repo policy, migration SQL and snapshot files are never edited by
hand; they are regenerated from the schema source. The journal update
is part of the same generate run.

Required follow-up: the migration runs automatically on the next
desktop app start (local-db migrations apply on boot). No manual
action needed beyond relaunching the app.
guessFailingTest (apps/desktop/src/main/todo-agent/supervisor.ts)
-----------------------------------------------------------------
The previous heuristic was a single regex matching `FAIL|✗|×` and
would both miss common runners and return run-specific strings that
broke the "same failure 3 times in a row → escalate" check — a timing
suffix like "(12 ms)" changing between runs was enough to reset the
consecutive-failure counter and make the futility guard toothless.

The replacement:

- Strips ANSI escapes before matching, so colored runner output is
  handled.
- Tries a prioritized list of line patterns covering bun test, vitest
  (tree view + summary + inline), jest (FAIL + ✕), generic ✗, TAP /
  node:test ("not ok 1 - …"), and playwright. Priority order matters
  because some runners emit several matches per failure and we want
  the most specific one first.
- Falls back to the first line containing "Error:" or "Assertion:" so
  shell verify commands that are not test runners (build scripts,
  type-checkers) still produce a stable identifier.
- Normalizes the returned id through `normalizeTestId`, which:
    * drops "(NNN ms)" and "[NNN ms]" timing suffixes,
    * collapses object hex addresses ("Foo@0x7f8b…") to "Foo@0x?",
    * truncates wording-variant ": expected X to be Y" tails,
    * caps length at 240 chars.
  This is the part that actually makes the futility guard work: the
  same logical failure now produces the same id across reruns even
  if the runner prints slightly different noise.

Plan doc paths (apps/desktop/plans/todo-agent-plan.md)
------------------------------------------------------
Three references still pointed at the Postgres `packages/db` schema
from the original design sketch. The feature actually lives in
`packages/local-db` (SQLite, the desktop app's local store). Updated
both the "files touched" checklist and the inline data-model code
block so future readers of the plan don't hunt in the wrong package.

Verified
--------
- `bun run typecheck` in apps/desktop — clean.
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 15, 2026

Warning

Rate limit exceeded

@MocA-Love has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 48 minutes and 20 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 48 minutes and 20 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: ab6b7635-8c14-48e6-9bdc-79bef47c56e4

📥 Commits

Reviewing files that changed from the base of the PR and between 72e446b and a9bb0e8.

📒 Files selected for processing (3)
  • apps/desktop/src/main/todo-agent/git-status.ts
  • apps/desktop/src/main/todo-agent/supervisor.ts
  • apps/desktop/src/renderer/features/todo-agent/TodoManager/TodoManager.tsx
📝 Walkthrough

Walkthrough

新しい自律TODO エージェント機能をデスクトップアプリに追加。tRPCルーターマウント、メイン・プロセスの supervisor・session-store・git操作モジュール、UI コンポーネント(ボタン・モーダル・マネージャー・サイドバー)、データベーススキーマ定義で構成。

Changes

Cohort / File(s) Summary
設定・ドキュメント
.gitignore, apps/desktop/plans/todo-agent-plan.md
ローカルアーティファクト除外パターンを追加、v1 実装計画の詳細仕様を定義。
tRPC ルーター統合
apps/desktop/src/lib/trpc/routers/index.ts
新規 todoAgent サブルーターを createAppRouter() に マウント。
メイン・プロセス: テキスト強化
apps/desktop/src/main/todo-agent/enhance-text.ts
ユーザーテキスト入力の AI リライト フロー実装(言語テンプレート選択、小規模モデル呼び出し、エラーハンドリング)。
メイン・プロセス: Git 検査
apps/desktop/src/main/todo-agent/git-status.ts
セッション対象 Git 操作(HEAD SHA 取得、スナップショット生成、差分取得、リモート発散計算)。
メイン・プロセス: セッション・ストア
apps/desktop/src/main/todo-agent/session-store.ts
セッション永続化・購読層。メモリバッファ、JSONL ストリーム、DB 同期、ワークツリー パス解決、起動時の中断セッション復旧。
メイン・プロセス: スーパーバイザー
apps/desktop/src/main/todo-agent/supervisor.ts
Claude CLI ヘッドレス実行管理。セッション初期化、反復制御、ストリーム解析、検証実行、予算ガード、ユーザー介入。
メイン・プロセス: tRPC ルーター
apps/desktop/src/main/todo-agent/trpc-router.ts
CRUD・操作・購読 API。セッション作成/開始/中止、テキスト強化、Git スナップショット、ストリーム更新、プリセット管理ネストルーター。
メイン・プロセス: 型・インデックス
apps/desktop/src/main/todo-agent/types.ts, apps/desktop/src/main/todo-agent/index.ts
共有型定義(Zod スキーマ、フェーズ、ストリーム イベント)、バレル再エクスポート。
UI: ボタン・モーダル
apps/desktop/src/renderer/features/todo-agent/TodoButton/*, apps/desktop/src/renderer/features/todo-agent/TodoModal/*
ワークスペース プリセット バーへの TODO 開始ボタン、セッション作成フォーム(プリセット選択、テキスト強化)。
UI: マネージャー・変更サイドバー
apps/desktop/src/renderer/features/todo-agent/TodoManager/*, apps/desktop/src/renderer/features/todo-agent/TodoManager/ChangesSidebar/*, apps/desktop/src/renderer/features/todo-agent/TodoManager/PresetsDialog/*
セッション一覧・詳細・操作、ストリーム表示、Git 差分ビューア、プリセット CRUD ダイアログ。
UI: 統合
apps/desktop/src/renderer/screens/main/components/WorkspaceView/ContentView/components/PresetsBar/PresetsBar.tsx
TodoButton を PresetsBar に配置。
DB: マイグレーション
packages/local-db/drizzle/0049_*.sql0054_*.sql
todo_sessionstodo_prompt_presets テーブル作成、列追加(目標・ヘッドレス・プリセット・始点 SHA)。
DB: スナップショット・ジャーナル
packages/local-db/drizzle/meta/0049_*.json0054_*.json, packages/local-db/drizzle/meta/_journal.json
Drizzle マイグレーション スナップショット・ジャーナル更新。
ローカル DB: スキーマ定義
packages/local-db/src/schema/todo-sessions.ts, packages/local-db/src/schema/todo-prompt-presets.ts, packages/local-db/src/schema/schema.ts, packages/local-db/src/schema/index.ts
Drizzle テーブル定義、型推論、再エクスポート。

Sequence Diagram(s)

sequenceDiagram
    participant User as ユーザー
    participant Renderer as レンダラー
    participant Main as メイン・プロセス
    participant DB as ローカルDB
    participant Claude as Claude CLI
    participant Verify as 検証 Script

    User->>Renderer: TODO ボタン・モーダルを開く
    Renderer->>Renderer: セッション作成フォーム送信
    Renderer->>Main: todoAgent.create (tRPC)
    Main->>DB: INSERT todo_sessions (queued)
    Main->>Main: アーティファクト ディレクトリ作成
    Renderer->>Renderer: トーストで成功表示
    
    User->>Renderer: セッション開始ボタン
    Renderer->>Main: todoAgent.start (tRPC)
    Main->>DB: UPDATE status="preparing"
    Main->>Main: Git HEAD SHA キャプチャ
    Main->>Main: stream イベント初期化
    Main->>DB: UPDATE status="running"
    Main->>Main: supervisor ループ開始 (非同期)
    
    Main->>Claude: spawn("claude", headless)
    Claude->>Claude: Claude ターン実行
    Claude-->>Main: NDJSON ストリーム出力
    Main->>Main: ストリーム解析(テキスト・ツール・結果)
    Main->>DB: UPDATE stream イベント追記
    Main->>DB: UPDATE iteration/cost/turns
    Main->>Renderer: subscribeStream イベント発行
    Renderer->>Renderer: ストリーム ビュー更新
    
    alt 検証コマンド存在
        Main->>Verify: execute verifyCommand
        Verify-->>Main: exit code / ログ
        Main->>DB: UPDATE status="done" or "failed"
        Main->>Main: 3回連続失敗なら escalated
    else 検証コマンドなし
        Main->>DB: UPDATE status="done"
    end
    
    Main->>DB: subscribeState イベント発行
    Renderer->>Renderer: セッション詳細更新(判定・コスト表示)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Poem

🐰✨ TODO の夢を叶えるはまた兎
Claude の手を借りて自動で走る
Git の足跡を記しながら
ストリームは流れ、画面は踊る
束ねる魔法、スーパーバイザー
検証の門をくぐりぬけて
できました!🎉

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 21.15% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed PRタイトル「feat(fork): autonomous TODO agent with live worker visibility and intervention」は、追加される自律TODOエージェント機能とライブワーカー表示・介入機能を正確に表現しており、変更セットの主要な目的を明確に示している。
Description check ✅ Passed PR説明は充実しており、概要・背景・仕組み・設計判断・スコープ・含まれる内容・コミット構成・非目的・テストプラン・既知の弱点を網羅している。必須テンプレートのすべてセクション(Description、Type of Change、Testing)が実質的に含まれており、とくに詳細な日本語ドキュメント、アーキテクチャ図、テストプランが提供されている。

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/todo-autonomous-agent

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

When a user created a TODO session against a workspace with
`type="branch"` (no worktree row), the supervisor threw
`todo-agent: workspace <id> has no worktree` at creation time and
refused to run.

The bug was that `resolveWorktreePath` only looked at the `worktrees`
table. Workspaces in this app come in two flavors:

- `type="worktree"` — backed by a real `worktrees.path`
- `type="branch"` — runs directly in the project's `mainRepoPath`,
  with no worktrees row at all

The existing terminal runtime already handles both via
`workspace-terminal-context.ts`, which falls back to
`projects.mainRepoPath` when no worktree row exists. The TODO agent now
follows the same resolution strategy: LEFT JOIN both `projects` and
`worktrees`, return `worktreePath ?? mainRepoPath`. The only
undefined-returning case is now "workspace row itself does not exist".

This unblocks session creation for any branch-type workspace. No schema
or API surface changes.
Two changes that turn the TODO agent from a "code task + test gate"
feature into a general autonomous task runner that covers research and
investigation use cases, and aligns the UI language with the rest of
this fork-local feature.

1. Verify command is now optional
---------------------------------

Motivation: not every TODO has a sensible acceptance command. Research
tasks ("このファイル群を調査して設計案をまとめて"), code-reading
tasks, and one-shot refactors do not have a `bun test` that can decide
"done" — forcing users to invent one made the feature feel
code-centric when it is really about autonomous execution in general.

Behavior:

- `packages/local-db/src/schema/todo-sessions.ts`: `verify_command` is
  now nullable. New migration `0050_todo_verify_optional.sql`
  (drizzle-kit generated) applies the NOT NULL drop.
- `apps/desktop/src/main/todo-agent/types.ts`: `verifyCommand` is now
  an optional zod string that transforms empty to undefined, so
  trimming an empty input reliably reaches the supervisor as
  "unset" rather than "empty string".
- `apps/desktop/src/main/todo-agent/supervisor.ts`: new branch at the
  top of `runSession` for the "no verify" path — single-turn mode.
  It writes the initial prompt once, waits for the worker PTY to go
  idle, and marks the session `done` with a verdict message asking
  the user to review the output in the worker terminal. No iteration
  loop, no futility detection, no budget polling beyond the shared
  wall-clock cap on the idle-wait. The user drives any follow-up
  turns manually by typing into the same terminal tab.
- The existing iteration loop is preserved verbatim for sessions
  that do have a verify command; only the branch above it is new.
- Goal doc and per-iteration prompts composed by the supervisor now
  switch wording based on whether a verify command is set
  (`renderGoalDoc` and `buildIterationPrompt`), and the panel's
  Start handler does the same for the initial claude invocation.

Rationale for keeping single-turn as one iteration rather than
capping the existing loop at 1: the loop's structure assumes a
verify-then-maybe-continue flow. Short-circuiting it keeps the
branching explicit and makes the "単発モード" state machine
readable at a glance in the supervisor.

2. UI localized to Japanese
---------------------------

This is a fork-local feature and the rest of the user's workflow is
in Japanese, so there is no reason for the TODO surface to be
English. Translated strings:

- `TodoButton/TodoButton.tsx`: tooltips, dropdown items
- `TodoModal/TodoModal.tsx`: dialog title/description, all form
  labels, placeholders, helper text, buttons, toasts, and error
  messages. The verify field is explicitly marked "(任意)" and its
  helper text explains the empty-equals-single-turn behavior. The
  budget fields (max iterations, wall-clock minutes) are now
  conditionally rendered only when the verify field has a value,
  since they have no meaning in single-turn mode.
- `TodoPanel/TodoPanel.tsx`: sheet header, session list empty
  state, detail labels (ステータス/タイトル/やってほしいこと/
  ゴール/Verify/予算/直近の結果), button labels (Start remains
  in-English as a recognizable verb, 中断/送信 are translated),
  intervene input placeholder, footer hint. The Verify field in the
  detail view now shows "単発モード(verify なし)" when the session
  was created without one, and the budget display adapts too.
- Toast messages (作成しました / 開始しました / 中断しました /
  送信に失敗しました, etc.) and the error thrown by trpc `create`
  when workspace path resolution fails.
- Supervisor-authored `goal.md` content and in-prompt wording are
  also Japanese so the worker Claude speaks the same language as
  the user.

Verified
--------
- `bun run typecheck` in apps/desktop — clean.
- Migration generated via `bunx drizzle-kit generate` in
  packages/local-db (not hand-edited).
Two related fixes for a failure mode reported after the v1 rollout.
Symptom: on Start, the worker terminal tab showed

    claude ".superset/todo/<id>/goal.md …"
      ⎿ Please run /login · API Error: 401 {authentication_error …}

and yet the TodoPanel flipped the session to `done · iter 1`. Two
distinct bugs were stacked here: (1) the command shape we sent to the
PTY was not the one the rest of the app uses, and (2) the supervisor
treated "the worker went idle" as "the worker finished successfully"
even when the idle was caused by an immediate authentication crash.

1. Use the canonical claude prompt command builder
---------------------------------------------------

`TodoPanel/TodoPanel.tsx`'s Start handler was building the launch
command by hand:

    const command = `claude ${JSON.stringify(initialPrompt)}`;

That invocation was subtly wrong in two ways:

- It skipped `--dangerously-skip-permissions`, which is part of the
  canonical claude command defined in
  `packages/shared/src/builtin-terminal-agents.ts` and is included by
  every other agent launch path in the app (Run button, tasks view,
  agent preset menu). Bypassing it changes how claude-code boots and
  how its interactive auth / tool-use prompts are handled.
- It passed the prompt as a single JSON-quoted positional arg instead
  of using the heredoc-cat form produced by `buildPromptCommandString`
  for the `argv` transport. The heredoc form is what the terminal
  runtime's `~/.superset/bin` shim is designed to see, and it survives
  multi-line prompts, quoting, and the wrapper's argument parsing.

Both problems go away by routing through `buildAgentPromptCommand`
from `@superset/shared/agent-command` with `agent: "claude"`, which
is the exact same code path the existing Run / task launches use.
The panel now calls:

    buildAgentPromptCommand({
        prompt: initialPrompt,
        randomId: session.id,
        agent: "claude",
    })

and writes the resulting string through `launchCommandInPane`.
`session.id` is already a UUID so it is a fine `randomId` for the
delimiter.

2. Detect worker startup errors instead of marking them `done`
---------------------------------------------------------------

`supervisor.ts`'s `waitForIdle` used to return a plain `boolean` for
"did we reach idle?" and the single-turn path settled the session
with `status: "done"` as long as idle was reached. That is the wrong
contract: if the claude process prints an auth error and exits
(or sits at a login prompt), the PTY goes idle too, and the session
was being reported as successfully complete.

Changes:

- `waitForIdle` now accumulates PTY output into a ~16 KB ring buffer
  during the wait and returns `{ idled, buffer }` instead of just
  `idled`. The buffer is used only for post-hoc scanning; it is not
  emitted anywhere.
- New `detectStartupError(buffer)` helper scans the captured text
  (with ANSI stripped) for a small, deliberately conservative set of
  fatal markers:
    * `Please run /login`
    * `authentication_error` / `Invalid authentication credentials`
    * `claude: command not found` / `command not found: claude`
    * `API Error: 5xx`
    * `fatal:`
  Each pattern maps to a Japanese, actionable `verdictReason`
  explaining what went wrong. The set is intentionally narrow so we
  do not confuse a normal test failure inside the worker's TUI with a
  startup crash — those patterns never appear in healthy runs.
- Single-turn path now runs the detector immediately after idle. On a
  hit, the session is moved to `failed` with the matched reason. On a
  miss, the existing "done with review-the-terminal" verdict stands.
- Iteration-mode path runs the detector once, after the first
  iteration's idle, before executing the verify command. This is the
  only moment the detector adds value: running verify against a
  worker that never actually booted would produce a misleading
  "verify failed" verdict instead of the real reason. Subsequent
  iterations are assumed to be live because the supervisor is still
  feeding them follow-up prompts and the worker is clearly
  processing.

Behavior after this fix on the reported failure
------------------------------------------------

Starting a session against a workspace whose `claude` binary is not
authenticated will now:

1. Launch claude via the canonical preset command (same as Run
   button). If the auth problem was an artifact of the hand-built
   command shape, it may resolve on its own.
2. If claude still fails authentication, the session will show
   `status: failed` with verdictReason
   "Claude Code の認証に失敗しました(API Error 401)。ワーカーの
   ターミナルで `/login` を実行してください。" instead of the
   misleading `done · iter 1`.

Verified
--------
- `bun run typecheck` in apps/desktop — clean.
…rkspace list for TODO agent

This commit is the backend half of a broader redesign of the TODO
agent surface. Three discrete changes land here:

1. `goal` is now optional on todo_sessions
-------------------------------------------

Motivation: not every TODO has a crisp acceptance sentence. Research
and investigation tasks naturally use "やって欲しいこと
(description) が終わったら完了" as the implicit goal, and making users
invent a separate goal string was pure friction.

Changes:
- `packages/local-db/src/schema/todo-sessions.ts`: `goal` column drops
  `.notNull()`. Migration `0051_todo_goal_optional.sql` is the drizzle-
  kit generated table-recreate migration that removes the NOT NULL
  constraint.
- `apps/desktop/src/main/todo-agent/types.ts`: `todoCreateInputSchema`'s
  `goal` is now an optional trimmed string that transforms empty to
  undefined — same pattern as `verifyCommand`.
- `apps/desktop/src/main/todo-agent/trpc-router.ts`: the `create`
  mutation now inserts `goal ?? null` so an omitted goal becomes a DB
  null, not an empty string.
- `apps/desktop/src/main/todo-agent/supervisor.ts`:
  - `renderGoalDoc` now emits "(未指定。上記『やって欲しいこと』が
    完了した時点で完了とみなす)" as the goal body when the session
    has no explicit goal, so the file the worker reads still has a
    coherent acceptance section.
  - `buildIterationPrompt` composes a `goalClause` that says either
    "ゴール(受け入れ条件)を達成することを目指してください" or
    "『やって欲しいこと』が完了した時点で完了とみなしてください"
    depending on whether `session.goal` is set, and threads that
    clause through all three prompt shapes (single-turn, first
    iteration with verify, retry iteration).

2. AI rewrite helper for the TODO creation form
------------------------------------------------

New backend for the sparkle/✨ button that the creation modal will get
in the follow-up commit. Click → send the field's current text to a
small model with a tight rewrite prompt → receive a cleaner, more
LLM-friendly version back.

Implementation notes:
- Reuses the existing `callSmallModel` plumbing from
  `apps/desktop/src/lib/ai/call-small-model.ts` — the same path the
  workspace auto-namer uses. Zero new credential handling, zero new
  provider fallback logic, diagnostics integration for free.
- `apps/desktop/src/main/todo-agent/enhance-text.ts` exposes
  `enhanceTodoText(rawText, kind)` where `kind` is `"description" |
  "goal"`. Each kind has a dedicated Japanese system prompt baked in:
    * description: "ユーザーが書いた雑な TODO の記述を、自律
      コーディングエージェントが理解しやすい明確な指示に書き換える"
    * goal: "雑なゴールを、検証可能な受け入れ条件に書き換える"
  Both prompts explicitly say "元の意図を保つ" and "新しい要件を
  追加しない" to prevent the model from hallucinating scope creep,
  cap the output at ~1-6 lines, and return only the rewritten text
  without any "Sure, here's the rewrite:" preambles.
- Invokes via `callSmallModel` → `generateText` from the Vercel AI SDK
  directly, since the `model` passed to the invoke callback is a
  `LanguageModel` from `@ai-sdk/anthropic` (for the Anthropic path,
  `claude-haiku-4-5-20251001`) or `@ai-sdk/openai` (OpenAI path).
  Both accept `generateText({ model, system, prompt })` uniformly,
  so the branching in `ai-name.ts` isn't needed here.
- `describeEnhanceFailure(attempts)` turns the SmallModelAttempt[] into
  a user-facing Japanese error string, honoring the same hierarchy
  the workspace namer uses (expired > failed > unsupported >
  missing-credentials).

New tRPC surface:
- `todoAgent.enhanceText` — `{ text, kind }` in, `{ text }` out.
  Throws TRPCError(INTERNAL_SERVER_ERROR, <japanese message>) on any
  failure so the renderer can surface it in a toast.

3. Cross-workspace session list for the Agent Manager view
----------------------------------------------------------

The existing `todoAgent.list` query is workspace-scoped. The
follow-up Agent-Manager-style view needs a single flat feed of all
TODO sessions grouped by workspace, so we can present something
closer to Antigravity's "all agents in one place" UX.

- `apps/desktop/src/main/todo-agent/session-store.ts` adds a new
  `TodoSessionListEntry` type (session fields + workspaceName,
  workspaceBranch, projectName) and a new `listAll()` method that
  LEFT JOINs `workspaces` and `projects` for those labels, filters
  out workspaces being deleted (`isNull(workspaces.deletingAt)`),
  and orders by `createdAt DESC`.
- `apps/desktop/src/main/todo-agent/trpc-router.ts` exposes
  `todoAgent.listAll` as a no-arg query returning that entry list.
- The existing `list` is kept for any callers that only need a single
  workspace and will continue to back the per-workspace badge count
  on the TODO button.

Housekeeping
------------
`.gitignore` now includes `.superset/todo/` so TODO agent runtime
artifacts (goal.md + any per-session state files) stay out of git.

Verified
--------
- `bun run typecheck` in apps/desktop — clean.
- Migration generated via `bunx drizzle-kit generate` in
  packages/local-db (not hand-edited).
TODO autonomous agent sessions write goal.md and per-session state
files into `.superset/todo/<session-id>/` inside the worktree.
These are runtime scratch data, not source — keep them out of git.
Paired with the backend commit that introduced the directory.
…ew and AI enhance

Major UX reshape of the TODO autonomous agent surface, replacing the
drawer panel with a full-view Agent-Manager-style interface inspired
by Google Antigravity's Agent Manager, Cursor 2.0's agents sidebar,
and Factory Desktop's sessions layout. Clicking the TODO button now
opens a single-pane-of-glass view of every autonomous session across
every workspace, with session creation available from within.

Research backing this shape: [antigravity.google/docs/agent-manager],
[cursor.com/changelog/2-0], [factory.ai/product/desktop], and
[docs.devin.ai/release-notes] consistently converge on a 2-pane
layout — grouped session list + detail — with a primary "+ new"
button in the header and a workspace / project as the grouping axis.
Goal-optional UX is also standard across these tools (Antigravity
treats the initial prompt as the implicit goal, Devin derives session
titles from the first message).

TodoManager (new: TodoManager/TodoManager.tsx)
-----------------------------------------------

Full-screen Dialog, 95vw × 86vh, no background scroll. Two panes:

- **Header** (h-12) — "TODO Agent Manager" title, subtitle, primary
  `+ 新しい TODO` button that opens the existing TodoModal with the
  current workspaceId pre-filled, close button. No workspace switcher
  here — the list is already cross-workspace.

- **Sidebar** (300px) — a filter input at the top (title /
  description / workspace substring match) and a scrollable list
  grouped by workspace. Each group has a sticky, uppercase-label
  header showing "project / workspace" and the session count.
  Grouping is done client-side from the flat `listAll` feed so we
  do not pay N queries. Each row shows a status dot, title, and
  `status · iter N` subline. Status dot colors follow the same
  convention used elsewhere in the app — amber-pulse for
  running/verifying/preparing, emerald for done, rose for
  failed/escalated, muted for queued/aborted.

- **Detail pane** — metadata header (status dot + status label +
  workspace / project breadcrumb + title), action buttons (Start
  for queued, 中断 for active), DetailBlock sections for やって欲しい
  こと / ゴール / Verify / 予算 / 直近の結果 / 介入. ゴール shows
  "未指定 ·『やって欲しいこと』の完了をゴールとみなします" when
  the session was created without an explicit goal. Verify shows
  "単発モード(verify なし)" similarly. The intervene row is a
  single-line Input + Send button that passes Enter to
  `todoAgent.sendInput`, which writes into the worker PTY.

Start handler (migrated from the old TodoPanel, essentially verbatim):
creates a terminal tab via `useTabsStore.getState().addTab`, renames
it with `TODO: <title>`, builds the initial prompt with the same
goal-optional awareness as the supervisor
(`ゴール...を目指してください` vs `『やって欲しいこと』が完了した時点
で完了とみなしてください`), launches via the canonical
`buildAgentPromptCommand` from `@superset/shared/agent-command`, then
calls `todoAgent.attachPane` to hand the session to the supervisor.
On success it invalidates both `listAll` and the per-workspace
`list` so the badge count on the TODO button updates.

Uses the new `TodoSessionListEntry` type (moved from session-store
into `main/todo-agent/types.ts` so the renderer can import it with a
`type` import — types are stripped at runtime, so it is safe despite
the main-process file location).

TodoButton (simplified: TodoButton/TodoButton.tsx)
---------------------------------------------------

Previously had a split button with a dropdown offering "新しい TODO"
and "Open panel". Now a single compact button: click → open
TodoManager. Session creation moved entirely inside the Manager, so
the dropdown is gone — the primary affordance matches the user's
requested "click TODO → see what exists first, create from there"
flow. The active-sessions counter badge is retained so users still
see in-flight work at a glance from the PresetsBar.

TodoModal (TodoModal/TodoModal.tsx)
------------------------------------

Two changes:

- **Goal is now optional.** Drops the goal field from the submit
  validation (`canSubmit`), marks the field "(任意)" in its label,
  adds placeholder guidance "(空欄なら『やって欲しいこと』の完了
  をゴールとします)". Submit passes
  `goal: hasGoal ? goal.trim() : undefined` so an empty field
  cleanly becomes a null in the DB via the zod transform.

- **AI enhance buttons on description and goal.** New `EnhanceButton`
  component lives under `TodoModal/components/EnhanceButton/` (one
  level of co-location since it is only used by TodoModal). It is
  a small sparkle/✨ ghost button rendered to the right of each
  Label, taking the current field value and a setter. Click calls
  `todoAgent.enhanceText` (the new mutation added in the backend
  commit) with `kind: "description" | "goal"`, and on success
  replaces the field value with the returned text. Uses
  `HiMiniSparkles` from react-icons with an `animate-pulse` while
  running, matches the common "Improve writing" UX pattern seen in
  Raycast AI, Linear AI, Notion AI, and v0. The button is disabled
  when the field is empty (so users get deterministic text before
  the rewrite pass).

Removed
-------

- `TodoPanel/TodoPanel.tsx` and `TodoPanel/index.ts` — fully
  superseded by TodoManager. No consumers remain.

Not in this commit
-------------------

- File tree / terminal / browser preview side pane (the Factory
  Desktop pattern). The Manager deliberately delegates the live
  worker view to the workspace's own terminal tab to keep the
  surface focused.
- Archive / delete / bulk operations. All rows are currently
  read-only except for Start / 中断 / Send.
- Artifact panel (TODO list / plan / diff viewer). Phase 3 item.
- Session pinning or status-based reorder. Current sort is
  createdAt DESC within each workspace group.

Verified
--------
- `bun run typecheck` in apps/desktop — clean.
The Agent Manager view was rendering at ~512px wide instead of 95vw
because shadcn's `DialogContent` default classes include
`max-w-[calc(100%-2rem)] ... sm:max-w-lg` (see
`packages/ui/src/components/ui/dialog.tsx:64`). On any screen >= 640px
(i.e. the entire desktop target of this app) the `sm:max-w-lg` rule
lives inside a `@media (min-width: 640px)` block and overrides the
base-layer `max-w-none` I was passing in the override className.

tailwind-merge resolves conflicts per variant, not across variants —
it sees `max-w-none` and `sm:max-w-lg` as two distinct utilities and
keeps both. To actually nullify the sm-level cap I need to override
it with the same variant prefix.

Fix: add `sm:max-w-none` alongside the base `max-w-none` in the
DialogContent className. Everything else (w-[95vw], h-[86vh], p-0,
gap-0, overflow-hidden) stays.

Verified:
- `bun run typecheck` in apps/desktop — clean.
Previous commit expanded the Agent Manager to 95vw after discovering
the sm:max-w-lg override issue — but on a typical laptop that is
way too big and crowds the app chrome. Dial it back to a bounded
fixed width with a viewport cap.

- DialogContent width: `w-[1080px] max-w-[calc(100vw-4rem)]` with
  the matching `sm:max-w-[calc(100vw-4rem)]` override so shadcn's
  default `sm:max-w-lg` (512px) stays disabled. 1080px is roughly
  the Antigravity Agent Manager width on a 1440p monitor and leaves
  32px margins on narrower screens.
- DialogContent height: `h-[80vh] max-h-[760px]` so tall monitors
  get a reasonable cap instead of stretching near-full-screen.
- Dialog layout switched to `flex flex-col` and the content grid
  is now `flex-1 min-h-0` instead of `h-[calc(86vh-48px)]`. This
  adapts automatically whether the content is limited by the
  viewport or by max-h-[760px], and avoids the brittle subtraction
  math that breaks when the header height changes.
- Sidebar narrowed from 300px to 260px to give the detail pane
  more breathing room at the smaller overall width.

Verified:
- `bun run typecheck` in apps/desktop — clean.
Middle ground between the previous two attempts:
- 95vw (committed as 55a484c) was too wide on desktop
- 1080px × 80vh (f0031b3 / 25f0719) was too small

Settles at:
- width: `w-[1360px]` target with `max-w-[calc(100vw-2rem)]` cap
  (plus the matching `sm:max-w-[calc(100vw-2rem)]` to keep shadcn's
  default `sm:max-w-lg` from re-applying at the sm breakpoint).
  1rem margin on each side on narrower screens.
- height: `h-[85vh] max-h-[860px]` — a bit taller so the detail
  pane comfortably shows header + description + goal + verify +
  budget + verdict without scrolling for a typical session.
- sidebar: back to 300px (down from 260px) now that the overall
  width can afford it. Matches the original design intent and the
  Antigravity / Cursor sidebar widths the research reference had.

Verified: `bun run typecheck` in apps/desktop — clean (no code
paths changed, only class tokens).
…start, delete/rerun, collapsible groups

Seven related UX improvements for the Agent Manager view, addressing
the feedback from the first round of live usage:

1. Manager size: ~1.5×
-----------------------
Previous pass landed at `w-[1360px] h-[85vh] max-h-[860px]`. The
user found it too constrained for the cross-workspace view. Bumped
to `w-[2040px] max-w-[calc(100vw-2rem)] h-[92vh] max-h-[1290px]`
with sidebar widened from 300px to 340px to match. The width cap
means narrower laptops still get `viewport - 2rem` so it never
exceeds the screen.

2. "ターミナルを開く" button
----------------------------
New outline button in the detail-pane header whenever the session
has `attachedTabId`. Click → calls `useTabsStore.setActiveTab(
workspaceId, attachedTabId)` which makes the worker tab the active
tab in that workspace. If the session is in a *different* workspace
than the one the user is currently on, we still set the active tab
(so it sticks when the user navigates there) and show a toast
explaining they need to switch workspaces manually. Cross-workspace
navigation from the manager is a v2 item.

3. Background start
--------------------
Previously, clicking Start called `tabs.addTab(workspaceId)` which
also set the new tab as the active tab — stealing the user's focus
the moment they closed the dialog. Now the Start handler captures
`activeTabIds[workspaceId]` BEFORE calling addTab, and after the
launch finishes calls `setActiveTab(workspaceId, previousActiveTabId)`
to restore it. The worker keeps running in the background tab; the
user only sees it when they explicitly click "ターミナルを開く".
Toast message updated to "バックグラウンドで開始しました: <title>".

4. Collapsible workspace groups
-------------------------------
Sidebar group headers are now buttons. Each click toggles a local
`collapsedGroups: Set<string>` keyed by workspaceId. Collapsed
groups hide their session rows but keep showing the group header
with a chevron-right icon (vs chevron-down when expanded). State
is component-local — not persisted — which is fine for a dialog
that opens transiently.

5. Delete past sessions
-----------------------
New `todoAgent.delete(sessionId)` tRPC mutation:
- Calls `supervisor.abort` as a safety no-op (the supervisor's
  `abort()` is already idempotent for non-active sessions).
- `store.remove(sessionId)` drops the DB row via drizzle delete.
- Best-effort `rmSync(.superset/todo/<id>, { recursive: true,
  force: true })` wipes the artifact directory. Failure is
  logged but does NOT fail the mutation — DB is the source of
  truth.

UI: trash icon button in the detail header. First click switches
to an inline "本当に削除 / キャンセル" confirmation to avoid
accidental deletes. Disabled while the session is active and not
yet in `queued` state so users cannot blow away a running worker
without an explicit abort.

6. Re-run past sessions
-----------------------
New `todoAgent.rerun(sessionId)` tRPC mutation: loads the source
session and creates a brand-new queued session that copies every
user-authored field (title, description, goal, verifyCommand,
maxIterations, maxWallClockSec, workspaceId, projectId) and resets
all execution state (status, phase, iteration, pane IDs, verdict,
completedAt). Calls `prepareArtifacts` so the new session has its
own `goal.md` under `.superset/todo/<new-id>/`.

UI: refresh-arrow icon button in the detail header, shown only for
"final" statuses (done / failed / escalated / aborted). On success
the new session appears at the top of the list via the listAll
refetch, and the user picks it up to Start.

7. "新しい TODO" stacks on top of the Manager
---------------------------------------------
Previously the TodoModal was rendered INSIDE TodoManager as a
sibling Dialog. Two shadcn/Radix Dialogs at the same level caused
click-outside / focus interactions to interfere — the modal would
sometimes close the Manager underneath it or focus the wrong
portal layer.

Fix: lift the modal up to TodoButton so both dialogs are rendered
as siblings at the TodoButton top level. TodoManager gains a new
`onRequestNewTodo: () => void` prop that it calls when the user
clicks "+ 新しい TODO"; TodoButton hooks that to its own
`setModalOpen(true)` state. The Manager stays open underneath;
the modal opens on top cleanly. `projectId` is now threaded through
TodoButton → TodoModal again.

Files touched
-------------
- `apps/desktop/src/main/todo-agent/session-store.ts`: new
  `remove(sessionId)` method wrapping drizzle delete.
- `apps/desktop/src/main/todo-agent/trpc-router.ts`: new `delete`
  and `rerun` mutations with artifact cleanup and prepareArtifacts
  plumbing. Imports `TODO_ARTIFACT_SUBDIR` from types.ts.
- `apps/desktop/src/renderer/features/todo-agent/TodoButton/TodoButton.tsx`:
  owns both Manager and Modal open state; passes `onRequestNewTodo`
  into Manager; renders TodoModal as a sibling Dialog.
- `apps/desktop/src/renderer/features/todo-agent/TodoManager/TodoManager.tsx`:
  resized, collapsible groups, terminal-jump + delete + rerun
  buttons in detail header, background-start active-tab restore,
  `onDeleted` callback to clear selection after delete.

Verified
--------
- `bun run typecheck` in apps/desktop — clean.
…ew and real verdict text

Root-cause rewrite addressing three critical bugs from the first live-
usage round:

1. **『まだ実行中なのに Done になってる』** — the idle-window heuristic
   mistook long-thinking / long-tool-running phases for turn completion
   and flipped sessions to `done` while Claude was still working.
2. **『verdict が単発タスク完了...の固定文字列で、Claude の最終応答が見えない』**
   — the single-turn path wrote a static placeholder instead of the
   real final assistant message.
3. **『通常タブに TODO のターミナルが見えてしまう』** — the previous
   architecture created a real workspace terminal tab (via
   `useTabsStore.addTab`) to host the interactive claude PTY, so
   worker tabs leaked into the workspace tab bar.

Codex consulted for design (see the branch discussion), and its
recommendation was unambiguous: move to **headless Claude Code
(`claude -p --output-format stream-json`) instead of a PTY**. That
single change dissolves all three problems at once:

- completion judgment = child process exit. No more idle guessing.
- verdict text = `result.result` from the NDJSON stream. No more
  PTY ANSI scraping.
- no PTY → no tab store involvement → no leaked workspace tab.

Intentionally NOT passing `--bare`: per the installed claude 2.1.109
help text, `--bare` forces ANTHROPIC_API_KEY and refuses OAuth /
keychain reads, which would break the user's Claude Max auth. Running
without `--bare` keeps keychain OAuth working and we still get full
control over every argument.

Main-process backend (apps/desktop/src/main/todo-agent/*)
---------------------------------------------------------

- `supervisor.ts`: complete rewrite.
  - `runClaudeTurn(...)` spawns `claude -p --output-format stream-json
    --verbose --include-partial-messages --permission-mode acceptEdits
    [--resume <sessionId>] <prompt>` as a node `child_process.spawn`
    under the worktree cwd. No PTY.
  - Line-buffered NDJSON parser on stdout (`drainLines` + `handleLine`).
    Each parsed record is classified by `classifyStreamJson`:
      * `system/init` → captures `session_id` for `--resume`
      * `assistant` → extracts the text portion(s) and emits as a
        `assistant_text` event; tool uses become `tool_use` events
        with a one-line summary of command/file_path/pattern input
      * `user` → `tool_result` events for tool outputs (truncated)
      * `result` → captures `result` (final text), `total_cost_usd`,
        `num_turns`; promoted to DB columns
      * `error` → `error` event
      * unknown → `raw` fallback
  - Turn completion is process exit. If exit != 0 and no `result`
    seen, stderr tail becomes the verdictReason. Signal abort uses
    SIGINT then a 1.5s SIGKILL fallback.
  - Iteration loop preserved: verify fail → retry with `--resume
    claudeSessionId` + failure tail in prompt. Futility guard (3x
    same failing test → escalated) and wall-clock budget cap are
    unchanged. iter budget is unchanged.
  - New `queueIntervention(sessionId, data)`: sets
    `pendingIntervention` on the DB row. Supervisor reads-then-clears
    it at each turn boundary and prepends it to the next prompt —
    this is the headless replacement for mid-stream PTY keystrokes.
  - Single-turn mode (no verify): one iteration, the final
    assistant text becomes the verdict, session goes to `done`.

- `session-store.ts`: adds per-session in-memory stream event buffer
  (capped at 500 events, ring-style trim from head). New
  `appendStreamEvents / getStreamEvents / clearStreamEvents /
  subscribeStream` APIs backed by EventEmitter. `remove` now also
  drops the stream buffer.

- `types.ts`: new `TodoStreamEvent`, `TodoStreamEventKind`, and
  `TodoStreamUpdate` types describing the condensed events the UI
  renders. Kept intentionally small (id, ts, iteration, kind, label,
  text) so tRPC IPC stays lightweight.

- `trpc-router.ts`:
  - `attachPane` removed. It made no sense in a paneless world.
  - New `start(sessionId)` mutation: validates the session is in a
    non-active state, flips to `preparing`, kicks off the supervisor
    loop fire-and-forget.
  - `sendInput` renamed semantics → `queueIntervention`. Writes to
    the `pending_intervention` DB column and is consumed at the next
    turn boundary.
  - New `getStream(sessionId)` query for the initial paint.
  - New `subscribeStream(sessionId)` observable subscription for
    live stream events. trpc-electron constraint satisfied via
    `@trpc/server/observable` as documented in AGENTS.md.

Schema (packages/local-db/src/schema/todo-sessions.ts)
------------------------------------------------------

Five new columns:
- `claude_session_id` TEXT — captured from `system.init`; used as
  `--resume` key for retry iterations so the same conversation state
  persists across verify loops.
- `final_assistant_text` TEXT — the real final Claude response,
  captured from `result.result`. Replaces the static placeholder.
- `total_cost_usd` REAL, `total_num_turns` INTEGER — aggregated
  across iterations from `result` events; displayed in a "消費"
  block in the Manager detail pane.
- `pending_intervention` TEXT — the intervention queue.

Migration `0052_todo_headless_fields.sql` generated by drizzle-kit,
not hand-edited. The three legacy columns `attached_pane_id`,
`attached_tab_id`, and the old `verify`/`verdict` fields are retained
for backwards compat with existing rows; they are no longer written
by the supervisor.

Renderer (apps/desktop/src/renderer/features/todo-agent/TodoManager/TodoManager.tsx)
------------------------------------------------------------------------------------

Fully rewritten detail pane. Major changes:

- **No more `launchCommandInPane` / `addTab` / pane bookkeeping.**
  Start simply calls `todoAgent.start.mutate({ sessionId })` and
  lets the main-process supervisor do the work. Zero workspace
  tab bar involvement.

- **Live stream view INSIDE the Manager.** New `StreamView` +
  `StreamEventRow` components render the parsed events as colored
  bubbles:
  * assistant_text → primary border/background
  * tool_use → amber
  * tool_result → emerald
  * result → stronger emerald
  * error → rose
  * raw / user-prompt → neutral
  Each row shows `[iter N] label` + wall-clock HH:MM:SS + wrapped
  body text. Capped at 500 events client-side to mirror the server
  buffer. This is the "terminal inside the Manager" the user asked
  for — not a PTY but a structured, labeled view of exactly what
  Claude is doing, which is more useful than raw ANSI anyway.

- **Subscription wiring.** Selection changes reset the local event
  state; `todoAgent.getStream` paints the initial snapshot; then
  `todoAgent.subscribeStream` streams appends live.

- **Timing block.** New `TimingBlock` showing 作成 / 開始 / 終了 /
  実行時間 in a 4-column grid. Uses `formatTimestamp` (local
  wall-clock) and `formatDuration` (秒 / 分秒 / 時間分). Duration
  falls back to `Date.now()` when the session is still running so
  users see the live elapsed time.

- **最終回答 block.** Dedicated panel that shows
  `session.finalAssistantText` when present. This is the real last
  Claude response, stored directly from `result.result` — no more
  PTY scraping, no more static placeholder.

- **消費 block.** Shows `$X.XXXX` and `N turns` when cost / turn
  count are available from any `result` event.

- **Intervention.** The intervene input is retained but its
  semantics are now "次のターンに注入する指示". Button label is
  now キュー. Any pending intervention is shown below the input as
  "予約済み: ..." so the user can see what will reach Claude next.

- **Start / 中断 / 再実行 / 削除** actions are preserved. "Start" is
  enabled for queued / failed / aborted / escalated sessions (so
  users can re-run a failed session in place) as well as fresh
  queued ones. The "ターミナルを開く" button is gone; the live
  stream view replaces it.

.gitignore
----------
Adds `.claude/worktrees/` so nested Claude Code scratch dirs (some
of them are themselves git repositories) do not get sucked into
commits. Unrelated to TODO agent but landed in the same surgery.

Verified
--------
- `bun run typecheck` in apps/desktop — clean.
- Migration generated via `bunx drizzle-kit generate` in
  packages/local-db (not hand-edited).

Sources
-------
- `claude -p / --output-format stream-json / --resume`:
  https://code.claude.com/docs/en/headless
- `--bare` ANTHROPIC_API_KEY requirement: local
  `claude --help` on installed CLI (v2.1.109), matching
  https://code.claude.com/docs/en/cli-reference
- Observable-only tRPC subscriptions:
  apps/desktop/AGENTS.md
…ise, abort race, timing tick)

Advisor review of 50c4641 flagged three blocking issues plus two
cheap UX nits. All five land in one commit since they are all in the
same two files and independently reviewable.

Blocking fixes
--------------

1. **`--permission-mode acceptEdits` can hang in headless `-p` mode.**
   `acceptEdits` auto-approves Edit/Write, but Bash tool calls still
   prompt for approval. In `-p` mode there is no one to grant that
   approval, so the child process would sit forever waiting for a
   prompt that never arrives — and our Promise, tied to process
   `close`, would never resolve. The session stays in `running` with
   no way for the user to know it is dead.

   Changed to `--permission-mode bypassPermissions` in
   `runClaudeTurn()`. This is the correct mode for fully autonomous
   operation — the user already opted into "let Claude do whatever"
   by creating an autonomous TODO, so bypassing all permission
   checks is the right default. Added an inline comment explaining
   why `acceptEdits` is insufficient.

2. **Child `error` event without `close` left the Promise hung.**
   Spawn failures like ENOENT (claude binary missing from PATH) or
   EACCES are reported asynchronously via the `error` event AFTER
   the `spawn()` call returns. Node does not guarantee a subsequent
   `close` in every failure path, so the old implementation — which
   only set `errorText` in the error handler and only resolved in
   the close handler — could hang forever in exactly the production
   failure mode it was trying to report.

   Introduced a single-shot `settle()` helper with a `settled`
   boolean guard. Both the `error` and `close` handlers funnel
   through it, so whichever fires first cleans up the abort
   listener, drains any residual stdoutBuffer, and resolves the
   outer Promise. The `error` handler now also composes a
   user-facing reason prefix (`claude プロセスエラー: ...`) so the
   session flips to `failed` with an actionable verdictReason
   instead of vanishing into a `running` limbo.

3. **Abort race overwrote `aborted` with `escalated`.**
   The iteration loop's `if (ac.signal.aborted) break;` exited the
   while loop, but execution fell through to the unconditional
   final `store.update(... status: "escalated"
   verdictReason: "iteration 予算を使い切りました")`. The abort
   handler had already written `status: "aborted"`, so the final
   write silently mislabeled aborted sessions as escalated with the
   wrong reason.

   Wrapped the final update in `if (!ac.signal.aborted)` so the
   escalation verdict only lands when we exhausted the budget
   cleanly. Abort now wins the race deterministically.

Cheap UX fixes
---------------

4. **TimingBlock 実行時間 was not ticking live.** The component only
   re-rendered when the session prop changed, which happens on the
   2-second `listAll` polling cadence. The 実行時間 counter could
   lag by up to 2 seconds behind the wall clock while a session was
   running.

   Added a 1-second `setInterval` in `SessionDetail` that forces a
   re-render via a throwaway `tick` state. The interval only runs
   while `session.completedAt == null` — it auto-stops the moment
   a session settles so finished rows do not pay re-render cost.

5. **`getStream` initial-paint query duplicated the subscription's
   own initial emit.** `subscribeStream` already sends the current
   in-memory buffer to new subscribers on connect, so the separate
   `todoAgent.getStream.useQuery` was delivering every event twice
   on mount. The client dedupe Set absorbed it, but it was wasted
   IPC and obscured the data-flow model.

   Removed the `getStream` call from `SessionDetail`. The
   subscription is now the single source of truth for stream
   events. (The server-side `getStream` route is left intact as a
   harmless read helper for potential future use.)

Non-blocking items intentionally deferred
------------------------------------------

- `pending_intervention` read-then-clear race (narrow window,
  harmless drop, user can re-queue).
- Queue drains after abort (probably intended behavior — the user
  aborted session A, not session B which was queued separately).
- Claude Code Stop hooks via `--settings` for a second completion
  source (v2 reliability boost).
- `--max-budget-usd` automatic cost cap.
- Stream events JSONL persistence for session replay.

Verified
--------
- `bun run typecheck` in apps/desktop — clean.
…rvention and markdown stream

Four related UX fixes to the TODO Agent Manager detail view after
seeing it in action:

1. Live stream kept growing past the visible area and the
   intervention input fell off the bottom with no way to scroll to
   it.
2. Intervention / send button was not reachable once the stream had
   enough content to push past the viewport.
3. Right side of the dialog was mostly empty whitespace because the
   detail pane was capped at `max-w-5xl` (~1024px) inside a 2040px
   dialog, wasting ~700px of horizontal real estate.
4. Claude's assistant messages rendered as raw text with newlines
   instead of real markdown, so code blocks, lists, and headings all
   came through unformatted.

New SessionDetail layout
------------------------

Complete restructure from one flex-column-inside-scroll-area to a
three-region layout that owns its own scroll containers:

  ┌─────────────────────────────────────────────┐
  │ HEADER (fixed): title + status + actions +  │
  │                 timing block                │
  ├─────────────────┬───────────────────────────┤
  │ LEFT COL        │ RIGHT COL                 │
  │ (scroll,        │ (scroll, fills height)    │
  │  ~34% of pane)  │                           │
  │ - description   │ CLAUDE の応答 /           │
  │ - goal          │ ライブストリーム          │
  │ - verify/budget │                           │
  │ - 消費          │ [iter N] event bubbles    │
  │ - 最終回答      │ ...                       │
  │ - verify 失敗   │ (auto-scrolls)            │
  ├─────────────────┴───────────────────────────┤
  │ FOOTER (fixed): intervention input + hint   │
  └─────────────────────────────────────────────┘

Implementation:

- Outer ScrollArea wrapper removed from TodoManager around the
  detail slot. SessionDetail now claims the full grid cell with
  `flex flex-col h-full min-h-0` and manages its own internal
  scrolling. The TodoManager's 2-column grid already had
  `flex-1 min-h-0` so the flex math chains all the way up.
- `grid-cols-[minmax(380px,34%)_1fr]` in the body region: left
  column has a minimum of 380px for the metadata to stay readable
  and grows to 34% on wide dialogs; right column (1fr) soaks up
  the rest for the stream. On the current 1360–2040px dialog that
  gives the stream ~900–1350px of horizontal space — huge
  improvement over the previous 1024px cap.
- Left column is wrapped in ScrollArea so long descriptions /
  failure logs scroll independently without pushing the stream.
- Right column stream is its own flex flex-col with a small
  sticky-ish label row on top (shrink-0) and a flex-1 min-h-0
  StreamView beneath, so the stream ALWAYS fills the remaining
  height — this is what the user actually wanted to see.
- Footer is shrink-0 border-t so the intervention input is always
  anchored at the bottom of the pane no matter how much content
  piles up above it.

StreamView: self-scrolling + auto-pin to bottom
------------------------------------------------

The previous StreamView was a flex column with a fixed
`max-h-[50vh]` inner scroll that caused the weird double-scroll
behavior. It is now a single-container, self-owning scroll surface
(`h-full overflow-auto`) that fills whatever vertical space its
parent gives it.

Added auto-scroll-to-bottom with a pin-to-bottom ref so users who
have scrolled up to read earlier output do not get yanked back down
on every new event:
- `pinnedToBottomRef` starts `true`.
- `onScroll` recomputes distance-from-bottom; pin flips to `false`
  when the user scrolls more than 40px up and back to `true` when
  they return near the bottom.
- A useEffect on events.length scrolls to the bottom only when
  `pinnedToBottomRef.current` is true. New events never interrupt
  a scroll-up read.

Markdown rendering for Claude's responses
------------------------------------------

StreamEventRow now branches on event.kind:

- `assistant_text` and `result` events go through the shared
  `MarkdownRenderer` at `renderer/components/MarkdownRenderer`,
  which wraps react-markdown with remark-gfm, rehype-raw, and
  rehype-sanitize. `scrollable: false` so it expands naturally
  inside the event bubble.
- `tool_use`, `tool_result`, `error`, and raw log lines stay in a
  plain `whitespace-pre-wrap font-mono` div so command strings,
  file paths, and stack traces keep their literal layout.

The "最終回答" block on the left column also uses MarkdownRenderer
now so the summary view renders identically to the in-stream final
message.

No new dependencies: react-markdown, remark-gfm, rehype-raw, and
rehype-sanitize were already apps/desktop deps, and the shared
MarkdownRenderer component already handles image safety, selection
menu, and theming. Zero surgery on the ui package.

Verified
--------
- `bun run typecheck` in apps/desktop — clean.
…opy buttons, rounded, flex layout

Five related improvements to the Agent Manager surface after another
round of live feedback on `bd0cd2cb6`:

1. **Footer clipped under heavy content**
   The previous commit tried to pin the intervention input with a
   `shrink-0 border-t` footer but still used a CSS grid
   (`grid-cols-[minmax(380px,34%)_1fr]`) for the body region. In
   some Chromium layout passes, grid rows inside a flex parent did
   not compute their height from `flex-1 min-h-0` cleanly, so when
   the stream view had enough content it pushed the footer out of
   view and the user could not reach the input or hint.

   Fix: converted the body region to **flex** (`flex flex-1
   min-h-0` with a sized left column and a `flex-1 min-w-0 min-h-0`
   right column). Flex chains height resolution deterministically
   from `DialogContent` → TodoManager body → SessionDetail body →
   StreamView. The footer is guaranteed visible regardless of
   content height. Left column width is `w-[34%] min-w-[360px]
   max-w-[520px]` so metadata stays readable on narrow screens and
   does not hog space on wide ones.

2. **Sidebar collapse toggle**
   New chevron button in the TodoManager header at the top-left.
   Clicking it toggles `sidebarCollapsed: boolean` local state. The
   sidebar wrapper is always mounted (no content remount flash on
   reopen); its width transitions between `w-[320px]` and `w-0`
   over 150ms with `overflow-hidden` so the collapsed state is
   truly invisible and gives the detail pane the full remaining
   width. `border-r-0` is applied when collapsed so the lingering
   border does not create a 1px sliver on the left.

3. **Row kebab menu with rename / re-run / delete**
   Every `SessionRow` in the sidebar now has an
   `HiMiniEllipsisVertical` button in its top-right corner
   (`opacity-0 group-hover:opacity-100` so it does not clutter
   idle rows). It opens a DropdownMenu with:
     - **リネーム** — starts inline rename via a small Input that
       swaps in for the title, autoFocused, with Enter to commit
       and Escape to cancel. Blur also commits. The new trpc
       `todoAgent.updateTitle` mutation validates 1..200 chars,
       writes `title` + `updatedAt` on the DB row, and invalidates
       `listAll` + per-workspace `list` so the row re-renders.
     - **タイトルをコピー** — copies just the session title to the
       clipboard via the shared `copyToClipboard` helper.
     - **同じ内容で再実行** — calls the existing `todoAgent.rerun`
       mutation (already added in 70bce0d) to clone the session
       with a fresh queued row.
     - **削除** — if the session is currently active, aborts it
       first (idempotent no-op otherwise), then calls `todoAgent.
       delete`. Clears selection in the parent via `onDeleted` so
       the detail pane does not keep showing a dead row.
   Rows are now rounded (`rounded-lg`), use
   `px-1.5 py-1 gap-0.5` spacing in each group, and hover states
   match the rest of the app's sidebar.

4. **Copy buttons on content blocks**
   New `<CopyIconButton value title label />` helper component:
   small rounded-md ghost button with `HiMiniDocumentDuplicate`.
   Writes the value to the system clipboard via
   `navigator.clipboard.writeText` (same pattern
   `ProblemsView.tsx:184` uses) and surfaces a toast.
   Wired into:
     - **`<DetailBlock>`** via a new optional `action` prop that
       renders in the header row. Used on "最終回答" and "直近の
       verify 失敗ログ" so the user can grab the full content
       without manually selecting.
     - **`StreamEventRow`** header: button is
       `opacity-0 group-hover:opacity-100` next to the timestamp
       so every event bubble can be copied with one click without
       visual noise during read.
   Final answer and failure log containers are now
   `rounded-lg border border-border/40 bg-muted/40` to match the
   stream event bubbles and the rest of the app's card style.

5. **Rounded polish to match app design language**
   - DialogContent: `rounded-xl` on the dialog itself
   - All header buttons: `rounded-md`
   - Sidebar filter input: `rounded-md`
   - Stream event bubbles: `rounded-lg`
   - Final answer / failure log containers: `rounded-lg`
   - Session rows: `rounded-lg`
   - Copy buttons / kebab button: `rounded-md`

Backend
-------
- New `todoAgent.updateTitle(sessionId, title)` mutation.
  Validates via zod (1..200 chars, trimmed), throws NOT_FOUND if
  the session is missing, otherwise updates just the title via the
  existing `sessionStore.update` helper (which also bumps
  `updatedAt` and emits the state event for any live
  subscriptions).

Verified
--------
- `bun run typecheck` in apps/desktop — clean.
…ative dates and panel-left icon

Five fixes in one round after seeing the previous polish commit in use:

1. **最終回答 + verify 失敗ログ still overflowed under the footer.**
   The flex restructure in c5cd524 helped, but the left column
   was still wrapped in shadcn `<ScrollArea>`. ScrollArea's
   internal height plumbing (viewport h-full inside a flex-col-
   inside-flex-col chain) did not always converge in this layout,
   letting the content push the pinned intervention input off
   the dialog.

   Replaced the left column's ScrollArea with a plain
   `<div className="... min-h-0 overflow-y-auto">`. Added
   `overflow-hidden` to the SessionDetail root, its body flex
   div, and the right column + StreamView wrapper so any child
   that somehow grows beyond its allotment gets clipped instead
   of pushing siblings. This is belt-and-suspenders on top of the
   existing flex math and has been the more reliable pattern for
   "pinned footer + scrolling body" in Electron/Chromium.

2. **Stream history was lost when a session was not currently
   running.** Events only lived in-memory (ring-buffered at 500
   entries) and were cleared on each new `runSession`, so past
   sessions — including ones from a previous app launch — showed
   an empty stream view.

   Persistence in `session-store.ts`:
   - `appendStreamEvents` now also appends to
     `{artifactPath}/stream.jsonl` via `appendFileSync`, line per
     event, full event shape (id, ts, iteration, kind, label,
     text). The artifact dir is already per-session and gets
     cleaned up on delete via the existing `rmSync(recursive:
     true)`, so persistence and cleanup stay coupled.
   - `getStreamEvents` now falls back to
     `loadStreamEventsFromDisk` when the in-memory buffer is
     empty, parsing the JSONL and validating each line
     defensively (malformed lines are silently skipped).
   - No DB / schema changes. No new dependencies. Works on app
     restart.

   The tRPC `subscribeStream` observable already seeds new
   subscribers by calling `getStreamEvents()`, so historical
   events flow through the existing subscription path without any
   client changes. Past sessions just "come back to life" when
   selected.

3. **Row kebab button was aligned to the top and would collide
   with the new relative date at bottom-right.** Restructured
   `SessionRow` from "absolute-positioned kebab over the row" to
   "flex row with a dedicated kebab column": the main button is
   `flex-1 min-w-0` and the kebab trigger sits in a sibling
   `<div className="flex items-center pr-1">` at the right edge.
   `items-center` vertically centers the kebab relative to the
   full two-line row, and placing it in its own column guarantees
   zero overlap with content inside the button.

4. **Relative time display on each sidebar row.** Added
   `formatRelativeTime(ms)` helper: 今 / N分前 / N時間前 / N日前
   / Nヶ月前 / N年前. Rendered at bottom-right of each row in
   `text-[10px] text-muted-foreground tabular-nums`. The status
   label gets `flex-1` so it takes whatever horizontal room is
   left after the date, and the date stays stuck to the right
   edge. Kebab is in its own column so hover-revealing it never
   covers the date.

5. **Sidebar toggle icon.** Replaced the makeshift chevron with
   the lucide `LuPanelLeftOpen` / `LuPanelLeftClose` pair from
   `react-icons/lu` — the same icon the app's other panel
   toggles use and the one the user pointed at in screenshot
   10. No new dep: `react-icons` was already used heavily
   (WorkspaceListItem, SearchDialog, etc.) so this is just
   picking a different icon from the same bundle.

Verified
--------
- `bun run typecheck` in apps/desktop — clean.
… working tree, per-file diff)

Adds a third pane to the TODO Agent Manager that shows exactly what
the worker produced in a given session — git commits made since the
session started, current working tree state, and unified diffs for
any selected file or commit.

Why scoped to the session
-------------------------

The existing `ChangesView` right sidebar in the rest of the app is
hard-coded to show diffs against the default branch HEAD, which
conflates "what this TODO did" with "everything the user has been
working on in this worktree". To get a clean per-session view we
capture the git HEAD SHA the instant the supervisor starts a run
and use it as the range base for `git log <sha>..HEAD`. Commits
made by the user before the session started are excluded by
construction.

Schema
------

- `packages/local-db/src/schema/todo-sessions.ts`: new nullable
  `startHeadSha: text("start_head_sha")` column. Nullable so
  existing rows and freshly-created `queued` sessions without a
  captured base do not break.
- `packages/local-db/drizzle/0053_todo_start_head_sha.sql`:
  drizzle-kit generated `ALTER TABLE ADD COLUMN` migration. SQLite
  handles this append-only so no data rewrite.

Backend
-------

- `apps/desktop/src/main/todo-agent/git-status.ts` (new):
  - `getCurrentHeadSha(cwd)` — thin wrapper used by the supervisor
    at run start.
  - `getSessionGitSnapshot({ cwd, startHeadSha })` — runs
    `rev-parse --abbrev-ref HEAD`, `rev-parse HEAD`,
    `log <startSha>..HEAD --format=...` (when start SHA is set
    and different from current), `status --porcelain=v1
    --untracked-files=all`, and `rev-list --left-right --count
    HEAD...@{u}` to return branch / commit list / working-tree
    files (with stage distinction staged/unstaged/untracked) /
    ahead-behind counters.
  - `getSessionFileDiff({ cwd, startHeadSha, path, scope,
    commitSha })` — unified diffs for four scopes: `session`
    (startSHA..HEAD for that path), `staged` (git diff --cached),
    `unstaged` (git diff), `commit` (git show for a single
    commit's changes to that path).
  - All calls go through `execGitWithShellPath` from
    `lib/trpc/routers/workspaces/utils/git-client` so PATH is
    resolved the same way the rest of the app's git layer does.
    Read-only only; no mutations.
- `apps/desktop/src/main/todo-agent/supervisor.ts`: at the top of
  `runSession`, capture `const startHeadSha = await
  getCurrentHeadSha(worktreePath)` and pass it to the initial
  `store.update({ ..., startHeadSha })`. Done before the claude
  subprocess is even spawned so the range anchor is accurate even
  if the run fails immediately.
- `apps/desktop/src/main/todo-agent/trpc-router.ts`:
  - New `todoAgent.gitSnapshot({ sessionId })` query.
  - New `todoAgent.gitFileDiff({ sessionId, path, scope,
    commitSha })` query.
  - Both resolve the worktree path via the existing
    `resolveWorktreePath(workspaceId)` and delegate to the helper.
  - Also threaded `startHeadSha: null` through the existing
    `create` and `rerun` store.insert calls to satisfy the new
    NOT-OPTIONAL-at-insert type shape.

Renderer
--------

- `apps/desktop/src/renderer/features/todo-agent/TodoManager/
  ChangesSidebar/ChangesSidebar.tsx` (new):
  - Header row with branch name + spinning refresh icon that
    invalidates both the snapshot and the currently selected diff
    query.
  - "開始時 HEAD" block showing `startSha.slice(0,12) →
    currentSha.slice(0,12)` and ahead/behind counts when set.
  - "コミット (N)" collapsible section listing each new commit
    with shortSha, subject, author, short relative date. Click
    selects the commit and loads its diff via `gitFileDiff`
    scope=`commit`.
  - "ワーキングツリー (N)" collapsible section listing staged,
    unstaged, and untracked files. Small colored status badge
    (M/A/D/R/? with amber/emerald/rose/primary/muted colors).
    Click selects file+stage and loads its diff via `gitFileDiff`
    scope=`staged`|`unstaged`. Untracked files are shown but not
    clickable for diff (no diff target).
  - `DiffBlock` — monospace `<pre>` renderer with color-coded
    lines (+/-, hunk headers, file headers). Wrapped in a
    `max-h-[50vh] overflow-auto` so long diffs are scrollable
    inside the sidebar without pushing other sections off-screen.
- `apps/desktop/src/renderer/features/todo-agent/TodoManager/
  TodoManager.tsx`:
  - New `changesSidebarCollapsed` state, persisted in component
    local state.
  - New `LuPanelRightOpen` / `LuPanelRightClose` toggle button
    placed before the close (×) button in the header.
  - Body flex now renders a third column after the detail pane:
    `shrink-0 border-l min-h-0 overflow-hidden transition-[width]`
    that swaps between `w-[380px]` and `w-0 border-l-0`. Mounts
    `<ChangesSidebar sessionId workspaceId active />` where
    `active` is true for queued/preparing/running/verifying so
    the polling only runs while something meaningful can change.

Verified
--------
- `bun run typecheck` in apps/desktop — clean.
…se events

Two small but visible polish items from the review.

1. **AI enhance buttons now render as pure icon**

The little ✨ next to the やって欲しいこと / ゴール fields in the TODO
creation modal used to render as `[✨ AI]` with the text label taking
more horizontal room than the input fields had to spare. Now it is a
single 24×24 ghost button holding only `HiMiniSparkles`; the running
state swaps to `animate-pulse` on the icon instead of a text change.
Tooltip carries both "AI で書き換える" and "AI で書き換え中…" for
state clarity.

File: `apps/desktop/src/renderer/features/todo-agent/TodoModal/
components/EnhanceButton/EnhanceButton.tsx`

2. **Setup-phase events rendered in the Manager live stream**

The live stream was empty until Claude actually started producing
output. Users had no signal for what the supervisor was doing
during the (usually subsecond but sometimes notable) boot window:
resolving the worktree, capturing the git HEAD, deciding the run
shape. The sidebar now paints these upfront as
`kind: "system_init"` events with iteration 0 (so they visually
anchor before any turn-1 events):

- "セットアップ — ワークスペースを解決しています…"
- "worktree — <absolute path>"
- "開始時 HEAD — <12-char sha>"
- "verify — <command>"  OR  "モード — 単発タスク(外部 verify なし)"
- "予算 — N iter · M 分"
- "Claude — claude -p --output-format stream-json を起動します"

Emitted via a new thin `appendSetupEvent(sessionId, label, text)`
helper in `supervisor.ts` that wraps
`getTodoSessionStore().appendStreamEvents` so the setup events
flow through the existing in-memory + JSONL persistence + live
subscription pipeline for free.

Verified
--------
- `bun run typecheck` in apps/desktop — clean.
Adds an optional "新しい worktree を作成してそこで実行する" checkbox
to the TODO creation modal. When checked, submit runs a two-step
flow:

1. Create a new workspace for the same project via the existing
   `workspaces.create` tRPC mutation, passing the TODO title +
   description joined as the `prompt` field. `workspaces.create`
   already handles AI branch name generation
   (`generateBranchNameFromPrompt` in
   `workspaces/utils/ai-branch-name.ts`) and AI workspace auto-name
   (`attemptWorkspaceAutoRenameFromPrompt` in
   `workspaces/utils/ai-name.ts`) internally, both going through
   the same `callSmallModel` path the TODO text enhancer uses, so
   the naming stays consistent across features without touching
   any of the workspace-creation plumbing.
2. Take the newly-created `workspace.id` and thread it into the
   usual `todoAgent.create` mutation as the target workspaceId.
   The TODO session is now scoped to the fresh worktree and,
   when Start is clicked, the supervisor captures that worktree's
   git HEAD as `startHeadSha` so the Changes right sidebar
   automatically reflects "everything this TODO produced" from
   the first commit onward.

UI (TodoModal)
--------------

- New bordered "card" below the title field with a shadcn
  `Checkbox` plus a short label and explanation. Uses the
  `bg-primary/5 border-primary/40` treatment when checked so the
  user sees at a glance that they are entering a different mode.
- Checkbox is disabled (`opacity-60`) and shows a muted
  explanation when the current workspace has no projectId. This
  happens for workspaces that predate the projects table or for
  branch-type workspaces without a project binding — the tRPC
  mutation requires `projectId`, so we fail fast with a clear
  message instead of letting the request explode.
- Added a small ✨ `HiMiniSparkles` next to the label to signal
  that the naming is AI-driven, matching the visual language of
  the per-field enhance buttons that now also render as
  icon-only sparkles.
- `reset()` in the modal clears the checkbox too so reopen never
  sticks the previous mode.

Flow (handleSubmit)
-------------------

- Default path (checkbox off): unchanged. Uses the current
  workspaceId directly.
- New worktree path: calls `workspaces.create.mutateAsync`, waits
  for the `{ workspace, ... }` result, extracts `workspace.id`
  as `targetWorkspaceId`, then proceeds with the existing TODO
  create mutation against that id. On success, toast says "新し
  い worktree を作成して TODO セッションを紐付けました" so the
  user knows both operations completed.
- Errors from either mutation surface via the existing toast
  fallback — e.g. "このワークスペースにはプロジェクトが紐付いて
  いないので新しい worktree を作成できません" when a user somehow
  manages to submit with the checkbox enabled but no projectId.

Also in this commit
-------------------

- `cn` import added to TodoModal for the new conditional classes
  on the checkbox card.
- `HiMiniSparkles` import added for the inline label icon.

Verified
--------
- `bun run typecheck` in apps/desktop — clean.
…oModal

Two requests from the review:

1. **Reusable system prompt presets** that users can attach to new
   TODOs at creation time, managed from a Settings row at the bottom
   of the Agent Manager's left sidebar.
2. **TodoModal is too text-heavy** — simplified the copy so the form
   matches the visual density of the rest of the app.

Schema + migration
------------------

- `packages/local-db/src/schema/todo-prompt-presets.ts` (new): new
  `todo_prompt_presets` table (id, name, content, createdAt,
  updatedAt) with `name` and `updatedAt` indexes.
- `packages/local-db/src/schema/todo-sessions.ts`: new nullable
  `custom_system_prompt` column. Selected preset content is copied
  into this column at session create time so later preset edits do
  not retroactively change a session that has already run.
- `packages/local-db/src/schema/index.ts` + `schema.ts`: re-export
  the new table so drizzle-kit picks it up via the existing root.
- `packages/local-db/drizzle/0054_todo_prompt_presets.sql`:
  auto-generated migration (CREATE TABLE + ALTER TABLE ADD COLUMN).

Backend
-------

- `apps/desktop/src/main/todo-agent/types.ts`:
  - `todoCreateInputSchema` gains optional `customSystemPrompt`
    (trimmed, max 20k, empty→undefined).
  - `todoPresetCreateInputSchema` and `todoPresetUpdateInputSchema`
    new zod shapes for the CRUD endpoints.
- `apps/desktop/src/main/todo-agent/supervisor.ts`:
  - `runClaudeTurn` params gain `customSystemPrompt: string | null`.
  - When present it is threaded into the spawned claude args as
    `--append-system-prompt <content>`. This composes with the
    iteration prompt + `--resume` so every turn in the session
    inherits the steering without re-injecting it in every prompt.
  - The per-turn call site in `runSession` reads the session row
    at turn boundary and passes `currentSession.customSystemPrompt
    ?? null`.
- `apps/desktop/src/main/todo-agent/trpc-router.ts`:
  - `create` now persists `input.customSystemPrompt ?? null` on
    the new DB column.
  - `rerun` now copies `source.customSystemPrompt` into the clone
    so re-running preserves the steering.
  - New nested `todoAgent.presets` router with:
    * `list` query (orderBy updatedAt desc)
    * `create` mutation (inputs: name 1..120, content 1..20k)
    * `update` mutation (inputs: id + name + content)
    * `delete` mutation (inputs: id; returns ok boolean)
  - All mutations run against `localDb` via drizzle directly —
    presets are a tiny kv-ish table, no caching needed.

Renderer
--------

- `apps/desktop/src/renderer/features/todo-agent/TodoManager/
  PresetsDialog/PresetsDialog.tsx` (new):
  - Full `Dialog` at 960×80vh with a 2-column layout: list of
    presets on the left, edit form on the right.
  - "新規プリセット" button at the top of the sidebar resets the
    draft state and clears selection.
  - Selecting a row populates the draft; editing flips a
    `dirty` flag that gates the save button.
  - Save routes to `create` or `update` depending on whether the
    draft has an id; success toast on both paths.
  - Delete uses the inline "本当に削除 / キャンセル" confirm
    pattern already established in the SessionRow kebab menu.
- `apps/desktop/src/renderer/features/todo-agent/TodoManager/
  TodoManager.tsx`:
  - `presetsDialogOpen` state + mounted `<PresetsDialog>` as a
    sibling Dialog inside the existing outer Dialog so it stacks
    on top of the Manager the way `<TodoModal>` does.
  - Left sidebar gains a `shrink-0 border-t` footer row with a
    "設定 / プリセット" button using `HiMiniCog6Tooth`. Clicking
    it opens PresetsDialog. The row mirrors the compact ghost-
    link styling of the existing row controls.

TodoModal simplification + preset picker
----------------------------------------

- Removed the 5-line `DialogDescription` entirely. Users reached
  the feature through the button's tooltip; the modal body needs
  to carry only what is actionable.
- Title placeholder: "例: Issue #123 のログインリダイレクト問題を
  修正" → "例: Issue #123 を修正" (half the width, same intent).
- Replaced the two-line "new worktree" card with a single-row
  label that renders as a checkbox-styled button: "新しい
  worktree を作成して実行" with a sparkle icon on the right.
  Description text was the biggest offender; cut entirely. The
  disabled state still shows via the muted opacity treatment.
- Description placeholder: long sentence → "やってほしい作業を
  書く".
- Goal: "(任意)" moved into a compact `text-[10px]` suffix on
  the label; placeholder shortened to "完了条件(空欄可)".
  Textarea rows dropped from 3 to 2.
- Verify: same treatment. Placeholder → "例: bun test". Removed
  the two-line explanation block below it entirely.
- New "システムプロンプト (任意)" row hosts a `PresetPicker`
  trigger that renders the selected preset name + an inline
  clear (×) button when set. Dropdown shows the full preset
  list with name + first ~2 lines of content as a preview, plus
  a "選択を解除" footer row and a hint when no presets exist.
  Selected preset content is read at submit time and passed as
  `customSystemPrompt` to the create mutation.
- All form inputs gained `rounded-md` to match the rest of the
  app.

Verified
--------
- `bun run typecheck` in apps/desktop — clean.
…sist, atomic create, stranded cleanup)

Four fixes from the code-review round (rv-pr #181). All four were
classified "修正推奨" after Codex pruned the false positives.

Q3: abort guard after runVerify
--------------------------------

In the iteration loop, if the user pressed 中断 while `runVerify` was
still executing, the verify child process died with an AbortError
and returned `{ passed: false, log: "AbortError: ..." }`. The very
next line wrote that "verify failed" verdict to the DB before the
loop's `break` on `ac.signal.aborted` could fire, so aborted
sessions ended up labeled "aborted" but carried a bogus "verify
failed: AbortError..." trail in the UI.

Fix in `supervisor.ts`: a one-liner `if (ac.signal.aborted) return;`
between `await runVerify(...)` and `appendVerifyEvent(...)`. Once
the user has aborted, we do not record the terminated verify at
all — `abort()` has already written the clean `aborted` state.

Q4: stream persistence no longer blocks the main process
--------------------------------------------------------

`persistStreamEvents` used to run per event:
- a synchronous `localDb.select()` to fetch the session row just for
  `artifactPath` (even though it never changes during a run)
- a synchronous `fs.appendFileSync` for the JSONL append

Claude's stream fires dozens to hundreds of events per turn, so
that was the main-process event loop being jammed several ms per
event for the duration of a run. In Electron the main process is
shared with the renderer so tab switches, tRPC calls, and terminal
writes would visibly stutter.

Rewritten in `session-store.ts`:

- New `artifactPathCache: Map<sessionId, absolutePath>`. The
  supervisor calls `store.setArtifactPathCache(sessionId,
  session0.artifactPath)` at the top of `runSession`, which also
  pre-`mkdirSync`s the directory exactly once. `persistStreamEvents`
  now reads from the cache; the DB fallback is kept only for
  historical-session replay outside of an active run.
- `appendFileSync` → `appendFile` from `node:fs/promises`. Async I/O
  so the main process thread stays free.
- New `persistQueues: Map<sessionId, Promise<void>>` chains the
  per-session appends so bursty events do not race and write out of
  order. Each subsequent append awaits the previous one via
  `.then(...)`; failures are swallowed with a console.warn so one
  bad append cannot poison the chain.

Net effect: CPU time per event drops by 10-100x, event order in
`stream.jsonl` is still fully ordered, the renderer no longer
stutters while a worker is chattering.

Q8: atomic create — no more half-written PENDING rows
-----------------------------------------------------

Previously, `todoAgent.create` / `rerun`:
1. `store.insert({ ..., artifactPath: ".superset/todo/PENDING" })`
2. `prepareArtifacts(session)` — computes real path, mkdir, write
   goal.md
3. `store.update(id, { artifactPath })`

If the app crashed between steps 1 and 2 (or 2 and 3), the DB was
left with a row whose `artifactPath` was literally
`.superset/todo/PENDING`. The next time the user clicked Start on
that row, the supervisor would try to read/write inside a bogus
directory and fail inscrutably. It was also a latent correctness
hazard for any downstream that assumed artifactPath was an absolute
path.

Split `prepareArtifacts` into two responsibilities:

- New `TodoSupervisor.computeArtifactPath({ sessionId, workspaceId
  })` — pure path calculation, throws if the workspace has no
  resolvable path. No fs side-effects.
- Existing `prepareArtifacts(session)` now expects the session's
  `artifactPath` to already be set and simply `mkdirSync`s the
  directory + writes `goal.md`.

`trpc-router.ts` `create` / `rerun` now:
1. Generate a UUID up front (`randomUUID`)
2. `computeArtifactPath` with that UUID
3. `store.insert({ id, ..., artifactPath })` — one shot, final value
4. `prepareArtifacts(session)` to materialize the directory

A crash anywhere in this flow can still leave a queued session with
no artifact dir, but never a row with a broken synthetic path. And
since `prepareArtifacts` is now idempotent on the directory itself,
Start can recover by recreating the dir on first use if needed.

A2: stranded-session sweep on startup
-------------------------------------

If the previous process died while a session was `preparing`,
`running`, or `verifying`, the in-memory supervisor state is gone
but the DB row is still there. The UI would render that session
forever as "running" with a pulsing amber dot, and the only way
out was to delete it manually.

`TodoSessionStore`'s constructor now runs a single-shot
`rehydrateStrandedSessions()` that UPDATEs any row in those three
non-terminal states to `status: "failed"` + a clear verdictReason
("前回の実行が中断されました(アプリ再起動)。再実行するか削除して
ください。") + `completedAt: Date.now()`. Only sessions with
`queued` status are left alone — those never started, so they
should stay queueable.

Constructor-time is the right injection point because the store is
a lazy singleton (`getTodoSessionStore()`), so the sweep runs
exactly once on first access — typically before the Agent Manager
has mounted.

Verified
--------
- `bun run typecheck` in apps/desktop — clean.
CI Lint was failing with 17 errors / 4 warnings on feat/todo-
autonomous-agent. Ran `bun run lint:fix` which auto-fixed 14 files
worth of Biome formatting / import ordering / style nits, then
resolved the 2 remaining errors manually.

Auto-fixed (14 files):
- apps/desktop/src/main/todo-agent/enhance-text.ts
- apps/desktop/src/main/todo-agent/git-status.ts
- apps/desktop/src/main/todo-agent/index.ts
- apps/desktop/src/main/todo-agent/session-store.ts
- apps/desktop/src/main/todo-agent/trpc-router.ts
- apps/desktop/src/main/todo-agent/types.ts
- apps/desktop/src/lib/trpc/routers/index.ts (Biome reordered the
  fork-local TodoAgent import next to the other `main/*` import)
- apps/desktop/src/renderer/features/todo-agent/TodoManager/
  PresetsDialog/PresetsDialog.tsx
- apps/desktop/src/renderer/features/todo-agent/TodoModal/
  TodoModal.tsx
- apps/desktop/src/renderer/features/todo-agent/TodoModal/
  components/EnhanceButton/EnhanceButton.tsx
- apps/desktop/src/renderer/screens/main/components/
  WorkspaceView/ContentView/components/PresetsBar/PresetsBar.tsx
  (just the Biome import-grouping rewrite triggered by the new
  TodoButton import)
- a few more minor whitespace/formatting-only touches

Manual fixes:

1. **noUnusedFunctionParameters** in `ChangesSidebar.tsx`: the
   `workspaceId` prop was declared but never read. Removed both
   from the `ChangesSidebarProps` interface and the TodoManager
   call site. The component only needs `sessionId` + `active` —
   workspace scoping is already handled server-side via the
   session row lookup inside the `gitSnapshot` query.

2. **noControlCharactersInRegex** in `supervisor.ts`
   `guessFailingTest`: the ANSI stripper used `/\x1b\[[0-9;]*m/g`
   to remove ESC-based color escapes from verify command output.
   Biome flags `\x1b` literals as suspicious (they often land in
   regexes by mistake). Stripping real ANSI escapes is the entire
   point here, so:
   - Switched the escape to the equivalent Unicode form
     `\u001B` (same byte, less alarming to Biome's default
     pattern).
   - Added a `biome-ignore` with an explanation so a future
     contributor can see at a glance that the control char is
     intentional.

Verified
--------
- `bun run lint` in apps/desktop — clean.
- `bun run typecheck` in apps/desktop — clean.
@MocA-Love MocA-Love marked this pull request as ready for review April 15, 2026 20:17
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 72e446b474

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread apps/desktop/src/main/todo-agent/git-status.ts Outdated
Comment thread apps/desktop/src/main/todo-agent/supervisor.ts Outdated
Comment thread apps/desktop/src/renderer/features/todo-agent/TodoManager/TodoManager.tsx Outdated
Automated review on commit 72e446b flagged three issues. All three
were verified against the code and are real bugs; this commit fixes
them.

1. [P1] Commit-scope file diffs returned blank (git-status.ts)
--------------------------------------------------------------

`getSessionFileDiff` with `scope: "commit"` unconditionally appended
`-- <path>` to the `git show` command. The ChangesSidebar sets
commit-row selections with `path: ""` because commit clicks are not
bound to a specific file — they should show the whole commit's
patch. With an empty path the command became
`git show --format= <sha> -- ""`, which Git rejects with "empty
string is not a valid pathspec". `gitOut` swallows the failure and
returns an empty string, so commit diffs silently rendered blank in
the UI.

Fix: only append `--` + path when the path is non-empty. When
`path` is the empty string (the "whole commit" case) we now emit
`git show --format= <sha>` which returns the full patch for every
file the commit touched. File-scoped commit diffs still work when
the caller actually provides a path.

2. [P1] Queue drain revived aborted sessions (supervisor.ts)
------------------------------------------------------------

`TodoSupervisor.start()` drains `this.queue` in a while loop once
the active run finishes. If the user aborted (or deleted) a
session while it was still waiting in the queue, its sessionId
stayed in `this.queue` unchanged. When the active run finished
the drain loop popped the aborted sessionId and ran it anyway,
re-reviving an already-terminal session into execution.

Two complementary fixes:

- `abort(sessionId)` now proactively removes the sessionId from
  `this.queue` via `splice(queueIdx, 1)` before touching the
  active run, so the drain loop never sees it again.
- The drain loop now re-reads the session row from the store after
  popping each id and `continue`s past any row whose status is
  already terminal (`aborted` / `failed` / `done` / `escalated`).
  This catches the abort race plus any status change made by
  another code path (`delete`, `rerun`) while the id was waiting.

3. [P2] SessionDetail stream events leaked across selections
------------------------------------------------------------

The effect that resets `streamEvents` on selection change had
`[]` as its deps array, so it only ran once on initial mount.
`SessionDetail` is reused across selections (the parent just
swaps its `session` prop), so when the user clicked a different
row the previous session's events stayed in state and got
appended to the new session's subscription deliveries — the live
stream panel showed a mix of two runs.

Fix: deps are now `[session.id]` so the reset fires on every
selection change. Added a `biome-ignore lint/correctness/
useExhaustiveDependencies` comment since `session.id` is a
reset-on-change dep, not a value read inside the body — Biome
cannot see the difference and would otherwise strip it.

Verified
--------
- `bun run lint` in apps/desktop — clean.
- `bun run typecheck` in apps/desktop — clean.
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 12

🧹 Nitpick comments (2)
packages/local-db/drizzle/0052_todo_headless_fields.sql (1)

3-3: total_cost_usdREAL ではなく整数最小単位で保持する設計を推奨します。

コストを後続で集計・比較する場合、浮動小数は誤差を持ち込みやすいです。INTEGER(例: micro-USD / cent)での保存のほうが安全です。

例: 精度重視のスキーマ案
-ALTER TABLE `todo_sessions` ADD `total_cost_usd` real;--> statement-breakpoint
+ALTER TABLE `todo_sessions` ADD `total_cost_microusd` integer;--> statement-breakpoint
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/local-db/drizzle/0052_todo_headless_fields.sql` at line 3, 現在のALTER
TABLE追加は`todo_sessions`の`total_cost_usd`をREALで追加していますが、金額は整数最小単位で保持すべきなので列型をREALからINTEGERに変更してマイグレーションを作り直してください:
更新対象は`todo_sessions.total_cost_usd`で、スキーマ変更をALTER TABLE ... ADD `total_cost_usd`
INTEGER NOT NULL DEFAULT
0(またはNULL許容とデフォルトの要件に合わせる)にし、アプリ側でUSDをセントやマイクロ単位に変換して保存/読み出すロジック(保存時に*100 or
*1_000_000、取得時に逆変換)を合わせて実装してください。
packages/local-db/src/schema/schema.ts (1)

459-462: 再エクスポートの集約先を一本化すると保守しやすいです。

packages/local-db/src/schema/index.ts でも同じモジュールを再エクスポートしているため、公開面の管理を1箇所に寄せると将来の衝突リスクを下げられます。

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/local-db/src/schema/schema.ts` around lines 459 - 462, The two
re-export lines (export * from "./todo-prompt-presets"; and export * from
"./todo-sessions";) should be removed from this schema file and consolidated
into the single public re-export barrel (index.ts) so all schema exports are
managed in one place; update the central index.ts to re-export both
todo-prompt-presets and todo-sessions (and add a brief comment explaining the
barrel) and remove the duplicate exports here to avoid future conflicts.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@apps/desktop/plans/todo-agent-plan.md`:
- Line 32: Several fenced code blocks in the Markdown (the triple-backtick
blocks) are missing language identifiers causing markdownlint MD040; update each
problematic code fence (the three backtick blocks referenced in the review) to
include an appropriate language tag such as "text" or "ts" (e.g., replace ```
with ```text or ```ts) so linting passes; ensure you update all occurrences
mentioned in the review comment so each code fence has a language specifier.
- Around line 13-20: The implementation model described in the plan (the bullets
"ライブ可視性", "信頼性", "逐次実行", "upstream とのマージ容易性") conflicts with the PR's actual
design (headless Claude NDJSON streaming + verify exit-code completion); update
todo-agent-plan.md to either rewrite these sections to describe the headless
NDJSON stream + verify-exit-code flow used by the current codebase (including
removing or adjusting PTY-resident/idle-detection language) or clearly mark the
document as obsolete and point to the new implementation notes; ensure the
revised text names the implemented mechanisms (NDJSON stream, verify exit code)
so future readers are not misled.

In `@apps/desktop/src/main/todo-agent/enhance-text.ts`:
- Around line 84-101: In describeEnhanceFailure, add explicit handling for
attempts whose outcome === "empty-result" (SmallModelAttempt) before falling
through to generic messages; return a clear Japanese message indicating the
model call succeeded but produced an empty response (e.g.,
"モデルは応答しましたが空の結果でした。再試行してください。") so users can distinguish "empty response" from
other failures and make retry decisions.

In `@apps/desktop/src/main/todo-agent/supervisor.ts`:
- Around line 156-163: The in-memory clear (store.clearStreamEvents(sessionId))
leaves the append-only persisted stream.jsonl intact causing old events to be
reloaded on retries; update supervisor startup to also truncate or rotate the
persisted stream file before a new run. Add or call a store-level method (e.g.
store.truncateStreamFile(sessionId) or store.rotateStreamFile(sessionId, runId))
before priming the cache (before/around the
store.setArtifactPathCache(sessionId, session0.artifactPath)) so the persisted
stream for the same sessionId is either truncated or writes go to a run-specific
file to avoid mixing previous run events. Ensure the new method is implemented
in the store backend to atomically truncate or rename the existing stream.jsonl
for that sessionId.
- Around line 83-93: The bug: aborted sessions remain in this.queue so start()
later pulls them and runSession() executes them; fix by 1) updating
abort(sessionId) to remove that id from this.queue (e.g., this.queue =
this.queue.filter(id => id !== sessionId)) and 2) adding a defensive guard at
the top of runSession(sessionId) to check the session's terminal status
(aborted/completed) and return immediately if terminal; reference methods:
abort, start, runSession, and the this.queue field.
- Around line 482-485: The spawn call creating `child = spawn("claude", args, {
cwd: params.cwd, env: process.env })` uses the raw process.env which fails when
Electron is launched from Finder; replace it to use the shell-resolved
environment helper (the same helper used for the git implementation, e.g.
resolveShellEnv/getShellEnvironment) so PATH and other shell startup changes are
respected before spawning `claude` (and apply the same change to the similar
spawn at the other location around lines 730-734); update the spawn options to
pass the resolved env object instead of process.env.

In `@apps/desktop/src/main/todo-agent/trpc-router.ts`:
- Around line 106-120: The router calls enhanceTodoText with two args but the
actual function signature is enhanceTodoText({ sessionId, kind, text }); fix by
making the router and schema match that signature: add sessionId to
todoEnhanceTextInputSchema and call enhanceTodoText({ sessionId:
input.sessionId, kind: input.kind, text: input.text }) (or alternatively change
enhanceTodoText to accept (text, kind) and update its callers); ensure the TRPC
input type and the callsite use the same shape and update any related
imports/types (enhanceTodoText and todoEnhanceTextInputSchema) accordingly.

In
`@apps/desktop/src/renderer/features/todo-agent/TodoManager/ChangesSidebar/ChangesSidebar.tsx`:
- Around line 45-55: The diffQuery call is sending path: "" when a commit is
selected which violates the server validator for gitFileDiff
(z.string().min(1)); update the argument construction in
electronTrpc.todoAgent.gitFileDiff.useQuery to branch based on selected.scope
(e.g., if selected?.scope === "commit" send the payload without path and include
commitSha/scope accordingly, otherwise include path as before), or populate a
non-empty path when required so the client payload matches the gitFileDiff input
schema; adjust the selected-based conditional used to build the query args in
ChangesSidebar (diffQuery) to ensure scope === "commit" uses the API shape the
router expects.

In `@apps/desktop/src/renderer/features/todo-agent/TodoManager/TodoManager.tsx`:
- Around line 280-284: SessionDetail コンポーネントが別セッション選択時に同一インスタンスを再利用して前セッションの
streamEvents / 入力 / 削除確認 state を持ち越しているので、SessionDetail をレンダーする箇所に一意の key
を付与してマウントを強制的に切り替えてください;具体的には現在の selected を渡している箇所で SessionDetail に
key={selected.id} を追加し(selected / selected.id を参照)、onDeleted で
setSelectedId(null) する既存のハンドラはそのまま維持してください。

In
`@apps/desktop/src/renderer/features/todo-agent/TodoModal/components/EnhanceButton/EnhanceButton.tsx`:
- Around line 55-64: Enhance the accessibility of the icon-only button in the
EnhanceButton component by adding an explicit aria-label to the Button (use the
existing title prop or fallback to running ? "AI で書き換え中…" : "AI で書き換える") and
mark the HiMiniSparkles icon as non-interactive for assistive tech (e.g., add
aria-hidden="true" and focusable={false} to the icon element); update the Button
JSX where onClick={handleClick}, disabled={disabled}, title={...} is set to
include aria-label and update the HiMiniSparkles usage to include aria-hidden
and focusable props.

In `@apps/desktop/src/renderer/features/todo-agent/TodoModal/TodoModal.tsx`:
- Around line 123-150: The code currently calls
createWorkspaceMut.mutateAsync(...) to create a worktree and then
create.mutateAsync(...) to create a todo session, leaving an orphaned workspace
if the second call fails; update the flow so both operations are atomic: either
(A) move the worktree + todo creation into a single main-process mutation on the
backend (preferred) so one server-side transaction handles both, or (B)
implement compensation logic in the caller around createWorkspaceMut.mutateAsync
and create.mutateAsync — after createWorkspaceMut.mutateAsync returns a
result.workspace.id, call create.mutateAsync(...) and if that fails, reliably
delete the newly created workspace via the matching workspace delete API (use
the same workspace id returned), and surface the original error; reference
createWorkspaceMut.mutateAsync, create.mutateAsync, and targetWorkspaceId when
locating the code to change.
- Around line 355-383: The nested button inside the DropdownMenuTrigger must be
removed; in TodoModal replace the inner clear <button> (the one rendered when
selected) with a non-button interactive element (e.g., a <span> or <div> with
role="button" and tabIndex={0}) and wire its click and keyboard handlers to call
onSelect(null) while calling e.preventDefault()/e.stopPropagation() to avoid
triggering the parent trigger; keep the same classes, title="解除", and accessible
keyboard handling (Enter/Space) so the clear control remains focusable and
accessible without nesting a button inside the DropdownMenuTrigger's button.

---

Nitpick comments:
In `@packages/local-db/drizzle/0052_todo_headless_fields.sql`:
- Line 3: 現在のALTER
TABLE追加は`todo_sessions`の`total_cost_usd`をREALで追加していますが、金額は整数最小単位で保持すべきなので列型をREALからINTEGERに変更してマイグレーションを作り直してください:
更新対象は`todo_sessions.total_cost_usd`で、スキーマ変更をALTER TABLE ... ADD `total_cost_usd`
INTEGER NOT NULL DEFAULT
0(またはNULL許容とデフォルトの要件に合わせる)にし、アプリ側でUSDをセントやマイクロ単位に変換して保存/読み出すロジック(保存時に*100 or
*1_000_000、取得時に逆変換)を合わせて実装してください。

In `@packages/local-db/src/schema/schema.ts`:
- Around line 459-462: The two re-export lines (export * from
"./todo-prompt-presets"; and export * from "./todo-sessions";) should be removed
from this schema file and consolidated into the single public re-export barrel
(index.ts) so all schema exports are managed in one place; update the central
index.ts to re-export both todo-prompt-presets and todo-sessions (and add a
brief comment explaining the barrel) and remove the duplicate exports here to
avoid future conflicts.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 533515b0-d81a-4ebe-a85e-9458b4c7f77a

📥 Commits

Reviewing files that changed from the base of the PR and between c9418eb and 72e446b.

📒 Files selected for processing (40)
  • .gitignore
  • apps/desktop/plans/todo-agent-plan.md
  • apps/desktop/src/lib/trpc/routers/index.ts
  • apps/desktop/src/main/todo-agent/enhance-text.ts
  • apps/desktop/src/main/todo-agent/git-status.ts
  • apps/desktop/src/main/todo-agent/index.ts
  • apps/desktop/src/main/todo-agent/session-store.ts
  • apps/desktop/src/main/todo-agent/supervisor.ts
  • apps/desktop/src/main/todo-agent/trpc-router.ts
  • apps/desktop/src/main/todo-agent/types.ts
  • apps/desktop/src/renderer/features/todo-agent/TodoButton/TodoButton.tsx
  • apps/desktop/src/renderer/features/todo-agent/TodoButton/index.ts
  • apps/desktop/src/renderer/features/todo-agent/TodoManager/ChangesSidebar/ChangesSidebar.tsx
  • apps/desktop/src/renderer/features/todo-agent/TodoManager/ChangesSidebar/index.ts
  • apps/desktop/src/renderer/features/todo-agent/TodoManager/PresetsDialog/PresetsDialog.tsx
  • apps/desktop/src/renderer/features/todo-agent/TodoManager/PresetsDialog/index.ts
  • apps/desktop/src/renderer/features/todo-agent/TodoManager/TodoManager.tsx
  • apps/desktop/src/renderer/features/todo-agent/TodoManager/index.ts
  • apps/desktop/src/renderer/features/todo-agent/TodoModal/TodoModal.tsx
  • apps/desktop/src/renderer/features/todo-agent/TodoModal/components/EnhanceButton/EnhanceButton.tsx
  • apps/desktop/src/renderer/features/todo-agent/TodoModal/components/EnhanceButton/index.ts
  • apps/desktop/src/renderer/features/todo-agent/TodoModal/index.ts
  • apps/desktop/src/renderer/screens/main/components/WorkspaceView/ContentView/components/PresetsBar/PresetsBar.tsx
  • packages/local-db/drizzle/0049_add_todo_sessions.sql
  • packages/local-db/drizzle/0050_todo_verify_optional.sql
  • packages/local-db/drizzle/0051_todo_goal_optional.sql
  • packages/local-db/drizzle/0052_todo_headless_fields.sql
  • packages/local-db/drizzle/0053_todo_start_head_sha.sql
  • packages/local-db/drizzle/0054_todo_prompt_presets.sql
  • packages/local-db/drizzle/meta/0049_snapshot.json
  • packages/local-db/drizzle/meta/0050_snapshot.json
  • packages/local-db/drizzle/meta/0051_snapshot.json
  • packages/local-db/drizzle/meta/0052_snapshot.json
  • packages/local-db/drizzle/meta/0053_snapshot.json
  • packages/local-db/drizzle/meta/0054_snapshot.json
  • packages/local-db/drizzle/meta/_journal.json
  • packages/local-db/src/schema/index.ts
  • packages/local-db/src/schema/schema.ts
  • packages/local-db/src/schema/todo-prompt-presets.ts
  • packages/local-db/src/schema/todo-sessions.ts

Comment on lines +13 to +20
- ライブ可視性: 実行中ワーカーは実際の PTY であり、既存の
`TerminalPane` コンポーネントで描画されるため、誰でも監視したり
直接入力したりできる。
- 信頼性: 完了判定は決定的な verify コマンドの終了コードで行い、
LLM の自己申告には依存しない。
- 逐次実行: 同時にアクティブなのは 1 タスクのみとし、それ以外はキューに入れる。
- upstream とのマージ容易性: 新規コードはすべて新しいファイル / ディレクトリに
置き、既存ファイルへの変更は追記のみ、かつ 1 行変更を 3 箇所に限定する。
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

実装計画の実行モデルが現状実装とずれています。

この文書は PTY 常駐の対話 worker と idle 検知中心の流れを前提にしていますが、この PR の実装説明は headless Claude の NDJSON ストリームと verify exit code ベースの完了判定に寄っています。今のままだと後から読む人が誤った前提で保守しやすいので、現実装に合わせて更新するか obsolete と明記した方がいいです。

Also applies to: 71-99

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/desktop/plans/todo-agent-plan.md` around lines 13 - 20, The
implementation model described in the plan (the bullets "ライブ可視性", "信頼性", "逐次実行",
"upstream とのマージ容易性") conflicts with the PR's actual design (headless Claude
NDJSON streaming + verify exit-code completion); update todo-agent-plan.md to
either rewrite these sections to describe the headless NDJSON stream +
verify-exit-code flow used by the current codebase (including removing or
adjusting PTY-resident/idle-detection language) or clearly mark the document as
obsolete and point to the new implementation notes; ensure the revised text
names the implemented mechanisms (NDJSON stream, verify exit code) so future
readers are not misled.


## アーキテクチャ

```
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

コードフェンスに言語指定を付けてください。

ここは markdownlint の MD040 が出ています。textts を付けるだけで警告を解消できます。

Also applies to: 60-60, 145-145, 237-237

🧰 Tools
🪛 markdownlint-cli2 (0.22.0)

[warning] 32-32: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/desktop/plans/todo-agent-plan.md` at line 32, Several fenced code blocks
in the Markdown (the triple-backtick blocks) are missing language identifiers
causing markdownlint MD040; update each problematic code fence (the three
backtick blocks referenced in the review) to include an appropriate language tag
such as "text" or "ts" (e.g., replace ``` with ```text or ```ts) so linting
passes; ensure you update all occurrences mentioned in the review comment so
each code fence has a language specifier.

Comment on lines +84 to +101
export function describeEnhanceFailure(attempts: SmallModelAttempt[]): string {
for (let index = attempts.length - 1; index >= 0; index -= 1) {
const attempt = attempts[index];
if (!attempt) continue;
if (attempt.outcome === "expired-credentials") {
return `${attempt.issue?.message ?? `${attempt.providerName} の認証が切れています`}。設定から再接続してください。`;
}
if (attempt.outcome === "failed") {
return `${attempt.providerName} での書き換えに失敗しました: ${attempt.issue?.message ?? attempt.reason ?? "unknown"}`;
}
if (attempt.outcome === "unsupported-credentials") {
return `${attempt.providerName} の認証種別が書き換えに対応していません。`;
}
}
if (attempts.every((a) => a.outcome === "missing-credentials")) {
return "AI 書き換えに使えるモデルアカウントが接続されていません。設定から Anthropic か OpenAI を接続してください。";
}
return "AI 書き換えに失敗しました。";
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

empty-result を明示的に扱ってください。

callSmallModel の attempt には empty-result があり得ますが、現状は汎用メッセージに落ちるので、ユーザーには「モデル呼び出し自体は通ったが空応答だった」のか「実行失敗」なのか区別できません。ここは専用メッセージを返した方が再試行時の判断がしやすいです。

💡 例
 export function describeEnhanceFailure(attempts: SmallModelAttempt[]): string {
 	for (let index = attempts.length - 1; index >= 0; index -= 1) {
 		const attempt = attempts[index];
 		if (!attempt) continue;
 		if (attempt.outcome === "expired-credentials") {
 			return `${attempt.issue?.message ?? `${attempt.providerName} の認証が切れています`}。設定から再接続してください。`;
 		}
+		if (attempt.outcome === "empty-result") {
+			return `${attempt.providerName} から空の結果が返されました。入力を少し具体化して再試行してください。`;
+		}
 		if (attempt.outcome === "failed") {
 			return `${attempt.providerName} での書き換えに失敗しました: ${attempt.issue?.message ?? attempt.reason ?? "unknown"}`;
 		}
 		if (attempt.outcome === "unsupported-credentials") {
 			return `${attempt.providerName} の認証種別が書き換えに対応していません。`;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/desktop/src/main/todo-agent/enhance-text.ts` around lines 84 - 101, In
describeEnhanceFailure, add explicit handling for attempts whose outcome ===
"empty-result" (SmallModelAttempt) before falling through to generic messages;
return a clear Japanese message indicating the model call succeeded but produced
an empty response (e.g., "モデルは応答しましたが空の結果でした。再試行してください。") so users can
distinguish "empty response" from other failures and make retry decisions.

Comment thread apps/desktop/src/main/todo-agent/supervisor.ts
Comment on lines +156 to +163
// Fresh in-memory buffer for this run. Old events from previous
// runs of the same session are cleared so the UI sees just the
// current attempt.
store.clearStreamEvents(sessionId);
// Prime the artifact-path cache so the hot stream-persist path
// does not need to do a synchronous SQLite read per event.
store.setArtifactPathCache(sessionId, session0.artifactPath);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

同じ session の再実行で stream 履歴が混ざります。

ここで消しているのは in-memory buffer だけですが、永続化側の stream.jsonl は append-only のままです。failed / aborted / escalated session を同じ sessionId で再開すると、再起動後や disk fallback 時に前回 run のイベントまで一緒に読み戻されます。再実行前に既存の stream file を truncate するか、run ごとに別ファイルへ分けたいです。

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/desktop/src/main/todo-agent/supervisor.ts` around lines 156 - 163, The
in-memory clear (store.clearStreamEvents(sessionId)) leaves the append-only
persisted stream.jsonl intact causing old events to be reloaded on retries;
update supervisor startup to also truncate or rotate the persisted stream file
before a new run. Add or call a store-level method (e.g.
store.truncateStreamFile(sessionId) or store.rotateStreamFile(sessionId, runId))
before priming the cache (before/around the
store.setArtifactPathCache(sessionId, session0.artifactPath)) so the persisted
stream for the same sessionId is either truncated or writes go to a run-specific
file to avoid mixing previous run events. Ensure the new method is implemented
in the store backend to atomically truncate or rename the existing stream.jsonl
for that sessionId.

Comment on lines +45 to +55
const diffQuery = electronTrpc.todoAgent.gitFileDiff.useQuery(
selected
? {
sessionId,
path: selected.path,
scope: selected.scope,
commitSha: selected.commitSha,
}
: { sessionId, path: "", scope: "session" as const },
{ enabled: !!selected, staleTime: 5_000 },
);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

find apps/desktop -name "ChangesSidebar.tsx" -type f

Repository: MocA-Love/superset

Length of output: 153


🏁 Script executed:

find apps/desktop -name "trpc-router.ts" -type f

Repository: MocA-Love/superset

Length of output: 109


🏁 Script executed:

cat -n apps/desktop/src/renderer/features/todo-agent/TodoManager/ChangesSidebar/ChangesSidebar.tsx | head -200

Repository: MocA-Love/superset

Length of output: 7657


🏁 Script executed:

cat -n apps/desktop/src/main/todo-agent/trpc-router.ts | sed -n '350,390p'

Repository: MocA-Love/superset

Length of output: 1523


コミット選択時の diff クエリが gitFileDiff の入力制約を満たしていません。

コミットを選ぶと path: "" を送っていますが(176-183行)、apps/desktop/src/main/todo-agent/trpc-router.ts:361gitFileDiffpath: z.string().min(1) を要求しています。このままではコミット diff 表示が毎回 tRPC 検証エラーで失敗するため、scope === "commit" では path 不要の API に揃えるか、クライアント側の選択モデルを分岐してください。

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@apps/desktop/src/renderer/features/todo-agent/TodoManager/ChangesSidebar/ChangesSidebar.tsx`
around lines 45 - 55, The diffQuery call is sending path: "" when a commit is
selected which violates the server validator for gitFileDiff
(z.string().min(1)); update the argument construction in
electronTrpc.todoAgent.gitFileDiff.useQuery to branch based on selected.scope
(e.g., if selected?.scope === "commit" send the payload without path and include
commitSha/scope accordingly, otherwise include path as before), or populate a
non-empty path when required so the client payload matches the gitFileDiff input
schema; adjust the selected-based conditional used to build the query args in
ChangesSidebar (diffQuery) to ensure scope === "commit" uses the API shape the
router expects.

Comment on lines +55 to +64
<Button
type="button"
size="sm"
variant="ghost"
className="h-6 w-6 p-0 rounded-md text-muted-foreground hover:text-primary"
onClick={handleClick}
disabled={disabled}
title={title ?? (running ? "AI で書き換え中…" : "AI で書き換える")}
>
<HiMiniSparkles className={cn("size-3.5", running && "animate-pulse")} />
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

アイコンボタンにアクセシブルネームを付けてください。

title 属性だけだと支援技術向けの名前として安定せず、このボタンの用途がスクリーンリーダー利用者に伝わりません。aria-label を付けて、アイコンは読み上げ対象から外した方が安全です。

♿ 修正例
 		<Button
 			type="button"
 			size="sm"
 			variant="ghost"
 			className="h-6 w-6 p-0 rounded-md text-muted-foreground hover:text-primary"
 			onClick={handleClick}
 			disabled={disabled}
 			title={title ?? (running ? "AI で書き換え中…" : "AI で書き換える")}
+			aria-label={title ?? (running ? "AI で書き換え中" : "AI で書き換える")}
 		>
-			<HiMiniSparkles className={cn("size-3.5", running && "animate-pulse")} />
+			<HiMiniSparkles
+				aria-hidden="true"
+				className={cn("size-3.5", running && "animate-pulse")}
+			/>
 		</Button>
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
<Button
type="button"
size="sm"
variant="ghost"
className="h-6 w-6 p-0 rounded-md text-muted-foreground hover:text-primary"
onClick={handleClick}
disabled={disabled}
title={title ?? (running ? "AI で書き換え中…" : "AI で書き換える")}
>
<HiMiniSparkles className={cn("size-3.5", running && "animate-pulse")} />
<Button
type="button"
size="sm"
variant="ghost"
className="h-6 w-6 p-0 rounded-md text-muted-foreground hover:text-primary"
onClick={handleClick}
disabled={disabled}
title={title ?? (running ? "AI で書き換え中…" : "AI で書き換える")}
aria-label={title ?? (running ? "AI で書き換え中" : "AI で書き換える")}
>
<HiMiniSparkles
aria-hidden="true"
className={cn("size-3.5", running && "animate-pulse")}
/>
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@apps/desktop/src/renderer/features/todo-agent/TodoModal/components/EnhanceButton/EnhanceButton.tsx`
around lines 55 - 64, Enhance the accessibility of the icon-only button in the
EnhanceButton component by adding an explicit aria-label to the Button (use the
existing title prop or fallback to running ? "AI で書き換え中…" : "AI で書き換える") and
mark the HiMiniSparkles icon as non-interactive for assistive tech (e.g., add
aria-hidden="true" and focusable={false} to the icon element); update the Button
JSX where onClick={handleClick}, disabled={disabled}, title={...} is set to
include aria-label and update the HiMiniSparkles usage to include aria-hidden
and focusable props.

Comment on lines +123 to +150
let targetWorkspaceId = workspaceId;
if (createWorktree) {
if (!projectId) {
throw new Error(
"このワークスペースにはプロジェクトが紐付いていないので新しい worktree を作成できません",
);
}
const namingPrompt = [title.trim(), description.trim()]
.filter(Boolean)
.join("\n\n");
const result = await createWorkspaceMut.mutateAsync({
projectId,
prompt: namingPrompt || title.trim(),
});
targetWorkspaceId = result.workspace.id;
}

const created = await create.mutateAsync({
workspaceId: targetWorkspaceId,
projectId,
title: title.trim(),
description: description.trim(),
goal: hasGoal ? goal.trim() : undefined,
verifyCommand: hasVerify ? verifyCommand.trim() : undefined,
maxIterations,
maxWallClockSec: maxMinutes * 60,
customSystemPrompt: selectedPreset?.content ?? undefined,
});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

worktree 作成とセッション作成が分離されていて、失敗時に部分成功が残ります。

ここは workspaces.create が成功したあとに todoAgent.create が失敗すると、新しい workspace/worktree だけが残ってユーザーにはエラーが返ります。非冪等な副作用が 2 段になっているので、1 つの main-process mutation にまとめるか、失敗時の補償削除を入れないと孤立リソースが発生します。

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/desktop/src/renderer/features/todo-agent/TodoModal/TodoModal.tsx` around
lines 123 - 150, The code currently calls createWorkspaceMut.mutateAsync(...) to
create a worktree and then create.mutateAsync(...) to create a todo session,
leaving an orphaned workspace if the second call fails; update the flow so both
operations are atomic: either (A) move the worktree + todo creation into a
single main-process mutation on the backend (preferred) so one server-side
transaction handles both, or (B) implement compensation logic in the caller
around createWorkspaceMut.mutateAsync and create.mutateAsync — after
createWorkspaceMut.mutateAsync returns a result.workspace.id, call
create.mutateAsync(...) and if that fails, reliably delete the newly created
workspace via the matching workspace delete API (use the same workspace id
returned), and surface the original error; reference
createWorkspaceMut.mutateAsync, create.mutateAsync, and targetWorkspaceId when
locating the code to change.

Comment on lines +355 to +383
<DropdownMenuTrigger asChild>
<button
type="button"
className={cn(
"flex items-center gap-2 px-2.5 py-1.5 rounded-md border text-xs transition",
selected
? "border-primary/40 bg-primary/5 text-foreground"
: "border-border/40 text-muted-foreground hover:bg-muted/40",
)}
>
<HiMiniSparkles className="size-3 text-primary/80" />
<span className="flex-1 text-left truncate">
{selected ? selected.name : "プリセットを選択(設定から管理)"}
</span>
{selected && (
<button
type="button"
className="size-4 rounded-sm flex items-center justify-center hover:bg-background/80"
onClick={(e) => {
e.preventDefault();
e.stopPropagation();
onSelect(null);
}}
title="解除"
>
<HiMiniXMark className="size-3" />
</button>
)}
</button>
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cat -n apps/desktop/src/renderer/features/todo-agent/TodoModal/TodoModal.tsx | sed -n '350,390p'

Repository: MocA-Love/superset

Length of output: 1642


ネストされた button 要素を削除してください。

クリア用の buttonDropdownMenuTriggerbutton の子要素として含まれており、これは無効な HTML 構造です。HTML 仕様によると、ボタン要素は他のボタンのような対話的なコンテンツをネストできません。この構造により、キーボード操作や支援技術(スクリーンリーダーなど)の動作が不安定になり、環境によっては親トリガーと子ボタンが同時に反応する可能性があります。

💡 修正例
-			<DropdownMenuTrigger asChild>
-				<button
-					type="button"
-					className={cn(
-						"flex items-center gap-2 px-2.5 py-1.5 rounded-md border text-xs transition",
-						selected
-							? "border-primary/40 bg-primary/5 text-foreground"
-							: "border-border/40 text-muted-foreground hover:bg-muted/40",
-					)}
-				>
-					<HiMiniSparkles className="size-3 text-primary/80" />
-					<span className="flex-1 text-left truncate">
-						{selected ? selected.name : "プリセットを選択(設定から管理)"}
-					</span>
-					{selected && (
-						<button
-							type="button"
-							className="size-4 rounded-sm flex items-center justify-center hover:bg-background/80"
-							onClick={(e) => {
-								e.preventDefault();
-								e.stopPropagation();
-								onSelect(null);
-							}}
-							title="解除"
-						>
-							<HiMiniXMark className="size-3" />
-						</button>
-					)}
-				</button>
-			</DropdownMenuTrigger>
+			<div className="relative">
+				<DropdownMenuTrigger asChild>
+					<button
+						type="button"
+						className={cn(
+							"flex w-full items-center gap-2 rounded-md border px-2.5 py-1.5 pr-7 text-xs transition",
+							selected
+								? "border-primary/40 bg-primary/5 text-foreground"
+								: "border-border/40 text-muted-foreground hover:bg-muted/40",
+						)}
+					>
+						<HiMiniSparkles className="size-3 text-primary/80" />
+						<span className="flex-1 truncate text-left">
+							{selected ? selected.name : "プリセットを選択(設定から管理)"}
+						</span>
+					</button>
+				</DropdownMenuTrigger>
+				{selected && (
+					<button
+						type="button"
+						className="absolute right-2 top-1/2 flex size-4 -translate-y-1/2 items-center justify-center rounded-sm hover:bg-background/80"
+						onClick={(e) => {
+							e.preventDefault();
+							e.stopPropagation();
+							onSelect(null);
+						}}
+						title="解除"
+					>
+						<HiMiniXMark className="size-3" />
+					</button>
+				)}
+			</div>
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
<DropdownMenuTrigger asChild>
<button
type="button"
className={cn(
"flex items-center gap-2 px-2.5 py-1.5 rounded-md border text-xs transition",
selected
? "border-primary/40 bg-primary/5 text-foreground"
: "border-border/40 text-muted-foreground hover:bg-muted/40",
)}
>
<HiMiniSparkles className="size-3 text-primary/80" />
<span className="flex-1 text-left truncate">
{selected ? selected.name : "プリセットを選択(設定から管理)"}
</span>
{selected && (
<button
type="button"
className="size-4 rounded-sm flex items-center justify-center hover:bg-background/80"
onClick={(e) => {
e.preventDefault();
e.stopPropagation();
onSelect(null);
}}
title="解除"
>
<HiMiniXMark className="size-3" />
</button>
)}
</button>
<div className="relative">
<DropdownMenuTrigger asChild>
<button
type="button"
className={cn(
"flex w-full items-center gap-2 rounded-md border px-2.5 py-1.5 pr-7 text-xs transition",
selected
? "border-primary/40 bg-primary/5 text-foreground"
: "border-border/40 text-muted-foreground hover:bg-muted/40",
)}
>
<HiMiniSparkles className="size-3 text-primary/80" />
<span className="flex-1 truncate text-left">
{selected ? selected.name : "プリセットを選択(設定から管理)"}
</span>
</button>
</DropdownMenuTrigger>
{selected && (
<button
type="button"
className="absolute right-2 top-1/2 flex size-4 -translate-y-1/2 items-center justify-center rounded-sm hover:bg-background/80"
onClick={(e) => {
e.preventDefault();
e.stopPropagation();
onSelect(null);
}}
title="解除"
>
<HiMiniXMark className="size-3" />
</button>
)}
</div>
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/desktop/src/renderer/features/todo-agent/TodoModal/TodoModal.tsx` around
lines 355 - 383, The nested button inside the DropdownMenuTrigger must be
removed; in TodoModal replace the inner clear <button> (the one rendered when
selected) with a non-button interactive element (e.g., a <span> or <div> with
role="button" and tabIndex={0}) and wire its click and keyboard handlers to call
onSelect(null) while calling e.preventDefault()/e.stopPropagation() to avoid
triggering the parent trigger; keep the same classes, title="解除", and accessible
keyboard handling (Enter/Space) so the clear control remains focusable and
accessible without nesting a button inside the DropdownMenuTrigger's button.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant