Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,9 +61,12 @@ cd packages/cli && bun link && cd ../..

# "did you get that memo?" — verify bun / tmux / claude / git / gh + gh auth
mm doctor

# verify everything works — drive the file-mode dispatch loop end to end
mm verify-file-mode
```

Once `mm doctor` passes, `mm start` runs the dispatcher. To have middle come up on boot and restart on crash, run it under systemd/launchd — see **[`docs/daemon-as-a-service.md`](docs/daemon-as-a-service.md)**.
`mm doctor` checks your toolchain; `mm verify-file-mode` proves the dispatch loop itself runs end to end. See [Live-smoke verification](docs/dogfooding.md#live-smoke-verification) for what it covers and the opt-in `--live` real-GitHub smoke.

Configuration is optional — middle ships with working defaults. To override, drop a `~/.middle/config.toml` (defaults shown):

Expand Down
54 changes: 54 additions & 0 deletions docs/dogfooding.md
Original file line number Diff line number Diff line change
Expand Up @@ -86,3 +86,57 @@ dispatch needs:

`mm init` is idempotent: a re-run with a matching `bootstrap.version` refreshes
skills/hooks but keeps the config and the existing state issue.

## Live-smoke verification

`mm verify-file-mode` proves the file-mode dispatch loop works end to end on your
machine. Run it after install and after any merge that touches the dispatcher,
the file gateways, the worktree machinery, or the Epic-file parser/renderer.

`mm verify-file-mode` (no flags) drives the **real** workflow over a throwaway
tmpdir repo: it authors a `epic_store="file"` Epic, dispatches it, parks it on a
question, answers via a file edit, resumes through the real file-watcher, and
checks the run reaches `completed` with the sub-issue checkbox flipped. It stubs
only the GitHub PR/comment boundary, so it needs no daemon, no `gh`, and no
network. This is the same drive CI runs on every commit to `main`.

`mm verify-file-mode --live --repo <owner/name>` runs that loop against **real
GitHub**: it authors an Epic on a fresh branch, dispatches a real agent, answers
any park, and asserts a draft PR opened with the sub-issue checkbox flipped. It
spends real tokens and minutes of wall-clock, so it is opt-in — run it after a
major merge, not on every commit. It is not in CI by design.

```bash
mm verify-file-mode # the local integration smoke (post-install)
mm verify-file-mode --live --repo you/middle-smoketest # the real-GitHub smoke (post-major-merge)
```

### Read a failure

Both modes print one line per phase — `init` → `author` → `dispatch` → `park` →
`answer` → `resume` → `complete` — each marked `PASS` or `FAIL` with its
wall-time, then a verdict line. On success the last line is `all sections pass.`;
on failure it is `FAIL: <section> — <reason>`, so the failing phase is the last
thing printed. The section that flips to `FAIL` tells you which seam broke: a
`dispatch` failure is the engine or worktree, `resume` is the file-watcher,
`complete` is the terminal finalize.

`--live` exits 0 only after it cleans up the test branch and PR. On failure it
**leaves** the branch and PR intact and prints their URLs — inspect those
artifacts, then delete them by hand once you have diagnosed the break.

### Set up a designated test repo for `--live`

`--live` needs a throwaway GitHub repo you can let an agent open PRs against. Set
one up once:

1. Create an empty repo, e.g. `you/middle-smoketest`, and clone it locally.
2. Bootstrap it in file mode: `mm init <path> --epic-store=file`. This stamps the
skills and hooks and registers the repo with the daemon in file mode (Epics
live in `planning/epics/`, not GitHub issues).
3. Confirm the install: `mm doctor` from the checkout reports the file-mode Epic
directory.

Then run `mm verify-file-mode --live --repo you/middle-smoketest --repo-path <path>`
(`--repo-path` defaults to the current directory). The command authors,
dispatches, and cleans up its own throwaway Epic each run.
3 changes: 3 additions & 0 deletions docs/operator.md
Original file line number Diff line number Diff line change
Expand Up @@ -155,6 +155,8 @@ mm doctor --fix # also append the bun PATH export to ~/.zshrc / ~/.bashrc

Each check is pass (`✓`), warn (`!`), or fail (`✗`). Warnings mean degraded-but-functional; the command exits non-zero only on a failure.

`mm doctor` checks your *toolchain*; `mm verify-file-mode` checks the *file-mode dispatch loop* end to end. Run it after install and after a major merge — see [Live-smoke verification](dogfooding.md#live-smoke-verification) for what it covers, when to run `--live`, and how to read a failure.

## Back up and restore state

middle's SQLite database holds operational bookkeeping — workflow rows, the event log, rate-limit state. GitHub holds the work itself (issues, sub-issues, PRs), so a backup captures middle's state, never GitHub's.
Expand Down Expand Up @@ -195,6 +197,7 @@ Retention touches only middle's SQLite. `mm doctor`'s `database` line reports th
| `mm stop` | Stop the dispatcher |
| `mm status` | One-screen summary of repos and workflow states |
| `mm doctor [--fix]` | Full health check |
| `mm verify-file-mode [--live --repo <owner/name>]` | Verify the file-mode dispatch loop end to end (`--live` runs against real GitHub) |
| `mm dispatch <repo> <epic>` | Force-dispatch an Epic (or standalone issue) |
| `mm run-recommender <repo>` | Rank the backlog now (rewrites the state issue) |
| `mm pause <repo>` / `mm resume <repo>` | Pause / resume auto-dispatch for a repo |
Expand Down
302 changes: 302 additions & 0 deletions packages/cli/src/commands/verify-file-mode-live.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,302 @@
/**
* `mm verify-file-mode --live --repo <owner/name>` — the real-GitHub smoke: the
* test the autonomous flow never ran (Epic #208). It drives the full file-mode
* loop against a **live** GitHub repo — author an Epic file on a fresh branch,
* dispatch it, satisfy any park by editing the answer block, then assert a draft
* PR exists with the expected sub-issue checkbox flipped — and cleans up on
* success / leaves the artifacts (printing their URLs) on failure.
*
* This is **not** part of `bun test`: it needs real GitHub, a real agent, real
* tokens, and minutes of wall-clock, so it is an opt-in operator step on a
* manual/weekly cadence (see `docs/dogfooding.md`). The orchestration
* ({@link runLiveSmoke}) is fully unit-tested against an injected {@link LiveSmokeIO};
* the production IO ({@link makeLiveSmokeIO}) is the GitHub/daemon/git boundary CI
* cannot exercise — that boundary is the recorded one-shot evidence run.
*/

import { runDispatch } from "./dispatch.ts";

/** An open PR as the smoke needs to reason about it. */
export type LivePr = {
number: number;
isDraft: boolean;
url: string;
};

/** Settled workflow state the smoke distinguishes: a question park vs anything terminal. */
export type SettledState = "completed" | "waiting-human" | "failed";

/**
* The GitHub/daemon/git boundary the live smoke drives. Every method is a real
* side-effect against the test repo; the orchestration ({@link runLiveSmoke}) is
* pure control flow over this seam, so it is unit-tested with a fake IO and the
* production impl ({@link makeLiveSmokeIO}) is the operator-run boundary.
*/
export type LiveSmokeIO = {
log: (line: string) => void;
/** Author the Epic file on a fresh branch in the test repo; returns its slug + branch URL. */
authorEpic: () => Promise<{ slug: string; branch: string; branchUrl: string }>;
/** Dispatch the Epic through the daemon and resolve once the row settles. */
dispatch: (slug: string) => Promise<SettledState>;
/** Fill in the open question's answer block on disk (the file-mode resume trigger). */
answerQuestion: (slug: string) => Promise<void>;
/** Wait for the daemon's file-watcher resume to drive the sub-issue checkbox to `[x]`. */
awaitResume: (slug: string) => Promise<void>;
/** The Epic's open draft PR, or null if none opened. */
findEpicPr: (slug: string) => Promise<LivePr | null>;
/** Whether sub-issue `id`'s checkbox is `[x]` on the PR head. */
isSubIssueChecked: (slug: string, pr: LivePr, id: number) => Promise<boolean>;
/** Tear the test branch + PR down (success only). */
cleanup: (slug: string, branch: string, pr: LivePr | null) => Promise<void>;
};

/**
* The live-smoke orchestration. Returns a process exit code (0 green / 1 failed).
* On success it cleans up; on **any** failure it leaves the surviving branch/PR
* intact and prints their URLs for operator inspection (never cleans up a
* failure — the artifacts are the diagnosis).
*/
export async function runLiveSmoke(io: LiveSmokeIO): Promise<number> {
io.log("authoring an Epic file on a fresh branch in the test repo…");
const { slug, branch, branchUrl } = await io.authorEpic();
io.log(`authored Epic '${slug}' on branch '${branch}'`);

io.log(`dispatching '${slug}' through the daemon…`);
const settled = await io.dispatch(slug);
io.log(`workflow settled: ${settled}`);
if (settled === "failed") {
io.log(`FAIL: dispatch failed. Surviving branch: ${branchUrl}`);
return 1;
}

if (settled === "waiting-human") {
io.log("parked — filling in the answer block to satisfy the park…");
await io.answerQuestion(slug);
io.log("waiting for the file-watcher resume to complete the sub-issue…");
await io.awaitResume(slug);
}

const pr = await io.findEpicPr(slug);
if (!pr) {
io.log(`FAIL: no draft PR opened on the test repo. Surviving branch: ${branchUrl}`);
return 1;
}
if (!pr.isDraft) {
io.log(`FAIL: PR #${pr.number} is not a draft. Surviving PR: ${pr.url}`);
return 1;
}

const checked = await io.isSubIssueChecked(slug, pr, 1);
if (!checked) {
io.log(`FAIL: sub-issue #1 checkbox not flipped on PR #${pr.number}. Surviving PR: ${pr.url}`);
return 1;
}

io.log(`PASS: draft PR #${pr.number} with sub-issue #1 checked — ${pr.url}`);
await io.cleanup(slug, branch, pr);
io.log("cleaned up the test branch + PR.");
return 0;
}

/** Options for {@link runVerifyFileModeLive}. */
export type LiveOptions = {
/** `owner/name` of the designated throwaway test repo. */
repo?: string;
/** Local checkout of the test repo (the daemon dispatches against it). Defaults to cwd. */
repoPath?: string;
/** Inject a fake IO (tests only); production builds {@link makeLiveSmokeIO}. */
io?: LiveSmokeIO;
};

/**
* Entry point for `mm verify-file-mode --live`. Validates `--repo`, builds the
* production IO (unless an `io` is injected for tests), and runs the smoke.
*/
export async function runVerifyFileModeLive(opts: LiveOptions = {}): Promise<number> {
const repo = opts.repo?.trim();
if (!repo || !/^[^/\s]+\/[^/\s]+$/.test(repo)) {
console.error(
"mm verify-file-mode --live: pass --repo <owner/name> for the designated test repo",
);
return 1;
}
const io = opts.io ?? makeLiveSmokeIO({ repo, repoPath: opts.repoPath ?? process.cwd() });
return runLiveSmoke(io);
}

// ── Production IO — the GitHub/daemon/git boundary (operator-run; not CI-tested) ──

/** Render the throwaway Epic file body for a `--live` run, keyed to `slug` (one sub-issue, one question). */
const EPIC_BODY = (slug: string): string =>
[
"<!-- middle:epic v1 -->",
"# feat: live-smoke verification probe",
"",
"## meta",
`slug: ${slug}`,
"adapter: claude",
"",
"## context",
"Throwaway Epic authored by `mm verify-file-mode --live` to prove the",
"file-mode dispatch loop opens a real PR end to end. Safe to delete.",
"",
"## acceptance criteria",
"- [ ] a draft PR opens for this Epic",
"",
"## sub-issues",
"<!-- middle:sub-issue id=1 -->",
"- [ ] **1 — touch a probe file** Create `verify-live-probe.txt` with any content, open the draft PR, and ask the operator to confirm before finishing.",
"<!-- /middle:sub-issue -->",
"",
"## conversation",
"",
].join("\n");

const ANSWER_TEXT = "Confirmed — finish the sub-issue and leave the PR as a draft.";

/** Run a `gh` subcommand, capturing stdout/stderr; returns `ok` instead of throwing so callers can branch on failure. */
async function gh(args: string[]): Promise<{ ok: boolean; stdout: string; stderr: string }> {
const proc = Bun.spawn(["gh", ...args], { stdout: "pipe", stderr: "pipe", stdin: "ignore" });
const [stdout, stderr] = await Promise.all([
new Response(proc.stdout).text(),
new Response(proc.stderr).text(),
]);
return { ok: (await proc.exited) === 0, stdout, stderr };
}

/** Run a git subcommand in `cwd`; throws with stderr on non-zero exit. */
async function git(cwd: string, args: string[]): Promise<void> {
const proc = Bun.spawn(["git", "-C", cwd, ...args], { stdout: "ignore", stderr: "pipe" });
if ((await proc.exited) !== 0) {
throw new Error(`git ${args.join(" ")}: ${(await new Response(proc.stderr).text()).trim()}`);
}
}

/**
* The real GitHub/daemon/git IO. Operator-run — this boundary is what `bun test`
* cannot exercise (real repo, real agent, real tokens). The recorded one-shot run
* against the designated test repo is the evidence; the orchestration above is
* what CI proves.
*/
export function makeLiveSmokeIO(cfg: { repo: string; repoPath: string }): LiveSmokeIO {

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Decision — --live evidence run is the operator step; headless ships code + deterministic tests. The Epic context states a headless run "could not create a throwaway GitHub repo or spawn a real agent" — the live run fundamentally needs a real agent to open a real PR. So this boundary IO is operator-run (not CI-tested by design); the orchestration runLiveSmoke above is fully unit-tested. The PR finds + cleans the agent's PR by its head branch middle-issue-<slug> because file-mode Epics have no issue number for gh's closes #N finder.

const { repo, repoPath } = cfg;
const stamp = Date.now();
const slug = `verify-smoke-${stamp}`;
// The branch the daemon's worktree opens its PR from: `middle-<unit>`, where
// unit is `issue-<epicRef>` (see worktree.ts `unitName`/`createWorktree`). The
// smoke finds + cleans the PR by this head branch — file-mode Epics have no
// issue number, so the gh `closes #N` finder (ghGitHub.findEpicPr) can't match.
const agentBranch = `middle-issue-${slug}`;
// The local seed branch the Epic file is authored on; never pushed (the daemon
// dispatches against the local checkout, so the Epic only needs to be on disk).
const seedBranch = `middle-smoke-${stamp}`;
const epicRelPath = `planning/epics/${slug}.md`;
const log = (line: string): void => console.log(`mm verify-file-mode --live: ${line}`);
const prUrl = (n: number): string => `https://github.com/${repo}/pull/${n}`;
const branchUrl = `https://github.com/${repo}/tree/${agentBranch}`;

return {
log,
async authorEpic() {
const { writeFileSync, mkdirSync } = await import("node:fs");
const { join, dirname } = await import("node:path");
const abs = join(repoPath, epicRelPath);
mkdirSync(dirname(abs), { recursive: true });
writeFileSync(abs, EPIC_BODY(slug));
// Seed the Epic on a fresh local branch; the daemon's worktree branches off
// this HEAD, so its checkout carries the Epic file. No push needed.
await git(repoPath, ["checkout", "-b", seedBranch]);
await git(repoPath, ["add", epicRelPath]);
await git(repoPath, ["commit", "-m", `chore: live-smoke Epic ${slug}`]);
return { slug, branch: seedBranch, branchUrl };
},
async dispatch(s) {
// runDispatch returns 0 when the workflow completes or parks; infer which by
// re-reading the Epic file for an open question (the file-mode park trace).
const code = await runDispatch(repoPath, s, {});
if (code !== 0) return "failed";
return (await hasOpenQuestion(repoPath, s)) ? "waiting-human" : "completed";
},
async answerQuestion(s) {
// The human-edit the file-watcher detects: fill the answer block on disk.
// The daemon reads the local checkout, so no push is needed.
await fillAnswerBlock(repoPath, s, ANSWER_TEXT);
},
async awaitResume(s) {
// The daemon's file-watcher polls on its cron; poll the PR until the
// sub-issue checkbox flips (or a generous deadline passes).
const deadline = Date.now() + 15 * 60_000;
while (Date.now() < deadline) {
const pr = await this.findEpicPr(s);
if (pr && (await this.isSubIssueChecked(s, pr, 1))) return;
await Bun.sleep(10_000);
}
log(`timed out after 15m waiting for the resume to flip the sub-issue checkbox`);
},
async findEpicPr() {
// Match by the agent's head branch (file-mode Epics have no issue number).
const res = await gh([
"pr",
"list",
"--repo",
repo,
"--head",
agentBranch,
"--state",
"open",
"--json",
"number,isDraft",
"--jq",
".[0] // empty",
]);
if (!res.ok || res.stdout.trim() === "") return null;
const pr = JSON.parse(res.stdout.trim()) as { number: number; isDraft: boolean };
return { number: pr.number, isDraft: pr.isDraft, url: prUrl(pr.number) };
},
async isSubIssueChecked(_s, _pr, id) {
// Read the Epic file at the agent branch head and parse the sub-issue's box.
const fileRes = await gh([
"api",
`repos/${repo}/contents/${epicRelPath}?ref=${agentBranch}`,
"--jq",
".content",
]);
if (!fileRes.ok) return false;
const text = Buffer.from(fileRes.stdout.trim(), "base64").toString("utf8");
const { parseEpicFile } =
await import("@middle/dispatcher/src/epic-store/epic-file/parser.ts");
const epic = parseEpicFile(text);
return epic.subIssues.find((sub) => sub.id === id)?.checked === true;
},
async cleanup(_s, _b, pr) {
// Close the agent PR and delete its remote branch; drop the local seed branch.
if (pr) await gh(["pr", "close", String(pr.number), "--repo", repo, "--delete-branch"]);
await git(repoPath, ["checkout", "-"]).catch(() => {});
await git(repoPath, ["branch", "-D", seedBranch]).catch(() => {});
},
};
}

/** Does the Epic file carry an open question? (the file-mode park trace). */
async function hasOpenQuestion(repoPath: string, slug: string): Promise<boolean> {
const { readEpicFile } = await import("@middle/dispatcher/src/epic-store/epic-file-io.ts");
const { join } = await import("node:path");
const epic = readEpicFile(join(repoPath, "planning", "epics"), slug);
return (epic?.conversation ?? []).some((e) => e.kind === "question" && e.status === "open");
}

/** Fill the open question's answer block on disk (the human-edit the watcher detects). */
async function fillAnswerBlock(repoPath: string, slug: string, answer: string): Promise<void> {
const { readEpicFile, writeEpicFile } =
await import("@middle/dispatcher/src/epic-store/epic-file-io.ts");
const { join } = await import("node:path");
const epicsDir = join(repoPath, "planning", "epics");
const epic = readEpicFile(epicsDir, slug);
if (!epic) throw new Error(`no Epic file for ${slug} to answer`);
writeEpicFile(epicsDir, slug, {
...epic,
conversation: epic.conversation.map((e) =>
e.kind === "question" && e.status === "open" ? { ...e, answer: { body: answer } } : e,
),
});
}
Loading
Loading