fix(recommender): run on the daemon's engine, not a standalone second one by thejustinwalsh · Pull Request #150 · thejustinwalsh/middle

thejustinwalsh · 2026-05-25T17:10:53Z

The bug

The recommender never actually ran from the daemon — the autonomous loop's first link was broken end to end (found by dogfooding: mm start chose nothing for 30+ min; mm run-recommender failed with "port 4120 in use"; the daemon trigger returned 202 enqueued but created no workflow row, no tmux, no run).

Root cause: dispatchRecommender was the Phase-7 standalone path and never migrated to the daemon when dispatch was. It stands up its own HookServer (binds the dispatcher port → EADDRINUSE when the daemon is up) and its own embedded engine (a second in-process engine inside the daemon that never processes the job). Dispatch, by contrast, runs on the daemon's one long-lived engine. The asymmetry was the bug.

The fix — make the recommender use the daemon, exactly like dispatch

Recommender workflow: per-repo settings via an optional resolveRunSettings(repo) resolver (schemaPath/config/repoConfig/agentTimeoutMs), so one registration on the daemon's long-lived engine serves every managed repo — mirroring how the implementation workflow resolves per-repo. The static fields remain as the fallback for the standalone runner (backward-compatible).
main.ts: registers the recommender on the daemon's engine; runRecommenderForRepo now engine.start("recommender", …) on it, reusing the daemon's HookServer/sessionGate + dispatcherUrl — exactly like startDispatchImpl.
mm run-recommender: now a thin client — auto-starts the daemon (like mm dispatch) and POSTs /trigger/recommender. No standalone engine, no port clash.

Integration-verified (per #143/#144)

recommender-workflow.test.ts — proves the daemon path RUNS: registers the workflow with resolveRunSettings on a real engine, drives it to completion, and asserts a recommender workflow row is created (the row the dead-engine path never made) + the per-repo resolver was used + auto-dispatch fired from the resolved config. Plus a guard test for the no-settings wiring error.
run-recommender.test.ts — rewritten for the thin-client contract: auto-starts the daemon when down, POSTs the trigger, relays a non-202 rejection, handles never-ready + unreachable.
Full suite 673 pass, lint/typecheck/format clean.

Verify

bun run typecheck && bun test
# live: mm start  (the cron now fires the recommender on the daemon's engine);
#       mm run-recommender <repo>  (thin client, auto-starts the daemon)

Follow-up still open: a healthy mm start should kick one recommender+auto-dispatch pass immediately rather than waiting for the first cron interval, and a failed recommender launch shouldn't burn the full interval (both noted while debugging).

Summary by CodeRabbit

New Features
- run-recommender now acts as a thin client to a long‑lived daemon: auto-starts, health-checks, and relays trigger requests; per-repo settings can be resolved at runtime with configurable timeouts.
Bug Fixes
- Improved handling of daemon connection/readiness and clearer exit codes on dispatcher rejections/failures.
Tests
- Expanded coverage for daemon client behavior, per‑repo resolution, timeouts, and failure cases.
API Changes
- Minor public API surface updates: new/exported types and an exported helper surfaced.

… one The recommender never actually ran from the daemon. dispatchRecommender stood up its OWN HookServer (on the dispatcher port → EADDRINUSE for mm run-recommender while the daemon was up) AND its OWN embedded engine inside the daemon (a second in-process engine that never processed the job → 202 'enqueued' but no workflow row, no tmux, no run). Dispatch was migrated to the daemon's engine long ago; the recommender was left on the Phase-7 standalone path. The autonomous loop's first link was broken end to end. - recommender workflow: per-repo settings via an optional resolveRunSettings(repo) resolver (schemaPath/config/repoConfig/agentTimeoutMs), so ONE registration on the daemon's long-lived engine serves every managed repo — mirroring how the implementation workflow resolves per-repo. Static fields stay as the fallback for the standalone runner (backward-compatible). - main.ts: register the recommender on the daemon's engine; runRecommenderForRepo now engine.start("recommender", …) on that engine (reusing the daemon's HookServer/sessionGate + dispatcherUrl), exactly like startDispatchImpl. - mm run-recommender: now a thin client — auto-starts the daemon (like mm dispatch) and POSTs /trigger/recommender; no more standalone engine/port clash. - integration test: proves the daemon path RUNS the recommender on the engine and creates the recommender workflow row (the row the dead-engine path never made).

coderabbitai · 2026-05-25T17:11:04Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: d836465f-f79e-4844-a316-4c28547f04c4

📥 Commits

Reviewing files that changed from the base of the PR and between 8434ad5 and 720172b.

📒 Files selected for processing (1)

packages/dispatcher/src/workflows/recommender.ts

📝 Walkthrough

Walkthrough

This PR migrates recommender execution from in-process dispatch to a daemon-based client model and registers the recommender workflow on the daemon engine with per-repo run-settings resolution and agent-timeout backstops.

Changes

Recommender daemon migration with per-repo config resolution

Layer / File(s)	Summary
CLI daemon client interface `packages/cli/src/commands/run-recommender.ts`, `packages/cli/test/run-recommender.test.ts`	`RunRecommenderOptions` gains `startDaemon`, `probeHealth`, `healthTimeoutMs`, and `trigger` injection points with default implementations for health probing and HTTP trigger. `runRecommender` now derives the daemon URL from `dispatcherPort`, auto-starts on health probe failure, polls for readiness, and returns exit code `0` only for HTTP `202`; tests verify daemon lifecycle and trigger behavior under up/down/error conditions.
Workflow per-repo configuration contract `packages/dispatcher/src/workflows/recommender.ts`, `packages/dispatcher/test/recommender-workflow.test.ts`	Adds exported `RecommenderRunSettings` and makes `RecommenderDeps` static fields optional while adding `resolveRunSettings?: (repo: string) => RecommenderRunSettings`. `createRecommenderWorkflow` resolves per-run settings, uses them in prompt construction, clamps agent stop-await with `MAX_AGENT_TIMEOUT_MS`, uses per-repo `repoConfig` for validation, and reads `autoDispatch` from per-repo config. Tests updated for daemon-path and timeout backstop.
Daemon engine registration and enqueue `packages/dispatcher/src/main.ts`	Imports expanded for recommender wiring. `runRecommenderForRepo` now enqueues work via `engine.start("recommender", ...)` with try/catch and 500-on-enqueue-failure. Registers `createRecommenderWorkflow(...)` on the daemon engine and wires DB, adapter resolver, session gate, tmux/worktree helpers, state issue gateway, `ghSurfaceProblem`, and auto-dispatch. Documentation updated to note long-lived bunqueue engine use.
Export ghSurfaceProblem helper `packages/dispatcher/src/recommender-run.ts`	`ghSurfaceProblem` is exported to allow injection into the daemon workflow.
Tests and timeout backstop updates `packages/dispatcher/test/recommender-workflow.test.ts`, `packages/cli/test/run-recommender.test.ts`	Workflow tests updated to expect `spawn-recommender-agent` timeout sized against the per-repo ceiling (`MAX_AGENT_TIMEOUT_MS`) and new daemon-path tests added covering resolveRunSettings success and wiring failure cases; CLI tests cover daemon lifecycle and trigger behaviors.

Sequence Diagram(s)

sequenceDiagram
  participant CLI
  participant Daemon
  participant DispatcherEngine
  participant RecommenderWorkflow
  CLI->>Daemon: GET /health (probeHealth)
  alt daemon down
    CLI->>Daemon: startDaemon()
    CLI->>Daemon: poll /health until ready (healthTimeoutMs)
  end
  CLI->>Daemon: POST /trigger/recommender (trigger with repoPath)
  Daemon->>DispatcherEngine: engine.start("recommender", { repo })
  DispatcherEngine->>RecommenderWorkflow: enqueue run -> runSettings(repo) -> spawn agent
  RecommenderWorkflow-->>DispatcherEngine: complete/failed
  Daemon-->>CLI: HTTP 202 / other status -> CLI maps to exit code

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

thejustinwalsh/middle#142: Related changes around daemon-triggered recommender dispatch and cron/managed-repo triggering that integrate with the same engine-driven recommender path.
thejustinwalsh/middle#123: Both PRs modify how agentTimeoutMs flows into the recommender workflow and adjust timeout/backstop behavior.
thejustinwalsh/middle#105: Introduced the /trigger/recommender endpoint this PR’s CLI targets; this PR extends that endpoint’s workflow with daemon-mode per-repo config resolution.

Suggested labels

ready-for-review

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately describes the main change: converting recommender execution from a standalone second engine to using the daemon's long-lived engine, which is the core bug fix across all modified files.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Warning

Review ran into problems

🔥 Problems

Git: Failed to clone repository. Please run the @coderabbitai full review command to re-trigger a full review. If the issue persists, set path_filters to include or exclude specific files.

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

thejustinwalsh · 2026-05-25T17:11:14Z

@coderabbitai review

coderabbitai · 2026-05-25T17:11:19Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/dispatcher/src/workflows/recommender.ts`:
- Around line 516-522: The step timeout for "spawn-recommender-agent" uses the
static deps.agentTimeoutMs which is unset in daemon mode; change the timeout to
derive the agent timeout from the per-repo resolver first
(runSettings(repo).agentTimeoutMs) falling back to deps.agentTimeoutMs and then
to DEFAULT_AGENT_TIMEOUT_MS, and use that computed value in the timeout
expression along with launchTimeout (so the step timeout mirrors the same
per-repo agentTimeoutMs used earlier around awaitStop).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: efab13dc-bd2c-4397-b68f-3a66b01c98a1

📥 Commits

Reviewing files that changed from the base of the PR and between c7f2ed4 and 12c4300.

📒 Files selected for processing (6)

packages/cli/src/commands/run-recommender.ts
packages/cli/test/run-recommender.test.ts
packages/dispatcher/src/main.ts
packages/dispatcher/src/recommender-run.ts
packages/dispatcher/src/workflows/recommender.ts
packages/dispatcher/test/recommender-workflow.test.ts

…backstop covers CodeRabbit on #150: the step's registration-time timeout read the static agentTimeoutMs (unset → 15min default in daemon mode) while awaitStop read the per-repo resolveRunSettings value — a repo configured above the default would trip the generic step timeout before its own specific Stop-await. Add MAX_AGENT_TIMEOUT_MS (30min) ceiling: clamp the per-repo awaitStop to it and size the step backstop to it, so the internal timeout always fires first.

thejustinwalsh · 2026-05-25T17:20:24Z

@coderabbitai review

coderabbitai · 2026-05-25T17:20:29Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

packages/dispatcher/test/recommender-workflow.test.ts (1)

647-649: ⚡ Quick win

Tighten the wiring-guard assertion.

This path throws after prepare-shallow-worktree, so the DB row should end in compensated and the worktree should be gone. Allowing "failed" here would still pass if compensation regressed.

💡 Suggested fix

     const id = await runToEnd(broken);
-    // The guard fails the run (and compensation rolls the worktree back) rather
-    // than silently producing a half-run — exactly the failure mode we're fixing.
-    expect(["failed", "compensated"]).toContain(getWorkflow(db, id)!.state);
+    expect(getWorkflow(db, id)!.state).toBe("compensated");
+    expect(await listWorktrees({ repoPath, worktreeRoot })).toEqual([]);

Based on learnings launching, waiting-human, rate-limited, and compensated are the DB-only workflow states.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/dispatcher/test/recommender-workflow.test.ts` around lines 647 -
649, Tighten the assertion so the test requires the workflow DB state to be
"compensated" (not allowing "failed") after the error thrown
post-prepare-shallow-worktree, and add an assertion that the worktree was
removed; update the expect that currently calls getWorkflow(db, id)!.state to
assert === "compensated" and add a check (using the existing worktree lookup
helper used elsewhere in this test file) that the worktree no longer exists for
id.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/dispatcher/src/workflows/recommender.ts`:
- Around line 409-414: The workflow repeatedly calls
runSettings(ctx.input.repo)/resolveRunSettings(repo) across build, spawn, verify
and auto-dispatch which lets live config changes mix settings mid-run; instead
call and await runSettings(ctx.input.repo) once at the start of the execution
(store it in a local const like resolvedRunSettings or runCfg) and replace all
subsequent calls (including the agentTimeout calculation and checks for
schemaPath/repoConfig/autoDispatch) to read from that single frozen object so
every step uses the same resolved settings for the entire run.

---

Nitpick comments:
In `@packages/dispatcher/test/recommender-workflow.test.ts`:
- Around line 647-649: Tighten the assertion so the test requires the workflow
DB state to be "compensated" (not allowing "failed") after the error thrown
post-prepare-shallow-worktree, and add an assertion that the worktree was
removed; update the expect that currently calls getWorkflow(db, id)!.state to
assert === "compensated" and add a check (using the existing worktree lookup
helper used elsewhere in this test file) that the worktree no longer exists for
id.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 18040610-fa04-4062-9e61-18962b4a7925

📥 Commits

Reviewing files that changed from the base of the PR and between 12c4300 and 8434ad5.

📒 Files selected for processing (2)

packages/dispatcher/src/workflows/recommender.ts
packages/dispatcher/test/recommender-workflow.test.ts

…steps CodeRabbit on #150: resolveRunSettings was re-called in build/spawn/verify/ trigger, so a live config edit mid-run could mix schemaPath/config/repoConfig/ autoDispatch/agentTimeoutMs within one execution. Resolve once in build-prompt, return it on BuildPromptResult, and read it from ctx in the later steps.

thejustinwalsh · 2026-05-25T18:02:32Z

@coderabbitai review

coderabbitai · 2026-05-25T18:02:40Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

…backstop covers CodeRabbit on #150: the step's registration-time timeout read the static agentTimeoutMs (unset → 15min default in daemon mode) while awaitStop read the per-repo resolveRunSettings value — a repo configured above the default would trip the generic step timeout before its own specific Stop-await. Add MAX_AGENT_TIMEOUT_MS (30min) ceiling: clamp the per-repo awaitStop to it and size the step backstop to it, so the internal timeout always fires first.

…l on a failed launch Two follow-ups from #150's debugging: - mm start now runs one recommender due-check pass immediately (any overdue managed repo fires now → auto-dispatch on completion) and nudges auto-dispatch for every managed repo, instead of idling until the first cron tick (up to a full interval). A fresh restart works in seconds, not 15 minutes. - The cron stamps last_recommender_run before firing (overlap guard) but now ROLLS IT BACK on a failed launch (setLastRecommenderRun to the prior value), so a failure retries on the next tick rather than going quiet for a full interval.

coderabbitai Bot requested changes May 25, 2026

View reviewed changes

Comment thread packages/dispatcher/src/workflows/recommender.ts

coderabbitai Bot requested changes May 25, 2026

View reviewed changes

Comment thread packages/dispatcher/src/workflows/recommender.ts

coderabbitai Bot approved these changes May 25, 2026

View reviewed changes

thejustinwalsh merged commit 975a064 into main May 25, 2026
1 check passed

thejustinwalsh deleted the fix/recommender-on-daemon-engine branch May 25, 2026 18:08

thejustinwalsh mentioned this pull request May 25, 2026

fix(daemon): kick recommender on startup + don't burn the interval on a failed launch #151

Merged

3 tasks

coderabbitai Bot mentioned this pull request May 26, 2026

fix(dispatcher): resolve state-issue schema from the package, not per-repo #157

Merged

1 task

thejustinwalsh mentioned this pull request May 29, 2026

Make the CodexAdapter functionally dispatchable against live codex 0.133.0 #177

Closed

9 tasks

coderabbitai Bot mentioned this pull request Jun 4, 2026

fix(dispatcher): multi-repo coordination — close the real holes #229

Merged

3 tasks

Conversation

thejustinwalsh commented May 25, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

The bug

The fix — make the recommender use the daemon, exactly like dispatch

Integration-verified (per #143/#144)

Verify

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

❌ Failed checks (1 warning)

Review ran into problems

Uh oh!

thejustinwalsh commented May 25, 2026

Uh oh!

coderabbitai Bot commented May 25, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

thejustinwalsh commented May 25, 2026

Uh oh!

coderabbitai Bot commented May 25, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

thejustinwalsh commented May 25, 2026

Uh oh!

coderabbitai Bot commented May 25, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

thejustinwalsh commented May 25, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 25, 2026 •

edited

Loading