Skip to content

refactor(core): migrate mock.module() to spyOn() — single bun test invocation#1300

Closed
cryptobirr wants to merge 12 commits intocoleam00:devfrom
cryptobirr:archon/task-agent-skills-issue-grinder-1776661428960
Closed

refactor(core): migrate mock.module() to spyOn() — single bun test invocation#1300
cryptobirr wants to merge 12 commits intocoleam00:devfrom
cryptobirr:archon/task-agent-skills-issue-grinder-1776661428960

Conversation

@cryptobirr
Copy link
Copy Markdown

@cryptobirr cryptobirr commented Apr 20, 2026

Summary

This PR accumulates fixes from the issue grinder pipeline across multiple issues:

Fixes #966
Fixes #967
Fixes #969
Fixes #970
Fixes #971
Fixes #1237
Fixes #1266
Fixes #1302

Test plan

  • `bun run validate` passes (type-check, lint, format, all tests green)
  • 823 tests pass in single `bun test src/` invocation for `@archon/core`

🤖 Generated with Claude Code

Converts all internal module mock.module() calls to spyOn() with
mockRestore() so @archon/core tests can run as a single bun test
invocation instead of 10 separate batches.

Key changes:
- Removed mock.module() for all internal modules (../db/*, @archon/git,
  ../handlers/command-handler, ../workflows/store-adapter,
  ../config/config-loader, ../utils/worktree-sync, ./orchestrator,
  ./prompt-builder, ../utils/error*, ../services/*, ./connection) and
  replaced with spyOn() + mockRestore() in beforeEach/afterEach
- Converted mock.module('@archon/paths') to spyOn across all 15 test
  files so real exports (BUNDLED_IS_BINARY, ensureProjectStructure, etc.)
  remain accessible to transitively-imported real modules
- Added resetLogForTest() export to orchestrator-agent.ts to clear
  cachedLog between test files (same pattern as resetDatabase in
  connection.ts)
- Simplified test script to: bun test src/

Fixes coleam00#966

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 20, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Converted many tests in packages/core from Bun's process-wide mock.module() to per-test spyOn() with beforeEach/afterEach teardown; simplified core test script to bun test src/; added resetLogForTest() and conditional message persistence in orchestrator-agent; changed corrupt commands handling to surface parse errors; forced fresh session on a resumed loop iteration; added Claude warmup helper and server warm-up call.

Changes

Cohort / File(s) Summary
Test runner
packages/core/package.json
Replaced explicit enumerated bun test invocations with a single directory-scoped bun test src/.
Paths / logger test harness
packages/core/src/config/config-loader.test.ts, packages/core/src/utils/worktree-sync.test.ts, packages/core/src/services/title-generator.test.ts, packages/core/src/handlers/clone.test.ts
Replaced mock.module('@archon/paths') with per-test spyOn(paths, ...) setups for createLogger/path helpers and added afterEach restores.
DB connection & DB-layer tests
packages/core/src/db/...
adapters/postgres.test.ts, codebases.test.ts, conversations.test.ts, env-vars.test.ts, isolation-environments.test.ts, messages.test.ts, sessions.test.ts, workflow-events.test.ts, workflows.test.ts
Switched module-level mock.module('./connection') to runtime spyOn(connection, ...) for pool.query, getDialect, getDatabaseType, etc.; added beforeEach/afterEach spy lifecycle; added test asserting getCodebaseCommands rejects on corrupt JSON.
Handlers & command flow tests
packages/core/src/handlers/command-handler.test.ts, packages/core/src/handlers/clone.test.ts
Migrated many module-level mocks to per-test spyOn() for DB/service/path functions; introduced centralized setup/restore helpers and afterEach cleanup.
Operations tests
packages/core/src/operations/...
isolation-operations.test.ts, workflow-operations.test.ts
Replaced mock.module() stubs with spyOn() for @archon/git, @archon/paths, DB and service dependencies; added lifecycle restore to avoid cross-test pollution.
Orchestrator: code & tests
packages/core/src/orchestrator/...
orchestrator-agent.ts, orchestrator-agent.test.ts, orchestrator-isolation.test.ts, orchestrator.test.ts
Added exported resetLogForTest() to clear cached logger; handleMessage now persists user messages only for non-web adapters (errors logged, not thrown); extensive migration of orchestrator tests from mock.module() to per-test spyOn() with setup/restore.
Service tests
packages/core/src/services/...
cleanup-service.test.ts, title-generator.test.ts
Replaced module-level mocks for paths/git/DB with per-test spyOn() spies; added typed spy variables and afterEach restores.
Workflows: executor & tests
packages/workflows/src/dag-executor.ts, packages/workflows/src/dag-executor.test.ts, packages/core/src/workflows/store-adapter.test.ts
Changed loop session threading: resumed interactive loop's first resumed iteration now forces fresh session (resumeSessionId undefined); updated tests accordingly; store-adapter tests migrated to spyOn() for DB/config.
Server API & schema
packages/server/src/routes/api.ts, packages/server/src/routes/schemas/codebase.schemas.ts
When persisted commands JSON fails to parse, handlers now include an optional commandsParseError string in successful responses; schema extended to document optional commandsParseError.
UI text change
packages/web/src/components/dashboard/WorkflowRunCard.tsx
Reject action text and dialog now switch between "Request Changes" (when approval.onReject exists) and "Cancel" otherwise; confirm labels and dialog copy updated.
Claude provider warmup
packages/providers/src/claude/warm.ts, packages/providers/src/index.ts, packages/server/src/index.ts
Added warmClaudeSubprocess() that conditionally calls @anthropic-ai/claude-agent-sdk startup; re-exported it and invoked it in server startup as a non-blocking warmup with errors caught and logged.
Widespread test migrations
multiple orchestrator, agent, workflow, store tests (see raw summary)
Large-scale migration from mock.module() to spyOn() across many tests; added beforeEach/afterEach lifecycle spy setup and teardown to enable single-process test runs.

Sequence Diagram(s)

sequenceDiagram
    participant Server
    participant Providers as `@archon/providers`
    participant ClaudeSDK as anthropic/claude-agent-sdk

    Server->>Providers: import warmClaudeSubprocess
    Server->>Providers: warmClaudeSubprocess() (async, fire-and-forget)
    Providers->>ClaudeSDK: dynamic import('@anthropic-ai/claude-agent-sdk')
    alt startup exists
        ClaudeSDK->>ClaudeSDK: await startup()
        ClaudeSDK-->>Providers: resolved
        Providers-->>Server: log claude.warm_success (info)
    else startup missing
        ClaudeSDK-->>Providers: undefined/startup absent
        Providers-->>Server: log claude.warm_unavailable (debug)
    end
    opt error
        Providers-->>Server: log claude.warm_unexpected_error (warn/debug)
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related issues

Possibly related PRs

Poem

"I'm a rabbit in the test garden, quick and bright,
I swapped big mocks for spies and kept tests tight.
One bun command now runs the whole show,
Claude wakes with a whisper, logs in a row.
Fresh sessions and clearer errors — I hop with delight! 🐇"

🚥 Pre-merge checks | ✅ 2 | ❌ 3

❌ Failed checks (2 warnings, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 55.56% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check ⚠️ Warning PR description lacks required template sections: UX Journey, Architecture Diagram, Validation Evidence, Security Impact, Human Verification, Side Effects/Blast Radius, and Rollback Plan. Complete the required sections from the description template, or mark sections as 'N/A' if not applicable. Minimum: Validation Evidence, Security Impact, Human Verification, and Rollback Plan must be addressed.
Out of Scope Changes check ❓ Inconclusive Changes are tightly scoped to test refactoring and supporting changes. Additional scope includes: dag-executor loop session resumption test expectation update, codebases error handling change, server API response schema extension, and WorkflowRunCard UI conditional logic—all appear related to a broader feature/fix effort beyond the core mock.module migration. Clarify whether dag-executor, codebases error handling, server API schema, and WorkflowRunCard changes are intentionally part of this PR or should be separated into distinct PRs. Consider whether they should reference different/additional linked issues.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed Title clearly summarizes the main refactoring effort: migrating from mock.module() to spyOn() and consolidating test invocations into a single bun test command.
Linked Issues check ✅ Passed All primary objectives from issue #966 are met: mock.module() converted to spyOn() across 23 core test files, resetLogForTest() added following resetDatabase() pattern, package.json test script simplified to 'bun test src/', and acceptance criteria achieved (818 tests pass in single invocation with 0 failures).

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

When an interactive loop gate pauses for human input, the Claude SDK
session from iteration 1 may be idle for minutes or hours. Resuming it
on iteration 2+ causes error_during_execution.

Fix: treat the first resumed iteration (isLoopResume && i ===
startIteration) as needing a fresh session, same as i === 1. The user's
input already flows via \$LOOP_USER_INPUT, so session continuity is not
needed.

Fixes coleam00#1208

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
packages/core/src/services/title-generator.test.ts (1)

18-20: ⚠️ Potential issue | 🟠 Major

Migrate internal module mocks to spyOn() to fix test isolation violations.

Lines 18–20 and 40–42 mock internal modules using mock.module(). Per the coding guideline, this approach is prohibited for internal modules when other test files mock the same modules with different implementations. packages/core/src/workflows/store-adapter.test.ts (line 28) also mocks @archon/providers with a different implementation, creating cross-file test pollution that makes test execution order-dependent. Switch both mocks to spyOn().

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/services/title-generator.test.ts` around lines 18 - 20, The
test currently uses mock.module() to replace the internal module export
updateConversationTitle; replace that with a jest.spyOn on the actual module
export (the updateConversationTitle function from '../db/conversations') and
provide the same mock implementation via mockImplementation/mockResolvedValue so
mocks are local to the test runtime; ensure you restore the spy after each test
(use mockRestore or jest.restoreAllMocks in afterEach) to avoid cross-file
pollution—also update the other instance in this file (the second mock.module
call around lines 40–42) the same way so both internal-module mocks use spyOn
instead of mock.module.
packages/core/src/services/cleanup-service.test.ts (1)

35-47: ⚠️ Potential issue | 🟠 Major

Migrate mock.module() calls to spyOn() to prevent test isolation violations.

Both mock.module('../isolation', ...) and mock.module('@archon/isolation', ...) are used by other test files (orchestrator-agent.test.ts, orchestrator-isolation.test.ts, command-handler.test.ts) with different implementations. These conflicting mocks will interfere when tests run in a single bun test src/ invocation. Use spyOn() instead to isolate mocks to this test file, or split into separate bun test invocations per the coding guidelines on test isolation.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/services/cleanup-service.test.ts` around lines 35 - 47, The
tests currently use mock.module(...) to override getIsolationProvider and
getPrState which causes cross-test interference; replace those mock.module calls
with spyOn(...) on the imported module functions (spyOn(require('../isolation'),
'getIsolationProvider') and spyOn(require('@archon/isolation'),
'getIsolationProvider') / spyOn(require('@archon/isolation'), 'getPrState')) and
have them return the same mock implementations (mockDestroy, mockGetPrState)
used now; ensure you restore or clear the spies in afterEach to avoid leaking
between tests.
packages/core/src/orchestrator/orchestrator-agent.test.ts (1)

94-110: ⚠️ Potential issue | 🟠 Major

Migrate the workflow mocks to spyOn() to improve test robustness.

These @archon/workflows/* modules have different mock implementations in orchestrator.test.ts (e.g., tool-formatter, workflow-discovery). While the package.json test script currently isolates these files in separate invocations (preventing cross-file pollution), using process-wide mock.module() creates fragility. Migrate to spyOn() with imports as namespaces in beforeEach / afterEach to make mocks reversible and test order-independent, matching the pattern already used for other mocked modules in this file.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/orchestrator/orchestrator-agent.test.ts` around lines 94 -
110, Replace the process-wide mock.module calls for tool-formatter,
workflow-discovery, router, and executor with reversible spyOn-based mocks:
import each module as a namespace and in beforeEach create spies for
formatToolCall, discoverWorkflowsWithConfig, findWorkflow, and executeWorkflow
(using the existing mock implementations mockDiscoverWorkflowsWithConfig and
mockExecuteWorkflow where appropriate), and restore/reset those spies in
afterEach so tests are order-independent and mocks are reversible; update any
test references to call the spy instances instead of relying on mock.module
globals.
🧹 Nitpick comments (6)
packages/core/src/workflows/store-adapter.test.ts (2)

28-30: Migrate this mock.module('@archon/providers', ...) to spyOn().

Same concern as in title-generator.test.ts: @archon/providers is an internal monorepo package mocked by multiple test files with different implementations, which is the exact pollution pattern this PR eliminates for other modules. Since the test only uses getAgentProvider and only to assert typeof deps.getAgentProvider === 'function', a spyOn(providers, 'getAgentProvider') in beforeEach would suffice.

As per coding guidelines: "never use mock.module() for modules that another test file also mock.module()s with different implementations".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/workflows/store-adapter.test.ts` around lines 28 - 30,
Replace the module-level mock.module('@archon/providers', ...) with a spyOn in
the test setup: import the providers module used in the test and in beforeEach
call spyOn(providers, 'getAgentProvider') to provide the minimal implementation
needed (e.g., a stub function) so the assertion that typeof
deps.getAgentProvider === 'function' still passes; remove the mock.module call
and ensure spies are cleaned up in afterEach (restore the spy or call
restoreAllMocks) so other tests aren’t polluted.

32-125: LGTM on the spy migration.

The spyOn/mockRestore lifecycle is correctly wired for all 17 DB/config spies, and beforeEach re-establishes implementations after afterEach restores. One optional cleanup: the large block of individual spy*.mockRestore() calls could be collapsed into a spies: Array<ReturnType<typeof spyOn>> = [] pattern pushed to during beforeEach and iterated in afterEach — reduces boilerplate and prevents future drift when spies are added/removed.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/workflows/store-adapter.test.ts` around lines 32 - 125, The
test file has many individual spy variables (e.g., spyDbWorkflowsCreateRun,
spyDbWorkflowsGetRun, spyDbWorkflowEventsCreate, spyConfigLoaderLoad) each
restored in afterEach; replace that boilerplate by collecting spyOn returns into
a single array (e.g., const spies: Array<ReturnType<typeof spyOn>> = []) inside
beforeEach as you create each spy (push each spy into spies) and then in
afterEach iterate spies.forEach(s => s.mockRestore()); keep the same spyOn calls
and mock implementations (mockCreateWorkflowRun, mockGetWorkflowRun, etc.) but
stop declaring/restore-ing each spy individually.
packages/core/src/services/cleanup-service.test.ts (1)

95-189: LGTM on the git/db spy setup.

The 24-spy beforeEach/afterEach pair is mechanical but correct. Consider collapsing into an array-tracked pattern (see suggestion on store-adapter.test.ts) to reduce maintenance cost — this file is the largest user of the pattern.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/services/cleanup-service.test.ts` around lines 95 - 189,
The beforeEach/afterEach block for creating and restoring 24 spies is
repetitive; refactor by collecting spy setup info into an array and iterating to
create/restore them. Create an array of descriptors (e.g., {obj: git, method:
'execFileAsync', mock: mockExecFileAsync, varName: 'spyGitExecFileAsync'}) and
in beforeEach loop call spyOn(descriptor.obj,
descriptor.method).mockImplementation(descriptor.mock) (or mockReturnValue where
appropriate) storing each spy in a map/array; in afterEach iterate the same
collection and call mockRestore() on each stored spy. Update references to the
existing spy variables (spyPathsCreateLogger, spyGitExecFileAsync,
spyDbIsolationListAllActive, etc.) to read from the new collection/map so tests
still access the spies via their original identifiers.
packages/core/src/config/config-loader.test.ts (1)

25-29: Self-mock of module under test is unusual but acceptable.

mock.module('./config-loader', ...) at module load replaces readConfigFile/writeConfigFile on the SUT itself while preserving other exports via spread. Unlike the other flagged cases, nothing else imports ./config-loader as './config-loader' from this directory, so cross-file pollution is unlikely. Still, a cleaner alternative would be to export readConfigFile/writeConfigFile via an injected dependency (or simply spyOn(realConfigLoader, 'readConfigFile') since realConfigLoader is already imported as a namespace at line 24) — that removes the only remaining mock.module from this file and keeps the PR's pattern consistent.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/config/config-loader.test.ts` around lines 25 - 29, The
test currently uses mock.module('./config-loader', ...) to replace
readConfigFile and writeConfigFile on the module under test; replace this
self-mocking with spies on the imported namespace instead: remove the
mock.module call and use spyOn(realConfigLoader, 'readConfigFile') and
spyOn(realConfigLoader, 'writeConfigFile') (or inject the functions as
dependencies) so the rest of the module's real exports remain intact without
self-mocking and match the project's mocking pattern.
packages/core/src/orchestrator/orchestrator.test.ts (2)

311-317: Inline mock() inside setupSpies() is unreachable for assertions.

The inline mock for storeAdapter.createWorkflowDeps is created fresh on every setupSpies() call and never stored, so no test can assert on its call args/counts. It also allocates a new function each beforeEach. If you ever need to verify createWorkflowDeps interactions, promote it to a module-level const mockCreateWorkflowDeps = mock(...) (parallel to the other mocks) and reuse it here. Currently fine because no test needs it, flagging as optional.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/orchestrator/orchestrator.test.ts` around lines 311 - 317,
The inline mock passed to spyOn for storeAdapter.createWorkflowDeps inside
setupSpies is recreated on every beforeEach and not exposed for assertions;
extract it to a module-level constant (e.g., mockCreateWorkflowDeps) alongside
the other mocks and then use that constant in spyStoreAdapterCreate =
spyOn(storeAdapter,
'createWorkflowDeps').mockImplementation(mockCreateWorkflowDeps) so tests can
inspect call counts/args and avoid allocating a new function each setup; update
references to setupSpies, spyStoreAdapterCreate, and
storeAdapter.createWorkflowDeps accordingly.

531-535: Redundant restoreSpies() at the top of beforeEach.

afterEach already restores spies after every test, so calling restoreSpies() again before setupSpies() is a no-op on the first run (optional chaining) and double-restore on later runs. Safe, but it muddies the lifecycle and can mask ordering bugs. Consider removing it and relying on the afterEach hook.

♻️ Proposed simplification
   beforeEach(() => {
-    restoreSpies();
     setupSpies();
     resetLogForTest();
     clearAllMocks();
     platform = new MockPlatformAdapter();
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/orchestrator/orchestrator.test.ts` around lines 531 - 535,
Remove the redundant call to restoreSpies() at the start of the beforeEach
block: rely on the afterEach hook to restore spies and keep beforeEach only
calling setupSpies(), resetLogForTest(), and clearAllMocks(); update the
beforeEach that currently references restoreSpies() so it no longer invokes
restoreSpies() before setupSpies(), leaving restore/teardown responsibility to
the afterEach cleanup.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/core/package.json`:
- Line 25: The test script in package.json currently runs multiple bun test
invocations and hard-codes some orchestrator files; change the "test" npm script
to a single package-local bun test invocation that covers all test files
deterministically (e.g., one bun test that targets src/** and
src/orchestrator/**/*.test.ts together instead of multiple commands), so update
the "test" script entry to run tests in one process and include the entire
orchestrator test directory rather than enumerating files.

In `@packages/core/src/db/env-vars.test.ts`:
- Around line 17-24: The module-scoped cached logger (cachedLog) in env-vars.ts
persists across tests so the spy installed by spyOn(paths, 'createLogger') is
bypassed; add a small test-only API to clear that state (e.g., export a
resetCachedLog or clearCachedLogger function in env-vars.ts that sets cachedLog
= undefined) and call it from tests' beforeEach so getLog() will re-create the
logger and invoke paths.createLogger spy, or alternatively remove the
module-scoped cachedLog and make getLog() retrieve/create the logger on each
call to avoid cross-test caching.

In `@packages/core/src/operations/workflow-operations.test.ts`:
- Around line 36-68: The module-level cached logger (cachedLog) in
workflow-operations.ts is not cleared between tests, causing getLog() to return
a stale logger across test cases; add and export a resetLogForTest() function
(following orchestrator-agent.ts pattern) that sets cachedLog back to
undefined/null, then call resetLogForTest() in the test file's afterEach()
alongside the existing mock restores so each test's spyOn(paths.createLogger)
can take effect.

In `@packages/core/src/orchestrator/orchestrator-agent.test.ts`:
- Around line 196-337: The test suite leaks a cached logger across tests; call
resetLogForTest() at the start of beforeEach and again at the end of afterEach
to clear the module-level logger cache used by orchestrator-agent tests (mirror
the pattern used in orchestrator.test.ts). Locate the existing beforeEach and
afterEach blocks in orchestrator-agent.test.ts and invoke resetLogForTest() (the
exported helper named resetLogForTest) immediately before setting up spies in
beforeEach and immediately after restoring spies in afterEach so mocked loggers
do not persist between tests.

In `@packages/core/src/utils/worktree-sync.test.ts`:
- Around line 33-36: The module-level cached logger in worktree-sync.ts causes
tests to bypass spyOn(paths, 'createLogger') after the first initialization; add
a test-only reset function resetWorktreeSyncLogForTest() in worktree-sync.ts
that clears the cached logger used by syncArchonToWorktree(), then call
resetWorktreeSyncLogForTest() in beforeEach and afterEach of the test (alongside
spyPathsCreateLogger setup/restore) so each test run re-triggers createLogger
and the spy is effective.

---

Outside diff comments:
In `@packages/core/src/orchestrator/orchestrator-agent.test.ts`:
- Around line 94-110: Replace the process-wide mock.module calls for
tool-formatter, workflow-discovery, router, and executor with reversible
spyOn-based mocks: import each module as a namespace and in beforeEach create
spies for formatToolCall, discoverWorkflowsWithConfig, findWorkflow, and
executeWorkflow (using the existing mock implementations
mockDiscoverWorkflowsWithConfig and mockExecuteWorkflow where appropriate), and
restore/reset those spies in afterEach so tests are order-independent and mocks
are reversible; update any test references to call the spy instances instead of
relying on mock.module globals.

In `@packages/core/src/services/cleanup-service.test.ts`:
- Around line 35-47: The tests currently use mock.module(...) to override
getIsolationProvider and getPrState which causes cross-test interference;
replace those mock.module calls with spyOn(...) on the imported module functions
(spyOn(require('../isolation'), 'getIsolationProvider') and
spyOn(require('@archon/isolation'), 'getIsolationProvider') /
spyOn(require('@archon/isolation'), 'getPrState')) and have them return the same
mock implementations (mockDestroy, mockGetPrState) used now; ensure you restore
or clear the spies in afterEach to avoid leaking between tests.

In `@packages/core/src/services/title-generator.test.ts`:
- Around line 18-20: The test currently uses mock.module() to replace the
internal module export updateConversationTitle; replace that with a jest.spyOn
on the actual module export (the updateConversationTitle function from
'../db/conversations') and provide the same mock implementation via
mockImplementation/mockResolvedValue so mocks are local to the test runtime;
ensure you restore the spy after each test (use mockRestore or
jest.restoreAllMocks in afterEach) to avoid cross-file pollution—also update the
other instance in this file (the second mock.module call around lines 40–42) the
same way so both internal-module mocks use spyOn instead of mock.module.

---

Nitpick comments:
In `@packages/core/src/config/config-loader.test.ts`:
- Around line 25-29: The test currently uses mock.module('./config-loader', ...)
to replace readConfigFile and writeConfigFile on the module under test; replace
this self-mocking with spies on the imported namespace instead: remove the
mock.module call and use spyOn(realConfigLoader, 'readConfigFile') and
spyOn(realConfigLoader, 'writeConfigFile') (or inject the functions as
dependencies) so the rest of the module's real exports remain intact without
self-mocking and match the project's mocking pattern.

In `@packages/core/src/orchestrator/orchestrator.test.ts`:
- Around line 311-317: The inline mock passed to spyOn for
storeAdapter.createWorkflowDeps inside setupSpies is recreated on every
beforeEach and not exposed for assertions; extract it to a module-level constant
(e.g., mockCreateWorkflowDeps) alongside the other mocks and then use that
constant in spyStoreAdapterCreate = spyOn(storeAdapter,
'createWorkflowDeps').mockImplementation(mockCreateWorkflowDeps) so tests can
inspect call counts/args and avoid allocating a new function each setup; update
references to setupSpies, spyStoreAdapterCreate, and
storeAdapter.createWorkflowDeps accordingly.
- Around line 531-535: Remove the redundant call to restoreSpies() at the start
of the beforeEach block: rely on the afterEach hook to restore spies and keep
beforeEach only calling setupSpies(), resetLogForTest(), and clearAllMocks();
update the beforeEach that currently references restoreSpies() so it no longer
invokes restoreSpies() before setupSpies(), leaving restore/teardown
responsibility to the afterEach cleanup.

In `@packages/core/src/services/cleanup-service.test.ts`:
- Around line 95-189: The beforeEach/afterEach block for creating and restoring
24 spies is repetitive; refactor by collecting spy setup info into an array and
iterating to create/restore them. Create an array of descriptors (e.g., {obj:
git, method: 'execFileAsync', mock: mockExecFileAsync, varName:
'spyGitExecFileAsync'}) and in beforeEach loop call spyOn(descriptor.obj,
descriptor.method).mockImplementation(descriptor.mock) (or mockReturnValue where
appropriate) storing each spy in a map/array; in afterEach iterate the same
collection and call mockRestore() on each stored spy. Update references to the
existing spy variables (spyPathsCreateLogger, spyGitExecFileAsync,
spyDbIsolationListAllActive, etc.) to read from the new collection/map so tests
still access the spies via their original identifiers.

In `@packages/core/src/workflows/store-adapter.test.ts`:
- Around line 28-30: Replace the module-level mock.module('@archon/providers',
...) with a spyOn in the test setup: import the providers module used in the
test and in beforeEach call spyOn(providers, 'getAgentProvider') to provide the
minimal implementation needed (e.g., a stub function) so the assertion that
typeof deps.getAgentProvider === 'function' still passes; remove the mock.module
call and ensure spies are cleaned up in afterEach (restore the spy or call
restoreAllMocks) so other tests aren’t polluted.
- Around line 32-125: The test file has many individual spy variables (e.g.,
spyDbWorkflowsCreateRun, spyDbWorkflowsGetRun, spyDbWorkflowEventsCreate,
spyConfigLoaderLoad) each restored in afterEach; replace that boilerplate by
collecting spyOn returns into a single array (e.g., const spies:
Array<ReturnType<typeof spyOn>> = []) inside beforeEach as you create each spy
(push each spy into spies) and then in afterEach iterate spies.forEach(s =>
s.mockRestore()); keep the same spyOn calls and mock implementations
(mockCreateWorkflowRun, mockGetWorkflowRun, etc.) but stop declaring/restore-ing
each spy individually.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 77ff685d-3085-4673-a5bc-df767503b54f

📥 Commits

Reviewing files that changed from the base of the PR and between ec5e5a5 and e90acb3.

📒 Files selected for processing (23)
  • packages/core/package.json
  • packages/core/src/config/config-loader.test.ts
  • packages/core/src/db/adapters/postgres.test.ts
  • packages/core/src/db/codebases.test.ts
  • packages/core/src/db/conversations.test.ts
  • packages/core/src/db/env-vars.test.ts
  • packages/core/src/db/isolation-environments.test.ts
  • packages/core/src/db/messages.test.ts
  • packages/core/src/db/sessions.test.ts
  • packages/core/src/db/workflow-events.test.ts
  • packages/core/src/db/workflows.test.ts
  • packages/core/src/handlers/clone.test.ts
  • packages/core/src/handlers/command-handler.test.ts
  • packages/core/src/operations/isolation-operations.test.ts
  • packages/core/src/operations/workflow-operations.test.ts
  • packages/core/src/orchestrator/orchestrator-agent.test.ts
  • packages/core/src/orchestrator/orchestrator-agent.ts
  • packages/core/src/orchestrator/orchestrator-isolation.test.ts
  • packages/core/src/orchestrator/orchestrator.test.ts
  • packages/core/src/services/cleanup-service.test.ts
  • packages/core/src/services/title-generator.test.ts
  • packages/core/src/utils/worktree-sync.test.ts
  • packages/core/src/workflows/store-adapter.test.ts

Comment thread packages/core/package.json Outdated
Comment on lines +17 to +24
spyPathsCreateLogger = spyOn(paths, 'createLogger').mockReturnValue({
info: mock(() => {}),
warn: mock(() => {}),
error: mock(() => {}),
debug: mock(() => {}),
trace: mock(() => {}),
fatal: mock(() => {}),
} as ReturnType<typeof paths.createLogger>);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Description: Check whether packages/core/src/db/env-vars.ts creates or caches a logger during module evaluation.
# Expected: createLogger is only called lazily after test beforeEach runs, or the module exposes a test reset for cached loggers.

rg -n -C3 --type=ts '\bcreateLogger\s*\(|\bgetLog\s*\(|\b(?:const|let)\s+\w*[Ll]og\w*\b' packages/core/src/db/env-vars.ts

Repository: coleam00/Archon

Length of output: 859


🏁 Script executed:

cat -n packages/core/src/db/env-vars.test.ts

Repository: coleam00/Archon

Length of output: 3410


Clear the cached logger between tests to ensure spy isolation.

The logger spy is installed in beforeEach, but env-vars.ts caches the logger in module-scoped cachedLog via getLog(). Once populated on the first test run, cachedLog persists across subsequent tests; later tests return the stale cached logger instead of calling the newly installed spy.

Export a test reset function from env-vars.ts to clear cachedLog, or refactor to avoid module-scoped state for logger access. This violates test isolation requirements.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/db/env-vars.test.ts` around lines 17 - 24, The
module-scoped cached logger (cachedLog) in env-vars.ts persists across tests so
the spy installed by spyOn(paths, 'createLogger') is bypassed; add a small
test-only API to clear that state (e.g., export a resetCachedLog or
clearCachedLogger function in env-vars.ts that sets cachedLog = undefined) and
call it from tests' beforeEach so getLog() will re-create the logger and invoke
paths.createLogger spy, or alternatively remove the module-scoped cachedLog and
make getLog() retrieve/create the logger on each call to avoid cross-test
caching.

Comment on lines 36 to +68
const { approveWorkflow, rejectWorkflow, getWorkflowStatus, resumeWorkflow, abandonWorkflow } =
await import('./workflow-operations');

// Set up spies after the module is imported
beforeEach(() => {
spyPathsCreateLogger = spyOn(paths, 'createLogger').mockReturnValue(
mockLogger as ReturnType<typeof paths.createLogger>
);
spyDbWorkflowsGetRun = spyOn(dbWorkflows, 'getWorkflowRun').mockImplementation(
mockGetWorkflowRun
);
spyDbWorkflowsListRuns = spyOn(dbWorkflows, 'listWorkflowRuns').mockImplementation(
mockListWorkflowRuns
);
spyDbWorkflowsUpdateRun = spyOn(dbWorkflows, 'updateWorkflowRun').mockImplementation(
mockUpdateWorkflowRun
);
spyDbWorkflowsCancelRun = spyOn(dbWorkflows, 'cancelWorkflowRun').mockImplementation(
mockCancelWorkflowRun
);
spyDbWorkflowEventsCreate = spyOn(dbWorkflowEvents, 'createWorkflowEvent').mockImplementation(
mockCreateWorkflowEvent
);
});

afterEach(() => {
spyPathsCreateLogger.mockRestore();
spyDbWorkflowsGetRun.mockRestore();
spyDbWorkflowsListRuns.mockRestore();
spyDbWorkflowsUpdateRun.mockRestore();
spyDbWorkflowsCancelRun.mockRestore();
spyDbWorkflowEventsCreate.mockRestore();
});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check that workflow-operations uses the lazy getLog pattern and doesn't cache createLogger at module load
rg -nP --type=ts -C3 "createLogger|cachedLog|getLog" packages/core/src/operations/workflow-operations.ts

Repository: coleam00/Archon

Length of output: 1784


🏁 Script executed:

# Check the full test file to see if cachedLog is reset in afterEach or elsewhere
head -100 packages/core/src/operations/workflow-operations.test.ts | tail -70

Repository: coleam00/Archon

Length of output: 2223


🏁 Script executed:

# Look for any reset mechanism or cleanup for cachedLog
rg -n "cachedLog|resetLog|reset.*Log" packages/core/src/operations/workflow-operations.test.ts

Repository: coleam00/Archon

Length of output: 41


🏁 Script executed:

# Check if cachedLog is exported from workflow-operations for testing
rg -n "export.*cachedLog|cachedLog" packages/core/src/operations/workflow-operations.ts

Repository: coleam00/Archon

Length of output: 205


🏁 Script executed:

# Search for a resetLogForTest pattern mentioned in learnings or across the codebase
rg -n "resetLogForTest|resetLog.*Test" packages/core/src/

Repository: coleam00/Archon

Length of output: 389


Reset cachedLog in afterEach to prevent logger pollution across tests.

The lazy getLog() pattern defers createLogger() until first use, which works with the top-level import. However, workflow-operations.ts caches the logger in module-scoped cachedLog and never resets it between tests. When tests run together in a single bun test src/ invocation:

  1. Test A calls getLog() → caches the first mock logger in cachedLog
  2. Test B's beforeEach sets up a new spyOn(), but cachedLog still holds test A's logger instance
  3. Test B calls getLog() → returns the stale cached logger, bypassing the new mock

This breaks isolation: assertions on mockLogger.error (lines 320–323) may fail or pass unexpectedly depending on test execution order.

Export a resetLogForTest() function (following the pattern in orchestrator-agent.ts line 65) and call it in afterEach() to clear cachedLog between tests.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/operations/workflow-operations.test.ts` around lines 36 -
68, The module-level cached logger (cachedLog) in workflow-operations.ts is not
cleared between tests, causing getLog() to return a stale logger across test
cases; add and export a resetLogForTest() function (following
orchestrator-agent.ts pattern) that sets cachedLog back to undefined/null, then
call resetLogForTest() in the test file's afterEach() alongside the existing
mock restores so each test's spyOn(paths.createLogger) can take effect.

Comment on lines 196 to +337
import { parseOrchestratorCommands, handleMessage } from './orchestrator-agent';

// ─── Spy setup / teardown ─────────────────────────────────────────────────────

beforeEach(() => {
spyPathsCreateLogger = spyOn(paths, 'createLogger').mockReturnValue(mockLogger);
spyPathsGetArchonWorkspacesPath = spyOn(paths, 'getArchonWorkspacesPath').mockReturnValue(
'/home/test/.archon/workspaces'
);
spyPathsGetArchonHome = spyOn(paths, 'getArchonHome').mockReturnValue('/home/test/.archon');
spyDbConversationsGetOrCreate = spyOn(
dbConversations,
'getOrCreateConversation'
).mockImplementation(mockGetOrCreateConversation);
spyDbConversationsGetByPlatformId = spyOn(
dbConversations,
'getConversationByPlatformId'
).mockImplementation(mockGetConversationByPlatformId);
spyDbConversationsUpdate = spyOn(dbConversations, 'updateConversation').mockImplementation(
mockUpdateConversation
);
spyDbConversationsTouch = spyOn(dbConversations, 'touchConversation').mockImplementation(
mockTouchConversation
);
spyDbCodebasesGet = spyOn(dbCodebases, 'getCodebase').mockImplementation(mockGetCodebase);
spyDbCodebasesList = spyOn(dbCodebases, 'listCodebases').mockImplementation(mockListCodebases);
spyDbCodebasesCreate = spyOn(dbCodebases, 'createCodebase').mockImplementation(
mockCreateCodebase
);
spyDbSessionsGetActive = spyOn(dbSessions, 'getActiveSession').mockImplementation(
mockGetActiveSession
);
spyDbSessionsUpdate = spyOn(dbSessions, 'updateSession').mockImplementation(mockUpdateSession);
spyDbSessionsTransition = spyOn(dbSessions, 'transitionSession').mockImplementation(
mockTransitionSession
);
spyCommandHandlerParse = spyOn(commandHandler, 'parseCommand').mockImplementation(
mockParseCommand
);
spyCommandHandlerHandle = spyOn(commandHandler, 'handleCommand').mockImplementation(
mockHandleCommand
);
spyDbEnvVarsGet = spyOn(dbEnvVars, 'getCodebaseEnvVars').mockImplementation(
mockGetCodebaseEnvVars
);
spyErrorFormatterClassify = spyOn(errorFormatter, 'classifyAndFormatError').mockImplementation(
mock((err: Error) => `Error: ${err.message}`)
);
spyErrorUtilsToError = spyOn(errorUtils, 'toError').mockImplementation(
mock((e: unknown) => (e instanceof Error ? e : new Error(String(e))))
);
spyStoreAdapterCreate = spyOn(storeAdapter, 'createWorkflowDeps').mockImplementation(
mock(() => ({}))
);
spyDbWorkflowsGetPaused = spyOn(dbWorkflows, 'getPausedWorkflowRun').mockImplementation(
mockGetPausedWorkflowRun
);
spyDbWorkflowsFindResumable = spyOn(
dbWorkflows,
'findResumableRunByParentConversation'
).mockImplementation(mockFindResumableRunByParentConversation);
spyDbWorkflowsUpdate = spyOn(dbWorkflows, 'updateWorkflowRun').mockImplementation(
mockUpdateWorkflowRun
);
spyDbWorkflowEventsCreate = spyOn(dbWorkflowEvents, 'createWorkflowEvent').mockImplementation(
mockCreateWorkflowEvent
);
spyConfigLoaderLoad = spyOn(configLoader, 'loadConfig').mockImplementation(mockLoadConfig);
spyTitleGeneratorGenerate = spyOn(titleGenerator, 'generateAndSetTitle').mockImplementation(
mockGenerateAndSetTitle
);
spyOrchestratorValidate = spyOn(
orchestratorModule,
'validateAndResolveIsolation'
).mockImplementation(mockValidateAndResolveIsolation);
spyOrchestratorDispatch = spyOn(
orchestratorModule,
'dispatchBackgroundWorkflow'
).mockImplementation(mockDispatchBackgroundWorkflow);
spyPromptBuilderOrchestrator = spyOn(promptBuilder, 'buildOrchestratorPrompt').mockImplementation(
mockBuildOrchestratorPrompt
);
spyPromptBuilderProjectScoped = spyOn(
promptBuilder,
'buildProjectScopedPrompt'
).mockImplementation(mockBuildProjectScopedPrompt);
spyPromptBuilderFormatWorkflowContext = spyOn(
promptBuilder,
'formatWorkflowContextSection'
).mockImplementation(mockFormatWorkflowContextSection);
spyDbMessagesGetRecent = spyOn(dbMessages, 'getRecentWorkflowResultMessages').mockImplementation(
mockGetRecentWorkflowResultMessages
);
spyDbMessagesAdd = spyOn(dbMessages, 'addMessage').mockImplementation(mockAddMessage);
spyDbMessagesList = spyOn(dbMessages, 'listMessages').mockImplementation(mockListMessages);
spyWorktreeSyncSync = spyOn(worktreeSync, 'syncArchonToWorktree').mockImplementation(
mockSyncArchonToWorktree
);
spyGitSyncWorkspace = spyOn(git, 'syncWorkspace').mockImplementation(mockSyncWorkspace);
spyGitToRepoPath = spyOn(git, 'toRepoPath').mockImplementation(
(p: string) => p as ReturnType<typeof git.toRepoPath>
);
});

afterEach(() => {
spyPathsCreateLogger.mockRestore();
spyPathsGetArchonWorkspacesPath.mockRestore();
spyPathsGetArchonHome.mockRestore();
spyDbConversationsGetOrCreate.mockRestore();
spyDbConversationsGetByPlatformId.mockRestore();
spyDbConversationsUpdate.mockRestore();
spyDbConversationsTouch.mockRestore();
spyDbCodebasesGet.mockRestore();
spyDbCodebasesList.mockRestore();
spyDbCodebasesCreate.mockRestore();
spyDbSessionsGetActive.mockRestore();
spyDbSessionsUpdate.mockRestore();
spyDbSessionsTransition.mockRestore();
spyCommandHandlerParse.mockRestore();
spyCommandHandlerHandle.mockRestore();
spyDbEnvVarsGet.mockRestore();
spyErrorFormatterClassify.mockRestore();
spyErrorUtilsToError.mockRestore();
spyStoreAdapterCreate.mockRestore();
spyDbWorkflowsGetPaused.mockRestore();
spyDbWorkflowsFindResumable.mockRestore();
spyDbWorkflowsUpdate.mockRestore();
spyDbWorkflowEventsCreate.mockRestore();
spyConfigLoaderLoad.mockRestore();
spyTitleGeneratorGenerate.mockRestore();
spyOrchestratorValidate.mockRestore();
spyOrchestratorDispatch.mockRestore();
spyPromptBuilderOrchestrator.mockRestore();
spyPromptBuilderProjectScoped.mockRestore();
spyPromptBuilderFormatWorkflowContext.mockRestore();
spyDbMessagesGetRecent.mockRestore();
spyDbMessagesAdd.mockRestore();
spyDbMessagesList.mockRestore();
spyWorktreeSyncSync.mockRestore();
spyGitSyncWorkspace.mockRestore();
spyGitToRepoPath.mockRestore();
});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Confirm resetLogForTest exists and is used anywhere that imports orchestrator-agent in tests.
rg -n -C3 "\bresetLogForTest\b|createLogger|cached.*log|let log" packages/core/src/orchestrator/orchestrator-agent.ts packages/core/src/orchestrator/*.test.ts

Repository: coleam00/Archon

Length of output: 5688


Call resetLogForTest() in both beforeEach and afterEach to prevent test pollution.

The orchestrator-agent module maintains a cached logger that persists across tests. Even though this file restores the @archon/paths.createLogger spy, the cached logger remains and leaks mocked instances into other test suites. The pattern is already established in orchestrator.test.ts (which imports and calls resetLogForTest())—apply the same fix here.

Proposed fix
-import { parseOrchestratorCommands, handleMessage } from './orchestrator-agent';
+import { parseOrchestratorCommands, handleMessage, resetLogForTest } from './orchestrator-agent';
@@
 beforeEach(() => {
+  resetLogForTest();
   spyPathsCreateLogger = spyOn(paths, 'createLogger').mockReturnValue(mockLogger);
@@
 afterEach(() => {
+  resetLogForTest();
   spyPathsCreateLogger.mockRestore();
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/orchestrator/orchestrator-agent.test.ts` around lines 196 -
337, The test suite leaks a cached logger across tests; call resetLogForTest()
at the start of beforeEach and again at the end of afterEach to clear the
module-level logger cache used by orchestrator-agent tests (mirror the pattern
used in orchestrator.test.ts). Locate the existing beforeEach and afterEach
blocks in orchestrator-agent.test.ts and invoke resetLogForTest() (the exported
helper named resetLogForTest) immediately before setting up spies in beforeEach
and immediately after restoring spies in afterEach so mocked loggers do not
persist between tests.

Comment on lines 33 to +36
beforeEach(() => {
spyPathsCreateLogger = spyOn(paths, 'createLogger').mockReturnValue(
mockLogger as ReturnType<typeof paths.createLogger>
);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify whether worktree-sync already has a cached logger reset helper and where syncArchonToWorktree is called.
rg -n -C3 'cachedLog|reset.*Log.*ForTest|reset.*Worktree|syncArchonToWorktree\s*\(' packages/core/src

Repository: coleam00/Archon

Length of output: 31324


Add test-only reset helper to worktree-sync.ts for logger cache isolation.

spyOn(paths, 'createLogger') is restored in afterEach, but syncArchonToWorktree() uses a module-level cached logger. Once initialized, that cache persists across tests in a single process and bypasses the spy in subsequent test runs.

This pattern is already established in the codebase at packages/core/src/orchestrator/orchestrator-agent.ts (lines 64–67). Apply the same fix: add resetWorktreeSyncLogForTest() and call it before and after each test.

Isolation fix
 // packages/core/src/utils/worktree-sync.ts
 let cachedLog: ReturnType<typeof createLogger> | undefined;
 function getLog(): ReturnType<typeof createLogger> {
   if (!cachedLog) cachedLog = createLogger('worktree-sync');
   return cachedLog;
 }
+
+export function resetWorktreeSyncLogForTest(): void {
+  cachedLog = undefined;
+}
 // packages/core/src/utils/worktree-sync.test.ts
-import { syncArchonToWorktree } from './worktree-sync';
+import { resetWorktreeSyncLogForTest, syncArchonToWorktree } from './worktree-sync';

 describe('syncArchonToWorktree', () => {
   beforeEach(() => {
+    resetWorktreeSyncLogForTest();
     spyPathsCreateLogger = spyOn(paths, 'createLogger').mockReturnValue(
       mockLogger as ReturnType<typeof paths.createLogger>
     );
   afterEach(() => {
     spyPathsCreateLogger.mockRestore();
+    resetWorktreeSyncLogForTest();
     isWorktreePathSpy.mockRestore();
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
beforeEach(() => {
spyPathsCreateLogger = spyOn(paths, 'createLogger').mockReturnValue(
mockLogger as ReturnType<typeof paths.createLogger>
);
import type { paths } from '@archon/paths';
import { resetWorktreeSyncLogForTest, syncArchonToWorktree } from './worktree-sync';
describe('syncArchonToWorktree', () => {
beforeEach(() => {
resetWorktreeSyncLogForTest();
spyPathsCreateLogger = spyOn(paths, 'createLogger').mockReturnValue(
mockLogger as ReturnType<typeof paths.createLogger>
);
});
afterEach(() => {
spyPathsCreateLogger.mockRestore();
resetWorktreeSyncLogForTest();
isWorktreePathSpy.mockRestore();
});
Suggested change
beforeEach(() => {
spyPathsCreateLogger = spyOn(paths, 'createLogger').mockReturnValue(
mockLogger as ReturnType<typeof paths.createLogger>
);
let cachedLog: ReturnType<typeof createLogger> | undefined;
function getLog(): ReturnType<typeof createLogger> {
if (!cachedLog) cachedLog = createLogger('worktree-sync');
return cachedLog;
}
export function resetWorktreeSyncLogForTest(): void {
cachedLog = undefined;
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/utils/worktree-sync.test.ts` around lines 33 - 36, The
module-level cached logger in worktree-sync.ts causes tests to bypass
spyOn(paths, 'createLogger') after the first initialization; add a test-only
reset function resetWorktreeSyncLogForTest() in worktree-sync.ts that clears the
cached logger used by syncArchonToWorktree(), then call
resetWorktreeSyncLogForTest() in beforeEach and afterEach of the test (alongside
spyPathsCreateLogger setup/restore) so each test run re-triggers createLogger
and the spy is effective.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
packages/workflows/src/dag-executor.test.ts (1)

19-27: ⚠️ Potential issue | 🟠 Major

Replace this internal module mock with spyOn().

This file still uses mock.module('@archon/paths'), which keeps a process-wide internal module replacement in a suite intended to run under one bun test src/ invocation. Convert these exports to spies and restore them so real @archon/paths exports remain available to transitive imports.

Refactor shape
 import * as git from '@archon/git';
+import * as paths from '@archon/paths';

 // --- Mock logger (MUST come before imports of modules under test) ---

 const mockLogFn = mock(() => {});
@@
-mock.module('@archon/paths', () => ({
-  createLogger: mock(() => mockLogger),
-  getCommandFolderSearchPaths: (folder?: string) => {
+let createLoggerSpy: ReturnType<typeof spyOn>;
+let commandFolderPathsSpy: ReturnType<typeof spyOn>;
+let defaultCommandsPathSpy: ReturnType<typeof spyOn>;
+
+beforeEach(() => {
+  createLoggerSpy = spyOn(paths, 'createLogger').mockReturnValue(mockLogger);
+  commandFolderPathsSpy = spyOn(paths, 'getCommandFolderSearchPaths').mockImplementation((folder?: string) => {
     const paths = ['.archon/commands'];
     if (folder) paths.unshift(folder);
     return paths;
-  },
-  getDefaultCommandsPath: () => '/nonexistent/defaults',
-}));
+  });
+  defaultCommandsPathSpy = spyOn(paths, 'getDefaultCommandsPath').mockReturnValue('/nonexistent/defaults');
+});
+
+afterEach(() => {
+  createLoggerSpy.mockRestore();
+  commandFolderPathsSpy.mockRestore();
+  defaultCommandsPathSpy.mockRestore();
+});

As per coding guidelines, **/*.test.{ts,tsx}: “Test isolation: Use spyOn() for internal modules; never use mock.module() for modules that another test file also mock.module()s with different implementations.”

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/workflows/src/dag-executor.test.ts` around lines 19 - 27, The test
currently replaces the entire `@archon/paths` module with mock.module(...) which
causes a process-wide replacement; instead import the real module (e.g. import *
as paths from '@archon/paths') and replace only the specific exports with spies:
spyOn(paths, 'createLogger').mockReturnValue(mockLogger), spyOn(paths,
'getCommandFolderSearchPaths').mockImplementation((folder?) => { const pathsArr
= ['.archon/commands']; if (folder) pathsArr.unshift(folder); return pathsArr;
}), and spyOn(paths,
'getDefaultCommandsPath').mockReturnValue('/nonexistent/defaults'); ensure you
restore the spies after each test (restoreIndividual spies or call
vi.restoreAllMocks()/afterEach cleanup) so transitive imports still see the real
`@archon/paths` in other tests.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/workflows/src/dag-executor.test.ts`:
- Around line 3844-3846: The test currently asserts that the resumed interactive
loop uses a fresh session by checking mockSendQueryDag.mock.calls[0][2]
(sessionArg) is undefined; change this to avoid locking behavior: either assert
that sessionArg equals the stored gate session ID used by the executor (i.e.,
expect sessionArg toMatch or equal the persisted session token the test sets up)
if session continuity is the intended behavior, or, if a fresh session is
required, replace the sessionArg undefined check with an assertion that the new
prompt passed to mockSendQueryDag (mockSendQueryDag.mock.calls[0][1]) contains
the prior loop output so the test verifies continuity of context. Ensure you
update the assertion around mockSendQueryDag calls and remove the rigid
toBeUndefined() check.

In `@packages/workflows/src/dag-executor.ts`:
- Around line 1582-1584: The code currently sets needsFreshSession =
loop.fresh_context || i === 1 || (isLoopResume && i === startIteration) which
forces resumeSessionId to undefined for the first resumed iteration; change the
logic so resumed interactive-loop iterations retain currentSessionId (i.e.,
remove the isLoopResume && i === startIteration branch from the fresh-session
condition). Update the same logic in the related block (the section around the
other resume/session handling at the 1593-1605 area) so both places compute
needsFreshSession without treating isLoopResume as forcing a fresh session,
ensuring resumeSessionId remains currentSessionId for resumed feedback turns
(refer to symbols: loop, needsFreshSession, resumeSessionId, currentSessionId,
isLoopResume, startIteration).

---

Outside diff comments:
In `@packages/workflows/src/dag-executor.test.ts`:
- Around line 19-27: The test currently replaces the entire `@archon/paths` module
with mock.module(...) which causes a process-wide replacement; instead import
the real module (e.g. import * as paths from '@archon/paths') and replace only
the specific exports with spies: spyOn(paths,
'createLogger').mockReturnValue(mockLogger), spyOn(paths,
'getCommandFolderSearchPaths').mockImplementation((folder?) => { const pathsArr
= ['.archon/commands']; if (folder) pathsArr.unshift(folder); return pathsArr;
}), and spyOn(paths,
'getDefaultCommandsPath').mockReturnValue('/nonexistent/defaults'); ensure you
restore the spies after each test (restoreIndividual spies or call
vi.restoreAllMocks()/afterEach cleanup) so transitive imports still see the real
`@archon/paths` in other tests.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: df94e682-a000-4341-8f8e-1c63334195da

📥 Commits

Reviewing files that changed from the base of the PR and between e90acb3 and 158ec01.

📒 Files selected for processing (2)
  • packages/workflows/src/dag-executor.test.ts
  • packages/workflows/src/dag-executor.ts

Comment on lines +3844 to +3846
// Should use a fresh session (not resume the stale one from before the human wait)
const sessionArg = mockSendQueryDag.mock.calls[0][2] as string | undefined;
expect(sessionArg).toBe('loop-session-1');
expect(sessionArg).toBeUndefined();
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don’t lock in fresh-session resume behavior here.

This assertion now codifies the resumed interactive loop losing the paused session context. If the production fix keeps session continuity, this should expect the stored gate session; if a fresh session is truly required, the test should also assert that the prior loop output is included in the new prompt.

Proposed expectation with session continuity
-      // Should use a fresh session (not resume the stale one from before the human wait)
+      // Should resume the paused loop session so feedback applies to the prior output
       const sessionArg = mockSendQueryDag.mock.calls[0][2] as string | undefined;
-      expect(sessionArg).toBeUndefined();
+      expect(sessionArg).toBe('loop-session-1');
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/workflows/src/dag-executor.test.ts` around lines 3844 - 3846, The
test currently asserts that the resumed interactive loop uses a fresh session by
checking mockSendQueryDag.mock.calls[0][2] (sessionArg) is undefined; change
this to avoid locking behavior: either assert that sessionArg equals the stored
gate session ID used by the executor (i.e., expect sessionArg toMatch or equal
the persisted session token the test sets up) if session continuity is the
intended behavior, or, if a fresh session is required, replace the sessionArg
undefined check with an assertion that the new prompt passed to mockSendQueryDag
(mockSendQueryDag.mock.calls[0][1]) contains the prior loop output so the test
verifies continuity of context. Ensure you update the assertion around
mockSendQueryDag calls and remove the rigid toBeUndefined() check.

Comment on lines +1582 to 1584
const needsFreshSession =
loop.fresh_context || i === 1 || (isLoopResume && i === startIteration);
const resumeSessionId = needsFreshSession ? undefined : currentSessionId;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Keep the stored loop session for resumed feedback turns.

Line 1583 forces resumeSessionId to undefined on the first resumed interactive-loop iteration, so the provider starts a new session while the prompt only injects $LOOP_USER_INPUT. Feedback like “add error handling” then has no prior plan/output context to apply to. Unless the prior loop output is explicitly included in the fresh prompt, keep resuming currentSessionId.

Proposed fix
-    const needsFreshSession =
-      loop.fresh_context || i === 1 || (isLoopResume && i === startIteration);
+    const needsFreshSession = loop.fresh_context || i === 1;
     const resumeSessionId = needsFreshSession ? undefined : currentSessionId;

Also applies to: 1593-1605

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/workflows/src/dag-executor.ts` around lines 1582 - 1584, The code
currently sets needsFreshSession = loop.fresh_context || i === 1 ||
(isLoopResume && i === startIteration) which forces resumeSessionId to undefined
for the first resumed iteration; change the logic so resumed interactive-loop
iterations retain currentSessionId (i.e., remove the isLoopResume && i ===
startIteration branch from the fresh-session condition). Update the same logic
in the related block (the section around the other resume/session handling at
the 1593-1605 area) so both places compute needsFreshSession without treating
isLoopResume as forcing a fresh session, ensuring resumeSessionId remains
currentSessionId for resumed feedback turns (refer to symbols: loop,
needsFreshSession, resumeSessionId, currentSessionId, isLoopResume,
startIteration).

@Wirasm
Copy link
Copy Markdown
Collaborator

Wirasm commented Apr 20, 2026

@cryptobirr this PR looks similar to the already-merged #1269 (merged April 17, 2026). Please check whether your change is still needed — rebase on main and confirm.

@Wirasm
Copy link
Copy Markdown
Collaborator

Wirasm commented Apr 20, 2026

Hi @cryptobirr — thanks for opening this PR.

This repository uses a PR template at .github/pull_request_template.md with several required sections. A few of them appear to be empty or placeholder here:

  • Security Impact
  • Human Verification
  • Side Effects / Blast Radius
  • Rollback Plan

Could you fill those out (even briefly)? The template helps reviewers understand scope, risk, and rollback — it speeds up review significantly.

If a section genuinely doesn't apply, just write "N/A" in it rather than leaving it blank.

User and others added 2 commits April 20, 2026 00:23
… empty object

Surface corrupt `codebases.commands` JSON to callers rather than swallowing
parse errors and returning `{}`. `getCodebaseCommands()` now throws, and
the GET /api/codebases list and detail endpoints include a `commandsParseError`
warning field so clients can distinguish missing commands from unreadable data.

Fixes coleam00#967

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
User and others added 2 commits April 20, 2026 00:26
Now that all mock.module() calls are migrated to spyOn(), @archon/core
runs cleanly in a single process. Removes the 4-batch workaround that
was needed to avoid module-cache pollution between test files.

Fixes coleam00#966

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…_reject config

When an approval node has `on_reject` set (onRejectPrompt in metadata),
the button reads "Request Changes" and the confirmation dialog explains
the workflow will retry with feedback. Without on_reject the button reads
"Cancel" and explains the run will be cancelled.

Fixes coleam00#969

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
packages/core/src/db/codebases.ts (1)

62-73: ⚠️ Potential issue | 🟡 Minor

Throwing here breaks the /commands handler and registerCommand — only adapters and API clone are safe.

Switching getCodebaseCommands from "log + return {}" to throwing propagates to some callers but not all:

Safe (already wrapped):

  • Adapters (github/gitlab/gitea) catch the error and log gracefully before continuing
  • cloneRepository calls from api.ts are wrapped in try/catch (1534–1548)

At risk:

  • /commands handler in orchestrator-agent.ts (line 718) calls handleCommand without try/catch; if the commands JSON is corrupt, the error propagates and the platform.sendMessage call fails instead of returning "No commands registered."
  • registerCommand (codebases.ts:80) calls getCodebaseCommands to perform a read-modify-write; if the codebase has corrupt JSON, the function throws and the RMW pattern breaks. This prevents healing a corrupted commands JSON by registering a new command.

If surfacing corruption is intentional, either:

  1. Add try/catch in orchestrator-agent.ts around the handleCommand('commands') case to return a graceful fallback message, or
  2. Expose a tryGetCodebaseCommands() variant that returns { commands: {}, parseError? } for mutating callers like registerCommand.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/db/codebases.ts` around lines 62 - 73, The change that
makes getCodebaseCommands throw on corrupt JSON breaks callers that expect a
safe fallback; implement a safe variant tryGetCodebaseCommands in codebases.ts
that returns { commands: {}, parseError?: Error } (or similar) without throwing,
update registerCommand to call tryGetCodebaseCommands for its read-modify-write,
and additionally wrap the 'commands' branch in orchestrator-agent.ts's
handleCommand call in a try/catch to catch parse errors and return the graceful
"No commands registered." fallback (log parseError) so the /commands handler and
registerCommand no longer fail when JSON is corrupt.
🧹 Nitpick comments (1)
packages/server/src/routes/api.ts (1)

1478-1489: Align log event name with the {domain}.{action}_{state} convention.

corrupted_commands_json (also used at line 1511) doesn't follow the structured logging convention used elsewhere in this file (api.workflow_run_resume_failed, etc.) or the paired event already emitted by the db layer (db.codebase_commands_json_parse_failed in packages/core/src/db/codebases.ts). Suggest something like api.codebase_commands_parse_failed for both occurrences.

As per coding guidelines: "Structured logging with Pino: use event naming format {domain}.{action}_{state} (e.g., session.create_started, workflow.step_completed)".

Proposed rename
-            getLog().error({ err: parseErr, codebaseId: cb.id }, 'corrupted_commands_json');
+            getLog().error({ err: parseErr, codebaseId: cb.id }, 'api.codebase_commands_parse_failed');
-          getLog().error({ err: parseErr, codebaseId: codebase.id }, 'corrupted_commands_json');
+          getLog().error({ err: parseErr, codebaseId: codebase.id }, 'api.codebase_commands_parse_failed');
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/server/src/routes/api.ts` around lines 1478 - 1489, The log event
name "corrupted_commands_json" used in the getLog().error call when JSON.parse
fails should follow the {domain}.{action}_{state} convention; update the
getLog().error invocation(s) that currently pass 'corrupted_commands_json' (the
block that parses commands and the other occurrence later) to use a structured
name such as "api.codebase_commands_parse_failed" and keep the same context
object ({ err: parseErr, codebaseId: cb.id }) so the event aligns with
db.codebase_commands_json_parse_failed and the project's logging conventions.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@packages/core/src/db/codebases.ts`:
- Around line 62-73: The change that makes getCodebaseCommands throw on corrupt
JSON breaks callers that expect a safe fallback; implement a safe variant
tryGetCodebaseCommands in codebases.ts that returns { commands: {}, parseError?:
Error } (or similar) without throwing, update registerCommand to call
tryGetCodebaseCommands for its read-modify-write, and additionally wrap the
'commands' branch in orchestrator-agent.ts's handleCommand call in a try/catch
to catch parse errors and return the graceful "No commands registered." fallback
(log parseError) so the /commands handler and registerCommand no longer fail
when JSON is corrupt.

---

Nitpick comments:
In `@packages/server/src/routes/api.ts`:
- Around line 1478-1489: The log event name "corrupted_commands_json" used in
the getLog().error call when JSON.parse fails should follow the
{domain}.{action}_{state} convention; update the getLog().error invocation(s)
that currently pass 'corrupted_commands_json' (the block that parses commands
and the other occurrence later) to use a structured name such as
"api.codebase_commands_parse_failed" and keep the same context object ({ err:
parseErr, codebaseId: cb.id }) so the event aligns with
db.codebase_commands_json_parse_failed and the project's logging conventions.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 52e58a31-b3d0-4f91-bf37-cc4b88243a3c

📥 Commits

Reviewing files that changed from the base of the PR and between 158ec01 and 3f04931.

📒 Files selected for processing (5)
  • packages/core/src/db/codebases.test.ts
  • packages/core/src/db/codebases.ts
  • packages/core/src/orchestrator/orchestrator-agent.ts
  • packages/server/src/routes/api.ts
  • packages/server/src/routes/schemas/codebase.schemas.ts
🚧 Files skipped from review as they are similar to previous changes (2)
  • packages/core/src/orchestrator/orchestrator-agent.ts
  • packages/core/src/db/codebases.test.ts

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
packages/web/src/components/dashboard/WorkflowRunCard.tsx (1)

164-166: Optional: extract a typed approval accessor to avoid repeated inline casts.

run.metadata.approval is accessed with an inline cast here ({ onRejectPrompt?: string }) and again at lines 263–267 ({ message?: string }) and line 259 for nullish check. Since metadata is typed as z.record(z.unknown()) on the server side, a small local helper / typed interface would centralize the shape and make future fields (e.g., onRejectPrompt, message) safer to extend.

interface ApprovalMetadata {
  message?: string;
  onRejectPrompt?: string;
}
const approval = run.metadata.approval as ApprovalMetadata | undefined;
const hasOnReject = approval?.onRejectPrompt !== undefined;

Not blocking — the current code is correct.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/web/src/components/dashboard/WorkflowRunCard.tsx` around lines 164 -
166, Extract a small typed accessor for run.metadata.approval to avoid repeated
inline casting: define an ApprovalMetadata interface (with message?: string and
onRejectPrompt?: string), create a local const approval = run.metadata.approval
as ApprovalMetadata | undefined, then update usages (currently the hasOnReject
calculation and the checks at the places referencing message/onRejectPrompt) to
use approval?.onRejectPrompt and approval?.message instead of repeating inline
casts; this centralizes the shape and makes future fields safer to extend.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@packages/web/src/components/dashboard/WorkflowRunCard.tsx`:
- Around line 164-166: Extract a small typed accessor for run.metadata.approval
to avoid repeated inline casting: define an ApprovalMetadata interface (with
message?: string and onRejectPrompt?: string), create a local const approval =
run.metadata.approval as ApprovalMetadata | undefined, then update usages
(currently the hasOnReject calculation and the checks at the places referencing
message/onRejectPrompt) to use approval?.onRejectPrompt and approval?.message
instead of repeating inline casts; this centralizes the shape and makes future
fields safer to extend.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d79a1f81-4dea-4373-beef-83457c1ed77d

📥 Commits

Reviewing files that changed from the base of the PR and between 3f04931 and 6be65c3.

📒 Files selected for processing (2)
  • packages/core/package.json
  • packages/web/src/components/dashboard/WorkflowRunCard.tsx
🚧 Files skipped from review as they are similar to previous changes (1)
  • packages/core/package.json

…artup()

Calls startup() from @anthropic-ai/claude-agent-sdk v0.2.89+ non-blocking
at server init to pre-warm the Claude Code subprocess, reducing first-query
latency by ~20x. Failure is non-fatal — logs a warning and continues.

startup() exists at runtime but is not yet declared in the SDK type
declarations; cast is documented in warm.ts.

Fixes coleam00#970

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Three independent gaps combined to make node deletion appear broken:

1. Auto-select on drop: onDrop now calls onNodeSelect(id) so keyboard
   deletion works immediately after drag-and-drop without a click first.

2. Delete button always visible: moved the Delete Node button from the
   buried Advanced tab to the inspector panel header as a trash icon,
   visible for all node types including Bash nodes that had no delete
   mechanism at all.

3. Backspace key: added Backspace alongside Delete in useBuilderKeyboard
   for broader OS compatibility (macOS users expect Backspace to delete).

Fixes coleam00#971

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/providers/src/claude/warm.ts`:
- Around line 18-25: The log messages append prose to the event name; change
them to structured Pino calls that keep the message as the event key and put
explanations in the metadata object: replace log.debug('claude.warm_skipped —
startup() not available in this SDK version') with log.debug({ reason:
'startup() not available in this SDK version' }, 'claude.warm_skipped'), keep
log.info('claude.warm_completed') as the event key (or add metadata if needed),
and replace log.warn({ err }, 'claude.warm_failed — continuing without
pre-warm') with log.warn({ err, reason: 'continuing without pre-warm' },
'claude.warm_failed'); locate these changes around the sdk.startup check and the
catch block where sdk.startup() is called.

In `@packages/server/src/index.ts`:
- Around line 691-694: The Claude warm-up is always invoked but should only run
when Claude credentials are present; update the start-up logic around
warmClaudeSubprocess() to check the existing hasClaudeCredentials flag (or the
function/property that indicates Claude availability) before calling it.
Concretely, wrap the warmClaudeSubprocess().catch(...) call in an if
(hasClaudeCredentials) { ... } block (or equivalent guard used elsewhere in
startServer) so warmClaudeSubprocess is not invoked for Codex-only deployments,
and keep the same error handling using getLog().debug inside that guarded
branch.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 6d37818e-462c-49c5-8f1e-df326349c2af

📥 Commits

Reviewing files that changed from the base of the PR and between 6be65c3 and af22d7e.

📒 Files selected for processing (3)
  • packages/providers/src/claude/warm.ts
  • packages/providers/src/index.ts
  • packages/server/src/index.ts

Comment on lines +18 to +25
if (typeof sdk.startup !== 'function') {
log.debug('claude.warm_skipped — startup() not available in this SDK version');
return;
}
await sdk.startup();
log.info('claude.warm_completed');
} catch (err) {
log.warn({ err }, 'claude.warm_failed — continuing without pre-warm');
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Keep Pino messages as structured event names.

Lines 19 and 25 append prose to the log event, which makes exact event filtering harder. Move the explanation into fields/comments and keep the message as the event key.

🪵 Proposed logging cleanup
     if (typeof sdk.startup !== 'function') {
-      log.debug('claude.warm_skipped — startup() not available in this SDK version');
+      log.debug({ reason: 'startup_unavailable' }, 'claude.warm_skipped');
       return;
     }
     await sdk.startup();
     log.info('claude.warm_completed');
   } catch (err) {
-    log.warn({ err }, 'claude.warm_failed — continuing without pre-warm');
+    // Best-effort warm-up: startup continues without Claude pre-warm.
+    log.warn({ err }, 'claude.warm_failed');
   }

As per coding guidelines, “Structured logging with Pino: use event naming format {domain}.{action}_{state}.”

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/providers/src/claude/warm.ts` around lines 18 - 25, The log messages
append prose to the event name; change them to structured Pino calls that keep
the message as the event key and put explanations in the metadata object:
replace log.debug('claude.warm_skipped — startup() not available in this SDK
version') with log.debug({ reason: 'startup() not available in this SDK version'
}, 'claude.warm_skipped'), keep log.info('claude.warm_completed') as the event
key (or add metadata if needed), and replace log.warn({ err },
'claude.warm_failed — continuing without pre-warm') with log.warn({ err, reason:
'continuing without pre-warm' }, 'claude.warm_failed'); locate these changes
around the sdk.startup check and the catch block where sdk.startup() is called.

Comment on lines +691 to +694
// Non-blocking: pre-warm Claude subprocess to reduce first-query latency
warmClaudeSubprocess().catch((err: unknown) => {
getLog().debug({ err }, 'claude.warm_unexpected_error');
});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Gate Claude warm-up behind Claude credentials.

startServer() supports Codex-only deployments, but this unconditionally loads and starts the Claude warm-up path even when hasClaudeCredentials is false. That can create noisy startup warnings or unnecessary subprocess work for servers that intentionally only configure Codex.

🛡️ Proposed guard
   // Non-blocking: pre-warm Claude subprocess to reduce first-query latency
-  warmClaudeSubprocess().catch((err: unknown) => {
-    getLog().debug({ err }, 'claude.warm_unexpected_error');
-  });
+  if (hasClaudeCredentials) {
+    warmClaudeSubprocess().catch((err: unknown) => {
+      getLog().debug({ err }, 'claude.warm_unexpected_error');
+    });
+  }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Non-blocking: pre-warm Claude subprocess to reduce first-query latency
warmClaudeSubprocess().catch((err: unknown) => {
getLog().debug({ err }, 'claude.warm_unexpected_error');
});
// Non-blocking: pre-warm Claude subprocess to reduce first-query latency
if (hasClaudeCredentials) {
warmClaudeSubprocess().catch((err: unknown) => {
getLog().debug({ err }, 'claude.warm_unexpected_error');
});
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/server/src/index.ts` around lines 691 - 694, The Claude warm-up is
always invoked but should only run when Claude credentials are present; update
the start-up logic around warmClaudeSubprocess() to check the existing
hasClaudeCredentials flag (or the function/property that indicates Claude
availability) before calling it. Concretely, wrap the
warmClaudeSubprocess().catch(...) call in an if (hasClaudeCredentials) { ... }
block (or equivalent guard used elsewhere in startServer) so
warmClaudeSubprocess is not invoked for Codex-only deployments, and keep the
same error handling using getLog().debug inside that guarded branch.

…leam00#972)

Adds `loop_until` and `max_iterations` fields to `workflowDefinitionSchema`.
Changes `executeDagWorkflow` return type from `string | undefined` to
`{ summary, nodeOutputs }` so the outer `executeWorkflow` loop can
evaluate the condition expression against node outputs after each DAG pass.
The while-loop iterates up to `max_iterations` (default 5, max 50), breaking
early when `evaluateCondition(loop_until, nodeOutputs)` returns true or when
the condition is unparseable. Updates all callers and test assertions.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ng on retry (coleam00#1266)

Fixes Crash Class A: when a Codex subprocess crashes (SIGTERM), reusing
the same AbortSignal in subsequent retries caused the newly spawned
process to be killed immediately (since the signal was already aborted).

Each retry iteration now gets a fresh AbortController whose signal is
chained from the caller's AbortSignal via a once-listener. The
per-attempt controller is aborted after the attempt (success or failure)
to release any listeners and free resources.

Also updates streamCodexEvents to use the per-attempt signal (not the
caller signal), and adds a comment explaining that the "Reading prompt
from stdin..." SIGTERM message is Codex CLI's startup banner rather than
an indication of where execution was at crash time.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…to fix Docker clone target (coleam00#1237)

In Docker (ARCHON_DOCKER=true), getArchonWorkspacesPath() correctly
returns /.archon/workspaces, but the orchestrator system prompt used a
hardcoded ~/.archon/workspaces string which the shell expands to
$HOME/.archon/workspaces — an ephemeral path that disappears on
container restart.

Adds a workspacesPath parameter to buildOrchestratorPrompt,
buildProjectScopedPrompt, buildRoutingRules, and
buildRoutingRulesWithProject (default: ~/.archon/workspaces for
non-Docker environments). The orchestrator now resolves the path via
getArchonWorkspacesPath() at call time and passes it into the prompt
builders, so clone instructions and working-directory references in the
system prompt reflect the correct persistent path.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ding dotenv (0) output (coleam00#1302)

strip-cwd-env.ts: add quiet: true to CWD file parsing so dotenv's
"injecting env (0) from .env" output is suppressed; after the delete
loop emit a single "[archon] stripped N keys from <cwd> (<files>)"
line only when keys were actually stripped.

cli.ts and server/index.ts: add quiet: true to ~/.archon/.env loads;
emit "[archon] loaded N keys from ~/.archon/.env" only when the file
has entries.

Fixes coleam00#1302

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@Wirasm
Copy link
Copy Markdown
Collaborator

Wirasm commented Apr 20, 2026

@cryptobirr, this is some genuinely valuable contribution, but please split the pr up, one pr per concern and i will review each pr. current state will not merge

thank you

@b1skit
Copy link
Copy Markdown

b1skit commented Apr 28, 2026

@Wirasm @cryptobirr I've hit #1237, posting here as it's still open but doesn't seem to have been included in the individual PRs?

I think it might be the same underlying issue for #1452 ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment