Skip to content

feat: contextFiles — per-project prompt context injection#1

Merged
mhooooo merged 5 commits intodevfrom
feat/context-files-prompt-injection
Apr 12, 2026
Merged

feat: contextFiles — per-project prompt context injection#1
mhooooo merged 5 commits intodevfrom
feat/context-files-prompt-injection

Conversation

@mhooooo
Copy link
Copy Markdown
Owner

@mhooooo mhooooo commented Apr 12, 2026

Summary

  • Add contextFiles to RepoConfig/MergedConfig so any project can declare files injected into the conversation prompt
  • Two-layer path security: reject absolute paths + .. at merge time, resolve-check at read time
  • 20K char cap, warn-level log on missing files, per-file provenance headings
  • Use case: moo-second-brain injects SOUL.md/USER.md/MEMORY.md for Slack JARVIS identity

Test plan

  • bun run type-check passes
  • bun run test passes
  • Config loader tests: propagation, absolute path rejection, .. rejection, backward compat
  • Prompt builder tests: content included/excluded, ordering after routing rules
  • Manual Slack smoke test after Archon server restart

Closes mhooooo/moo-second-brain#57

🤖 Generated with Claude Code

mhooooo and others added 5 commits April 11, 2026 07:07
…lack

When the Claude Code SDK rejects a resume attempt with "No conversation
found" (the SDK session ID is gone), the orchestrator now transparently
resets the session and retries the query instead of surfacing an error
that the user has to /reset manually.

Also accepts bare 'reset' without the leading slash on Slack, since Slack
intercepts /reset as its own slash command.

Changes:
- claude.ts: classify stale_session as a non-retryable error class
  (checked before 'crash' — specific wins over generic); export
  STALE_SESSION_PATTERNS as the single source of truth for both the
  classifier and the orchestrator's isStaleSessionError() helper
- session-transitions.ts: new 'stale-session-cleared' transition
  (deactivates — next message creates a fresh session)
- orchestrator-agent.ts: isStaleSessionError() helper; SLACK_BARE_COMMANDS
  normalization scoped to Slack platform only; handleStreamMode and
  handleBatchMode wrap their AI query loops in runStreamQuery() /
  runBatchQuery() functions so a catch block can reset sessionForQuery
  and re-run with the fresh session ID; state reset before retry
  (allMessages/allChunks/assistantMessages/commandDetected) so partial
  content from the failed attempt never bleeds into the fresh response
- claude.test.ts: stale_session classification tests, including priority
  over 'crash' on overlapping error messages, and .cause assertions
- orchestrator.test.ts: parameterized stream/batch retry tests covering
  successful reset+retry, no-third-retry guard, null-session skip, and
  fresh session ID assertion on retry

Ported from the dynamous/remote-coding-agent fork (commit 229217cf) with
the following intentional deltas against the new v0.3.5 base:
- Dropped defaultCodebase auto-scoping block (Patch 1 not carried —
  CONFIG-REPLACEABLE per investigation verdict)
- Slack bare-command normalization scoped to Slack platform only
  (fork shipped unscoped initially; this change came in a later
  review-findings sub-commit)
- runStreamQuery/runBatchQuery keep upstream's 4-arg aiClient.sendQuery
  signature including requestOptions (fork was on pre-v0.3.2 3-arg shape)
- Upstream deterministic command list preserved (help, status, reset,
  workflow, register-project, update-project, remove-project, commands,
  init, worktree) — fork only had 5
- No CHANGELOG / bun.lock / package.json / docs changes — those will
  be rebuilt on top of the v0.3.5 base

Upstream-PR candidate for coleam00/Archon.
… tool

The previous /invoke-workflow text-sentinel approach was unreliable —
Claude would emit the sentinel inconsistently (mid-response, inside code
blocks, with extra text, or not at all). The fallback post-loop regex
parser caught some cases but left a persistent failure mode where
workflows either didn't dispatch or dispatched at the wrong time.

Migrate to an in-process MCP server exposing invoke_workflow as a real
typed tool call. Claude now dispatches workflows by calling a function
with structured parameters, which is deterministic and reliable.

Changes:
- packages/core/src/orchestrator/workflow-tool.ts (NEW): buildWorkflowMcpServer
  factory using createSdkMcpServer. Registers invoke_workflow tool with
  zod schema for workflow_name / project_name / task_description. Tool is
  fire-and-forget — it kicks off dispatchOrchestratorWorkflow via the
  injected dispatch callback and returns immediately so the conversation
  turn can end cleanly.
- packages/core/src/orchestrator/codebase-utils.ts (NEW): findCodebaseByName
  helper — org-qualified and case-insensitive project matching, extracted
  from orchestrator-agent.ts to eliminate duplication between workflow-tool.ts
  and the register-project handler.
- packages/core/src/orchestrator/workflow-tool.test.ts (NEW): 8 tests
  covering server shape, error paths, dispatch happy path, error handling,
  case-insensitive project matching, org-qualified matching, and zod
  validation of task_description.
- orchestrator-agent.ts:
  - handleStreamMode and handleBatchMode each build a workflowMcpServer
    at entry via buildWorkflowMcpServer({ ... dispatch }), then pass it
    via requestOptions.mcpServers['archon-tools'] to aiClient.sendQuery.
    Caller-provided requestOptions are merged, not overwritten, so outer
    MCP config still works.
  - /invoke-workflow text-sentinel detection removed from both stream
    and batch post-loop command parsers. /register-project still uses
    the text sentinel since it needs inline-parseable user-visible output.
  - handleWorkflowInvocationResult function deleted (dead code after
    sentinel removal).
  - issueContext parameter renamed to _issueContext in handleStreamMode
    and handleBatchMode to document that it's unused — issue context now
    travels through the task_description field of the tool call instead.
  - Imports: buildWorkflowMcpServer and findCodebaseByName added.
- prompt-builder.ts: router description rewritten to describe the
  invoke_workflow tool interface (tool parameters) instead of the
  text-sentinel command syntax.
- orchestrator-agent.test.ts: workflow-tool module mock added.
- prompt-builder.test.ts: assertions updated to match the new
  tool-based routing instructions.
- error-formatter.ts: "Use /reset" → "Use `reset`" for the session-error
  fallback message (Slack intercepts /reset as its own slash command;
  bare 'reset' is accepted by the orchestrator after commit 1's
  SLACK_BARE_COMMANDS normalization).
- error-formatter.test.ts: test expectation updated.

Ported from dynamous/remote-coding-agent fork commit 3df00e1b with the
following intentional deltas:
- Dropped: Slack thinking indicator (⏳ emoji) — the feature was broken
  in v0.2 (emoji flashed on then off because the wrapper's await fn()
  returned immediately on a fire-and-forget handler) and not worth
  carrying forward without a fix.
- Kept: upstream's 4-arg aiClient.sendQuery signature with requestOptions;
  MCP server merged into caller-provided requestOptions rather than
  replacing them.
- Kept: upstream's longer deterministic command list (help, status,
  reset, workflow, register-project, update-project, remove-project,
  commands, init, worktree) — fork only had 5.
- Skipped: CHANGELOG, CLAUDE.md, docs/adapters/slack.md changes —
  upstream docs have diverged; docs will be rebuilt on top of v0.3.5.
- Skipped: packages/core/package.json MCP SDK dependency bump — will
  resolve naturally via bun install once the runtime is assembled.

Upstream-PR candidate for coleam00/Archon.
Implements the approval gate for JARVIS's three-axis trust model:
heartbeat.py classifies GitHub issues and posts workflow proposals as
Slack DMs; Moo replies "go" (or "go #N") to authorize dispatch without
leaving Slack. One "go" = one dispatch. Silence means deny. Proposals
TTL 24h.

This is the JARVIS-specific delta of the fork — not upstreamable.
coleam00/Archon has no proposal/approval/supervised concept, and its
routing philosophy (user → Archon → workflow directly) doesn't match
JARVIS's heartbeat-classifier → proposal → human-approval model.

Changes:
- packages/server/src/proposals.ts (NEW): in-memory ProposalQueue with
  prune-on-read TTL expiry, enqueue/getPending/getAll/markDispatched
  methods. Singleton exported as proposalQueue. 104 lines.
- packages/server/src/proposals.test.ts (NEW): 10 unit tests, 100%
  coverage — TTL expiry, duplicate dispatch prevention, channel
  scoping, pending/dispatched filtering, issue-number lookup.
- packages/server/src/routes/api.ts: POST /api/proposals endpoint.
  Called by heartbeat.py after the Slack triage DM. Validates required
  fields (channelId, workflowName, codebaseName, userMessage) +
  optional issueNumber/branchName. Returns 201 with proposal id +
  expiresAt. Coexists with upstream's new registerOpenApiRoute update-
  check route from v0.3.3.
- packages/core/src/orchestrator/orchestrator-agent.ts:
  dispatchApprovedWorkflow() exported — takes pre-approved workflow +
  codebase + user message + isolation hints, bypasses the AI router
  (no Claude turn), resolves the codebase via findCodebaseByName,
  calls dispatchOrchestratorWorkflow directly.
- packages/core/src/index.ts: re-export dispatchApprovedWorkflow.
- packages/server/src/index.ts: "go" pre-check in the Slack onMessage
  handler — regex /^go(\s+#?(\d+))?$/i runs BEFORE normal message
  processing. If matched: reads pending proposals for the channel,
  resolves which to approve (all, or #N), markDispatched BEFORE
  starting (idempotency guard), fires each dispatch through
  lockManager.acquireLock + dispatchApprovedWorkflow, sends status
  messages for empty-queue / already-dispatched / no-match cases.

Trust model: the approval gate is here. Slack adapter whitelist
enforces who can reach this point. markDispatched happens before
execution, so re-sending "go" returns a status confirmation instead of
double-dispatching. Proposals expire after TTL — no standing approvals.

Ported from dynamous/remote-coding-agent fork commit 69f24fdd with the
following integration detail:
- The "go" handler block was mis-merged by git (left adjacent to the
  GitLab adapter init block, which wasn't its correct location). The
  block was removed from there and manually injected into the Slack
  onMessage callback at the correct pre-check point (after content is
  extracted from the bot mention strip, before thread context check
  and handleMessage dispatch).
- The upstream registerOpenApiRoute update-check route (new in v0.3.3)
  and the fork's app.post('/api/proposals') route both land at the end
  of registerRoutes() — both are preserved, patch's route appended
  after upstream's.
- createMessageErrorHandler, getLog, lockManager, slackAdapter, and
  conversationId are all already in the enclosing scope of the Slack
  onMessage callback — no new imports required in server/index.ts.

PERMANENT fork delta — NOT an upstream-PR candidate.
Four independent test fixes required to make the carried commits
(stale sessions + invoke_workflow MCP tool) pass on the v0.3.5 base.
Grouped as one cleanup commit rather than splitting across fixups
because the changes span commit 1 and commit 2's test files and would
create misleading squash semantics if backfilled.

Fixes:

1. claude.test.ts: re-nest classifySubprocessError inside describe('ClaudeClient')
   The cherry-pick auto-merge closed ClaudeClient before the fork's
   new classifySubprocessError describe block, which in turn contains
   upstream's pre-spawn env leak gate tests that reference `client`
   (declared inside ClaudeClient). Result: ReferenceError: client is
   not defined on 4 env-leak-gate tests. Moved the closing }); of
   ClaudeClient from line 1006 to end-of-file, effectively nesting
   classifySubprocessError + pre-spawn env leak gate back inside.

2. orchestrator.test.ts: update stale session retry test expected arg
   The fork's version expected transitionSession to be called with
   the platform conversation ID ('chat-456'). Upstream's convention
   (line 832's 'first-message' transition) uses the DB conversation
   ID. My stale-session-cleared transition was auto-merged to follow
   upstream's convention (conversation.id), so update the two failing
   retry tests to expect mockConversation.id instead of 'chat-456'.

3. orchestrator.test.ts: add Slack platform mock to bare command tests
   Commit 1 scopes SLACK_BARE_COMMANDS normalization to Slack only
   (per the fork's later review-findings sub-commit). The fork's
   tests were written when normalization was unscoped. Added a
   beforeEach in the bare command normalization describe to mock
   platform.getPlatformType() = 'slack'.

4. orchestrator.test.ts: delete obsolete /invoke-workflow text-sentinel tests
   Commit 2 removed text-sentinel detection from handleStreamMode
   and handleBatchMode, and the handleWorkflowInvocationResult
   function was deleted entirely. Tests that exercised text-sentinel
   dispatch behavior (3 in stream mode, 5 in workflow routing via AI,
   8 total) are testing dead code. Deleted them rather than
   converting to MCP-tool tests — MCP dispatch is already covered by
   workflow-tool.test.ts (added by commit 2).

After this commit: full test suite exit 0 across all 9 packages.
Add `contextFiles` to RepoConfig/MergedConfig so any project can declare
files whose content gets injected into the conversation prompt when the
conversation is scoped to that project.

Use case: moo-second-brain injects SOUL.md/USER.md/MEMORY.md so Slack
JARVIS starts with identity context instead of generic Archon personality.

- contextFiles on RepoConfig + MergedConfig interfaces
- mergeRepoConfig() propagation with path validation (rejects absolute + ..)
- buildProjectScopedPrompt() accepts optional contextContent
- orchestrator-agent reads files from codebase.default_cwd with
  defense-in-depth resolve check, 20K char cap, warn on missing
- Config loader tests (propagation, path rejection, backward compat)
- Prompt builder tests (content included/excluded, ordering)

Closes mhooooo/moo-second-brain#57

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@mhooooo mhooooo merged commit 110eddf into dev Apr 12, 2026
@mhooooo mhooooo deleted the feat/context-files-prompt-injection branch April 12, 2026 22:17
mhooooo added a commit that referenced this pull request Apr 12, 2026
* feat: auto-reset stale Claude SDK sessions + accept bare 'reset' on Slack

When the Claude Code SDK rejects a resume attempt with "No conversation
found" (the SDK session ID is gone), the orchestrator now transparently
resets the session and retries the query instead of surfacing an error
that the user has to /reset manually.

Also accepts bare 'reset' without the leading slash on Slack, since Slack
intercepts /reset as its own slash command.

Changes:
- claude.ts: classify stale_session as a non-retryable error class
  (checked before 'crash' — specific wins over generic); export
  STALE_SESSION_PATTERNS as the single source of truth for both the
  classifier and the orchestrator's isStaleSessionError() helper
- session-transitions.ts: new 'stale-session-cleared' transition
  (deactivates — next message creates a fresh session)
- orchestrator-agent.ts: isStaleSessionError() helper; SLACK_BARE_COMMANDS
  normalization scoped to Slack platform only; handleStreamMode and
  handleBatchMode wrap their AI query loops in runStreamQuery() /
  runBatchQuery() functions so a catch block can reset sessionForQuery
  and re-run with the fresh session ID; state reset before retry
  (allMessages/allChunks/assistantMessages/commandDetected) so partial
  content from the failed attempt never bleeds into the fresh response
- claude.test.ts: stale_session classification tests, including priority
  over 'crash' on overlapping error messages, and .cause assertions
- orchestrator.test.ts: parameterized stream/batch retry tests covering
  successful reset+retry, no-third-retry guard, null-session skip, and
  fresh session ID assertion on retry

Ported from the dynamous/remote-coding-agent fork (commit 229217cf) with
the following intentional deltas against the new v0.3.5 base:
- Dropped defaultCodebase auto-scoping block (Patch 1 not carried —
  CONFIG-REPLACEABLE per investigation verdict)
- Slack bare-command normalization scoped to Slack platform only
  (fork shipped unscoped initially; this change came in a later
  review-findings sub-commit)
- runStreamQuery/runBatchQuery keep upstream's 4-arg aiClient.sendQuery
  signature including requestOptions (fork was on pre-v0.3.2 3-arg shape)
- Upstream deterministic command list preserved (help, status, reset,
  workflow, register-project, update-project, remove-project, commands,
  init, worktree) — fork only had 5
- No CHANGELOG / bun.lock / package.json / docs changes — those will
  be rebuilt on top of the v0.3.5 base

Upstream-PR candidate for coleam00/Archon.

* feat: replace /invoke-workflow text sentinel with invoke_workflow MCP tool

The previous /invoke-workflow text-sentinel approach was unreliable —
Claude would emit the sentinel inconsistently (mid-response, inside code
blocks, with extra text, or not at all). The fallback post-loop regex
parser caught some cases but left a persistent failure mode where
workflows either didn't dispatch or dispatched at the wrong time.

Migrate to an in-process MCP server exposing invoke_workflow as a real
typed tool call. Claude now dispatches workflows by calling a function
with structured parameters, which is deterministic and reliable.

Changes:
- packages/core/src/orchestrator/workflow-tool.ts (NEW): buildWorkflowMcpServer
  factory using createSdkMcpServer. Registers invoke_workflow tool with
  zod schema for workflow_name / project_name / task_description. Tool is
  fire-and-forget — it kicks off dispatchOrchestratorWorkflow via the
  injected dispatch callback and returns immediately so the conversation
  turn can end cleanly.
- packages/core/src/orchestrator/codebase-utils.ts (NEW): findCodebaseByName
  helper — org-qualified and case-insensitive project matching, extracted
  from orchestrator-agent.ts to eliminate duplication between workflow-tool.ts
  and the register-project handler.
- packages/core/src/orchestrator/workflow-tool.test.ts (NEW): 8 tests
  covering server shape, error paths, dispatch happy path, error handling,
  case-insensitive project matching, org-qualified matching, and zod
  validation of task_description.
- orchestrator-agent.ts:
  - handleStreamMode and handleBatchMode each build a workflowMcpServer
    at entry via buildWorkflowMcpServer({ ... dispatch }), then pass it
    via requestOptions.mcpServers['archon-tools'] to aiClient.sendQuery.
    Caller-provided requestOptions are merged, not overwritten, so outer
    MCP config still works.
  - /invoke-workflow text-sentinel detection removed from both stream
    and batch post-loop command parsers. /register-project still uses
    the text sentinel since it needs inline-parseable user-visible output.
  - handleWorkflowInvocationResult function deleted (dead code after
    sentinel removal).
  - issueContext parameter renamed to _issueContext in handleStreamMode
    and handleBatchMode to document that it's unused — issue context now
    travels through the task_description field of the tool call instead.
  - Imports: buildWorkflowMcpServer and findCodebaseByName added.
- prompt-builder.ts: router description rewritten to describe the
  invoke_workflow tool interface (tool parameters) instead of the
  text-sentinel command syntax.
- orchestrator-agent.test.ts: workflow-tool module mock added.
- prompt-builder.test.ts: assertions updated to match the new
  tool-based routing instructions.
- error-formatter.ts: "Use /reset" → "Use `reset`" for the session-error
  fallback message (Slack intercepts /reset as its own slash command;
  bare 'reset' is accepted by the orchestrator after commit 1's
  SLACK_BARE_COMMANDS normalization).
- error-formatter.test.ts: test expectation updated.

Ported from dynamous/remote-coding-agent fork commit 3df00e1b with the
following intentional deltas:
- Dropped: Slack thinking indicator (⏳ emoji) — the feature was broken
  in v0.2 (emoji flashed on then off because the wrapper's await fn()
  returned immediately on a fire-and-forget handler) and not worth
  carrying forward without a fix.
- Kept: upstream's 4-arg aiClient.sendQuery signature with requestOptions;
  MCP server merged into caller-provided requestOptions rather than
  replacing them.
- Kept: upstream's longer deterministic command list (help, status,
  reset, workflow, register-project, update-project, remove-project,
  commands, init, worktree) — fork only had 5.
- Skipped: CHANGELOG, CLAUDE.md, docs/adapters/slack.md changes —
  upstream docs have diverged; docs will be rebuilt on top of v0.3.5.
- Skipped: packages/core/package.json MCP SDK dependency bump — will
  resolve naturally via bun install once the runtime is assembled.

Upstream-PR candidate for coleam00/Archon.

* feat: supervised-autonomous Slack "go" dispatch trigger

Implements the approval gate for JARVIS's three-axis trust model:
heartbeat.py classifies GitHub issues and posts workflow proposals as
Slack DMs; Moo replies "go" (or "go #N") to authorize dispatch without
leaving Slack. One "go" = one dispatch. Silence means deny. Proposals
TTL 24h.

This is the JARVIS-specific delta of the fork — not upstreamable.
coleam00/Archon has no proposal/approval/supervised concept, and its
routing philosophy (user → Archon → workflow directly) doesn't match
JARVIS's heartbeat-classifier → proposal → human-approval model.

Changes:
- packages/server/src/proposals.ts (NEW): in-memory ProposalQueue with
  prune-on-read TTL expiry, enqueue/getPending/getAll/markDispatched
  methods. Singleton exported as proposalQueue. 104 lines.
- packages/server/src/proposals.test.ts (NEW): 10 unit tests, 100%
  coverage — TTL expiry, duplicate dispatch prevention, channel
  scoping, pending/dispatched filtering, issue-number lookup.
- packages/server/src/routes/api.ts: POST /api/proposals endpoint.
  Called by heartbeat.py after the Slack triage DM. Validates required
  fields (channelId, workflowName, codebaseName, userMessage) +
  optional issueNumber/branchName. Returns 201 with proposal id +
  expiresAt. Coexists with upstream's new registerOpenApiRoute update-
  check route from v0.3.3.
- packages/core/src/orchestrator/orchestrator-agent.ts:
  dispatchApprovedWorkflow() exported — takes pre-approved workflow +
  codebase + user message + isolation hints, bypasses the AI router
  (no Claude turn), resolves the codebase via findCodebaseByName,
  calls dispatchOrchestratorWorkflow directly.
- packages/core/src/index.ts: re-export dispatchApprovedWorkflow.
- packages/server/src/index.ts: "go" pre-check in the Slack onMessage
  handler — regex /^go(\s+#?(\d+))?$/i runs BEFORE normal message
  processing. If matched: reads pending proposals for the channel,
  resolves which to approve (all, or #N), markDispatched BEFORE
  starting (idempotency guard), fires each dispatch through
  lockManager.acquireLock + dispatchApprovedWorkflow, sends status
  messages for empty-queue / already-dispatched / no-match cases.

Trust model: the approval gate is here. Slack adapter whitelist
enforces who can reach this point. markDispatched happens before
execution, so re-sending "go" returns a status confirmation instead of
double-dispatching. Proposals expire after TTL — no standing approvals.

Ported from dynamous/remote-coding-agent fork commit 69f24fdd with the
following integration detail:
- The "go" handler block was mis-merged by git (left adjacent to the
  GitLab adapter init block, which wasn't its correct location). The
  block was removed from there and manually injected into the Slack
  onMessage callback at the correct pre-check point (after content is
  extracted from the bot mention strip, before thread context check
  and handleMessage dispatch).
- The upstream registerOpenApiRoute update-check route (new in v0.3.3)
  and the fork's app.post('/api/proposals') route both land at the end
  of registerRoutes() — both are preserved, patch's route appended
  after upstream's.
- createMessageErrorHandler, getLog, lockManager, slackAdapter, and
  conversationId are all already in the enclosing scope of the Slack
  onMessage callback — no new imports required in server/index.ts.

PERMANENT fork delta — NOT an upstream-PR candidate.

* fix(tests): align carried tests with v0.3.5 base

Four independent test fixes required to make the carried commits
(stale sessions + invoke_workflow MCP tool) pass on the v0.3.5 base.
Grouped as one cleanup commit rather than splitting across fixups
because the changes span commit 1 and commit 2's test files and would
create misleading squash semantics if backfilled.

Fixes:

1. claude.test.ts: re-nest classifySubprocessError inside describe('ClaudeClient')
   The cherry-pick auto-merge closed ClaudeClient before the fork's
   new classifySubprocessError describe block, which in turn contains
   upstream's pre-spawn env leak gate tests that reference `client`
   (declared inside ClaudeClient). Result: ReferenceError: client is
   not defined on 4 env-leak-gate tests. Moved the closing }); of
   ClaudeClient from line 1006 to end-of-file, effectively nesting
   classifySubprocessError + pre-spawn env leak gate back inside.

2. orchestrator.test.ts: update stale session retry test expected arg
   The fork's version expected transitionSession to be called with
   the platform conversation ID ('chat-456'). Upstream's convention
   (line 832's 'first-message' transition) uses the DB conversation
   ID. My stale-session-cleared transition was auto-merged to follow
   upstream's convention (conversation.id), so update the two failing
   retry tests to expect mockConversation.id instead of 'chat-456'.

3. orchestrator.test.ts: add Slack platform mock to bare command tests
   Commit 1 scopes SLACK_BARE_COMMANDS normalization to Slack only
   (per the fork's later review-findings sub-commit). The fork's
   tests were written when normalization was unscoped. Added a
   beforeEach in the bare command normalization describe to mock
   platform.getPlatformType() = 'slack'.

4. orchestrator.test.ts: delete obsolete /invoke-workflow text-sentinel tests
   Commit 2 removed text-sentinel detection from handleStreamMode
   and handleBatchMode, and the handleWorkflowInvocationResult
   function was deleted entirely. Tests that exercised text-sentinel
   dispatch behavior (3 in stream mode, 5 in workflow routing via AI,
   8 total) are testing dead code. Deleted them rather than
   converting to MCP-tool tests — MCP dispatch is already covered by
   workflow-tool.test.ts (added by commit 2).

After this commit: full test suite exit 0 across all 9 packages.

* feat: contextFiles — per-project prompt context injection

Add `contextFiles` to RepoConfig/MergedConfig so any project can declare
files whose content gets injected into the conversation prompt when the
conversation is scoped to that project.

Use case: moo-second-brain injects SOUL.md/USER.md/MEMORY.md so Slack
JARVIS starts with identity context instead of generic Archon personality.

- contextFiles on RepoConfig + MergedConfig interfaces
- mergeRepoConfig() propagation with path validation (rejects absolute + ..)
- buildProjectScopedPrompt() accepts optional contextContent
- orchestrator-agent reads files from codebase.default_cwd with
  defense-in-depth resolve check, 20K char cap, warn on missing
- Config loader tests (propagation, path rejection, backward compat)
- Prompt builder tests (content included/excluded, ordering)

Closes mhooooo/moo-second-brain#57

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
mhooooo pushed a commit that referenced this pull request Apr 25, 2026
…gent) (coleam00#1270)

* feat(providers): add Pi community provider (@mariozechner/pi-coding-agent)

Introduces Pi as the first community provider under the Phase 2 registry,
registered with builtIn: false. Wraps Pi's full coding-agent harness the
same way ClaudeProvider wraps @anthropic-ai/claude-agent-sdk and
CodexProvider wraps @openai/codex-sdk.

- PiProvider implements IAgentProvider; fresh AgentSession per sendQuery call
- AsyncQueue bridges Pi's callback-based session.subscribe() to Archon's
  AsyncGenerator<MessageChunk> contract
- Server-safe: AuthStorage.inMemory + SessionManager.inMemory +
  SettingsManager.inMemory + DefaultResourceLoader with all no* flags —
  no filesystem access, no cross-request state
- API key seeded per-call from options.env → process.env fallback
- Model refs: '<pi-provider-id>/<model-id>' (e.g. google/gemini-2.5-pro,
  openrouter/qwen/qwen3-coder) with syntactic compatibility check
- registerPiProvider() wired at CLI, server, and config-loader entrypoints,
  kept separate from registerBuiltinProviders() since builtIn: false is
  load-bearing for the community-provider validation story
- All 12 capability flags declared false in v1 — dag-executor warnings fire
  honestly for any unmapped nodeConfig field
- 58 new tests covering event mapping, async-queue semantics, model-ref
  parsing, defensive config parsing, registry integration

Supported Pi providers (v1): anthropic, openai, google, groq, mistral,
cerebras, xai, openrouter, huggingface. Extend PI_PROVIDER_ENV_VARS as
needed.

Out of scope (v1): session resume, MCP, hooks, skills mapping, thinking
level mapping, structured output, OAuth flows, model catalog validation.
These remain false on PI_CAPABILITIES until intentionally wired.

* feat(providers/pi): read ~/.pi/agent/auth.json for OAuth + api_key passthrough

Replaces the v1 env-var-only auth flow with AuthStorage.create(), which
reads ~/.pi/agent/auth.json. This transparently picks up credentials the
user has populated via `pi` → `/login` (OAuth subscriptions: Claude
Pro/Max, ChatGPT Plus, GitHub Copilot, Gemini CLI, Antigravity) or by
editing the file directly.

Env-var behavior preserved: when ANTHROPIC_API_KEY / GEMINI_API_KEY /
etc. is set (in process.env or per-request options.env), the adapter
calls setRuntimeApiKey which is priority #1 in Pi's resolution chain.
Auth.json entries are priority coleam00#2-coleam00#3. Pi's internal env-var fallback
remains priority coleam00#4 as a safety net.

Archon does not implement OAuth flows itself — it only rides on creds
the user created via the Pi CLI. OAuth refresh still happens inside Pi
(auth-storage.ts:369-413) under a file lock; concurrent refreshes
between the Pi CLI and Archon are race-safe by Pi's own design.

- Fail-fast error now mentions both the env-var path and `pi /login`
- 2 new tests: OAuth cred from auth.json; env var wins over auth.json
- 12 existing tests still pass (env-var-only path unchanged)

CI compatibility: no auth.json in CI, no change — env-var (secrets)
flows through Pi's getEnvApiKey fallback identically to v1.

* test(e2e): add Pi provider smoke test workflow

Mirrors e2e-claude-smoke.yaml: single prompt node + bash assert.
Targets `anthropic/claude-haiku-4-5` via `provider: pi`; works in CI
(ANTHROPIC_API_KEY secret) and locally (user's `pi /login` OAuth).

Verified locally with an Anthropic OAuth subscription — full run takes
~4s from session_started to assert PASS, exercising the async-queue
bridge and agent_end → result-chunk assembly under real Pi event timing.

Not yet wired into .github/workflows/e2e-smoke.yml — separate PR once
this lands, to keep the Pi provider PR minimal.

* feat(providers/pi): v2 — thinkingLevel, tool restrictions, systemPrompt

Extends the Pi adapter with three node-level translations, flipping the
corresponding capability flags from false → true so the dag-executor no
longer emits warnings for these fields on Pi nodes.

1. effort / thinking → Pi thinkingLevel (options-translator.ts)
   - Archon EffortLevel enum: low|medium|high|max (from
     packages/workflows/src/schemas/dag-node.ts). `max` maps to Pi's
     `xhigh` since Archon's enum lacks it.
   - Pi-native strings (minimal, xhigh, off) also accepted for
     programmatic callers bypassing the schema.
   - `off` on either field → no thinkingLevel (Pi's implicit off).
   - Claude-shape object `thinking: {type:'enabled', budget_tokens:N}`
     yields a system warning and is not applied.

2. allowed_tools / denied_tools → filtered Pi built-in tools
   - Supports all 7 Pi tools: read, bash, edit, write, grep, find, ls.
   - Case-insensitive normalization.
   - Empty `allowed_tools: []` means no tools (LLM-only), matching
     e2e-claude-smoke's idiom.
   - Unknown names (Claude-specific like `WebFetch`) collected and
     surfaced as a system warning; ignored tools don't fail the run.

3. systemPrompt (AgentRequestOptions + nodeConfig.systemPrompt)
   - Threaded through `DefaultResourceLoader({systemPrompt})`; Pi's
     default prompt is replaced entirely. Request-level wins over
     node-level.

Capability flag changes:
- thinkingControl: false → true
- effortControl:   false → true
- toolRestrictions: false → true

Package delta:
- +1 direct dep: @sinclair/typebox (Pi types reference it; adding as
  direct dep resolves the TS portable-type error).
- +1 test file: options-translator.test.ts (19 tests, 100% coverage).
- provider.test.ts extended with 11 new tests covering all three paths.
- registry.test.ts updated: capability assertion reflects new flags.

Live-verified: `bun run cli workflow run e2e-pi-smoke --no-worktree`
succeeds in 1.2s with thinkingLevel=low, toolCount=0. Smoke YAML updated
to use `effort: low` (schema-valid) + `allowed_tools: []` (LLM-only).

* test(e2e): add comprehensive Pi smoke covering every CI-compatible node type

Exercises every node type Archon supports under `provider: pi`, except
`approval:` (pauses for human input, incompatible with CI):
  1. prompt   — inline AI prompt
  2. command  — named command file (uses e2e-echo-command.md)
  3. loop     — bounded iterative AI prompt (max_iterations: 2)
  4. bash     — shell script with JSON output
  5. script   — bun runtime (echo-args.js)
  6. script   — uv / Python runtime (echo-py.py)

Plus DAG features on top of Pi:
  - depends_on + $nodeId.output substitution
  - when: conditional with JSON dot-access
  - trigger_rule: all_success merge
  - final assert node validates every upstream output is non-empty

Complements the minimal e2e-pi-smoke.yaml — that stays as the fast-path
smoke for connectivity checks; this one is the broader surface coverage.

Verified locally end-to-end against Anthropic OAuth (pi /login): PASS,
all 9 non-final nodes produce output, assert succeeds.

* feat(providers/pi): resolve Archon `skills:` names to Pi skill paths

Flips capabilities.skills: false → true by translating Archon's name-based
`skills:` nodeConfig (e.g. `skills: [agent-browser]`) to absolute directory
paths Pi's DefaultResourceLoader can consume via additionalSkillPaths.

Search order for each skill name (first match wins):
  1. <cwd>/.agents/skills/<name>/      — project-local, agentskills.io
  2. <cwd>/.claude/skills/<name>/      — project-local, Claude convention
  3. ~/.agents/skills/<name>/          — user-global, agentskills.io
  4. ~/.claude/skills/<name>/          — user-global, Claude convention

A directory resolves only if it contains a SKILL.md. Unresolved names are
collected and surfaced as a system-chunk warning (e.g. "Pi could not
resolve skill names: foo, bar. Searched .agents/skills and .claude/skills
(project + user-global)."), matching the semantic of "requested but not
found" without aborting the run.

Pi's buildSystemPrompt auto-appends the agentskills.io XML block for each
loaded skill, so the model sees them — no separate prompt injection needed
(Pi differs from Claude here; Claude wraps in an AgentDefinition with a
preloaded prompt, Pi uses XML block in system prompt).

Ancestor directory traversal above cwd is deliberately skipped in this
pass — matches the Pi provider's cwd-bound scope and avoids ambiguity
about which repo's skills win when Archon runs from a subdirectory.

Bun's os.homedir() bypasses the HOME env var; the resolver uses
`process.env.HOME ?? homedir()` so tests can stage a synthetic home dir.

Tests:
- 11 new tests in options-translator.test.ts cover project/user, .agents/
  vs .claude/, project-wins-over-user, SKILL.md presence check, dedup,
  missing-name collection.
- 2 new integration tests in provider.test.ts cover the missing-skill
  warning path and the "no skills configured → no additionalSkillPaths"
  path.
- registry.test.ts updated to assert skills: true in capabilities.

Live-verified locally: `.claude/skills/archon-dev/SKILL.md` resolves,
pi.session_started log shows `skillCount: 1, missingSkillCount: 0`,
smoke workflow passes in 1.2s.

* feat(providers/pi): session resume via Pi session store

Flips capabilities.sessionResume: false → true. Pi now persists sessions
under ~/.pi/agent/sessions/<encoded-cwd>/<uuid>.jsonl by default — same
pattern Claude and Codex use for their respective stores, same blast
radius as those providers.

Flow:
  - No resumeSessionId → SessionManager.create(cwd) (fresh, persisted)
  - resumeSessionId + match in SessionManager.list(cwd) → open(path)
  - resumeSessionId + no match → fresh session + system warning
    ("⚠️ Could not resume Pi session. Starting fresh conversation.")
    Matches Codex's resume_thread_failed fallback at
    packages/providers/src/codex/provider.ts:553-558.

The sessionId flows back to Archon via the terminal `result` chunk —
bridgeSession annotates it with session.sessionId unconditionally so
Archon's orchestrator can persist it and pass it as resumeSessionId on
the next turn. Same mechanism used for Claude/Codex.

Cross-cwd resume (e.g. worktree switch) is deliberately not supported in
this pass: list(cwd) scans only the current cwd's session dir. A workflow
that changes cwd mid-run lands on a fresh session, which matches Pi's
mental model.

Bridge sessionId annotation uses session.sessionId, which Pi always
populates (UUID) — so no special-case for inMemory sessions is needed.

Factored the resolver into session-resolver.ts (5 unit tests):
  - no id → create
  - id + match → open
  - id + no match → create with resumeFailed: true
  - list() throws → resumeFailed: true (graceful)
  - empty-string id → treated as "no resume requested"

Integration tests in provider.test.ts add 3 cases:
  - resume-not-found yields warning + calls create
  - resume-match calls open with the file path, no warning
  - result chunk always carries sessionId

Verified live end-to-end against Anthropic OAuth:
  - first call → sessionId 019d...; model replies "noted"
  - second call with that sessionId → "resumed: true" in logs; model
    correctly recalls prior turn ("Crimson.")
  - bogus sessionId → "⚠️ Could not resume..." warning + fresh UUID

* refactor(providers,core): generalize community-provider registration

Addresses the community-pattern regression flagged in the PR coleam00#1270 review:
a second community provider should require editing only its own directory,
not seven files across providers/ + core/ + cli/ + server/.

Three changes:

1. Drop typed `pi` slot from AssistantDefaultsConfig + AssistantDefaults.
   Community providers live behind the generic `[string]` index that
   `ProviderDefaultsMap` was explicitly designed to provide. The typed
   claude/codex slots stay — they give IDE autocomplete for built-in
   config access without `as` casts, which was the whole reason the
   intersection exists. Community providers parse their own config via
   Record<string, unknown> anyway, so the typed slot added no real
   parser safety.

2. Loop-based getDefaults + mergeAssistantDefaults. No more hardcoded
   `pi: {}` spreads. getDefaults() seeds from `getRegisteredProviders()`;
   mergeAssistantDefaults clones every slot present in `base`. Adding a
   new provider requires zero edits to this function.

3. New `registerCommunityProviders()` aggregator in registry.ts.
   Entrypoints (CLI, server, config-loader) call ONE function after
   `registerBuiltinProviders()` rather than one call per community
   provider. Adding a new community provider is now a single-line edit
   to registerCommunityProviders().

This makes Pi (and future community providers) actually behave like
Phase 2 (coleam00#1195) advertised: drop the implementation under
packages/providers/src/community/<id>/, export a `register<Id>Provider`,
add one line to the aggregator.

Tests:
- New `registerCommunityProviders` suite (2 tests: registers pi,
  idempotent).
- config-loader.test updated: assert built-in slots explicitly rather
  than exhaustive map shape.

No functional change for Pi end-users. Purely structural.

* fix(providers/pi,core): correctness + hygiene fixes from PR coleam00#1270 review

Addresses six of the review's important findings, all within the same
PR branch:

1. envInjection: false → true
   The provider reads requestOptions.env on every call (for API-key
   passthrough). Declaring the capability false caused a spurious
   dag-executor warning for every Pi user who configured codebase env
   vars — which is the MAIN auth path. Flipping to true removes the
   false positive.

2. toSafeAssistantDefaults: denylist → allowlist
   The old shape deleted `additionalDirectories`, `settingSources`,
   `codexBinaryPath` before sending defaults to the web UI. Any future
   sensitive provider field (OAuth token, absolute path, internal
   metadata) would silently leak via the `[key: string]: unknown` index
   signature. New SAFE_ASSISTANT_FIELDS map lists exactly what to
   expose per provider; unknown providers get an empty allowlist so
   the web UI sees "provider exists" but no config details.

3. AsyncQueue single-consumer invariant
   The type was documented single-consumer but unenforced. A second
   `for await` would silently race with the first over buffer +
   waiters. Added a synchronous guard in Symbol.asyncIterator that
   throws on second call — copy-paste mistakes now fail fast with a
   clear message instead of dropping items.

4. session.dispose() / session.abort() silent catches
   Both catch blocks now log at debug via a module-scoped logger so
   SDK regressions surface without polluting normal output.

5. Type scripted events as AgentSessionEvent in provider.test.ts
   Was `Record<string, unknown>` — Pi field renames would silently
   keep tests passing. Now typed against Pi's actual event union.

6. Leaked /tmp/pi-research/... path in provider.ts comment
   Local-machine path that crept in during research. Replaced with
   the upstream GitHub URL (matches convention at provider.ts:110).

Plus review-flagged simplifications:
  - Extract lookupPiModel wrapper — isolates the `as unknown as` cast
    behind one searchable name.
  - Hoist QueueItem → BridgeQueueItem at module scope (export'd for
    test visibility; not used externally yet but enables unit testing
    the mapping in isolation if needed later).
  - getRegisteredProviderNames: remove side-effecting registration
    calls. `loadConfig()` already bootstraps the registry before any
    caller can observe this helper — the hidden coupling was
    misleading.

Plus missing-coverage tests from the review (pr-test-analyzer):
  - session.prompt() rejection → error surfaces to consumer
  - pre-aborted signal → session.abort() called
  - mid-stream abort → session.abort() called
  - modelFallbackMessage → system chunk yielded
  - AsyncQueue second-consumer → throws synchronously

No behavioral changes for end users beyond the envInjection warning
fix.

* docs: Pi provider + community-provider contributor guide

Addresses the PR coleam00#1270 review's docs-impact findings: the original Pi
PR had no user-facing or contributor-facing documentation, and
architecture.md still referenced the pre-Phase-2 factory.ts pattern
(factory.ts was deleted in coleam00#1195).

1. packages/docs-web/src/content/docs/reference/architecture.md
   - Replace stale factory.ts references with the registry pattern.
   - Update inline IAgentProvider block: add getCapabilities, add
     options parameter.
   - Rewrite MessageChunk block as the actual discriminated union
     (was a placeholder with optional fields that didn't match the
     current type).
   - "Adding a New AI Agent Provider" checklist now distinguishes
     built-in (register in registerBuiltinProviders) from community
     (separate guide). Links to the new contributor guide.

2. packages/docs-web/src/content/docs/contributing/adding-a-community-provider.md (new)
   - Step-by-step guide using Pi as the reference implementation.
   - Covers: directory layout, capability discipline (start false,
     flip one at a time), provider class skeleton, registration via
     aggregator, test isolation (Bun mock.module pollution), what
     NOT to do (no edits to AssistantDefaultsConfig, no direct
     registerProvider from entrypoints, no overclaiming capabilities).

3. packages/docs-web/src/content/docs/getting-started/ai-assistants.md
   - New "Pi (Community Provider)" section: install, OAuth +
     API-key table per Pi backend, model ref format, workflow
     examples, capability matrix showing what Pi supports (session
     resume, tool restrictions, effort/thinking, skills, system
     prompt, envInjection) and what it doesn't (MCP, hooks,
     structured output, cost control, fallback model, sandbox).

4. .env.example
   - New Pi section with commented env vars for each supported
     backend (ANTHROPIC_API_KEY through HUGGINGFACE_API_KEY), each
     paired with its Pi provider id. OAuth flow (pi /login → auth.json)
     is explicitly called out — Archon reads that file too.

5. CHANGELOG.md
   - Unreleased entry for Pi, registerCommunityProviders aggregator,
     and the new contributor guide.
mhooooo pushed a commit that referenced this pull request Apr 26, 2026
…nv gaps, add good-practices + troubleshooting (coleam00#1363)

* fix(skill/when): document the full `when:` operator set and compound expressions

The skill reference previously stated "operators: ==, != only" which is
materially wrong — the condition evaluator supports ==, !=, <, >, <=, >=
plus && / || compound expressions with && binding tighter than ||, plus
dot-notation JSON field access. An agent authoring a workflow from the
skill would think half the operators don't exist.

Replaces the single-sentence section with a structured reference covering:
- All six comparison operators (string and numeric modes)
- Compound expressions with precedence rules and short-circuit eval
- JSON dot notation semantics and failure modes
- The fail-closed rules in full (invalid expression, non-numeric side,
  missing field, skipped upstream)

Grounded in packages/workflows/src/condition-evaluator.ts.

* feat(skill): document Approval and Cancel node types

Approval and cancel nodes are first-class DAG node types (approval since the
workflow lifecycle work in coleam00#871, cancel as a guarded-exit primitive) but the
skill never described either one. An agent reading the skill and asked to
"add a review gate before implementation" or "stop the workflow if the input
is unsafe" would fall back to bash + exit 1, losing the proper semantics
(cancelled vs. failed, on_reject AI rework, web UI auto-resume).

Approval node coverage (references/workflow-dag.md, SKILL.md):
- Full configuration block with message, capture_response, on_reject
- The interactive: true workflow-level requirement for web UI delivery
- Approve/reject commands across all platforms (CLI, slash, natural
  language) and the capture_response → $node-id.output flow
- Ignored-fields list + the on_reject.prompt AI sub-node exception

Cancel node coverage (references/workflow-dag.md, SKILL.md):
- Single-field schema (cancel: "<reason>")
- Lifecycle: cancelled (not failed); in-flight parallel nodes stopped;
  no DAG auto-resume path
- The "cancel: vs bash-exit-1" decision rule (expected precondition miss
  vs. check itself failing)
- Two canonical patterns — upstream-classification gate, pre-expensive-step
  gate

Validation-rules list updated to enumerate approval/cancel constraints
(message non-empty, on_reject.max_attempts range 1-10, cancel reason
non-empty), plus a forward note that script: joins the mutually-exclusive
set once PR coleam00#1362 lands.

Placement in both files is after the Loop section and before the validation
section, so this commit stays additive with respect to PR coleam00#1362's Script
node insertion between Bash and Loop — rebase is clean.

* feat(skill): document workflow-level fields beyond name/provider/model

The skill's Schema section previously showed only name, description, provider,
and model at the workflow level — which is most of a stub. Agents asked to
"use the 1M-context Claude beta" or "run this under a network sandbox" or
"add a fallback model in case Opus rate-limits" had no way to discover
that any of these fields existed at the workflow level.

Adds a comprehensive Workflow-Level Fields section covering:
- Core: name, description, provider, model, interactive (with explicit
  callout that interactive: true is REQUIRED for approval/loop gates on
  web UI — a common footgun)
- Isolation: worktree.enabled for pin-on/pin-off (the only worktree field
  at workflow level; baseBranch/copyFiles/path/initSubmodules are
  config.yaml only, so a cross-reference points there)
- Claude SDK advanced: effort, thinking, fallbackModel, betas, sandbox,
  with explicit per-node-only exceptions (maxBudgetUsd, systemPrompt)
- Codex-specific: modelReasoningEffort (with note that it's NOT the same
  as Claude's effort — this has confused users), webSearchMode,
  additionalDirectories
- A complete worked example combining sandbox + approval + interactive

All fields cross-referenced against packages/workflows/src/schemas/workflow.ts
and packages/workflows/src/schemas/dag-node.ts.

* feat(skill/loop): document interactive loops and gate_message

Interactive loop nodes pause between iterations for human feedback via
/workflow approve — used by archon-piv-loop and archon-interactive-prd.
The skill's Loop Nodes section previously omitted both interactive: true
and gate_message entirely, so an agent writing a guided-refinement
workflow wouldn't know the feature exists or that gate_message is
required at parse time.

Adds:
- interactive and gate_message rows to the config table (marking
  gate_message as required when interactive: true — enforced by the
  loader's superRefine)
- A dedicated "Interactive Loops" subsection explaining the 6-step
  iterate-pause-approve-resume flow
- Explicit call-out that $LOOP_USER_INPUT populates ONLY on the first
  iteration of a resumed session — easy to miss and a common surprise
- Workflow-level interactive: true requirement for web UI delivery
  (loader warning otherwise) so the full-flow example is complete
- Note that until_bash substitution DOES shell-quote $nodeId.output
  (unlike script bodies) — called out since the audit surfaced this
  inconsistency

* fix(skill/cli): complete the CLI command reference with missing lifecycle commands

The CLI reference previously documented only list, run, cleanup, validate,
complete, version, setup, and chat — missing nearly every workflow
lifecycle command an agent needs to operate a paused, failed, or stuck
run. The interactive-workflows reference assumed these commands existed
without actually documenting them.

Adds full documentation for:
- archon workflow status — show running workflow(s)
- archon workflow approve <run-id> [comment] — resume approval gate
  (also populates $LOOP_USER_INPUT on interactive loops and the gate
  node's output when capture_response: true)
- archon workflow reject <run-id> [reason] — reject gate; cancels or
  triggers on_reject rework depending on node config
- archon workflow cancel <run-id> — terminate running/paused with
  in-flight subprocess kill
- archon workflow abandon <run-id> — mark stuck row cancelled without
  subprocess kill (for orphan-cleanup after server crashes — matches
  the coleam00#1216 precedent)
- archon workflow resume <run-id> [message] — force-resume specific
  run (auto-resume is default; this is for explicit override)
- archon workflow cleanup [days] — disk hygiene for old terminal runs
  (with explicit callout that it does NOT transition 'running' rows,
  a common confusion)
- archon workflow event emit — used inside loop prompts for state
  signalling; documented so agents don't invent their own mechanism
- archon continue <branch> [flags] [msg] — iterative-session entry
  point with --workflow and --no-context flags

Also:
- Adds --allow-env-keys flag to the `workflow run` flag table with
  audit-log context and the env-leak-gate remediation use case
- Adds an "Auto-resume without --resume" note disambiguating when
  --resume is needed vs. when auto-resume handles it
- Adds --include-closed flag to `isolation cleanup`, which was
  previously missing; converts the flag list to a structured table
- Explains the cancel/abandon distinction (live subprocess vs. orphan)

All grounded in packages/cli/src/commands/workflow.ts, continue.ts,
and isolation.ts.

* feat(skill/repo-init): add scripts/ and state/, three-path env model, per-project env injection

The repo-init reference was missing two first-class .archon/ directories
(scripts/ since v0.3.3, state/ since the workflow-state feature) and had
nothing to say about env — the #1 thing a user hits on first-run when
their repo has a .env file with API keys.

Directory tree updates:
- Adds .archon/scripts/ with the extension->runtime rule (.ts/.js -> bun,
  .py -> uv) so agents know where to put named scripts referenced by
  script: nodes.
- Adds .archon/state/ with explicit "always gitignore" callout — these
  are runtime artifacts, not source. Previously undocumented in the skill.
- Adds .archon/.env (repo-scoped Archon env) and distinguishes it from
  the target repo's top-level .env.
- Adds a "What each directory is for" list so the structure isn't just
  a tree with no narrative.

.gitignore guidance:
- state/ and .env added as must-gitignore (state/ matches CLAUDE.md and
  reference/archon-directories.md — skill was lagging).
- mcp/ demoted to conditional — gitignore only if you hardcode secrets.

New "Three-Path Env Model" section:
- ~/.archon/.env (trusted, user), <cwd>/.archon/.env (trusted, repo),
  <cwd>/.env (UNTRUSTED, target project — stripped from subprocess env).
- Precedence (override: true across archon-owned paths) and the
  observable [archon] loaded N keys / stripped K keys log lines so
  operators can verify what actually happened.
- Decision tree for where to put API keys vs. target-project env vs.
  things Archon shouldn't touch.
- Links to archon setup --scope home|project with --force for writing
  to the right file with timestamped backups.

New "Per-Project Env Injection" section:
- Documents both managed surfaces: .archon/config.yaml env: block
  (git-committed, $REF expansion) and Web UI Settings → Projects →
  Env Vars (DB-stored, never returned over API).
- Names every execution surface that receives the injected vars:
  Claude/Codex/Pi subprocess, bash: nodes, script: nodes, and direct
  codebase-scoped chat.
- Documents the env-leak gate with all 5 remediation paths so an agent
  hitting "Cannot register: env has sensitive keys" knows the options.

Grounded in CHANGELOG v0.3.7 (three-path env + setup flags), v0.3.0
(env-leak gate), and reference/security.md on the docs site.

* fix(skill/authoring-commands): correct override paths and add home-scoped commands

The file-location and discovery sections described an override layout that
does not match the actual resolver. It showed:

  .archon/commands/defaults/archon-assist.md  # Overrides the bundled

and claimed `.archon/commands/defaults/` was where repo-level overrides
lived. In fact the resolver (executor-shared.ts:152-200 + command-
validation.ts) walks `.archon/commands/` 1 level deep and uses basename
matching — putting `archon-assist.md` at the top of `.archon/commands/`
is the canonical way to override the bundled version. The `defaults/`
subfolder is a Archon-internal convention for shipping bundled defaults,
not a user-facing override pattern.

Also, home-scoped commands (`~/.archon/commands/`, shipped in v0.3.7)
were completely absent — agents authoring personal helpers wouldn't
know they could live at the user level and be shared across every repo.

Changes:
- File Location section now shows all three discovery scopes (repo,
  home, bundled) with precedence ordering and 1-level subfolder rules
- Duplicate-basename rule documented as a user error surface
- Discovery and Priority section rewritten with accurate 3-step lookup
  order — no more references to the nonexistent defaults/ override path
- Adds the Web UI "Global (~/.archon/commands/)" palette label note so
  users authoring helpers for the builder know what to expect

No code changes — this is a pure fix of stale/incorrect skill reference
material.

* feat(skill): add workflow good-practices and troubleshooting reference pages

Closes two gaps from the audit. The skill previously had zero guidance on
designing multi-node workflows (what to avoid, what to reach for first,
how to structure artifact chains) and zero guidance on where to look
when things go wrong (log paths, env-leak gate remediations, orphan-row
cleanup, resume semantics).

New references/good-practices.md (9 Good Practices + 7 Anti-Patterns):

- Use deterministic nodes (bash:/script:) for deterministic work, AI for
  reasoning — the single biggest quality lever
- output_format required whenever downstream when: reads a field — the
  most common source of "workflow silently routes wrong"
- trigger_rule: none_failed_min_one_success after conditional branches —
  the classic bug where all_success fails because a skipped when:-gated
  branch doesn't count as a success
- context: fresh requires artifacts for state passing — commands must
  explicitly "read $ARTIFACTS_DIR/..." when downstream of fresh
- Cheap models (haiku) for glue, strong for substance
- Workflow descriptions as routing affordances
- Validate (archon validate workflows) + smoke-run before shipping
- Artifact-chain-first design
- worktree.enabled: true for code-changing workflows (reversibility)
- Anti-patterns with before/after YAML examples for each (AI-for-tests,
  free-form when: matching, context: fresh without artifacts, long flat
  AI-node layers, secrets in YAML, retry on loop nodes, tiny
  max_iterations, missing workflow-level interactive:, tool-restricted
  MCP nodes)

New references/troubleshooting.md:

- Log location (~/.archon/workspaces/<owner>/<repo>/logs/<run-id>.jsonl)
  with jq recipes for common queries (last assistant message, failed
  events, full stream)
- Artifact location for cross-node handoff debugging
- 9 Common Failure Modes, each with root cause + concrete fix:
  - $BASE_BRANCH unresolvable
  - Env-leak gate (5 remediations)
  - Claude/Codex binary not found (compiled-binary-only)
  - "running" forever (AI working / orphan / idle_timeout)
  - Mid-workflow failure and auto-resume semantics
  - Approval gate missing on web UI (workflow-level interactive:)
  - MCP plugin connection noise (filtered by design)
  - Empty $nodeId.output / field access (4 causes)
- Diagnostic command cheat sheet (list, status, isolation list, validate,
  tail-log, --verbose, LOG_LEVEL=debug)
- Escalation protocol (version + validate + log tail + CHANGELOG + issue)

SKILL.md routing table now dispatches "Workflow good practices /
anti-patterns" and "Troubleshoot a failing / stuck workflow" to the new
references so an agent can find them without having to know they exist.

* docs(book): update node-types coverage from four to all seven

The book is the curated first-contact reading path (landing page → "Get
Started" → /book/). Both dag-workflows.md and quick-reference.md were
stuck on "four node types" — missing script, approval, and cancel. A user
reading the book as their first introduction would form an incomplete
mental model, then find three more node types in the reference section
later with no explanation of when they arrived.

book/dag-workflows.md:
- "four node types" → "seven node types. Exactly one mode field is
  required per node"
- Table now lists Command, Prompt, Bash, Script, Loop, Approval, Cancel
  with one-line "when to use" for each, and cross-links to the dedicated
  guide pages for Script / Loop / Approval
- New sections below the table for Script (inline + named examples with
  runtime and deps), Approval (with the interactive: true workflow-level
  note that's easy to miss), and Cancel (guarded-exit pattern) — keeping
  the existing narrative shape for Bash and Loop

book/quick-reference.md:
- Node Options table now includes script, approval, cancel rows
- agents row added (inline sub-agents, Claude-only)
- New "Script-specific fields" and "Approval-specific fields" subsections
  so the cheat-sheet is actually complete rather than pointing users
  elsewhere for the required constraints
- Retry row callout that loop nodes hard-error on retry — previously
  omitted
- bash timeout note widened to cover script timeout (same semantics)

Both files are docs-web content; the CI build on the docs-script-nodes
PR (coleam00#1362) previously validated the Starlight build path with a similar
table addition, so this should render clean.

* fix(skill/cli): remove nonexistent \`archon workflow cancel\`, fix workflow status jq recipe

Two accuracy issues from the PR code-reviewer (comment 4311243858).

C1: \`archon workflow cancel <run-id>\` does NOT exist as a CLI subcommand.
The switch at packages/cli/src/cli.ts:318-485 dispatches on list / run /
status / resume / abandon / approve / reject / cleanup / event — running
\`archon workflow cancel\` hits the default case and exits with "Unknown
workflow subcommand: cancel" (cli.ts:478-484). Active cancellation is
only available via:
  - /workflow cancel <run-id> chat slash command (all platforms)
  - Cancel button on the Web UI dashboard
  - POST /api/workflows/runs/{runId}/cancel REST endpoint

cli-commands.md: removed the \`### archon workflow cancel <run-id>\`
subsection; kept the \`abandon\` subsection but made it explicit that
abandon does NOT kill a subprocess. Added a call-out box at the bottom
of the abandon section explaining where to go for actual cancellation.

troubleshooting.md "running forever" section: split the original
cancel-vs-abandon advice into three bullets — Web UI / CLI abandon (for
orphans, no subprocess kill) / chat \`/workflow cancel\` (for live runs
that need interruption). Added an explicit "there is no archon workflow
cancel CLI subcommand" parenthetical since the wrong command was being
suggested in flow.

I1: the \`archon workflow list --json\` diagnostic used an incorrect jq
filter. workflow list's --json output (workflow.ts:185-219) has shape
{ workflows: [{ name, description, provider?, model?, ... }], errors: [...] }
with no \`runs\` field — \`jq '.workflows[] | select(.runs)'\` returns empty
unconditionally. Replaced with \`archon workflow status --json | jq '.runs[]'\`,
which matches the actual shape of workflowStatusCommand at
workflow.ts:852+ ({ runs: WorkflowRun[] }). Also tightened the narration
to distinguish JSON from human-readable status output.

No change to the commit history in this PR — these are follow-up fixes
to claims I introduced in earlier commits of this branch (f10b989 for
C1, 66d2b86 for I1).

* fix(skill): remove env-leak gate references (feature was removed in provider extraction)

C2 from the PR code-reviewer (comment 4311243858). The pre-spawn env-leak
gate was removed from the codebase during the provider-extraction refactor
— see TODO(coleam00#1135) at packages/providers/src/claude/provider.ts:908. Zero
hits for --allow-env-keys / allowEnvKeys / allow_env_keys / allow_target_repo_keys
across packages/. The CLI's parseArgs (cli.ts:182-208) has no
--allow-env-keys option, and because parseArgs uses strict: false, an
unknown --allow-env-keys would be silently ignored rather than error.

What remains accurate and is NOT touched:
- Three-Path Env Model section (user/repo archon-owned envs are loaded;
  target repo <cwd>/.env keys are stripped from process.env at boot)
  still correctly describes current behavior, grounded in
  packages/paths/src/strip-cwd-env.ts + env-integration.test.ts
- Per-Project Env Injection section (Option 1: .archon/config.yaml env:
  block; Option 2: Web UI Settings → Projects → Env Vars) is unchanged —
  both remain the sanctioned way to get env vars into subprocesses

Removed claims (all three files):
- cli-commands.md: --allow-env-keys flag row in the workflow run flags
  table
- repo-init.md: the "Env-leak gate" subsection at the end of Per-Project
  Env Injection listing 5 remediations (all of which reference UI/CLI/
  config surfaces that don't exist). Replaced with a succinct callout
  that explains the actual current behavior — target repo .env keys are
  stripped, workflows that need those values should use managed
  injection — so the reader still gets the "where to put my env vars"
  answer
- troubleshooting.md: the "Cannot register: codebase has sensitive env
  keys" section (error message that can no longer be emitted)

If the env-leak gate is ever resurrected per TODO(coleam00#1135), the docs can be
re-added then. The CHANGELOG v0.3.0 entry describing the gate is a
historical record of past behavior and does not need to be rewritten.

* fix(skill/troubleshooting): correct JSONL event type names and field name

C3 from the PR code-reviewer (comment 4311243858). The troubleshooting
reference's event-types table used _started / _completed / _failed
suffixes, but packages/workflows/src/logger.ts:19-30 shows the actual
WorkflowEvent.type enum is:

  workflow_start | workflow_complete | workflow_error |
  assistant | tool | validation |
  node_start | node_complete | node_skipped | node_error

The second jq recipe also queried `.event` but the discriminator is `.type`.

Fixes:
- Event table: renamed columns (_started → _start, _completed → _complete,
  _failed → _error). Explicitly called out the field name as `type` so the
  reader knows what jq selector to use
- Replaced the "tool_use / tool_result" row with a single `tool` row and
  listed its actual payload fields (tool_name, tool_input, duration_ms,
  tokens) — tool_use/tool_result are SDK message kinds that appear within
  the AI stream, not top-level log event types
- Added a `validation` row (was missing; it's emitted by workflow-level
  validation calls with `check` and `result` fields)
- Removed `retry_attempt` row — this event type is not emitted to the
  JSONL file. Retry bookkeeping goes through pino logs, not the workflow
  log file
- Added an explicit callout that loop_iteration_started /
  loop_iteration_completed (and other emitter-only events) go through
  the workflow event emitter + DB workflow_events table, NOT the JSONL
  file. Pointed readers to the DB or Web UI for loop-level detail. This
  distinguishes the two parallel event systems — easy to conflate
  (store.ts:11-17 uses _started/_completed/_failed for the DB side,
  logger.ts uses _start/_complete/_error for JSONL)
- Fixed the "all failed events" jq recipe: .event → .type and _failed → _error
- Minor cleanup: the inline "tool_use events" mention in the "running
  forever" section said the wrong event name — updated to "tool or
  assistant events in the tail"

Grounded in packages/workflows/src/logger.ts (canonical JSONL event
shape) and packages/workflows/src/store.ts (the parallel DB event
naming, which the reviewer correctly flagged as different and worth
keeping distinct).

* fix(skill): two stragglers from the code-reviewer audit

Cleanup of two references that slipped through the earlier C1 and C3 fixes:

- references/troubleshooting.md:126: \`node_failed\` → \`node_error\`
  (the "Node output is empty" diagnostics section references the JSONL
  log, which uses the logger.ts enum — not the DB workflow_events table
  which does use \`node_failed\`). The C3 fix corrected the event table
  and one jq recipe but missed this inline mention.

- references/interactive-workflows.md:106: removed \`archon workflow
  cancel <run-id>\` (nonexistent CLI subcommand) from the
  troubleshooting bullet. This was pre-existing before the hardening
  PR but fell within the C1 remediation scope. Replaced with the
  correct triage: reject (approval gate only) vs abandon (orphan
  cleanup, no subprocess kill) vs chat /workflow cancel (actual
  subprocess termination).

Grounded in the same sources as the earlier C1/C3 commits:
packages/cli/src/cli.ts:318-485 (no cancel case) and
packages/workflows/src/logger.ts:19-30 (JSONL type enum).

* feat(skill): point to archon.diy as the canonical docs source

The skill had no reference to archon.diy (the live docs site built from
packages/docs-web/). Several reference files said "see the docs site"
without naming the URL, leaving the agent to guess or grep the repo for
the hostname. An agent with the skill loaded should know that when the
distilled reference pages don't cover a case, the full canonical docs
are one WebFetch away.

SKILL.md: new "Richer Context: archon.diy" section between Routing and
Running Workflows. Covers:
- When to reach for the live docs (longer examples, tutorial framing,
  features the skill only mentions in passing, "where's that
  documented?" user questions)
- URL map — 13 starting points covering getting-started, book (tutorial
  series), guides/ (authoring + per-node-type + per-node-feature),
  reference/ (variables, CLI, security, architecture, configuration,
  troubleshooting), adapters/, deployment/
- Precedence: skill refs first (context-cheap, tuned for agents), docs
  site as escalation. Prevents agents defaulting to WebFetch when a
  local skill ref already covers the answer

Also upgrades the 5 existing generic "docs site" mentions across
reference files to concrete archon.diy URLs with anchor fragments where
helpful:
- good-practices.md: Inline sub-agents pattern → archon.diy/guides/
  authoring-workflows/#inline-sub-agents
- troubleshooting.md: "Install page on the docs site" → archon.diy/
  getting-started/installation/
- workflow-dag.md: "Workflow Description Best Practices" → anchor link;
  sandbox schema reference → archon.diy/guides/authoring-workflows/
  #claude-sdk-advanced-options
- repo-init.md: Security Model reference → archon.diy/reference/
  security/#target-repo-env-isolation (deep-link into the section that
  covers the <cwd>/.env strip behavior)

URL source of truth: astro.config.mjs:5 (site: 'https://archon.diy').
URL structure mirrors packages/docs-web/src/content/docs/<section>/
<page>.md — verified by the 62 pages the docs build produces.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant