feat: contextFiles — per-project prompt context injection#1
Merged
Conversation
…lack When the Claude Code SDK rejects a resume attempt with "No conversation found" (the SDK session ID is gone), the orchestrator now transparently resets the session and retries the query instead of surfacing an error that the user has to /reset manually. Also accepts bare 'reset' without the leading slash on Slack, since Slack intercepts /reset as its own slash command. Changes: - claude.ts: classify stale_session as a non-retryable error class (checked before 'crash' — specific wins over generic); export STALE_SESSION_PATTERNS as the single source of truth for both the classifier and the orchestrator's isStaleSessionError() helper - session-transitions.ts: new 'stale-session-cleared' transition (deactivates — next message creates a fresh session) - orchestrator-agent.ts: isStaleSessionError() helper; SLACK_BARE_COMMANDS normalization scoped to Slack platform only; handleStreamMode and handleBatchMode wrap their AI query loops in runStreamQuery() / runBatchQuery() functions so a catch block can reset sessionForQuery and re-run with the fresh session ID; state reset before retry (allMessages/allChunks/assistantMessages/commandDetected) so partial content from the failed attempt never bleeds into the fresh response - claude.test.ts: stale_session classification tests, including priority over 'crash' on overlapping error messages, and .cause assertions - orchestrator.test.ts: parameterized stream/batch retry tests covering successful reset+retry, no-third-retry guard, null-session skip, and fresh session ID assertion on retry Ported from the dynamous/remote-coding-agent fork (commit 229217cf) with the following intentional deltas against the new v0.3.5 base: - Dropped defaultCodebase auto-scoping block (Patch 1 not carried — CONFIG-REPLACEABLE per investigation verdict) - Slack bare-command normalization scoped to Slack platform only (fork shipped unscoped initially; this change came in a later review-findings sub-commit) - runStreamQuery/runBatchQuery keep upstream's 4-arg aiClient.sendQuery signature including requestOptions (fork was on pre-v0.3.2 3-arg shape) - Upstream deterministic command list preserved (help, status, reset, workflow, register-project, update-project, remove-project, commands, init, worktree) — fork only had 5 - No CHANGELOG / bun.lock / package.json / docs changes — those will be rebuilt on top of the v0.3.5 base Upstream-PR candidate for coleam00/Archon.
… tool
The previous /invoke-workflow text-sentinel approach was unreliable —
Claude would emit the sentinel inconsistently (mid-response, inside code
blocks, with extra text, or not at all). The fallback post-loop regex
parser caught some cases but left a persistent failure mode where
workflows either didn't dispatch or dispatched at the wrong time.
Migrate to an in-process MCP server exposing invoke_workflow as a real
typed tool call. Claude now dispatches workflows by calling a function
with structured parameters, which is deterministic and reliable.
Changes:
- packages/core/src/orchestrator/workflow-tool.ts (NEW): buildWorkflowMcpServer
factory using createSdkMcpServer. Registers invoke_workflow tool with
zod schema for workflow_name / project_name / task_description. Tool is
fire-and-forget — it kicks off dispatchOrchestratorWorkflow via the
injected dispatch callback and returns immediately so the conversation
turn can end cleanly.
- packages/core/src/orchestrator/codebase-utils.ts (NEW): findCodebaseByName
helper — org-qualified and case-insensitive project matching, extracted
from orchestrator-agent.ts to eliminate duplication between workflow-tool.ts
and the register-project handler.
- packages/core/src/orchestrator/workflow-tool.test.ts (NEW): 8 tests
covering server shape, error paths, dispatch happy path, error handling,
case-insensitive project matching, org-qualified matching, and zod
validation of task_description.
- orchestrator-agent.ts:
- handleStreamMode and handleBatchMode each build a workflowMcpServer
at entry via buildWorkflowMcpServer({ ... dispatch }), then pass it
via requestOptions.mcpServers['archon-tools'] to aiClient.sendQuery.
Caller-provided requestOptions are merged, not overwritten, so outer
MCP config still works.
- /invoke-workflow text-sentinel detection removed from both stream
and batch post-loop command parsers. /register-project still uses
the text sentinel since it needs inline-parseable user-visible output.
- handleWorkflowInvocationResult function deleted (dead code after
sentinel removal).
- issueContext parameter renamed to _issueContext in handleStreamMode
and handleBatchMode to document that it's unused — issue context now
travels through the task_description field of the tool call instead.
- Imports: buildWorkflowMcpServer and findCodebaseByName added.
- prompt-builder.ts: router description rewritten to describe the
invoke_workflow tool interface (tool parameters) instead of the
text-sentinel command syntax.
- orchestrator-agent.test.ts: workflow-tool module mock added.
- prompt-builder.test.ts: assertions updated to match the new
tool-based routing instructions.
- error-formatter.ts: "Use /reset" → "Use `reset`" for the session-error
fallback message (Slack intercepts /reset as its own slash command;
bare 'reset' is accepted by the orchestrator after commit 1's
SLACK_BARE_COMMANDS normalization).
- error-formatter.test.ts: test expectation updated.
Ported from dynamous/remote-coding-agent fork commit 3df00e1b with the
following intentional deltas:
- Dropped: Slack thinking indicator (⏳ emoji) — the feature was broken
in v0.2 (emoji flashed on then off because the wrapper's await fn()
returned immediately on a fire-and-forget handler) and not worth
carrying forward without a fix.
- Kept: upstream's 4-arg aiClient.sendQuery signature with requestOptions;
MCP server merged into caller-provided requestOptions rather than
replacing them.
- Kept: upstream's longer deterministic command list (help, status,
reset, workflow, register-project, update-project, remove-project,
commands, init, worktree) — fork only had 5.
- Skipped: CHANGELOG, CLAUDE.md, docs/adapters/slack.md changes —
upstream docs have diverged; docs will be rebuilt on top of v0.3.5.
- Skipped: packages/core/package.json MCP SDK dependency bump — will
resolve naturally via bun install once the runtime is assembled.
Upstream-PR candidate for coleam00/Archon.
Implements the approval gate for JARVIS's three-axis trust model:
heartbeat.py classifies GitHub issues and posts workflow proposals as
Slack DMs; Moo replies "go" (or "go #N") to authorize dispatch without
leaving Slack. One "go" = one dispatch. Silence means deny. Proposals
TTL 24h.
This is the JARVIS-specific delta of the fork — not upstreamable.
coleam00/Archon has no proposal/approval/supervised concept, and its
routing philosophy (user → Archon → workflow directly) doesn't match
JARVIS's heartbeat-classifier → proposal → human-approval model.
Changes:
- packages/server/src/proposals.ts (NEW): in-memory ProposalQueue with
prune-on-read TTL expiry, enqueue/getPending/getAll/markDispatched
methods. Singleton exported as proposalQueue. 104 lines.
- packages/server/src/proposals.test.ts (NEW): 10 unit tests, 100%
coverage — TTL expiry, duplicate dispatch prevention, channel
scoping, pending/dispatched filtering, issue-number lookup.
- packages/server/src/routes/api.ts: POST /api/proposals endpoint.
Called by heartbeat.py after the Slack triage DM. Validates required
fields (channelId, workflowName, codebaseName, userMessage) +
optional issueNumber/branchName. Returns 201 with proposal id +
expiresAt. Coexists with upstream's new registerOpenApiRoute update-
check route from v0.3.3.
- packages/core/src/orchestrator/orchestrator-agent.ts:
dispatchApprovedWorkflow() exported — takes pre-approved workflow +
codebase + user message + isolation hints, bypasses the AI router
(no Claude turn), resolves the codebase via findCodebaseByName,
calls dispatchOrchestratorWorkflow directly.
- packages/core/src/index.ts: re-export dispatchApprovedWorkflow.
- packages/server/src/index.ts: "go" pre-check in the Slack onMessage
handler — regex /^go(\s+#?(\d+))?$/i runs BEFORE normal message
processing. If matched: reads pending proposals for the channel,
resolves which to approve (all, or #N), markDispatched BEFORE
starting (idempotency guard), fires each dispatch through
lockManager.acquireLock + dispatchApprovedWorkflow, sends status
messages for empty-queue / already-dispatched / no-match cases.
Trust model: the approval gate is here. Slack adapter whitelist
enforces who can reach this point. markDispatched happens before
execution, so re-sending "go" returns a status confirmation instead of
double-dispatching. Proposals expire after TTL — no standing approvals.
Ported from dynamous/remote-coding-agent fork commit 69f24fdd with the
following integration detail:
- The "go" handler block was mis-merged by git (left adjacent to the
GitLab adapter init block, which wasn't its correct location). The
block was removed from there and manually injected into the Slack
onMessage callback at the correct pre-check point (after content is
extracted from the bot mention strip, before thread context check
and handleMessage dispatch).
- The upstream registerOpenApiRoute update-check route (new in v0.3.3)
and the fork's app.post('/api/proposals') route both land at the end
of registerRoutes() — both are preserved, patch's route appended
after upstream's.
- createMessageErrorHandler, getLog, lockManager, slackAdapter, and
conversationId are all already in the enclosing scope of the Slack
onMessage callback — no new imports required in server/index.ts.
PERMANENT fork delta — NOT an upstream-PR candidate.
Four independent test fixes required to make the carried commits
(stale sessions + invoke_workflow MCP tool) pass on the v0.3.5 base.
Grouped as one cleanup commit rather than splitting across fixups
because the changes span commit 1 and commit 2's test files and would
create misleading squash semantics if backfilled.
Fixes:
1. claude.test.ts: re-nest classifySubprocessError inside describe('ClaudeClient')
The cherry-pick auto-merge closed ClaudeClient before the fork's
new classifySubprocessError describe block, which in turn contains
upstream's pre-spawn env leak gate tests that reference `client`
(declared inside ClaudeClient). Result: ReferenceError: client is
not defined on 4 env-leak-gate tests. Moved the closing }); of
ClaudeClient from line 1006 to end-of-file, effectively nesting
classifySubprocessError + pre-spawn env leak gate back inside.
2. orchestrator.test.ts: update stale session retry test expected arg
The fork's version expected transitionSession to be called with
the platform conversation ID ('chat-456'). Upstream's convention
(line 832's 'first-message' transition) uses the DB conversation
ID. My stale-session-cleared transition was auto-merged to follow
upstream's convention (conversation.id), so update the two failing
retry tests to expect mockConversation.id instead of 'chat-456'.
3. orchestrator.test.ts: add Slack platform mock to bare command tests
Commit 1 scopes SLACK_BARE_COMMANDS normalization to Slack only
(per the fork's later review-findings sub-commit). The fork's
tests were written when normalization was unscoped. Added a
beforeEach in the bare command normalization describe to mock
platform.getPlatformType() = 'slack'.
4. orchestrator.test.ts: delete obsolete /invoke-workflow text-sentinel tests
Commit 2 removed text-sentinel detection from handleStreamMode
and handleBatchMode, and the handleWorkflowInvocationResult
function was deleted entirely. Tests that exercised text-sentinel
dispatch behavior (3 in stream mode, 5 in workflow routing via AI,
8 total) are testing dead code. Deleted them rather than
converting to MCP-tool tests — MCP dispatch is already covered by
workflow-tool.test.ts (added by commit 2).
After this commit: full test suite exit 0 across all 9 packages.
Add `contextFiles` to RepoConfig/MergedConfig so any project can declare files whose content gets injected into the conversation prompt when the conversation is scoped to that project. Use case: moo-second-brain injects SOUL.md/USER.md/MEMORY.md so Slack JARVIS starts with identity context instead of generic Archon personality. - contextFiles on RepoConfig + MergedConfig interfaces - mergeRepoConfig() propagation with path validation (rejects absolute + ..) - buildProjectScopedPrompt() accepts optional contextContent - orchestrator-agent reads files from codebase.default_cwd with defense-in-depth resolve check, 20K char cap, warn on missing - Config loader tests (propagation, path rejection, backward compat) - Prompt builder tests (content included/excluded, ordering) Closes mhooooo/moo-second-brain#57 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
mhooooo
added a commit
that referenced
this pull request
Apr 12, 2026
* feat: auto-reset stale Claude SDK sessions + accept bare 'reset' on Slack
When the Claude Code SDK rejects a resume attempt with "No conversation
found" (the SDK session ID is gone), the orchestrator now transparently
resets the session and retries the query instead of surfacing an error
that the user has to /reset manually.
Also accepts bare 'reset' without the leading slash on Slack, since Slack
intercepts /reset as its own slash command.
Changes:
- claude.ts: classify stale_session as a non-retryable error class
(checked before 'crash' — specific wins over generic); export
STALE_SESSION_PATTERNS as the single source of truth for both the
classifier and the orchestrator's isStaleSessionError() helper
- session-transitions.ts: new 'stale-session-cleared' transition
(deactivates — next message creates a fresh session)
- orchestrator-agent.ts: isStaleSessionError() helper; SLACK_BARE_COMMANDS
normalization scoped to Slack platform only; handleStreamMode and
handleBatchMode wrap their AI query loops in runStreamQuery() /
runBatchQuery() functions so a catch block can reset sessionForQuery
and re-run with the fresh session ID; state reset before retry
(allMessages/allChunks/assistantMessages/commandDetected) so partial
content from the failed attempt never bleeds into the fresh response
- claude.test.ts: stale_session classification tests, including priority
over 'crash' on overlapping error messages, and .cause assertions
- orchestrator.test.ts: parameterized stream/batch retry tests covering
successful reset+retry, no-third-retry guard, null-session skip, and
fresh session ID assertion on retry
Ported from the dynamous/remote-coding-agent fork (commit 229217cf) with
the following intentional deltas against the new v0.3.5 base:
- Dropped defaultCodebase auto-scoping block (Patch 1 not carried —
CONFIG-REPLACEABLE per investigation verdict)
- Slack bare-command normalization scoped to Slack platform only
(fork shipped unscoped initially; this change came in a later
review-findings sub-commit)
- runStreamQuery/runBatchQuery keep upstream's 4-arg aiClient.sendQuery
signature including requestOptions (fork was on pre-v0.3.2 3-arg shape)
- Upstream deterministic command list preserved (help, status, reset,
workflow, register-project, update-project, remove-project, commands,
init, worktree) — fork only had 5
- No CHANGELOG / bun.lock / package.json / docs changes — those will
be rebuilt on top of the v0.3.5 base
Upstream-PR candidate for coleam00/Archon.
* feat: replace /invoke-workflow text sentinel with invoke_workflow MCP tool
The previous /invoke-workflow text-sentinel approach was unreliable —
Claude would emit the sentinel inconsistently (mid-response, inside code
blocks, with extra text, or not at all). The fallback post-loop regex
parser caught some cases but left a persistent failure mode where
workflows either didn't dispatch or dispatched at the wrong time.
Migrate to an in-process MCP server exposing invoke_workflow as a real
typed tool call. Claude now dispatches workflows by calling a function
with structured parameters, which is deterministic and reliable.
Changes:
- packages/core/src/orchestrator/workflow-tool.ts (NEW): buildWorkflowMcpServer
factory using createSdkMcpServer. Registers invoke_workflow tool with
zod schema for workflow_name / project_name / task_description. Tool is
fire-and-forget — it kicks off dispatchOrchestratorWorkflow via the
injected dispatch callback and returns immediately so the conversation
turn can end cleanly.
- packages/core/src/orchestrator/codebase-utils.ts (NEW): findCodebaseByName
helper — org-qualified and case-insensitive project matching, extracted
from orchestrator-agent.ts to eliminate duplication between workflow-tool.ts
and the register-project handler.
- packages/core/src/orchestrator/workflow-tool.test.ts (NEW): 8 tests
covering server shape, error paths, dispatch happy path, error handling,
case-insensitive project matching, org-qualified matching, and zod
validation of task_description.
- orchestrator-agent.ts:
- handleStreamMode and handleBatchMode each build a workflowMcpServer
at entry via buildWorkflowMcpServer({ ... dispatch }), then pass it
via requestOptions.mcpServers['archon-tools'] to aiClient.sendQuery.
Caller-provided requestOptions are merged, not overwritten, so outer
MCP config still works.
- /invoke-workflow text-sentinel detection removed from both stream
and batch post-loop command parsers. /register-project still uses
the text sentinel since it needs inline-parseable user-visible output.
- handleWorkflowInvocationResult function deleted (dead code after
sentinel removal).
- issueContext parameter renamed to _issueContext in handleStreamMode
and handleBatchMode to document that it's unused — issue context now
travels through the task_description field of the tool call instead.
- Imports: buildWorkflowMcpServer and findCodebaseByName added.
- prompt-builder.ts: router description rewritten to describe the
invoke_workflow tool interface (tool parameters) instead of the
text-sentinel command syntax.
- orchestrator-agent.test.ts: workflow-tool module mock added.
- prompt-builder.test.ts: assertions updated to match the new
tool-based routing instructions.
- error-formatter.ts: "Use /reset" → "Use `reset`" for the session-error
fallback message (Slack intercepts /reset as its own slash command;
bare 'reset' is accepted by the orchestrator after commit 1's
SLACK_BARE_COMMANDS normalization).
- error-formatter.test.ts: test expectation updated.
Ported from dynamous/remote-coding-agent fork commit 3df00e1b with the
following intentional deltas:
- Dropped: Slack thinking indicator (⏳ emoji) — the feature was broken
in v0.2 (emoji flashed on then off because the wrapper's await fn()
returned immediately on a fire-and-forget handler) and not worth
carrying forward without a fix.
- Kept: upstream's 4-arg aiClient.sendQuery signature with requestOptions;
MCP server merged into caller-provided requestOptions rather than
replacing them.
- Kept: upstream's longer deterministic command list (help, status,
reset, workflow, register-project, update-project, remove-project,
commands, init, worktree) — fork only had 5.
- Skipped: CHANGELOG, CLAUDE.md, docs/adapters/slack.md changes —
upstream docs have diverged; docs will be rebuilt on top of v0.3.5.
- Skipped: packages/core/package.json MCP SDK dependency bump — will
resolve naturally via bun install once the runtime is assembled.
Upstream-PR candidate for coleam00/Archon.
* feat: supervised-autonomous Slack "go" dispatch trigger
Implements the approval gate for JARVIS's three-axis trust model:
heartbeat.py classifies GitHub issues and posts workflow proposals as
Slack DMs; Moo replies "go" (or "go #N") to authorize dispatch without
leaving Slack. One "go" = one dispatch. Silence means deny. Proposals
TTL 24h.
This is the JARVIS-specific delta of the fork — not upstreamable.
coleam00/Archon has no proposal/approval/supervised concept, and its
routing philosophy (user → Archon → workflow directly) doesn't match
JARVIS's heartbeat-classifier → proposal → human-approval model.
Changes:
- packages/server/src/proposals.ts (NEW): in-memory ProposalQueue with
prune-on-read TTL expiry, enqueue/getPending/getAll/markDispatched
methods. Singleton exported as proposalQueue. 104 lines.
- packages/server/src/proposals.test.ts (NEW): 10 unit tests, 100%
coverage — TTL expiry, duplicate dispatch prevention, channel
scoping, pending/dispatched filtering, issue-number lookup.
- packages/server/src/routes/api.ts: POST /api/proposals endpoint.
Called by heartbeat.py after the Slack triage DM. Validates required
fields (channelId, workflowName, codebaseName, userMessage) +
optional issueNumber/branchName. Returns 201 with proposal id +
expiresAt. Coexists with upstream's new registerOpenApiRoute update-
check route from v0.3.3.
- packages/core/src/orchestrator/orchestrator-agent.ts:
dispatchApprovedWorkflow() exported — takes pre-approved workflow +
codebase + user message + isolation hints, bypasses the AI router
(no Claude turn), resolves the codebase via findCodebaseByName,
calls dispatchOrchestratorWorkflow directly.
- packages/core/src/index.ts: re-export dispatchApprovedWorkflow.
- packages/server/src/index.ts: "go" pre-check in the Slack onMessage
handler — regex /^go(\s+#?(\d+))?$/i runs BEFORE normal message
processing. If matched: reads pending proposals for the channel,
resolves which to approve (all, or #N), markDispatched BEFORE
starting (idempotency guard), fires each dispatch through
lockManager.acquireLock + dispatchApprovedWorkflow, sends status
messages for empty-queue / already-dispatched / no-match cases.
Trust model: the approval gate is here. Slack adapter whitelist
enforces who can reach this point. markDispatched happens before
execution, so re-sending "go" returns a status confirmation instead of
double-dispatching. Proposals expire after TTL — no standing approvals.
Ported from dynamous/remote-coding-agent fork commit 69f24fdd with the
following integration detail:
- The "go" handler block was mis-merged by git (left adjacent to the
GitLab adapter init block, which wasn't its correct location). The
block was removed from there and manually injected into the Slack
onMessage callback at the correct pre-check point (after content is
extracted from the bot mention strip, before thread context check
and handleMessage dispatch).
- The upstream registerOpenApiRoute update-check route (new in v0.3.3)
and the fork's app.post('/api/proposals') route both land at the end
of registerRoutes() — both are preserved, patch's route appended
after upstream's.
- createMessageErrorHandler, getLog, lockManager, slackAdapter, and
conversationId are all already in the enclosing scope of the Slack
onMessage callback — no new imports required in server/index.ts.
PERMANENT fork delta — NOT an upstream-PR candidate.
* fix(tests): align carried tests with v0.3.5 base
Four independent test fixes required to make the carried commits
(stale sessions + invoke_workflow MCP tool) pass on the v0.3.5 base.
Grouped as one cleanup commit rather than splitting across fixups
because the changes span commit 1 and commit 2's test files and would
create misleading squash semantics if backfilled.
Fixes:
1. claude.test.ts: re-nest classifySubprocessError inside describe('ClaudeClient')
The cherry-pick auto-merge closed ClaudeClient before the fork's
new classifySubprocessError describe block, which in turn contains
upstream's pre-spawn env leak gate tests that reference `client`
(declared inside ClaudeClient). Result: ReferenceError: client is
not defined on 4 env-leak-gate tests. Moved the closing }); of
ClaudeClient from line 1006 to end-of-file, effectively nesting
classifySubprocessError + pre-spawn env leak gate back inside.
2. orchestrator.test.ts: update stale session retry test expected arg
The fork's version expected transitionSession to be called with
the platform conversation ID ('chat-456'). Upstream's convention
(line 832's 'first-message' transition) uses the DB conversation
ID. My stale-session-cleared transition was auto-merged to follow
upstream's convention (conversation.id), so update the two failing
retry tests to expect mockConversation.id instead of 'chat-456'.
3. orchestrator.test.ts: add Slack platform mock to bare command tests
Commit 1 scopes SLACK_BARE_COMMANDS normalization to Slack only
(per the fork's later review-findings sub-commit). The fork's
tests were written when normalization was unscoped. Added a
beforeEach in the bare command normalization describe to mock
platform.getPlatformType() = 'slack'.
4. orchestrator.test.ts: delete obsolete /invoke-workflow text-sentinel tests
Commit 2 removed text-sentinel detection from handleStreamMode
and handleBatchMode, and the handleWorkflowInvocationResult
function was deleted entirely. Tests that exercised text-sentinel
dispatch behavior (3 in stream mode, 5 in workflow routing via AI,
8 total) are testing dead code. Deleted them rather than
converting to MCP-tool tests — MCP dispatch is already covered by
workflow-tool.test.ts (added by commit 2).
After this commit: full test suite exit 0 across all 9 packages.
* feat: contextFiles — per-project prompt context injection
Add `contextFiles` to RepoConfig/MergedConfig so any project can declare
files whose content gets injected into the conversation prompt when the
conversation is scoped to that project.
Use case: moo-second-brain injects SOUL.md/USER.md/MEMORY.md so Slack
JARVIS starts with identity context instead of generic Archon personality.
- contextFiles on RepoConfig + MergedConfig interfaces
- mergeRepoConfig() propagation with path validation (rejects absolute + ..)
- buildProjectScopedPrompt() accepts optional contextContent
- orchestrator-agent reads files from codebase.default_cwd with
defense-in-depth resolve check, 20K char cap, warn on missing
- Config loader tests (propagation, path rejection, backward compat)
- Prompt builder tests (content included/excluded, ordering)
Closes mhooooo/moo-second-brain#57
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
mhooooo
pushed a commit
that referenced
this pull request
Apr 25, 2026
…gent) (coleam00#1270) * feat(providers): add Pi community provider (@mariozechner/pi-coding-agent) Introduces Pi as the first community provider under the Phase 2 registry, registered with builtIn: false. Wraps Pi's full coding-agent harness the same way ClaudeProvider wraps @anthropic-ai/claude-agent-sdk and CodexProvider wraps @openai/codex-sdk. - PiProvider implements IAgentProvider; fresh AgentSession per sendQuery call - AsyncQueue bridges Pi's callback-based session.subscribe() to Archon's AsyncGenerator<MessageChunk> contract - Server-safe: AuthStorage.inMemory + SessionManager.inMemory + SettingsManager.inMemory + DefaultResourceLoader with all no* flags — no filesystem access, no cross-request state - API key seeded per-call from options.env → process.env fallback - Model refs: '<pi-provider-id>/<model-id>' (e.g. google/gemini-2.5-pro, openrouter/qwen/qwen3-coder) with syntactic compatibility check - registerPiProvider() wired at CLI, server, and config-loader entrypoints, kept separate from registerBuiltinProviders() since builtIn: false is load-bearing for the community-provider validation story - All 12 capability flags declared false in v1 — dag-executor warnings fire honestly for any unmapped nodeConfig field - 58 new tests covering event mapping, async-queue semantics, model-ref parsing, defensive config parsing, registry integration Supported Pi providers (v1): anthropic, openai, google, groq, mistral, cerebras, xai, openrouter, huggingface. Extend PI_PROVIDER_ENV_VARS as needed. Out of scope (v1): session resume, MCP, hooks, skills mapping, thinking level mapping, structured output, OAuth flows, model catalog validation. These remain false on PI_CAPABILITIES until intentionally wired. * feat(providers/pi): read ~/.pi/agent/auth.json for OAuth + api_key passthrough Replaces the v1 env-var-only auth flow with AuthStorage.create(), which reads ~/.pi/agent/auth.json. This transparently picks up credentials the user has populated via `pi` → `/login` (OAuth subscriptions: Claude Pro/Max, ChatGPT Plus, GitHub Copilot, Gemini CLI, Antigravity) or by editing the file directly. Env-var behavior preserved: when ANTHROPIC_API_KEY / GEMINI_API_KEY / etc. is set (in process.env or per-request options.env), the adapter calls setRuntimeApiKey which is priority #1 in Pi's resolution chain. Auth.json entries are priority coleam00#2-coleam00#3. Pi's internal env-var fallback remains priority coleam00#4 as a safety net. Archon does not implement OAuth flows itself — it only rides on creds the user created via the Pi CLI. OAuth refresh still happens inside Pi (auth-storage.ts:369-413) under a file lock; concurrent refreshes between the Pi CLI and Archon are race-safe by Pi's own design. - Fail-fast error now mentions both the env-var path and `pi /login` - 2 new tests: OAuth cred from auth.json; env var wins over auth.json - 12 existing tests still pass (env-var-only path unchanged) CI compatibility: no auth.json in CI, no change — env-var (secrets) flows through Pi's getEnvApiKey fallback identically to v1. * test(e2e): add Pi provider smoke test workflow Mirrors e2e-claude-smoke.yaml: single prompt node + bash assert. Targets `anthropic/claude-haiku-4-5` via `provider: pi`; works in CI (ANTHROPIC_API_KEY secret) and locally (user's `pi /login` OAuth). Verified locally with an Anthropic OAuth subscription — full run takes ~4s from session_started to assert PASS, exercising the async-queue bridge and agent_end → result-chunk assembly under real Pi event timing. Not yet wired into .github/workflows/e2e-smoke.yml — separate PR once this lands, to keep the Pi provider PR minimal. * feat(providers/pi): v2 — thinkingLevel, tool restrictions, systemPrompt Extends the Pi adapter with three node-level translations, flipping the corresponding capability flags from false → true so the dag-executor no longer emits warnings for these fields on Pi nodes. 1. effort / thinking → Pi thinkingLevel (options-translator.ts) - Archon EffortLevel enum: low|medium|high|max (from packages/workflows/src/schemas/dag-node.ts). `max` maps to Pi's `xhigh` since Archon's enum lacks it. - Pi-native strings (minimal, xhigh, off) also accepted for programmatic callers bypassing the schema. - `off` on either field → no thinkingLevel (Pi's implicit off). - Claude-shape object `thinking: {type:'enabled', budget_tokens:N}` yields a system warning and is not applied. 2. allowed_tools / denied_tools → filtered Pi built-in tools - Supports all 7 Pi tools: read, bash, edit, write, grep, find, ls. - Case-insensitive normalization. - Empty `allowed_tools: []` means no tools (LLM-only), matching e2e-claude-smoke's idiom. - Unknown names (Claude-specific like `WebFetch`) collected and surfaced as a system warning; ignored tools don't fail the run. 3. systemPrompt (AgentRequestOptions + nodeConfig.systemPrompt) - Threaded through `DefaultResourceLoader({systemPrompt})`; Pi's default prompt is replaced entirely. Request-level wins over node-level. Capability flag changes: - thinkingControl: false → true - effortControl: false → true - toolRestrictions: false → true Package delta: - +1 direct dep: @sinclair/typebox (Pi types reference it; adding as direct dep resolves the TS portable-type error). - +1 test file: options-translator.test.ts (19 tests, 100% coverage). - provider.test.ts extended with 11 new tests covering all three paths. - registry.test.ts updated: capability assertion reflects new flags. Live-verified: `bun run cli workflow run e2e-pi-smoke --no-worktree` succeeds in 1.2s with thinkingLevel=low, toolCount=0. Smoke YAML updated to use `effort: low` (schema-valid) + `allowed_tools: []` (LLM-only). * test(e2e): add comprehensive Pi smoke covering every CI-compatible node type Exercises every node type Archon supports under `provider: pi`, except `approval:` (pauses for human input, incompatible with CI): 1. prompt — inline AI prompt 2. command — named command file (uses e2e-echo-command.md) 3. loop — bounded iterative AI prompt (max_iterations: 2) 4. bash — shell script with JSON output 5. script — bun runtime (echo-args.js) 6. script — uv / Python runtime (echo-py.py) Plus DAG features on top of Pi: - depends_on + $nodeId.output substitution - when: conditional with JSON dot-access - trigger_rule: all_success merge - final assert node validates every upstream output is non-empty Complements the minimal e2e-pi-smoke.yaml — that stays as the fast-path smoke for connectivity checks; this one is the broader surface coverage. Verified locally end-to-end against Anthropic OAuth (pi /login): PASS, all 9 non-final nodes produce output, assert succeeds. * feat(providers/pi): resolve Archon `skills:` names to Pi skill paths Flips capabilities.skills: false → true by translating Archon's name-based `skills:` nodeConfig (e.g. `skills: [agent-browser]`) to absolute directory paths Pi's DefaultResourceLoader can consume via additionalSkillPaths. Search order for each skill name (first match wins): 1. <cwd>/.agents/skills/<name>/ — project-local, agentskills.io 2. <cwd>/.claude/skills/<name>/ — project-local, Claude convention 3. ~/.agents/skills/<name>/ — user-global, agentskills.io 4. ~/.claude/skills/<name>/ — user-global, Claude convention A directory resolves only if it contains a SKILL.md. Unresolved names are collected and surfaced as a system-chunk warning (e.g. "Pi could not resolve skill names: foo, bar. Searched .agents/skills and .claude/skills (project + user-global)."), matching the semantic of "requested but not found" without aborting the run. Pi's buildSystemPrompt auto-appends the agentskills.io XML block for each loaded skill, so the model sees them — no separate prompt injection needed (Pi differs from Claude here; Claude wraps in an AgentDefinition with a preloaded prompt, Pi uses XML block in system prompt). Ancestor directory traversal above cwd is deliberately skipped in this pass — matches the Pi provider's cwd-bound scope and avoids ambiguity about which repo's skills win when Archon runs from a subdirectory. Bun's os.homedir() bypasses the HOME env var; the resolver uses `process.env.HOME ?? homedir()` so tests can stage a synthetic home dir. Tests: - 11 new tests in options-translator.test.ts cover project/user, .agents/ vs .claude/, project-wins-over-user, SKILL.md presence check, dedup, missing-name collection. - 2 new integration tests in provider.test.ts cover the missing-skill warning path and the "no skills configured → no additionalSkillPaths" path. - registry.test.ts updated to assert skills: true in capabilities. Live-verified locally: `.claude/skills/archon-dev/SKILL.md` resolves, pi.session_started log shows `skillCount: 1, missingSkillCount: 0`, smoke workflow passes in 1.2s. * feat(providers/pi): session resume via Pi session store Flips capabilities.sessionResume: false → true. Pi now persists sessions under ~/.pi/agent/sessions/<encoded-cwd>/<uuid>.jsonl by default — same pattern Claude and Codex use for their respective stores, same blast radius as those providers. Flow: - No resumeSessionId → SessionManager.create(cwd) (fresh, persisted) - resumeSessionId + match in SessionManager.list(cwd) → open(path) - resumeSessionId + no match → fresh session + system warning ("⚠️ Could not resume Pi session. Starting fresh conversation.") Matches Codex's resume_thread_failed fallback at packages/providers/src/codex/provider.ts:553-558. The sessionId flows back to Archon via the terminal `result` chunk — bridgeSession annotates it with session.sessionId unconditionally so Archon's orchestrator can persist it and pass it as resumeSessionId on the next turn. Same mechanism used for Claude/Codex. Cross-cwd resume (e.g. worktree switch) is deliberately not supported in this pass: list(cwd) scans only the current cwd's session dir. A workflow that changes cwd mid-run lands on a fresh session, which matches Pi's mental model. Bridge sessionId annotation uses session.sessionId, which Pi always populates (UUID) — so no special-case for inMemory sessions is needed. Factored the resolver into session-resolver.ts (5 unit tests): - no id → create - id + match → open - id + no match → create with resumeFailed: true - list() throws → resumeFailed: true (graceful) - empty-string id → treated as "no resume requested" Integration tests in provider.test.ts add 3 cases: - resume-not-found yields warning + calls create - resume-match calls open with the file path, no warning - result chunk always carries sessionId Verified live end-to-end against Anthropic OAuth: - first call → sessionId 019d...; model replies "noted" - second call with that sessionId → "resumed: true" in logs; model correctly recalls prior turn ("Crimson.") - bogus sessionId → "⚠️ Could not resume..." warning + fresh UUID * refactor(providers,core): generalize community-provider registration Addresses the community-pattern regression flagged in the PR coleam00#1270 review: a second community provider should require editing only its own directory, not seven files across providers/ + core/ + cli/ + server/. Three changes: 1. Drop typed `pi` slot from AssistantDefaultsConfig + AssistantDefaults. Community providers live behind the generic `[string]` index that `ProviderDefaultsMap` was explicitly designed to provide. The typed claude/codex slots stay — they give IDE autocomplete for built-in config access without `as` casts, which was the whole reason the intersection exists. Community providers parse their own config via Record<string, unknown> anyway, so the typed slot added no real parser safety. 2. Loop-based getDefaults + mergeAssistantDefaults. No more hardcoded `pi: {}` spreads. getDefaults() seeds from `getRegisteredProviders()`; mergeAssistantDefaults clones every slot present in `base`. Adding a new provider requires zero edits to this function. 3. New `registerCommunityProviders()` aggregator in registry.ts. Entrypoints (CLI, server, config-loader) call ONE function after `registerBuiltinProviders()` rather than one call per community provider. Adding a new community provider is now a single-line edit to registerCommunityProviders(). This makes Pi (and future community providers) actually behave like Phase 2 (coleam00#1195) advertised: drop the implementation under packages/providers/src/community/<id>/, export a `register<Id>Provider`, add one line to the aggregator. Tests: - New `registerCommunityProviders` suite (2 tests: registers pi, idempotent). - config-loader.test updated: assert built-in slots explicitly rather than exhaustive map shape. No functional change for Pi end-users. Purely structural. * fix(providers/pi,core): correctness + hygiene fixes from PR coleam00#1270 review Addresses six of the review's important findings, all within the same PR branch: 1. envInjection: false → true The provider reads requestOptions.env on every call (for API-key passthrough). Declaring the capability false caused a spurious dag-executor warning for every Pi user who configured codebase env vars — which is the MAIN auth path. Flipping to true removes the false positive. 2. toSafeAssistantDefaults: denylist → allowlist The old shape deleted `additionalDirectories`, `settingSources`, `codexBinaryPath` before sending defaults to the web UI. Any future sensitive provider field (OAuth token, absolute path, internal metadata) would silently leak via the `[key: string]: unknown` index signature. New SAFE_ASSISTANT_FIELDS map lists exactly what to expose per provider; unknown providers get an empty allowlist so the web UI sees "provider exists" but no config details. 3. AsyncQueue single-consumer invariant The type was documented single-consumer but unenforced. A second `for await` would silently race with the first over buffer + waiters. Added a synchronous guard in Symbol.asyncIterator that throws on second call — copy-paste mistakes now fail fast with a clear message instead of dropping items. 4. session.dispose() / session.abort() silent catches Both catch blocks now log at debug via a module-scoped logger so SDK regressions surface without polluting normal output. 5. Type scripted events as AgentSessionEvent in provider.test.ts Was `Record<string, unknown>` — Pi field renames would silently keep tests passing. Now typed against Pi's actual event union. 6. Leaked /tmp/pi-research/... path in provider.ts comment Local-machine path that crept in during research. Replaced with the upstream GitHub URL (matches convention at provider.ts:110). Plus review-flagged simplifications: - Extract lookupPiModel wrapper — isolates the `as unknown as` cast behind one searchable name. - Hoist QueueItem → BridgeQueueItem at module scope (export'd for test visibility; not used externally yet but enables unit testing the mapping in isolation if needed later). - getRegisteredProviderNames: remove side-effecting registration calls. `loadConfig()` already bootstraps the registry before any caller can observe this helper — the hidden coupling was misleading. Plus missing-coverage tests from the review (pr-test-analyzer): - session.prompt() rejection → error surfaces to consumer - pre-aborted signal → session.abort() called - mid-stream abort → session.abort() called - modelFallbackMessage → system chunk yielded - AsyncQueue second-consumer → throws synchronously No behavioral changes for end users beyond the envInjection warning fix. * docs: Pi provider + community-provider contributor guide Addresses the PR coleam00#1270 review's docs-impact findings: the original Pi PR had no user-facing or contributor-facing documentation, and architecture.md still referenced the pre-Phase-2 factory.ts pattern (factory.ts was deleted in coleam00#1195). 1. packages/docs-web/src/content/docs/reference/architecture.md - Replace stale factory.ts references with the registry pattern. - Update inline IAgentProvider block: add getCapabilities, add options parameter. - Rewrite MessageChunk block as the actual discriminated union (was a placeholder with optional fields that didn't match the current type). - "Adding a New AI Agent Provider" checklist now distinguishes built-in (register in registerBuiltinProviders) from community (separate guide). Links to the new contributor guide. 2. packages/docs-web/src/content/docs/contributing/adding-a-community-provider.md (new) - Step-by-step guide using Pi as the reference implementation. - Covers: directory layout, capability discipline (start false, flip one at a time), provider class skeleton, registration via aggregator, test isolation (Bun mock.module pollution), what NOT to do (no edits to AssistantDefaultsConfig, no direct registerProvider from entrypoints, no overclaiming capabilities). 3. packages/docs-web/src/content/docs/getting-started/ai-assistants.md - New "Pi (Community Provider)" section: install, OAuth + API-key table per Pi backend, model ref format, workflow examples, capability matrix showing what Pi supports (session resume, tool restrictions, effort/thinking, skills, system prompt, envInjection) and what it doesn't (MCP, hooks, structured output, cost control, fallback model, sandbox). 4. .env.example - New Pi section with commented env vars for each supported backend (ANTHROPIC_API_KEY through HUGGINGFACE_API_KEY), each paired with its Pi provider id. OAuth flow (pi /login → auth.json) is explicitly called out — Archon reads that file too. 5. CHANGELOG.md - Unreleased entry for Pi, registerCommunityProviders aggregator, and the new contributor guide.
mhooooo
pushed a commit
that referenced
this pull request
Apr 26, 2026
…nv gaps, add good-practices + troubleshooting (coleam00#1363) * fix(skill/when): document the full `when:` operator set and compound expressions The skill reference previously stated "operators: ==, != only" which is materially wrong — the condition evaluator supports ==, !=, <, >, <=, >= plus && / || compound expressions with && binding tighter than ||, plus dot-notation JSON field access. An agent authoring a workflow from the skill would think half the operators don't exist. Replaces the single-sentence section with a structured reference covering: - All six comparison operators (string and numeric modes) - Compound expressions with precedence rules and short-circuit eval - JSON dot notation semantics and failure modes - The fail-closed rules in full (invalid expression, non-numeric side, missing field, skipped upstream) Grounded in packages/workflows/src/condition-evaluator.ts. * feat(skill): document Approval and Cancel node types Approval and cancel nodes are first-class DAG node types (approval since the workflow lifecycle work in coleam00#871, cancel as a guarded-exit primitive) but the skill never described either one. An agent reading the skill and asked to "add a review gate before implementation" or "stop the workflow if the input is unsafe" would fall back to bash + exit 1, losing the proper semantics (cancelled vs. failed, on_reject AI rework, web UI auto-resume). Approval node coverage (references/workflow-dag.md, SKILL.md): - Full configuration block with message, capture_response, on_reject - The interactive: true workflow-level requirement for web UI delivery - Approve/reject commands across all platforms (CLI, slash, natural language) and the capture_response → $node-id.output flow - Ignored-fields list + the on_reject.prompt AI sub-node exception Cancel node coverage (references/workflow-dag.md, SKILL.md): - Single-field schema (cancel: "<reason>") - Lifecycle: cancelled (not failed); in-flight parallel nodes stopped; no DAG auto-resume path - The "cancel: vs bash-exit-1" decision rule (expected precondition miss vs. check itself failing) - Two canonical patterns — upstream-classification gate, pre-expensive-step gate Validation-rules list updated to enumerate approval/cancel constraints (message non-empty, on_reject.max_attempts range 1-10, cancel reason non-empty), plus a forward note that script: joins the mutually-exclusive set once PR coleam00#1362 lands. Placement in both files is after the Loop section and before the validation section, so this commit stays additive with respect to PR coleam00#1362's Script node insertion between Bash and Loop — rebase is clean. * feat(skill): document workflow-level fields beyond name/provider/model The skill's Schema section previously showed only name, description, provider, and model at the workflow level — which is most of a stub. Agents asked to "use the 1M-context Claude beta" or "run this under a network sandbox" or "add a fallback model in case Opus rate-limits" had no way to discover that any of these fields existed at the workflow level. Adds a comprehensive Workflow-Level Fields section covering: - Core: name, description, provider, model, interactive (with explicit callout that interactive: true is REQUIRED for approval/loop gates on web UI — a common footgun) - Isolation: worktree.enabled for pin-on/pin-off (the only worktree field at workflow level; baseBranch/copyFiles/path/initSubmodules are config.yaml only, so a cross-reference points there) - Claude SDK advanced: effort, thinking, fallbackModel, betas, sandbox, with explicit per-node-only exceptions (maxBudgetUsd, systemPrompt) - Codex-specific: modelReasoningEffort (with note that it's NOT the same as Claude's effort — this has confused users), webSearchMode, additionalDirectories - A complete worked example combining sandbox + approval + interactive All fields cross-referenced against packages/workflows/src/schemas/workflow.ts and packages/workflows/src/schemas/dag-node.ts. * feat(skill/loop): document interactive loops and gate_message Interactive loop nodes pause between iterations for human feedback via /workflow approve — used by archon-piv-loop and archon-interactive-prd. The skill's Loop Nodes section previously omitted both interactive: true and gate_message entirely, so an agent writing a guided-refinement workflow wouldn't know the feature exists or that gate_message is required at parse time. Adds: - interactive and gate_message rows to the config table (marking gate_message as required when interactive: true — enforced by the loader's superRefine) - A dedicated "Interactive Loops" subsection explaining the 6-step iterate-pause-approve-resume flow - Explicit call-out that $LOOP_USER_INPUT populates ONLY on the first iteration of a resumed session — easy to miss and a common surprise - Workflow-level interactive: true requirement for web UI delivery (loader warning otherwise) so the full-flow example is complete - Note that until_bash substitution DOES shell-quote $nodeId.output (unlike script bodies) — called out since the audit surfaced this inconsistency * fix(skill/cli): complete the CLI command reference with missing lifecycle commands The CLI reference previously documented only list, run, cleanup, validate, complete, version, setup, and chat — missing nearly every workflow lifecycle command an agent needs to operate a paused, failed, or stuck run. The interactive-workflows reference assumed these commands existed without actually documenting them. Adds full documentation for: - archon workflow status — show running workflow(s) - archon workflow approve <run-id> [comment] — resume approval gate (also populates $LOOP_USER_INPUT on interactive loops and the gate node's output when capture_response: true) - archon workflow reject <run-id> [reason] — reject gate; cancels or triggers on_reject rework depending on node config - archon workflow cancel <run-id> — terminate running/paused with in-flight subprocess kill - archon workflow abandon <run-id> — mark stuck row cancelled without subprocess kill (for orphan-cleanup after server crashes — matches the coleam00#1216 precedent) - archon workflow resume <run-id> [message] — force-resume specific run (auto-resume is default; this is for explicit override) - archon workflow cleanup [days] — disk hygiene for old terminal runs (with explicit callout that it does NOT transition 'running' rows, a common confusion) - archon workflow event emit — used inside loop prompts for state signalling; documented so agents don't invent their own mechanism - archon continue <branch> [flags] [msg] — iterative-session entry point with --workflow and --no-context flags Also: - Adds --allow-env-keys flag to the `workflow run` flag table with audit-log context and the env-leak-gate remediation use case - Adds an "Auto-resume without --resume" note disambiguating when --resume is needed vs. when auto-resume handles it - Adds --include-closed flag to `isolation cleanup`, which was previously missing; converts the flag list to a structured table - Explains the cancel/abandon distinction (live subprocess vs. orphan) All grounded in packages/cli/src/commands/workflow.ts, continue.ts, and isolation.ts. * feat(skill/repo-init): add scripts/ and state/, three-path env model, per-project env injection The repo-init reference was missing two first-class .archon/ directories (scripts/ since v0.3.3, state/ since the workflow-state feature) and had nothing to say about env — the #1 thing a user hits on first-run when their repo has a .env file with API keys. Directory tree updates: - Adds .archon/scripts/ with the extension->runtime rule (.ts/.js -> bun, .py -> uv) so agents know where to put named scripts referenced by script: nodes. - Adds .archon/state/ with explicit "always gitignore" callout — these are runtime artifacts, not source. Previously undocumented in the skill. - Adds .archon/.env (repo-scoped Archon env) and distinguishes it from the target repo's top-level .env. - Adds a "What each directory is for" list so the structure isn't just a tree with no narrative. .gitignore guidance: - state/ and .env added as must-gitignore (state/ matches CLAUDE.md and reference/archon-directories.md — skill was lagging). - mcp/ demoted to conditional — gitignore only if you hardcode secrets. New "Three-Path Env Model" section: - ~/.archon/.env (trusted, user), <cwd>/.archon/.env (trusted, repo), <cwd>/.env (UNTRUSTED, target project — stripped from subprocess env). - Precedence (override: true across archon-owned paths) and the observable [archon] loaded N keys / stripped K keys log lines so operators can verify what actually happened. - Decision tree for where to put API keys vs. target-project env vs. things Archon shouldn't touch. - Links to archon setup --scope home|project with --force for writing to the right file with timestamped backups. New "Per-Project Env Injection" section: - Documents both managed surfaces: .archon/config.yaml env: block (git-committed, $REF expansion) and Web UI Settings → Projects → Env Vars (DB-stored, never returned over API). - Names every execution surface that receives the injected vars: Claude/Codex/Pi subprocess, bash: nodes, script: nodes, and direct codebase-scoped chat. - Documents the env-leak gate with all 5 remediation paths so an agent hitting "Cannot register: env has sensitive keys" knows the options. Grounded in CHANGELOG v0.3.7 (three-path env + setup flags), v0.3.0 (env-leak gate), and reference/security.md on the docs site. * fix(skill/authoring-commands): correct override paths and add home-scoped commands The file-location and discovery sections described an override layout that does not match the actual resolver. It showed: .archon/commands/defaults/archon-assist.md # Overrides the bundled and claimed `.archon/commands/defaults/` was where repo-level overrides lived. In fact the resolver (executor-shared.ts:152-200 + command- validation.ts) walks `.archon/commands/` 1 level deep and uses basename matching — putting `archon-assist.md` at the top of `.archon/commands/` is the canonical way to override the bundled version. The `defaults/` subfolder is a Archon-internal convention for shipping bundled defaults, not a user-facing override pattern. Also, home-scoped commands (`~/.archon/commands/`, shipped in v0.3.7) were completely absent — agents authoring personal helpers wouldn't know they could live at the user level and be shared across every repo. Changes: - File Location section now shows all three discovery scopes (repo, home, bundled) with precedence ordering and 1-level subfolder rules - Duplicate-basename rule documented as a user error surface - Discovery and Priority section rewritten with accurate 3-step lookup order — no more references to the nonexistent defaults/ override path - Adds the Web UI "Global (~/.archon/commands/)" palette label note so users authoring helpers for the builder know what to expect No code changes — this is a pure fix of stale/incorrect skill reference material. * feat(skill): add workflow good-practices and troubleshooting reference pages Closes two gaps from the audit. The skill previously had zero guidance on designing multi-node workflows (what to avoid, what to reach for first, how to structure artifact chains) and zero guidance on where to look when things go wrong (log paths, env-leak gate remediations, orphan-row cleanup, resume semantics). New references/good-practices.md (9 Good Practices + 7 Anti-Patterns): - Use deterministic nodes (bash:/script:) for deterministic work, AI for reasoning — the single biggest quality lever - output_format required whenever downstream when: reads a field — the most common source of "workflow silently routes wrong" - trigger_rule: none_failed_min_one_success after conditional branches — the classic bug where all_success fails because a skipped when:-gated branch doesn't count as a success - context: fresh requires artifacts for state passing — commands must explicitly "read $ARTIFACTS_DIR/..." when downstream of fresh - Cheap models (haiku) for glue, strong for substance - Workflow descriptions as routing affordances - Validate (archon validate workflows) + smoke-run before shipping - Artifact-chain-first design - worktree.enabled: true for code-changing workflows (reversibility) - Anti-patterns with before/after YAML examples for each (AI-for-tests, free-form when: matching, context: fresh without artifacts, long flat AI-node layers, secrets in YAML, retry on loop nodes, tiny max_iterations, missing workflow-level interactive:, tool-restricted MCP nodes) New references/troubleshooting.md: - Log location (~/.archon/workspaces/<owner>/<repo>/logs/<run-id>.jsonl) with jq recipes for common queries (last assistant message, failed events, full stream) - Artifact location for cross-node handoff debugging - 9 Common Failure Modes, each with root cause + concrete fix: - $BASE_BRANCH unresolvable - Env-leak gate (5 remediations) - Claude/Codex binary not found (compiled-binary-only) - "running" forever (AI working / orphan / idle_timeout) - Mid-workflow failure and auto-resume semantics - Approval gate missing on web UI (workflow-level interactive:) - MCP plugin connection noise (filtered by design) - Empty $nodeId.output / field access (4 causes) - Diagnostic command cheat sheet (list, status, isolation list, validate, tail-log, --verbose, LOG_LEVEL=debug) - Escalation protocol (version + validate + log tail + CHANGELOG + issue) SKILL.md routing table now dispatches "Workflow good practices / anti-patterns" and "Troubleshoot a failing / stuck workflow" to the new references so an agent can find them without having to know they exist. * docs(book): update node-types coverage from four to all seven The book is the curated first-contact reading path (landing page → "Get Started" → /book/). Both dag-workflows.md and quick-reference.md were stuck on "four node types" — missing script, approval, and cancel. A user reading the book as their first introduction would form an incomplete mental model, then find three more node types in the reference section later with no explanation of when they arrived. book/dag-workflows.md: - "four node types" → "seven node types. Exactly one mode field is required per node" - Table now lists Command, Prompt, Bash, Script, Loop, Approval, Cancel with one-line "when to use" for each, and cross-links to the dedicated guide pages for Script / Loop / Approval - New sections below the table for Script (inline + named examples with runtime and deps), Approval (with the interactive: true workflow-level note that's easy to miss), and Cancel (guarded-exit pattern) — keeping the existing narrative shape for Bash and Loop book/quick-reference.md: - Node Options table now includes script, approval, cancel rows - agents row added (inline sub-agents, Claude-only) - New "Script-specific fields" and "Approval-specific fields" subsections so the cheat-sheet is actually complete rather than pointing users elsewhere for the required constraints - Retry row callout that loop nodes hard-error on retry — previously omitted - bash timeout note widened to cover script timeout (same semantics) Both files are docs-web content; the CI build on the docs-script-nodes PR (coleam00#1362) previously validated the Starlight build path with a similar table addition, so this should render clean. * fix(skill/cli): remove nonexistent \`archon workflow cancel\`, fix workflow status jq recipe Two accuracy issues from the PR code-reviewer (comment 4311243858). C1: \`archon workflow cancel <run-id>\` does NOT exist as a CLI subcommand. The switch at packages/cli/src/cli.ts:318-485 dispatches on list / run / status / resume / abandon / approve / reject / cleanup / event — running \`archon workflow cancel\` hits the default case and exits with "Unknown workflow subcommand: cancel" (cli.ts:478-484). Active cancellation is only available via: - /workflow cancel <run-id> chat slash command (all platforms) - Cancel button on the Web UI dashboard - POST /api/workflows/runs/{runId}/cancel REST endpoint cli-commands.md: removed the \`### archon workflow cancel <run-id>\` subsection; kept the \`abandon\` subsection but made it explicit that abandon does NOT kill a subprocess. Added a call-out box at the bottom of the abandon section explaining where to go for actual cancellation. troubleshooting.md "running forever" section: split the original cancel-vs-abandon advice into three bullets — Web UI / CLI abandon (for orphans, no subprocess kill) / chat \`/workflow cancel\` (for live runs that need interruption). Added an explicit "there is no archon workflow cancel CLI subcommand" parenthetical since the wrong command was being suggested in flow. I1: the \`archon workflow list --json\` diagnostic used an incorrect jq filter. workflow list's --json output (workflow.ts:185-219) has shape { workflows: [{ name, description, provider?, model?, ... }], errors: [...] } with no \`runs\` field — \`jq '.workflows[] | select(.runs)'\` returns empty unconditionally. Replaced with \`archon workflow status --json | jq '.runs[]'\`, which matches the actual shape of workflowStatusCommand at workflow.ts:852+ ({ runs: WorkflowRun[] }). Also tightened the narration to distinguish JSON from human-readable status output. No change to the commit history in this PR — these are follow-up fixes to claims I introduced in earlier commits of this branch (f10b989 for C1, 66d2b86 for I1). * fix(skill): remove env-leak gate references (feature was removed in provider extraction) C2 from the PR code-reviewer (comment 4311243858). The pre-spawn env-leak gate was removed from the codebase during the provider-extraction refactor — see TODO(coleam00#1135) at packages/providers/src/claude/provider.ts:908. Zero hits for --allow-env-keys / allowEnvKeys / allow_env_keys / allow_target_repo_keys across packages/. The CLI's parseArgs (cli.ts:182-208) has no --allow-env-keys option, and because parseArgs uses strict: false, an unknown --allow-env-keys would be silently ignored rather than error. What remains accurate and is NOT touched: - Three-Path Env Model section (user/repo archon-owned envs are loaded; target repo <cwd>/.env keys are stripped from process.env at boot) still correctly describes current behavior, grounded in packages/paths/src/strip-cwd-env.ts + env-integration.test.ts - Per-Project Env Injection section (Option 1: .archon/config.yaml env: block; Option 2: Web UI Settings → Projects → Env Vars) is unchanged — both remain the sanctioned way to get env vars into subprocesses Removed claims (all three files): - cli-commands.md: --allow-env-keys flag row in the workflow run flags table - repo-init.md: the "Env-leak gate" subsection at the end of Per-Project Env Injection listing 5 remediations (all of which reference UI/CLI/ config surfaces that don't exist). Replaced with a succinct callout that explains the actual current behavior — target repo .env keys are stripped, workflows that need those values should use managed injection — so the reader still gets the "where to put my env vars" answer - troubleshooting.md: the "Cannot register: codebase has sensitive env keys" section (error message that can no longer be emitted) If the env-leak gate is ever resurrected per TODO(coleam00#1135), the docs can be re-added then. The CHANGELOG v0.3.0 entry describing the gate is a historical record of past behavior and does not need to be rewritten. * fix(skill/troubleshooting): correct JSONL event type names and field name C3 from the PR code-reviewer (comment 4311243858). The troubleshooting reference's event-types table used _started / _completed / _failed suffixes, but packages/workflows/src/logger.ts:19-30 shows the actual WorkflowEvent.type enum is: workflow_start | workflow_complete | workflow_error | assistant | tool | validation | node_start | node_complete | node_skipped | node_error The second jq recipe also queried `.event` but the discriminator is `.type`. Fixes: - Event table: renamed columns (_started → _start, _completed → _complete, _failed → _error). Explicitly called out the field name as `type` so the reader knows what jq selector to use - Replaced the "tool_use / tool_result" row with a single `tool` row and listed its actual payload fields (tool_name, tool_input, duration_ms, tokens) — tool_use/tool_result are SDK message kinds that appear within the AI stream, not top-level log event types - Added a `validation` row (was missing; it's emitted by workflow-level validation calls with `check` and `result` fields) - Removed `retry_attempt` row — this event type is not emitted to the JSONL file. Retry bookkeeping goes through pino logs, not the workflow log file - Added an explicit callout that loop_iteration_started / loop_iteration_completed (and other emitter-only events) go through the workflow event emitter + DB workflow_events table, NOT the JSONL file. Pointed readers to the DB or Web UI for loop-level detail. This distinguishes the two parallel event systems — easy to conflate (store.ts:11-17 uses _started/_completed/_failed for the DB side, logger.ts uses _start/_complete/_error for JSONL) - Fixed the "all failed events" jq recipe: .event → .type and _failed → _error - Minor cleanup: the inline "tool_use events" mention in the "running forever" section said the wrong event name — updated to "tool or assistant events in the tail" Grounded in packages/workflows/src/logger.ts (canonical JSONL event shape) and packages/workflows/src/store.ts (the parallel DB event naming, which the reviewer correctly flagged as different and worth keeping distinct). * fix(skill): two stragglers from the code-reviewer audit Cleanup of two references that slipped through the earlier C1 and C3 fixes: - references/troubleshooting.md:126: \`node_failed\` → \`node_error\` (the "Node output is empty" diagnostics section references the JSONL log, which uses the logger.ts enum — not the DB workflow_events table which does use \`node_failed\`). The C3 fix corrected the event table and one jq recipe but missed this inline mention. - references/interactive-workflows.md:106: removed \`archon workflow cancel <run-id>\` (nonexistent CLI subcommand) from the troubleshooting bullet. This was pre-existing before the hardening PR but fell within the C1 remediation scope. Replaced with the correct triage: reject (approval gate only) vs abandon (orphan cleanup, no subprocess kill) vs chat /workflow cancel (actual subprocess termination). Grounded in the same sources as the earlier C1/C3 commits: packages/cli/src/cli.ts:318-485 (no cancel case) and packages/workflows/src/logger.ts:19-30 (JSONL type enum). * feat(skill): point to archon.diy as the canonical docs source The skill had no reference to archon.diy (the live docs site built from packages/docs-web/). Several reference files said "see the docs site" without naming the URL, leaving the agent to guess or grep the repo for the hostname. An agent with the skill loaded should know that when the distilled reference pages don't cover a case, the full canonical docs are one WebFetch away. SKILL.md: new "Richer Context: archon.diy" section between Routing and Running Workflows. Covers: - When to reach for the live docs (longer examples, tutorial framing, features the skill only mentions in passing, "where's that documented?" user questions) - URL map — 13 starting points covering getting-started, book (tutorial series), guides/ (authoring + per-node-type + per-node-feature), reference/ (variables, CLI, security, architecture, configuration, troubleshooting), adapters/, deployment/ - Precedence: skill refs first (context-cheap, tuned for agents), docs site as escalation. Prevents agents defaulting to WebFetch when a local skill ref already covers the answer Also upgrades the 5 existing generic "docs site" mentions across reference files to concrete archon.diy URLs with anchor fragments where helpful: - good-practices.md: Inline sub-agents pattern → archon.diy/guides/ authoring-workflows/#inline-sub-agents - troubleshooting.md: "Install page on the docs site" → archon.diy/ getting-started/installation/ - workflow-dag.md: "Workflow Description Best Practices" → anchor link; sandbox schema reference → archon.diy/guides/authoring-workflows/ #claude-sdk-advanced-options - repo-init.md: Security Model reference → archon.diy/reference/ security/#target-repo-env-isolation (deep-link into the section that covers the <cwd>/.env strip behavior) URL source of truth: astro.config.mjs:5 (site: 'https://archon.diy'). URL structure mirrors packages/docs-web/src/content/docs/<section>/ <page>.md — verified by the 62 pages the docs build produces.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
contextFilestoRepoConfig/MergedConfigso any project can declare files injected into the conversation prompt..at merge time, resolve-check at read timeTest plan
bun run type-checkpassesbun run testpasses..rejection, backward compatCloses mhooooo/moo-second-brain#57
🤖 Generated with Claude Code