fix(workflows): make archon-adversarial-dev sed usage portable and add regression test by LaplaceYoung · Pull Request #1 · LaplaceYoung/Archon

LaplaceYoung · 2026-04-22T04:58:43Z

Motivation

Ensure the archon-adversarial-dev workflow init step works on environments where sed -i is non-portable by using an atomic temp-file + mv replacement.
Prevent regressions by asserting the portable flow is present and disallowing sed -i variants in bundled workflow content.
Improve cross-platform robustness for workflow initialization that writes state.json.

Description

Replaced non-portable sed -i call in .archon/workflows/defaults/archon-adversarial-dev.yaml with a temp-file flow using STATE_TMP, sed ... > "$STATE_TMP", and mv "$STATE_TMP" "$ARTIFACTS/state.json".
Added a regression test in packages/workflows/src/defaults/bundled-defaults.test.ts that requires the STATE_TMP pattern, the redirected sed invocation, the mv step, and asserts that no sed -i variants appear via a regex.
Small test-suite-related adjustments only; no production API or runtime behavior changes beyond the workflow YAML replacement.

Testing

Ran bun test packages/workflows/src/defaults/bundled-defaults.test.ts, which failed in this environment due to module resolution error Cannot find module '@archon/paths' and therefore could not complete the test run.
Ran bun --filter @archon/workflows test, which likewise failed with Cannot find module '@archon/paths' and did not produce passing results here.
The changes are unit-test covered by the new assertions; full CI in the repository (with correct workspace resolution) is expected to run the suite successfully.

…gent) (coleam00#1270) * feat(providers): add Pi community provider (@mariozechner/pi-coding-agent) Introduces Pi as the first community provider under the Phase 2 registry, registered with builtIn: false. Wraps Pi's full coding-agent harness the same way ClaudeProvider wraps @anthropic-ai/claude-agent-sdk and CodexProvider wraps @openai/codex-sdk. - PiProvider implements IAgentProvider; fresh AgentSession per sendQuery call - AsyncQueue bridges Pi's callback-based session.subscribe() to Archon's AsyncGenerator<MessageChunk> contract - Server-safe: AuthStorage.inMemory + SessionManager.inMemory + SettingsManager.inMemory + DefaultResourceLoader with all no* flags — no filesystem access, no cross-request state - API key seeded per-call from options.env → process.env fallback - Model refs: '<pi-provider-id>/<model-id>' (e.g. google/gemini-2.5-pro, openrouter/qwen/qwen3-coder) with syntactic compatibility check - registerPiProvider() wired at CLI, server, and config-loader entrypoints, kept separate from registerBuiltinProviders() since builtIn: false is load-bearing for the community-provider validation story - All 12 capability flags declared false in v1 — dag-executor warnings fire honestly for any unmapped nodeConfig field - 58 new tests covering event mapping, async-queue semantics, model-ref parsing, defensive config parsing, registry integration Supported Pi providers (v1): anthropic, openai, google, groq, mistral, cerebras, xai, openrouter, huggingface. Extend PI_PROVIDER_ENV_VARS as needed. Out of scope (v1): session resume, MCP, hooks, skills mapping, thinking level mapping, structured output, OAuth flows, model catalog validation. These remain false on PI_CAPABILITIES until intentionally wired. * feat(providers/pi): read ~/.pi/agent/auth.json for OAuth + api_key passthrough Replaces the v1 env-var-only auth flow with AuthStorage.create(), which reads ~/.pi/agent/auth.json. This transparently picks up credentials the user has populated via `pi` → `/login` (OAuth subscriptions: Claude Pro/Max, ChatGPT Plus, GitHub Copilot, Gemini CLI, Antigravity) or by editing the file directly. Env-var behavior preserved: when ANTHROPIC_API_KEY / GEMINI_API_KEY / etc. is set (in process.env or per-request options.env), the adapter calls setRuntimeApiKey which is priority #1 in Pi's resolution chain. Auth.json entries are priority coleam00#2-coleam00#3. Pi's internal env-var fallback remains priority coleam00#4 as a safety net. Archon does not implement OAuth flows itself — it only rides on creds the user created via the Pi CLI. OAuth refresh still happens inside Pi (auth-storage.ts:369-413) under a file lock; concurrent refreshes between the Pi CLI and Archon are race-safe by Pi's own design. - Fail-fast error now mentions both the env-var path and `pi /login` - 2 new tests: OAuth cred from auth.json; env var wins over auth.json - 12 existing tests still pass (env-var-only path unchanged) CI compatibility: no auth.json in CI, no change — env-var (secrets) flows through Pi's getEnvApiKey fallback identically to v1. * test(e2e): add Pi provider smoke test workflow Mirrors e2e-claude-smoke.yaml: single prompt node + bash assert. Targets `anthropic/claude-haiku-4-5` via `provider: pi`; works in CI (ANTHROPIC_API_KEY secret) and locally (user's `pi /login` OAuth). Verified locally with an Anthropic OAuth subscription — full run takes ~4s from session_started to assert PASS, exercising the async-queue bridge and agent_end → result-chunk assembly under real Pi event timing. Not yet wired into .github/workflows/e2e-smoke.yml — separate PR once this lands, to keep the Pi provider PR minimal. * feat(providers/pi): v2 — thinkingLevel, tool restrictions, systemPrompt Extends the Pi adapter with three node-level translations, flipping the corresponding capability flags from false → true so the dag-executor no longer emits warnings for these fields on Pi nodes. 1. effort / thinking → Pi thinkingLevel (options-translator.ts) - Archon EffortLevel enum: low|medium|high|max (from packages/workflows/src/schemas/dag-node.ts). `max` maps to Pi's `xhigh` since Archon's enum lacks it. - Pi-native strings (minimal, xhigh, off) also accepted for programmatic callers bypassing the schema. - `off` on either field → no thinkingLevel (Pi's implicit off). - Claude-shape object `thinking: {type:'enabled', budget_tokens:N}` yields a system warning and is not applied. 2. allowed_tools / denied_tools → filtered Pi built-in tools - Supports all 7 Pi tools: read, bash, edit, write, grep, find, ls. - Case-insensitive normalization. - Empty `allowed_tools: []` means no tools (LLM-only), matching e2e-claude-smoke's idiom. - Unknown names (Claude-specific like `WebFetch`) collected and surfaced as a system warning; ignored tools don't fail the run. 3. systemPrompt (AgentRequestOptions + nodeConfig.systemPrompt) - Threaded through `DefaultResourceLoader({systemPrompt})`; Pi's default prompt is replaced entirely. Request-level wins over node-level. Capability flag changes: - thinkingControl: false → true - effortControl: false → true - toolRestrictions: false → true Package delta: - +1 direct dep: @sinclair/typebox (Pi types reference it; adding as direct dep resolves the TS portable-type error). - +1 test file: options-translator.test.ts (19 tests, 100% coverage). - provider.test.ts extended with 11 new tests covering all three paths. - registry.test.ts updated: capability assertion reflects new flags. Live-verified: `bun run cli workflow run e2e-pi-smoke --no-worktree` succeeds in 1.2s with thinkingLevel=low, toolCount=0. Smoke YAML updated to use `effort: low` (schema-valid) + `allowed_tools: []` (LLM-only). * test(e2e): add comprehensive Pi smoke covering every CI-compatible node type Exercises every node type Archon supports under `provider: pi`, except `approval:` (pauses for human input, incompatible with CI): 1. prompt — inline AI prompt 2. command — named command file (uses e2e-echo-command.md) 3. loop — bounded iterative AI prompt (max_iterations: 2) 4. bash — shell script with JSON output 5. script — bun runtime (echo-args.js) 6. script — uv / Python runtime (echo-py.py) Plus DAG features on top of Pi: - depends_on + $nodeId.output substitution - when: conditional with JSON dot-access - trigger_rule: all_success merge - final assert node validates every upstream output is non-empty Complements the minimal e2e-pi-smoke.yaml — that stays as the fast-path smoke for connectivity checks; this one is the broader surface coverage. Verified locally end-to-end against Anthropic OAuth (pi /login): PASS, all 9 non-final nodes produce output, assert succeeds. * feat(providers/pi): resolve Archon `skills:` names to Pi skill paths Flips capabilities.skills: false → true by translating Archon's name-based `skills:` nodeConfig (e.g. `skills: [agent-browser]`) to absolute directory paths Pi's DefaultResourceLoader can consume via additionalSkillPaths. Search order for each skill name (first match wins): 1. <cwd>/.agents/skills/<name>/ — project-local, agentskills.io 2. <cwd>/.claude/skills/<name>/ — project-local, Claude convention 3. ~/.agents/skills/<name>/ — user-global, agentskills.io 4. ~/.claude/skills/<name>/ — user-global, Claude convention A directory resolves only if it contains a SKILL.md. Unresolved names are collected and surfaced as a system-chunk warning (e.g. "Pi could not resolve skill names: foo, bar. Searched .agents/skills and .claude/skills (project + user-global)."), matching the semantic of "requested but not found" without aborting the run. Pi's buildSystemPrompt auto-appends the agentskills.io XML block for each loaded skill, so the model sees them — no separate prompt injection needed (Pi differs from Claude here; Claude wraps in an AgentDefinition with a preloaded prompt, Pi uses XML block in system prompt). Ancestor directory traversal above cwd is deliberately skipped in this pass — matches the Pi provider's cwd-bound scope and avoids ambiguity about which repo's skills win when Archon runs from a subdirectory. Bun's os.homedir() bypasses the HOME env var; the resolver uses `process.env.HOME ?? homedir()` so tests can stage a synthetic home dir. Tests: - 11 new tests in options-translator.test.ts cover project/user, .agents/ vs .claude/, project-wins-over-user, SKILL.md presence check, dedup, missing-name collection. - 2 new integration tests in provider.test.ts cover the missing-skill warning path and the "no skills configured → no additionalSkillPaths" path. - registry.test.ts updated to assert skills: true in capabilities. Live-verified locally: `.claude/skills/archon-dev/SKILL.md` resolves, pi.session_started log shows `skillCount: 1, missingSkillCount: 0`, smoke workflow passes in 1.2s. * feat(providers/pi): session resume via Pi session store Flips capabilities.sessionResume: false → true. Pi now persists sessions under ~/.pi/agent/sessions/<encoded-cwd>/<uuid>.jsonl by default — same pattern Claude and Codex use for their respective stores, same blast radius as those providers. Flow: - No resumeSessionId → SessionManager.create(cwd) (fresh, persisted) - resumeSessionId + match in SessionManager.list(cwd) → open(path) - resumeSessionId + no match → fresh session + system warning ("⚠️ Could not resume Pi session. Starting fresh conversation.") Matches Codex's resume_thread_failed fallback at packages/providers/src/codex/provider.ts:553-558. The sessionId flows back to Archon via the terminal `result` chunk — bridgeSession annotates it with session.sessionId unconditionally so Archon's orchestrator can persist it and pass it as resumeSessionId on the next turn. Same mechanism used for Claude/Codex. Cross-cwd resume (e.g. worktree switch) is deliberately not supported in this pass: list(cwd) scans only the current cwd's session dir. A workflow that changes cwd mid-run lands on a fresh session, which matches Pi's mental model. Bridge sessionId annotation uses session.sessionId, which Pi always populates (UUID) — so no special-case for inMemory sessions is needed. Factored the resolver into session-resolver.ts (5 unit tests): - no id → create - id + match → open - id + no match → create with resumeFailed: true - list() throws → resumeFailed: true (graceful) - empty-string id → treated as "no resume requested" Integration tests in provider.test.ts add 3 cases: - resume-not-found yields warning + calls create - resume-match calls open with the file path, no warning - result chunk always carries sessionId Verified live end-to-end against Anthropic OAuth: - first call → sessionId 019d...; model replies "noted" - second call with that sessionId → "resumed: true" in logs; model correctly recalls prior turn ("Crimson.") - bogus sessionId → "⚠️ Could not resume..." warning + fresh UUID * refactor(providers,core): generalize community-provider registration Addresses the community-pattern regression flagged in the PR coleam00#1270 review: a second community provider should require editing only its own directory, not seven files across providers/ + core/ + cli/ + server/. Three changes: 1. Drop typed `pi` slot from AssistantDefaultsConfig + AssistantDefaults. Community providers live behind the generic `[string]` index that `ProviderDefaultsMap` was explicitly designed to provide. The typed claude/codex slots stay — they give IDE autocomplete for built-in config access without `as` casts, which was the whole reason the intersection exists. Community providers parse their own config via Record<string, unknown> anyway, so the typed slot added no real parser safety. 2. Loop-based getDefaults + mergeAssistantDefaults. No more hardcoded `pi: {}` spreads. getDefaults() seeds from `getRegisteredProviders()`; mergeAssistantDefaults clones every slot present in `base`. Adding a new provider requires zero edits to this function. 3. New `registerCommunityProviders()` aggregator in registry.ts. Entrypoints (CLI, server, config-loader) call ONE function after `registerBuiltinProviders()` rather than one call per community provider. Adding a new community provider is now a single-line edit to registerCommunityProviders(). This makes Pi (and future community providers) actually behave like Phase 2 (coleam00#1195) advertised: drop the implementation under packages/providers/src/community/<id>/, export a `register<Id>Provider`, add one line to the aggregator. Tests: - New `registerCommunityProviders` suite (2 tests: registers pi, idempotent). - config-loader.test updated: assert built-in slots explicitly rather than exhaustive map shape. No functional change for Pi end-users. Purely structural. * fix(providers/pi,core): correctness + hygiene fixes from PR coleam00#1270 review Addresses six of the review's important findings, all within the same PR branch: 1. envInjection: false → true The provider reads requestOptions.env on every call (for API-key passthrough). Declaring the capability false caused a spurious dag-executor warning for every Pi user who configured codebase env vars — which is the MAIN auth path. Flipping to true removes the false positive. 2. toSafeAssistantDefaults: denylist → allowlist The old shape deleted `additionalDirectories`, `settingSources`, `codexBinaryPath` before sending defaults to the web UI. Any future sensitive provider field (OAuth token, absolute path, internal metadata) would silently leak via the `[key: string]: unknown` index signature. New SAFE_ASSISTANT_FIELDS map lists exactly what to expose per provider; unknown providers get an empty allowlist so the web UI sees "provider exists" but no config details. 3. AsyncQueue single-consumer invariant The type was documented single-consumer but unenforced. A second `for await` would silently race with the first over buffer + waiters. Added a synchronous guard in Symbol.asyncIterator that throws on second call — copy-paste mistakes now fail fast with a clear message instead of dropping items. 4. session.dispose() / session.abort() silent catches Both catch blocks now log at debug via a module-scoped logger so SDK regressions surface without polluting normal output. 5. Type scripted events as AgentSessionEvent in provider.test.ts Was `Record<string, unknown>` — Pi field renames would silently keep tests passing. Now typed against Pi's actual event union. 6. Leaked /tmp/pi-research/... path in provider.ts comment Local-machine path that crept in during research. Replaced with the upstream GitHub URL (matches convention at provider.ts:110). Plus review-flagged simplifications: - Extract lookupPiModel wrapper — isolates the `as unknown as` cast behind one searchable name. - Hoist QueueItem → BridgeQueueItem at module scope (export'd for test visibility; not used externally yet but enables unit testing the mapping in isolation if needed later). - getRegisteredProviderNames: remove side-effecting registration calls. `loadConfig()` already bootstraps the registry before any caller can observe this helper — the hidden coupling was misleading. Plus missing-coverage tests from the review (pr-test-analyzer): - session.prompt() rejection → error surfaces to consumer - pre-aborted signal → session.abort() called - mid-stream abort → session.abort() called - modelFallbackMessage → system chunk yielded - AsyncQueue second-consumer → throws synchronously No behavioral changes for end users beyond the envInjection warning fix. * docs: Pi provider + community-provider contributor guide Addresses the PR coleam00#1270 review's docs-impact findings: the original Pi PR had no user-facing or contributor-facing documentation, and architecture.md still referenced the pre-Phase-2 factory.ts pattern (factory.ts was deleted in coleam00#1195). 1. packages/docs-web/src/content/docs/reference/architecture.md - Replace stale factory.ts references with the registry pattern. - Update inline IAgentProvider block: add getCapabilities, add options parameter. - Rewrite MessageChunk block as the actual discriminated union (was a placeholder with optional fields that didn't match the current type). - "Adding a New AI Agent Provider" checklist now distinguishes built-in (register in registerBuiltinProviders) from community (separate guide). Links to the new contributor guide. 2. packages/docs-web/src/content/docs/contributing/adding-a-community-provider.md (new) - Step-by-step guide using Pi as the reference implementation. - Covers: directory layout, capability discipline (start false, flip one at a time), provider class skeleton, registration via aggregator, test isolation (Bun mock.module pollution), what NOT to do (no edits to AssistantDefaultsConfig, no direct registerProvider from entrypoints, no overclaiming capabilities). 3. packages/docs-web/src/content/docs/getting-started/ai-assistants.md - New "Pi (Community Provider)" section: install, OAuth + API-key table per Pi backend, model ref format, workflow examples, capability matrix showing what Pi supports (session resume, tool restrictions, effort/thinking, skills, system prompt, envInjection) and what it doesn't (MCP, hooks, structured output, cost control, fallback model, sandbox). 4. .env.example - New Pi section with commented env vars for each supported backend (ANTHROPIC_API_KEY through HUGGINGFACE_API_KEY), each paired with its Pi provider id. OAuth flow (pi /login → auth.json) is explicitly called out — Archon reads that file too. 5. CHANGELOG.md - Unreleased entry for Pi, registerCommunityProviders aggregator, and the new contributor guide.

fix(workflows): harden adversarial-dev sed portability checks

b3b201b

LaplaceYoung added the codex label Apr 22, 2026 — with ChatGPT Codex Connector

LaplaceYoung merged commit 8e4b866 into dev Apr 22, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(workflows): make archon-adversarial-dev sed usage portable and add regression test#1

fix(workflows): make archon-adversarial-dev sed usage portable and add regression test#1
LaplaceYoung merged 1 commit intodevfrom
young/fix-issues-in-pr-and-comment

LaplaceYoung commented Apr 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

LaplaceYoung commented Apr 22, 2026

Motivation

Description

Testing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant