Skip to content

chore: cherry-pick upstream Tier 6 workflow polish (7 commits) — batch 7#12

Merged
prospapledge88 merged 9 commits intodevfrom
chore/cherry-pick-upstream-tier6-workflow-polish
May 5, 2026
Merged

chore: cherry-pick upstream Tier 6 workflow polish (7 commits) — batch 7#12
prospapledge88 merged 9 commits intodevfrom
chore/cherry-pick-upstream-tier6-workflow-polish

Conversation

@prospapledge88
Copy link
Copy Markdown
Owner

Summary

Cherry-picks 7 workflow-polish commits (Tier 6) from coleam00/archon upstream/dev. Builds on PRs #6/#7/#8/#9/#10/#11; cumulative absorption now 57 / 110 upstream commits.

Picks (chronological)

Upstream Description
817186d4 Make archon-adversarial-dev init sed portable on macOS (no sed -i) (coleam00#1155)
46671c46 Filter user-plugin MCP failure noise out of workflow warnings (coleam00#1327)
d1a7c96f Test coverage for anyFailed status derivation in DAG executor (coleam00#1403)
3a291b48 Migrate archon-piv-loop plan handoff to $ARTIFACTS_DIR/progress.txt (coleam00#1398)
87234c0b Switch eight bundled-workflow Opus pins from claude-opus-4-5-20250929 to opus[1m] alias (coleam00#1395)
f342d059 Fix approval gate state-machine bypass after reject-with-redraft on resume (coleam00#1435)
dc83efb2 Concise failure messages for bash/script nodes (coleam00#1389, coleam00#1393)

Skipped / already-absorbed

  • 4c6ddd99 (fail loudly on SDK isError) — already absorbed in earlier batch
  • 7ea32141 (initialize options.hooks before YAML merge) — already absorbed
  • bc25deef (detect completion in any XML tag) — already absorbed
  • f094f2ad was applied but partially: the path migration was already covered by the fork's prior 8295ece7 resolution, so the net diff to archon-piv-loop.yaml is small.

Pre-existing fork conflict markers cleaned up

.archon/workflows/defaults/archon-piv-loop.yaml had two unresolved <<<<<<< HEAD ... 8295ece7 conflict markers committed to dev that survived from an earlier cherry-pick. They surfaced again as nested conflicts during this batch and have now been resolved in favor of the safer 8295ece7 version (explicit git add path/... instead of git add -A, plus the Never stage guidance). The two markers should never have been committed in the first place — flagging here so you're aware. Section diff after resolution:

  • Step 4 (Phase 3 commit guidance): now reads **Never stage**: ... or anything under $ARTIFACTS_DIR.\n\nTrack progress in $ARTIFACTS_DIR/progress.txt:
  • Step 5 (post-validation fixes): now uses explicit git add path/to/file1 ... + git status --porcelain + git diff --cached --quiet || git commit ... (was git add -A && git commit ... 2>/dev/null || true)

Conflict-resolution notes

  • packages/workflows/src/dag-executor.ts (6fea3926) — fork imports join from path, upstream adds isAbsolute and resolve as resolvePath. Combined: import { isAbsolute, join, resolve as resolvePath } from 'path'.
  • packages/workflows/src/dag-executor.test.ts (6fea3926) — fork's shouldContinueStreamingForStatus describe block conflicted with upstream's three new MCP-filter describe blocks. Resolution: kept fork's existing block (terminated cleanly with });), then appended the upstream parseMcpFailureServerNames, loadConfiguredMcpServerNames, and executeDagWorkflow -- MCP failure filtering describes.
  • packages/workflows/src/dag-executor.test.ts (a57d6281) — upstream's added block tried to also re-add shouldContinueStreamingForStatus (fork already has it from an earlier batch). Resolution: dropped the duplicate, kept the new executeDagWorkflow -- final status derivation describe.
  • .archon/workflows/defaults/archon-piv-loop.yaml (f094f2ad) — see pre-existing conflict markers above.
  • packages/workflows/src/defaults/bundled-defaults.generated.ts — auto-merge conflicts from every YAML touch were resolved by taking upstream's bytes during the picks, then re-running bun run generate:bundled at the end so the file matches the fork's actual default set. Final regen lands as a separate commit (b9bb2860).
  • packages/docs-web/src/content/docs/guides/script-nodes.md (fccfe423) — upstream wants to modify the file, but the fork hasn't absorbed 46874cab which creates it. Dropped the docs change rather than create a half-merged docs page.
  • CHANGELOG.md (multiple picks) — git checkout --ours for each conflict, then a single batched entry added at the end.

Validation

bun run validate passes (with the same baseline-equivalent test failures as dev):

  • check:bundled ✅ (36 commands, 22 workflows up to date — 22 because the regen consolidates Opus pins)
  • check:bundled-skill ✅ (21 files match)
  • type-check ✅ (all 10 packages)
  • lint ✅ (zero warnings)
  • format:check
  • test baseline-equivalent on macOS local: 25 fail in @archon/core + 1 fail in @archon/workflows (script-node .env leak — macOS-specific; doesn't fire on Linux CI). Linux CI is expected to show the alternative baseline 25 + 2 fail in @archon/workflows (substituteWorkflowVariables prefix tests pre-dating the prompt-injection sanitizer wrap), per the saved memory documenting the platform split. Net: baseline-equivalent on both platforms. New: the 7 cherry-picks together add ~25 new tests (mostly in dag-executor.test.ts), all passing.

Test plan

  • bun run validate clean (baseline-equivalent failures, same as dev)
  • All 7 picks applied (3 already-absorbed picks dropped cleanly)
  • Pre-existing archon-piv-loop.yaml conflict markers resolved
  • Bundled defaults regenerated after Opus alias + sed fix + piv-loop migration
  • CI green on push
  • Post-merge spot-check: trigger an approval-gate workflow with reject + redraft + resume; verify the redraft prompt re-runs (not a silent approve)

🤖 Generated with Claude Code

LaplaceYoung and others added 9 commits May 6, 2026 07:33
coleam00#1155)

* fix(workflows): make adversarial init sed portable on macOS

* chore: regenerate bundled-defaults after adversarial-dev sed fix

Sync generated bundle with the new temp-file sed pattern in
archon-adversarial-dev.yaml so check:bundled passes and binary
distributions ship the macOS-safe version.

---------

Co-authored-by: laplace young <yangqk12@whu.edu.cn>
Co-authored-by: Rasmus Widing <rasmus.widing@gmail.com>
…coleam00#1327)

* fix(workflows): filter user-plugin MCP noise out of workflow warnings

Before this change, the dag-executor surfaced every entry in the Claude
SDK's "MCP server connection failed: …" system message to the user. That
message includes user-level plugin MCPs inherited from ~/.claude/ (e.g.
`telegram`) that fail to connect in the headless workflow subprocess —
they're non-actionable noise for the workflow author.

Fix:
- Pre-compute the set of workflow-configured MCP server names per node
  by parsing the `mcp:` config file once at the start of
  executeNodeInternal. No caller-facing API change; no duplication of
  the provider's env-var expansion logic (we only need the keys).
- Split the system-message handler: the `MCP server connection failed:`
  path now surfaces only the subset of failing names that match the
  node's configured set; user-plugin failures are debug-logged as
  `dag.mcp_plugin_connection_suppressed`. The `⚠️` branch is unchanged.

Supersedes coleam00#1134 (closed as stale — the Windows HOME fix in that PR was
already shipped via coleam00#1302, and the claude.ts enabledPlugins change
targeted a file that has since moved into @archon/providers).

Credits @MrFadiAi for identifying and reporting the underlying issue.

Co-authored-by: Fadi Ai <MrFadiAi@users.noreply.github.com>

* fix(workflows): preserve MCP failure status in filtered message + observability

Address review feedback on PR coleam00#1327:

- parseMcpFailureServerNames now returns {name, segment} entries so the
  forwarded "MCP server connection failed: ..." message preserves the
  per-server status detail (e.g. "(timeout)", "(disconnected)") that the
  bare-name reconstruction was dropping.
- loadConfiguredMcpServerNames now debug-logs read failures as
  dag.mcp_filter_config_read_failed instead of swallowing them silently.
  A transient EMFILE/EBUSY at filter time would otherwise silently
  reclassify a real workflow-MCP failure as plugin noise.
- Add 4 integration tests through executeDagWorkflow covering the mixed
  workflow/plugin split, all-plugin suppression, no-mcp:-config nodes,
  and the unchanged ⚠️ warning path.
- Drop a WHAT comment above configuredMcpNames and a temporal phrase
  ("before this filter landed") that would rot on merge.
- Document the filtering boundary in guides/mcp-servers.md and add a
  troubleshooting row for users debugging silently-suppressed plugin MCPs.

---------

Co-authored-by: Fadi Ai <MrFadiAi@users.noreply.github.com>
…cutor (coleam00#1403)

PIV Task 1: Adds three new tests in a dedicated describe block
'executeDagWorkflow -- final status derivation' covering the anyFailed
branch (dag-executor.ts ~line 2956) that previously had no direct test:
- one success + one independent failure calls failWorkflowRun (not completeWorkflowRun)
- multiple successes + one failure calls failWorkflowRun (not completeWorkflowRun)
- trigger_rule: none_failed skips dependent node but anyFailed still marks run failed

Fixes coleam00#1381.
…m00#1398)

* fix(workflow): migrate piv-loop plan handoff to $ARTIFACTS_DIR (coleam00#1380)

The create-plan node used a relative path (.claude/archon/plans/{slug}.plan.md)
that the AI agent would sometimes write to a different location, breaking all
downstream nodes that glob for the plan file. Migrated all plan/progress file
references to $ARTIFACTS_DIR/plan.md and $ARTIFACTS_DIR/progress.txt, matching
the pattern used by archon-fix-github-issue and other workflows.

Changes:
- Replace slug-based plan path with $ARTIFACTS_DIR/plan.md in create-plan node
- Replace ls -t glob discovery with direct $ARTIFACTS_DIR/plan.md reads in
  refine-plan, code-review, and fix-feedback nodes
- Replace empty-string guard with file-existence check in implement-setup bash
- Migrate progress.txt references in implement loop to $ARTIFACTS_DIR/
- Add explicit plan/progress paths in finalize node
- Regenerated bundled-defaults.generated.ts

Fixes coleam00#1380

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(workflow): address review findings in archon-piv-loop

- Rename 'Step 2: Write the Plan' to 'Step 2: Plan File Location' to
  eliminate the duplicate heading that collided with Step 3's identical
  title in the create-plan node
- Guard implement-setup against a 0-task plan file: exit 1 with a
  clear error when no '### Task N:' sections are found, preventing a
  silent no-op implement loop
- Remove 2>/dev/null from code-review commit so pre-commit hook failures
  and other stderr are visible to the agent instead of silently swallowed
- Replace '|| true' on git push in finalize with an explicit WARNING echo
  so push failures (auth, upstream conflict, no remote) surface to the
  agent rather than being silently ignored
- Regenerate bundled-defaults.generated.ts

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore(workflows): regenerate bundled defaults to match opus[1m] alias

The bundle was stale relative to the YAML sources after coleam00#1395 merged —
check:bundled was failing CI. Regenerated; no YAML edits.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…#1395)

Anthropic's Opus 4.7 landed 2026-04-16; on the Anthropic API, opus /
opus[1m] now resolve to 4.7 with a 1M context window at standard
pricing. Using the alias instead of the hard-pinned claude-opus-4-6[1m]
lets bundled default workflows auto-track the recommended Opus version.

No explicit effort is set, so nodes inherit the per-model default
(xhigh on 4.7, high on 4.6).
…ume (coleam00#1435)

* fix(workflows): approval gate bypass after reject-with-redraft on resume

When an approval node was rejected with on_reject.prompt, the synthetic
PromptNode built to run the on_reject prompt reused the approval gate's
own node ID. executeNodeInternal then wrote a node_completed event with
that ID, causing getCompletedDagNodeOutputs to treat the gate as already
completed on the next resume — bypassing the human gate entirely.

Fix: give the synthetic node the ID `${node.id}:on_reject` so its
node_completed event has a distinct step_name that won't match the
approval gate slot in priorCompletedNodes.

Adds a regression test asserting no node_completed event with the
approval gate's ID is written during on_reject execution.

Fixes coleam00#1429

* test(workflows): add positive assertion and SSE side-effect comment for on_reject synthetic node

Add complementary positive assertion to the regression test to verify that
node_completed is written exactly once with step_name 'review:on_reject',
ensuring future refactors that suppress the event entirely would be caught.

Add inline comment in executeApprovalNode documenting the known SSE side-effect:
node_started/node_completed events with nodeId='review:on_reject' flow through
the SSE pipeline into the web UI, resulting in a transient phantom node in the
execution view. This is cosmetic-only — the human gate contract is preserved.

* simplify: reduce duplicate cast pattern in on_reject test assertions
…m00#1389) (coleam00#1393)

* fix(workflows): concise failure messages for bash/script nodes (coleam00#1389)

When a `bash:` or `script:` node fails, the executor was embedding
`err.message` verbatim into the user-visible error. For inline scripts run
via `bash -c <body>` / `bun -e <body>`, Node's `promisify(execFile)` puts
the entire substituted script body into `err.message`, `err.cmd`, and the
first line of `err.stack` — and the Pino log serialized all three, repeating
the body ~3× in one structured log line. The actionable diagnostic was
buried under kilobytes of echoed source.

Route both catch-block default branches through a new
`formatSubprocessFailure(err, label)` helper in `executor-shared.ts` that:

- strips the `Command failed: <cmd>` prefix line (which carries the body)
- prefers `err.stderr` — Bun/bash emit the actionable diagnostic there
- tail-truncates at 2 KB with a `…[truncated]` marker
- returns a controlled `logFields` subset (`exitCode`, `killed`, `stderrTail`)
  so Pino never re-serializes `err.message` / `err.stack` / `err.cmd`

Also drops the script-node `stderrHint` concatenation — stderr is already
handled by the helper, so the previous code appended it twice.

Timeout / ENOENT / EACCES branches are preserved verbatim.

Fixes coleam00#1389

* fix(workflows): address PR review feedback for coleam00#1389

- Run formatSubprocessFailure unconditionally so timeout / ENOENT / EACCES
  branches also get sanitized log fields (the timeout message also embeds
  the `Command failed: bash -c <body>` line).
- Drop `errType: err.constructor.name` (always 'Error' in production) and
  replace with `nodeType: 'bash' | 'script'` for actual discriminating value.
- Replace chained `||` + ternary in diagnostic selection with explicit
  if/else for readability.
- Simplify exit suffix guard: `err.code != null` instead of double typeof.
- Make stderrTail emptiness check explicit.
- Drop `RawSubprocessError` export (no external consumer) and widen `code`
  to `number | string | null` to mirror Node's ExecFileException.
- Tighten test length bound to <2100 (was <4000 / <2200) so a doubling of
  SUBPROCESS_ERROR_MAX_CHARS would actually trip the assertion.
- Replace broad regex assertion with `.toContain('[eval]')` — the location
  marker is the strongest signal that diagnostic content survived.
- Strip issue-number citations from describe block and test comments.
- Update script-nodes.md to distinguish stderr behavior on success vs
  failure paths.
- Add CHANGELOG Unreleased / Fixed entry for the user-visible change.
@prospapledge88 prospapledge88 merged commit 6013c03 into dev May 5, 2026
1 of 2 checks passed
@prospapledge88 prospapledge88 deleted the chore/cherry-pick-upstream-tier6-workflow-polish branch May 5, 2026 21:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants