fix: Windows HOME resolution and MCP plugin warning suppression by MrFadiAi · Pull Request #1134 · coleam00/Archon

MrFadiAi · 2026-04-12T18:49:30Z

Summary

Fix Windows path resolution bug where HOME environment variable is unset, causing CLI and server to fail loading ~/.archon/.env configuration
Suppress non-fatal MCP connection failures from user-level plugins (e.g., telegram) in headless workflow subprocesses — only surface warnings for workflow-configured MCP servers

Changes

Windows HOME Resolution (`@archon/cli`, `@archon/server`)

Replace process.env.HOME with getArchonHome() from @archon/paths for cross-platform config path resolution in both CLI and server entry points
On Windows, HOME is often unset — getArchonHome() correctly falls back to USERPROFILE or HOMEDRIVE+HOMEPATH

MCP Plugin Warning Suppression (`@archon/workflows`, `@archon/core`)

Track workflow-configured MCP server names in the DAG executor
Filter MCP connection failure messages to only surface warnings for servers explicitly configured via mcp: in workflow YAML
User-level plugin MCP failures (e.g., telegram) are logged at debug level instead of warn and not sent to the user
Pass settings: { enabledPlugins: {} } to Claude Agent SDK subprocess to attempt disabling user plugins

Test plan

Verified on Windows 11 where HOME is unset — CLI and server now correctly load ~/.archon/.env
Verified archon workflow run archon-assist "Say hello" completes without telegram MCP warning
TypeScript type-check passes for all packages
ESLint + Prettier pass (pre-commit hook)

🤖 Generated with Claude Code

Summary by CodeRabbit

Bug Fixes
- Improved MCP server connection error handling by distinguishing between workflow-level and plugin-level failures; only workflow failures are reported to users, while plugin failures are logged internally.
New Features
- Claude agent subprocess now exclusively uses explicitly configured MCP servers for enhanced predictability and control.

- Use getArchonHome() instead of process.env.HOME for cross-platform config path resolution in CLI and server entry points - Suppress non-fatal MCP connection failures from user-level plugins (e.g., telegram) in headless workflow subprocesses - Only surface MCP failures for workflow-configured servers Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai · 2026-04-12T18:49:44Z

📝 Walkthrough

Walkthrough

Updates env file path resolution across CLI, server, and tests to use a centralized getArchonHome() function. Disables user plugin execution in Claude subprocess by adding plugin settings. Modifies workflow executor to differentiate and filter MCP server connection failures between workflow-specific and plugin-specific sources.

Changes

Cohort / File(s)	Summary
Environment Path Standardization `packages/cli/src/cli.ts`, `packages/cli/src/cli.test.ts`, `packages/server/src/index.ts`	Replaced manual home directory resolution with centralized `getArchonHome()` function for computing the global `~/.archon/.env` file path. Updated imports and adjusted test setup to reflect the new path resolution method.
Claude Provider Configuration `packages/core/src/providers/claude.ts`	Added `settings: { enabledPlugins: {} }` to Claude Agent SDK options to disable user and plugin execution within the subprocess, restricting activity to supplied MCP servers only.
MCP Failure Filtering `packages/workflows/src/dag-executor.ts`	Implemented differentiation between workflow-specific and plugin-specific MCP server connection failures in system message handling. Only workflow MCP failures are surfaced to users via `safeSendMessage`; plugin failures are logged at debug level and suppressed from user output.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

fix: remove CWD env stripping, load ~/.archon/.env with override for binary support #1045: Modifies CLI and server env file loading to use getArchonHome() and alters override semantics for the global env file.
fix: strip CWD .env leak, enable platform adapters in serve, add first-event timeout #1092: Updates the same env-loading code paths in packages/cli/src/cli.ts and packages/server/src/index.ts for global env file resolution.

Poem

🐰 Paths now home, with getArchon's grace,
MCP failures sorted to their rightful place—
Plugins hush while workflows speak,
Claude's plugins rest, no longer meek!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	The description covers the main problem, solution, and test plan, but is missing critical template sections including architecture diagrams, validation evidence, security impact, compatibility analysis, human verification details, and rollback plan.	Complete the description template by adding Architecture Diagrams (Before/After), Validation Evidence (test command outputs), Security Impact assessment, Compatibility/Migration details, Human Verification scope, Side Effects/Blast Radius analysis, and Rollback Plan.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and specifically identifies the two main changes: Windows HOME resolution fix and MCP plugin warning suppression, matching the core objectives.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

packages/cli/src/cli.test.ts (1)

241-246: ⚠️ Potential issue | 🟠 Major

Test is environment-dependent and can be flaky.

This path now targets the real user ~/.archon/.env when present, so test outcomes can vary by machine/user file contents. Prefer a temp .env fixture to keep this deterministic.

Proposed deterministic test adjustment

   it('should allow ~/.archon/.env to override Bun-auto-loaded vars via override:true', async () => {
     const { config } = await import('dotenv');
-    const { resolve } = await import('path');
-    const { existsSync } = await import('fs');
+    const { resolve, join } = await import('path');
+    const { mkdtemp, writeFile, rm } = await import('fs/promises');
+    const { tmpdir } = await import('os');

     // Simulate Bun auto-loading a stale value
     process.env.TEST_ARCHON_OVERRIDE = 'from-cwd-env';

-    // Write a temporary env content and load with override
-    const globalEnvPath = resolve(getArchonHome(), '.env');
-    if (existsSync(globalEnvPath)) {
-      const result = config({ path: globalEnvPath, override: true });
-      // If ~/.archon/.env exists and has DATABASE_URL, it should override
-      expect(result.error).toBeUndefined();
-    }
+    const tempDir = await mkdtemp(join(tmpdir(), 'archon-cli-test-'));
+    const testEnvPath = resolve(tempDir, '.env');
+    await writeFile(testEnvPath, 'TEST_ARCHON_OVERRIDE=from-archon-env\n', 'utf8');
+    const result = config({ path: testEnvPath, override: true });
+    expect(result.error).toBeUndefined();
+    expect(process.env.TEST_ARCHON_OVERRIDE).toBe('from-archon-env');
+    await rm(tempDir, { recursive: true, force: true });

     // Clean up
     delete process.env.TEST_ARCHON_OVERRIDE;
   });

As per coding guidelines: "Keep tests deterministic — no flaky timing or network dependence without guardrails; ensure local validation commands (bun run validate) map directly to CI expectations (Determinism + Reproducibility)".

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@packages/cli/src/cli.test.ts` around lines 241 - 246, The test reads the real
user home (~/.archon/.env) via getArchonHome() and becomes flaky; instead create
a temporary directory and a temp .env fixture, write the expected DATABASE_URL
into that file, mock or override getArchonHome() to return the temp dir (or set
the env value used by getArchonHome()), then call resolve(getArchonHome(),
'.env') and load with config({ path: ..., override: true }) and assert
result.error is undefined; finally clean up the temp file/dir. Target symbols:
globalEnvPath, getArchonHome, resolve, existsSync, and config.

🧹 Nitpick comments (2)

packages/core/src/providers/claude.ts (1)

388-391: Scope plugin suppression to workflow/headless calls only.

This hard-codes plugin disabling for every ClaudeProvider call. If this provider is reused outside DAG/headless runs, it can unintentionally disable user-intended plugins there too. Prefer a typed AgentRequestOptions flag (for example, disableUserPlugins?: boolean) and apply settings.enabledPlugins only when that flag is set by workflow execution paths.

Proposed direction

-        settings: { enabledPlugins: {} },
+        ...(requestOptions?.disableUserPlugins === true
+          ? { settings: { enabledPlugins: {} } }
+          : {}),

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/providers/claude.ts` around lines 388 - 391, The provider
currently unconditionally injects settings.enabledPlugins into every
ClaudeProvider call; change this to a conditional based on a new optional
boolean flag on the request type (e.g., add disableUserPlugins?: boolean to
AgentRequestOptions) and only set settings: { enabledPlugins: {} } on the
outgoing request when AgentRequestOptions.disableUserPlugins === true (default
false). Update the types for AgentRequestOptions, the call site(s) that create
requests for workflow/headless runs to set disableUserPlugins: true, and the
ClaudeProvider request-building logic (referencing ClaudeProvider and
settings.enabledPlugins) so normal non-workflow calls keep user plugins enabled.

packages/workflows/src/dag-executor.ts (1)

1037-1071: Minor duplication in filtering logic.

The code filters failedEntries twice with the same split(' (')[0] operation. While this is readable and the arrays are small, a single-pass partition would be slightly more efficient.

♻️ Optional: single-pass partition

-          const workflowFailures = failedEntries.filter(entry => {
-            const name = entry.split(' (')[0];
-            return workflowMcpNames.has(name);
-          });
-
-          if (workflowFailures.length > 0) {
+          const workflowFailures: string[] = [];
+          const pluginFailures: string[] = [];
+          for (const entry of failedEntries) {
+            const name = entry.split(' (')[0];
+            if (workflowMcpNames.has(name)) {
+              workflowFailures.push(entry);
+            } else {
+              pluginFailures.push(entry);
+            }
+          }
+
+          if (workflowFailures.length > 0) {
             const workflowMsg = `MCP server connection failed: ${workflowFailures.join(', ')}`;
             getLog().warn(
               { nodeId: node.id, mcpStatus: workflowMsg },
               'dag.mcp_server_connection_failed'
             );
             const delivered = await safeSendMessage(
               platform,
               conversationId,
               workflowMsg,
               nodeContext
             );
             if (!delivered) {
               getLog().error(
                 { nodeId: node.id, mcpStatus: workflowMsg, workflowRunId: workflowRun.id },
                 'dag.mcp_connection_failure_delivery_failed'
               );
             }
           }
-          // User-plugin MCP failures: log at debug, don't surface to user
-          const pluginFailures = failedEntries.filter(entry => {
-            const name = entry.split(' (')[0];
-            return !workflowMcpNames.has(name);
-          });
+
           if (pluginFailures.length > 0) {

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@packages/workflows/src/dag-executor.ts` around lines 1037 - 1071, The code
currently computes workflowFailures and pluginFailures by filtering
failedEntries twice using entry.split(' (')[0]; change this to a single-pass
partition over failedEntries (e.g., a for loop or Array.reduce) that extracts
const name = entry.split(' (')[0] once and pushes the entry into either
workflowFailures or pluginFailures based on workflowMcpNames.has(name); keep the
subsequent logic that logs workflowFailures, calls safeSendMessage, and logs
delivery errors unchanged, only replace the duplicate filtering with the
single-pass partition that produces workflowFailures and pluginFailures.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@packages/cli/src/cli.test.ts`:
- Around line 241-246: The test reads the real user home (~/.archon/.env) via
getArchonHome() and becomes flaky; instead create a temporary directory and a
temp .env fixture, write the expected DATABASE_URL into that file, mock or
override getArchonHome() to return the temp dir (or set the env value used by
getArchonHome()), then call resolve(getArchonHome(), '.env') and load with
config({ path: ..., override: true }) and assert result.error is undefined;
finally clean up the temp file/dir. Target symbols: globalEnvPath,
getArchonHome, resolve, existsSync, and config.

---

Nitpick comments:
In `@packages/core/src/providers/claude.ts`:
- Around line 388-391: The provider currently unconditionally injects
settings.enabledPlugins into every ClaudeProvider call; change this to a
conditional based on a new optional boolean flag on the request type (e.g., add
disableUserPlugins?: boolean to AgentRequestOptions) and only set settings: {
enabledPlugins: {} } on the outgoing request when
AgentRequestOptions.disableUserPlugins === true (default false). Update the
types for AgentRequestOptions, the call site(s) that create requests for
workflow/headless runs to set disableUserPlugins: true, and the ClaudeProvider
request-building logic (referencing ClaudeProvider and settings.enabledPlugins)
so normal non-workflow calls keep user plugins enabled.

In `@packages/workflows/src/dag-executor.ts`:
- Around line 1037-1071: The code currently computes workflowFailures and
pluginFailures by filtering failedEntries twice using entry.split(' (')[0];
change this to a single-pass partition over failedEntries (e.g., a for loop or
Array.reduce) that extracts const name = entry.split(' (')[0] once and pushes
the entry into either workflowFailures or pluginFailures based on
workflowMcpNames.has(name); keep the subsequent logic that logs
workflowFailures, calls safeSendMessage, and logs delivery errors unchanged,
only replace the duplicate filtering with the single-pass partition that
produces workflowFailures and pluginFailures.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c58d79d8-8e7b-46a7-a670-0e0fe9c298c3

📥 Commits

Reviewing files that changed from the base of the PR and between eb75ab6 and 43c0b97.

📒 Files selected for processing (5)

packages/cli/src/cli.test.ts
packages/cli/src/cli.ts
packages/core/src/providers/claude.ts
packages/server/src/index.ts
packages/workflows/src/dag-executor.ts

Wirasm · 2026-04-21T08:56:00Z

Thanks for raising this, @MrFadiAi — closing in favor of a fresh fix on current dev.

This PR is 9 days stale and all three changes have drifted:

Windows HOME fix — already shipped. packages/cli/src/cli.ts and packages/server/src/index.ts no longer reference process.env.HOME; PR CLI silently strips repo-local .env vars; logs say (0) from .env instead of stripped N keys #1302 moved env loading into loadArchonEnv() (@archon/paths/env-loader), which uses getArchonHome() internally and handles the Windows case (USERPROFILE / HOMEDRIVE+HOMEPATH / ARCHON_HOME override). No further change needed.
settings: { enabledPlugins: {} } in claude.ts — file moved. packages/core/src/providers/claude.ts was extracted to packages/providers/src/claude/provider.ts during the provider registry refactor, so the patch can't apply.
MCP plugin warning suppression in dag-executor.ts — core idea is good, but the surrounding code has been touched by feat(isolation,workflows): worktree location + per-workflow isolation policy #1310 and feat(paths,workflows): unify ~/.archon/{workflows,commands,scripts} + drop globalSearchPath (closes #1136) #1315 and needs a manual rebase.

The MCP noise filter (point 3) is the piece still worth landing. We'll re-do it on current dev with tests and post the follow-up PR here — you'll be credited via Co-authored-by: on the commit.

@MrFadiAi

…#1327) * fix(workflows): filter user-plugin MCP noise out of workflow warnings Before this change, the dag-executor surfaced every entry in the Claude SDK's "MCP server connection failed: …" system message to the user. That message includes user-level plugin MCPs inherited from ~/.claude/ (e.g. `telegram`) that fail to connect in the headless workflow subprocess — they're non-actionable noise for the workflow author. Fix: - Pre-compute the set of workflow-configured MCP server names per node by parsing the `mcp:` config file once at the start of executeNodeInternal. No caller-facing API change; no duplication of the provider's env-var expansion logic (we only need the keys). - Split the system-message handler: the `MCP server connection failed:` path now surfaces only the subset of failing names that match the node's configured set; user-plugin failures are debug-logged as `dag.mcp_plugin_connection_suppressed`. The `⚠️` branch is unchanged. Supersedes #1134 (closed as stale — the Windows HOME fix in that PR was already shipped via #1302, and the claude.ts enabledPlugins change targeted a file that has since moved into @archon/providers). Credits @MrFadiAi for identifying and reporting the underlying issue. Co-authored-by: Fadi Ai <MrFadiAi@users.noreply.github.com> * fix(workflows): preserve MCP failure status in filtered message + observability Address review feedback on PR #1327: - parseMcpFailureServerNames now returns {name, segment} entries so the forwarded "MCP server connection failed: ..." message preserves the per-server status detail (e.g. "(timeout)", "(disconnected)") that the bare-name reconstruction was dropping. - loadConfiguredMcpServerNames now debug-logs read failures as dag.mcp_filter_config_read_failed instead of swallowing them silently. A transient EMFILE/EBUSY at filter time would otherwise silently reclassify a real workflow-MCP failure as plugin noise. - Add 4 integration tests through executeDagWorkflow covering the mixed workflow/plugin split, all-plugin suppression, no-mcp:-config nodes, and the unchanged ⚠️ warning path. - Drop a WHAT comment above configuredMcpNames and a temporal phrase ("before this filter landed") that would rot on merge. - Document the filtering boundary in guides/mcp-servers.md and add a troubleshooting row for users debugging silently-suppressed plugin MCPs. --------- Co-authored-by: Fadi Ai <MrFadiAi@users.noreply.github.com>

coderabbitai Bot reviewed Apr 12, 2026

View reviewed changes

Wirasm closed this Apr 21, 2026

Wirasm mentioned this pull request Apr 21, 2026

fix(workflows): filter user-plugin MCP noise out of workflow warnings #1327

Merged

3 tasks

Wirasm mentioned this pull request Apr 22, 2026

Release 0.3.7 #1352

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: Windows HOME resolution and MCP plugin warning suppression#1134

fix: Windows HOME resolution and MCP plugin warning suppression#1134
MrFadiAi wants to merge 1 commit intocoleam00:devfrom
MrFadiAi:fix/windows-home-mcp-warnings

MrFadiAi commented Apr 12, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Apr 12, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Wirasm commented Apr 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

MrFadiAi commented Apr 12, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Windows HOME Resolution (@archon/cli, @archon/server)

MCP Plugin Warning Suppression (@archon/workflows, @archon/core)

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Wirasm commented Apr 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

MrFadiAi commented Apr 12, 2026 •

edited by coderabbitai Bot

Loading

Windows HOME Resolution (`@archon/cli`, `@archon/server`)

MCP Plugin Warning Suppression (`@archon/workflows`, `@archon/core`)

coderabbitai Bot commented Apr 12, 2026 •

edited

Loading