fix: Windows HOME resolution and MCP plugin warning suppression#1134
fix: Windows HOME resolution and MCP plugin warning suppression#1134MrFadiAi wants to merge 1 commit intocoleam00:devfrom
Conversation
- Use getArchonHome() instead of process.env.HOME for cross-platform config path resolution in CLI and server entry points - Suppress non-fatal MCP connection failures from user-level plugins (e.g., telegram) in headless workflow subprocesses - Only surface MCP failures for workflow-configured servers Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
📝 WalkthroughWalkthroughUpdates env file path resolution across CLI, server, and tests to use a centralized Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
packages/cli/src/cli.test.ts (1)
241-246:⚠️ Potential issue | 🟠 MajorTest is environment-dependent and can be flaky.
This path now targets the real user
~/.archon/.envwhen present, so test outcomes can vary by machine/user file contents. Prefer a temp.envfixture to keep this deterministic.Proposed deterministic test adjustment
it('should allow ~/.archon/.env to override Bun-auto-loaded vars via override:true', async () => { const { config } = await import('dotenv'); - const { resolve } = await import('path'); - const { existsSync } = await import('fs'); + const { resolve, join } = await import('path'); + const { mkdtemp, writeFile, rm } = await import('fs/promises'); + const { tmpdir } = await import('os'); // Simulate Bun auto-loading a stale value process.env.TEST_ARCHON_OVERRIDE = 'from-cwd-env'; - // Write a temporary env content and load with override - const globalEnvPath = resolve(getArchonHome(), '.env'); - if (existsSync(globalEnvPath)) { - const result = config({ path: globalEnvPath, override: true }); - // If ~/.archon/.env exists and has DATABASE_URL, it should override - expect(result.error).toBeUndefined(); - } + const tempDir = await mkdtemp(join(tmpdir(), 'archon-cli-test-')); + const testEnvPath = resolve(tempDir, '.env'); + await writeFile(testEnvPath, 'TEST_ARCHON_OVERRIDE=from-archon-env\n', 'utf8'); + const result = config({ path: testEnvPath, override: true }); + expect(result.error).toBeUndefined(); + expect(process.env.TEST_ARCHON_OVERRIDE).toBe('from-archon-env'); + await rm(tempDir, { recursive: true, force: true }); // Clean up delete process.env.TEST_ARCHON_OVERRIDE; });As per coding guidelines: "Keep tests deterministic — no flaky timing or network dependence without guardrails; ensure local validation commands (
bun run validate) map directly to CI expectations (Determinism + Reproducibility)".🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/cli/src/cli.test.ts` around lines 241 - 246, The test reads the real user home (~/.archon/.env) via getArchonHome() and becomes flaky; instead create a temporary directory and a temp .env fixture, write the expected DATABASE_URL into that file, mock or override getArchonHome() to return the temp dir (or set the env value used by getArchonHome()), then call resolve(getArchonHome(), '.env') and load with config({ path: ..., override: true }) and assert result.error is undefined; finally clean up the temp file/dir. Target symbols: globalEnvPath, getArchonHome, resolve, existsSync, and config.
🧹 Nitpick comments (2)
packages/core/src/providers/claude.ts (1)
388-391: Scope plugin suppression to workflow/headless calls only.This hard-codes plugin disabling for every
ClaudeProvidercall. If this provider is reused outside DAG/headless runs, it can unintentionally disable user-intended plugins there too. Prefer a typedAgentRequestOptionsflag (for example,disableUserPlugins?: boolean) and applysettings.enabledPluginsonly when that flag is set by workflow execution paths.Proposed direction
- settings: { enabledPlugins: {} }, + ...(requestOptions?.disableUserPlugins === true + ? { settings: { enabledPlugins: {} } } + : {}),🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/core/src/providers/claude.ts` around lines 388 - 391, The provider currently unconditionally injects settings.enabledPlugins into every ClaudeProvider call; change this to a conditional based on a new optional boolean flag on the request type (e.g., add disableUserPlugins?: boolean to AgentRequestOptions) and only set settings: { enabledPlugins: {} } on the outgoing request when AgentRequestOptions.disableUserPlugins === true (default false). Update the types for AgentRequestOptions, the call site(s) that create requests for workflow/headless runs to set disableUserPlugins: true, and the ClaudeProvider request-building logic (referencing ClaudeProvider and settings.enabledPlugins) so normal non-workflow calls keep user plugins enabled.packages/workflows/src/dag-executor.ts (1)
1037-1071: Minor duplication in filtering logic.The code filters
failedEntriestwice with the samesplit(' (')[0]operation. While this is readable and the arrays are small, a single-pass partition would be slightly more efficient.♻️ Optional: single-pass partition
- const workflowFailures = failedEntries.filter(entry => { - const name = entry.split(' (')[0]; - return workflowMcpNames.has(name); - }); - - if (workflowFailures.length > 0) { + const workflowFailures: string[] = []; + const pluginFailures: string[] = []; + for (const entry of failedEntries) { + const name = entry.split(' (')[0]; + if (workflowMcpNames.has(name)) { + workflowFailures.push(entry); + } else { + pluginFailures.push(entry); + } + } + + if (workflowFailures.length > 0) { const workflowMsg = `MCP server connection failed: ${workflowFailures.join(', ')}`; getLog().warn( { nodeId: node.id, mcpStatus: workflowMsg }, 'dag.mcp_server_connection_failed' ); const delivered = await safeSendMessage( platform, conversationId, workflowMsg, nodeContext ); if (!delivered) { getLog().error( { nodeId: node.id, mcpStatus: workflowMsg, workflowRunId: workflowRun.id }, 'dag.mcp_connection_failure_delivery_failed' ); } } - // User-plugin MCP failures: log at debug, don't surface to user - const pluginFailures = failedEntries.filter(entry => { - const name = entry.split(' (')[0]; - return !workflowMcpNames.has(name); - }); + if (pluginFailures.length > 0) {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/workflows/src/dag-executor.ts` around lines 1037 - 1071, The code currently computes workflowFailures and pluginFailures by filtering failedEntries twice using entry.split(' (')[0]; change this to a single-pass partition over failedEntries (e.g., a for loop or Array.reduce) that extracts const name = entry.split(' (')[0] once and pushes the entry into either workflowFailures or pluginFailures based on workflowMcpNames.has(name); keep the subsequent logic that logs workflowFailures, calls safeSendMessage, and logs delivery errors unchanged, only replace the duplicate filtering with the single-pass partition that produces workflowFailures and pluginFailures.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@packages/cli/src/cli.test.ts`:
- Around line 241-246: The test reads the real user home (~/.archon/.env) via
getArchonHome() and becomes flaky; instead create a temporary directory and a
temp .env fixture, write the expected DATABASE_URL into that file, mock or
override getArchonHome() to return the temp dir (or set the env value used by
getArchonHome()), then call resolve(getArchonHome(), '.env') and load with
config({ path: ..., override: true }) and assert result.error is undefined;
finally clean up the temp file/dir. Target symbols: globalEnvPath,
getArchonHome, resolve, existsSync, and config.
---
Nitpick comments:
In `@packages/core/src/providers/claude.ts`:
- Around line 388-391: The provider currently unconditionally injects
settings.enabledPlugins into every ClaudeProvider call; change this to a
conditional based on a new optional boolean flag on the request type (e.g., add
disableUserPlugins?: boolean to AgentRequestOptions) and only set settings: {
enabledPlugins: {} } on the outgoing request when
AgentRequestOptions.disableUserPlugins === true (default false). Update the
types for AgentRequestOptions, the call site(s) that create requests for
workflow/headless runs to set disableUserPlugins: true, and the ClaudeProvider
request-building logic (referencing ClaudeProvider and settings.enabledPlugins)
so normal non-workflow calls keep user plugins enabled.
In `@packages/workflows/src/dag-executor.ts`:
- Around line 1037-1071: The code currently computes workflowFailures and
pluginFailures by filtering failedEntries twice using entry.split(' (')[0];
change this to a single-pass partition over failedEntries (e.g., a for loop or
Array.reduce) that extracts const name = entry.split(' (')[0] once and pushes
the entry into either workflowFailures or pluginFailures based on
workflowMcpNames.has(name); keep the subsequent logic that logs
workflowFailures, calls safeSendMessage, and logs delivery errors unchanged,
only replace the duplicate filtering with the single-pass partition that
produces workflowFailures and pluginFailures.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: c58d79d8-8e7b-46a7-a670-0e0fe9c298c3
📒 Files selected for processing (5)
packages/cli/src/cli.test.tspackages/cli/src/cli.tspackages/core/src/providers/claude.tspackages/server/src/index.tspackages/workflows/src/dag-executor.ts
|
Thanks for raising this, @MrFadiAi — closing in favor of a fresh fix on current This PR is 9 days stale and all three changes have drifted:
The MCP noise filter (point 3) is the piece still worth landing. We'll re-do it on current |
…#1327) * fix(workflows): filter user-plugin MCP noise out of workflow warnings Before this change, the dag-executor surfaced every entry in the Claude SDK's "MCP server connection failed: …" system message to the user. That message includes user-level plugin MCPs inherited from ~/.claude/ (e.g. `telegram`) that fail to connect in the headless workflow subprocess — they're non-actionable noise for the workflow author. Fix: - Pre-compute the set of workflow-configured MCP server names per node by parsing the `mcp:` config file once at the start of executeNodeInternal. No caller-facing API change; no duplication of the provider's env-var expansion logic (we only need the keys). - Split the system-message handler: the `MCP server connection failed:` path now surfaces only the subset of failing names that match the node's configured set; user-plugin failures are debug-logged as `dag.mcp_plugin_connection_suppressed`. The `⚠️ ` branch is unchanged. Supersedes #1134 (closed as stale — the Windows HOME fix in that PR was already shipped via #1302, and the claude.ts enabledPlugins change targeted a file that has since moved into @archon/providers). Credits @MrFadiAi for identifying and reporting the underlying issue. Co-authored-by: Fadi Ai <MrFadiAi@users.noreply.github.com> * fix(workflows): preserve MCP failure status in filtered message + observability Address review feedback on PR #1327: - parseMcpFailureServerNames now returns {name, segment} entries so the forwarded "MCP server connection failed: ..." message preserves the per-server status detail (e.g. "(timeout)", "(disconnected)") that the bare-name reconstruction was dropping. - loadConfiguredMcpServerNames now debug-logs read failures as dag.mcp_filter_config_read_failed instead of swallowing them silently. A transient EMFILE/EBUSY at filter time would otherwise silently reclassify a real workflow-MCP failure as plugin noise. - Add 4 integration tests through executeDagWorkflow covering the mixed workflow/plugin split, all-plugin suppression, no-mcp:-config nodes, and the unchanged⚠️ warning path. - Drop a WHAT comment above configuredMcpNames and a temporal phrase ("before this filter landed") that would rot on merge. - Document the filtering boundary in guides/mcp-servers.md and add a troubleshooting row for users debugging silently-suppressed plugin MCPs. --------- Co-authored-by: Fadi Ai <MrFadiAi@users.noreply.github.com>
Summary
HOMEenvironment variable is unset, causing CLI and server to fail loading~/.archon/.envconfigurationChanges
Windows HOME Resolution (
@archon/cli,@archon/server)process.env.HOMEwithgetArchonHome()from@archon/pathsfor cross-platform config path resolution in both CLI and server entry pointsHOMEis often unset —getArchonHome()correctly falls back toUSERPROFILEorHOMEDRIVE+HOMEPATHMCP Plugin Warning Suppression (
@archon/workflows,@archon/core)mcp:in workflow YAMLdebuglevel instead ofwarnand not sent to the usersettings: { enabledPlugins: {} }to Claude Agent SDK subprocess to attempt disabling user pluginsTest plan
HOMEis unset — CLI and server now correctly load~/.archon/.envarchon workflow run archon-assist "Say hello"completes without telegram MCP warning🤖 Generated with Claude Code
Summary by CodeRabbit
Bug Fixes
New Features