fix(#67): validate cwd before subprocess.run to fix Windows WinError 267 by Knapp-Kevin · Pull Request #84 · BicameralAI/bicameral-mcp

Knapp-Kevin · 2026-04-28T19:30:57Z

Closes #67. Partially addresses #70 (the alpha_flow / phase3 / reset clusters in #70 cited #67 as a likely shared root cause; this PR confirms that and clears the WinError 267 portion).

Summary

~38 tests on Windows failed uniformly with NotADirectoryError [WinError 267] The directory name is invalid raised from subprocess.py:1554 (CreateProcess). Root cause: several subprocess wrappers passed cwd=Path(repo_path).resolve() without validating that repo_path was non-empty or pointed at an existing directory.

POSIX silently degraded to the test runner's CWD when repo_path was empty (often a real git repo, so the call appeared to "work" with garbage SHAs — a different bug class). Windows raised NotADirectoryError, which wasn't in the wrappers' except tuples, crashing the test session.

Fix (defense in depth)

handlers/ingest.py — inject ctx.repo_path into the payload before calling ledger.ingest_payload so the adapter never sees payload['repo'] == ''.
ledger/adapter.py::ingest_payload — fall back to ctx.repo_path when payload.repo is empty, and skip resolve_head entirely when no repo is known.
ledger/status.py::resolve_ref — validate that Path(repo_path).resolve() exists and is a directory before invoking subprocess; catch NotADirectoryError as a final safety net.
code_locator_runtime.py::_git_stdout — same validation pattern.
ledger/status.py + ledger/adapter.py — broaden every except (subprocess.TimeoutExpired, FileNotFoundError) tuple to include NotADirectoryError.

Tests

tests/test_subprocess_cwd_safety.py — 11 tests:

resolve_ref returns None on empty / nonexistent / file repo_path
_git_stdout returns "" on the same set of bad inputs
Empty ref short-circuits to None
Happy path (real git repo) still resolves HEAD to a 40-char SHA
Source-level invariant (parametrized): every typed-tuple except clause around subprocess.run in ledger/status.py, ledger/adapter.py, and code_locator_runtime.py must include NotADirectoryError. This static check fails the build if a future commit removes the guard from any affected file.

Local Windows verification

Test file	Before	After
`test_alpha_flow.py`	0/7 (all WinError 267)	5/7 (2 remaining are drift-detection logic, not subprocess)
`test_reset.py`	0/4 (all WinError 267)	4/4 ✅
`test_phase3_integration.py`	0/6 (all WinError 267)	0/6 (different failure class — code_locator index empty in fixture)
`test_subprocess_cwd_safety.py`	n/a	11/11 ✅

The WinError 267 cluster is gone. The remaining alpha_flow / phase3 / v0420_history failures are now visible as their own bug classes (drift logic, code_locator fixture state, HistoryDecision schema), which is the intended outcome — issue #70 (umbrella) can be re-triaged accordingly.

Test plan

resolve_ref no longer crashes on empty repo_path
_git_stdout no longer crashes on empty repo_path
Real git repos still resolve HEAD correctly
All 11 new safety tests pass on Windows
No POSIX behaviour change (the validation succeeds for any real directory; the new NotADirectoryError handling is a no-op when not raised)
CI Linux green

…s WinError 267 Issue BicameralAI#67: ~38 tests on Windows failed uniformly with ``NotADirectoryError [WinError 267] The directory name is invalid`` raised from ``subprocess.py:1554`` (CreateProcess). Root cause: several subprocess wrappers passed ``cwd=Path(repo_path).resolve()`` without validating ``repo_path`` was non-empty or pointed at an existing directory. POSIX silently degraded to the test runner's CWD when ``repo_path`` was empty (often a real git repo, so the call appeared to "work" with garbage SHAs — a different bug class). Windows raised ``NotADirectoryError`` from CreateProcess, which wasn't caught by the ``except (subprocess.TimeoutExpired, FileNotFoundError)`` tuples, crashing the test session. Fixes (defense in depth): 1. ``handlers/ingest.py`` — inject ``ctx.repo_path`` into the payload before calling ``ledger.ingest_payload`` so the adapter never sees ``payload['repo'] == ''``. 2. ``ledger/adapter.py::ingest_payload`` — fall back to ``ctx.repo_path`` when payload.repo is empty, and skip ``resolve_head`` entirely when no repo is known. 3. ``ledger/status.py::resolve_ref`` — validate that ``Path(repo_path).resolve()`` exists and is a directory before invoking subprocess. Add ``NotADirectoryError`` to the except tuple as a final safety net. 4. ``code_locator_runtime.py::_git_stdout`` — same pattern: validate the cwd is a real directory and catch ``NotADirectoryError``. 5. ``ledger/status.py`` and ``ledger/adapter.py`` — broaden every ``except (subprocess.TimeoutExpired, FileNotFoundError)`` tuple to include ``NotADirectoryError``. This is the cross-cutting "never let a bad cwd escalate" rule. Tests: - ``tests/test_subprocess_cwd_safety.py`` (11 tests): - ``resolve_ref`` returns ``None`` on empty / nonexistent / file repo_path - ``_git_stdout`` returns ``""`` on the same set of bad inputs - empty ``ref`` short-circuits to ``None`` - happy path (real git repo) still resolves HEAD to a 40-char SHA - source-level invariant: every typed-tuple except clause around ``subprocess.run`` must include ``NotADirectoryError``. This static check fails the build if a future commit removes the guard from any of the affected files. - Local Windows verification: - ``test_alpha_flow.py``: 0/7 → 5/7 passing (the 2 remaining are drift-detection logic bugs, NOT subprocess crashes — fall under issue BicameralAI#70). - ``test_reset.py``: 0/4 → 4/4 passing. - ``test_phase3_integration.py``: 0/6 → 0/6 (different failure class now — code_locator index empty in the fixture). - ``test_v0420_history.py``: 7/9 (2 failures are HistoryDecision schema mismatch, also not subprocess-related). The WinError 267 cluster is gone. The remaining alpha_flow / phase3 / v0420_history failures are now visible as their own bug classes, which is the intended outcome — issue BicameralAI#70 (umbrella) can be re-triaged accordingly.

coderabbitai · 2026-04-28T19:31:04Z

Warning

Rate limit exceeded

@Knapp-Kevin has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 51 minutes and 54 seconds before requesting another review.

To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: fdb10b93-0e19-4958-a428-6708fe3be0f7

📥 Commits

Reviewing files that changed from the base of the PR and between a340750 and 8a007f8.

📒 Files selected for processing (5)

code_locator_runtime.py
handlers/ingest.py
ledger/adapter.py
ledger/status.py
tests/test_subprocess_cwd_safety.py

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

The new dev integration workflow ("everything pushes and merges to dev first, then PRs from dev to main upon Jin's approval") needs CI to run on PRs targeting dev — not just main. Without this, retargeted PRs (#73, #79–#84) never get a green badge and have to be merged on local verification only. Updates 3 workflows: MCP Regression Tests, Preflight Eval, Schema Persistence. All other path filters retained. Direct push to dev (not via PR) — no CI exists yet to run on this file's own PR (chicken-and-egg). Subsequent PRs to dev will inherit the new triggers.

…-to-dev labeller (#102) * chore: add ruff + mypy lint stack + Windows test matrix + secret scan + merged-to-dev labeller (CI Phase 1) Implements Phase 1 of docs/DEV_CYCLE.md §4.5.4 per plan-ci-phase-1.md (rev 2, PASS verdict). Five atomic changes land together so the new CI gates light up on the next PR run: 1. pyproject.toml — declare ruff>=0.5.0 + mypy>=1.10.0 in [project.optional-dependencies].test, plus minimal [tool.ruff] / [tool.mypy] config. Lint scope: E/F/W/I/B/UP. Tests/scripts get per-file-ignores so day-one CI is green. Mypy is lenient (ignore_missing_imports, warn_return_any=false) with per-module ignore_errors=true overrides for the 16 noisiest modules — full type coverage chipped away in follow-up PRs. 2. .github/workflows/test-mcp-regression.yml — convert single-runner job to ubuntu-latest + windows-latest matrix with fail-fast: false and a job-level timeout-minutes: 20. The pull_request: trigger is left untouched (no types: added). BICAMERAL_SKIP_CONSENT_NOTICE='1' added to job env so non-interactive CI doesn't stall on the consent prompt. Windows is expected green given the fcntl + subprocess fixes already on dev (#80, #84). 3. .github/workflows/lint-and-typecheck.yml (new) — ruff check + ruff format --check + mypy on pull_request to main/dev. 4. .github/workflows/secret-scan.yml (new) — gitleaks/gitleaks-action@v2 with fetch-depth: 0 so the diff range is fully scannable. Triggers on pull_request to main/dev. 5. .github/workflows/label-merged-to-dev.yml (new — separate workflow, NOT a job in test-mcp-regression.yml). Triggered only on pull_request: branches: [dev], types: [closed] with if: github.event.pull_request.merged == true. Minimal permissions (issues: write, pull-requests: read). actions/github-script@v7 parses GitHub close-keywords from the PR body and applies the merged-to-dev label to each referenced issue. This is the audit V1 fix — keeping the labeller in its own file means test-mcp-regression.yml's existing trigger semantics cannot regress. Branch-protection rules to require these checks remain a manual GitHub UI step (admin-only) — see PR description. Lint hygiene fixes shipped alongside the workflow plumbing: - handlers/update.py: add `from pathlib import Path` (was used unimported). - ledger/status.py: drop unused line_count local. - ledger/queries.py: noqa-annotate the intentional non-top-level import. - 213 ruff --fix auto-corrections across the tree (sorted imports, dropped unused imports, datetime.UTC, PEP 585/604 annotation modernisation, etc.). Refs: docs/DEV_CYCLE.md §4.5.4 Phase 1. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore: ruff format pass Apply ruff format across the tree to satisfy `ruff format --check .` in the new lint-and-typecheck workflow. No semantic changes — pure whitespace, line wrapping, and trailing-comma normalisation. Split from the previous CI Phase 1 commit so the workflow plumbing diff stays readable. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(ci): trufflehog instead of gitleaks (org license) + Linux-only eval steps Two CI failures on PR #102's first run: 1. Gitleaks fails with "missing license. Go grab one at gitleaks.io" — gitleaks-action@v2 requires a paid license for organizations as of the 2023 breaking update. Switch to trufflesecurity/trufflehog@main, which is free for all repos and has equivalent detection coverage. Use --only-verified to keep noise low. 2. Windows matrix job fails on the Generate E2E report step ("No artifacts found at .../test-results/e2e — run Phase 3 tests first"). The medusa corpus and M1 adversarial eval are Linux-only by design (bash shell, ANTHROPIC_API_KEY-gated, large corpus clone). Gate the corpus clone, the M1 secret probe, and the M1 adversarial step plus the Generate E2E report step on matrix.os == 'ubuntu-latest'. The Windows job continues to run the full pytest suite (the actual regression value) plus uploads its own artifacts via the matrix-suffixed name. Artifact name now includes matrix.os so both runs upload distinct results without overwriting each other. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore: ruff format inbound from #100 merge The fixed test_desync_scenarios.py from PR #100 wasn't ruff-formatted (ruff didn't exist in CI when #100 ran). After merging dev forward, apply the format pass. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Knapp-Kevin temporarily deployed to ci-test April 28, 2026 19:31 — with GitHub Actions Inactive

Knapp-Kevin mentioned this pull request Apr 28, 2026

AssertionError cluster: ~20 tests across ast_diff/alpha_*/desync_scenarios may share root causes (Windows test env or stale fixtures) #70

Closed

Knapp-Kevin changed the base branch from main to dev April 28, 2026 20:23

Knapp-Kevin added the bug Something isn't working label Apr 28, 2026

Knapp-Kevin merged commit f94d74c into BicameralAI:dev Apr 28, 2026
4 checks passed

This was referenced Apr 29, 2026

chore: merge main into dev (v0.13.3 telemetry refactor → dev) #94

Merged

chore: CI Phase 1 — Windows matrix + ruff/mypy + secret scan + merged-to-dev labeller #102

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(#67): validate cwd before subprocess.run to fix Windows WinError 267#84

fix(#67): validate cwd before subprocess.run to fix Windows WinError 267#84
Knapp-Kevin merged 1 commit into
BicameralAI:devfrom
Knapp-Kevin:fix/67-subprocess-cwd-windows

Knapp-Kevin commented Apr 28, 2026

Uh oh!

coderabbitai Bot commented Apr 28, 2026

Rate limit exceeded

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Knapp-Kevin commented Apr 28, 2026

Summary

Fix (defense in depth)

Tests

Local Windows verification

Test plan

Uh oh!

coderabbitai Bot commented Apr 28, 2026

Rate limit exceeded

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant