fix(#67): validate cwd before subprocess.run to fix Windows WinError 267#84
Conversation
…s WinError 267 Issue BicameralAI#67: ~38 tests on Windows failed uniformly with ``NotADirectoryError [WinError 267] The directory name is invalid`` raised from ``subprocess.py:1554`` (CreateProcess). Root cause: several subprocess wrappers passed ``cwd=Path(repo_path).resolve()`` without validating ``repo_path`` was non-empty or pointed at an existing directory. POSIX silently degraded to the test runner's CWD when ``repo_path`` was empty (often a real git repo, so the call appeared to "work" with garbage SHAs — a different bug class). Windows raised ``NotADirectoryError`` from CreateProcess, which wasn't caught by the ``except (subprocess.TimeoutExpired, FileNotFoundError)`` tuples, crashing the test session. Fixes (defense in depth): 1. ``handlers/ingest.py`` — inject ``ctx.repo_path`` into the payload before calling ``ledger.ingest_payload`` so the adapter never sees ``payload['repo'] == ''``. 2. ``ledger/adapter.py::ingest_payload`` — fall back to ``ctx.repo_path`` when payload.repo is empty, and skip ``resolve_head`` entirely when no repo is known. 3. ``ledger/status.py::resolve_ref`` — validate that ``Path(repo_path).resolve()`` exists and is a directory before invoking subprocess. Add ``NotADirectoryError`` to the except tuple as a final safety net. 4. ``code_locator_runtime.py::_git_stdout`` — same pattern: validate the cwd is a real directory and catch ``NotADirectoryError``. 5. ``ledger/status.py`` and ``ledger/adapter.py`` — broaden every ``except (subprocess.TimeoutExpired, FileNotFoundError)`` tuple to include ``NotADirectoryError``. This is the cross-cutting "never let a bad cwd escalate" rule. Tests: - ``tests/test_subprocess_cwd_safety.py`` (11 tests): - ``resolve_ref`` returns ``None`` on empty / nonexistent / file repo_path - ``_git_stdout`` returns ``""`` on the same set of bad inputs - empty ``ref`` short-circuits to ``None`` - happy path (real git repo) still resolves HEAD to a 40-char SHA - source-level invariant: every typed-tuple except clause around ``subprocess.run`` must include ``NotADirectoryError``. This static check fails the build if a future commit removes the guard from any of the affected files. - Local Windows verification: - ``test_alpha_flow.py``: 0/7 → 5/7 passing (the 2 remaining are drift-detection logic bugs, NOT subprocess crashes — fall under issue BicameralAI#70). - ``test_reset.py``: 0/4 → 4/4 passing. - ``test_phase3_integration.py``: 0/6 → 0/6 (different failure class now — code_locator index empty in the fixture). - ``test_v0420_history.py``: 7/9 (2 failures are HistoryDecision schema mismatch, also not subprocess-related). The WinError 267 cluster is gone. The remaining alpha_flow / phase3 / v0420_history failures are now visible as their own bug classes, which is the intended outcome — issue BicameralAI#70 (umbrella) can be re-triaged accordingly.
|
Warning Rate limit exceeded
To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (5)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
The new dev integration workflow ("everything pushes and merges to dev
first, then PRs from dev to main upon Jin's approval") needs CI to run
on PRs targeting dev — not just main. Without this, retargeted PRs
(#73, #79–#84) never get a green badge and have to be merged on local
verification only.
Updates 3 workflows: MCP Regression Tests, Preflight Eval, Schema
Persistence. All other path filters retained.
Direct push to dev (not via PR) — no CI exists yet to run on this
file's own PR (chicken-and-egg). Subsequent PRs to dev will inherit
the new triggers.
…-to-dev labeller (#102) * chore: add ruff + mypy lint stack + Windows test matrix + secret scan + merged-to-dev labeller (CI Phase 1) Implements Phase 1 of docs/DEV_CYCLE.md §4.5.4 per plan-ci-phase-1.md (rev 2, PASS verdict). Five atomic changes land together so the new CI gates light up on the next PR run: 1. pyproject.toml — declare ruff>=0.5.0 + mypy>=1.10.0 in [project.optional-dependencies].test, plus minimal [tool.ruff] / [tool.mypy] config. Lint scope: E/F/W/I/B/UP. Tests/scripts get per-file-ignores so day-one CI is green. Mypy is lenient (ignore_missing_imports, warn_return_any=false) with per-module ignore_errors=true overrides for the 16 noisiest modules — full type coverage chipped away in follow-up PRs. 2. .github/workflows/test-mcp-regression.yml — convert single-runner job to ubuntu-latest + windows-latest matrix with fail-fast: false and a job-level timeout-minutes: 20. The pull_request: trigger is left untouched (no types: added). BICAMERAL_SKIP_CONSENT_NOTICE='1' added to job env so non-interactive CI doesn't stall on the consent prompt. Windows is expected green given the fcntl + subprocess fixes already on dev (#80, #84). 3. .github/workflows/lint-and-typecheck.yml (new) — ruff check + ruff format --check + mypy on pull_request to main/dev. 4. .github/workflows/secret-scan.yml (new) — gitleaks/gitleaks-action@v2 with fetch-depth: 0 so the diff range is fully scannable. Triggers on pull_request to main/dev. 5. .github/workflows/label-merged-to-dev.yml (new — separate workflow, NOT a job in test-mcp-regression.yml). Triggered only on pull_request: branches: [dev], types: [closed] with if: github.event.pull_request.merged == true. Minimal permissions (issues: write, pull-requests: read). actions/github-script@v7 parses GitHub close-keywords from the PR body and applies the merged-to-dev label to each referenced issue. This is the audit V1 fix — keeping the labeller in its own file means test-mcp-regression.yml's existing trigger semantics cannot regress. Branch-protection rules to require these checks remain a manual GitHub UI step (admin-only) — see PR description. Lint hygiene fixes shipped alongside the workflow plumbing: - handlers/update.py: add `from pathlib import Path` (was used unimported). - ledger/status.py: drop unused line_count local. - ledger/queries.py: noqa-annotate the intentional non-top-level import. - 213 ruff --fix auto-corrections across the tree (sorted imports, dropped unused imports, datetime.UTC, PEP 585/604 annotation modernisation, etc.). Refs: docs/DEV_CYCLE.md §4.5.4 Phase 1. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore: ruff format pass Apply ruff format across the tree to satisfy `ruff format --check .` in the new lint-and-typecheck workflow. No semantic changes — pure whitespace, line wrapping, and trailing-comma normalisation. Split from the previous CI Phase 1 commit so the workflow plumbing diff stays readable. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(ci): trufflehog instead of gitleaks (org license) + Linux-only eval steps Two CI failures on PR #102's first run: 1. Gitleaks fails with "missing license. Go grab one at gitleaks.io" — gitleaks-action@v2 requires a paid license for organizations as of the 2023 breaking update. Switch to trufflesecurity/trufflehog@main, which is free for all repos and has equivalent detection coverage. Use --only-verified to keep noise low. 2. Windows matrix job fails on the Generate E2E report step ("No artifacts found at .../test-results/e2e — run Phase 3 tests first"). The medusa corpus and M1 adversarial eval are Linux-only by design (bash shell, ANTHROPIC_API_KEY-gated, large corpus clone). Gate the corpus clone, the M1 secret probe, and the M1 adversarial step plus the Generate E2E report step on matrix.os == 'ubuntu-latest'. The Windows job continues to run the full pytest suite (the actual regression value) plus uploads its own artifacts via the matrix-suffixed name. Artifact name now includes matrix.os so both runs upload distinct results without overwriting each other. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore: ruff format inbound from #100 merge The fixed test_desync_scenarios.py from PR #100 wasn't ruff-formatted (ruff didn't exist in CI when #100 ran). After merging dev forward, apply the format pass. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes #67. Partially addresses #70 (the alpha_flow / phase3 / reset clusters in #70 cited #67 as a likely shared root cause; this PR confirms that and clears the WinError 267 portion).
Summary
~38 tests on Windows failed uniformly with
NotADirectoryError [WinError 267] The directory name is invalidraised fromsubprocess.py:1554(CreateProcess). Root cause: several subprocess wrappers passedcwd=Path(repo_path).resolve()without validating thatrepo_pathwas non-empty or pointed at an existing directory.POSIX silently degraded to the test runner's CWD when
repo_pathwas empty (often a real git repo, so the call appeared to "work" with garbage SHAs — a different bug class). Windows raisedNotADirectoryError, which wasn't in the wrappers'excepttuples, crashing the test session.Fix (defense in depth)
handlers/ingest.py— injectctx.repo_pathinto the payload before callingledger.ingest_payloadso the adapter never seespayload['repo'] == ''.ledger/adapter.py::ingest_payload— fall back toctx.repo_pathwhenpayload.repois empty, and skipresolve_headentirely when no repo is known.ledger/status.py::resolve_ref— validate thatPath(repo_path).resolve()exists and is a directory before invoking subprocess; catchNotADirectoryErroras a final safety net.code_locator_runtime.py::_git_stdout— same validation pattern.ledger/status.py+ledger/adapter.py— broaden everyexcept (subprocess.TimeoutExpired, FileNotFoundError)tuple to includeNotADirectoryError.Tests
tests/test_subprocess_cwd_safety.py— 11 tests:resolve_refreturnsNoneon empty / nonexistent / file repo_path_git_stdoutreturns""on the same set of bad inputsrefshort-circuits toNoneexceptclause aroundsubprocess.runinledger/status.py,ledger/adapter.py, andcode_locator_runtime.pymust includeNotADirectoryError. This static check fails the build if a future commit removes the guard from any affected file.Local Windows verification
test_alpha_flow.pytest_reset.pytest_phase3_integration.pytest_subprocess_cwd_safety.pyThe WinError 267 cluster is gone. The remaining alpha_flow / phase3 / v0420_history failures are now visible as their own bug classes (drift logic, code_locator fixture state, HistoryDecision schema), which is the intended outcome — issue #70 (umbrella) can be re-triaged accordingly.
Test plan
resolve_refno longer crashes on emptyrepo_path_git_stdoutno longer crashes on emptyrepo_pathNotADirectoryErrorhandling is a no-op when not raised)