Skip to content

fix(#67): validate cwd before subprocess.run to fix Windows WinError 267#84

Merged
Knapp-Kevin merged 1 commit into
BicameralAI:devfrom
Knapp-Kevin:fix/67-subprocess-cwd-windows
Apr 28, 2026
Merged

fix(#67): validate cwd before subprocess.run to fix Windows WinError 267#84
Knapp-Kevin merged 1 commit into
BicameralAI:devfrom
Knapp-Kevin:fix/67-subprocess-cwd-windows

Conversation

@Knapp-Kevin

Copy link
Copy Markdown
Collaborator

Closes #67. Partially addresses #70 (the alpha_flow / phase3 / reset clusters in #70 cited #67 as a likely shared root cause; this PR confirms that and clears the WinError 267 portion).

Summary

~38 tests on Windows failed uniformly with NotADirectoryError [WinError 267] The directory name is invalid raised from subprocess.py:1554 (CreateProcess). Root cause: several subprocess wrappers passed cwd=Path(repo_path).resolve() without validating that repo_path was non-empty or pointed at an existing directory.

POSIX silently degraded to the test runner's CWD when repo_path was empty (often a real git repo, so the call appeared to "work" with garbage SHAs — a different bug class). Windows raised NotADirectoryError, which wasn't in the wrappers' except tuples, crashing the test session.

Fix (defense in depth)

  1. handlers/ingest.py — inject ctx.repo_path into the payload before calling ledger.ingest_payload so the adapter never sees payload['repo'] == ''.
  2. ledger/adapter.py::ingest_payload — fall back to ctx.repo_path when payload.repo is empty, and skip resolve_head entirely when no repo is known.
  3. ledger/status.py::resolve_ref — validate that Path(repo_path).resolve() exists and is a directory before invoking subprocess; catch NotADirectoryError as a final safety net.
  4. code_locator_runtime.py::_git_stdout — same validation pattern.
  5. ledger/status.py + ledger/adapter.py — broaden every except (subprocess.TimeoutExpired, FileNotFoundError) tuple to include NotADirectoryError.

Tests

tests/test_subprocess_cwd_safety.py — 11 tests:

  • resolve_ref returns None on empty / nonexistent / file repo_path
  • _git_stdout returns "" on the same set of bad inputs
  • Empty ref short-circuits to None
  • Happy path (real git repo) still resolves HEAD to a 40-char SHA
  • Source-level invariant (parametrized): every typed-tuple except clause around subprocess.run in ledger/status.py, ledger/adapter.py, and code_locator_runtime.py must include NotADirectoryError. This static check fails the build if a future commit removes the guard from any affected file.

Local Windows verification

Test file Before After
test_alpha_flow.py 0/7 (all WinError 267) 5/7 (2 remaining are drift-detection logic, not subprocess)
test_reset.py 0/4 (all WinError 267) 4/4 ✅
test_phase3_integration.py 0/6 (all WinError 267) 0/6 (different failure class — code_locator index empty in fixture)
test_subprocess_cwd_safety.py n/a 11/11 ✅

The WinError 267 cluster is gone. The remaining alpha_flow / phase3 / v0420_history failures are now visible as their own bug classes (drift logic, code_locator fixture state, HistoryDecision schema), which is the intended outcome — issue #70 (umbrella) can be re-triaged accordingly.

Test plan

  • resolve_ref no longer crashes on empty repo_path
  • _git_stdout no longer crashes on empty repo_path
  • Real git repos still resolve HEAD correctly
  • All 11 new safety tests pass on Windows
  • No POSIX behaviour change (the validation succeeds for any real directory; the new NotADirectoryError handling is a no-op when not raised)
  • CI Linux green

…s WinError 267

Issue BicameralAI#67: ~38 tests on Windows failed uniformly with
``NotADirectoryError [WinError 267] The directory name is invalid``
raised from ``subprocess.py:1554`` (CreateProcess). Root cause:
several subprocess wrappers passed ``cwd=Path(repo_path).resolve()``
without validating ``repo_path`` was non-empty or pointed at an
existing directory.

POSIX silently degraded to the test runner's CWD when ``repo_path``
was empty (often a real git repo, so the call appeared to "work" with
garbage SHAs — a different bug class). Windows raised
``NotADirectoryError`` from CreateProcess, which wasn't caught by the
``except (subprocess.TimeoutExpired, FileNotFoundError)`` tuples,
crashing the test session.

Fixes (defense in depth):

1. ``handlers/ingest.py`` — inject ``ctx.repo_path`` into the payload
   before calling ``ledger.ingest_payload`` so the adapter never sees
   ``payload['repo'] == ''``.

2. ``ledger/adapter.py::ingest_payload`` — fall back to
   ``ctx.repo_path`` when payload.repo is empty, and skip
   ``resolve_head`` entirely when no repo is known.

3. ``ledger/status.py::resolve_ref`` — validate that
   ``Path(repo_path).resolve()`` exists and is a directory before
   invoking subprocess. Add ``NotADirectoryError`` to the except
   tuple as a final safety net.

4. ``code_locator_runtime.py::_git_stdout`` — same pattern: validate
   the cwd is a real directory and catch ``NotADirectoryError``.

5. ``ledger/status.py`` and ``ledger/adapter.py`` — broaden every
   ``except (subprocess.TimeoutExpired, FileNotFoundError)`` tuple
   to include ``NotADirectoryError``. This is the cross-cutting
   "never let a bad cwd escalate" rule.

Tests:

- ``tests/test_subprocess_cwd_safety.py`` (11 tests):
  - ``resolve_ref`` returns ``None`` on empty / nonexistent / file
    repo_path
  - ``_git_stdout`` returns ``""`` on the same set of bad inputs
  - empty ``ref`` short-circuits to ``None``
  - happy path (real git repo) still resolves HEAD to a 40-char SHA
  - source-level invariant: every typed-tuple except clause around
    ``subprocess.run`` must include ``NotADirectoryError``. This
    static check fails the build if a future commit removes the
    guard from any of the affected files.

- Local Windows verification:
  - ``test_alpha_flow.py``: 0/7 → 5/7 passing (the 2 remaining are
    drift-detection logic bugs, NOT subprocess crashes — fall under
    issue BicameralAI#70).
  - ``test_reset.py``: 0/4 → 4/4 passing.
  - ``test_phase3_integration.py``: 0/6 → 0/6 (different failure
    class now — code_locator index empty in the fixture).
  - ``test_v0420_history.py``: 7/9 (2 failures are HistoryDecision
    schema mismatch, also not subprocess-related).

The WinError 267 cluster is gone. The remaining alpha_flow / phase3 /
v0420_history failures are now visible as their own bug classes,
which is the intended outcome — issue BicameralAI#70 (umbrella) can be
re-triaged accordingly.
@coderabbitai

coderabbitai Bot commented Apr 28, 2026

Copy link
Copy Markdown

Warning

Rate limit exceeded

@Knapp-Kevin has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 51 minutes and 54 seconds before requesting another review.

To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: fdb10b93-0e19-4958-a428-6708fe3be0f7

📥 Commits

Reviewing files that changed from the base of the PR and between a340750 and 8a007f8.

📒 Files selected for processing (5)
  • code_locator_runtime.py
  • handlers/ingest.py
  • ledger/adapter.py
  • ledger/status.py
  • tests/test_subprocess_cwd_safety.py
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@Knapp-Kevin Knapp-Kevin changed the base branch from main to dev April 28, 2026 20:23
@Knapp-Kevin Knapp-Kevin added the bug Something isn't working label Apr 28, 2026
Knapp-Kevin added a commit that referenced this pull request Apr 28, 2026
The new dev integration workflow ("everything pushes and merges to dev
first, then PRs from dev to main upon Jin's approval") needs CI to run
on PRs targeting dev — not just main. Without this, retargeted PRs
(#73, #79#84) never get a green badge and have to be merged on local
verification only.

Updates 3 workflows: MCP Regression Tests, Preflight Eval, Schema
Persistence. All other path filters retained.

Direct push to dev (not via PR) — no CI exists yet to run on this
file's own PR (chicken-and-egg). Subsequent PRs to dev will inherit
the new triggers.
@Knapp-Kevin Knapp-Kevin merged commit f94d74c into BicameralAI:dev Apr 28, 2026
4 checks passed
Knapp-Kevin added a commit that referenced this pull request Apr 29, 2026
…-to-dev labeller (#102)

* chore: add ruff + mypy lint stack + Windows test matrix + secret scan + merged-to-dev labeller (CI Phase 1)

Implements Phase 1 of docs/DEV_CYCLE.md §4.5.4 per plan-ci-phase-1.md (rev 2,
PASS verdict). Five atomic changes land together so the new CI gates light up
on the next PR run:

1. pyproject.toml — declare ruff>=0.5.0 + mypy>=1.10.0 in
   [project.optional-dependencies].test, plus minimal [tool.ruff] /
   [tool.mypy] config. Lint scope: E/F/W/I/B/UP. Tests/scripts get
   per-file-ignores so day-one CI is green. Mypy is lenient
   (ignore_missing_imports, warn_return_any=false) with per-module
   ignore_errors=true overrides for the 16 noisiest modules — full type
   coverage chipped away in follow-up PRs.

2. .github/workflows/test-mcp-regression.yml — convert single-runner job
   to ubuntu-latest + windows-latest matrix with fail-fast: false and a
   job-level timeout-minutes: 20. The pull_request: trigger is left
   untouched (no types: added). BICAMERAL_SKIP_CONSENT_NOTICE='1' added
   to job env so non-interactive CI doesn't stall on the consent prompt.
   Windows is expected green given the fcntl + subprocess fixes already
   on dev (#80, #84).

3. .github/workflows/lint-and-typecheck.yml (new) — ruff check +
   ruff format --check + mypy on pull_request to main/dev.

4. .github/workflows/secret-scan.yml (new) — gitleaks/gitleaks-action@v2
   with fetch-depth: 0 so the diff range is fully scannable. Triggers on
   pull_request to main/dev.

5. .github/workflows/label-merged-to-dev.yml (new — separate workflow,
   NOT a job in test-mcp-regression.yml). Triggered only on
   pull_request: branches: [dev], types: [closed] with
   if: github.event.pull_request.merged == true. Minimal permissions
   (issues: write, pull-requests: read). actions/github-script@v7 parses
   GitHub close-keywords from the PR body and applies the merged-to-dev
   label to each referenced issue. This is the audit V1 fix — keeping
   the labeller in its own file means test-mcp-regression.yml's existing
   trigger semantics cannot regress.

Branch-protection rules to require these checks remain a manual GitHub
UI step (admin-only) — see PR description.

Lint hygiene fixes shipped alongside the workflow plumbing:
- handlers/update.py: add `from pathlib import Path` (was used unimported).
- ledger/status.py: drop unused line_count local.
- ledger/queries.py: noqa-annotate the intentional non-top-level import.
- 213 ruff --fix auto-corrections across the tree (sorted imports, dropped
  unused imports, datetime.UTC, PEP 585/604 annotation modernisation, etc.).

Refs: docs/DEV_CYCLE.md §4.5.4 Phase 1.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: ruff format pass

Apply ruff format across the tree to satisfy `ruff format --check .` in
the new lint-and-typecheck workflow. No semantic changes — pure
whitespace, line wrapping, and trailing-comma normalisation.

Split from the previous CI Phase 1 commit so the workflow plumbing diff
stays readable.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(ci): trufflehog instead of gitleaks (org license) + Linux-only eval steps

Two CI failures on PR #102's first run:

1. Gitleaks fails with "missing license. Go grab one at gitleaks.io" —
   gitleaks-action@v2 requires a paid license for organizations as of
   the 2023 breaking update. Switch to trufflesecurity/trufflehog@main,
   which is free for all repos and has equivalent detection coverage.
   Use --only-verified to keep noise low.

2. Windows matrix job fails on the Generate E2E report step ("No artifacts
   found at .../test-results/e2e — run Phase 3 tests first"). The medusa
   corpus and M1 adversarial eval are Linux-only by design (bash shell,
   ANTHROPIC_API_KEY-gated, large corpus clone). Gate the corpus clone,
   the M1 secret probe, and the M1 adversarial step plus the Generate
   E2E report step on matrix.os == 'ubuntu-latest'. The Windows job
   continues to run the full pytest suite (the actual regression value)
   plus uploads its own artifacts via the matrix-suffixed name.

Artifact name now includes matrix.os so both runs upload distinct
results without overwriting each other.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: ruff format inbound from #100 merge

The fixed test_desync_scenarios.py from PR #100 wasn't ruff-formatted
(ruff didn't exist in CI when #100 ran). After merging dev forward,
apply the format pass.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Test suite: 38 NotADirectoryError [WinError 267] failures on Windows from subprocess.py:1554

1 participant