Skip to content

fix: stale test cluster — 5 orthogonal fixes (#70)#100

Merged
Knapp-Kevin merged 1 commit into
BicameralAI:devfrom
Knapp-Kevin:fix/issue-70-test-cluster
Apr 29, 2026
Merged

fix: stale test cluster — 5 orthogonal fixes (#70)#100
Knapp-Kevin merged 1 commit into
BicameralAI:devfrom
Knapp-Kevin:fix/issue-70-test-cluster

Conversation

@Knapp-Kevin

Copy link
Copy Markdown
Collaborator

Summary

Resolves the 9-test AssertionError cluster from #70 via 5 orthogonal fixes (zero file overlap):

  1. server.py — strip "SurrealDB" jargon from bicameral.reset description (jargon-hygiene contract).
  2. tests/test_bind.py — mock ledger.status.get_git_content in test_bind_idempotent and test_bind_status_transition so the bind handler's file-existence check is satisfied without real git content.
  3. tests/test_desync_scenarios.pytest_scenario_06_code_added_ungrounded_resolvable now refreshes ctx.authoritative_sha to the new HEAD after the in-test commit (via object.__setattr__ since BicameralContext is a frozen dataclass).
  4. tests/test_sync_middleware.pytest_ensure_skips_link_commit_when_already_synced now monkeypatches the module-level _LAST_SYNCED_SHA instead of ctx._sync_state (which the middleware never reads).
  5. tests/test_v0420_history.py — assertions updated from singular dec.fulfillment to plural dec.fulfillments list contract per the v0.4.20 API.

Verification

pytest tests/test_v0417_jargon_hygiene.py tests/test_bind.py tests/test_desync_scenarios.py tests/test_sync_middleware.py tests/test_v0420_history.py -v

All 9 previously-failing tests now pass. Two pre-existing failures remain (test_no_backend_jargon_in_skill_files, test_bind_success_with_explicit_lines) — both reproduce on upstream/dev without our changes and are out of scope for this PR.

No product behavior change. Audit verdict was PASS (.agent/staging/AUDIT_REPORT_70.md).

Closes #70

- server.py: strip "SurrealDB" jargon from bicameral.reset description
- test_bind.py: mock get_git_content for idempotency + status transition tests
- test_desync_scenarios.py: refresh ctx.authoritative_sha post-commit
- test_sync_middleware.py: patch module-level _LAST_SYNCED_SHA, not ctx state
- test_v0420_history.py: update assertions to plural `fulfillments` list contract

All 5 fixes are orthogonal (zero file overlap). 9 previously-failing tests
now pass. No product behavior change.

Closes BicameralAI#70

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Apr 29, 2026

Copy link
Copy Markdown

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: dda196bc-ab4c-4113-9352-9ae0cfa50be8

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@Knapp-Kevin Knapp-Kevin merged commit ffbf39b into BicameralAI:dev Apr 29, 2026
2 checks passed
Knapp-Kevin added a commit to Knapp-Kevin/bicameral-mcp that referenced this pull request Apr 29, 2026
The fixed test_desync_scenarios.py from PR BicameralAI#100 wasn't ruff-formatted
(ruff didn't exist in CI when BicameralAI#100 ran). After merging dev forward,
apply the format pass.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Knapp-Kevin added a commit that referenced this pull request Apr 29, 2026
…-to-dev labeller (#102)

* chore: add ruff + mypy lint stack + Windows test matrix + secret scan + merged-to-dev labeller (CI Phase 1)

Implements Phase 1 of docs/DEV_CYCLE.md §4.5.4 per plan-ci-phase-1.md (rev 2,
PASS verdict). Five atomic changes land together so the new CI gates light up
on the next PR run:

1. pyproject.toml — declare ruff>=0.5.0 + mypy>=1.10.0 in
   [project.optional-dependencies].test, plus minimal [tool.ruff] /
   [tool.mypy] config. Lint scope: E/F/W/I/B/UP. Tests/scripts get
   per-file-ignores so day-one CI is green. Mypy is lenient
   (ignore_missing_imports, warn_return_any=false) with per-module
   ignore_errors=true overrides for the 16 noisiest modules — full type
   coverage chipped away in follow-up PRs.

2. .github/workflows/test-mcp-regression.yml — convert single-runner job
   to ubuntu-latest + windows-latest matrix with fail-fast: false and a
   job-level timeout-minutes: 20. The pull_request: trigger is left
   untouched (no types: added). BICAMERAL_SKIP_CONSENT_NOTICE='1' added
   to job env so non-interactive CI doesn't stall on the consent prompt.
   Windows is expected green given the fcntl + subprocess fixes already
   on dev (#80, #84).

3. .github/workflows/lint-and-typecheck.yml (new) — ruff check +
   ruff format --check + mypy on pull_request to main/dev.

4. .github/workflows/secret-scan.yml (new) — gitleaks/gitleaks-action@v2
   with fetch-depth: 0 so the diff range is fully scannable. Triggers on
   pull_request to main/dev.

5. .github/workflows/label-merged-to-dev.yml (new — separate workflow,
   NOT a job in test-mcp-regression.yml). Triggered only on
   pull_request: branches: [dev], types: [closed] with
   if: github.event.pull_request.merged == true. Minimal permissions
   (issues: write, pull-requests: read). actions/github-script@v7 parses
   GitHub close-keywords from the PR body and applies the merged-to-dev
   label to each referenced issue. This is the audit V1 fix — keeping
   the labeller in its own file means test-mcp-regression.yml's existing
   trigger semantics cannot regress.

Branch-protection rules to require these checks remain a manual GitHub
UI step (admin-only) — see PR description.

Lint hygiene fixes shipped alongside the workflow plumbing:
- handlers/update.py: add `from pathlib import Path` (was used unimported).
- ledger/status.py: drop unused line_count local.
- ledger/queries.py: noqa-annotate the intentional non-top-level import.
- 213 ruff --fix auto-corrections across the tree (sorted imports, dropped
  unused imports, datetime.UTC, PEP 585/604 annotation modernisation, etc.).

Refs: docs/DEV_CYCLE.md §4.5.4 Phase 1.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: ruff format pass

Apply ruff format across the tree to satisfy `ruff format --check .` in
the new lint-and-typecheck workflow. No semantic changes — pure
whitespace, line wrapping, and trailing-comma normalisation.

Split from the previous CI Phase 1 commit so the workflow plumbing diff
stays readable.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(ci): trufflehog instead of gitleaks (org license) + Linux-only eval steps

Two CI failures on PR #102's first run:

1. Gitleaks fails with "missing license. Go grab one at gitleaks.io" —
   gitleaks-action@v2 requires a paid license for organizations as of
   the 2023 breaking update. Switch to trufflesecurity/trufflehog@main,
   which is free for all repos and has equivalent detection coverage.
   Use --only-verified to keep noise low.

2. Windows matrix job fails on the Generate E2E report step ("No artifacts
   found at .../test-results/e2e — run Phase 3 tests first"). The medusa
   corpus and M1 adversarial eval are Linux-only by design (bash shell,
   ANTHROPIC_API_KEY-gated, large corpus clone). Gate the corpus clone,
   the M1 secret probe, and the M1 adversarial step plus the Generate
   E2E report step on matrix.os == 'ubuntu-latest'. The Windows job
   continues to run the full pytest suite (the actual regression value)
   plus uploads its own artifacts via the matrix-suffixed name.

Artifact name now includes matrix.os so both runs upload distinct
results without overwriting each other.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: ruff format inbound from #100 merge

The fixed test_desync_scenarios.py from PR #100 wasn't ruff-formatted
(ruff didn't exist in CI when #100 ran). After merging dev forward,
apply the format pass.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@Knapp-Kevin Knapp-Kevin added the flow:feature Standard feature/fix PR targeting BicameralAI/dev (the default flow) label Apr 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

flow:feature Standard feature/fix PR targeting BicameralAI/dev (the default flow)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant