Skip to content

chore: CI Phase 1 — Windows matrix + ruff/mypy + secret scan + merged-to-dev labeller#102

Merged
Knapp-Kevin merged 5 commits into
BicameralAI:devfrom
Knapp-Kevin:chore/ci-phase-1-tier-1-gates
Apr 29, 2026
Merged

chore: CI Phase 1 — Windows matrix + ruff/mypy + secret scan + merged-to-dev labeller#102
Knapp-Kevin merged 5 commits into
BicameralAI:devfrom
Knapp-Kevin:chore/ci-phase-1-tier-1-gates

Conversation

@Knapp-Kevin

Copy link
Copy Markdown
Collaborator

Summary

Implements CI Phase 1 from docs/DEV_CYCLE.md §4.5.4 per plan-ci-phase-1.md (rev 2 — PASS verdict, see AUDIT_REPORT_CI_PHASE_1.md). Five atomic changes shipped as one PR so all Tier 1 gates light up together on the next PR run.

  • Phase 1 — pyproject.toml: declare ruff>=0.5.0 + mypy>=1.10.0 in [project.optional-dependencies].test; add minimal [tool.ruff] (E/F/W/I/B/UP) and [tool.mypy] (lenient: ignore_missing_imports=true, warn_return_any=false) blocks. Day-one CI is kept green via per-file-ignores for tests/** and scripts/**, plus [[tool.mypy.overrides]] ignore_errors = true for 16 noisy modules.
  • Phase 2 — test-mcp-regression.yml: convert single-runner job to [ubuntu-latest, windows-latest] matrix with fail-fast: false and job-level timeout-minutes: 20. The pull_request: trigger is untouched (no types: added — audit V1 fix preserved). Adds BICAMERAL_SKIP_CONSENT_NOTICE='1' so non-interactive CI doesn't stall.
  • Phase 3 — lint-and-typecheck.yml (new): three steps — ruff check ., ruff format --check ., mypy .. Triggers on pull_request: branches: [main, dev].
  • Phase 4 — secret-scan.yml (new): single gitleaks/gitleaks-action@v2 job with actions/checkout@v4 + fetch-depth: 0. Triggers on pull_request: branches: [main, dev].
  • Phase 5 — label-merged-to-dev.yml (new, separate workflow file — NOT a job in test-mcp-regression.yml): pull_request: branches: [dev], types: [closed], if: github.event.pull_request.merged == true, minimal permissions (issues: write, pull-requests: read), actions/github-script@v7 regex-based labeller per the plan.

Two commits:

  1. ab36d85 — workflow plumbing + lint enablement (213 ruff --fix auto-corrections + 3 manual fixes for the residual F821 / E402 / F841 errors).
  2. f10ec05ruff format pass (125 files reformatted; pure whitespace).

Required follow-up — branch protection (manual)

After this PR lands, a repo admin must configure branch-protection rules on dev to require:

  • Lint & Type Check / ruff + mypy
  • MCP Regression Suite (ubuntu-latest)
  • MCP Regression Suite (windows-latest)
  • Schema Persistence Tests / Schema Persistence Smoke Test
  • Secret Scan / Gitleaks

Branch protection cannot be encoded in this PR — it requires admin access to the GitHub UI. See plan-ci-phase-1.md §"Branch protection (manual)".

Test plan

  • ruff check . — green (All checks passed!)
  • ruff format --check . — green (153 files already formatted)
  • mypy . — green (Success: no issues found in 80 source files)
  • pytest tests/test_alpha_contract.py — 5 passed locally
  • CI shows two MCP Regression Suite jobs (ubuntu + windows). Windows matrix is expected green given the fcntl + subprocess Windows fixes already on dev (fix(#74): make events.writer cross-platform (POSIX fcntl + Windows msvcrt) #80, fix(#67): validate cwd before subprocess.run to fix Windows WinError 267 #84).
  • CI shows new Lint & Type Check / ruff + mypy job — green.
  • CI shows new Secret Scan / Gitleaks job — green.
  • After merge: open a small PR with Closes #X body, merge it, observe merged-to-dev label appear on #X. Regression workflow MUST NOT re-fire on close.

Notes

  • Mypy day-one: 75 errors across 16 modules suppressed via [[tool.mypy.overrides]] ignore_errors = true to keep CI green. Strict typing is a follow-up project tracked separately — do not remove suppressions without first fixing the underlying errors.
  • Ruff day-one: 251 of 272 initial findings auto-fixed via ruff --fix. The remaining 21 (mostly F821 / F841 / E712 in tests) are masked via [tool.ruff.lint.per-file-ignores] for tests/** and scripts/**. Three production-code findings fixed manually (handlers/update.py, ledger/queries.py, ledger/status.py).

References

  • Plan: plan-ci-phase-1.md (rev 2)
  • Audit: .agent/staging/AUDIT_REPORT_CI_PHASE_1.md (PASS verdict, V1 remediation verified)
  • Spec: docs/DEV_CYCLE.md §4.5.4 Phase 1

🤖 Generated with Claude Code

Knapp-Kevin and others added 2 commits April 29, 2026 12:38
… + merged-to-dev labeller (CI Phase 1)

Implements Phase 1 of docs/DEV_CYCLE.md §4.5.4 per plan-ci-phase-1.md (rev 2,
PASS verdict). Five atomic changes land together so the new CI gates light up
on the next PR run:

1. pyproject.toml — declare ruff>=0.5.0 + mypy>=1.10.0 in
   [project.optional-dependencies].test, plus minimal [tool.ruff] /
   [tool.mypy] config. Lint scope: E/F/W/I/B/UP. Tests/scripts get
   per-file-ignores so day-one CI is green. Mypy is lenient
   (ignore_missing_imports, warn_return_any=false) with per-module
   ignore_errors=true overrides for the 16 noisiest modules — full type
   coverage chipped away in follow-up PRs.

2. .github/workflows/test-mcp-regression.yml — convert single-runner job
   to ubuntu-latest + windows-latest matrix with fail-fast: false and a
   job-level timeout-minutes: 20. The pull_request: trigger is left
   untouched (no types: added). BICAMERAL_SKIP_CONSENT_NOTICE='1' added
   to job env so non-interactive CI doesn't stall on the consent prompt.
   Windows is expected green given the fcntl + subprocess fixes already
   on dev (BicameralAI#80, BicameralAI#84).

3. .github/workflows/lint-and-typecheck.yml (new) — ruff check +
   ruff format --check + mypy on pull_request to main/dev.

4. .github/workflows/secret-scan.yml (new) — gitleaks/gitleaks-action@v2
   with fetch-depth: 0 so the diff range is fully scannable. Triggers on
   pull_request to main/dev.

5. .github/workflows/label-merged-to-dev.yml (new — separate workflow,
   NOT a job in test-mcp-regression.yml). Triggered only on
   pull_request: branches: [dev], types: [closed] with
   if: github.event.pull_request.merged == true. Minimal permissions
   (issues: write, pull-requests: read). actions/github-script@v7 parses
   GitHub close-keywords from the PR body and applies the merged-to-dev
   label to each referenced issue. This is the audit V1 fix — keeping
   the labeller in its own file means test-mcp-regression.yml's existing
   trigger semantics cannot regress.

Branch-protection rules to require these checks remain a manual GitHub
UI step (admin-only) — see PR description.

Lint hygiene fixes shipped alongside the workflow plumbing:
- handlers/update.py: add `from pathlib import Path` (was used unimported).
- ledger/status.py: drop unused line_count local.
- ledger/queries.py: noqa-annotate the intentional non-top-level import.
- 213 ruff --fix auto-corrections across the tree (sorted imports, dropped
  unused imports, datetime.UTC, PEP 585/604 annotation modernisation, etc.).

Refs: docs/DEV_CYCLE.md §4.5.4 Phase 1.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Apply ruff format across the tree to satisfy `ruff format --check .` in
the new lint-and-typecheck workflow. No semantic changes — pure
whitespace, line wrapping, and trailing-comma normalisation.

Split from the previous CI Phase 1 commit so the workflow plumbing diff
stays readable.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Apr 29, 2026

Copy link
Copy Markdown

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ae7c1ecb-9211-4e4c-877b-ff15c920b64f

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

…al steps

Two CI failures on PR BicameralAI#102's first run:

1. Gitleaks fails with "missing license. Go grab one at gitleaks.io" —
   gitleaks-action@v2 requires a paid license for organizations as of
   the 2023 breaking update. Switch to trufflesecurity/trufflehog@main,
   which is free for all repos and has equivalent detection coverage.
   Use --only-verified to keep noise low.

2. Windows matrix job fails on the Generate E2E report step ("No artifacts
   found at .../test-results/e2e — run Phase 3 tests first"). The medusa
   corpus and M1 adversarial eval are Linux-only by design (bash shell,
   ANTHROPIC_API_KEY-gated, large corpus clone). Gate the corpus clone,
   the M1 secret probe, and the M1 adversarial step plus the Generate
   E2E report step on matrix.os == 'ubuntu-latest'. The Windows job
   continues to run the full pytest suite (the actual regression value)
   plus uploads its own artifacts via the matrix-suffixed name.

Artifact name now includes matrix.os so both runs upload distinct
results without overwriting each other.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Knapp-Kevin and others added 2 commits April 29, 2026 12:51
The fixed test_desync_scenarios.py from PR BicameralAI#100 wasn't ruff-formatted
(ruff didn't exist in CI when BicameralAI#100 ran). After merging dev forward,
apply the format pass.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@Knapp-Kevin Knapp-Kevin merged commit 4bbe57d into BicameralAI:dev Apr 29, 2026
7 checks passed
Knapp-Kevin added a commit to Knapp-Kevin/bicameral-mcp that referenced this pull request Apr 29, 2026
The Tier 1 lint gate from BicameralAI#102 caught 32 stylistic findings on this
branch (22 in the new test files plus 10 in pre-existing files):
- timezone.utc → datetime.UTC alias (UP017 from PEP 695)
- import sorting (I001)
- 12 files needing ruff format

All auto-fixable. No behavior change. 28 telemetry tests still pass.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@Knapp-Kevin Knapp-Kevin added the flow:feature Standard feature/fix PR targeting BicameralAI/dev (the default flow) label Apr 29, 2026
Knapp-Kevin added a commit that referenced this pull request Apr 29, 2026
* feat: preflight telemetry capture loop pieces 1–4 (v0.15.0, #65)

Adds opt-in local-only preflight telemetry — captures preflight events
and downstream tool engagement for failure-mode triage. Default off;
hashed by default; raw via separate env var.

New module: preflight_telemetry.py
  - Salt at ~/.bicameral/salt (mode 0o600), per-install, race-safe init
  - hash_topic, hash_file_paths (order-independent set hash)
  - new_preflight_id (UUIDv4)
  - write_preflight_event, write_engagement (JSONL append, mode 0o600)
  - _maybe_rotate (50MB / 30 days, keeps last 5)

preflight_id plumb-through:
  - PreflightResponse, LinkCommitResponse, BindResponse, RatifyResponse
    gain optional preflight_id: str | None field
  - update.py dict returns also gain preflight_id key (11 sites)
  - server.py inputSchema for affected tools accepts optional preflight_id

Pieces 5 (SessionEnd reconciliation skill) and 6 (triage CLI) are
deferred to follow-up plans #65-pt2 and #65-pt3.

Closes #65 (pieces 1–4)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: ruff check --fix + format pass

The Tier 1 lint gate from #102 caught 32 stylistic findings on this
branch (22 in the new test files plus 10 in pre-existing files):
- timezone.utc → datetime.UTC alias (UP017 from PEP 695)
- import sorting (I001)
- 12 files needing ruff format

All auto-fixable. No behavior change. 28 telemetry tests still pass.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(types): correct return type on local_counters._open_for_append_secure

mypy flagged the os.PathLike return type as incompatible with the
actual BufferedWriter from os.fdopen. Use typing.IO[bytes] which is
what the with-block consumes anyway. Pure type fix; no behavior change.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Knapp-Kevin added a commit that referenced this pull request Apr 29, 2026
Fast-follow lint hygiene PR after #96 merged with 8 ruff failures still on its HEAD. Dev's ruff+mypy gate (#102) was red on 5f773e6; this PR clears it.

Re-applies the same fixes (4 files in tests/eval/ + tests/test_ephemeral_authoritative.py) directly against current dev. Zero behavioural changes.

Refs #96, #102.
jinhongkuan added a commit that referenced this pull request May 3, 2026
…om dev

Curated v0 subset of dev's divergence onto triage-from-dev. v1 work
(codegenome/, governance/, semantic-status pre-classifier, HITL bypass,
LLM drift judge — issues #44, #60, #61, #109, #110, #112) intentionally
held back per DEV_CYCLE.md §10.5.1 eligibility ("not triage-eligible:
schema-migrating changes, breaking public-API changes, multi-PR feature
epics").

CI workflows
- `.github/workflows/v0-user-flow-e2e.yml` — assertions + manual demo
  recording job for the v0 user-flow e2e harness (#108). Pairs with the
  e2e harness commits already on triage (a50d723, 697dc6e, f97ddab,
  e961cad, 17907fb, 82a493e, cf48270, 975dc83, e72a418).
- `.github/workflows/lint-and-typecheck.yml` — Tier-1 PR gate per
  DEV_CYCLE §4.5.1 (ruff + mypy).
- `.github/workflows/secret-scan.yml` — Tier-1 PR gate.
- `.github/workflows/label-merged-to-dev.yml` — auto-applies the
  `merged-to-dev` label on merge (CI Phase 1, #102).
- `.github/workflows/test-mcp-regression.yml` — Windows matrix added
  (existing file updated).

Demo recording
- `tests/e2e/record_demo.sh` — non-interactive demo recorder.
- `tests/e2e/demo_renderer.py` — overlay renderer.
- `tests/e2e/prompts/composite-demo.md` — single-session three-scene
  composite script (PM ingest + dev preflight/edit/commit + PM history).
- `tests/e2e/README.md` — design notes for the e2e harness.
- `docs/demos/README.md` — demos index.
- `docs/demos/v0-userflow-e2e.md` — v0 user-flow demo doc.
- `.gitignore` — excludes `docs/demos/**/*.mp4` (artifacts uploaded via
  GitHub Actions, not git).

Dev-cycle reference docs
- `docs/DEV_CYCLE.md` — the canonical dev cycle reference (#93). Defines
  the triage lane this PR follows (§10.5).
- `docs/guides/README.md`, `docs/training/README.md` — scaffolding
  alongside the dev-cycle docs.

Why bulk-copy instead of cherry-pick: 50+ candidate dev commits diverged
substantially from triage's pre-§10.5 SHAs and prior triage-adapt
workarounds (preflight_telemetry imports, schema migrations gated on
codegenome). A clean snapshot of each file from origin/dev avoids
fighting historical SHA churn while preserving the v0 content
faithfully. §10.5.3 anticipates this (the lane "carries some commits
with different SHAs … sunk cost from the lane's pre-§10.5 era").

Skipped from dev's divergence (held for next major or held permanently):
- v1 architecture: codegenome/, governance/, classify/heuristic.py
  semantic pre-classifier (Layer A Phase 1)
- #65 preflight telemetry capture loop (depends on v1 escalation
  feedback substrate)
- #76, #77 decision_level dashboard surfacing + classifier (deferred
  pending separate review)
- #48, #49 pre-push drift hook + sticky drift PR comment (deferred
  pending separate review)
- #97 event vocabulary extension (deferred — discussed separately)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@jinhongkuan jinhongkuan mentioned this pull request May 3, 2026
15 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

flow:feature Standard feature/fix PR targeting BicameralAI/dev (the default flow)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant