diff --git a/GOVERNANCE.md b/GOVERNANCE.md index 129d1ef1..57e5f69c 100644 --- a/GOVERNANCE.md +++ b/GOVERNANCE.md @@ -556,3 +556,43 @@ than renumbering the rest. change-history folders. The modified-OpenSpec workflow at `openspec/README.md` documents the divergence from upstream. + +29. **Backlog files are scoped.** Zeta has two backlog + files; each one holds a specific class of work. + Cross-contamination is a smell. + + - **`docs/BACKLOG.md`** — general engineering + backlog. New features, refactors, factory / + tooling / skill work, UX / DX items, research + that doesn't tie to a specific security + control. Default destination for almost + everything. + - **`docs/security/SECURITY-BACKLOG.md`** — + deferred SECURITY CONTROLS only. Partner + document to `docs/security/V1-SECURITY-GOALS.md`. + Items here must be genuine security controls + that fail one of the three v1.0 gates + (plausible adversary / quiet failure / bounded + cost). Infrastructure, CI, skill gaps, editor + tooling, lint configs — all belong in + `BACKLOG.md`, not here. + + **Enforcement.** The `backlog-scrum-master` + (Leilani) owns the audit cadence — every round + or two, walk both backlogs and flag any entry + that belongs in the other. Mis-filed entries get + moved, not duplicated. + + **Why.** Security-audit readers need + `SECURITY-BACKLOG.md` to be a clean list of + deferred security controls — that's how it + feeds SDL attestations (SDL-CHECKLIST.md) and + threat-model reviews. Letting factory / + engineering items drift in dilutes the + signal and makes future security audits harder. + + **Detection aid.** When unsure, ask: "would an + external auditor reading this list of deferred + items to assess my v1.0 security posture expect + to see this entry?" If no, it's + `docs/BACKLOG.md`. diff --git a/docs/BACKLOG.md b/docs/BACKLOG.md index 36cdc51e..0b622e4c 100644 --- a/docs/BACKLOG.md +++ b/docs/BACKLOG.md @@ -94,6 +94,90 @@ within each priority tier. side (Window.fs wiring pending). Target: measured numbers in `docs/BENCHMARKS.md` by end of round 20. +## P1 — Factory / static-analysis / tooling (round-33 surface) + +- [ ] **`openspec-gap-finder` skill** (round 32 ask). Viktor + (spec-zealot) reviews spec-to-code alignment for an existing + capability but doesn't scan the repo for capabilities shipped + without a spec. Needs a new skill parallel to + `skill-gap-finder` — spec-zealot role wears both via the + Aarav pattern. GOVERNANCE §28 enforcement depends on this. +- [ ] **`static-analysis-gap-finder` skill** (round 33 ask). + Aaron: "we need another gap analysis tool around static + analysis and linting and tools and rules we maybe missing." + Parallel to `openspec-gap-finder` + `skill-gap-finder`. + Spec-zealot role wears all three gap-finders. Proactive + lint discovery: enumerate committed languages/surfaces + + check whether a matching linter is on the lint gate. +- [ ] **Crank lint configurations to HIGH across the board.** + Aaron: "in general when there is static analysis + configuration or linting things of that nature we want to + crank it up to high." Round-33 baseline is mid-stringency; + post-33 pass researches each tool's recommended-strict + preset and adopts. Trigger: round 34, after round-33 Track D + baseline has proven itself stable. +- [ ] **documentation-agent cadence.** Add documentation-agent + to `factory-audit`'s every-10-rounds walk scope. Each walk: + scan comments in config files + doc-to-code alignment + + retired-file references. Round 33 surfaced multiple stale- + comment issues that would have been caught by a scheduled + doc-state audit. +- [ ] **Declarative-manifest setup matching `../scratch`'s + tiered shape.** Zeta's `tools/setup/manifests/` is + declarative-ish (`apt.txt`, `brew.txt`, etc.) but flat. + `../scratch`'s `declarative/` has tiered profiles + (`min`/`runner`/`quality`/`all`) per platform per tool. + Push one incremental step per round — split `brew.txt` into + tiers, then `.dotnet-tools` / `.bun-global` formats, etc. +- [ ] **Upstream sync script + `references/upstreams/` + population.** `references/reference-sources.json` manifest + exists; sync script referenced in README ("when it exists") + does not. Build the script so `./tools/setup/sync- + upstreams.sh` (or equivalent) reads the manifest, clones + each upstream into `references/upstreams//`, keeps + them fresh. +- [ ] **`upstream-comparative-analysis` skill.** Aaron round 33: + "add a skill around upstream reference comparative analysis + that will allow us to get ideas and such and compare our + code with upstreams for inspiration." Skill procedure: pick + a Zeta subsystem + matching upstream, produce a parallel + walk highlighting shape / contract / perf / correctness + differences. Owned by `tech-radar-owner` role; feeds + `docs/TECH-RADAR.md` Adopt/Trial/Assess/Hold moves. +- [ ] **Upstream categorisation audit (multi-round).** Aaron + round 33: "we probably need some upstream maintenance to + make sure all the categories are good and correct and make + sense for our project because we were not thinking as deeply + as we are now so there might be better categorisation." The + `categories` field in `references/reference-sources.json` + and the grouping prose in `references/README.md` were + written incrementally and may no longer reflect Zeta's + current mental model. Walk every upstream, re-read our + current research notes, propose better categories. Spread + across multiple rounds (long task requiring many internal + searches); each round picks a sub-tree. +- [ ] **Post-install repo automation: runtime choice open + (research needed).** `tools/setup/install.sh` (bash) owns + bootstrap; post-install polyglot automation (format-repo, + coverage-collect, benchmark-compare, lint orchestration) + benefits from one cross-platform runtime. Aaron: "IDK if + I want to stick with [Bun+TS] which is why i want the + research." Candidates: Bun/Node, Deno, Python (on PATH + via mise), .NET console tools, others. Write a design doc + comparing cold-start, install weight, type safety, + ecosystem breadth. Trigger: first post-install automation + task that would need cross-platform scripting. +- [ ] **Devcontainer / Codespaces image (GOVERNANCE §24 + third leg).** Two-leg parity (dev + CI) is today; + devcontainer closes three-leg parity for first external + contributor onboarding. Dockerfile calls + `tools/setup/install.sh`. +- [ ] **Windows CI matrix + Windows install path + (`tools/setup/windows.ps1`).** Trigger: stable green on + mac + linux for N rounds, OR named Windows consumer. + Windows install will be PowerShell, not Git Bash (Git Bash + is not guaranteed installed). + ## P1 — CI / DX follow-ups (after round-29 anchor) - [ ] **Full mise migration.** Round 29 adopts `.mise.toml` diff --git a/docs/security/SECURITY-BACKLOG.md b/docs/security/SECURITY-BACKLOG.md index 482f0faa..db584536 100644 --- a/docs/security/SECURITY-BACKLOG.md +++ b/docs/security/SECURITY-BACKLOG.md @@ -146,136 +146,20 @@ cost) but is still worth shipping eventually. - **Rough cost estimate:** M - **Priority:** P1 -### Devcontainer / Codespaces image (GOVERNANCE §24 third leg) - -- **Why deferred:** two-leg parity (dev + CI) is the v1.0 floor; - devcontainer is a consumer-experience boost, not a security - control per se. -- **Trigger to revisit:** first external contributor onboards - and friction measures the gap, OR Codespaces becomes the - default onboarding path. -- **Rough cost estimate:** S -- **Priority:** P2 - -### documentation-agent cadence - -- **Why deferred:** the documentation-agent skill only triggers - when someone spots drift. Round 33 surfaced multiple stale- - comment / out-of-date-documentation issues that would have - been caught by a scheduled doc-state audit (e.g., my own - `.markdownlint-cli2.jsonc` comment drifted from the actual - suppression strategy in one round). Aaron round 33: "is the - documentation person looking for out of date documentation - on any kind of cadence?" -- **Trigger to revisit:** round 34 factory-improvement slot. - Add documentation-agent to `factory-audit`'s every-10-rounds - walk scope. Each walk: scan comments in config files + - doc-to-code alignment + retired-file references. -- **Rough cost estimate:** S (just add to cadence) -- **Priority:** P1 - -### Post-install repo automation: runtime choice open (research needed) - -- **Why deferred:** `tools/setup/install.sh` (bash) owns bootstrap - because it can't assume any runtime pre-install. After install, - Zeta's eventual polyglot repo-automation surface (format-repo, - coverage-collect, benchmark-compare, lint orchestration) - benefits from a single cross-platform runtime. Aaron's prior - choice on `../SQLSharp` + `../scratch` was Bun + TypeScript + - package.json — it worked, but Aaron is explicitly NOT committed - to repeating it: "IDK if I want to stick with this which is - why i want the research." Candidates worth comparing include - Bun/Node, Deno, Python (already on PATH post-install via mise), - .NET console tools (already on PATH), or something else - entirely. The constraint is "one runtime handles - everything polyglot-post-install, no bash+pwsh parallel twins." -- **Trigger to revisit:** first post-install automation task that - would need cross-platform scripting (benchmark comparison, - coverage aggregation, format-repo across .fs/.cs/.md/etc.). - Research round before committing: write a design doc comparing - candidate runtimes across cold-start cost, install weight, type - safety, ecosystem breadth for lint/format/test orchestration, - and maintainability. -- **Rough cost estimate:** S (research doc) + M (first shipped - automation in chosen runtime) -- **Priority:** P2 (on-demand; not blocking) -- **Note:** install.sh stays bash regardless — bootstrap can't - depend on its own output. - -### `static-analysis-gap-finder` skill (missing-lint-tool detection) - -- **Why deferred:** Round 33 Track D surfaced that Zeta had no - markdown / workflow / shell linter until this round. Nobody - was proactively scanning for missing linters. Aaron: "we need - another gap analysis tool around static analysis and linting - and tools and rules we maybe missing." Parallel to - `openspec-gap-finder` and `skill-gap-finder`; owned by the - `spec-zealot` role (Viktor) along with those other gap-finders - — one role wearing multiple gap-finding skills, Aarav pattern. - Proactive lint discovery: on a new-project template, this - skill would enumerate committed languages/surfaces + check - whether a matching linter is on the lint gate. -- **Trigger to revisit:** round 34 factory-improvement slot. -- **Rough cost estimate:** M -- **Priority:** P1 - -### Crank lint configurations to HIGH across the board - -- **Why deferred:** Aaron round 33: "in general when there is - static analysis configuration or linting things of that nature - we want to crank it up to high." Current configs are mid- - stringency (several relaxed rules in markdownlint, severity- - floor-style on shellcheck, default ruleset on actionlint). - A post-round-33 pass should research each tool's - recommended-strict preset and adopt. -- **Trigger to revisit:** round 34, after round-33 Track D - baseline has proven itself stable. -- **Rough cost estimate:** S-per-tool, L cumulative -- **Priority:** P1 - -### `openspec-gap-finder` skill (missing-spec detection) - -- **Why deferred:** Viktor (spec-zealot) reviews spec-to-code - alignment for an existing capability but doesn't scan the repo - for capabilities shipped without a spec. Aaron's round-32 - observation: "someone should be responsible for looking for - missing openspec, do we already have one that looks for ones - that are out of sync, we should be checking these too every - now and then as well." Gap is parallel to `skill-gap-finder` - (finds absent skills); needs `openspec-gap-finder` (finds - absent specs + flags drift between spec and shipped artefact). -- **Trigger to revisit:** round 33 factory-improvement slot. -- **Rough cost estimate:** M (skill + skill-creator workflow + - first audit run) -- **Priority:** P1 (GOVERNANCE §28 enforcement depends on it) - -### Declarative-manifest setup (match `../scratch`'s shape) - -- **Why deferred:** Zeta's `tools/setup/manifests/` is already - declarative-ish (`apt.txt`, `brew.txt`, `dotnet-tools.txt`, - `verifiers.txt`) but flat and un-tiered. `../scratch`'s - `declarative/` directory has tiered profiles - (`min`/`runner`/`quality`/`all`) per platform per tool, - matching dotnet's `.dotnet-tools` / `.dotnet-workloads` / - Brewfile / Caskfile / `.apt` / `.uv-tools` / `.bun-global` - formats. Aaron's round-32 ask: "push hard each sprint" on - closing this gap. -- **Trigger to revisit:** every round, track at least one - incremental step toward the scratch shape — e.g., split - `brew.txt` into min/runtimes/quality tiers this round, - convert `dotnet-tools.txt` to `.dotnet-tools` next. -- **Rough cost estimate:** L (incremental over 4-6 rounds) -- **Priority:** P1 (ratchet toward v1.0) - -### Windows CI matrix - -- **Why deferred:** stable green on mac + linux is the immediate - target. Windows expands surface by 50% of CI-minute budget - for a v1.0 consumer base that is predominantly mac + linux. -- **Trigger to revisit:** named Windows consumer, OR 4 weeks - stable on the current two-OS matrix. -- **Rough cost estimate:** M (install.sh Windows path + CI job) -- **Priority:** P2 + ### Signed-commit requirement on `main`