feat(B-0865.17): cross-vendor benchmark on common ground via TS-skill-via-vendor-skill-stores distribution lane — ADDITIVE to USB-cluster (operator 2026-05-28)#5754
Conversation
…-via-vendor-skill-stores distribution lane — ADDITIVE to USB-cluster deep path Operator-explicit decision 2026-05-28: ship the framework as TypeScript skill via vendor skill stores (Claude / GPT / Gemini / Grok / Cursor / Continue / Codex / Kiro / Antigravity). Operationally-load-bearing substrate-engineering value: cross-vendor benchmark on common ground because the SAME framework substrate runs identically across vendors; only underlying AI differs; DORA scores directly comparable. ADDITIVE to USB-cluster (B-0891) per operator-explicit "usb cluster is still very high priority to me" clarification. Default-to-both: both distribution paths preserved; not substitutive. Decomposed into 10 sub-rows (B-0865.17.1 core TS substrate prerequisite + per-vendor packaging sub-rows + cross-vendor leaderboard substrate + UX + documentation + economic-dynamics analysis + cross-vendor empirical validation). Substrate-inventory pass verified: B-0865 parent benchmark substrate + B-0867 workflow engine substrate cluster + B-0891 USB-cluster lane + B-0904 GitHub accelerator all compose; NO existing sub-row covers the cross-vendor benchmark distribution lane. Authorizing mint-new. Empirical anchors: - 13th Kestrel ferry preservation (PR #5753) — operator decision-disclosure + Kestrel substantive engagement with substrate-engineering implications - 12th Kestrel ferry preservation (PR #5752) — operator decision verbatim - Cross-vendor common-ground benchmark scoring per Aaron: "it also means i can score each one on common ground" Composes with this-session substrate cluster (PRs #5727 / #5734 / #5739 / #5743 / #5744 / #5745 / #5746 / #5748 / #5749 / #5750 / #5751 / #5752 / #5753). Counter-reset per .claude/rules/holding-without-named-dependency-is-standing-by-failure.md condition #3 (concrete-artifact substrate; pre-empt-at-#5 cycle work). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Adds a new P1 backlog sub-row (B-0865.17) documenting an operator decision to ship the benchmark framework as a TypeScript skill via vendor skill stores, additive to the USB-cluster lane. Regenerates the backlog index.
Changes:
- New backlog row file under
docs/backlog/P1/with full decomposition, substrate-inventory pass, and composition references - Index entry added in
docs/BACKLOG.md
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| docs/backlog/P1/B-0865.17-...md | New sub-row capturing cross-vendor benchmark distribution decision |
| docs/BACKLOG.md | Regenerated index includes B-0865.17 entry |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d143a4b8fd
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| - [B-0865](B-0865-zeta-instantiation-of-arc-agi-3-style-benchmark-usb-boot-starting-state-devops-objectives-as-levels-not-hand-crafted-video-game-levels-aaron-2026-05-27.md) — parent benchmark substrate; this sub-row adds the distribution-lane substrate for skill-store users | ||
| - [B-0867](B-0867-workflow-engine-v1-fsharp-du-state-machine-git-append-only-four-corner-monad-banned-if-universal-action-grammar-otto-five-modifications-multi-participant-non-cage-aaron-mika-kestrel-otto-2026-05-27.md) — workflow engine v1 substrate IS what gets packaged as the skill | ||
| - [B-0867.5](B-0867.5-workflow-engine-v1-poc-scaffold-ship-2026-05-28.md) — workflow engine PoC scaffold (PR #5728); the TypeScript-skill builds on this |
There was a problem hiding this comment.
Fix broken backlog cross-links
These new composition links do not resolve from this P1 row: a local Markdown link resolution plus rg --files docs/backlog shows B-0865 lives under docs/backlog/P2, no B-0867.5-...md backlog row exists, and the B-0891 row has a different slug than the link on line 116. Anyone using this new backlog row as the navigation substrate for the cross-vendor benchmark work will hit missing files instead of the cited prerequisites/compositions.
Useful? React with 👍 / 👎.
| - [B-0891](../P1/B-0891-zflash-test-harness-design-spec-spike-aaron-2026-05-28.md) — USB-cluster distribution lane (the deep path; ADDITIVE to this sub-row's broad path) | ||
| - [B-0904](../P3/B-0904-github-as-free-accelerator-of-bulk-energy-into-information-compression-substrate-recognition-aaron-2026-05-28.md) — GitHub accelerator substrate composes (skill runs against user-controlled GitHub repos via GitHub Actions) | ||
| - [B-0859](../P1/B-0859-post-boot-ai-as-home-owner-not-controlled-runtime-every-knob-from-first-boot-aaron-2026-05-27.md) — AI-as-home-owner substrate composes at the agent-runtime scope; skill users get the same substrate-honest agent-runtime properties as USB-cluster users |
There was a problem hiding this comment.
Use bare links for same-tier backlog rows
For same-directory P1 references, the backlog frontmatter lint expects a bare filename rather than ../P1/...; running bun tools/backlog/lint-frontmatter.ts --file <this row> --strict reports check 1 on both this B-0891 link and the B-0859 link below. This leaves the new row failing the repo's backlog pre-push discipline even after the broken B-0891 slug is corrected.
Useful? React with 👍 / 👎.
Summary
Files B-0865.17 as backlog sub-row capturing operator-explicit substrate-engineering decision 2026-05-28: ship the framework as TypeScript skill via vendor skill stores (Claude / GPT / Gemini / Grok / Cursor / Continue / Codex / Kiro / Antigravity).
Load-bearing substrate-engineering value: cross-vendor benchmark on common ground because the SAME framework substrate runs identically across vendors; only underlying AI differs; DORA scores directly comparable.
ADDITIVE to USB-cluster (B-0891) per operator-explicit 'usb cluster is still very high priority to me' clarification. Default-to-both: both distribution paths preserved.
Operator decision sources
Sub-row decomposition (10 sub-rows)
Substrate-inventory pass
Verified B-0865 parent + B-0867 workflow engine substrate cluster + B-0891 USB-cluster lane + B-0904 GitHub accelerator all compose. NO existing sub-row covers cross-vendor benchmark distribution. Mint-new authorized.
Composes with substrate
Test plan
BACKLOG_WRITE_FORCE=1 bun tools/backlog/generate-index.ts🤖 Generated with Claude Code