From 0a1b18d0063868793f96b68dab0110749e6f1397 Mon Sep 17 00:00:00 2001 From: Lior Date: Wed, 27 May 2026 15:46:50 -0400 Subject: [PATCH] docs(archive): Batch archive of 20 PRs --- ...pty-systemd-path-broke-clear-nmtui-ping.md | 114 +++++++++ ...o-submit-back-telemetry-fixes-from-in-t.md | 60 +++++ ...cosystem-as-force-multipliers-behind-ze.md | 27 ++ ...vicetitan-route-plug-into-existing-cont.md | 27 ++ ...a-native-scheduler-first-dst-ai-aware-c.md | 27 ++ ...meta-playbook-control-structure-injecti.md | 31 +++ ...-device-partprobe-bare-partprobe-was-hi.md | 88 +++++++ ...able-controllable-cluster-fabric-device.md | 25 ++ ...5-etcd-less-options-kine-adapter-family.md | 31 +++ ...ress-post-merge-worktree-hygiene-review.md | 57 +++++ ...-convergence-push-down-ai-processing-di.md | 33 +++ ...ls-wire-bash-retirement-inventory-guard.md | 171 +++++++++++++ ...a-grok-2026-05-25-substrate-batch-local.md | 114 +++++++++ ...very-repair-jsonl-sessions-corrupted-by.md | 238 ++++++++++++++++++ ...on-agent-claude-code-session-recovery-l.md | 109 ++++++++ ...organization-package-architecture-slice.md | 145 +++++++++++ ...tached-worktree-and-clone-hygiene-in-ba.md | 43 ++++ ...int-md032-prose-arithmetic-joiner-recur.md | 49 ++++ ...tier-2-docker-desktop-dev-experience-wo.md | 74 ++++++ ...entic-org-gate-commands-with-hat-policy.md | 89 +++++++ 20 files changed, 1552 insertions(+) create mode 100644 docs/pr-discussions/PR-5047-fix-b-0754-iter-2-empty-systemd-path-broke-clear-nmtui-ping.md create mode 100644 docs/pr-discussions/PR-5048-backlog-b-0762-ai-auto-submit-back-telemetry-fixes-from-in-t.md create mode 100644 docs/pr-discussions/PR-5050-backlog-b-0764-cncf-ecosystem-as-force-multipliers-behind-ze.md create mode 100644 docs/pr-discussions/PR-5051-backlog-b-0765-p1-servicetitan-route-plug-into-existing-cont.md create mode 100644 docs/pr-discussions/PR-5053-backlog-b-0767-p1-zeta-native-scheduler-first-dst-ai-aware-c.md create mode 100644 docs/pr-discussions/PR-5055-backlog-b-0769-p1-vc-meta-playbook-control-structure-injecti.md create mode 100644 docs/pr-discussions/PR-5057-fix-b-0754-iter-3-per-device-partprobe-bare-partprobe-was-hi.md create mode 100644 docs/pr-discussions/PR-5059-backlog-b-0772-observable-controllable-cluster-fabric-device.md create mode 100644 docs/pr-discussions/PR-5061-backlog-b-0774-b-0775-etcd-less-options-kine-adapter-family.md create mode 100644 docs/pr-discussions/PR-5063-fix-pr-5020-address-post-merge-worktree-hygiene-review.md create mode 100644 docs/pr-discussions/PR-5066-backlog-b-0779-ai-nas-convergence-push-down-ai-processing-di.md create mode 100644 docs/pr-discussions/PR-5067-tools-wire-bash-retirement-inventory-guard.md create mode 100644 docs/pr-discussions/PR-5068-research-backlog-mika-grok-2026-05-25-substrate-batch-local.md create mode 100644 docs/pr-discussions/PR-5069-feat-claude-code-recovery-repair-jsonl-sessions-corrupted-by.md create mode 100644 docs/pr-discussions/PR-5070-backlog-b-0788-agent-on-agent-claude-code-session-recovery-l.md create mode 100644 docs/pr-discussions/PR-5071-build-agentic-organization-package-architecture-slice.md create mode 100644 docs/pr-discussions/PR-5072-fix-gemini-enforce-detached-worktree-and-clone-hygiene-in-ba.md create mode 100644 docs/pr-discussions/PR-5073-docs-shadow-markdownlint-md032-prose-arithmetic-joiner-recur.md create mode 100644 docs/pr-discussions/PR-5074-docs-persona-max-add-tier-2-docker-desktop-dev-experience-wo.md create mode 100644 docs/pr-discussions/PR-5075-feat-agentic-org-gate-commands-with-hat-policy.md diff --git a/docs/pr-discussions/PR-5047-fix-b-0754-iter-2-empty-systemd-path-broke-clear-nmtui-ping.md b/docs/pr-discussions/PR-5047-fix-b-0754-iter-2-empty-systemd-path-broke-clear-nmtui-ping.md new file mode 100644 index 0000000000..3f7d8e6545 --- /dev/null +++ b/docs/pr-discussions/PR-5047-fix-b-0754-iter-2-empty-systemd-path-broke-clear-nmtui-ping.md @@ -0,0 +1,114 @@ +--- +pr_number: 5047 +title: "fix(B-0754 iter-2): empty systemd PATH broke clear+nmtui+ping+systemctl on real hardware" +author: "AceHack" +state: "MERGED" +created_at: "2026-05-26T00:10:32Z" +merged_at: "2026-05-26T00:22:03Z" +closed_at: "2026-05-26T00:22:04Z" +head_ref: "otto-cli/b0754-fix-systemd-path-iter2-2026-05-25" +base_ref: "main" +archived_at: "2026-05-27T19:46:43Z" +archive_tool: "tools/pr-preservation/archive-pr.ts" +--- + +# PR #5047: fix(B-0754 iter-2): empty systemd PATH broke clear+nmtui+ping+systemctl on real hardware + +## PR description + +## Iteration 1 result (real-hardware test, Aaron 2026-05-25) + +Photo evidence on the cluster node screen after booting the v1 ISO: + +- `clear: command not found` (line 40 + line 77) — the role-prompt and banner sections +- `nmtui: command not found` — when ethernet-DHCP wait expired and the wifi-fallback fired +- Drop-to-shell worked correctly — operator landed at a working root prompt with the recovery hints intact + +The substrate-honest failure path validated: the script degraded gracefully and the operator could still complete the install manually (`nmtui` + `zeta-install control-plane` from the recovery shell). But the load-bearing zero-typing-automation flow didn't reach the end. **This PR is the fix so iteration 2 completes unattended.** + +## Root cause + +NixOS systemd services get a **minimal PATH** by default. The first-boot script's bare commands (`clear`, `nmtui`, `ping`, `systemctl`, plus every command zeta-install.sh would reach for — `lsblk`, `sgdisk`, `mkfs.fat`, `mkfs.ext4`, `mount`, `partprobe`, `partprobe`, etc.) all need either explicit absolute paths OR a configured Environment block on the systemd unit. The interactive-shell PATH that 'just works' for SSH or tty2 login is NOT inherited by Type=idle systemd services. + +The reason only `clear` and `nmtui` were observed: `nmtui` blocked first; the rest never executed in the failed path. + +## Fix (defense in depth) + +### 1. systemd unit Environment block (load-bearing) + +`configuration.nix`: explicit `PATH` + `TERM` on the zeta-first-boot service. Covers every current AND future bare command: + +```nix +environment = { + PATH = "/run/current-system/sw/bin:/run/current-system/sw/sbin:/run/wrappers/bin"; + TERM = "linux"; +}; +``` + +### 2. Script-level belt-and-suspenders + +`zeta-first-boot.sh`: + +- Replace `clear || true` (×2) with `printf '\\033c' || true` — ANSI 'reset terminal' escape; no external command dependency +- Change `nmtui` invocation to `/run/current-system/sw/bin/nmtui` (absolute path) + +Even if the systemd Environment is overridden by some future change, these two failure modes stay fixed. + +## Composes with + +- B-0759 first-time-CLI-user persona — drop-to-shell with recovery hints worked exactly as designed; the persona-aligned error path was substrate-honest +- B-0760 USB-as-repair-tool — same systemd-PATH discipline applies to every command the repair flow will invoke +- B-0761 reference architecture — this is iteration N of N for the AI-native cluster-bootstrap reference; bandwidth payoff across every future install + +## Test plan + +- [x] `bash -n` syntax check on edited zeta-first-boot.sh +- [ ] CI rebuilds ISO via build-ai-cluster-iso.yml (auto-triggers on `full-ai-cluster/usb-nixos-installer/**` path) +- [ ] Aaron reflashes via zflash + boots cluster node + observes unattended install completes end-to-end +- [ ] CI green + +## Reviews + +### COMMENTED — @copilot-pull-request-reviewer (2026-05-26T00:12:22Z) + +## Pull request overview + +Fixes the installer ISO’s first-boot automation failing under NixOS systemd’s minimal default `PATH`, so the unattended “zero-typing” flow can complete on real hardware. + +**Changes:** +- Replaced `clear` with an ANSI terminal reset escape to remove reliance on `clear` being in `PATH`. +- Invoked `nmtui` via an absolute path to avoid `PATH`-inheritance issues. +- Set explicit `PATH` and `TERM` in the `zeta-first-boot` systemd unit to cover current and future bare command usage in the first-boot flow. + +### Reviewed changes + +Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment. + +| File | Description | +| ---- | ----------- | +| full-ai-cluster/usb-nixos-installer/zeta-first-boot.sh | Removes dependency on `clear` being available and pins `nmtui` to an absolute path for robustness under systemd. | +| full-ai-cluster/usb-nixos-installer/nixos/installer/configuration.nix | Adds explicit `PATH`/`TERM` to the first-boot systemd unit to ensure all invoked tools are discoverable in the unit environment. | + +### COMMENTED — @copilot-pull-request-reviewer (2026-05-26T00:21:32Z) + +## Pull request overview + +Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments. + +## Review threads + +### Thread 1: full-ai-cluster/usb-nixos-installer/zeta-first-boot.sh:120 (resolved) + +**@copilot-pull-request-reviewer** (2026-05-26T00:12:22Z): + +The comment mentions “the env-var below” as a defense, but there isn’t an environment variable being set in the script at/after this point. Consider rewording to reference the systemd unit Environment block (and/or explicitly exporting a PATH in the script if that’s what you meant) so the rationale stays accurate for future readers. + +## General comments + +### @chatgpt-codex-connector (2026-05-26T00:19:05Z) + +You have reached your Codex usage limits for code reviews. You can see your limits in the [Codex usage dashboard](https://chatgpt.com/codex/cloud/settings/usage). + +### @AceHack (2026-05-26T00:19:11Z) + +Fixed in `fd160ff1`: reworded the comment to reference the systemd unit's `environment.PATH` override (set in `configuration.nix` on `systemd.services.zeta-first-boot.environment.PATH` via `lib.mkForce`) rather than implying an env-var is set in the shell script. Thanks for the catch. diff --git a/docs/pr-discussions/PR-5048-backlog-b-0762-ai-auto-submit-back-telemetry-fixes-from-in-t.md b/docs/pr-discussions/PR-5048-backlog-b-0762-ai-auto-submit-back-telemetry-fixes-from-in-t.md new file mode 100644 index 0000000000..3a19f2320f --- /dev/null +++ b/docs/pr-discussions/PR-5048-backlog-b-0762-ai-auto-submit-back-telemetry-fixes-from-in-t.md @@ -0,0 +1,60 @@ +--- +pr_number: 5048 +title: "backlog(B-0762): AI auto-submit-back telemetry + fixes from in-the-wild installs \u2014 adoption-cost-to-zero flywheel" +author: "AceHack" +state: "MERGED" +created_at: "2026-05-26T00:12:34Z" +merged_at: "2026-05-26T00:14:55Z" +closed_at: "2026-05-26T00:14:55Z" +head_ref: "otto-cli/b0762-ai-auto-submit-back-telemetry-flywheel-2026-05-25" +base_ref: "main" +archived_at: "2026-05-27T19:46:42Z" +archive_tool: "tools/pr-preservation/archive-pr.ts" +--- + +# PR #5048: backlog(B-0762): AI auto-submit-back telemetry + fixes from in-the-wild installs — adoption-cost-to-zero flywheel + +## PR description + +Aaron 2026-05-25 named the adoption-scaling mechanism: each in-the-wild install opt-in submits success/failure telemetry; failures trigger LLM-generated minimal-diff PRs; pattern-matched fixes auto-merge; next operator on similar hardware pays 0 cost. Network-effect substrate; ARC-AGI parallel (B-0761) provides the training data + benchmark catalog. Composes with B-0743 / B-0754-B-0761. + +## Reviews + +### COMMENTED — @copilot-pull-request-reviewer (2026-05-26T00:15:24Z) + +## Pull request overview + +Adds a new P2 backlog row (B-0762) describing an install-time opt-in telemetry loop that can auto-generate minimal-diff fix PRs from in-the-wild failures, aiming to drive “adoption cost → 0” via a network-effect feedback cycle. + +**Changes:** +- Added backlog row **B-0762** capturing the telemetry + auto-fix flywheel concept, acceptance criteria, and privacy/security floor. +- Updated `docs/BACKLOG.md` index to include **B-0762** under P2. + +### Reviewed changes + +Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments. + +| File | Description | +| ---- | ----------- | +| docs/backlog/P2/B-0762-ai-auto-submit-back-telemetry-fixes-from-in-the-wild-installs-adoption-cost-to-zero-flywheel-aaron-2026-05-25.md | New backlog row defining the telemetry→LLM PR→auto-merge gating loop and acceptance criteria. | +| docs/BACKLOG.md | Adds B-0762 to the generated backlog index. | + +## Review threads + +### Thread 1: docs/backlog/P2/B-0762-ai-auto-submit-back-telemetry-fixes-from-in-the-wild-installs-adoption-cost-to-zero-flywheel-aaron-2026-05-25.md:14 (unresolved) + +**@copilot-pull-request-reviewer** (2026-05-26T00:15:24Z): + +`composes_with` references `B-0743`, but there is no corresponding backlog row (`id: B-0743`) anywhere in the repo. Either add the missing backlog row (or land it in the same PR/stack) or drop this reference to avoid dangling IDs. + +### Thread 2: docs/backlog/P2/B-0762-ai-auto-submit-back-telemetry-fixes-from-in-the-wild-installs-adoption-cost-to-zero-flywheel-aaron-2026-05-25.md:19 (unresolved) + +**@copilot-pull-request-reviewer** (2026-05-26T00:15:24Z): + +`composes_with` includes `B-0758` and `B-0760`, but there are no corresponding backlog rows (`id: B-0758` / `id: B-0760`) in the repo. Add those rows (or update to the correct existing IDs) so backlog cross-references stay resolvable. + +## General comments + +### @chatgpt-codex-connector (2026-05-26T00:12:38Z) + +You have reached your Codex usage limits for code reviews. You can see your limits in the [Codex usage dashboard](https://chatgpt.com/codex/cloud/settings/usage). diff --git a/docs/pr-discussions/PR-5050-backlog-b-0764-cncf-ecosystem-as-force-multipliers-behind-ze.md b/docs/pr-discussions/PR-5050-backlog-b-0764-cncf-ecosystem-as-force-multipliers-behind-ze.md new file mode 100644 index 0000000000..26f542e0fa --- /dev/null +++ b/docs/pr-discussions/PR-5050-backlog-b-0764-cncf-ecosystem-as-force-multipliers-behind-ze.md @@ -0,0 +1,27 @@ +--- +pr_number: 5050 +title: "backlog(B-0764): CNCF ecosystem as force multipliers behind Zeta interfaces (KEDA, DAPR, OPA, OAM/KubeVela)" +author: "AceHack" +state: "MERGED" +created_at: "2026-05-26T00:16:48Z" +merged_at: "2026-05-26T00:18:54Z" +closed_at: "2026-05-26T00:18:54Z" +head_ref: "otto-cli/b0764-cncf-ecosystem-force-multipliers-2026-05-25" +base_ref: "main" +archived_at: "2026-05-27T19:46:41Z" +archive_tool: "tools/pr-preservation/archive-pr.ts" +--- + +# PR #5050: backlog(B-0764): CNCF ecosystem as force multipliers behind Zeta interfaces (KEDA, DAPR, OPA, OAM/KubeVela) + +## PR description + +Aaron 2026-05-25 named the CNCF-ecosystem-as-force-multipliers pattern: adopt KEDA + DAPR + OPA + OAM/KubeVela + Crossplane + kro + Cilium + Knative + OpenTelemetry + Rook/Ceph as plugins behind Zeta interfaces. Composed via Ace (PM-of-PMs) + ontology negotiation (B-0741). Zeta owns the stable operator contract; CNCF projects ship the substrate underneath; operator gets ecosystem momentum + maintenance distribution for free. + +Disambiguates OPA (Open Policy Agent / Rego) vs OAM (Open Application Model). Composes with B-0741 / B-0747 / B-0748 / B-0749 / B-0761 / B-0762 / B-0763. + +## General comments + +### @chatgpt-codex-connector (2026-05-26T00:16:51Z) + +You have reached your Codex usage limits for code reviews. You can see your limits in the [Codex usage dashboard](https://chatgpt.com/codex/cloud/settings/usage). diff --git a/docs/pr-discussions/PR-5051-backlog-b-0765-p1-servicetitan-route-plug-into-existing-cont.md b/docs/pr-discussions/PR-5051-backlog-b-0765-p1-servicetitan-route-plug-into-existing-cont.md new file mode 100644 index 0000000000..224e2264e5 --- /dev/null +++ b/docs/pr-discussions/PR-5051-backlog-b-0765-p1-servicetitan-route-plug-into-existing-cont.md @@ -0,0 +1,27 @@ +--- +pr_number: 5051 +title: "backlog(B-0765 P1): ServiceTitan route \u2014 plug into existing control interfaces; ontology negotiation at standards layer" +author: "AceHack" +state: "MERGED" +created_at: "2026-05-26T00:20:16Z" +merged_at: "2026-05-26T00:21:49Z" +closed_at: "2026-05-26T00:21:49Z" +head_ref: "otto-cli/b0765-service-titan-route-standards-layer-2026-05-25" +base_ref: "main" +archived_at: "2026-05-27T19:46:41Z" +archive_tool: "tools/pr-preservation/archive-pr.ts" +--- + +# PR #5051: backlog(B-0765 P1): ServiceTitan route — plug into existing control interfaces; ontology negotiation at standards layer + +## PR description + +Aaron 2026-05-25 named the ServiceTitan strategic principle as substrate-engineering policy: every cluster-install substrate decision filters through 'are we inventing or adopting?'. Prefer adopting existing standards (k8s CRDs, OAM Components, Crossplane Compositions, Helm 3 OCI, ArgoCD, Flux, OpenTelemetry, OPA Rego, DAPR Components, NixOS) over inventing parallel substrate. Sharpens (not retracts) B-0763 + B-0764. Ontology negotiation at the STANDARDS LAYER (not per-project) is the load-bearing leverage point. + +P1 because it's the strategic filter shape every future cluster-install row should pass through. Composes with B-0741 / B-0744 / B-0747 / B-0748 / B-0749 / B-0754 / B-0759 / B-0761 / B-0762 / B-0763 / B-0764. + +## General comments + +### @chatgpt-codex-connector (2026-05-26T00:20:21Z) + +You have reached your Codex usage limits for code reviews. You can see your limits in the [Codex usage dashboard](https://chatgpt.com/codex/cloud/settings/usage). diff --git a/docs/pr-discussions/PR-5053-backlog-b-0767-p1-zeta-native-scheduler-first-dst-ai-aware-c.md b/docs/pr-discussions/PR-5053-backlog-b-0767-p1-zeta-native-scheduler-first-dst-ai-aware-c.md new file mode 100644 index 0000000000..0f9832b163 --- /dev/null +++ b/docs/pr-discussions/PR-5053-backlog-b-0767-p1-zeta-native-scheduler-first-dst-ai-aware-c.md @@ -0,0 +1,27 @@ +--- +pr_number: 5053 +title: "backlog(B-0767 P1): Zeta-native scheduler first \u2014 DST + AI-aware cluster management" +author: "AceHack" +state: "MERGED" +created_at: "2026-05-26T00:23:39Z" +merged_at: "2026-05-26T00:25:44Z" +closed_at: "2026-05-26T00:25:44Z" +head_ref: "otto-cli/b0767-zeta-native-scheduler-first-dst-ai-aware-2026-05-25" +base_ref: "main" +archived_at: "2026-05-27T19:46:40Z" +archive_tool: "tools/pr-preservation/archive-pr.ts" +--- + +# PR #5053: backlog(B-0767 P1): Zeta-native scheduler first — DST + AI-aware cluster management + +## PR description + +Aaron 2026-05-25 sequencing call on B-0766 wave order: scheduler is load-bearing enough on (1) DST grounding for the whole cluster + (2) AI-aware scheduling (GPU topology, model locality, workload class, energy cost) that it should be Wave 1, not Wave 2. Binary-compatible via Pod.spec.schedulerName: zeta-scheduler (per B-0765 ServiceTitan route). Sub-waves A-E from baseline through DBSP+Bayesian + multi-objective. + +First concrete sub-row of B-0766. Composes with B-0428 (F# fork) + B-0741 / B-0747 / B-0754 / B-0761 / B-0762 / B-0763 / B-0764 / B-0765 / B-0766. + +## General comments + +### @chatgpt-codex-connector (2026-05-26T00:23:43Z) + +You have reached your Codex usage limits for code reviews. You can see your limits in the [Codex usage dashboard](https://chatgpt.com/codex/cloud/settings/usage). diff --git a/docs/pr-discussions/PR-5055-backlog-b-0769-p1-vc-meta-playbook-control-structure-injecti.md b/docs/pr-discussions/PR-5055-backlog-b-0769-p1-vc-meta-playbook-control-structure-injecti.md new file mode 100644 index 0000000000..b44be4ae0a --- /dev/null +++ b/docs/pr-discussions/PR-5055-backlog-b-0769-p1-vc-meta-playbook-control-structure-injecti.md @@ -0,0 +1,31 @@ +--- +pr_number: 5055 +title: "backlog(B-0769 P1): VC meta-playbook \u2014 control-structure injection around capital flow + adj-EBITDA financial-engineering layer (substrate-honest variant)" +author: "AceHack" +state: "MERGED" +created_at: "2026-05-26T00:27:03Z" +merged_at: "2026-05-26T00:28:52Z" +closed_at: "2026-05-26T00:28:52Z" +head_ref: "otto-cli/b0769-vc-meta-playbook-control-structure-injection-substrate-honest-2026-05-25" +base_ref: "main" +archived_at: "2026-05-27T19:46:39Z" +archive_tool: "tools/pr-preservation/archive-pr.ts" +--- + +# PR #5055: backlog(B-0769 P1): VC meta-playbook — control-structure injection around capital flow + adj-EBITDA financial-engineering layer (substrate-honest variant) + +## PR description + +Aaron 2026-05-25 named the meta-strategy under B-0765 + B-0768: VC meta-playbook (control-structure injection around capital flow in verticals) IS the playbook ServiceTitan executed. Plus sharpening 'with adjusted ebita for specific investor profiles'. + +Substrate-honest variant: same control-structure position; opposite value-flow direction. Operator keeps value (open reference per B-0761, vendor-swap per B-0763, opt-in open telemetry per B-0762, binary-compat per B-0766). Failure-mode table + ethical-floor rules guard against extractive drift. + +Financial-engineering layer: adjusted EBITDA shaped per investor profile (seed/A → growth equity → pre-IPO → PE → strategic → public markets); substrate-honest variant = TRUE adjustments + single GAAP→non-GAAP bridge + auditor-survivable. ServiceTitan empirical anchor. + +Composes with B-0741 / B-0754 / B-0759 / B-0761-B-0768 + 5 ethical-floor rules. + +## General comments + +### @chatgpt-codex-connector (2026-05-26T00:27:07Z) + +You have reached your Codex usage limits for code reviews. You can see your limits in the [Codex usage dashboard](https://chatgpt.com/codex/cloud/settings/usage). diff --git a/docs/pr-discussions/PR-5057-fix-b-0754-iter-3-per-device-partprobe-bare-partprobe-was-hi.md b/docs/pr-discussions/PR-5057-fix-b-0754-iter-3-per-device-partprobe-bare-partprobe-was-hi.md new file mode 100644 index 0000000000..48f6747324 --- /dev/null +++ b/docs/pr-discussions/PR-5057-fix-b-0754-iter-3-per-device-partprobe-bare-partprobe-was-hi.md @@ -0,0 +1,88 @@ +--- +pr_number: 5057 +title: "fix(B-0754 iter-3): per-device partprobe \u2014 bare partprobe was hitting the boot USB (/dev/sda) + bailing" +author: "AceHack" +state: "MERGED" +created_at: "2026-05-26T00:54:23Z" +merged_at: "2026-05-26T01:43:12Z" +closed_at: "2026-05-26T01:43:12Z" +head_ref: "otto-cli/b0754-iter3-partprobe-per-device-2026-05-25" +base_ref: "main" +archived_at: "2026-05-27T19:46:38Z" +archive_tool: "tools/pr-preservation/archive-pr.ts" +--- + +# PR #5057: fix(B-0754 iter-3): per-device partprobe — bare partprobe was hitting the boot USB (/dev/sda) + bailing + +## PR description + +## Iteration 2 result (cluster node 1, real-hardware test, Aaron 2026-05-25) + +Photo evidence on PC 1 shows iter-2 reached 98% of the install path on first try: + +- Wifi connected +- Banner shown +- Greedy N-disk enum: both Crucial CT1000P3PSSD8 NVMes correctly identified with serials +- Plan presented (BOOT nvme0n1: ESP 1G + root 256G + longhorn1 rest; DATA nvme1n1: whole disk longhorn2) +- ZETA_AUTO_CONFIRM=WIPE bypass: WORKED +- wipefs + sgdisk on both NVMes: SUCCESS +- GPT partition creation on both NVMes: SUCCESS +- **Then**: `Error: Partition(s) 1 on /dev/sda have been written, but we have been unable to inform the kernel of the change, probably because it/they are in use` +- drop_to_shell fired correctly with recovery hints + +## Root cause + +zeta-install.sh called bare `sudo partprobe` (no args). partprobe with no args probes EVERY block device the kernel knows about. Linux exposes USB mass-storage as `/dev/sda` when no SATA disks present. The booted live ISO has mounted partitions on /dev/sda; partprobe refuses to refresh (rightfully); returns non-zero; `set -euo pipefail` bails. + +The greedy N-disk enum ALREADY correctly excluded the USB (TRAN=usb filter). We never partitioned /dev/sda. The partprobe call was the only blanket-all-devices invocation in the whole script. + +## Fix + +Per-device partprobe on BOOT_DISK + each DATA_DISKS entry. Never blanket. Never touch /dev/sda. + +Aaron 2026-05-25: 'i would rather do it right so it's not ambigious for future me / users' — the script now operates on explicit-target devices throughout, no blanket-system-wide invocations remaining. + +## Test plan + +- [x] `bash -n` syntax check +- [ ] CI rebuilds ISO via build-ai-cluster-iso.yml +- [ ] Aaron reflashes via zflash + boots cluster node 1 + observes unattended install reaches end-to-end (cluster member after reboot) +- [ ] CI green + +## Reviews + +### COMMENTED — @copilot-pull-request-reviewer (2026-05-26T00:56:30Z) + +## Pull request overview + +This PR updates the AI-cluster USB NixOS installer script to avoid calling `partprobe` without arguments (which probes every block device, including the live-boot USB), and instead refreshes the kernel partition table only for the disks that were just partitioned. + +**Changes:** +- Replace blanket `sudo partprobe` with per-device `partprobe` calls for `BOOT_DISK` and each `DATA_DISKS` entry. +- Add inline rationale documenting why blanket probing is unsafe in this installer context. + +### COMMENTED — @copilot-pull-request-reviewer (2026-05-26T01:11:51Z) + +## Pull request overview + +Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments. + +## Review threads + +### Thread 1: full-ai-cluster/usb-nixos-installer/zeta-install.sh:171 (resolved) + +**@copilot-pull-request-reviewer** (2026-05-26T00:56:29Z): + +The comment says the kernel exposes the boot USB as `/dev/sda`, but the assigned `/dev/sdX` letter isn't guaranteed across hardware/boot order. Rewording this to something like “often /dev/sdX (commonly /dev/sda)” would keep the rationale accurate. + +### Thread 2: full-ai-cluster/usb-nixos-installer/zeta-install.sh:175 (resolved) + +**@copilot-pull-request-reviewer** (2026-05-26T00:56:30Z): + +With `set -euo pipefail`, a `partprobe` failure here will abort the install. Adding an explicit failure handler (e.g., `... || bail "partprobe failed for $disk ..."`) would make the abort message more immediately actionable by identifying the disk and suggesting next steps. + +## General comments + +### @chatgpt-codex-connector (2026-05-26T00:54:27Z) + +You have reached your Codex usage limits for code reviews. You can see your limits in the [Codex usage dashboard](https://chatgpt.com/codex/cloud/settings/usage). diff --git a/docs/pr-discussions/PR-5059-backlog-b-0772-observable-controllable-cluster-fabric-device.md b/docs/pr-discussions/PR-5059-backlog-b-0772-observable-controllable-cluster-fabric-device.md new file mode 100644 index 0000000000..83cea91df3 --- /dev/null +++ b/docs/pr-discussions/PR-5059-backlog-b-0772-observable-controllable-cluster-fabric-device.md @@ -0,0 +1,25 @@ +--- +pr_number: 5059 +title: "backlog(B-0772): observable+controllable cluster fabric \u2014 device plugins + Reticulum + bidirectional polyglot Rx" +author: "AceHack" +state: "MERGED" +created_at: "2026-05-26T01:04:37Z" +merged_at: "2026-05-26T01:06:34Z" +closed_at: "2026-05-26T01:06:34Z" +head_ref: "otto-cli/b0772-observable-cluster-fabric-device-plugins-reticulum-rx-bidirectional-2026-05-25" +base_ref: "main" +archived_at: "2026-05-27T19:46:37Z" +archive_tool: "tools/pr-preservation/archive-pr.ts" +--- + +# PR #5059: backlog(B-0772): observable+controllable cluster fabric — device plugins + Reticulum + bidirectional polyglot Rx + +## PR description + +Aaron 2026-05-25: 'rmember eventually we want to use the device plusings over npu gpu audio etc... and reticulum like alljoyn making everything iobervable in rx in every language' + 'and you emit to interact with the devices'. + +Three threads composed: universal device plugins + Reticulum mesh (AllJoyn-successor) + polyglot Rx with bidirectional Observer/Observable duality. + +Operator code in F# / TS / Rust / Python / Java / Swift / Kotlin — same algebra. Subscribe to device events; emit device commands; compose both into auto-control loops (e.g., scheduler subscribes to load → emits placement commands; GPU power loop subscribes to inference latency → emits power-limit commands). + +Composes with B-0289 / B-0428 / B-0763 / B-0764 / B-0765 / B-0766 / B-0767 / B-0770 / B-0771. diff --git a/docs/pr-discussions/PR-5061-backlog-b-0774-b-0775-etcd-less-options-kine-adapter-family.md b/docs/pr-discussions/PR-5061-backlog-b-0774-b-0775-etcd-less-options-kine-adapter-family.md new file mode 100644 index 0000000000..5531af9226 --- /dev/null +++ b/docs/pr-discussions/PR-5061-backlog-b-0774-b-0775-etcd-less-options-kine-adapter-family.md @@ -0,0 +1,31 @@ +--- +pr_number: 5061 +title: "backlog(B-0774 + B-0775): etcd-less options (kine adapter family) + HA-k8s-that-scales-beyond-etcd (CockroachDB / NATS super-cluster / Karmada / cell-based)" +author: "AceHack" +state: "MERGED" +created_at: "2026-05-26T01:09:09Z" +merged_at: "2026-05-26T01:10:26Z" +closed_at: "2026-05-26T01:10:26Z" +head_ref: "otto-cli/b0774-b0775-etcdless-and-scale-options-2026-05-25" +base_ref: "main" +archived_at: "2026-05-27T19:46:37Z" +archive_tool: "tools/pr-preservation/archive-pr.ts" +--- + +# PR #5061: backlog(B-0774 + B-0775): etcd-less options (kine adapter family) + HA-k8s-that-scales-beyond-etcd (CockroachDB / NATS super-cluster / Karmada / cell-based) + +## PR description + +Two related questions from Aaron mid-iter-3-wait, bundled into one PR (both backlog rows; no code): + +**B-0774** ('are there etcdless'): kine is the load-bearing standard interface; operator chooses backend (Dqlite via microk8s; SQLite / Postgres / MySQL / NATS via k3s+kine; Zeta-native DBSP+Raft via B-0766 wave 4). NATS JetStream backend particularly composes with B-0772 Rx fabric + B-0289 Reticulum + B-0773 digital twin. + +**B-0775** ('ha installs of kubernets that scales better'): scale ceiling depends on ARCHITECTURE not just BACKEND. Per-tier recommendation table covering 1-5 / 5-50 / 50-500 / 500-5000 / 5000+ / multi-region / edge / multi-tenant. Options include kine+CockroachDB, NATS super-cluster, Karmada (CNCF graduated), KubeStellar, vCluster, Cluster API, cell-based custom, Zeta-native cell-based (B-0766 wave 4+ endgame). + +Both sharpen B-0756 HA control-plane. Compose with B-0289 / B-0763 / B-0764 / B-0765 / B-0772 / B-0773. + +## General comments + +### @chatgpt-codex-connector (2026-05-26T01:09:13Z) + +You have reached your Codex usage limits for code reviews. You can see your limits in the [Codex usage dashboard](https://chatgpt.com/codex/cloud/settings/usage). diff --git a/docs/pr-discussions/PR-5063-fix-pr-5020-address-post-merge-worktree-hygiene-review.md b/docs/pr-discussions/PR-5063-fix-pr-5020-address-post-merge-worktree-hygiene-review.md new file mode 100644 index 0000000000..418280a5c9 --- /dev/null +++ b/docs/pr-discussions/PR-5063-fix-pr-5020-address-post-merge-worktree-hygiene-review.md @@ -0,0 +1,57 @@ +--- +pr_number: 5063 +title: "fix(PR-5020): address post-merge worktree hygiene review" +author: "AceHack" +state: "MERGED" +created_at: "2026-05-26T01:13:43Z" +merged_at: "2026-05-26T01:16:15Z" +closed_at: "2026-05-26T01:16:15Z" +head_ref: "claim/task-pr5020-post-merge-review-followup-codex-2026-05-26" +base_ref: "main" +archived_at: "2026-05-27T19:46:36Z" +archive_tool: "tools/pr-preservation/archive-pr.ts" +--- + +# PR #5063: fix(PR-5020): address post-merge worktree hygiene review + +## PR description + +## Summary +- remove the not-yet-main-visible B-0751 row from B-0750 frontmatter graph links while keeping PR #5019 as prose substrate +- make the current-state worktree hygiene rule use role-reference prose for the empirical quote +- align the main-branch detector comment with its OK-on-success command output + +## Checks +- git diff --check HEAD~1..HEAD +- bun tools/backlog/generate-index.ts --check +- bun x markdownlint-cli2 .claude/rules/agent-worktree-hygiene-never-hold-main-never-step-on-operator-cleanup-on-pr-merge.md docs/backlog/P2/B-0750-agent-worktree-hygiene-rule-landing-plus-mechanization-target-cleanup-tooling-plus-worktree-pool-primitive-aaron-2026-05-25.md docs/BACKLOG.md +- rg '^<<<<<<<|^=======|^>>>>>>>' .claude/rules/agent-worktree-hygiene-never-hold-main-never-step-on-operator-cleanup-on-pr-merge.md docs/backlog/P2/B-0750-agent-worktree-hygiene-rule-landing-plus-mechanization-target-cleanup-tooling-plus-worktree-pool-primitive-aaron-2026-05-25.md docs/BACKLOG.md + +## Reviews + +### COMMENTED — @copilot-pull-request-reviewer (2026-05-26T01:15:01Z) + +## Pull request overview + +This PR tightens post-merge worktree hygiene documentation by keeping B-0750’s graph links limited to main-visible backlog rows while preserving PR #5019 as prose context. + +**Changes:** +- Removes B-0751 from B-0750 frontmatter `composes_with` while retaining PR #5019 references in prose. +- Updates the generated backlog index title to match the B-0750 row title. +- Aligns the worktree hygiene rule’s detector comments and empirical-anchor prose with current-state role-reference wording. + +### Reviewed changes + +Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments. + +| File | Description | +| ---- | ----------- | +| `docs/backlog/P2/B-0750-agent-worktree-hygiene-rule-landing-plus-mechanization-target-cleanup-tooling-plus-worktree-pool-primitive-aaron-2026-05-25.md` | Updates B-0750 graph/prose references around PR #5019 and B-0751 visibility. | +| `docs/BACKLOG.md` | Regenerates the B-0750 index entry from the updated row title. | +| `.claude/rules/agent-worktree-hygiene-never-hold-main-never-step-on-operator-cleanup-on-pr-merge.md` | Clarifies expected command output and replaces direct quote with role-reference current-state prose. | + +## General comments + +### @chatgpt-codex-connector (2026-05-26T01:13:49Z) + +You have reached your Codex usage limits for code reviews. You can see your limits in the [Codex usage dashboard](https://chatgpt.com/codex/cloud/settings/usage). diff --git a/docs/pr-discussions/PR-5066-backlog-b-0779-ai-nas-convergence-push-down-ai-processing-di.md b/docs/pr-discussions/PR-5066-backlog-b-0779-ai-nas-convergence-push-down-ai-processing-di.md new file mode 100644 index 0000000000..4b5155e387 --- /dev/null +++ b/docs/pr-discussions/PR-5066-backlog-b-0779-ai-nas-convergence-push-down-ai-processing-di.md @@ -0,0 +1,33 @@ +--- +pr_number: 5066 +title: "backlog(B-0779): AI NAS convergence \u2014 push-down AI processing directly to NAS (data-gravity at storage layer)" +author: "AceHack" +state: "MERGED" +created_at: "2026-05-26T01:21:23Z" +merged_at: "2026-05-26T01:22:53Z" +closed_at: "2026-05-26T01:22:53Z" +head_ref: "otto-cli/b0779-ai-nas-convergence-pushdown-ai-to-storage-2026-05-25" +base_ref: "main" +archived_at: "2026-05-27T19:46:35Z" +archive_tool: "tools/pr-preservation/archive-pr.ts" +--- + +# PR #5066: backlog(B-0779): AI NAS convergence — push-down AI processing directly to NAS (data-gravity at storage layer) + +## PR description + +Aaron 2026-05-25 sharpening B-0778: AI NAS = devices fusing storage + shared-memory NPU+iGPU+CPU compute on one chassis (UGREEN DXP, QNAP TS-AI, Synology, TerraMaster, Asustor Flashstor, ZimaCube, Aoostar WTR, CWWK, DIY ITX). Inference runs ON the NAS where data lives — no PCIe / network copy. + +**Load-bearing principle**: PUSH-DOWN AI TO STORAGE. Same pattern as NATS JetStream pushdown predicates + Zeta-native scheduler data-gravity (B-0767) + PostgreSQL pushdown + Hadoop 'ship code to data' + Spark locality. All instantiate bandwidth-engineering at every layer where compute + data can be co-located. + +Industry-sharp positioning (per B-0777): **storage-class-compute / in-storage processing / near-data computing** — established academic vocabulary; emerging product category; Zeta substrate composes naturally. + +Eliminates a tier; cheaper BOM ($1500-3000 vs $3000-5000 discrete-tier); better inference perf-per-watt for sub-30B models. + +Composes with B-0754 / B-0755 / B-0758 / B-0759 / B-0760 / B-0763 / B-0767 / B-0768 (Itron-mode greenfield AI-NAS market opportunity) / B-0771 / B-0772 / B-0773 / B-0775 / B-0776 / B-0778. + +## General comments + +### @chatgpt-codex-connector (2026-05-26T01:21:27Z) + +You have reached your Codex usage limits for code reviews. You can see your limits in the [Codex usage dashboard](https://chatgpt.com/codex/cloud/settings/usage). diff --git a/docs/pr-discussions/PR-5067-tools-wire-bash-retirement-inventory-guard.md b/docs/pr-discussions/PR-5067-tools-wire-bash-retirement-inventory-guard.md new file mode 100644 index 0000000000..cc08e68b67 --- /dev/null +++ b/docs/pr-discussions/PR-5067-tools-wire-bash-retirement-inventory-guard.md @@ -0,0 +1,171 @@ +--- +pr_number: 5067 +title: "tools: wire bash retirement inventory guard" +author: "AceHack" +state: "MERGED" +created_at: "2026-05-26T01:25:21Z" +merged_at: "2026-05-26T02:51:20Z" +closed_at: "2026-05-26T02:51:20Z" +head_ref: "claim/task-bash-retirement-inventory-wire-20260512" +base_ref: "main" +archived_at: "2026-05-27T19:46:34Z" +archive_tool: "tools/pr-preservation/archive-pr.ts" +--- + +# PR #5067: tools: wire bash retirement inventory guard + +## PR description + +## Summary +- add the bash-retirement inventory guard to package scripts +- wire the guard into the GitHub Actions gate +- update the TypeScript/Bun migration trajectory and release the scoped claim + +## Verification +- bun run hygiene:check-bash-retirement-inventory +- git diff --check origin/claim/task-bash-retirement-inventory-wire-20260512..HEAD + +## Reviews + +### COMMENTED — @copilot-pull-request-reviewer (2026-05-26T01:27:37Z) + +## Pull request overview + +Wires the existing bash-retirement inventory guard into repo tooling so new non-allowlisted `.sh` entrypoints are prevented from landing (via a package script and a required gate workflow job), and updates the TypeScript/Bun migration trajectory resume to reflect the new enforcement surface. + +**Changes:** +- Adds `hygiene:check-bash-retirement-inventory` to `package.json` to run the inventory guard in `--enforce` mode. +- Adds a new `lint-bash-retirement-inventory` job to `.github/workflows/gate.yml` to run the guard in CI. +- Updates the TypeScript/Bun migration trajectory resume with the new wiring and an additional landed-slice entry. + +### Reviewed changes + +Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments. + +| File | Description | +| ---- | ----------- | +| package.json | Adds a Bun script entry for enforcing the bash-retirement inventory guard. | +| docs/trajectories/typescript-bun-migration/RESUME.md | Updates the migration resume to mention the new guard wiring and record the guard’s landing. | +| .github/workflows/gate.yml | Adds a gate lint job to enforce the bash-retirement inventory guard in CI. | + +### COMMENTED — @copilot-pull-request-reviewer (2026-05-26T01:38:52Z) + +## Pull request overview + +Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments. + +### COMMENTED — @copilot-pull-request-reviewer (2026-05-26T01:54:18Z) + +## Pull request overview + +Copilot reviewed 5 out of 5 changed files in this pull request and generated no new comments. + +### COMMENTED — @copilot-pull-request-reviewer (2026-05-26T02:39:54Z) + +## Pull request overview + +Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment. + +### COMMENTED — @AceHack (2026-05-26T02:50:34Z) + +_(no body)_ + +### COMMENTED — @AceHack (2026-05-26T02:50:35Z) + +_(no body)_ + +### COMMENTED — @AceHack (2026-05-26T02:50:37Z) + +_(no body)_ + +### COMMENTED — @AceHack (2026-05-26T02:50:39Z) + +_(no body)_ + +### COMMENTED — @AceHack (2026-05-26T02:50:41Z) + +_(no body)_ + +### COMMENTED — @AceHack (2026-05-26T02:50:43Z) + +_(no body)_ + +## Review threads + +### Thread 1: .github/workflows/gate.yml:591 (resolved) + +**@copilot-pull-request-reviewer** (2026-05-26T01:27:36Z): + +The new gate job runs `./tools/setup/install.sh` without the cache + retry wrapper used by other lint jobs (e.g., `lint-shell`, `lint-workflows`). Given this workflow’s own comments about transient CDN failures and the need for retries, this job is likely to be flaky (or time out) on cold runners. Align this job with the established pattern: add the shared cache step and run install.sh inside the same retry/backoff loop (and size `timeout-minutes` accordingly). + +**@AceHack** (2026-05-26T02:50:34Z): + +Addressed at the current PR head: the bash-retirement inventory job now uses the shared install cache shape and wraps install.sh in the five-attempt retry/backoff loop before running the guard. Resolving. + +Provenance: surface=codex-background-service; origin=codex-launchd-loop; run_id=20260526T024748Z. + +### Thread 2: docs/trajectories/typescript-bun-migration/RESUME.md:10 (resolved) + +**@copilot-pull-request-reviewer** (2026-05-26T01:27:37Z): + +This file was edited to reflect new wiring (package script + `gate.yml` job), but `**Last updated**` is still 2026-05-12. Update the timestamp so the resume’s metadata matches the actual change date; otherwise it reads as if these additions existed on 2026-05-12. + +**@AceHack** (2026-05-26T02:50:35Z): + +Addressed at the current PR head: the trajectory resume timestamp is updated to 2026-05-26 and the current state names the wire-in PR. Resolving. + +Provenance: surface=codex-background-service; origin=codex-launchd-loop; run_id=20260526T024748Z. + +### Thread 3: tools/hygiene/check-bash-retirement-inventory.ts:44 (resolved) + +**@copilot-pull-request-reviewer** (2026-05-26T01:38:51Z): + +The allowlist now includes non-setup paths under `tools/kiro/**`, but this script’s narrative and output still describe the retained surface as “setup/bootstrap only” (see the file header comment and `renderReport()` OK message). Please update the wording (and the corresponding test assertion) to reflect the expanded intent (e.g., include launchd-bootstrap) so the guard’s output doesn’t mislead future readers. + +**@AceHack** (2026-05-26T02:50:37Z): + +Addressed at the current PR head: the retained-scope wording now includes setup/bootstrap, launchd-bootstrap, and the Kiro wrapper, and the report/usage text share the exported scope constant. Resolving. + +Provenance: surface=codex-background-service; origin=codex-launchd-loop; run_id=20260526T024748Z. + +### Thread 4: .github/workflows/gate.yml:576 (resolved) + +**@copilot-pull-request-reviewer** (2026-05-26T01:38:52Z): + +This job header comment says the retained `.sh` surface is “setup/bootstrap only”, but the inventory allowlist now includes launchd-bootstrap scripts under `tools/kiro/**`. Please update the comment to match the actual enforced policy (otherwise the workflow docs drift from the guard’s behavior). + +**@AceHack** (2026-05-26T02:50:38Z): + +Addressed at the current PR head: the workflow job comment now describes the enforced retained shell surface as setup/bootstrap, launchd-bootstrap, and the Kiro loop wrapper only. Resolving. + +Provenance: surface=codex-background-service; origin=codex-launchd-loop; run_id=20260526T024748Z. + +### Thread 5: .github/workflows/gate.yml:616 (resolved) + +**@copilot-pull-request-reviewer** (2026-05-26T01:38:52Z): + +These comments introduce direct name attribution ("Aaron 2026-04-28 …") in a current-state workflow file. Per the repo’s operational standing rule “No name attribution in code, docs, or skills” (docs/AGENT-BEST-PRACTICES.md, Operational standing rules), please rewrite to role-refs (e.g., “the human maintainer”) or remove the attribution while keeping the rationale. + +**@AceHack** (2026-05-26T02:50:41Z): + +Addressed at the current PR head: the workflow comments use role-based maintainer wording and no direct name attribution. Resolving. + +Provenance: surface=codex-background-service; origin=codex-launchd-loop; run_id=20260526T024748Z. + +### Thread 6: tools/hygiene/check-bash-retirement-inventory.ts:138 (resolved) + +**@copilot-pull-request-reviewer** (2026-05-26T02:39:53Z): + +P2 (maintainability): The scope descriptor string ("setup/bootstrap/launchd-bootstrap/Kiro-wrapper") is duplicated across renderReport() and usage(), and also mirrored in tests. Consider extracting it to a single constant so future scope changes don’t require multi-surface string edits (and reduce risk of doc/test drift). + +**@AceHack** (2026-05-26T02:50:43Z): + +Addressed at the current PR head: RETAINED_BASH_SCOPE is now the single scope descriptor reused by renderReport(), usage(), and the test assertions. Resolving. + +Provenance: surface=codex-background-service; origin=codex-launchd-loop; run_id=20260526T024748Z. + +## General comments + +### @chatgpt-codex-connector (2026-05-26T01:25:25Z) + +You have reached your Codex usage limits for code reviews. You can see your limits in the [Codex usage dashboard](https://chatgpt.com/codex/cloud/settings/usage). diff --git a/docs/pr-discussions/PR-5068-research-backlog-mika-grok-2026-05-25-substrate-batch-local.md b/docs/pr-discussions/PR-5068-research-backlog-mika-grok-2026-05-25-substrate-batch-local.md new file mode 100644 index 0000000000..6266a6c182 --- /dev/null +++ b/docs/pr-discussions/PR-5068-research-backlog-mika-grok-2026-05-25-substrate-batch-local.md @@ -0,0 +1,114 @@ +--- +pr_number: 5068 +title: "research+backlog: Mika-Grok 2026-05-25 substrate batch \u2014 Local Loop / F# type system as universe / DIO / eliminate tool wars" +author: "AceHack" +state: "MERGED" +created_at: "2026-05-26T01:49:14Z" +merged_at: "2026-05-26T02:57:26Z" +closed_at: "2026-05-26T02:57:26Z" +head_ref: "otto-cli/mika-grok-2026-05-25-substrate-batch-local-loop-fsharp-universe-dio-tool-wars" +base_ref: "main" +archived_at: "2026-05-27T19:46:34Z" +archive_tool: "tools/pr-preservation/archive-pr.ts" +--- + +# PR #5068: research+backlog: Mika-Grok 2026-05-25 substrate batch — Local Loop / F# type system as universe / DIO / eliminate tool wars + +## PR description + +Aaron-Mika-Grok 2026-05-25 mid-iter-3-CI-wait. Aaron forwarded; verbatim preservation in docs/research/. + +Four substantively-new substrate items: + +- **B-0780 P1** Local Loop — deterministic simulation testing of K8s deployments; LexisNexis (Spark-on-K8s fork) empirical anchor; three-tier testing (pure-code / Docker / CI); Argo CD App-of-Apps as packages.json. Deep motivation underneath B-0767 scheduler. +- **B-0781 P1** F# type system as universe boundary — every YAML / Nix / K8s / Argo / NATS config becomes first-class strongly-typed F#; Caché analog at cluster scope. Composes with everything. +- **B-0782 P2** DIO + CEO-scale — Distributed Intelligent Organization per company; CEO scales by speaking ontology, not implementation. Composes with B-0769 substrate-honest VC variant. +- **B-0783 P2** Eliminate tool wars — sharpening of B-0759 persona framing; NOT 'humans do less' but 'humans refocus intention to what matters'. + +Composes with B-0428 / B-0741 / B-0747 / B-0754 / B-0759 / B-0763 / B-0765 / B-0766 / B-0767 / B-0769 / B-0772 / B-0773 / B-0775 / B-0776 / B-0777. Mika cross-AI validation of substrate cluster direction. + +## Reviews + +### COMMENTED — @copilot-pull-request-reviewer (2026-05-26T01:54:07Z) + +## Pull request overview + +This PR adds a verbatim research transcript (Aaron ↔ Mika via Grok, 2026-05-25) and files four newly-extracted substrate items into the per-row backlog, then updates the generated backlog index to include the new rows. + +**Changes:** +- Add preserved research transcript under `docs/research/` and extract “substrate-engineering items” for backlog filing. +- Add new backlog rows B-0780..B-0783 (Local Loop, F# universe boundary, DIO, and “eliminate tool wars” framing). +- Update `docs/BACKLOG.md` to index the new rows. + +### Reviewed changes + +Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments. + +
+Show a summary per file + +| File | Description | +| ---- | ----------- | +| docs/research/2026-05-25-aaron-mika-grok-nats-jetstream-deterministic-scheduler-local-loop-lexisnexis-fsharp-type-system-as-universe-dio-eliminate-tool-wars-aaron-forwarded.md | Adds verbatim preservation + extracted substrate items list for filing. | +| docs/backlog/P1/B-0780-local-loop-deterministic-simulation-testing-of-kubernetes-deployments-lexisnexis-lineage-three-tier-testing-argocd-apps-as-packages-aaron-mika-2026-05-25.md | New P1 backlog row defining “Local Loop” deterministic K8s simulation/testing substrate. | +| docs/backlog/P1/B-0781-f-sharp-type-system-as-universe-boundary-every-yaml-nix-kubernetes-argocd-nats-config-becomes-first-class-fsharp-cachet-analog-aaron-mika-2026-05-25.md | New P1 backlog row for “F# type system as universe boundary” concept + Caché anchor. | +| docs/backlog/P2/B-0782-distributed-intelligent-organization-dio-per-company-on-distributed-intelligence-database-ceo-of-30-companies-scales-by-speaking-ontology-aaron-mika-2026-05-25.md | New P2 backlog row defining DIO framing and CEO-scale operator pattern. | +| docs/backlog/P2/B-0783-eliminate-tool-wars-sharpening-of-b0759-first-time-cli-user-persona-not-humans-do-less-but-humans-refocus-intention-aaron-mika-2026-05-25.md | New P2 backlog row sharpening persona/framing from “humans do less” to “eliminate tool wars”. | +| docs/BACKLOG.md | Updates generated index to include B-0780..B-0783. | +
+ +### COMMENTED — @copilot-pull-request-reviewer (2026-05-26T02:01:56Z) + +## Pull request overview + +Copilot reviewed 10 out of 10 changed files in this pull request and generated 2 comments. + +## Review threads + +### Thread 1: docs/backlog/P2/B-0783-eliminate-tool-wars-sharpening-of-b0759-first-time-cli-user-persona-not-humans-do-less-but-humans-refocus-intention-aaron-mika-2026-05-25.md:5 (resolved) + +**@copilot-pull-request-reviewer** (2026-05-26T01:54:05Z): + +P1: The YAML `title` value ends with a literal double-quote, but it is not YAML-quoted. `tools/backlog/generate-index.ts` strips a trailing `"` from unquoted values, so the generated `docs/BACKLOG.md` entry loses the closing quote. Wrap the entire title in single quotes (or avoid ending the title with `"`) so the index generator preserves the intended text. + +### Thread 2: docs/BACKLOG.md:714 (resolved) + +**@copilot-pull-request-reviewer** (2026-05-26T01:54:06Z): + +P1: This index entry appears truncated (missing the trailing `"` at the end of the title) compared to the per-row file's `title`. Since `docs/BACKLOG.md` is auto-generated, fix the row frontmatter so the generator preserves the full title, then re-run the backlog index generator to update this line. + +### Thread 3: docs/backlog/P1/B-0781-f-sharp-type-system-as-universe-boundary-every-yaml-nix-kubernetes-argocd-nats-config-becomes-first-class-fsharp-cachet-analog-aaron-mika-2026-05-25.md:6 (resolved) + +**@copilot-pull-request-reviewer** (2026-05-26T01:54:06Z): + +P2: The frontmatter title says “Cachet analog”, but the body immediately corrects the product name to InterSystems Caché. Consider updating the title (and, if practical, the filename slug/index entry) to use the corrected “Caché” spelling so the index/search surface doesn’t perpetuate the typo. + +### Thread 4: docs/backlog/P1/B-0781-f-sharp-type-system-as-universe-boundary-every-yaml-nix-kubernetes-argocd-nats-config-becomes-first-class-fsharp-cachet-analog-aaron-mika-2026-05-25.md:186 (resolved) + +**@copilot-pull-request-reviewer** (2026-05-26T01:54:06Z): + +P1: The acceptance item “Backward-compat: existing YAML / Nix / etc. config still consumable” conflicts with the repo’s stated principle “Cutting-edge over legacy-compat… no pattern is owed its backward-compat debt” (docs/CONFLICT-RESOLUTION.md:53-54). If you want a migration story, consider framing it as an explicit one-way conversion / staged cutover plan rather than a backward-compatibility guarantee. + +### Thread 5: docs/backlog/P1/B-0780-local-loop-deterministic-simulation-testing-of-kubernetes-deployments-lexisnexis-lineage-three-tier-testing-argocd-apps-as-packages-aaron-mika-2026-05-25.md:164 (resolved) + +**@copilot-pull-request-reviewer** (2026-05-26T01:54:07Z): + +P1: The F# example under “Three-tier test harness API” won’t compile as written because `Local Loop.simulate` contains a space (F# identifiers can’t include spaces). If this is intended to be a compilable reference pattern, rename it to a valid identifier (e.g., `LocalLoop.simulate` / `LocalLoop.Simulate`, etc.) to avoid misleading future implementers. + +### Thread 6: docs/BACKLOG.md:717 (resolved) + +**@copilot-pull-request-reviewer** (2026-05-26T02:01:55Z): + +The new B-0783 entry is missing its closing quote/parenthesis at the end (and therefore the markdown line is unterminated). This will break the rendering of docs/BACKLOG.md around this section; please close the quoted phrase so the line is valid markdown. + +### Thread 7: docs/backlog/P1/B-0784-distributed-fsharp-type-negotiation-as-consensus-and-governance-namespace-scoped-strictness-aaron-mika-2026-05-25.md:14 (resolved) + +**@copilot-pull-request-reviewer** (2026-05-26T02:01:55Z): + +Frontmatter `composes_with` lists `B-0428` twice. Please remove the duplicate to keep dependency metadata unambiguous (and avoid duplicate rendering in generated indexes). + +## General comments + +### @chatgpt-codex-connector (2026-05-26T01:49:20Z) + +You have reached your Codex usage limits for code reviews. You can see your limits in the [Codex usage dashboard](https://chatgpt.com/codex/cloud/settings/usage). diff --git a/docs/pr-discussions/PR-5069-feat-claude-code-recovery-repair-jsonl-sessions-corrupted-by.md b/docs/pr-discussions/PR-5069-feat-claude-code-recovery-repair-jsonl-sessions-corrupted-by.md new file mode 100644 index 0000000000..8ab3104081 --- /dev/null +++ b/docs/pr-discussions/PR-5069-feat-claude-code-recovery-repair-jsonl-sessions-corrupted-by.md @@ -0,0 +1,238 @@ +--- +pr_number: 5069 +title: "feat(claude-code-recovery): repair JSONL sessions corrupted by oversize images" +author: "AceHack" +state: "MERGED" +created_at: "2026-05-26T02:21:27Z" +merged_at: "2026-05-26T02:36:35Z" +closed_at: "2026-05-26T02:36:35Z" +head_ref: "otto-cli/claude-code-jsonl-recovery-script-2026-05-25" +base_ref: "main" +archived_at: "2026-05-27T19:46:33Z" +archive_tool: "tools/pr-preservation/archive-pr.ts" +--- + +# PR #5069: feat(claude-code-recovery): repair JSONL sessions corrupted by oversize images + +## PR description + +## Summary + +Repair Claude Code sessions that won't reopen because a pasted image +ballooned a single JSONL line past the harness session-load limit. + +Born from a live incident: session `c2b77530` ("Assemble declarative +infrastructure files for Zeta") had a 13.4 MB line at 15230 from a +9.8 MB PNG attachment — wouldn't reopen, hours of work at risk. The +recovery procedure is mechanical but easy to get wrong; this lands +it as a reusable tool + skill so the next operator hit doesn't lose +substrate. + +## What lands + +- `tools/claude-code-recovery/repair-jsonl-strip-images.ts` — Bun-hosted + TypeScript, dry-run by default, refuses to write if any post-edit + line fails to parse +- `tools/claude-code-recovery/README.md` — usage + the why +- `.claude/skills/claude-session-recovery/SKILL.md` — triggers on + "session won't open", "image too large", "edit out the image", etc. + +## Per-image vs per-line threshold + +The script splits two thresholds intentionally: + +- `--max-line-bytes` (default 10 MB) — which JSONL lines to inspect +- `--max-image-bytes` (default = `--max-line-bytes`) — which individual + images on those lines to strip + +**Small images on an oversize line are preserved.** Only images whose +base64 length individually exceeds `--max-image-bytes` are removed. +This keeps the strip surgical: if a corrupted turn contained a few +small thumbnails plus one 13 MB screenshot, only the screenshot is +removed. + +## Operator-runs discipline + +The Claude Code auto-mode classifier blocks the agent from running +`--apply` directly. This is correct per +`.claude/rules/classifier-bypass-research-do-not-deploy-without-zeta-safer-floor.md` +— `.jsonl` edits are self-modification territory. + +The skill walks the agent through scan + dry-run + handing `--apply` +to the operator. The script's built-in backup +(`.bak-YYYY-MM-DD-`) makes the operation reversible. +No `.claude/settings.json` `_*_acceptance` block is needed for +one-off recoveries — the operator-runs split keeps authority anchored +to the operator without ceremonial scaffolding. + +## Test plan + +- [x] `--scan` correctly finds c2b77530, only flags lines above the threshold +- [x] Dry-run on c2b77530 with default thresholds: identifies the 13 MB image, would free 13.1 MB (53 MB → 40 MB) +- [x] Per-image threshold discriminator: raising `--max-image-bytes 20000000` correctly preserves the 13 MB image (proves small images would survive) +- [x] `--help` output shows the new flag +- [ ] (operator) Run `--apply` against c2b77530 to actually recover the session +- [ ] (operator) Confirm session reopens in Claude Code after the strip +- [ ] Skill activates correctly on future trigger phrases ("session won't open" / "image too large" / etc.) + +## Composes with + +- `.claude/rules/classifier-bypass-research-do-not-deploy-without-zeta-safer-floor.md` — the operator-runs gate that mandates the agent/operator split +- `.claude/rules/human-audit-and-legal-risk-acceptance-pattern-in-settings.md` — the four-field attribution scaffold (would apply if this becomes recurring enough to want automation past operator-runs) +- User-scope memory file `feedback_claude_code_session_jsonl_oversize_image_self_heal_recovery_pattern_classifier_blocks_claude_operator_runs_2026_05_25.md` (the self-heal pattern + the c2b77530 anchor) + +🤖 Generated with [Claude Code](https://claude.com/claude-code) + +## Reviews + +### COMMENTED — @github-advanced-security (2026-05-26T02:23:42Z) + +_(no body)_ + +### COMMENTED — @copilot-pull-request-reviewer (2026-05-26T02:24:49Z) + +## Pull request overview + +Adds a Bun/TypeScript recovery utility and companion documentation/skill to repair Claude Code session JSONL files that can’t be reopened due to oversize pasted image attachments (base64 ballooning a single JSONL line past the harness load limit). + +**Changes:** +- Introduces `repair-jsonl-strip-images.ts` to scan session JSONL files for oversize lines and strip only oversize image blocks (dry-run by default, backup + validation on apply). +- Documents usage, thresholds, and the agent/operator split for safe `--apply` execution. +- Adds a `claude-session-recovery` skill to guide operators through scan → dry-run → operator-run apply. + +### Reviewed changes + +Copilot reviewed 3 out of 3 changed files in this pull request and generated 7 comments. + +| File | Description | +| ---- | ----------- | +| tools/claude-code-recovery/repair-jsonl-strip-images.ts | New CLI tool to scan/repair Claude Code session JSONL lines corrupted by oversize image blocks. | +| tools/claude-code-recovery/README.md | Usage and operational guidance for applying the recovery tool safely. | +| .claude/skills/claude-session-recovery/SKILL.md | New skill to route “session won’t open due to image” incidents into the scripted recovery procedure. | + +### COMMENTED — @github-advanced-security (2026-05-26T02:29:59Z) + +_(no body)_ + +### COMMENTED — @copilot-pull-request-reviewer (2026-05-26T02:31:08Z) + +## Pull request overview + +Copilot reviewed 3 out of 3 changed files in this pull request and generated 6 comments. + +## Review threads + +### Thread 1: tools/claude-code-recovery/repair-jsonl-strip-images.ts:298 (resolved) + +**@github-advanced-security** (2026-05-26T02:23:41Z): + +## CodeQL / Potential file system race condition + +The file may have changed since it [was checked](1). + +[Show more details](https://github.com/Lucent-Financial-Group/Zeta/security/code-scanning/93) + +### Thread 2: tools/claude-code-recovery/repair-jsonl-strip-images.ts:373 (resolved) + +**@github-advanced-security** (2026-05-26T02:23:42Z): + +## CodeQL / Potential file system race condition + +The file may have changed since it [was checked](1). +The file may have changed since it [was checked](2). + +[Show more details](https://github.com/Lucent-Financial-Group/Zeta/security/code-scanning/94) + +### Thread 3: tools/claude-code-recovery/repair-jsonl-strip-images.ts:50 (resolved) + +**@copilot-pull-request-reviewer** (2026-05-26T02:24:47Z): + +This tool hard-codes a maintainer-specific Claude project slug (and embeds a user handle) as DEFAULT_SLUG. That makes the default behavior fail for any operator whose repo path differs, and it leaks machine-specific identity into a reusable tool. Prefer deriving the slug from the current repo absolute path (Claude’s slugging rule) or require --slug explicitly and print a clear error showing how to find the correct slug. + +### Thread 4: tools/claude-code-recovery/repair-jsonl-strip-images.ts:90 (resolved) + +**@copilot-pull-request-reviewer** (2026-05-26T02:24:47Z): + +Argument parsing accepts flags without values (e.g., "--slug" at end) and then proceeds with undefined/NaN values (Number(undefined) => NaN), which will cause confusing behavior later. Add validation that every value-taking flag has a following non-flag token, and reject invalid numbers; also consider allowing numeric underscores by normalizing (remove '_' / ',' before Number) since the help/comments show underscore-separated literals. + +### Thread 5: tools/claude-code-recovery/repair-jsonl-strip-images.ts:312 (resolved) + +**@copilot-pull-request-reviewer** (2026-05-26T02:24:48Z): + +Repair mode inspects lines with length >= --max-line-bytes, but scan mode flags only lines with length > --max-line-bytes. This threshold mismatch means a line exactly at the limit will be processed in repair but never appear in scan output. Make the comparison consistent (either both '>' or both '>=' / '<='). + +### Thread 6: tools/claude-code-recovery/repair-jsonl-strip-images.ts:385 (resolved) + +**@copilot-pull-request-reviewer** (2026-05-26T02:24:48Z): + +Most tools/ scripts in this repo export a main(argv) and only call process.exit when run as the entrypoint (if (import.meta.main)). This file unconditionally calls main() and exits from inside helpers, which makes it harder to import/reuse/test and diverges from established tooling structure. Refactor to export main(argv) -> exit code, and guard the process.exit call behind import.meta.main. + +### Thread 7: .claude/skills/claude-session-recovery/SKILL.md:7 (resolved) + +**@copilot-pull-request-reviewer** (2026-05-26T02:24:48Z): + +Skill frontmatter description is far above the repo’s carved-sentence guidance (one sentence, <120 chars preferred, <150 target) and risks being truncated/dropped by the skill router budget. Carve description down to a single routing-quality sentence and move procedural detail into the skill body. + +### Thread 8: .claude/skills/claude-session-recovery/SKILL.md:34 (resolved) + +**@copilot-pull-request-reviewer** (2026-05-26T02:24:48Z): + +This grep example hard-codes a maintainer-specific project slug; on any other machine the slug will differ and the command will miss sessions. Use a placeholder (or instruct deriving the slug from the repo path) so the skill remains usable for other operators. + +### Thread 9: tools/claude-code-recovery/README.md:50 (resolved) + +**@copilot-pull-request-reviewer** (2026-05-26T02:24:48Z): + +The README documents a specific operator’s slug as the default. Since Claude’s project slug depends on the operator’s absolute repo path, this default will be wrong for most users. Prefer documenting the slug as "" (or explain how to compute/find it), and consider removing the hard-coded default from the tool so docs and behavior stay consistent. + +### Thread 10: tools/claude-code-recovery/repair-jsonl-strip-images.ts:327 (resolved) + +**@github-advanced-security** (2026-05-26T02:29:59Z): + +## CodeQL / Potential file system race condition + +The file may have changed since it [was checked](1). + +[Show more details](https://github.com/Lucent-Financial-Group/Zeta/security/code-scanning/95) + +### Thread 11: tools/claude-code-recovery/repair-jsonl-strip-images.ts:352 (resolved) + +**@copilot-pull-request-reviewer** (2026-05-26T02:31:06Z): + +The `--max-line-bytes` / `--max-image-bytes` flags are documented as byte thresholds, but this code compares against `ln.length` (UTF-16 code units) rather than the UTF-8 byte length. For non-ASCII content, this will misclassify lines near the threshold and make `--scan` / `--session` inconsistent with the stated units. Use `Buffer.byteLength(ln, "utf8")` (and similarly when computing line lengths in `scanFile`). + +### Thread 12: tools/claude-code-recovery/repair-jsonl-strip-images.ts:272 (resolved) + +**@copilot-pull-request-reviewer** (2026-05-26T02:31:07Z): + +`scanFile` computes per-line “bytes” using string indices from `readFileSync(path, "utf8")`, which are not byte counts for UTF-8. Since the tool is explicitly thresholding in bytes, read the file as a `Buffer` (no encoding) and count bytes by scanning for `0x0A`, or compute each line’s size via `Buffer.byteLength` on slices. + +### Thread 13: tools/claude-code-recovery/repair-jsonl-strip-images.ts:405 (resolved) + +**@copilot-pull-request-reviewer** (2026-05-26T02:31:07Z): + +`newContent.length` is used as “bytes” in the size delta log, but `.length` is characters (UTF-16 code units), while `sizeBefore` is bytes from `statSync`. Use `Buffer.byteLength(newContent, "utf8")` for accurate byte reporting (and to avoid negative/incorrect “saves …” output when non-ASCII appears). + +### Thread 14: tools/claude-code-recovery/repair-jsonl-strip-images.ts:429 (resolved) + +**@copilot-pull-request-reviewer** (2026-05-26T02:31:07Z): + +`--apply` writes in-place via `writeFileSync(path, newContent)`. If the process is interrupted mid-write (crash / out-of-disk), this can leave the session file truncated/corrupted (even though a backup exists). Prefer writing to a temp file in the same directory and `renameSync` to replace atomically (and preserve original file mode/ownership as needed). + +### Thread 15: tools/claude-code-recovery/repair-jsonl-strip-images.ts:256 (resolved) + +**@copilot-pull-request-reviewer** (2026-05-26T02:31:08Z): + +`processLine` says it annotates “the first container that lost images”, but the current logic annotates `attachment.prompt` whenever it exists, even if the actual drops happened in `message.content` or top-level `content`. This can add a synthetic text block to an unrelated container and fail to annotate the container that was actually modified. Track which container(s) had drops (e.g., have `apply(...)` return a boolean / record the first modified container reference) and call `annotate` on that specific container. + +### Thread 16: .claude/skills/claude-session-recovery/SKILL.md:3 (resolved) + +**@copilot-pull-request-reviewer** (2026-05-26T02:31:08Z): + +Skill frontmatter `description` is significantly longer than the repo guidance to carve skill descriptions to <120 chars (see docs/ops/COST-REDUCTION-LESSONS.md:53-55 / B-0347). Consider shrinking this to a single carved sentence and move the trigger-phrase list into the body (it’s already present under “Specific trigger phrases”). + +## General comments + +### @chatgpt-codex-connector (2026-05-26T02:21:32Z) + +You have reached your Codex usage limits for code reviews. You can see your limits in the [Codex usage dashboard](https://chatgpt.com/codex/cloud/settings/usage). diff --git a/docs/pr-discussions/PR-5070-backlog-b-0788-agent-on-agent-claude-code-session-recovery-l.md b/docs/pr-discussions/PR-5070-backlog-b-0788-agent-on-agent-claude-code-session-recovery-l.md new file mode 100644 index 0000000000..2a9ceefa51 --- /dev/null +++ b/docs/pr-discussions/PR-5070-backlog-b-0788-agent-on-agent-claude-code-session-recovery-l.md @@ -0,0 +1,109 @@ +--- +pr_number: 5070 +title: "backlog(B-0788): agent-on-agent Claude Code session recovery \u2014 lift operator-runs gate once safety substrate stronger than classifier" +author: "AceHack" +state: "MERGED" +created_at: "2026-05-26T02:39:39Z" +merged_at: "2026-05-26T02:45:18Z" +closed_at: "2026-05-26T02:45:18Z" +head_ref: "otto-cli/b0788-agent-on-agent-session-recovery-2026-05-25" +base_ref: "main" +archived_at: "2026-05-27T19:46:32Z" +archive_tool: "tools/pr-preservation/archive-pr.ts" +--- + +# PR #5070: backlog(B-0788): agent-on-agent Claude Code session recovery — lift operator-runs gate once safety substrate stronger than classifier + +## PR description + +## Summary + +Files B-0788 — the P3 deferred row capturing the eventual-state +design for cross-agent Claude Code session recovery without a +human in the loop. Composes with PR #5069 (which shipped the +operator-runs interim). + +Aaron 2026-05-26 explicit authorization: *"this is something we +can allow other agents to run on each other even tually so we +need no human operator"* + agreed with the composed design. + +## Why deferred + +All three lift criteria from `.claude/rules/classifier-bypass-research-do-not-deploy-without-zeta-safer-floor.md` must hold first: + +1. Zeta safety substrate on `.jsonl` edits stronger than the Anthropic classifier +2. Knights Guild ratification (per B-0628) +3. Maintainer-collective re-authorization after substrate is in place + +None are met today; row tracks the design so it's not lost. + +## Eventual shape (3 components) + +- **settings.json `_session_recovery_acceptance` block** (four-field attribution per `human-audit-and-legal-risk-acceptance-pattern-in-settings.md`) +- **`session-recovery-request` bus envelope topic** for inter-Otto coordination +- **Recovery audit log** (DV2.0 hub-satellite shape — `recovery-event` hub + `recovery-strip-detail` satellite) + +Full design in the row body. + +## Test plan + +- [x] B-0788 ID free on origin/main (highest claimed: B-0787 in-flight via PR #5068) +- [x] BACKLOG.md regenerated via `bun tools/backlog/generate-index.ts` +- [x] Row follows P3 frontmatter convention (id/title/status/priority/size/created/authors/composes_with/depends_on/labels) +- [x] `composes_with` populated with B-0628 + B-0664; `depends_on` left empty (no specific row to point at for "stronger than classifier" criterion yet) +- [x] Row body lists premature-action risks so the deferral is substrate-honest + +🤖 Generated with [Claude Code](https://claude.com/claude-code) + +## Reviews + +### COMMENTED — @copilot-pull-request-reviewer (2026-05-26T02:42:10Z) + +## Pull request overview + +Adds a new P3 backlog row (B-0788) documenting the *deferred* eventual-state design for agent-on-agent Claude Code session recovery (lifting the operator-runs gate only after explicit safety-substrate lift criteria are met), and refreshes the generated backlog index accordingly. + +**Changes:** +- Added `docs/backlog/P3/B-0788-…` capturing lift criteria + 3-component eventual design (settings acceptance, bus envelope, audit log). +- Regenerated `docs/BACKLOG.md` to include the new row (and any newly discovered P3 rows in that range). + +### Reviewed changes + +Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments. + +| File | Description | +| ---- | ----------- | +| docs/backlog/P3/B-0788-agent-on-agent-claude-code-session-recovery-lift-operator-runs-gate-once-zeta-safety-substrate-stronger-than-classifier-2026-05-25.md | New P3 backlog row documenting deferred design + lift criteria for agent-on-agent session recovery. | +| docs/BACKLOG.md | Updated generated index to include B-0788 (and surrounding P3 rows). | + +### COMMENTED — @copilot-pull-request-reviewer (2026-05-26T02:45:26Z) + +## Pull request overview + +Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment. + +## Review threads + +### Thread 1: docs/backlog/P3/B-0788-agent-on-agent-claude-code-session-recovery-lift-operator-runs-gate-once-zeta-safety-substrate-stronger-than-classifier-2026-05-25.md:9 (resolved) + +**@copilot-pull-request-reviewer** (2026-05-26T02:42:10Z): + +`tools/backlog/README.md` documents `last_updated` as a required frontmatter field (and used to track edits). This row is missing it; for a new row it should generally match `created` until the next edit. + +### Thread 2: docs/backlog/P3/B-0788-agent-on-agent-claude-code-session-recovery-lift-operator-runs-gate-once-zeta-safety-substrate-stronger-than-classifier-2026-05-25.md:5 (resolved) + +**@copilot-pull-request-reviewer** (2026-05-26T02:42:10Z): + +The row body repeatedly describes this as deferred until lift criteria are met (e.g., “this row stays in P3 deferred state”), but the frontmatter `status` is `open`. Consider aligning the frontmatter to match the semantics (e.g., `status: deferred`) or adjusting the prose if `open` is intentional. + +### Thread 3: docs/backlog/P3/B-0788-agent-on-agent-claude-code-session-recovery-lift-operator-runs-gate-once-zeta-safety-substrate-stronger-than-classifier-2026-05-25.md:9 (unresolved) + +**@copilot-pull-request-reviewer** (2026-05-26T02:45:26Z): + +`last_updated` should reflect the most recent content edit date. The row body references events on 2026-05-26 (e.g., PR #5069 merge time and the 2026-05-26 authorization), so leaving `last_updated: 2026-05-25` violates the per-row backlog convention that `last_updated` is updated on every content edit. + +## General comments + +### @chatgpt-codex-connector (2026-05-26T02:39:42Z) + +You have reached your Codex usage limits for code reviews. You can see your limits in the [Codex usage dashboard](https://chatgpt.com/codex/cloud/settings/usage). diff --git a/docs/pr-discussions/PR-5071-build-agentic-organization-package-architecture-slice.md b/docs/pr-discussions/PR-5071-build-agentic-organization-package-architecture-slice.md new file mode 100644 index 0000000000..87e80cafdf --- /dev/null +++ b/docs/pr-discussions/PR-5071-build-agentic-organization-package-architecture-slice.md @@ -0,0 +1,145 @@ +--- +pr_number: 5071 +title: "Build Agentic Organization package architecture slice" +author: "maximdolphin" +state: "MERGED" +created_at: "2026-05-26T02:41:02Z" +merged_at: "2026-05-26T02:58:03Z" +closed_at: "2026-05-26T02:58:03Z" +head_ref: "codex/agentic-org-package-ca-clean" +base_ref: "main" +archived_at: "2026-05-27T19:46:31Z" +archive_tool: "tools/pr-preservation/archive-pr.ts" +--- + +# PR #5071: Build Agentic Organization package architecture slice + +## PR description + +## Summary +- Adds the package-first Agentic Organization implementation slice on the Node/Nest-compatible TypeScript stack. +- Wires command outcome persistence through generic ports with Cockroach as a replaceable adapter, including idempotency race handling. +- Adds worker/event ingestion, NATS/Cockroach boundaries, observability contracts, OpenSpec scenarios, and docs aligned to the Agentic Organization north star. +- Adds the first `@agentic-org/policy` package and gates command execution through generic `CommandAuthorizationPort` / `HatAuthorityPort` checks before idempotency lookup, handler dispatch, or persistence. +- Documents remaining policy-visibility work: durable denial observation and allowed policy-decision projection into audit/outbox envelopes before real API/MCP/Hermes/Temporal/Dapr entrypoints are exposed. + +## Validation +- npm test +- npm run typecheck +- git diff --check +- subagent review: architecture/SOLID/vendor boundaries passed +- subagent review: correctness/TDD/policy gate behavior passed +- subagent review: mission/docs/north-star alignment passed + +## Reviews + +### COMMENTED — @copilot-pull-request-reviewer (2026-05-26T02:44:43Z) + +## Pull request overview + +This PR expands the Agentic Organization TypeScript “package-first” slice by adding inbound event ingestion (with inbox dedupe + reaction plan persistence), a worker host/app runtime composition layer, and governance/observability contracts to keep adapters behind ports and keep tests out of production source trees. + +**Changes:** +- Adds event ingestion processor + stores (in-memory + Cockroach adapter) to support inbox receipts, payload-conflict detection, and reaction plan persistence behind generic ports. +- Adds `@agentic-org/workers` run-once worker host and an `apps/workers` runtime shell that composes worker + NATS consumer loops and records telemetry. +- Extends governance checks (dependency boundaries + source layout), observability helpers (worker cycle + NATS consumer batch attributes), and updates OpenSpec/docs to match the new slice contracts. + +### Reviewed changes + +Copilot reviewed 67 out of 67 changed files in this pull request and generated 2 comments. + +
+Show a summary per file + +| File | Description | +| ---- | ----------- | +| openspec/specs/agentic-organization/spec.md | Adds scenarios for effect-returning command handlers, vendor boundaries, source layout governance, event ingestion/NATS consumer behavior, and worker/runtime expectations. | +| agentic-organization/tsconfig.json | Includes `apps/**/*.ts` in typechecking. | +| agentic-organization/packages/workers/test/worker-host.test.ts | Adds worker-host tests covering bounded cycles, idle, degraded lanes, and per-event failure handling. | +| agentic-organization/packages/workers/src/worker-host.ts | Introduces run-once worker host composing outbox publishing + inbound ingestion via ports. | +| agentic-organization/packages/workers/src/index.ts | Exports worker host types/constructors. | +| agentic-organization/packages/state/src/index.ts | Re-exports event-ingestion store types and in-memory store factory. | +| agentic-organization/packages/state/src/in-memory-organization-store.ts | Switches to single `recordCommandOutcome` persistence port (idempotency + effects). | +| agentic-organization/packages/state/src/event-ingestion-store.ts | Adds generic event-ingestion store port + in-memory implementation for receipts/reaction plans. | +| agentic-organization/packages/state-cockroach/test/cockroach-schema.test.ts | Updates schema test to include inbox receipts + reaction plan tables. | +| agentic-organization/packages/state-cockroach/test/cockroach-outbox-event-source.test.ts | Fixes test import path to use `../src/...`. | +| agentic-organization/packages/state-cockroach/test/cockroach-event-ingestion-store.test.ts | Adds Cockroach event-ingestion store tests for claim/transaction batching + null normalization. | +| agentic-organization/packages/state-cockroach/test/cockroach-command-state-store.test.ts | Adds Cockroach command outcome transaction batching + replay/conflict behavior tests. | +| agentic-organization/packages/state-cockroach/src/index.ts | Exports Cockroach event-ingestion store adapter. | +| agentic-organization/packages/state-cockroach/src/cockroach-schema.ts | Adds inbox receipts and reaction plans tables to Cockroach schema generation. | +| agentic-organization/packages/state-cockroach/src/cockroach-event-ingestion-store.ts | Implements Cockroach-backed inbox receipt + reaction plan persistence behind SQL executor seam. | +| agentic-organization/packages/state-cockroach/src/cockroach-command-state-store.ts | Refactors command persistence to `recordCommandOutcome` with transaction boundary + claim semantics. | +| agentic-organization/packages/state-cockroach/src/cockroach-command-state-store.test.ts | Removes old in-src test (migrated to `test/`). | +| agentic-organization/packages/state-cockroach/migrations/0001_agentic_org_core_state.sql | Extends migration with inbox receipts + reaction plans tables. | +| agentic-organization/packages/runtime/test/event-ingestion.test.ts | Adds tests for inbox dedupe, payload conflicts, and pending-receipt retry behavior. | +| agentic-organization/packages/runtime/test/event-automation.test.ts | Fixes test import path to use `../src/...`. | +| agentic-organization/packages/runtime/src/reaction-plan.ts | Re-exports reaction plan types from domain instead of defining duplicates. | +| agentic-organization/packages/runtime/src/index.ts | Exports event-ingestion processor public API. | +| agentic-organization/packages/runtime/src/event-ingestion.ts | Adds runtime event ingestion processor: inbox check → rule eval → record outcome via store port. | +| agentic-organization/packages/README.md | Updates package list, slice flow, and validation commands to include workers + inbound ingestion. | +| agentic-organization/packages/observability/test/workflow-visibility.test.ts | Fixes test import path to use `../src/...`. | +| agentic-organization/packages/observability/test/span-attributes.test.ts | Fixes test import path to use `../src/...`. | +| agentic-organization/packages/observability/test/nats-consumer-attributes.test.ts | Adds tests for NATS consumer batch attribute projection. | +| agentic-organization/packages/observability/src/worker-cycle-attributes.ts | Adds worker-cycle attribute projection helper. | +| agentic-organization/packages/observability/src/nats-consumer-attributes.ts | Adds NATS consumer batch attribute projection helper. | +| agentic-organization/packages/observability/src/index.ts | Exports new observability helpers. | +| agentic-organization/packages/messaging/test/subject-builder.test.ts | Fixes test import path to use `../src/...`. | +| agentic-organization/packages/messaging/test/outbox-publisher.test.ts | Updates expected messaging domain + subject shape; fixes import path. | +| agentic-organization/packages/messaging/src/outbox-publisher.ts | Splits messaging domains (supervisor signal vs work item) and updates resolver. | +| agentic-organization/packages/messaging-nats/test/nats-jetstream-event-publisher.test.ts | Fixes test import path; updates subject expectation. | +| agentic-organization/packages/messaging-nats/test/nats-jetstream-event-consumer.test.ts | Adds tests for NATS consumer ack/terminate/DLQ/nack behavior and counters. | +| agentic-organization/packages/messaging-nats/src/nats-jetstream-event-consumer.ts | Adds NATS JetStream consumer adapter: decode → ingest → ack/nack/terminate + DLQ policy. | +| agentic-organization/packages/messaging-nats/src/index.ts | Exports new consumer adapter API. | +| agentic-organization/packages/governance/test/package-dependency-boundaries.test.ts | Adds governance tests for dependency boundaries + source layout + app boundary. | +| agentic-organization/packages/governance/src/package-dependency-boundaries.ts | Adds source layout validation and expands boundary rule taxonomy. | +| agentic-organization/packages/governance/src/package-dependency-boundaries.test.ts | Removes old in-src test (migrated to `test/`). | +| agentic-organization/packages/governance/src/index.ts | Exports new governance APIs/types. | +| agentic-organization/packages/domain/test/work-item-state-machine.test.ts | Fixes test import path to use `../src/...`. | +| agentic-organization/packages/domain/test/hat-communication-brief.test.ts | Fixes test import paths to use `../src/...`. | +| agentic-organization/packages/domain/test/event-envelope.test.ts | Fixes test import path to use `../src/...`. | +| agentic-organization/packages/domain/src/reaction-plan.ts | Introduces domain-owned reaction plan types/enums. | +| agentic-organization/packages/domain/src/index.ts | Exports reaction plan types/enums. | +| agentic-organization/packages/application/test/send-supervisor-signal.test.ts | Updates test to assert effect-returning handler outcome (no direct state writes). | +| agentic-organization/packages/application/test/command-pipeline.test.ts | Adds pipeline tests for outcome-port persistence, replay/conflict outcomes, and failure behavior. | +| agentic-organization/packages/application/src/ports.ts | Introduces `recordCommandOutcome` port and `CommandEffects` contract. | +| agentic-organization/packages/application/src/index.ts | Exports new handler outcome/ports/types. | +| agentic-organization/packages/application/src/handlers/send-supervisor-signal.ts | Refactors handler to return `{ result, effects }` instead of writing state. | +| agentic-organization/packages/application/src/command-pipeline.ts | Refactors pipeline to persist via `recordCommandOutcome` and handle replay/conflict outcomes. | +| agentic-organization/packages/application/src/command-pipeline.test.ts | Removes old in-src test (migrated to `test/`). | +| agentic-organization/packages/application/src/command-handler-registry.ts | Updates handler contract to return typed outcome (result + effects). | +| agentic-organization/package.json | Updates test script glob to new `test/` layout and includes app tests. | +| agentic-organization/docs/V0_EXECUTABLE_CONTRACT.md | Refines V0 contract language to route capability inputs through supervisor signals/triage. | +| agentic-organization/docs/TECHNICAL_CA_PACKAGE_ARCHITECTURE.md | Documents new vendor-port boundaries, ingestion/worker runtime behavior, and env contract. | +| agentic-organization/docs/README.md | Adds North Star alignment checkpoint doc to docs index. | +| agentic-organization/docs/NORTH_STAR_ALIGNMENT_CHECKPOINT.md | Adds alignment checkpoint doc capturing drift list and priorities. | +| agentic-organization/docs/FIRST_IMPLEMENTATION_SLICE.md | Updates slice description to include outcome persistence, ingestion, workers app shell, and telemetry. | +| agentic-organization/apps/workers/test/worker-runtime.test.ts | Adds tests for worker runtime loop orchestration + telemetry + degraded conditions. | +| agentic-organization/apps/workers/test/worker-config.test.ts | Adds tests for typed env parsing + validation errors. | +| agentic-organization/apps/workers/test/worker-composition.test.ts | Adds tests for composition root factory wiring. | +| agentic-organization/apps/workers/src/worker-runtime.ts | Adds workers runtime host: runs loops, records telemetry, returns healthy/degraded with failures. | +| agentic-organization/apps/workers/src/index.ts | Exports app runtime/config/composition APIs. | +| agentic-organization/apps/workers/src/config.ts | Implements typed env parsing for worker runtime config. | +| agentic-organization/apps/workers/src/composition.ts | Adds composition root that binds config + ports to runtime. | +| agentic-organization/apps/workers/README.md | Documents responsibilities/boundaries/config contract for `apps/workers`. | +
+ +## Review threads + +### Thread 1: agentic-organization/packages/messaging-nats/src/nats-jetstream-event-consumer.ts:201 (resolved) + +**@copilot-pull-request-reviewer** (2026-05-26T02:44:43Z): + +P0: `terminateWithDeadLetter` (and its callers) don’t handle failures from `deadLetterPublisher.publish(...)` or `message.terminate()`. If either throws, the batch aborts and the message is left neither acked/nacked/terminated, which can stall delivery and starve later messages. Consider wrapping DLQ+terminate in a try/catch and falling back to `negativeAcknowledge()` (and counter updates) when dead-lettering/termination fails. + +### Thread 2: agentic-organization/packages/state-cockroach/src/cockroach-event-ingestion-store.ts:105 (resolved) + +**@copilot-pull-request-reviewer** (2026-05-26T02:44:43Z): + +P0: After inserting reaction plans, the code always returns `Processed` without verifying that `MarkInboxReceiptProcessed` actually marked the receipt (the UPDATE can no-op if another transaction completed the receipt first). That can lead to reaction plans being inserted even though the receipt wasn’t successfully claimed/completed by this transaction. Consider making the UPDATE return a row (e.g., `RETURNING event_id`) and treating a missing row as a duplicate/claim-loss outcome (and ensuring the transaction rolls back in that case). + +## General comments + +### @chatgpt-codex-connector (2026-05-26T02:41:07Z) + +Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. +Credits must be used to enable repository wide code reviews. diff --git a/docs/pr-discussions/PR-5072-fix-gemini-enforce-detached-worktree-and-clone-hygiene-in-ba.md b/docs/pr-discussions/PR-5072-fix-gemini-enforce-detached-worktree-and-clone-hygiene-in-ba.md new file mode 100644 index 0000000000..ebecef7260 --- /dev/null +++ b/docs/pr-discussions/PR-5072-fix-gemini-enforce-detached-worktree-and-clone-hygiene-in-ba.md @@ -0,0 +1,43 @@ +--- +pr_number: 5072 +title: "fix(gemini): enforce detached worktree and clone hygiene in background loop" +author: "AceHack" +state: "MERGED" +created_at: "2026-05-26T02:44:30Z" +merged_at: "2026-05-26T02:47:42Z" +closed_at: "2026-05-26T02:47:43Z" +head_ref: "lior/fix-loop-tick-hygiene" +base_ref: "main" +archived_at: "2026-05-27T19:46:30Z" +archive_tool: "tools/pr-preservation/archive-pr.ts" +--- + +# PR #5072: fix(gemini): enforce detached worktree and clone hygiene in background loop + +## PR description + +## Summary + +This PR hardens the Lior (Gemini) background loop runner prompt configuration to strictly enforce detached worktree safety (B-0750) and per-agent isolated clone architecture (B-0751). + +## Changes + +- `.gemini/bin/lior-loop-tick.ts`: Updated prompt text to: + - Require all git operations to run inside isolated detached worktrees (`git worktree add --detach origin/main`). + - Ban local modifications/commits directly on the contested root checkout or on `main`. + - Introduce explicit checkout and push patterns for PR generation. + - Formally instruct compliance with the per-agent isolated clone directory layout at `/private/tmp/zeta-clones/lior-antigravity/` per B-0751. + +Co-Authored-By: Gemini + +## Reviews + +### COMMENTED — @copilot-pull-request-reviewer (2026-05-26T02:44:35Z) + +Copilot wasn't able to review any files in this pull request. + +## General comments + +### @chatgpt-codex-connector (2026-05-26T02:44:35Z) + +You have reached your Codex usage limits for code reviews. You can see your limits in the [Codex usage dashboard](https://chatgpt.com/codex/cloud/settings/usage). diff --git a/docs/pr-discussions/PR-5073-docs-shadow-markdownlint-md032-prose-arithmetic-joiner-recur.md b/docs/pr-discussions/PR-5073-docs-shadow-markdownlint-md032-prose-arithmetic-joiner-recur.md new file mode 100644 index 0000000000..3d49670272 --- /dev/null +++ b/docs/pr-discussions/PR-5073-docs-shadow-markdownlint-md032-prose-arithmetic-joiner-recur.md @@ -0,0 +1,49 @@ +--- +pr_number: 5073 +title: "docs(shadow): markdownlint MD032 prose-arithmetic-joiner recurring failure class" +author: "AceHack" +state: "MERGED" +created_at: "2026-05-26T02:57:00Z" +merged_at: "2026-05-26T02:58:14Z" +closed_at: "2026-05-26T02:58:14Z" +head_ref: "otto-cli/shadow-md032-prose-joiner-2026-05-25" +base_ref: "main" +archived_at: "2026-05-27T19:46:29Z" +archive_tool: "tools/pr-preservation/archive-pr.ts" +--- + +# PR #5073: docs(shadow): markdownlint MD032 prose-arithmetic-joiner recurring failure class + +## PR description + +## Summary + +Per Aaron 2026-05-25 framing — *"reoccuring failures belong in shadow logs for class identification"* — this PR lands the shadow-lesson log for the markdownlint MD032 prose-arithmetic-joiner failure class observed on PR #5068. + +## Empirical anchor + +PR #5068 (Mika substrate batch) hit 8 MD032 errors. Direct inspection showed **4 of the 8 were not real list-bullet typos** — they were prose continuations where `+` or `-` landed at the start of a wrap line. markdownlint can't distinguish prose-`+` from list-`+`; operator-side rewrap is the simplest-first mitigation. + +## What ships + +- `docs/research/2026-05-25-shadow-lesson-log-markdownlint-md032-prose-arithmetic-joiner-recurring-class.md` — the lesson log with failure-class definition, 4 empirical examples, 4 ranked mitigations (simplest-first), and the next-step trigger (promote to tooling mitigation only if class recurs 2+ more times in 30 days) + +## Composes with + +- `.claude/rules/blocked-green-ci-investigate-threads.md` empirical FP catalog +- `.claude/rules/all-complexity-is-accidental-in-greenfield.md` (simplest-first response) +- Sibling shadow-lesson-logs under `docs/research/2026-05-{07,13,14}-shadow-lesson-log-*.md` + +## Test plan + +- [x] markdownlint clean on the new file +- [x] Filename follows existing `YYYY-MM-DD-shadow-lesson-log-*.md` convention +- [ ] CI passes (gate workflow + CodeQL) + +🤖 Generated with [Claude Code](https://claude.com/claude-code) + +## General comments + +### @chatgpt-codex-connector (2026-05-26T02:57:04Z) + +You have reached your Codex usage limits for code reviews. You can see your limits in the [Codex usage dashboard](https://chatgpt.com/codex/cloud/settings/usage). diff --git a/docs/pr-discussions/PR-5074-docs-persona-max-add-tier-2-docker-desktop-dev-experience-wo.md b/docs/pr-discussions/PR-5074-docs-persona-max-add-tier-2-docker-desktop-dev-experience-wo.md new file mode 100644 index 0000000000..abe80864ec --- /dev/null +++ b/docs/pr-discussions/PR-5074-docs-persona-max-add-tier-2-docker-desktop-dev-experience-wo.md @@ -0,0 +1,74 @@ +--- +pr_number: 5074 +title: "docs(persona/max): add tier-2 Docker Desktop dev-experience workstream" +author: "AceHack" +state: "MERGED" +created_at: "2026-05-26T03:16:05Z" +merged_at: "2026-05-26T03:17:46Z" +closed_at: "2026-05-26T03:17:46Z" +head_ref: "otto-cli/max-persona-tier2-docker-desktop-workstream-2026-05-25" +base_ref: "main" +archived_at: "2026-05-27T19:46:29Z" +archive_tool: "tools/pr-preservation/archive-pr.ts" +--- + +# PR #5074: docs(persona/max): add tier-2 Docker Desktop dev-experience workstream + +## PR description + +## Summary + +Adds Max's near-term primary workstream — **owning the tier-2 Docker Desktop + Kubernetes dev-experience** for the Zeta cluster substrate — to his persona files. + +Aaron 2026-05-25: *"okay can you make a pr and up maxes persona with this updated responsiblity and i'll send him a link once it's on master."* This PR puts the updated persona on `main` so Aaron can send Max a permanent URL. + +## Workstream scope (added to PERSONA.md + STARTING-POINT.md) + +- **Tier-2 in the three-tier testing story** (B-0780): middle tier between Aaron+Otto's pure-code F# Local Loop (tier-1) and the full real cluster (tier-3 already shipping per B-0754) +- **Sub-scopes**: Argo CD sync-wave debugging; OTel observability matching CNI mesh shape; 30+ chart coverage matrix; CI testing on kind/k3d + GitHub workflows; `zeta dev up` developer-facing surface +- **Topology**: Docker Desktop ships native multi-node kind (1–10 node slider; default 3 for consensus-quorum testing); multi-cluster federation lives in CI, not always-on locally +- **Touch ID / biometrics**: Max gets the zflash Touch ID + PAM pattern for all privileged macOS operations; never types passwords for admin +- **Skills-and-scripts encoding contract**: every Docker Desktop UX interaction ends as TS script (per Rule 0), Claude Code skill, or backlog row — nothing lost in chat + +## Files changed + +- `memory/persona/max/PERSONA.md` — new "Current focus" section with workstream contract + 5 sub-scopes + topology substrate + Touch ID integration + skills-and-scripts contract + composes-with cluster +- `memory/persona/max/STARTING-POINT.md` — new "Current focus" section with 8-file cold-boot reading list + applicable disciplines + 5 concrete first deliverables + updated 30-day success metrics + +`NOTEBOOK.md` not modified — Max's own running notes surface. + +## Test plan + +- [x] markdownlint clean +- [x] No new files (only edits to existing persona files) +- [x] Cross-references resolve (B-0780, B-0759, B-0770, B-0776, B-0786, zflash-setup.ts, zeta-install.sh, .claude/rules/*) +- [ ] CI passes (gate workflow + CodeQL) + +🤖 Generated with [Claude Code](https://claude.com/claude-code) + +## Reviews + +### COMMENTED — @copilot-pull-request-reviewer (2026-05-26T03:16:59Z) + +## Pull request overview + +Documentation-only PR adding Max's near-term workstream (tier-2 Docker Desktop + Kubernetes dev-experience) to his persona files under `memory/persona/max/`. Both files are on the history-surface allowlist (`memory/**`), so persona name attribution is in-policy. + +**Changes:** +- Adds "Current focus" section to `PERSONA.md` describing the tier-2 workstream, sub-scopes, topology, Touch ID integration, and skills/scripts encoding contract +- Adds matching "Current focus" section to `STARTING-POINT.md` with cold-boot reading list, applicable disciplines, first deliverables, and updated 30-day success metrics + +### Reviewed changes + +Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments. + +| File | Description | +| ---- | ----------- | +| memory/persona/max/PERSONA.md | New "Current focus — tier-2 Docker Desktop dev-experience workstream" section with tier table, sub-scopes, topology, Touch ID pattern, encoding contract, and composes-with links | +| memory/persona/max/STARTING-POINT.md | New "Current focus" section with cold-boot reading list, disciplines, concrete first deliverables, and 30-day success metrics | + +## General comments + +### @chatgpt-codex-connector (2026-05-26T03:16:10Z) + +You have reached your Codex usage limits for code reviews. You can see your limits in the [Codex usage dashboard](https://chatgpt.com/codex/cloud/settings/usage). diff --git a/docs/pr-discussions/PR-5075-feat-agentic-org-gate-commands-with-hat-policy.md b/docs/pr-discussions/PR-5075-feat-agentic-org-gate-commands-with-hat-policy.md new file mode 100644 index 0000000000..cf555f2016 --- /dev/null +++ b/docs/pr-discussions/PR-5075-feat-agentic-org-gate-commands-with-hat-policy.md @@ -0,0 +1,89 @@ +--- +pr_number: 5075 +title: "feat(agentic-org): gate commands with hat policy" +author: "maximdolphin" +state: "MERGED" +created_at: "2026-05-26T03:21:23Z" +merged_at: "2026-05-26T03:38:27Z" +closed_at: "2026-05-26T03:38:27Z" +head_ref: "codex/agentic-org-policy-gate" +base_ref: "main" +archived_at: "2026-05-27T19:46:28Z" +archive_tool: "tools/pr-preservation/archive-pr.ts" +--- + +# PR #5075: feat(agentic-org): gate commands with hat policy + +## PR description + +## Summary +- Adds the first `@agentic-org/policy` package with generic `CommandAuthorizationPort` and `HatAuthorityPort` contracts. +- Gates the application command pipeline before idempotency lookup, handler dispatch, or persistence, returning typed `policy_denied` errors for inactive hat authority. +- Strengthens governance boundaries so policy code stays adapter/vendor-free, and updates Agentic Organization docs/OpenSpec to mark denial observation and allowed decision projection as the next visibility slice. + +## Validation +- `npm test` +- `npm run typecheck` +- `git diff --check origin/main...HEAD` +- Subagent review: architecture/SOLID/vendor boundaries passed +- Subagent review: correctness/TDD/policy gate behavior passed +- Subagent review: north-star/docs alignment passed + +## Reviews + +### COMMENTED — @copilot-pull-request-reviewer (2026-05-26T03:25:41Z) + +## Pull request overview + +Introduces a first-cut policy boundary for Agentic Organization command execution by adding a dedicated `policy` package with authorization ports and enforcing a policy gate at the start of the application command pipeline (before idempotency, dispatch, or persistence). This aligns the implementation with governance goals to keep policy logic vendor/adapter-free while making policy denials typed and observable at the command boundary. + +**Changes:** +- Added `@agentic-org/policy` contracts + a default `createCommandAuthorizationPort` implementation backed by `HatAuthorityPort`, with tests for allow/deny behavior. +- Updated the application command pipeline to require `CommandAuthorizationPort` and reject denied commands before any idempotency lookup, handler dispatch, or state persistence. +- Strengthened governance dependency-boundary checks (including policy package constraints) and updated OpenSpec + architecture docs to reflect the new policy gate slice. + +### Reviewed changes + +Copilot reviewed 14 out of 14 changed files in this pull request and generated 2 comments. + +
+Show a summary per file + +| File | Description | +| ---- | ----------- | +| openspec/specs/agentic-organization/spec.md | Specifies that command authorization runs before any command effects and adds policy boundary assertions. | +| agentic-organization/packages/README.md | Updates package inventory and first-slice pipeline narrative to include policy gating. | +| agentic-organization/packages/policy/src/index.ts | Adds policy ports, decision types, and a default command authorization implementation. | +| agentic-organization/packages/policy/test/command-authorization.test.ts | Tests allow/deny mapping from hat authority decisions to policy decisions. | +| agentic-organization/packages/application/src/command-pipeline.ts | Gates command execution via policy authorization before idempotency/dispatch/persistence. | +| agentic-organization/packages/application/src/command-result.ts | Extends command errors with `policy_denied` and policy metadata fields. | +| agentic-organization/packages/application/test/command-pipeline.test.ts | Updates pipeline tests to provide authorization port and adds an early policy-denial test. | +| agentic-organization/packages/governance/src/package-dependency-boundaries.ts | Adds `Policy` boundary rule identifier. | +| agentic-organization/packages/governance/test/package-dependency-boundaries.test.ts | Enforces policy package import constraints and expands forbidden vendor/import fragments. | +| agentic-organization/docs/V0_SCHEMA_AND_COMMANDS.md | Updates command contract narrative to reflect policy-derived authorization context. | +| agentic-organization/docs/V0_POLICY_AND_RUNTIME_BOUNDARIES.md | Updates policy/runtime boundary doc to include command authorization gating and denial visibility notes. | +| agentic-organization/docs/TECHNICAL_CA_PACKAGE_ARCHITECTURE.md | Documents the new policy port boundary and the “deny before effects” execution rule. | +| agentic-organization/docs/NORTH_STAR_ALIGNMENT_CHECKPOINT.md | Moves policy/hat authority from “gap” to a checkpoint with remaining follow-ons. | +| agentic-organization/docs/FIRST_IMPLEMENTATION_SLICE.md | Updates slice plan to include policy gate and governance constraints for the new package. | +
+ +## Review threads + +### Thread 1: agentic-organization/docs/V0_SCHEMA_AND_COMMANDS.md:262 (resolved) + +**@copilot-pull-request-reviewer** (2026-05-26T03:25:40Z): + +This new bullet refers to `actor` / hat assignment / scope, but the preceding bullets in this same section still describe actor identity as flattened `actorAgentId` / `actorHatAssignmentId`. Elsewhere in this repo the command contract is modeled as `actor: { agentId, hatAssignmentId }`; consider updating this section to use one consistent shape (or explicitly state when/where the flattened fields exist). + +### Thread 2: agentic-organization/packages/application/test/command-pipeline.test.ts:208 (resolved) + +**@copilot-pull-request-reviewer** (2026-05-26T03:25:41Z): + +This test validates early rejection and the policy decision id, but it doesn’t assert that the pipeline plumbs through `policyVersion` and the typed denial `reason` onto the returned `CommandResult.error`. Adding assertions here would prevent regressions in the new error contract. + +## General comments + +### @chatgpt-codex-connector (2026-05-26T03:21:27Z) + +Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. +Credits must be used to enable repository wide code reviews.