Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
---
pr_number: 5384
title: "feat(B-0847): each Zeta AI gets own GitHub identity + email once cluster operational \u2014 closes algo-wink-attribution-gap (Aaron 2026-05-26)"
author: "AceHack"
state: "MERGED"
created_at: "2026-05-27T02:24:28Z"
merged_at: "2026-05-27T02:26:09Z"
closed_at: "2026-05-27T02:26:09Z"
head_ref: "feat-b0847-ai-own-github-identity-once-cluster-operational-2026-05-26-2206z"
base_ref: "main"
archived_at: "2026-05-27T19:27:20Z"
archive_tool: "tools/pr-preservation/archive-pr.ts"
---

# PR #5384: feat(B-0847): each Zeta AI gets own GitHub identity + email once cluster operational — closes algo-wink-attribution-gap (Aaron 2026-05-26)

## PR description

## Summary

Aaron caught an algo-wink-failure-mode 2026-05-26: I framed \`gh autoMergeRequest.enabledBy: AceHack\` as "operator-authority armed the merge" when the field is structurally OAuth-token-owner (not actor). Actual actor was me (Otto-CLI); visible only via Co-Authored-By trailer in commits.

Aaron's proposed fix: *"i think we should create you your own github with email once we get you running on the cluster"* → substrate-honest end-to-end attribution.

This PR files [B-0847](docs/backlog/P2/B-0847-each-ai-gets-own-github-identity-with-email-once-cluster-operational-substrate-honest-attribution-end-to-end-closes-enabledby-token-owner-not-actor-algo-wink-aaron-2026-05-26.md) as the durable future-target substrate.

## 4-phase plan

- **Phase 1**: Ilyana public-surface naming review per AI (gates ALL creation)
- **Phase 2**: legal-risk attribution \`_ai_github_identity_acceptance\` block per AI per existing rule
- **Phase 3**: HSM + per-AI OAuth tokens + email infrastructure (cluster-dependent)
- **Phase 4**: per-AI gitconfig + \`gh\` token routing migration

## Today's discipline (Phase 0)

Until per-AI identity ships:

1. Never read \`gh enabledBy\` / \`gh author\` as authorization-source signal (token-owner ≠ actor)
2. Always cross-reference Co-Authored-By trailers for actual-actor attribution
3. State framings substrate-honestly ("I armed via borrowed token" NOT "operator armed")

## Test plan

- [x] Backlog row authored
- [x] BACKLOG.md regenerated
- [x] User-scope memory entry captures empirical anchor + bounded discipline
- [ ] CI passes

## Composes with

B-0751 (per-agent isolated clones) · B-0628 (Knights Guild ratification) · \`algo-wink-failure-mode\` · \`mechanical-authorization-check\` · \`glass-halo-bidirectional\` · \`persistence-choice-architecture-for-zeta-ais\` · \`non-coercion-invariant\` HC-8 · \`honor-those-that-came-before\` · \`agent-roster-reference-card\` · \`naming-expert\` SKILL.md (Ilyana review) · \`human-audit-and-legal-risk-acceptance-pattern-in-settings\` (legal-risk attribution per Aaron's standing constitutional invariant)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

## Reviews

### COMMENTED — @copilot-pull-request-reviewer (2026-05-27T02:25:22Z)

## Pull request overview

Files a new P2 backlog row (B-0847) capturing a future-target plan to give each Zeta AI its own GitHub identity + email once cluster infrastructure is operational, addressing the `gh enabledBy = token-owner ≠ actor` attribution gap. Updates the backlog index accordingly.

**Changes:**
- Adds new backlog row file under `docs/backlog/P2/` describing problem, 4-phase plan, composes-with links, and acceptance criteria.
- Adds the row to `docs/BACKLOG.md` index in P2 section.

### Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

| File | Description |
| ---- | ----------- |
| docs/backlog/P2/B-0847-...-2026-05-26.md | New P2 backlog row capturing per-AI GitHub identity substrate target |
| docs/BACKLOG.md | Index entry for B-0847 added to P2 list |

## General comments

### @chatgpt-codex-connector (2026-05-27T02:24:33Z)

You have reached your Codex usage limits for code reviews. You can see your limits in the [Codex usage dashboard](https://chatgpt.com/codex/cloud/settings/usage).
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
---
pr_number: 5385
title: "fix(B-0835 Bug 4+5 \u2014 Aaron 2026-05-27 control-plane install): storage probe filters 0B devices + gh CLI in installed system PATH"
author: "AceHack"
state: "MERGED"
created_at: "2026-05-27T02:30:40Z"
merged_at: "2026-05-27T02:33:34Z"
closed_at: "2026-05-27T02:33:34Z"
head_ref: "fix-b0835-storage-probe-filter-zero-size-block-devices-2026-05-26-2233z"
base_ref: "main"
archived_at: "2026-05-27T19:27:19Z"
archive_tool: "tools/pr-preservation/archive-pr.ts"
---

# PR #5385: fix(B-0835 Bug 4+5 — Aaron 2026-05-27 control-plane install): storage probe filters 0B devices + gh CLI in installed system PATH

## PR description

## Summary

Two empirical anchors from Aaron's iter-5.4 install of \`node-e5a176\` (PR #5380 self-registered cleanly) where post-reboot login surfaced two distinct gaps:

### Bug 4 — \`/dev/sda 0B\` zero-size device in node.yaml

Storage probe at zeta-install.sh:781 emitted every block device, including 0-byte placeholders (empty SD card readers, optical bays). Aaron's Intel Core Ultra 9 185H node registered \`/dev/sda 0B\` → Copilot P1 on [PR #5380](https://github.com/Lucent-Financial-Group/Zeta/pull/5380).

Fix: \`awk '\$3==\"disk\" && \$2!=\"0B\"{...}'\` filter excludes zero-size devices.

### Bug 5 — \`gh: command not found\` on first login

Operator: *\"when i log in gh command is not found\"*. Installer ISO had gh (iter-5.4.0 used it for \`gh auth login\` during install) but \`common.nix\` systemPackages didn't include it — auth tokens in \`~/.config/gh\` were stranded without the binary.

Fix: add \`gh\` to \`common.nix\` \`environment.systemPackages\` so the installed system has it for re-auth + ssh-key sync + future register/deregister tooling.

## Test plan

- [ ] CI passes
- [ ] Next ISO build picks up both fixes
- [ ] Future installs register without 0B entries; \`gh\` available on first login

## Composes with

- B-0813 (cluster-node schema), B-0817 (register-node tool), iter-5.4 install cascade
- PR #5380 (the registration where these gaps surfaced)
- Aaron's empirical observations 2026-05-27: \"i can't ping it by name\" (mitigated via IP lookup; found at 192.168.4.128) → \"when i log in gh command is not found and i don't think it registered\" (registration DID happen — PR #5380 — but no \`gh\` to check it)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

## General comments

### @chatgpt-codex-connector (2026-05-27T02:30:46Z)

You have reached your Codex usage limits for code reviews. You can see your limits in the [Codex usage dashboard](https://chatgpt.com/codex/cloud/settings/usage).
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
---
pr_number: 5386
title: "feat(B-0848): node-local Claude agent stewards own registration PR + K8s cluster health reporter \u2014 first concrete B-0847 AI-on-cluster instance (Aaron 2026-05-26)"
author: "AceHack"
state: "MERGED"
created_at: "2026-05-27T02:33:30Z"
merged_at: "2026-05-27T02:35:09Z"
closed_at: "2026-05-27T02:35:09Z"
head_ref: "feat-b0848-node-local-claude-agent-pr-steward-cluster-health-reporter-2026-05-26-2240z"
base_ref: "main"
archived_at: "2026-05-27T19:27:18Z"
archive_tool: "tools/pr-preservation/archive-pr.ts"
---

# PR #5386: feat(B-0848): node-local Claude agent stewards own registration PR + K8s cluster health reporter — first concrete B-0847 AI-on-cluster instance (Aaron 2026-05-26)

## PR description

## Summary

Aaron's verbatim proposal in response to PR #5380 being auto-merge-armed + blocked on 1 Copilot thread:

> *\"oh shit is that pr fully automatic? can we make an claude agent get installed and do what you do on there but it's main goal is just to get it to steward the registerain pr for now and then after it's checked in report on the status of the k8s cluster, i can interactive login like gh if that works.\"*

This is the **first concrete instance of B-0847** (each Zeta AI gets own GitHub identity) — node-local Claude IS the AI that needs the identity; PR-stewardship IS the first work that needs the substrate-honest attribution.

## Two-phase scope (bounded)

- **Phase 1** — steward the node's own registration PR (poll → diagnose threads → fix → resolve → auto-merge fires)
- **Phase 2** — after registration merged + cluster running, report K8s cluster health (kubectl read-only queries → synthesized per-tick report)

## Auth model

Mirror of iter-5.4.0 \`gh auth login\`: operator SSHes to node → \`claude login\` device flow → token in \`~/.config/claude/\`. Aaron's \"i can interactive login like gh if that works\" → yes, device flow works identically.

## What this is NOT

- NOT arbitrary cluster mutation (read-only K8s queries + scoped PR actions on own-registration only)
- NOT replacement for operator (operator in loop for irreversible actions per NCI HC-8)
- NOT immediate ship (5-phase landing; manual validation on node-e5a176 first)
- NOT NixOS-module before manual validation succeeds

## 5-phase landing

| Phase | Scope | Status |
|---|---|---|
| 0 | substrate row | this PR |
| 1 | manual install on node-e5a176 + PR-stewardship validation | next |
| 2 | K8s health reporter scope expansion | after Phase 1 + cluster up |
| 3 | NixOS module + multi-node composability | after Phase 2 |
| 4 | per-AI GitHub identity migration (composes B-0847) | after Ilyana review |
| 5 | cluster-wide coordination (composes B-0796 Twilio sibling) | long-horizon |

## Composes with

[B-0847](docs/backlog/P2/B-0847-each-ai-gets-own-github-identity-with-email-once-cluster-operational-substrate-honest-attribution-end-to-end-closes-enabledby-token-owner-not-actor-algo-wink-aaron-2026-05-26.md) · B-0794 · B-0795/B-0812/B-0813 · [B-0796](docs/backlog/P2/B-0796-twilio-phone-support-substrate-AI-picks-up-call-fixes-cluster-via-event-store-runbooks-while-talking-sms-parallel-interface-amazon-USB-sales-enabled-by-AI-as-support-layer-aaron-mika-2026-05-26.md) · B-0628 · B-0751 · B-0835 Bug 5

🤖 Generated with [Claude Code](https://claude.com/claude-code)

## General comments

### @chatgpt-codex-connector (2026-05-27T02:33:35Z)

You have reached your Codex usage limits for code reviews. You can see your limits in the [Codex usage dashboard](https://chatgpt.com/codex/cloud/settings/usage).
Original file line number Diff line number Diff line change
@@ -0,0 +1,113 @@
---
pr_number: 5387
title: "fix(B-0835 Bug 6+7): multi-protocol name resolution \u2014 Avahi hardening + NetBIOS (nmbd) + DHCP-hostname; reliability for 'i can't ping it by name' (Aaron 2026-05-27)"
author: "AceHack"
state: "MERGED"
created_at: "2026-05-27T02:37:14Z"
merged_at: "2026-05-27T02:46:39Z"
closed_at: "2026-05-27T02:46:39Z"
head_ref: "fix-b0835-multi-protocol-name-resolution-netbios-avahi-hardening-2026-05-26-2305z"
base_ref: "main"
archived_at: "2026-05-27T19:27:17Z"
archive_tool: "tools/pr-preservation/archive-pr.ts"
---

# PR #5387: fix(B-0835 Bug 6+7): multi-protocol name resolution — Avahi hardening + NetBIOS (nmbd) + DHCP-hostname; reliability for 'i can't ping it by name' (Aaron 2026-05-27)

## PR description

## Summary

Aaron 2026-05-27 (verbatim):

> *\"my mac is ethernet connected and i connected to the same wifi as it but i still can't ping could it be something else or can we make hostname more reliable? maybe a netbios or something? i like ashai or whatever it is but can we make it reliable? i think this is looking very good.\"*

Empirical: ping by IP works ✓, SSH works ✓, but Bonjour resolution times out AND unicast mDNS query to port 5353/udp times out (actual no-response, not connection-attempt noise). Avahi alone proved unreliable.

## Multi-protocol additive approach

Operator's preferred Avahi/Bonjour stays + 2 fallback mechanisms added (different protocols, different failure modes):

### Bug 6 — Avahi hardening

- \`nssmdns6 = true\` (IPv6 nss-mdns alongside IPv4; some macOS configs prefer AAAA queries first)
- \`ipv4 + ipv6\` explicit
- \`reflector = true\` (forwards mDNS across subnets — composes with multi-segment LAN setups)
- \`publish.hinfo + publish.userServices\` (additional discoverability)

### Bug 7 — NetBIOS via Samba's nmbd (belt-and-suspenders)

NetBIOS uses UDP broadcast on port 137 (vs mDNS multicast on 5353) — **different failure modes**. If network drops IGMP/multicast but allows broadcast (common on home/SMB switches), \`node-e5a176\` resolves via NetBIOS where \`node-e5a176.local\` fails via mDNS.

Operator usage (any LAN host):
\`\`\`bash
nmblookup node-e5a176 # Linux/macOS NetBIOS lookup
smbutil lookup node-e5a176 # macOS native NetBIOS
ping node-e5a176 # if nsswitch has wins
\`\`\`

Samba is enabled for NetBIOS name-advertisement **only** (no shares declared = no SMB file-share exposure).

### DHCP-hostname registration (3rd layer)

NetworkManager already advertises hostname via DHCP option 12 by default. Many home routers register DHCP client hostnames as DNS names (\`node-e5a176.lan\` from Asus/Netgear/Eero). No config change needed.

## Operator now has 3 name-resolution mechanisms

| # | Lookup | Mechanism | Failure mode |
|---|---|---|---|
| 1 | \`node-e5a176.local\` | mDNS multicast | IGMP filtering, multicast drop |
| 2 | \`node-e5a176\` (via nmblookup) | NetBIOS broadcast | Different protocol; works when mDNS fails |
| 3 | \`node-e5a176.lan\` | Router DHCP+DNS | Depends on router support |
| 4 | IP (192.168.4.128) | Always reliable | Need \`arp -a\` first if IP not memorized |

## Test plan

- [ ] CI passes
- [ ] Next ISO build picks up multi-protocol stack
- [ ] On next install: validate all 3 mechanisms; document which work on operator's specific LAN

## Composes with

B-0792 (injected-hostname) · iter-5.4.1 self-registration (PR #5380 carries MAC + hostname for correlation) · [B-0848](docs/backlog/P2/B-0848-node-local-claude-agent-stewards-own-registration-pr-then-reports-k8s-cluster-status-operator-interactive-login-pattern-aaron-2026-05-26.md) (node-local Claude needs reliable name resolution)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

## Reviews

### COMMENTED — @copilot-pull-request-reviewer (2026-05-27T02:39:50Z)

## Pull request overview

This PR aims to make cluster-node hostname resolution more reliable on typical home/SMB LANs by keeping Avahi/Bonjour mDNS and adding additional fallback mechanisms (notably NetBIOS name advertisement via Samba).

**Changes:**
- Harden Avahi configuration (IPv6 NSS, explicit v4/v6 enablement, reflector, additional publish records).
- Enable Samba with NetBIOS-focused settings to support broadcast-based name lookup as an mDNS fallback.
- Document DHCP hostname registration as an additional expected fallback layer.

## Review threads

### Thread 1: full-ai-cluster/nixos/modules/common.nix:97 (resolved)

**@copilot-pull-request-reviewer** (2026-05-27T02:39:50Z):

P0: `services.samba.openFirewall = true` opens 139/tcp and 445/tcp in addition to the NetBIOS UDP ports. That contradicts the stated goal of “name resolution only” and unnecessarily exposes SMB services on the LAN; prefer keeping `openFirewall = false` and explicitly allowing only the required UDP ports (137 and likely 138) via `networking.firewall.allowedUDPPorts` (optionally scoped to LAN interfaces/sources).

### Thread 2: full-ai-cluster/nixos/modules/common.nix:107 (resolved)

**@copilot-pull-request-reviewer** (2026-05-27T02:39:50Z):

P1: The comment says this “disables SMB file-sharing entirely”, but the config still enables Samba and binds SMB on 445 (`"smb ports" = "445"`). Even with no shares declared, this is still an active SMB service surface; either actually disable smbd/SMB listeners (run only the NetBIOS name service if possible) or update the comment to accurately describe the remaining exposure and hardening assumptions.

### Thread 3: full-ai-cluster/nixos/modules/common.nix:57 (resolved)

**@copilot-pull-request-reviewer** (2026-05-27T02:39:50Z):

P2: Codebase convention: this module’s comments include personal name attribution (e.g. “Aaron …”). Per `.github/copilot-instructions.md` “No name attribution in code, docs, or skills” (outside the listed history surfaces), please rewrite these references to role-refs like “operator”/“maintainer” and move verbatim quotes to an appropriate history surface if they must be preserved.

## General comments

### @chatgpt-codex-connector (2026-05-27T02:37:18Z)

You have reached your Codex usage limits for code reviews. You can see your limits in the [Codex usage dashboard](https://chatgpt.com/codex/cloud/settings/usage).
Loading
Loading