Skip to content

fix(B-0792 iter-5.2.2): hostname auto-gen at install-time NOT flash-time (multi-node reuse fix) + login-banner shows hostname pre-login (Aaron 2026-05-26)#5113

Merged
AceHack merged 1 commit into
mainfrom
otto-cli/iter522-move-auto-hostname-from-flash-time-to-install-time-2026-05-26
May 26, 2026
Merged

fix(B-0792 iter-5.2.2): hostname auto-gen at install-time NOT flash-time (multi-node reuse fix) + login-banner shows hostname pre-login (Aaron 2026-05-26)#5113
AceHack merged 1 commit into
mainfrom
otto-cli/iter522-move-auto-hostname-from-flash-time-to-install-time-2026-05-26

Conversation

@AceHack
Copy link
Copy Markdown
Member

@AceHack AceHack commented May 26, 2026

Two Aaron 2026-05-26 empirical observations:

(1) Hostname auto-gen moved flash-time → install-time (reverts iter-5.2.1 flash-time approach). Aaron: 'wait zflash has a hard coded name? i was thinking it would be auto generated on each machine so i can't use that same usb twice?' — flash-time auto-gen baked the same name into the USB; every install from same USB → mDNS collision. Fix: when no --host, DON'T write zeta-hostname.txt to ESP; zeta-install.sh generates fresh node-<6hex> on-node per-install. Same USB now installs N nodes with N unique hostnames.

(2) Login banner shows hostname + ssh hint pre-login. Aaron: 'i mean i see a login but no hostname until after i login can you update to show hostname before i login' — new nixos/modules/login-banner.nix sets services.getty.greetingLine + services.getty.helpLine to show photo-friendly banner with hostname + ssh-from-Mac hint BEFORE the login: prompt. Imported transitively via common.nix.

Composes with iter-5.1+5.2+5.2.1 substrate already on main. zflash --host pikachu still works (operator override path).

…tall-time + login-banner shows hostname pre-login (Aaron 2026-05-26)

Two fixes addressing maintainer 2026-05-26 empirical observations:

(1) Move hostname auto-generation from flash-time to install-time
    (REVERTS iter-5.2.1 flash-time auto-gen):

The maintainer 2026-05-26: "wait zflash has a hard coded name? i was
thinking it would be auto generated on each machine so i can't use
that same usb twice?"

iter-5.2.1 baked the auto-name into the USB ESP at flash time → every
install from the same USB inherited the SAME hostname → mDNS collision
when reused across machines. Defeats the multi-node-from-single-USB use
case the maintainer wanted.

Fix: when --host is NOT passed to zflash, DON'T write
zeta-hostname.txt to the ESP. zeta-install.sh now generates a fresh
random node-<6hex> ON THE NODE at install time via head -c 3
/dev/urandom | xxd -p. Each install from the same USB gets a
different hostname.

zflash --host pikachu still works the same way (operator override).
The previous "log auto-name pre-flash so operator knows what to ssh
to" UX is traded for multi-node correctness; operator now reads
the auto-name from the cluster console's pre-login banner (see (2)
below) OR from mDNS scan.

(2) Login banner shows hostname + ssh hint BEFORE login:

The maintainer 2026-05-26: "i mean i see a login but no hostname
until after i login can you update to show hostname before i login"

NixOS default getty shows just "login:" without prominent hostname.
New nixos/modules/login-banner.nix sets services.getty.greetingLine +
services.getty.helpLine to display:

  ╭─────────────────────────────────────────────────────────╮
  │  ZETA CLUSTER NODE                                      │
  │  Hostname:  <hostname>                                  │
  │  SSH from operator Mac:                                 │
  │    ssh zeta@<hostname>.local                            │
  │  Console login:                                         │
  │    user:     zeta                                       │
  │    password: zeta-change-me  (rotate after first login) │
  ╰─────────────────────────────────────────────────────────╯
  <hostname> login:

Photo-friendly per the maintainer's discipline: "whenever i have to
ferry commands by reading and typing i'm going to avoid it like the
plague and try to get like pictures and auto run and short commands
pre built in".

common.nix imports login-banner.nix so EVERY host (control-plane,
worker-gpu, worker-template, future configs) gets the pre-login
banner via the existing import chain.

Changes:

- full-ai-cluster/tools/zflash.ts: removed iter-5.2.1
  flash-time auto-gen; replaced with informational message when no
  --host passed pointing at on-node install-time generation +
  pre-login banner display
- full-ai-cluster/usb-nixos-installer/zeta-install.sh: when no
  zeta-hostname.txt on ESP, generate node-<6hex> on-node via
  head -c 3 /dev/urandom | xxd -p; write to /mnt/etc/zeta/
  cluster-node-id; injected-hostname.nix module picks up
- full-ai-cluster/nixos/modules/login-banner.nix (NEW): getty
  greetingLine + helpLine customization
- full-ai-cluster/nixos/modules/common.nix: imports
  login-banner.nix

Composes with iter-5.1 + iter-5.2 + iter-5.2.1 substrate already on
main; iter-5.2.2 supersedes iter-5.2.1's flash-time-auto-gen
mechanism with install-time generation.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 26, 2026 06:34
@AceHack AceHack enabled auto-merge (squash) May 26, 2026 06:34
@chatgpt-codex-connector
Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

@AceHack AceHack merged commit 7d19b95 into main May 26, 2026
33 of 34 checks passed
@AceHack AceHack deleted the otto-cli/iter522-move-auto-hostname-from-flash-time-to-install-time-2026-05-26 branch May 26, 2026 06:38
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the full-ai-cluster install substrate to avoid hostname collisions when reusing the same USB across multiple machines, and improves console UX by showing the hostname before login.

Changes:

  • Move auto-generated hostname creation from zflash flash-time to zeta-install.sh install-time (per-node/per-install unique hostname when --host isn’t provided).
  • Add a NixOS getty login banner module to display hostname + SSH hint pre-login, and import it via common.nix.
  • Update zflash messaging to reflect the new install-time hostname behavior.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File Description
full-ai-cluster/usb-nixos-installer/zeta-install.sh Generates a per-install random node-<6hex> hostname when no ESP hostname file exists, and writes it into the injected-hostname substrate.
full-ai-cluster/tools/zflash.ts Removes flash-time hostname auto-gen and adjusts operator messaging for the install-time generation path.
full-ai-cluster/nixos/modules/login-banner.nix Adds a getty greeting/help banner to show hostname + SSH hint before the login prompt.
full-ai-cluster/nixos/modules/common.nix Imports the new login-banner module so all cluster hosts get the pre-login banner.

// the auto-name from the cluster console's login banner (printed
// pre-login per iter-5.2.2 login-banner module) OR from mDNS scan.
// Right trade for multi-node correctness.
if (hostOverride === null && willInject) {
`\niter-5.2.2: --host not specified; zeta-install.sh on-node will\n` +
` auto-generate a unique node-<6hex> hostname per-install.\n` +
` Pre-login banner on first boot displays the chosen hostname\n` +
` + IP (per iter-5.2.2 NixOS login-banner module).\n` +
Comment on lines +4 to +5
# hostname (+ primary IP + ssh-from-Mac hint) is visible BEFORE the
# operator logs in at the console.
Comment on lines +42 to +43
# \\n in literal NixOS string becomes "\n" in /etc/issue, which
# agetty expands to the system hostname at runtime.
AceHack added a commit that referenced this pull request May 26, 2026
… + rows_filed_24h (Aaron 2026-05-26 — "per agent so we can see helath like per trajectory") (#5115)

Aaron 2026-05-26 substrate-engineering concern:

> 'we need to make sure that decopose is happening an on going
> backlog log or else infinate backlog is just infnate debt'

> 'the decompose to action is what i want background to show
> with stats over time on the github page we have for plant
> metrics that and also prs, i want that per agent so we can
> see helath like per trajectory'

Extends tools/dashboard/generate-metrics.ts to surface per-agent
PR-shipping rate + decompose-to-action ratio in demo/metrics.json
(consumed by the Zeta Factory Dashboard at
lucent-financial-group.github.io/Zeta/demo/index.html).

Three new per-agent fields:

  prs_merged_24h           — PRs this agent merged in 24h window
  rows_filed_24h           — PRs whose title matches `backlog(B-NNNN`
                             (row-filing-only PRs, NOT action-on-rows)
  decompose_to_action_ratio — (prs_merged - rows_filed) / max(rows_filed, 1)
                             → impl-PRs per row-filing-PR
                             → >=1 = strong action-on-rows discipline
                             → <1  = filing rows faster than shipping
                                     them = debt-accumulation signal

Attribution via branch-prefix lookup (BRANCH_PREFIX_TO_AGENT) per
.claude/rules/agent-roster-reference-card.md lane discipline:
otto-cli/ + otto-desktop/ + otto-vscode/ + otto/ → Otto;
alexa-kiro/ + alexa/ → Alexa; riven-cursor/ + riven/ → Riven;
vera-codex/ + vera/ → Vera; lior-antigravity/ + lior-gemini/ +
lior/ → Lior. PRs from non-prefixed branches attribute to 'Unknown'
bucket (operator-auditable as missing-attribution surface).

EMPIRICAL validation 2026-05-26 (live run):

  Otto:  57 PRs / 30 row-filing → ratio = 0.9 (nearly 1:1; debt signal!)
  Lior:   6 PRs / 0 row-filing  → ratio = 6   (all action)
  Others: 0/0/0 (quiet 24h window)

Otto ratio 0.9 EMPIRICALLY VALIDATES Aaron's concern — this
session filed 6 substantive rows (B-0791..B-0794, B-0796, B-0797)
+ shipped 4 implementation PRs (#5103 iter-5.1+5.2, #5107 iter-5.2.1,
#5113 iter-5.2.2, #5110 draft) — ratio < 1. The metric now exposes
the pattern continuously.

Dashboard HTML render of these new fields is follow-on substrate
(small UI work). The data layer is the load-bearing first step;
operator + Mika can read demo/metrics.json directly until UI lands.

Substrate-honest note: the dashboard generation itself happens on
the autonomous-loop cron tick (per B-0414); per-agent stats will
update on every tick going forward. Time-series tracking (today's
metric vs 7d-ago, 30d-ago) is separate substrate (would need to
preserve historical metrics.json snapshots; deferred to follow-on
iteration).

Composes with .claude/rules/agent-roster-reference-card.md
(branch-prefix attribution), .claude/rules/holding-without-named-
dependency-is-standing-by-failure.md (decompose-to-action discipline),
B-0797 (autonomous-loop sometimes-task; same substrate-engineering
direction).

Co-authored-by: Lior <lior@zeta.dev>
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
AceHack added a commit that referenced this pull request May 26, 2026
…atches dropped iter-N modules before ~15min Nix build (Aaron 2026-05-26) (#5116)

Aaron 2026-05-26: 'start wroking on the ci stuff while we iterate so
you can start iterating without me' + 'any parts we can test in
siolate are candidates for more unit like tests instead of full
integration tests'.

This PR ships #1 of an ascending test-substrate cascade:

  #1 Source-substrate audit (this PR; ~1s; preflight)
  #2 Unit tests for zflash.ts + shell-logic (next PR)
  #3 ISO content audit (via 7z list; after ISO build)
  #4 NixOS test framework (full VM boot + install round-trip)
  #5 End-to-end CI workflow (hardware-class regression)

The maintainer 2026-05-26 USB flash empirically surfaced two
related bugs the audit catches:

(1) Workflow trigger-path filter on build-ai-cluster-iso.yml was
    `nixos/modules/disko-shapes/**` only — missed iter-5.2 (PR
    #5103 added injected-hostname.nix) + iter-5.2.2 (PR #5113
    added login-banner.nix). Result: CI didn't rebuild the ISO
    when those modules landed; operator downloaded an older ISO
    via `gh run download` that lacked the iter-5.x substrate.

(2) Even after broadening trigger paths, source-substrate audit
    is a FLOOR: catches "module file in repo but iter-N sentinel
    accidentally dropped in a fix-fwd" + "module file removed
    by mistake". Pure source-level grep; runs in ~1s; no Nix
    build needed.

Changes:

- NEW tools/ci/audit-installer-substrate.ts (~250 LOC TS):
  - REQUIRED_FILES list (10 expected installer-substrate paths)
  - REQUIRED_SENTINELS list (5 file→sentinel-strings assertions)
  - Exit codes: 0 pass / 1 missing file / 2 missing sentinel
  - Runs locally + in CI; bun tools/ci/audit-installer-substrate.ts
  - Empirical pass on current main substrate

- BROADENED .github/workflows/build-ai-cluster-iso.yml triggers:
  full-ai-cluster/nixos/disko-shapes/** → full-ai-cluster/nixos/**
  + full-ai-cluster/tools/** + tools/ci/audit-installer-substrate.ts

- ADDED preflight audit step BEFORE the ~15min nix build (fails
  fast if substrate is incomplete; saves CI minutes when iter-N
  modules accidentally get dropped)

Audits performed:

  REQUIRED_FILES (10):
    zeta-install.sh, zeta-first-boot.sh, installer/configuration.nix,
    initial-password.nix, operator-ssh-keys.nix, operator-ssh-keys.txt,
    common.nix, injected-hostname.nix, login-banner.nix, zflash.ts

  REQUIRED_SENTINELS (5 file→list pairs):
    zeta-install.sh:    Step 6.5/6.6/6.7 markers, iter-5.2.2,
                        /dev/urandom
    zeta-first-boot.sh: ETHERNET_WAIT_SECS, nmtui, zeta-install
    common.nix:         imports of injected-hostname.nix +
                        login-banner.nix, services.avahi, nssmdns4
    injected-hostname.nix: cluster-node-id, networking.hostName,
                           lib.mkOverride
    login-banner.nix:   getty greetingLine + helpLine, Hostname:,
                        ssh zeta@

Adding new iter-N modules: append path to REQUIRED_FILES + sentinels
to REQUIRED_SENTINELS in the audit tool. Future-Otto reads this
header to discover the pattern.

Follow-on PRs in the test-substrate cascade (per Aaron's direction):

  - Unit tests for zflash.ts parseArgs + RFC1123 validation +
    mountEsp method-selection (Bun test runner; no I/O)
  - Docker-based zeta-install.sh test (mocked /dev devices +
    mocked /iso + /tmp/zeta-boot-esp; tests Step 6.6 + 6.7 logic
    without VM boot)
  - ISO content audit (7z list of built ISO; verifies expected
    paths + boot config; runs AFTER nix build, before artifact
    upload)
  - NixOS test framework (full QEMU VM boot + install round-trip;
    asserts pre-login banner, ssh-zero-typing, NM-profile
    persistence, hostname auto-gen)

Co-authored-by: Lior <lior@zeta.dev>
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants