From e64facac699232d2ba60312c5b37a6a3fe0529d4 Mon Sep 17 00:00:00 2001
From: Lior <lior@zeta.dev>
Date: Tue, 26 May 2026 00:57:12 -0400
Subject: [PATCH 1/2] backlog(B-0771): re-land audio codecs working (DAW-ready)
 + Intel NPU/VPU exposed + ONNX as operator contract

Re-land of stale-DIRTY PR #5058 (Tier-3 per pr-triage-tiers).
Same B-0771 row (269 lines) from PR #5058 head cf9f8e2fc; BACKLOG.md
regenerated; pre-emptive MD022 fix (heading wrap across lines collapsed
to single line to satisfy blanks-around-headings).

Co-Authored-By: Claude <noreply@anthropic.com>
---
 docs/BACKLOG.md                               |   1 +
 ...r-daw-and-ai-workloads-aaron-2026-05-25.md | 269 ++++++++++++++++++
 2 files changed, 270 insertions(+)
 create mode 100644 docs/backlog/P2/B-0771-audio-codecs-working-plus-intel-npu-vpu-exposed-for-daw-and-ai-workloads-aaron-2026-05-25.md

diff --git a/docs/BACKLOG.md b/docs/BACKLOG.md
index f39ab2a938..abb825d231 100644
--- a/docs/BACKLOG.md
+++ b/docs/BACKLOG.md
@@ -714,6 +714,7 @@ are closed (status: closed in frontmatter)._
 - [ ] **[B-0762](backlog/P2/B-0762-ai-auto-submit-back-telemetry-fixes-from-in-the-wild-installs-adoption-cost-to-zero-flywheel-aaron-2026-05-25.md)** AI auto-submit-back telemetry + fixes from in-the-wild installs — adoption-cost-to-zero flywheel
 - [ ] **[B-0764](backlog/P2/B-0764-cncf-ecosystem-as-force-multipliers-behind-zeta-interfaces-keda-dapr-opa-oam-kubevela-plus-ace-and-ontology-negotiation-aaron-2026-05-25.md)** CNCF ecosystem as force multipliers behind Zeta interfaces — KEDA, DAPR, OPA, OAM/KubeVela + Ace + ontology negotiation
 - [ ] **[B-0770](backlog/P2/B-0770-gl-inet-comet-pro-ip-kvm-integration-remote-bios-to-cluster-member-zero-physical-access-aaron-2026-05-25.md)** GL.iNet Comet Pro IP-KVM integration — remote BIOS-to-cluster-member; zero-physical-access cluster bring-up + repair
+- [ ] **[B-0771](backlog/P2/B-0771-audio-codecs-working-plus-intel-npu-vpu-exposed-for-daw-and-ai-workloads-aaron-2026-05-25.md)** Audio codecs working (DAW-ready) + Intel NPU/VPU exposed as cluster compute resource — beyond cosmetic firmware fix
 - [ ] **[B-0772](backlog/P2/B-0772-observable-cluster-fabric-universal-device-plugins-plus-reticulum-mesh-plus-polyglot-rx-streams-aaron-2026-05-25.md)** Observable cluster fabric — universal device plugins (NPU/GPU/audio/etc.) + Reticulum mesh (AllJoyn-successor) + polyglot Rx streams in every language
 - [ ] **[B-0774](backlog/P2/B-0774-etcdless-options-kine-adapter-dqlite-postgres-nats-zeta-native-dbsp-aaron-2026-05-25.md)** Etcd-less k8s options — kine adapter family (SQLite/Postgres/MySQL/NATS/Dqlite) + Zeta-native DBSP+Raft endgame
 - [ ] **[B-0775](backlog/P2/B-0775-ha-kubernetes-that-scales-beyond-etcd-cockroach-nats-supercluster-karmada-cluster-api-cell-based-aaron-2026-05-25.md)** HA Kubernetes that scales beyond etcd — CockroachDB / NATS super-cluster / Karmada / KubeStellar / Cluster API / cell-based architecture
diff --git a/docs/backlog/P2/B-0771-audio-codecs-working-plus-intel-npu-vpu-exposed-for-daw-and-ai-workloads-aaron-2026-05-25.md b/docs/backlog/P2/B-0771-audio-codecs-working-plus-intel-npu-vpu-exposed-for-daw-and-ai-workloads-aaron-2026-05-25.md
new file mode 100644
index 0000000000..2059f6182f
--- /dev/null
+++ b/docs/backlog/P2/B-0771-audio-codecs-working-plus-intel-npu-vpu-exposed-for-daw-and-ai-workloads-aaron-2026-05-25.md
@@ -0,0 +1,269 @@
+---
+id: B-0771
+priority: P2
+status: open
+title: Audio codecs working (DAW-ready) + Intel NPU/VPU exposed as cluster compute resource — beyond cosmetic firmware fix
+effort: M
+ask: aaron 2026-05-25
+created: 2026-05-25
+last_updated: 2026-05-25
+depends_on:
+  - B-0754
+composes_with:
+  - B-0755
+  - B-0759
+  - B-0761
+  - B-0763
+  - B-0764
+  - B-0767
+tags: [cluster, audio, daw, npu, vpu, intel, openvino, ai-workload, pipewire, alsa, jack]
+---
+
+## Problem
+
+Aaron 2026-05-25 mid-iter-3-prep, extending the audio-firmware
+cleanup scope: *"i'd like the sound codecs workikng and npus for
+use like by daw and others."*
+
+B-0754 iter-3 PR (#5057) bundles `hardware.enableRedistributableFirmware = true`
+into the installer ISO, which silences the Intel SoF `ASoC: failed
+to instantiate card -2` boot-time warning by giving the firmware
+probe what it needs. But that's only the FIRMWARE LAYER. To
+actually use audio (DAW workloads) + NPU (AI inference workloads)
+on Zeta cluster nodes, more substrate is needed:
+
+| Layer | What iter-3 fixes | What this row adds |
+|---|---|---|
+| Kernel | Firmware blobs reachable; SoF + intel_vpu can probe cleanly | Confirm intel_vpu module loaded; expose `/dev/accel/accel0` for NPU |
+| Audio stack | (nothing) | ALSA configured; PipeWire (default modern stack) or PulseAudio; JACK for DAW-class real-time audio; per-role config |
+| NPU userspace | (nothing) | intel-npu-driver userspace package; OpenVINO 2024.0+ runtime; per-workload library access |
+| Container access | (nothing) | k8s device plugin for NPU (similar to nvidia-device-plugin); makes `intel.com/npu: 1` requestable in Pod specs |
+| Per-role substrate | (nothing) | New host config variants: `workstation` (audio + NPU + GPU + GUI); existing roles get NPU-only by default if NPU hardware present |
+| Operator UX | (nothing) | Documented: "use Zeta cluster node as a DAW workstation"; "schedule AI inference on NPU instead of GPU when latency-acceptable" |
+
+## Target
+
+Three coherent capability deliveries:
+
+### Capability 1: Audio works end-to-end on cluster nodes (DAW-ready)
+
+Operator can use a Zeta cluster node as a workstation that runs a
+DAW (Ableton, Reaper, Ardour, Bitwig, Logic-via-CrossOver) with
+real-time audio. Built-in audio output (HDMI / analog / SPDIF /
+USB) + microphone input + low-latency PipeWire/JACK substrate.
+
+### Capability 2: NPU exposed as a first-class cluster compute resource
+
+Operator can deploy AI workloads (inference primarily; some
+training where NPU substrate supports) that request NPU compute
+via standard k8s resource-requests (`intel.com/npu: 1` or similar
+per device plugin convention). Workloads use OpenVINO 2024.0+
+runtime to access NPU. Scheduler (B-0767) is NPU-aware for
+multi-NPU nodes.
+
+### Capability 3: Per-role substrate handles audio + NPU declaratively
+
+New host role `workstation` (composes with B-0755 role taxonomy)
+that includes audio stack + NPU + GPU + GUI for nodes that double
+as user workstations. Existing roles (`control-plane`,
+`worker-gpu`, etc.) get NPU-only enabling automatically when NPU
+hardware is present (lspci detection at install time).
+
+## Acceptance
+
+### Capability 1: Audio
+
+- [ ] `modules/audio-stack.nix` ships PipeWire as the default
+      audio server (NixOS modern recommendation; backwards-
+      compatible PulseAudio + JACK protocols)
+- [ ] ALSA + UCM profiles configured per detected codec
+      (SoundWire codecs auto-recognized via topology files
+      shipped in sof-firmware)
+- [ ] Real-time audio support: `security.rtkit.enable = true`;
+      audio group membership for operator user; rtprio limits
+      adjusted for JACK / Pro Audio
+- [ ] Optional JACK2 + jack2-dbus for DAW workloads that
+      require explicit JACK (most modern DAWs work over
+      PipeWire's JACK shim; explicit JACK for hardcore
+      Pro Audio cases)
+- [ ] Test: play a tone + record from mic on a Zeta cluster
+      node; document the test recipe; per-hardware-class
+      compatibility matrix
+- [ ] DAW-specific recipes: documented bootstrap for Ableton
+      Live (via WINE/Bottles), Reaper (native Linux), Ardour
+      (native), Bitwig (native); per-DAW USB-audio-interface
+      compatibility notes
+
+### Capability 2: NPU
+
+- [ ] `modules/intel-npu.nix` ships:
+      - `boot.kernelModules = [ "intel_vpu" ];` (mainline
+        kernel 6.5+ has the driver)
+      - `hardware.firmware = with pkgs; [ intel-npu-driver-firmware ];`
+        (Intel NPU firmware blobs)
+      - User-namespace permission to `/dev/accel/accel0` via
+        udev rule + group membership
+- [ ] `modules/openvino.nix` ships OpenVINO 2024.0+ runtime
+      with NPU plugin enabled
+- [ ] k8s NPU device plugin: ships intel/intel-device-plugins-
+      for-kubernetes (specifically the GPU + VPU plugins);
+      registers `intel.com/npu` resource per node with NPU
+- [ ] Per-pod NPU access: documented Pod spec for requesting
+      `intel.com/npu: 1` + mounting OpenVINO runtime
+- [ ] Test: run a simple OpenVINO inference workload
+      (resnet50 image classification) on NPU vs CPU; show
+      latency + power benefit
+- [ ] Auto-detection: install-time `lspci | grep -i "Neural"`
+      check; if NPU present, enable NPU modules in installed
+      host config
+
+### Capability 3: Per-role substrate
+
+- [ ] New `nixos/hosts/workstation/configuration.nix` host
+      config: composes modules/audio-stack.nix + modules/
+      intel-npu.nix + modules/desktop-environment.nix (GNOME
+      or KDE Plasma) + modules/k3s-agent.nix (so workstation
+      is also a cluster member)
+- [ ] flake.nix `nixosConfigurations.workstation = mkSystem
+      { modules = [ ... ]; };` entry
+- [ ] B-0754 v1 role keystroke prompt extended: add 'k' for
+      workstation (composes with B-0755 role taxonomy
+      expansion); other role options unchanged
+- [ ] Existing roles (`control-plane`, `worker-gpu`) get
+      NPU-only enabling at install time if NPU detected
+      (audio stack OFF by default unless workstation role)
+- [ ] Documented: when to pick workstation role (DAW user;
+      AI engineer wanting NPU on local box; cluster member
+      that's also a daily-driver)
+
+### Capability 4: Scheduler awareness (composes with B-0767)
+
+- [ ] `Zeta.K8s.Scheduler` NPU-aware plugin (per B-0767
+      sub-wave B GPU topology + sub-wave C model locality):
+      - NPU-class workloads prefer nodes with available NPU
+      - Workload-class fitness: small models (sub-1B params)
+        + low-latency requirements → NPU; large models → GPU
+      - Latency-vs-throughput trade-off awareness
+
+## ServiceTitan-route composition (B-0765 / B-0763)
+
+Inference access goes through layered existing standards:
+
+| Layer | Standard | Role |
+|---|---|---|
+| **Model format** | **ONNX** (Open Neural Network Exchange) | Operator-facing portability format; cross-framework (PyTorch / TF / scikit-learn / XGBoost export to ONNX); one model definition runs everywhere Zeta supports |
+| **Inference runtime** | **ONNX Runtime** (Microsoft-led) | Cross-platform engine that abstracts hardware backends via Execution Providers |
+| **ONNX Runtime EPs** | OpenVINO EP (Intel CPU/GPU/NPU); CUDA EP + TensorRT EP (NVIDIA); ROCm EP + MIGraphX EP (AMD); CoreML EP (Apple); DirectML EP (Windows); default CPU EP (fallback) | One ONNX model → ONNX Runtime → best-fit EP per node hardware |
+| **Native vendor runtimes** | OpenVINO (Intel); TensorRT (NVIDIA); MIGraphX (AMD); Core ML (Apple) | Highest perf for vendor-specific workloads; Zeta exposes for operators who want max perf at cost of portability |
+| **k8s device plugin** | intel-device-plugins-for-kubernetes; nvidia-device-plugin; amd-device-plugin | Existing CNCF/vendor standards adopted per B-0764 |
+| **PipeWire / ALSA / JACK** | Linux audio standards | Operators using vanilla audio software unchanged |
+
+**Substrate-honest layering**: ONNX is the operator's contract;
+runtime selection is Zeta's substrate decision (scheduler-driven
+per B-0767). Operator deploys an ONNX model + resource
+requirements; Zeta picks the best Execution Provider per node
+hardware automatically. Operator can override with explicit
+vendor-native runtime (OpenVINO IR, TensorRT engine, MIGraphX
+MXR) for max perf when willing to lose portability.
+
+### Why ONNX as operator contract (not OpenVINO or TensorRT directly)
+
+| Choice | Portability | Perf ceiling | Operator lock-in |
+|---|---|---|---|
+| **ONNX → ONNX Runtime → vendor EP** | High | High (within EP capabilities) | None — model moves anywhere |
+| OpenVINO IR (operator-side) | Low (Intel-only) | Highest on Intel | High (Intel-locked) |
+| TensorRT engine (operator-side) | None (NVIDIA-only + per-GPU-compiled) | Highest on NVIDIA | High (NVIDIA-locked) |
+| PyTorch model (operator-side) | High via torch.compile | Variable | Medium (torch ecosystem) |
+| TensorFlow SavedModel (operator-side) | High | Variable | Medium (TF ecosystem) |
+
+ONNX wins the operator-contract slot because it preserves
+operator optionality (per B-0763 negotiation-high-seat) while
+allowing Zeta to pick the best execution path per node. Operators
+wanting max perf can OPT INTO vendor-native runtime; default
+contract stays portable.
+
+Per B-0763 vendor swap: alternative inference runtimes (MLIR-
+based, TVM, IREE, ONNX MLIR), alternative NPU hardware (AMD
+XDNA, Apple Neural Engine via Asahi, Hailo-8/15, etc.), and
+alternative scheduler-policy backends all fit the same
+`Zeta.AI.Inference` interface; operators swap via config change
+without rewriting application code.
+
+## Hardware compatibility matrix
+
+| Hardware class | Audio | NPU | Workstation-role-suitable |
+|---|---|---|---|
+| Intel Meteor Lake (Core Ultra 1st gen) | SoF + SoundWire | Intel NPU (3 TOPS INT8) | Yes |
+| Intel Lunar Lake (Core Ultra 200V) | SoF + SoundWire | Intel NPU 4 (48 TOPS INT8) | Yes |
+| Intel Arrow Lake (Core Ultra 200S desktop) | HD Audio | Intel NPU 3 (13 TOPS) | Yes |
+| AMD Ryzen AI 300 series | HD Audio | XDNA NPU (50 TOPS) | Yes (different driver substrate; out of scope v1) |
+| Apple Silicon (M-series) | Apple Audio | Neural Engine | Asahi Linux scope; out of scope v1 |
+| NVIDIA + dedicated GPU only | HD Audio | (none — GPU does the inference) | Yes (no NPU; GPU schedules) |
+
+V1 scope: Intel Meteor Lake / Lunar Lake / Arrow Lake (Aaron's
+hardware family). AMD XDNA + Apple Neural Engine as separate
+sub-rows when those operators show up.
+
+## Composes with
+
+- B-0754 — installer ISO (the iter-3 firmware fix is the
+  prerequisite this row builds on)
+- B-0755 — role taxonomy expansion (workstation role lands here)
+- B-0759 — first-time-CLI-user persona (DAW operators are a
+  persona variant; AI engineers running OpenVINO are another)
+- B-0761 — open reference architecture (NPU + audio support
+  becomes part of the AI-cluster reference; ARC-AGI scenarios
+  can include NPU-vs-GPU latency benchmarks)
+- B-0763 — cloud-native plugins fit Zeta interfaces
+  (`Zeta.Compute.NPU` interface accommodates Intel + AMD +
+  Apple + future NPU vendors)
+- B-0764 — CNCF force multipliers (intel-device-plugins-for-
+  kubernetes is the existing standard adopted)
+- B-0767 — Zeta-native scheduler (NPU-awareness in scheduler
+  sub-wave B/C)
+
+## Why not just use kube-scheduler's default device-plugin support?
+
+
+Default kube-scheduler treats device plugins as opaque
+countable resources. For NPU specifically, you want:
+
+- NPU-vs-GPU placement based on workload class (small +
+  latency-sensitive → NPU; large → GPU)
+- Model locality (NPUs have on-chip cache; warm models stay
+  put)
+- Power-cost tier (NPU is ~10x more power-efficient than GPU
+  for small models; cluster-level power optimization wants
+  to prefer NPU when possible)
+- Multi-NPU topology awareness (some boards have multiple
+  NPUs; Zeta scheduler can co-locate multi-NPU workloads)
+
+This is exactly the value B-0767 Zeta-native scheduler adds.
+NPU awareness is one of its first concrete sub-wave-B/C
+plugins.
+
+## Out of scope
+
+- AMD XDNA NPU support — separate row when an operator
+  shows up with that hardware
+- Apple Neural Engine via Asahi Linux — separate row; Asahi
+  Linux substrate is its own track
+- Audio-over-network (Dante / AVB / AES67) for studio-grade
+  multi-machine audio — separate scope; pro-audio-cluster row
+- VST / AU plugin substrate (running Windows VSTs via WINE,
+  Mac AU plugins via emulation) — separate per-DAW scope
+- Sample library distribution + asset management for DAWs —
+  separate scope
+- Headphone-spatial-audio / Dolby Atmos / HRTF processing —
+  separate scope
+
+## Origin
+
+Aaron 2026-05-25 mid-iter-3-prep, extending the cosmetic
+firmware fix to the broader scope: cluster nodes should
+support DAW workloads (audio working) + NPU compute (for AI
+inference) — not just be silent-cluster-workers. This makes
+Zeta cluster substrate suitable for AI engineers + creative
+professionals using their cluster nodes as workstations too,
+which composes naturally with the AI-cluster-substrate value
+prop the reference architecture (B-0761) demonstrates.

From df7b1315dadccb0036d97805364ab0faeccb8150 Mon Sep 17 00:00:00 2001
From: Lior <lior@zeta.dev>
Date: Tue, 26 May 2026 00:58:58 -0400
Subject: [PATCH 2/2] fix(B-0771): collapse MD012 double blank introduced by
 MD022 heading-wrap fix

Co-Authored-By: Claude <noreply@anthropic.com>
---
 ...-npu-vpu-exposed-for-daw-and-ai-workloads-aaron-2026-05-25.md | 1 -
 1 file changed, 1 deletion(-)

diff --git a/docs/backlog/P2/B-0771-audio-codecs-working-plus-intel-npu-vpu-exposed-for-daw-and-ai-workloads-aaron-2026-05-25.md b/docs/backlog/P2/B-0771-audio-codecs-working-plus-intel-npu-vpu-exposed-for-daw-and-ai-workloads-aaron-2026-05-25.md
index 2059f6182f..a2b80f3c9f 100644
--- a/docs/backlog/P2/B-0771-audio-codecs-working-plus-intel-npu-vpu-exposed-for-daw-and-ai-workloads-aaron-2026-05-25.md
+++ b/docs/backlog/P2/B-0771-audio-codecs-working-plus-intel-npu-vpu-exposed-for-daw-and-ai-workloads-aaron-2026-05-25.md
@@ -224,7 +224,6 @@ sub-rows when those operators show up.
 
 ## Why not just use kube-scheduler's default device-plugin support?
 
-
 Default kube-scheduler treats device plugins as opaque
 countable resources. For NPU specifically, you want: