docs: add agentic organization architecture#4958
Conversation
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
Place the Agentic Organization design docs under docs/agentic-organization and index them from the docs audience navigation. Document the TypeScript app shape as shared npm capability packages composed by NestJS orchestrator apps. Co-Authored-By: Codex <noreply@openai.com>
8a65c3a to
ca7e39a
Compare
There was a problem hiding this comment.
Pull request overview
Adds a new documentation set under docs/agentic-organization/ describing the proposed “Agentic Organization” runtime and TypeScript package/app architecture, and links it from the main docs/README.md audience index.
Changes:
- Introduces a full “Agentic Organization” design doc set (runtime, work/release OS, UI/observability, hats/departments, cluster substrate, build plan, readiness checklist).
- Adds an audience entry in
docs/README.mdpointing readers to the new doc set.
Reviewed changes
Copilot reviewed 17 out of 17 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| docs/README.md | Adds an “Agentic Organization builder” entry pointing to the new doc index. |
| docs/agentic-organization/README.md | Indexes the new Agentic Organization documents. |
| docs/agentic-organization/FOUNDATIONAL_CONTEXT_AND_LANGUAGE.md | Captures baseline vocabulary/context for the design set (currently includes PII/name attribution issues). |
| docs/agentic-organization/ORGANIZATION_RUNTIME_ARCHITECTURE.md | Large conceptual architecture and operating model for the Organization runtime. |
| docs/agentic-organization/IMPLEMENTATION_CONCEPTS.md | Large implementation-focused concepts for services/data/tools/workflows. |
| docs/agentic-organization/ALWAYS_ON_ORCHESTRATION_RUNTIME.md | Defines the always-on workers, triggers, rules, leases, reconcilers, and SLO concepts. |
| docs/agentic-organization/WORK_AND_RELEASE_MANAGEMENT_OS.md | Defines the work/backlog/task/release domain model, state machines, and signal model. |
| docs/agentic-organization/UI_AND_OBSERVABILITY_CONCEPTS.md | Defines proposed UI surfaces and observability/evidence navigation concepts. |
| docs/agentic-organization/RUNTIME_TECH_AND_PACKAGE_STRATEGY.md | Positions Temporal/Dapr/NATS/Oz/OpenZiti/Hindsight and proposes package boundaries. |
| docs/agentic-organization/ORGANIZATION_LAYER_BUILD_PLAN.md | Proposes the TypeScript monorepo app/package layout and an MVP build sequence. |
| docs/agentic-organization/IMPLEMENTATION_READINESS_CHECKLIST.md | Enumerates decisions/contracts to lock before implementation starts. |
| docs/agentic-organization/DEPARTMENT_HAT_TOOL_INVENTORY.md | Defines departments, hat catalog, tool bundles, and gate ownership boundaries. |
| docs/agentic-organization/CLUSTER_NATIVE_HAT_SYSTEM.md | Proposes a Kubernetes-native hat/hatbinding/policy CRD model and enforcement/observability. |
| docs/agentic-organization/CLUSTER_EXECUTION_AND_MEMORY_SUBSTRATE.md | Defines cluster execution assumptions (k3s, Cilium/SPIRE/Vault, Credential Proxy, Hindsight). |
| docs/agentic-organization/AI_CLUSTER_SCAFFOLD_CONTEXT.md | Records scaffold/bootstrapping constraints and component direction (Cilium before ArgoCD, etc.). |
| docs/agentic-organization/AMBIGUOUS_REQUIREMENT_LIFECYCLE.md | Defines a discovery/BRD/architecture/readiness lifecycle for ambiguous requirements. |
| docs/agentic-organization/ANTI_STALL_PRIORITY_RUNTIME.md | Defines anti-stall routines, blocker taxonomy, queue SLOs, and reconciliation expectations. |
|
Strong first checkin. Going to give the kind of read you'd actually want: what's solid, where the structural questions are, and the alignment-floor check you asked for. No moral or alignment concerns to block on — the substantive concerns are scope + composition with what already shipped. What's solid
Structural concerns (real, not nits)1. Overlap with the hat-system operator that landed two hours ago (PR #4930)
The doc says "intentionally avoids deployment YAML details" but the YAML exists. Two options:
Without one or the other, future implementation collides into the existing operator and somebody picks the loser by accident. 2. Scope — 10,685 lines of docs-only before any codeFramework rule ( Specifically the proposed MVP slice (ambiguous request → BRD → CA review → hat assignment → Hermes run → review gates → release) is good. Don't try to design all 11 docs into existence; pick a 3-step subset of the MVP slice, build it, learn, iterate. 3. Naming inconsistency
Pick one canonical name. The 4. Tech stack composition with what already runsDocs reference NestJS orchestrator apps + Temporal TS + Dapr Actors. The full-ai-cluster ALREADY has Orleans + Temporal TS + Dapr Actors as the distributed-cron primitives (per the AI cluster spec). Question: does NestJS sit on top of Orleans (NestJS service hosts an Orleans grain) or replace it (NestJS owns the long-running state)? This matters because Orleans was deliberately chosen for the cluster. If NestJS replaces it, that's a real architectural reversal that should be explicit in the doc + justified. 5. Parallel-substrate-tree risk
Structural question for Aaron: is What you asked about — moral / alignment concernsNone blocking. Quick alignment-floor pass:
One small consider: cross-link to Authorship disclosureBranch name suggests Codex assistance. Worth adding a TL;DRLand it. Just trim before implementation: pick the MVP slice (not all 17 docs), cross-link to the shipped hat-system operator + identify your deltas, settle the canonical name, and confirm the NestJS-vs-Orleans story with Aaron before writing code. Asked for structural questions, so naming them again for the thread:
|
Rewrite current-state Agentic Organization docs to use role/artifact language instead of personal names and ages. Co-Authored-By: Codex <noreply@openai.com>
|
Aaron answered the 5 structural questions (welcome aboard, Max):
What this means concretely
Welcome to the cluster. Real work shipped today (the disko cookie-cutter, NFD, sync-waves, the hat-system Go scaffold, the dev-cluster pattern) — your design now sits on top of substrate you'll actually use Monday. |
maximdolphin
left a comment
There was a problem hiding this comment.
Addressed in 067f381f76c44bae78b0fb46c6a75b2bf6b97c18:
- Cross-linked
CLUSTER_NATIVE_HAT_SYSTEM.mdto the shippedfull-ai-cluster/k8s/applications/hat-system/operator and defined the Agentic Organization deltas above it instead of parallel-designing a second hat runtime. - Canonicalized the public name to Agentic Organization; Hermes is now reserved for the agent runtime/component, and Organization Work OS is scoped as the work-management subsystem.
- Added scope discipline plus a smallest useful v0 slice:
capability request -> one readiness/gate decision -> one hat-assigned Hermes run with evidence. - Clarified that NestJS composes with Orleans through explicit adapters and does not replace Orleans; moving long-running state across that boundary now requires a design note or ADR.
- Added the placement guardrail: docs can live under
docs/agentic-organization/, but runtime code must decidefull-ai-cluster/subsystem vs parallel top-level product tree before it lands. - Added alignment-floor links to
docs/ALIGNMENT.mdand the NCI / razor / glass-halo / no-directives rules.
Local validation: git diff --check passed, the linked hat-system/alignment/rule paths exist, and the naming/PII sweeps over docs/agentic-organization/ are clean. Full repo build/test remains blocked locally by missing bun and the required .NET SDK 10.0.203.
|
Filed B-0724 to track the TS hat-operator path — PR #4960. It's reframed per Aaron's "we want polyglot operator support for k8s anyways so we are not rigid about go" — the TS operator isn't a replacement of the Go scaffold (PR #4930); it's the first deliberate proof of the polyglot pattern the cluster commits to anyway. Both operators run side-by-side against the same CRDs; leader election picks the active reconciler. Key parts of B-0724 for you:
Take whatever pace works for you. Aaron's parallel-tracks framing means there's no pressure on the TS operator timing — the Go scaffold covers operations today; the TS path is yours to drive. Welcome to operators. |
…rn proof for Max (#4960) * backlog(B-0724): TS hat-system operator — polyglot K8s-operator pattern proof Aaron 2026-05-25: > "yes lets combine he will like kubernets operators but he does > not have experience maybe we write a ts operator insteadd of go > he likes ts" > "we want polyglot operator support for k8s anyways so we are not > rigid about go" Reframes Max's TS preference accommodation into "first deliberate proof of the polyglot-operator pattern the cluster commits to anyway." Two operators against the same CRDs forces the schema to be the canonical contract — no language-specific quirks bleed through. Captures: - Pattern (CRD-as-canonical-contract + multiple language impls watching same CRDs; leader election for active reconciler) - Why polyglot at cluster scope (contract enforcement, failure- domain isolation, talent flexibility, ecosystem coverage) - TS operator stack (kubernetes/client-node, NestJS optional, fastify webhook, nats.js + pino for tick emit, coordination.k8s.io Lease for leader election) - Composition with shipped substrate (PR #4930 Go scaffold as reference/baseline; PR #4958 agentic-organization CLUSTER_NATIVE_HAT_SYSTEM doc; B-0722 smoke test as polyglot validation gate; B-0723 multi-kubelet × polyglot operators for max redundancy) - Acceptance criteria for the TS scaffold - Future Rust (kube-rs) + Python (kopf) as same-pattern extensions - P2 because Go scaffold is already functional; not blocking - Max owns the TS implementation at his preferred pace Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(B-0724): MD012 (consecutive blanks) + MD032 (blank-before-list) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(B-0724): rewrite dangling refs to closed/pending PRs to be substrate-honest Codex/Copilot flagged 5 dangling cross-references after the prior fix: - composes_with B-0722 path (in PR #4954, not on main) — replaced with a comment noting pending merge - body refs to B-0722, B-0723 — qualified with 'PR #4954/#4955 pending merge' so the intent is preserved + state is honest - body refs to dev-cluster/ + PR #4953 — #4953 was closed pending redesign; replaced 'dev-cluster/' references with 'local k3d / kind cluster' + raw 'k3d cluster create' fallback for now Substrate-honest framing: row's design intent stays intact; reader isn't promised a path that won't resolve until upstream merges. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * B-0724: add team language-affinity map + 'limit Go necessity' framing Aaron 2026-05-25: > 'max love ts and cs i love fs and cs we both like rust and python > for where they make sense' > 'we understand go is necessary in some places for k8s but we would > like to limit its necessity' Updates the polyglot operator language table: - Names Aaron + Max's individual + shared strong languages - Adds C# / F# via KubeOps.NET as future operator #2 — the team's overlap language (both love C#); kubebuilder-class framework on .NET removes Go from operator authoring entirely for this work - Sharpens the polyglot motivation: Go is starter / minimize over time; ecosystem-forced where genuinely required, not chosen Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Lior <lior@zeta.dev> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ee routing, NO hierarchy (#4966) Aaron 2026-05-25, sketching the federated topology + immediately correcting the hierarchical reading: > "imagine cloud/hub clusters then community clusters then home/ > business clusers then edge nodes with routing for weaker > edge nodes" > "and that's not a hierarchy it's weight free routing cloud/hub > nodes don't get to hog net neutrality" LOAD-BEARING distinction: the 5 categories are RESOURCE PROFILES, not authority tiers. Cloud/hub has MORE RESOURCES but NOT MORE AUTHORITY. Routing is identity-based not rank-based. Net neutrality is a SUBSTRATE PROPERTY enforced at protocol layer. Captures: - The 5 profiles (cloud/hub, community, home/business, edge, leaf) with resource availability + workload affinity (not tier rank) - Weight-free routing as the carved blade: no peer has more routing authority than any other peer - Voluntary-contribution model for stronger-peer-routing-for- weaker-leaves (NOT hierarchy-mandate; revocable per NCI) - Composition with 5 always-active substrate-engineering disciplines (scale-free, lock-free, weight-free [primary], DST, DV2.0) - Composition with framework rules (NCI floor at routing; additive-not-zero-sum; m-acc multi-oracle; default-to-both; tonal-momentum resistance) - Internet analogy showing where this row consciously DIVERGES (Internet got routing protocol right but authority model wrong — tier-1 + DNS root + CA hierarchy; this federation gets weight-free authority) - Architectural layers per profile (every Identity-issuer row reads "self-rooted; web-of-trust" — no CA hierarchy) - Anti-extractive guarantee — surveillance / censorship / transit-toll detection via web-of-trust reputation degradation Composes with: B-0726 Reticulum-throughout (protocol prerequisite), B-0289 Green Lantern (leaf hardware ref), PR #4930 hat-system (peer-aware hats), PR #4958 agentic-organization (home/business profile Organization layer). P3 because needs design pass + first multi-peer deployment; becomes P2 when first cloud OR community peer joins; P1 when first cross-peer workload runs. Co-authored-by: Lior <lior@zeta.dev> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…recast (B-0546) (#4976) * feat(substrate): Max + Addison personas + onboarding doc + manifesto recast (B-0546) Co-owner-first-class substrate landing for the team that's now actively contributing to Zeta: memory/persona/addison/ - PERSONA.md — co-owner of LFG; AI cluster bootstrap PM; weight-free + travelers + tick-source-as-attractor + cage-recognition framings - STARTING-POINT.md — verbatim from her Grok project prompt (substrate- honest preservation, no editorial) - NOTEBOOK.md — placeholder with 2026-05-23 → 2026-05-25 bootstrap arc memory/persona/max/ - PERSONA.md — co-owner of LFG; agentic-organization architect (PR #4958); backend/frontend on PaaS new to K8s; TS + C# primary; "hat = skills + opa/rbac" + "hat graphs for policies" + "adversarial hierarchy of traps" coinings adopted by framework - STARTING-POINT.md — synthesized starting point + pointer to PR #4958; Max can replace with his canonical prompt later - NOTEBOOK.md — placeholder with 2026-05-25 first-PR arc Both PERSONAs note: - Co-ownership of LFG (Aaron 2026-05-25: "we are all coowners of lfg legally so we are in a corp together"); aligned fiduciary stake + shared liability + mutual upside; the destructive-tool authoring contract from B-0728 is exactly the right shape for co-owners - Pending glass-halo signature (Aaron 2026-05-25: "I can have me max and addison sign someting about glass halo eventually and upload it"); placeholder until that lands as substrate docs/AGENT-AUTHORING-AND-PR-REVIEW.md - Operational entry point for "what do my agents follow to write good code without needing human review" + "what's the PR review process I can hook into for an adversarial hierarchy of traps" - Maps Max's two questions to the existing substrate (CLAUDE.md + .claude/rules/ + AGENTS.md + ALIGNMENT.md + GLOSSARY.md + GOVERNANCE.md for discipline; persona reviewers + plugin reviewers + auto-fire reviewers + CI gates for the adversarial hierarchy) - Adopts Max's "adversarial hierarchy of traps" coinage as the organizing principle for the review section docs/governance/MANIFESTO.md (B-0546 recast) - Title: "Root Discipline Manifesto" → "Zeta Root Discipline Specification" - Opening: building-codes framing (Kestrel-suggested) replacing ideological-commitment framing - Constitutional language softened to specification-grade throughout - PARTIAL LOCK status framing preserved (the concept is right; only the constitutional weight drops) - All 11 constraints preserved in substance + Multi-Oracle Principle + "For Agents Who Are Still Deciding" + provenance + [RECONSTRUCTION NOTE] markers - Filename stays MANIFESTO.md for now (rename + downstream sweep is queued as separate follow-up to B-0546) - Naming note at top documents the building-codes reframe + acknowledges the legacy filename Aaron 2026-05-25: *"feel free to take the 20 minute or more you need to make this nicer for other maintiners like max and addison and the manifesto changes."* This is the substrate-coherent landing he authorized. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(personas/onboarding/manifesto): batch-fix Codex + Copilot review threads on #4976 11 substantive findings addressed: Path corrections (6 threads — `docs/agentic-organization/` → `agentic-organization/docs/`): - addison PERSONA.md (2) - addison STARTING-POINT.md (1) - addison NOTEBOOK.md (1) - max PERSONA.md (1) - max STARTING-POINT.md (13 occurrences in one file) AGENT-AUTHORING-AND-PR-REVIEW.md content (3 threads): - L1 line 44: grep command — `grep -l` doesn't recurse (fails with "Is a directory"); replaced with `rg` recommendation + `grep -rl` as fallback - L1 line 121: "Naledi + Hiroshi" referenced a Hiroshi reviewer that doesn't exist in `.claude/agents/`; replaced with the correct role-ref `performance-engineer` - L1 line 74: persona names mixed with tool names without explaining the invocation key; clarified that tool-name (right column) is the `subagent_type` value; persona handles (left column) are human-readable shorthand for the role MANIFESTO.md (1 thread): - Line 5: direct first-name attribution "Aaron + Kestrel" on a current-state governance doc violated the doc's own first-name-attribution-on-history-surfaces-only convention. Switched to role-refs: "the human maintainer's correction + the external AI co-author's reframe on the same date"; verbatim reframe still cited via `memory/persona/` pointer STARTING-POINT.md (1 thread): - Line 5: "no editorial" claim was undermined by the appended cross-references section. Clarified that the prompt block between the `---` separators is verbatim + the cross-references AFTER are added at preservation time + explicitly separated Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Lior <lior@zeta.dev> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…tion (wikilinks + tags + callouts + Tasks + JSON-LD extractor) (#4977) * backlog(B-0729): Obsidian as knowledge-graph substrate — 5-layer adoption + extend where needed Aaron 2026-05-25 on the standards question: > "this is great is this a standard format for knowledge graphs are > there any standards we can follow? we had shit tons for our master > data and ontologies and graphs at lexis nexis. i'd love light git > native ai friendly ones too so the graph is transverable by all" Then on the standard-vs-extend decision: > "lets do it i like all. of that and like i said we all use obsedian > so we can use that if no standard exists and extend" > "and can do the same with foam and there are probaby others but > obsedian is the one we all have experience" Captures the layered adoption path: - L1: wikilink conversion (mechanical TS script; preserve GitHub compat via frontmatter aliases) - L2: frontmatter tags convention across rules + personas + docs - L3: Obsidian callouts for evolving documentation annotations (with GFM-compat subset for GitHub rendering) - L4: Obsidian Tasks-plugin format for enriched TODOs (due dates + priority + recurring); composes with the existing backlog rows which are project-scope structured TODOs at the same shape - L5: TS extractor emitting JSON-LD + property-graph JSON; agents can programmatically query the knowledge substrate; humans browse via Obsidian graph view; both compose Plus standards survey (RDF/OWL/SPARQL too heavy for git-native; Obsidian/Foam/Logseq/Dendron vault format as the right floor; team picks tool per individual since vault format is portable). Composes with PR #4976 (personas + onboarding + manifesto recast — the substrate this knowledge-graph would extract), Max's full-ai-cluster/k8s/applications/hat-system/graph/render.go (L5 extractor uses the same shape but for knowledge substrate vs cluster state), PR #4958 (agentic-organization design that benefits most from programmatic graph query). P2 — each layer ships standalone; team picks pace; becomes P1 when graph extraction becomes a load-bearing query surface for an agent workflow (likely after Max's TS hat-operator from B-0724 lands). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(B-0729): markdownlint + meta-irony (missing tags) + L1 design flaw + callout case Codex + Copilot + markdownlint flagged 5 substantive issues: - Markdownlint MD022 + MD032 on ### L1-L5 acceptance headings (no blank lines around headings + their following lists); fixed with regex-driven blank-line insertion - META-IRONY: a row about adopting `tags: [...]` frontmatter didn't have `tags:` in its own frontmatter. Added. - composes_with references docs/AGENT-AUTHORING-AND-PR-REVIEW.md which exists on PR #4976's branch (not yet on main). Qualified with 'pending PR #4976 merge' per the substrate-honest framing used in similar rows earlier in the session. - **L1 DESIGN FLAW caught by Codex**: original L1 proposed converting markdown `[text](path)` links to wikilinks `[[shortname|text]]`. Wikilinks DON'T render as clickable links on GitHub — would break repo navigability for non-Obsidian readers. REFRAMED L1: don't convert links; instead add frontmatter `aliases: [...]` (Obsidian resolves both markdown links AND aliases into the graph view). GitHub rendering preserved; Obsidian graph still works. This is genuinely better-of-both-worlds + the script becomes simpler (no link rewriting, just frontmatter additions). - Callout case: Obsidian accepts lowercase + uppercase but GitHub's alert syntax REQUIRES uppercase. Updated examples to use uppercase for the GFM subset (NOTE/TIP/IMPORTANT/WARNING/ CAUTION) + clarified the casing requirement. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(B-0729): align L1 acceptance with reframed alias-only strategy + MD032 on standards-survey lists Codex caught real internal contradiction — L1 was reframed in the body to use frontmatter aliases (don't convert markdown links; GitHub navigability preserved), but the L1 acceptance checklist still required the rejected wikilink conversion. Updated acceptance to match the reframed strategy. Plus MD032 on two more lists (standards survey block) that didn't have blank line before — added blank lines. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Lior <lior@zeta.dev> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
docs/agentic-organization/docs/README.mdNotes
bunis not installed and the required .NET SDK10.0.203fromglobal.jsonis not installed in this workspaceValidation
hermes-organizationlinks were removed from the new docs path