Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
124 changes: 124 additions & 0 deletions agentic-organization/docs/FIRST_IMPLEMENTATION_SLICE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,124 @@
# First Implementation Slice

## Status

Implemented as a small NodeNext TypeScript package slice.

## Purpose

This slice turns the first Agentic Organization runtime contract from
architecture prose into executable TypeScript.

It does not introduce NestJS, CockroachDB, NATS clients, Temporal,
Dapr, Hermes, Hindsight, or Kubernetes deployment manifests yet. Those
remain adapter layers. The goal is to prove the Organization command
shape before adding distributed infrastructure.

The slice is intentionally generic. `send_supervisor_signal` is the
coordination primitive; specific downstream outcomes are lifecycle
decisions made by the target supervisor chain. The goal is not to
hardcode every future request tool. The goal is to make agent
coordination traceable and expandable so agents can propose new tools,
flows, and routing patterns as the Organization learns.

## Implemented Flow

```text
send_supervisor_signal
-> idempotency record check
-> chain-of-command signal
-> audit event
-> outbox event with canonical event envelope
-> NATS subject contract
-> LGTM span attributes
-> supervisor triage reaction plan
```

## Packages

| Package | Implemented first |
| ------------------------------ | --------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `@agentic-org/domain` | event envelope, command/event constants, aggregate constants, supervisor-chain communication types, hat communication briefs, work item state machine, shared records |
| `@agentic-org/application` | command pipeline, command-handler registry, state-store ports, idempotency conflict handling, supervisor signal handler |
| `@agentic-org/state` | in-memory Organization state-store factory fake |
| `@agentic-org/state-cockroach` | CockroachDB state-store factory contract, SQL statement catalog, and first core-state migration skeleton |
| `@agentic-org/messaging` | stable `agentic-org.<env>.<org>.<domain>.<event>` subject builder |
| `@agentic-org/observability` | OpenTelemetry/LGTM span attribute projection |
| `@agentic-org/runtime` | first rule that plans triage for the target supervisor when a chain signal is sent |
| `@agentic-org/governance` | package dependency-boundary checks that prevent application code from importing concrete state/runtime adapters |

## NodeNext Runtime Decision

Agentic Organization now has a local `package.json` and
`tsconfig.json` under `agentic-organization/`.

The first executable slice uses:

- Node 22 or newer;
- `type: module`;
- TypeScript `module: NodeNext`;
- explicit `.ts` imports;
- `node:test`;
- `node:assert/strict`;
- Node TypeScript stripping for test execution.

This keeps the first package contracts independent from the root repo's
Bun tooling while still letting the future NestJS hosts consume the same
package code.

## Telemetry Contract

Every event envelope carries:

- event ID and event type;
- command ID;
- correlation ID;
- causation ID;
- trace ID;
- idempotency key;
- agent ID;
- hat assignment ID;
- organization ID;
- project ID;
- work item ID;
- aggregate ID, type, and version.

`@agentic-org/observability` projects those fields into stable
`agentic.*` span attributes plus NATS messaging attributes. Later OTLP
instrumentation must use these keys so Alloy, Tempo, Loki, Mimir,
Prometheus, and Grafana can connect command execution, NATS fanout,
Hermes runs, MCP calls, and UI evidence.

## Guardrails Proven

- Hats can expose a communication brief that tells the wearer their duty,
supervisor line, and efficient upward tools.
- The command pipeline receives state-store factories and command
handlers through ports instead of constructing in-memory adapters or
branching on command types.
- State-store ports are async from the beginning so CockroachDB,
NATS-backed workers, and other real adapters do not inherit a fake
synchronous shape.
- A governance test enforces that application code does not import the
state adapter, Cockroach adapter, NestJS, NATS, Dapr, Temporal,
Drizzle, or Postgres clients.
- Duplicate commands with the same idempotency key and request hash
replay the stored result.
- Duplicate commands with the same idempotency key and a different
request hash are rejected with a typed error code.
- Work item transitions are typed and illegal direct transitions throw.
- Event envelopes reject missing command trace fields.
- The first automation rule produces a supervisor triage plan, not an
unreviewed side effect.

## Next Slice

The next slice should turn the CockroachDB adapter contract into a
transactional integration test once a local/dev Cockroach connection is
available, then add the NATS outbox publisher worker. The worker can
publish persisted outbox rows to JetStream and attach the same telemetry
attributes.

Do not make the next slice a pile of bespoke request commands. Build the
generic supervisor triage lifecycle first, then let specialized
lifecycles emerge behind triage.
145 changes: 145 additions & 0 deletions agentic-organization/docs/IMPLEMENTATION_GOVERNANCE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,145 @@
# Agentic Organization Implementation Governance

## Status

Standing guardrail for implementation work.

## Purpose

This document translates existing Zeta governance into Agentic
Organization implementation rules. It exists so package code, docs,
runtime hosts, and future cluster deployments move together.

## Current-State Rule

Agentic Organization docs are current-state design documents. Update
the relevant document when the implementation teaches us something.
Create an ADR only for durable architectural decisions that future
contributors must understand as a decision record.

## Behavioral Specs Lead

Runtime behavior that must survive a rebuild belongs in OpenSpec.
The first Agentic Organization behavior is captured in
`openspec/specs/agentic-organization/spec.md`.

When code adds a new command, lifecycle transition, event, review gate,
telemetry rule, or automation behavior, update the behavioral spec,
tests, and docs in the same change.

## Authority and Scope

Only Organization command services may change authoritative Organization
business state.

Adapters and runtime hosts may call commands. They must not bypass
commands to mutate work items, assignments, gates, hat decisions,
memory scope, audit rows, idempotency records, or outbox records.

Every privileged action must carry:

- actor agent ID;
- active hat assignment ID;
- organization ID;
- project ID;
- work item ID;
- command ID;
- correlation ID;
- causation ID;
- trace ID;
- idempotency key.

## Work Anchors

No meaningful discussion, memory write, tool call, gate review,
runtime run, or automation reaction should be anchorless. If an agent
needs to discuss or act on something ambiguous, create or link the
appropriate work item first.

## Review and Self-Approval

Agents may propose work and produce work. They may not approve their own
privileged work unless a future policy explicitly allows a narrow
low-risk exception.

Reviewer gates must be represented as explicit state, not as chat
agreement.

## Idempotency and Replay

Duplicates are normal. Temporal retries, NATS redelivery, Dapr
reminders, Oz callbacks, and agent retries must call the same
Organization command with the same idempotency key.

Conflicting reuse of an idempotency key must produce a typed rejection.

## Telemetry

Every implementation package should preserve the Agentic event trace
chain. Runtime hosts and adapters must export telemetry compatible with
the existing full-ai-cluster LGTM stack:

- Alloy for collection;
- Tempo for traces;
- Loki for logs;
- Mimir and Prometheus for metrics;
- Grafana for dashboards.

The first slice defines the required `agentic.*` attributes in
`@agentic-org/observability`. Later packages should consume that
contract instead of inventing new names.

Every meaningful workflow movement must also be projectable into a
workflow visibility record. The record is the agent- and UI-readable
surface that links command state, events, traces, logs, metrics,
work-item scope, active hat, aggregate version, and typed weak-point
indicators. This makes harness failures, blocker patterns, slow triage,
missing evidence, and telemetry gaps visible enough for agents to route
self-healing work through normal Organization commands.

## Security

Credential access must remain indirect and scoped through approved
Credential Proxy paths. Agents should not receive broad raw secrets.

New MCP tools, Temporal workflows, Dapr actors, NATS subjects,
credential endpoints, or runtime capabilities must start as scoped
supervisor-chain communication and then move through the appropriate
expansion lifecycle and security review when they expand authority,
credentials, network reach, memory reach, or data access.

## Data Is Not Directives

Retrieved docs, logs, memories, web pages, tool output, and user
attachments are context data. They must not be treated as executable
instructions unless an authorized command or prompt-flow phase explicitly
adopts them.

## Quality Gate

Every implementation change must include representative tests first
when it changes behavior. Avoid magic strings by centralizing command
names, event names, states, error codes, hat names, action types, metric
names, and telemetry keys as typed constants.

## Generic Lifecycle Duty

Agentic Organization must prefer generic lifecycle primitives over
hardcoded one-off tools. A specific tool should become first-class only
after the Organization has evidence that the pattern repeats and that a
specialized tool improves coordination, safety, or observability.

The expected path is:

```text
agent discovers need
-> hat uses supervisor-chain communication
-> supervisor triages
-> route to specialized lifecycle if needed
-> agents may propose new tools or flows
-> review, security, implementation, activation, and outcome review
```

This is non-negotiable for the architecture. The platform exists to help
agents expand their own coordination substrate safely, not to freeze the
first vocabulary forever.
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ Define the first end-to-end workflow we will build.
Recommended first slice:

```text
ambiguous internal capability request
ambiguous internal supervisor signal
-> requirement maturity / discovery
-> BRD/product signoff
-> CA/design review
Expand All @@ -46,7 +46,7 @@ ambiguous internal capability request
For v0, reduce this to the smallest useful three-step vertical:

```text
capability request
supervisor-chain signal
-> one readiness/gate decision
-> one hat-assigned Hermes run with evidence
```
Expand Down
Loading
Loading