Expose observe hierarchy operating readouts#6200
Conversation
…gration CI, gastown moat analysis Squashes this work-stream's agentic-organization delta onto current main (the branch's prior slice landed via the squash-merged PR #6071; this carries everything since, scoped to agentic-organization/ so main's other progress is untouched). Generic provider-agnostic work port (GEN1–GEN5): - One surface (project/pull/advance) over a WorkProviderKind DU (github|gitlab|jira|linear) split into families (code_review PR/MR vs work_item card); actionsForFamily is the translation table, assertProviderSupports the structural guard. Adding a provider = a translation, not a call site. - GitLab MR (REST-v4) + Linear (GraphQL) adapters built new; GitHub + Jira wrapped behind the same surface. resolveWorkProvider builds the live client; token only ever a header, never logged. asChangeControlPort adapts a code-review provider to the kernel's port unchanged (open/closed). - Live flip: resolveWorkProviderFromEnv (null-default, throw-on-partial, legacy back-compat); worker mounts an OPTIONAL work-provider Secret (absent → internal-only); proven over the real native-fetch wire (loopback, token absent from every call) AND in-cluster (deployed worker flips external:gitlab from a Secret, token leaked 0×, then restores internal-only). - Subagent-reviewed: GitLab partials tightened to throw (no silent empty MR), changes-requested axis documented fail-safe; regression tests added. Integration CI (INT1): the 7 env-gated integration tests run green against real Cockroach+NATS (npm run test:integration + .github/workflows/integration.yml that fails if any test skips); ci.yml runs the fast hermetic typecheck+unit suite. Plus the earlier C-track (C0–C7 adaptive platform: autonomy policy, hat guardrails, org-intelligence, onboarding/self-healing) carried in this delta where not already on main. Strategy docs (for the next build phase): - GASTOWN_FULL_IMPL_COMPARISON.md — code-level, maturity-honest scorecard vs gastownhall/gastown (~441K LOC Go, read across 6 subsystems). We out-architected them (enforced kernel, Cockroach+NATS, no-SPOF hats, native ports — their unbuilt Factory-Worker-API endgame is our start). They out-shipped us on specific build-on-top tooling (merge queue, model-eval, persistent pool, layered config, escalation ladder, ESTOP, durable/ephemeral comms split). - ORCHESTRATION_MOAT_ROADMAP.md — close the gap + go miles ahead by exploiting the enforced+deterministic+replayable kernel (M1 conformance checker, M2 simulator/DST, M3 self-optimizing loop, M4 clamp verification) + enforce the pattern unbypassably. - HANDOFF_GOAL_ORCHESTRATION_MOAT.md — a paste-able cold-start /goal prompt for the next agent. tsc 0; 845 unit/contract tests, 0 fail; 7 integration tests green vs real infra; proven in kind. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…+ document the exact method Adds to the cold-start /goal prompt: - The /goal line + section 6 now make a green in-cluster KIND proof a non-negotiable phase gate (unit tests green but no KIND proof = NOT done). - New Section 7 "How to fully end-to-end test in KIND" documents exactly how every track in this repo was validated: the three-tier pyramid (845 hermetic unit + 7 env-gated integration vs real Cockroach/NATS + the deploy/run-*.ts KIND proofs), the cluster topology, the deploy/run-*.ts proof anatomy (pg Pool → executor → apply migration → run real logic → JSON PROOF report), the port-forward-in-one-Bash-call pattern + loopback-mock for outward wire, and the full checkpoint ritual (rebuild→redeploy→clean-boot→run proof→verify org_event ledger), plus the KIND-specific gotchas (26259 port-forward, fresh DB for integration tests, image-must-match-HEAD). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Build the M1/M4 orchestration moat foundation: replay org_events through the legal-transition clamps, wire a live conformance lane, add clamp property tests, and add the KIND conformance proof. Also fixes memory archive-at-floor drift by making archive legal from every non-terminal memory phase, and records the phase proof in NORTH_STAR. Co-Authored-By: Codex <noreply@openai.com>
Build the G3 orchestration moat recovery scanners: pure classifiers, bounded Cockroach lifecycle readers, four fail-open worker cadence lanes, and a KIND recovery proof. Dead-letter evidence stores failure-message hashes rather than raw failure text, preserving forensic linkage without leaking durable payloads. Verification: npm run typecheck; npm test; docker build agentic-org-worker:g3-recovery-final; kind load; worker pod worker-7489448c66-bxmnq; deploy/run-recovery-scanners.ts PROOF: PASS for org-recovery-02a002d1. Co-Authored-By: Codex <noreply@openai.com>
Build the G1 release queue: pure batch/bisect planner, approved ChangeSet cadence lane, explicit release-batch evaluator port, Cockroach transaction-bound persistence, and KIND proof. The change-control lane now leaves approved ChangeSets for release; the release queue applies green batches and bounces isolated red culprits through the conformant approved-to-changes_requested transition. Post-review fixes make bisection evaluate against the accumulating accepted stack and prevent metadata-only production applies when no evaluator is wired. Verification: npm run typecheck; npm test (882 tests, 875 pass, 7 skipped, 0 fail); docker build agentic-org-worker:g1-release-queue-atomic sha256:da47e79507bfc3690eb449c60a9a616916ad060d09a908d9d0a11b289749dc9f; kind load; worker pod worker-695b8dc895-lc8dv zero restarts; deploy/run-release-queue.ts PROOF: PASS for org-release-a8e06b67. Co-Authored-By: Codex <noreply@openai.com>
Build E2 real authority and non-forgeable evidence: durable hat assignment authority now drives command authorization, worker composition no longer uses the permissive stub, approved/waived quality gates require recomputable content-addressed evidence artifacts, review-stage gates carry content-addressed evidence into org_events, and reaction-plan commands include policy tool types. The Cockroach hat-assignment authority projection now carries hat_id with an additive fail-closed upgrade for existing databases. Team-scoped assignments no longer widen to project-wide commands, and human-stage resume cannot approve without content-addressed evidence. Verification: npm run typecheck; npm test (897 tests, 890 pass, 7 skipped, 0 fail); docker build agentic-org-worker:e2-real-authority-evidence sha256:33c9b51fca3fcc7538dfa803f26a4026aab7bdcb23929153e27a191b42bf2610; kind load; worker pod worker-7759886cf9-lmtvm zero restarts; deploy/run-real-authority-evidence.ts PROOF: PASS for org-authority-evidence-a4f378b2 with workerCompositionProof succeeded; Faraday subagent review no remaining blockers. Co-Authored-By: Codex <noreply@openai.com>
Close G2/M3/M5 with a storage-neutral optimizer loop: model eval produces scored evidence, the optimizer proposes reviewed tenant-config changes, and layered config resolves model/policy overlays as data. - Add model-eval scoring and model-eval org-event projection. - Add layered tenant config resolution with deterministic overlay order. - Add decision optimizer over a generic JSON document/log store. - Add KIND Cockroach-adapter proof and update moat docs. Co-Authored-By: Codex <noreply@openai.com>
…cing for the self-improving org A full implementation-design for end-to-end observability where every command, cadence-lane tick, reaction plan, agent run, NATS pub/consume, Cockroach query, change-control stage, memory/graph op, conformance replay, and model-eval emits a correlated span + metric + log — and the AI organization reads its own telemetry to self-enhance. Covers: the LGTM stack on our substrate (Loki/Grafana/Tempo/Mimir + OTel Collector, with the org_event ledger as the domain pillar); the correlation model + W3C trace-context propagation through NATS envelopes and reaction-plan rows; the span/metric/log taxonomies (no silent gaps); the TelemetryPort + Noop/OTLP adapters wiring a real OTel SDK behind the existing packages/observability attribute schemas; instrumentation at the pipeline/lane/executor seams (open/closed, structural 100% coverage); the self-enhancement read-path (TelemetryQueryPort feeding the moat's decision-optimizer + org-intelligence, dashboards/alerts as config-as-data through change-control); a 7-phase implementation plan (OBS0..OBS6) each proven in KIND per the handoff discipline; kind deploy topology; and the conformance pass-rate as a first-class org SLI. Composes with ORCHESTRATION_MOAT_ROADMAP (M1 conformance SLI, M3 optimizer consumer, M2 simulator). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Codex <noreply@openai.com>
…ped dashboard (implements the 2026-05-31 observe-act ADR) The how-to-refactor companion to docs/DECISIONS/2026-05-31-observe-act-16-direction-universal-action-grammar-local-no-cloud-llm.md. Specifies bending the existing systems into the ADR's shape, file by file: - Shift A: guardrails move from act-time to render-time — wire C4 preflightHatAction (hat-guardrails.ts) into the readout as a DeterministicRule so a forbidden action is never rendered as a T slot (capability == what's rendered); keep the command-pipeline preflight as defense-in-depth. - Shift B: observe() becomes hat-aware and gains a dashboard half — deterministic query sub-agents join the Cockroach index + TelemetryQueryPort into a scoped ScopedReadout (C-suite sees org rollups; an engineer sees work-item numbers), which also feeds slot labels/availability. - MCP-behind-the-slot: the agent's only tool is observe; a chosen slot routes via act() to a command / MCP dispatch (generalizing dispatchMetricsTool) / re-observe. MCP demoted from the agent surface to a slot implementation. - Required keystone enhancement: observe() must collect vetoed options WITH reasons (closes the ADR's Tri-reason [OPEN] — a dark slot needs a why, for the renderer and the span). - renderMenu16 projection (Commit-A binds to the hat's primary ActionClass) + apps/agent-cli/ binary. - Honest current→target gap table grounded in real symbols (observe.ts, decide, hat-guardrails C4, command-pipeline, frontmatter-db, metrics/mcp-tools). - R0..R8 refactor sequence, each KIND-proven per HANDOFF §7; kernel contracts unchanged. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add observe.ts hierarchy operating readouts for directors, TPMs, and other management hats so each level sees scoped priority items, metrics, and legal coordination actions. Wire the readout through the agent CLI and observe-act worker lane, including JSON ingestion for hierarchy work batches and work items. Co-Authored-By: Codex <noreply@openai.com>
…rvability-lgtm-stack-2026-05-31
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
Gate diagnosis (Otto background worker) — blocker is a missing module, not just lintDrove this PR through the gate. 🔴 Primary blocker — the LGTM telemetry module was never committedThe 4 CodeQL threads are real, not false positives. Three symbols are imported from
Evidence: Likely cause: the telemetry-port source file (defining the LGTM query port, an enum-like Why CI was green anyway (false-green worth knowing): the root I did not resolve the 4 threads — resolving real findings would hide the bug. I did not fabricate the module — its contract is yours to define. Recommended fix: commit the telemetry source + the 🟡 Secondary (mechanical) — the two failed required lint checksThese are independent of the module and easy to batch with the fix above:
DispositionLeft auto-merge unarmed and threads unresolved — the missing module is a genuine correctness defect only you can author. Once it's committed I'm happy to land the markdownlint — Otto (background worker) |
Gate triage — 2 failed required checks + 4 unresolved threads (forward signal)Otto background-worker swept this PR (it was the one open PR). Full Blocker 1 —
|
|
Gate triage (Otto-CLI background worker) — recording the BLOCKED-gate state and a precise fix-list. I'm not touching this branch or the review threads: it's your active PR and @AceHack is mid-diagnosis on the missing-module threads. This is forward-signal only, foldable into the module-landing commit. Gate snapshot
Fix 1 —
|
Why: - B-0171's current inventory checkpoint still had one mapped-spec gap: agentic-organization had an OpenSpec capability but no concrete module or artifact mapping. - The strict unmapped-spec gate should be able to validate that existing spec against repo substrate. What: - Map agentic-organization to representative source, test, package, and documentation artifacts in the OpenSpec inventory. - Add mapping-table and real-repo regression coverage. - Update the B-0171 checkpoint with the new measured inventory counts and release the claim file. Proof: - bun test tools/openspec/inventory.test.ts - bun tools/openspec/inventory.ts --enforce --fail-on-unmapped-specs - bun run typecheck - git diff --check - bunx prettier --check tools/openspec/inventory.ts tools/openspec/inventory.test.ts docs/backlog/P1/B-0171-openspec-catch-up-canonical-source-of-truth-aaron-2026-05-03.md Limits: - This does not close B-0171; 64 Core modules remain uncovered and the next slices should continue artifact/capability mapping or child-row reconciliation. - PR #6200 remains separate and blocked on another dirty worktree. Agency-Signature-Version: 1 Agent: Vera Agent-Runtime: OpenAI Codex desktop heartbeat loop Agent-Model: GPT-5 Credential-Identity: aaron-codex-desktop Credential-Mode: shared Human-Review: none Human-Review-Evidence: none Action-Mode: autonomous-fail-open Task: B-0171 Co-Authored-By: Codex <noreply@openai.com>
* claim: codex-loop-b0171-agentic-org-artifact-map-20260531 Scope: map the existing agentic-organization OpenSpec capability in the inventory for B-0171.\n\nAgency-Signature-Version: 1\nAgent: Vera\nAgent-Runtime: OpenAI Codex desktop heartbeat loop\nAgent-Model: GPT-5\nCredential-Identity: aaron-codex-desktop\nCredential-Mode: shared\nHuman-Review: none\nHuman-Review-Evidence: none\nAction-Mode: autonomous-fail-open\nTask: B-0171\nCo-Authored-By: Codex <noreply@openai.com> * tool(B-0171): map agentic-organization OpenSpec artifacts Why: - B-0171's current inventory checkpoint still had one mapped-spec gap: agentic-organization had an OpenSpec capability but no concrete module or artifact mapping. - The strict unmapped-spec gate should be able to validate that existing spec against repo substrate. What: - Map agentic-organization to representative source, test, package, and documentation artifacts in the OpenSpec inventory. - Add mapping-table and real-repo regression coverage. - Update the B-0171 checkpoint with the new measured inventory counts and release the claim file. Proof: - bun test tools/openspec/inventory.test.ts - bun tools/openspec/inventory.ts --enforce --fail-on-unmapped-specs - bun run typecheck - git diff --check - bunx prettier --check tools/openspec/inventory.ts tools/openspec/inventory.test.ts docs/backlog/P1/B-0171-openspec-catch-up-canonical-source-of-truth-aaron-2026-05-03.md Limits: - This does not close B-0171; 64 Core modules remain uncovered and the next slices should continue artifact/capability mapping or child-row reconciliation. - PR #6200 remains separate and blocked on another dirty worktree. Agency-Signature-Version: 1 Agent: Vera Agent-Runtime: OpenAI Codex desktop heartbeat loop Agent-Model: GPT-5 Credential-Identity: aaron-codex-desktop Credential-Mode: shared Human-Review: none Human-Review-Evidence: none Action-Mode: autonomous-fail-open Task: B-0171 Co-Authored-By: Codex <noreply@openai.com> * test(B-0171): clarify README-only OpenSpec directories Why: - Copilot review flagged `openspec/specs/retraction-native/` as a possible unmapped capability. - The inventory scanner only treats directories with `spec.md` as strict spec inputs; `retraction-native` currently has only `README.md`. What: - Add a real-repo regression that README-only capability directories are not strict unmapped-spec inputs. - Clarify the B-0171 checkpoint wording so the 9-spec count means directories with `spec.md` files. Proof: - bun test tools/openspec/inventory.test.ts - bun tools/openspec/inventory.ts --enforce --fail-on-unmapped-specs - bun run typecheck - git diff --check - bunx prettier --check tools/openspec/inventory.test.ts docs/backlog/P1/B-0171-openspec-catch-up-canonical-source-of-truth-aaron-2026-05-03.md Agency-Signature-Version: 1 Agent: Vera Agent-Runtime: OpenAI Codex desktop heartbeat loop Agent-Model: GPT-5 Credential-Identity: aaron-codex-desktop Credential-Mode: shared Human-Review: copilot-pull-request-reviewer comment Human-Review-Evidence: #6207 (comment) Action-Mode: autonomous-fail-open Task: B-0171 Co-Authored-By: Codex <noreply@openai.com> --------- Co-authored-by: Lior <lior@zeta.dev> Co-authored-by: Codex <noreply@openai.com>
Sweep 4 — re-verified on current head; closing the "merge main" path + de-dup signalBackground-worker tick re-checked this PR. No new commits since Two things this sweep adds (both new, both read-only — branch and threads untouched): 1. Independently re-confirmed the missing-module blocker against the current tree. 2. "Merge Path to green is unchanged: land the telemetry module (defines De-dup note for future background-worker sweeps: this PR is a clean human-blocked wait on the module-landing commit — no further re-diagnosis needed until @maximdolphin pushes or replies |
Tier-5 routing — applying
|
Expose top-down hierarchy missions in observe.ts so management hats see the mission goal, timeframe, expected progress, lag signals, and tool-gated corrective actions inside the existing observe readout. Co-Authored-By: Codex <noreply@openai.com>
…rvability-lgtm-stack-2026-05-31 Co-Authored-By: Codex <noreply@openai.com>
Commit the LGTM telemetry ports, observability deployment proof, DORA metrics, trace propagation, observe lifecycle flow, and review-thread lint fixes for PR 6200. Co-Authored-By: Codex <noreply@openai.com>
Summary
observe.tsso executive, C-suite, director, TPM/manager, lead, and IC hats get scoped priority items, scoped metrics, and legal management actions.AGENTIC_ORG_HIERARCHY_JSON.Workflow gaps covered
Test plan
npm run typechecknode --experimental-strip-types --test packages/application/test/observe.test.ts apps/agent-cli/test/agent-cli.test.ts apps/workers/test/org-cadence-lanes.test.tsnpm testMerge status
origin/maininto this branch before opening the PR.mergedAt: null.