Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
---
id: B-0036
priority: P3
status: open
title: "GOVERNANCE.md §33 archive-header backfill on 26 pre-existing courier-ferry research docs + wire `tools/hygiene/check-archive-header-section33.sh` to CI gate.yml lint job"
tier: hygiene-tooling-and-substrate-discipline
effort: M
ask: Otto observation 2026-04-26 — §33 archive header was the most-common review finding across the 11-Amara-refinement courier-ferry lineage this session (PRs #560 / #562 / #563 / #565 / #566 / #568 / #569 / #570 / #553 each retrofitted post-review). Per Otto-346 (recurring pattern → substrate primitive missing) + Otto-341 (mechanism over vigilance), the structural fix is a CI lint that catches the violation pre-merge.
created: 2026-04-26
last_updated: 2026-04-26
composes_with: [feedback_otto_346_dependency_symbiosis_is_human_anchoring_via_upstream_contribution_good_citizenship_dont_blaze_past_2026_04_26.md, feedback_otto_341_lint_suppression_is_self_deception_noise_signal_or_underlying_fix_greenfield_large_refactors_welcome_training_data_human_shortcut_bias_2026_04_26.md, GOVERNANCE.md-section-33-archive-header-discipline, tools/hygiene/check-tick-history-order.sh]
tags: [hygiene-tooling, lint-discipline, otto-346-recurring-pattern-to-substrate-primitive, governance-section33, courier-ferry-imports, archive-header-discipline, mechanism-over-vigilance]
---

# B-0036 — §33 Archive-Header Backfill + CI Wire

## Origin

Otto observation across this session's 11-Amara-refinement courier-ferry research-doc lineage: GOVERNANCE.md §33 archive-header missing was the **most-common review finding** across the 11 PRs. Each PR was retrofitted with the 4-field header AFTER the review caught it.

The fix-shape Otto already shipped in this session: a hygiene tool `tools/hygiene/check-archive-header-section33.sh` that catches the violation before merge.

## What this row addresses

Two sequential sub-tasks:

### Sub-task 1: Backfill 26 pre-existing courier-ferry research docs

The lint tool when run on `main` finds **26 violations** in pre-existing courier-ferry research docs. Each needs the 4-field §33 header (Scope / Attribution / Operational status / Non-fusion disclaimer) added to the first 20 lines.

Files affected (as of 2026-04-26 main):

- `docs/research/codex-cli-first-class-2026-04-23.md`
- `docs/research/drift-taxonomy-bootstrap-precursor-2026-04-22.md`
- `docs/research/dst-accepted-boundaries.md`
- `docs/research/dst-compliance-criteria.md`
- `docs/research/gemini-cli-capability-map.md`
- `docs/research/grok-cli-capability-map.md`
- `docs/research/maji-formal-operational-model-amara-courier-ferry-2026-04-26.md`
- `docs/research/maji-messiah-spectre-aperiodic-monotile-amara-third-courier-ferry-2026-04-26.md`
- `docs/research/memory-reconciliation-algorithm-design-2026-04-24.md`
- `docs/research/meta-pixel-perfect-text-to-image-youtube-wink-2026-04-22.md`
- `docs/research/muratori-zeta-pattern-mapping-2026-04-23.md` (only `Non-fusion disclaimer:` missing)
- `docs/research/openai-codex-cli-capability-map.md`
- `docs/research/openai-deep-ingest-cross-substrate-readability-2026-04-22.md`
- `docs/research/oracle-scoring-v0-design-addressing-aminata-critical-2026-04-23.md` (only `Non-fusion disclaimer:` missing)
- `docs/research/provenance-aware-claim-veracity-detector-2026-04-23.md` (only `Non-fusion disclaimer:` missing)
- `docs/research/quantum-sensing-low-snr-detection-and-analogy-boundaries-2026-04-23.md` (only `Non-fusion disclaimer:` missing)
- `docs/research/superfluid-ai-github-funding-survival-bayesian-belief-propagation-amara-seventh-courier-ferry-2026-04-26.md`
- `docs/research/superfluid-ai-language-gravity-austrian-economics-amara-eighth-courier-ferry-2026-04-26.md`
- `docs/research/superfluid-ai-rigorous-mathematical-formalization-amara-fifth-courier-ferry-2026-04-26.md`
- `docs/research/test-classification.md` (3 of 4 labels missing)
- (...and ~6 more — full list via running the lint tool)

Note: some docs (Maji formal model, Spectre-Messiah, Superfluid AI fifth/seventh/eighth) are listed because the lint runs against `main` which doesn't have the PR-side edits yet — those docs DO have §33 headers in the PRs that are landing right now. Re-run the lint after the in-flight PRs merge to get the accurate residual list.

Backfill PR shape: dedicated PR adding §33 headers to all residual docs. Effort: M (1-3 days; mechanical work + per-doc judgment on what each Scope/Attribution should say).

### Sub-task 2: Wire to CI as enforcing lint

After Sub-task 1 lands and the lint reports 0 violations on `main`, wire the lint into `.github/workflows/gate.yml` as a new lint job (alongside `lint (markdownlint)`, `lint (shellcheck)`, `lint (actionlint)`, etc.) so future courier-ferry imports cannot land without §33 headers.

The lint script already exists at `tools/hygiene/check-archive-header-section33.sh`. Wiring is a small workflow-yml addition:

```yaml
- name: lint (archive header §33)
run: tools/hygiene/check-archive-header-section33.sh
```

This blocks the recurring-pattern at the structural layer: the tool catches violations pre-merge instead of waiting for human / advisory-AI review on each PR.

## Composition with existing factory substrate

- **Otto-346** (dependency symbiosis is human-anchoring; recurring inline pattern = signal substrate primitive missing): this row IS the Otto-346 application. The recurring §33 review-finding pattern → substrate primitive (the lint tool) → eventual CI enforcement.
- **Otto-341** (lint suppression is self-deception; mechanism over vigilance): the goal is mechanism (CI-enforced), not vigilance (each agent remembering the §33 discipline).
- **Otto-229** (tick-history append-only): same shape — recurring discipline-violation became a `check-tick-history-order.sh` CI lint after the bug pattern was identified. This row applies the same template.
- **Otto-238** (retractability; visible reversal not silent fix): the backfill PR landing first preserves the lineage of which docs needed retrofit; CI enforcement second prevents future violations without changing past rows.

## Why P3

- Not blocking any current PR merge
- The lint tool exists already; CI enforcement is the structural improvement
- Backfill is mechanical; can be batched into a single PR when ready

## Test plan (when picked up)

- Sub-task 1: run `tools/hygiene/check-archive-header-section33.sh` on the backfill branch; expect exit 0 (no violations)
- Sub-task 2: confirm gate.yml addition fires on a synthetic-test PR adding a courier-ferry doc WITHOUT §33 header — should fail; then add header and verify success
- Aminata adversarial review: does the lint catch all attack-shapes? E.g., a doc with bold-styled `**Scope**:` (which the #570 P0 finding showed is wrong); a doc with `Scope:` in line 21+ (out of 20-line bound)
- F1/F2/F3 filter pass

## What this row does NOT do

- Does NOT auto-fix existing docs (the lint reports; the backfill PR fixes mechanically)
- Does NOT enforce §33 on docs OUTSIDE `docs/research/**` (other surfaces have different governance)
- Does NOT pre-commit to the exact wording of each §33 header field (that's per-doc author judgment)
- Does NOT replace human review entirely; lint catches structural violation, review still catches content quality

## Owed work after this row is picked up

1. Backfill PR (Sub-task 1): adds §33 headers to all residual courier-ferry research docs
2. CI wire PR (Sub-task 2): adds the lint job to gate.yml
3. Update `docs/research/README.md` (if exists) to mention the §33 discipline + lint
4. Otto-346 substrate file may want a cross-reference to this row as a concrete instance of the principle in action
151 changes: 151 additions & 0 deletions tools/hygiene/check-archive-header-section33.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,151 @@
#!/usr/bin/env bash
#
# tools/hygiene/check-archive-header-section33.sh — validates that
# courier-ferry / external-conversation imports under `docs/research/**`
# carry the 4-field archive boundary header in the first 20 lines per
# GOVERNANCE.md §33.
#
# Why this exists (Otto-346 pattern; observed 2026-04-26):
# The §33 archive header was the most-common review finding across
# the 11-Amara-refinement courier-ferry lineage this session: PR #560
# / #562 / #563 / #565 / #566 / #568 / #569 / #570 / #553 each had
# to be retrofitted with the header AFTER review. Recurring identical
# review finding = signal that the discipline lacks automated
# enforcement.
#
# Per Otto-346 (recurring pattern → substrate primitive missing) +
# Otto-341 (mechanism over vigilance), the right shape is a CI lint
# check that fails the build when a courier-ferry import lands
# without the §33 header — instead of waiting for human / advisory-AI
# review to flag it on every doc.
#
# What this checks:
# For every file under `docs/research/**.md` that matches the
# courier-ferry import pattern (filename or content contains
# "courier-ferry" / "cross-substrate" / "external conversation"):
# - First 20 lines contain ALL four required §33 labels:
# * `Scope:` (literal label, NOT bold-styled `**Scope**:`)
# * `Attribution:`
# * `Operational status:`
# * `Non-fusion disclaimer:`
# - Reports every failing file with a per-file diagnostic line, then
# a summary line with the total count. Multi-violation reporting is
# intentional: agents can fix all violations in a single pass instead
# of running the lint repeatedly to discover them serially.
Comment on lines +31 to +34
# - Exits non-zero on any failure
Comment thread
AceHack marked this conversation as resolved.
#
# What this does NOT do:
# - Does NOT validate the CONTENT of each header field (that's a
# judgment call the author makes)
# - Does NOT auto-fix; the fix is the author's responsibility (the
# CI failure points at the missing labels)
# - Does NOT enforce §33 on docs OUTSIDE `docs/research/**` (other
# surfaces have different governance per AGENTS.md)
Comment thread
AceHack marked this conversation as resolved.
#
# Composes with:
# - GOVERNANCE.md §33 (the rule this lints)
# - tools/hygiene/check-tick-history-order.sh (pattern: structural-
# prevention via lint, not vigilance)
# - .github/workflows/gate.yml (wired as a lint job)
#
# Self-test:
# $ tools/hygiene/check-archive-header-section33.sh
# → exit 0 if all courier-ferry research docs have §33 headers
# → exit 1 with diagnostic if any are missing

set -euo pipefail

REPO_ROOT="$(git rev-parse --show-toplevel 2>/dev/null || pwd)"
RESEARCH_DIR="${REPO_ROOT}/docs/research"

if [[ ! -d "$RESEARCH_DIR" ]]; then
echo "OK: docs/research/ does not exist; nothing to check"
exit 0
fi

# Required §33 labels as literal strings. Bold-styled forms like
# `**Scope**:` are NOT acceptable per the discovery in PR #570 P0:
# header-format linting may not recognize bold-styled labels.
required_labels=(
"Scope:"
"Attribution:"
"Operational status:"
"Non-fusion disclaimer:"
)

# A courier-ferry / external-conversation import is identified by
# specific structural-marker patterns in filename or first-20-line
# content. Patterns are role-ref-based (NOT name-attribution) per
# the "No name attribution in code, docs, or skills" rule:
# we look for the structural-shape markers like 'courier-ferry'
# and 'cross-substrate', not personal names. Empty match = file is
# NOT in scope, skip silently.
#
# Content signals are scoped to the first 20 lines (the §33 header
# region itself) to AVOID false positives where a doc merely
# mentions an external system in its body. The narrow lookback
# also makes the lint cheaper.
is_courier_ferry_import() {
local file="$1"
# Filename signals — structural markers only (no personal names)
if [[ "$file" =~ courier-ferry|cross-substrate|external-import|cross-ferry ]]; then
return 0
fi
# Content signals scanned in the §33 header region (first 20 lines).
# Patterns target structural phrases — courier-ferry process,
# external-conversation status — NOT mere mentions of external
# systems. Matches like 'chatgpt' / 'google search ai' alone are
# too broad and produce false positives on internal research docs
# (Copilot P0 finding: PR #571 review).
if head -20 "$file" 2>/dev/null | grep -qiE 'courier.ferry|external conversation|external collaborator|external research agent|courier-ferry capture'; then
return 0
fi
return 1
}

violations=0
violation_files=()

Comment on lines +106 to +108
# Iterate all .md files under docs/research/ recursively. The
# enforcement scope must match the documented scope ('docs/research/**');
# subdirectories like docs/research/claims/ exist today and any
# courier-ferry doc placed in one would bypass a single-level glob.
# Codex P2 finding (PR #571 review): use recursive walk via 'find'
# instead of '*.md' single-level glob.
shopt -s nullglob
while IFS= read -r -d '' file; do
if ! is_courier_ferry_import "$file"; then
continue
fi

header_region=$(head -20 "$file")
missing=()
for label in "${required_labels[@]}"; do
if ! echo "$header_region" | grep -qF "$label"; then
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Match §33 labels as raw line prefixes

The current check uses grep -qF "$label" against the header block, so any line that merely contains the substring (for example **Scope:** or prose/code snippets containing Scope:) is accepted. This contradicts the stated requirement that labels be in literal non-bold form and lets formatting drift bypass the lint, so a file can pass CI while still violating the intended §33 header shape.

Useful? React with 👍 / 👎.

missing+=("$label")
fi
done

if [[ ${#missing[@]} -gt 0 ]]; then
violations=$((violations + 1))
violation_files+=("$file")
echo "VIOLATION: ${file#"$REPO_ROOT/"} missing §33 labels: ${missing[*]}" >&2
fi
done < <(find "$RESEARCH_DIR" -type f -name '*.md' -print0)

if [[ $violations -gt 0 ]]; then
echo "" >&2
echo "FAIL: $violations courier-ferry research-doc(s) missing GOVERNANCE.md §33 archive header(s)" >&2
echo "" >&2
echo "Required header (literal label form, NOT bold-styled) in first 20 lines:" >&2
echo " Scope: <one-line scope>" >&2
echo " Attribution: <named entities + first-name attribution per Otto-279>" >&2
echo " Operational status: <research-grade vs operational-policy>" >&2
echo " Non-fusion disclaimer: <attribution boundary preservation>" >&2
echo "" >&2
echo "Pattern reference: see PR #570 / #566 / #563 §33-header fixes for examples." >&2
exit 1
fi

echo "OK: all courier-ferry research docs have §33 archive headers"
exit 0
Loading