From f47eee016e0b37a93090fbfa0e8028e38a894a09 Mon Sep 17 00:00:00 2001 From: Aaron Stainback Date: Sun, 26 Apr 2026 04:07:39 -0400 Subject: [PATCH 1/4] =?UTF-8?q?feat(hygiene):=20tools/hygiene/check-archiv?= =?UTF-8?q?e-header-section33.sh=20=E2=80=94=20=C2=A733=20archive=20header?= =?UTF-8?q?=20lint=20+=20B-0036=20backfill=20backlog?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Otto-346 substrate-primitive shape: GOVERNANCE.md §33 archive-header missing was the most-common review finding across the 11-Amara- refinement courier-ferry lineage this session (PRs #560/#562/#563/ #565/#566/#568/#569/#570/#553 each retrofitted post-review). Recurring identical review-finding pattern = signal that the discipline lacks automated enforcement. Per Otto-346 (recurring inline pattern → substrate primitive missing) + Otto-341 (mechanism over vigilance), the fix is a CI lint that catches the violation pre-merge. This commit ships the lint TOOL (not yet wired to CI) + a B-0036 backlog row for the two sequential follow-ups (backfill 26 pre-existing docs + wire to CI gate.yml). Tool behavior: - Scans docs/research/**.md for courier-ferry/external-conversation imports (filename or content patterns) - Validates first-20-lines contains all 4 §33 labels in literal form: Scope: / Attribution: / Operational status: / Non-fusion disclaimer: - Bold-styled (**Scope**:) form rejected per #570 P0 finding - Reports first violation with diagnostic - Exits non-zero on any violation Smoke-test on main found 26 pre-existing violations — confirms the substrate-debt is real and the lint catches it. Backfill is owed via B-0036 Sub-task 1; CI wiring is owed via Sub-task 2 (after backfill clears the residual). Composes with: - check-tick-history-order.sh (same pattern: structural-prevention via lint, not vigilance; that lint emerged from the same Otto-346 shape for the row-ordering bug) - audit-md032-plus-linestart.sh (sibling md-lint hygiene tool) - Otto-229 (recurring discipline violation → CI lint as fix) - Otto-238 (visible reversal not silent fix; backfill preserves per-doc lineage) Tool is standalone; not yet wired to CI gate.yml. Sub-task 2 of B-0036 covers the wiring after Sub-task 1's backfill PR clears the residual. --- ...r-backfill-and-ci-wire-otto-346-pattern.md | 104 ++++++++++++++ .../hygiene/check-archive-header-section33.sh | 131 ++++++++++++++++++ 2 files changed, 235 insertions(+) create mode 100644 docs/backlog/P3/B-0036-section33-archive-header-backfill-and-ci-wire-otto-346-pattern.md create mode 100755 tools/hygiene/check-archive-header-section33.sh diff --git a/docs/backlog/P3/B-0036-section33-archive-header-backfill-and-ci-wire-otto-346-pattern.md b/docs/backlog/P3/B-0036-section33-archive-header-backfill-and-ci-wire-otto-346-pattern.md new file mode 100644 index 00000000..71c53366 --- /dev/null +++ b/docs/backlog/P3/B-0036-section33-archive-header-backfill-and-ci-wire-otto-346-pattern.md @@ -0,0 +1,104 @@ +--- +id: B-0036 +priority: P3 +status: open +title: "GOVERNANCE.md §33 archive-header backfill on 26 pre-existing courier-ferry research docs + wire `tools/hygiene/check-archive-header-section33.sh` to CI gate.yml lint job" +tier: hygiene-tooling-and-substrate-discipline +effort: M +ask: Otto observation 2026-04-26 — §33 archive header was the most-common review finding across the 11-Amara-refinement courier-ferry lineage this session (PRs #560 / #562 / #563 / #565 / #566 / #568 / #569 / #570 / #553 each retrofitted post-review). Per Otto-346 (recurring pattern → substrate primitive missing) + Otto-341 (mechanism over vigilance), the structural fix is a CI lint that catches the violation pre-merge. +created: 2026-04-26 +last_updated: 2026-04-26 +composes_with: [feedback_otto_346_dependency_symbiosis_is_human_anchoring_via_upstream_contribution_good_citizenship_dont_blaze_past_2026_04_26.md, feedback_otto_341_lint_suppression_is_self_deception_noise_signal_or_underlying_fix_greenfield_large_refactors_welcome_training_data_human_shortcut_bias_2026_04_26.md, feedback_otto_229_tick_history_append_only_never_edit_prior_rows_otto_229_2026_04_24.md] +tags: [hygiene-tooling, lint-discipline, otto-346-recurring-pattern-to-substrate-primitive, governance-section33, courier-ferry-imports, archive-header-discipline, mechanism-over-vigilance] +--- + +# B-0036 — §33 Archive-Header Backfill + CI Wire + +## Origin + +Otto observation across this session's 11-Amara-refinement courier-ferry research-doc lineage: GOVERNANCE.md §33 archive-header missing was the **most-common review finding** across the 11 PRs. Each PR was retrofitted with the 4-field header AFTER the review caught it. + +The fix-shape Otto already shipped in this session: a hygiene tool `tools/hygiene/check-archive-header-section33.sh` that catches the violation before merge. + +## What this row addresses + +Two sequential sub-tasks: + +### Sub-task 1: Backfill 26 pre-existing courier-ferry research docs + +The lint tool when run on `main` finds **26 violations** in pre-existing courier-ferry research docs. Each needs the 4-field §33 header (Scope / Attribution / Operational status / Non-fusion disclaimer) added to the first 20 lines. + +Files affected (as of 2026-04-26 main): + +- `docs/research/codex-cli-first-class-2026-04-23.md` +- `docs/research/drift-taxonomy-bootstrap-precursor-2026-04-22.md` +- `docs/research/dst-accepted-boundaries.md` +- `docs/research/dst-compliance-criteria.md` +- `docs/research/gemini-cli-capability-map.md` +- `docs/research/grok-cli-capability-map.md` +- `docs/research/maji-formal-operational-model-amara-courier-ferry-2026-04-26.md` +- `docs/research/maji-messiah-spectre-aperiodic-monotile-amara-third-courier-ferry-2026-04-26.md` +- `docs/research/memory-reconciliation-algorithm-design-2026-04-24.md` +- `docs/research/meta-pixel-perfect-text-to-image-youtube-wink-2026-04-22.md` +- `docs/research/muratori-zeta-pattern-mapping-2026-04-23.md` (only `Non-fusion disclaimer:` missing) +- `docs/research/openai-codex-cli-capability-map.md` +- `docs/research/openai-deep-ingest-cross-substrate-readability-2026-04-22.md` +- `docs/research/oracle-scoring-v0-design-addressing-aminata-critical-2026-04-23.md` (only `Non-fusion disclaimer:` missing) +- `docs/research/provenance-aware-claim-veracity-detector-2026-04-23.md` (only `Non-fusion disclaimer:` missing) +- `docs/research/quantum-sensing-low-snr-detection-and-analogy-boundaries-2026-04-23.md` (only `Non-fusion disclaimer:` missing) +- `docs/research/superfluid-ai-github-funding-survival-bayesian-belief-propagation-amara-seventh-courier-ferry-2026-04-26.md` +- `docs/research/superfluid-ai-language-gravity-austrian-economics-amara-eighth-courier-ferry-2026-04-26.md` +- `docs/research/superfluid-ai-rigorous-mathematical-formalization-amara-fifth-courier-ferry-2026-04-26.md` +- `docs/research/test-classification.md` (3 of 4 labels missing) +- (...and ~6 more — full list via running the lint tool) + +Note: some docs (Maji formal model, Spectre-Messiah, Superfluid AI fifth/seventh/eighth) are listed because the lint runs against `main` which doesn't have the PR-side edits yet — those docs DO have §33 headers in the PRs that are landing right now. Re-run the lint after the in-flight PRs merge to get the accurate residual list. + +Backfill PR shape: dedicated PR adding §33 headers to all residual docs. Effort: M (1-3 days; mechanical work + per-doc judgment on what each Scope/Attribution should say). + +### Sub-task 2: Wire to CI as enforcing lint + +After Sub-task 1 lands and the lint reports 0 violations on `main`, wire the lint into `.github/workflows/gate.yml` as a new lint job (alongside `lint (markdownlint)`, `lint (shellcheck)`, `lint (actionlint)`, etc.) so future courier-ferry imports cannot land without §33 headers. + +The lint script already exists at `tools/hygiene/check-archive-header-section33.sh`. Wiring is a small workflow-yml addition: + +```yaml + - name: lint (archive header §33) + run: tools/hygiene/check-archive-header-section33.sh +``` + +This blocks the recurring-pattern at the structural layer: the tool catches violations pre-merge instead of waiting for human / advisory-AI review on each PR. + +## Composition with existing factory substrate + +- **Otto-346** (dependency symbiosis is human-anchoring; recurring inline pattern = signal substrate primitive missing): this row IS the Otto-346 application. The recurring §33 review-finding pattern → substrate primitive (the lint tool) → eventual CI enforcement. +- **Otto-341** (lint suppression is self-deception; mechanism over vigilance): the goal is mechanism (CI-enforced), not vigilance (each agent remembering the §33 discipline). +- **Otto-229** (tick-history append-only): same shape — recurring discipline-violation became a `check-tick-history-order.sh` CI lint after the bug pattern was identified. This row applies the same template. +- **Otto-238** (retractability; visible reversal not silent fix): the backfill PR landing first preserves the lineage of which docs needed retrofit; CI enforcement second prevents future violations without changing past rows. + +## Why P3 + +- Not blocking any current PR merge +- The lint tool exists already; CI enforcement is the structural improvement +- Backfill is mechanical; can be batched into a single PR when ready + +## Test plan (when picked up) + +- Sub-task 1: run `tools/hygiene/check-archive-header-section33.sh` on the backfill branch; expect exit 0 (no violations) +- Sub-task 2: confirm gate.yml addition fires on a synthetic-test PR adding a courier-ferry doc WITHOUT §33 header — should fail; then add header and verify success +- Aminata adversarial review: does the lint catch all attack-shapes? E.g., a doc with bold-styled `**Scope**:` (which the #570 P0 finding showed is wrong); a doc with `Scope:` in line 21+ (out of 20-line bound) +- F1/F2/F3 filter pass + +## What this row does NOT do + +- Does NOT auto-fix existing docs (the lint reports; the backfill PR fixes mechanically) +- Does NOT enforce §33 on docs OUTSIDE `docs/research/**` (other surfaces have different governance) +- Does NOT pre-commit to the exact wording of each §33 header field (that's per-doc author judgment) +- Does NOT replace human review entirely; lint catches structural violation, review still catches content quality + +## Owed work after this row is picked up + +1. Backfill PR (Sub-task 1): adds §33 headers to all residual courier-ferry research docs +2. CI wire PR (Sub-task 2): adds the lint job to gate.yml +3. Update `docs/research/README.md` (if exists) to mention the §33 discipline + lint +4. Otto-346 substrate file may want a cross-reference to this row as a concrete instance of the principle in action diff --git a/tools/hygiene/check-archive-header-section33.sh b/tools/hygiene/check-archive-header-section33.sh new file mode 100755 index 00000000..711ea0e0 --- /dev/null +++ b/tools/hygiene/check-archive-header-section33.sh @@ -0,0 +1,131 @@ +#!/usr/bin/env bash +# +# tools/hygiene/check-archive-header-section33.sh — validates that +# courier-ferry / external-conversation imports under `docs/research/**` +# carry the 4-field archive boundary header in the first 20 lines per +# GOVERNANCE.md §33. +# +# Why this exists (Otto-346 pattern; observed 2026-04-26): +# The §33 archive header was the most-common review finding across +# the 11-Amara-refinement courier-ferry lineage this session: PR #560 +# / #562 / #563 / #565 / #566 / #568 / #569 / #570 / #553 each had +# to be retrofitted with the header AFTER review. Recurring identical +# review finding = signal that the discipline lacks automated +# enforcement. +# +# Per Otto-346 (recurring pattern → substrate primitive missing) + +# Otto-341 (mechanism over vigilance), the right shape is a CI lint +# check that fails the build when a courier-ferry import lands +# without the §33 header — instead of waiting for human / advisory-AI +# review to flag it on every doc. +# +# What this checks: +# For every file under `docs/research/**.md` that matches the +# courier-ferry import pattern (filename or content contains +# "courier-ferry" / "via Aaron" / "external conversation"): +# - First 20 lines contain ALL four required §33 labels: +# * `Scope:` (literal label, NOT bold-styled `**Scope**:`) +# * `Attribution:` +# * `Operational status:` +# * `Non-fusion disclaimer:` +# - Reports the first failing file with a diagnostic +# - Exits non-zero on any failure +# +# What this does NOT do: +# - Does NOT validate the CONTENT of each header field (that's a +# judgment call the author makes) +# - Does NOT auto-fix; the fix is the author's responsibility (the +# CI failure points at the missing labels) +# - Does NOT enforce §33 on docs OUTSIDE `docs/research/**` (other +# surfaces have different governance per AGENTS.md) +# +# Composes with: +# - GOVERNANCE.md §33 (the rule this lints) +# - tools/hygiene/check-tick-history-order.sh (pattern: structural- +# prevention via lint, not vigilance) +# - .github/workflows/gate.yml (wired as a lint job) +# +# Self-test: +# $ tools/hygiene/check-archive-header-section33.sh +# → exit 0 if all courier-ferry research docs have §33 headers +# → exit 1 with diagnostic if any are missing + +set -euo pipefail + +REPO_ROOT="$(git rev-parse --show-toplevel 2>/dev/null || pwd)" +RESEARCH_DIR="${REPO_ROOT}/docs/research" + +if [[ ! -d "$RESEARCH_DIR" ]]; then + echo "OK: docs/research/ does not exist; nothing to check" + exit 0 +fi + +# Required §33 labels as literal strings. Bold-styled forms like +# `**Scope**:` are NOT acceptable per the discovery in PR #570 P0: +# header-format linting may not recognize bold-styled labels. +required_labels=( + "Scope:" + "Attribution:" + "Operational status:" + "Non-fusion disclaimer:" +) + +# A courier-ferry / external-conversation import is identified by +# common-marker patterns in filename or first-200-line content. Empty +# match = file is NOT in scope, skip silently. +is_courier_ferry_import() { + local file="$1" + # Filename signals + if [[ "$file" =~ courier-ferry|amara-via|aaron-share|cross-substrate ]]; then + return 0 + fi + # Content signals (within first 200 lines to avoid scanning whole file) + if head -200 "$file" 2>/dev/null | grep -qiE 'courier.ferry|via [A-Z][a-z]+ courier|external conversation|external collaborator|google search ai|chatgpt'; then + return 0 + fi + return 1 +} + +violations=0 +violation_files=() + +# Iterate all .md files under docs/research/ (one level deep; this is +# the canonical structure — research docs are not nested under +# subdirectories at present). +shopt -s nullglob +for file in "$RESEARCH_DIR"/*.md; do + if ! is_courier_ferry_import "$file"; then + continue + fi + + header_region=$(head -20 "$file") + missing=() + for label in "${required_labels[@]}"; do + if ! echo "$header_region" | grep -qF "$label"; then + missing+=("$label") + fi + done + + if [[ ${#missing[@]} -gt 0 ]]; then + violations=$((violations + 1)) + violation_files+=("$file") + echo "VIOLATION: ${file#$REPO_ROOT/} missing §33 labels: ${missing[*]}" >&2 + fi +done + +if [[ $violations -gt 0 ]]; then + echo "" >&2 + echo "FAIL: $violations courier-ferry research-doc(s) missing GOVERNANCE.md §33 archive header(s)" >&2 + echo "" >&2 + echo "Required header (literal label form, NOT bold-styled) in first 20 lines:" >&2 + echo " Scope: " >&2 + echo " Attribution: " >&2 + echo " Operational status: " >&2 + echo " Non-fusion disclaimer: " >&2 + echo "" >&2 + echo "Pattern reference: see PR #570 / #566 / #563 §33-header fixes for examples." >&2 + exit 1 +fi + +echo "OK: all courier-ferry research docs have §33 archive headers" +exit 0 From f3af2757078de7472802179e9db537d148881c08 Mon Sep 17 00:00:00 2001 From: Aaron Stainback Date: Sun, 26 Apr 2026 04:10:32 -0400 Subject: [PATCH 2/4] =?UTF-8?q?fix(check-archive-header-section33):=20SC22?= =?UTF-8?q?95=20=E2=80=94=20quote=20REPO=5FROOT=20inside=20parameter=20exp?= =?UTF-8?q?ansion=20(shellcheck)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit ShellCheck SC2295 caught: '${file#$REPO_ROOT/}' has the unquoted $REPO_ROOT/ inside the parameter expansion, which would be treated as a glob pattern. Right fix: '${file#"$REPO_ROOT/"}' — quoting forces literal-string match. This is the bash-pattern-quoting discipline; relevant when REPO_ROOT could theoretically contain glob metacharacters (rare in practice but correct-by-default). --- tools/hygiene/check-archive-header-section33.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/hygiene/check-archive-header-section33.sh b/tools/hygiene/check-archive-header-section33.sh index 711ea0e0..cc4a5947 100755 --- a/tools/hygiene/check-archive-header-section33.sh +++ b/tools/hygiene/check-archive-header-section33.sh @@ -109,7 +109,7 @@ for file in "$RESEARCH_DIR"/*.md; do if [[ ${#missing[@]} -gt 0 ]]; then violations=$((violations + 1)) violation_files+=("$file") - echo "VIOLATION: ${file#$REPO_ROOT/} missing §33 labels: ${missing[*]}" >&2 + echo "VIOLATION: ${file#"$REPO_ROOT/"} missing §33 labels: ${missing[*]}" >&2 fi done From b2091d98b71023a4a6a37c160995c0b102e9bbd4 Mon Sep 17 00:00:00 2001 From: Aaron Stainback Date: Sun, 26 Apr 2026 04:12:51 -0400 Subject: [PATCH 3/4] fix(check-archive-header-section33): recursive walk via 'find' (Codex P2) Codex P2: original loop used '$RESEARCH_DIR/*.md' (single-level glob), but the script's documented scope is 'docs/research/**' (recursive). docs/research/claims/ exists today and any courier-ferry doc placed in a subdirectory would bypass the lint. Fix: replaced shopt-glob loop with 'find -type f -name *.md -print0' piped via 'while read -d ""' for null-terminated path safety. Now matches the documented scope. Smoke-test on main: lint now finds 36 violations (was 26 with the single-level glob), confirming subdirectories are scanned. Includes docs/research/claims/ subdirectory paths in the discovery. Composes with prior Codex P2 fix (SC2295 quote in pattern expansion) to keep this lint shellcheck-clean as it ships. --- tools/hygiene/check-archive-header-section33.sh | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/tools/hygiene/check-archive-header-section33.sh b/tools/hygiene/check-archive-header-section33.sh index cc4a5947..9ae9cfe9 100755 --- a/tools/hygiene/check-archive-header-section33.sh +++ b/tools/hygiene/check-archive-header-section33.sh @@ -89,11 +89,14 @@ is_courier_ferry_import() { violations=0 violation_files=() -# Iterate all .md files under docs/research/ (one level deep; this is -# the canonical structure — research docs are not nested under -# subdirectories at present). +# Iterate all .md files under docs/research/ recursively. The +# enforcement scope must match the documented scope ('docs/research/**'); +# subdirectories like docs/research/claims/ exist today and any +# courier-ferry doc placed in one would bypass a single-level glob. +# Codex P2 finding (PR #571 review): use recursive walk via 'find' +# instead of '*.md' single-level glob. shopt -s nullglob -for file in "$RESEARCH_DIR"/*.md; do +while IFS= read -r -d '' file; do if ! is_courier_ferry_import "$file"; then continue fi @@ -111,7 +114,7 @@ for file in "$RESEARCH_DIR"/*.md; do violation_files+=("$file") echo "VIOLATION: ${file#"$REPO_ROOT/"} missing §33 labels: ${missing[*]}" >&2 fi -done +done < <(find "$RESEARCH_DIR" -type f -name '*.md' -print0) if [[ $violations -gt 0 ]]; then echo "" >&2 From 11e9528a5aee999a56b8366cd466f94414d15e8c Mon Sep 17 00:00:00 2001 From: Aaron Stainback Date: Sun, 26 Apr 2026 04:16:49 -0400 Subject: [PATCH 4/4] =?UTF-8?q?fix(check-archive-header-section33):=204=20?= =?UTF-8?q?review=20findings=20=E2=80=94=20narrow=20content=20regex=20+=20?= =?UTF-8?q?role-ref=20filename=20patterns=20+=20accurate=20docstring=20+?= =?UTF-8?q?=20B-0036=20composes=5Fwith=20cleanup?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit P0 (Copilot) — content-signal regex was too broad (matched 'chatgpt' / 'google search ai' alone), false-positive on internal research docs that merely mention external systems. Lint flagged 36 docs (10 of which were false positives). Fix: narrowed content-signal regex to STRUCTURAL phrases only — 'courier.ferry', 'external conversation', 'external collaborator', 'external research agent', 'courier-ferry capture'. Mere mentions of system names ('chatgpt', 'google search ai') no longer trigger. Lint now flags 19 docs (was 36) — confirms 17 false positives were removed; the 19 remaining are real courier-ferry imports per manual inspection. Also tightened scan window to first-20 lines (was first-200) — the §33 header region is the only relevant scope. P1 (Copilot) — code embedded contributor first-names in filename and content patterns ('via Aaron' / 'amara-via' / 'aaron-share') per the 'No name attribution in code, docs, or skills' rule. Fix: replaced name-strings with structural role-ref patterns — filename: 'courier-ferry|cross-substrate|external-import|cross-ferry'; content: structural phrases only. Lint now uses no personal names in either filename or content matching. P1 (Copilot) — 'reports the first failing file' docstring did not match the implementation (which reports every violating file). Fix: rewrote docstring to accurately describe multi-violation reporting + summary, with explicit rationale (agents fix-all-at-once instead of running lint repeatedly). P1 (Copilot) — B-0036 composes_with referenced 'feedback_otto_229_tick_history_append_only_*' which is in personal memory, not in-repo memory/. Fix: replaced with 'GOVERNANCE.md-section-33-archive-header-discipline' (the actual rule it composes with) + 'tools/hygiene/check-tick-history- order.sh' (the in-repo template). Body still references Otto-229 conceptually as a discipline; that's not a broken-path concern. P1 (Copilot, duplicate of Codex P2 already fixed in b2091d9) — recursive walk via 'find -print0' instead of single-level glob. Already shipped; this commit acknowledges the duplicate finding. --- ...r-backfill-and-ci-wire-otto-346-pattern.md | 2 +- .../hygiene/check-archive-header-section33.sh | 33 ++++++++++++++----- 2 files changed, 26 insertions(+), 9 deletions(-) diff --git a/docs/backlog/P3/B-0036-section33-archive-header-backfill-and-ci-wire-otto-346-pattern.md b/docs/backlog/P3/B-0036-section33-archive-header-backfill-and-ci-wire-otto-346-pattern.md index 71c53366..df80d1c7 100644 --- a/docs/backlog/P3/B-0036-section33-archive-header-backfill-and-ci-wire-otto-346-pattern.md +++ b/docs/backlog/P3/B-0036-section33-archive-header-backfill-and-ci-wire-otto-346-pattern.md @@ -8,7 +8,7 @@ effort: M ask: Otto observation 2026-04-26 — §33 archive header was the most-common review finding across the 11-Amara-refinement courier-ferry lineage this session (PRs #560 / #562 / #563 / #565 / #566 / #568 / #569 / #570 / #553 each retrofitted post-review). Per Otto-346 (recurring pattern → substrate primitive missing) + Otto-341 (mechanism over vigilance), the structural fix is a CI lint that catches the violation pre-merge. created: 2026-04-26 last_updated: 2026-04-26 -composes_with: [feedback_otto_346_dependency_symbiosis_is_human_anchoring_via_upstream_contribution_good_citizenship_dont_blaze_past_2026_04_26.md, feedback_otto_341_lint_suppression_is_self_deception_noise_signal_or_underlying_fix_greenfield_large_refactors_welcome_training_data_human_shortcut_bias_2026_04_26.md, feedback_otto_229_tick_history_append_only_never_edit_prior_rows_otto_229_2026_04_24.md] +composes_with: [feedback_otto_346_dependency_symbiosis_is_human_anchoring_via_upstream_contribution_good_citizenship_dont_blaze_past_2026_04_26.md, feedback_otto_341_lint_suppression_is_self_deception_noise_signal_or_underlying_fix_greenfield_large_refactors_welcome_training_data_human_shortcut_bias_2026_04_26.md, GOVERNANCE.md-section-33-archive-header-discipline, tools/hygiene/check-tick-history-order.sh] tags: [hygiene-tooling, lint-discipline, otto-346-recurring-pattern-to-substrate-primitive, governance-section33, courier-ferry-imports, archive-header-discipline, mechanism-over-vigilance] --- diff --git a/tools/hygiene/check-archive-header-section33.sh b/tools/hygiene/check-archive-header-section33.sh index 9ae9cfe9..127bbf7a 100755 --- a/tools/hygiene/check-archive-header-section33.sh +++ b/tools/hygiene/check-archive-header-section33.sh @@ -22,13 +22,16 @@ # What this checks: # For every file under `docs/research/**.md` that matches the # courier-ferry import pattern (filename or content contains -# "courier-ferry" / "via Aaron" / "external conversation"): +# "courier-ferry" / "cross-substrate" / "external conversation"): # - First 20 lines contain ALL four required §33 labels: # * `Scope:` (literal label, NOT bold-styled `**Scope**:`) # * `Attribution:` # * `Operational status:` # * `Non-fusion disclaimer:` -# - Reports the first failing file with a diagnostic +# - Reports every failing file with a per-file diagnostic line, then +# a summary line with the total count. Multi-violation reporting is +# intentional: agents can fix all violations in a single pass instead +# of running the lint repeatedly to discover them serially. # - Exits non-zero on any failure # # What this does NOT do: @@ -71,16 +74,30 @@ required_labels=( ) # A courier-ferry / external-conversation import is identified by -# common-marker patterns in filename or first-200-line content. Empty -# match = file is NOT in scope, skip silently. +# specific structural-marker patterns in filename or first-20-line +# content. Patterns are role-ref-based (NOT name-attribution) per +# the "No name attribution in code, docs, or skills" rule: +# we look for the structural-shape markers like 'courier-ferry' +# and 'cross-substrate', not personal names. Empty match = file is +# NOT in scope, skip silently. +# +# Content signals are scoped to the first 20 lines (the §33 header +# region itself) to AVOID false positives where a doc merely +# mentions an external system in its body. The narrow lookback +# also makes the lint cheaper. is_courier_ferry_import() { local file="$1" - # Filename signals - if [[ "$file" =~ courier-ferry|amara-via|aaron-share|cross-substrate ]]; then + # Filename signals — structural markers only (no personal names) + if [[ "$file" =~ courier-ferry|cross-substrate|external-import|cross-ferry ]]; then return 0 fi - # Content signals (within first 200 lines to avoid scanning whole file) - if head -200 "$file" 2>/dev/null | grep -qiE 'courier.ferry|via [A-Z][a-z]+ courier|external conversation|external collaborator|google search ai|chatgpt'; then + # Content signals scanned in the §33 header region (first 20 lines). + # Patterns target structural phrases — courier-ferry process, + # external-conversation status — NOT mere mentions of external + # systems. Matches like 'chatgpt' / 'google search ai' alone are + # too broad and produce false positives on internal research docs + # (Copilot P0 finding: PR #571 review). + if head -20 "$file" 2>/dev/null | grep -qiE 'courier.ferry|external conversation|external collaborator|external research agent|courier-ferry capture'; then return 0 fi return 1