Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
---
id: B-0070
priority: P2
slug: orphan-role-ref-detector-lint
status: backlog
created: 2026-04-28
maintainer: aaron
ownership: otto
title: Orphan role-ref detector lint — catch ferry-N without named source on code surfaces (Aaron 2026-04-28)
---
Comment on lines +1 to +10
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: The YAML frontmatter here doesn’t match the documented per-row backlog schema in tools/backlog/README.md (e.g., status is documented as open/closed/superseded-by-*/deferred, and last_updated is marked required). Either adjust this row to the documented schema (add last_updated, use a documented status value), or update the schema docs/tooling in the same change-set so the repo has a single source of truth.

Copilot uses AI. Check for mistakes.

# B-0070 — Orphan role-ref detector lint

## Why

The human maintainer 2026-04-28 (verbatim, /btw aside during PR #24
drain):

> "not sure if you can update to find things like that that don't make
> sense in the future like look for courrier-ferrrrry or whatever IDK
> just thinking out out for your future self and the review agentsd"

Aaron caught a recurring failure mode: when stripping named attribution
from code-surface text per the Otto-279 history-surface-only rule, the
mechanical replacement leaves orphan role-refs that don't carry semantic
weight. The detection should be a lint that catches this pattern at
write-time, before it ships.

Documented in `memory/feedback_orphan_role_ref_after_name_stripping_aaron_2026_04_28.md`.

## What

Lint that scans code-surface files (excluding history-surfaces) for:

1. **Orphan role-ref forms** — text like `courier-ferry-N`, `ferry-N`,
`ferry-N's` without a resolvable named source nearby. These are
the over-stripped attributions that should EITHER be removed
entirely OR replaced with a self-contained principle name.

2. **Un-stripped name attribution on code-surface** — text like
`Amara ferry-N`, `Grok ferry-N`, `Gemini ferry-N`, `Per <Name>
2026-MM-DD` on code-surface files (`tools/`, behavioural `docs/`,
`.claude/skills/`). Should be moved to a history-surface OR
replaced with role-ref AND self-contained principle name.

Scope:

- **Apply to:** `tools/**` (excluding `tools/lean4/.lake/`),
behavioural docs in `docs/` (excluding history surfaces),
`.claude/skills/**/SKILL.md` (skill bodies),
`src/**`, `*.fsproj`, `*.csproj`
- **Exclude (history surfaces per Otto-279):** `memory/**`,
Comment on lines +46 to +52
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: The scan scope includes very broad roots (tools/**, docs/**, src/**) but doesn’t explicitly exclude references/upstreams/**, which the repo’s standing rules require excluding from file-iteration scans due to size/noise (docs/AGENT-BEST-PRACTICES.md “Exclude references/upstreams/ from every file-iteration command”). Please add that exclusion to the planned lint’s scope/implementation notes to avoid a slow, noisy audit.

Copilot uses AI. Check for mistakes.
`docs/research/**`, `docs/aurora/**`, `docs/ROUND-HISTORY.md`,
`docs/DECISIONS/**`, `docs/hygiene-history/**`,
`docs/pr-preservation/**`, `docs/pr-discussions/**`, commit
messages
Comment on lines +55 to +56
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 (xref / rule alignment): This list says docs/pr-discussions/** is a “history surface per Otto-279”, but docs/AGENT-BEST-PRACTICES.md’s Otto-279 history-surface list does not include docs/pr-discussions/** (it lists docs/pr-preservation/**). Either remove docs/pr-discussions/** from the “per Otto-279” claim, or update the cited rule so the surfaces list is consistent.

Suggested change
`docs/pr-preservation/**`, `docs/pr-discussions/**`, commit
messages
`docs/pr-preservation/**`, commit messages

Copilot uses AI. Check for mistakes.

## How

Initial implementation: bash script under `tools/hygiene/` matching
the existing audit-* pattern. Wired into CI gate as a soft-fail (warn,
don't block) initially — same pattern as how
`audit-memory-index-duplicates.sh` started before being promoted to
hard-fail.

Detector regex (initial):

```
# Orphan role-ref (no resolvable named source)
\bcourier-ferry-\d+\b
\bferry-\d+\b
\bferry-\d+'s?\b

# Un-stripped name attribution on code-surface
\b(Amara|Grok|Gemini|Codex|Cursor|Aaron|Otto)\s+ferry-\d+\b
\bPer\s+(Amara|Grok|Gemini|Codex|Cursor|Aaron|Otto)\s+2026-
```
Comment on lines +66 to +77
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 (shell portability): The proposed detector regexes use \b and \d, which aren’t POSIX ERE and won’t work with common grep -E/awk (notably on macOS). Since this row proposes implementing the lint as a bash script, specify an engine that supports these tokens (e.g., ripgrep PCRE2) or rewrite the patterns in portable ERE form (e.g., [0-9]+ and explicit boundary handling).

Copilot uses AI. Check for mistakes.

Output shape: per-finding row with file:line:column, the matched
text, and a fix-suggestion (one of: remove attribution clause / move
to history surface / replace with self-contained principle name).

## Composes with

- **Otto-279 history-surface carve-out** at
`docs/AGENT-BEST-PRACTICES.md` ~287-348 — defines WHICH surfaces
get named attribution
- **`memory/feedback_orphan_role_ref_after_name_stripping_aaron_2026_04_28.md`** — the substrate
capturing the failure mode
- **`prompt-protector` skill** — invisible-Unicode lint shape;
orphan-role-ref lint would compose at the same write-time-scan
layer
- **`tools/hygiene/audit-memory-index-duplicates.sh`** — pattern
template for the audit-script shape
- **task #296** (commit-message-shape skill update) — the skill body
is also code-surface; lint catches inadvertent named-attribution in
the skill prose

## Cadence

When other 0/0/0 work clears OR when an orphan role-ref ships in a PR
that the lint would have caught (whichever fires first). Composes
with the `skill-improver` workflow — when `commit-message-shape` skill
is next updated (task #296), bundle the lint with it.

## Provenance

- Aaron 2026-04-28 verbatim (above) during PR #24 review
- Pattern empirically caught: PR #24 had 4 orphan role-refs after
mechanical name-strip; cleanup was reactive, not preventive
- Companion memory:
`memory/feedback_orphan_role_ref_after_name_stripping_aaron_2026_04_28.md`
Original file line number Diff line number Diff line change
@@ -0,0 +1,131 @@
---
name: Orphan role-ref after name-stripping — Aaron 2026-04-28 — when stripping named attribution leaves a role-ref that no longer makes sense, REMOVE the comment / attribution-clause entirely instead of leaving the orphan
description: Aaron 2026-04-28 caught a recurring failure mode in name-attribution corrections — when the original code/comment/doc mentions a named source ("Amara ferry-12") and the role-ref discipline strips the name, the resulting orphan ("courier-ferry-12 absorb") doesn't carry the same semantic weight. Two paths forward (a) recover the named source on a history surface, (b) remove the comment / attribution-clause entirely. The middle ground (orphan role-ref) is worse than either. Aaron explicit verbatim 2026-04-28 in PR #24 review
type: feedback
---

# Orphan role-ref after name-stripping

## Verbatim quote (Aaron 2026-04-28)

> "courier-ferry-5 absorb this does not really make sense with amamras
> name, we could remove the comment all together"

> "not sure if you can update to find things like that that don't make
> sense in the future like look for courrier-ferrrrry or whatever IDK
> just thinking out out for your future self and the review agentsd"

## The pattern

When applying the Otto-279 history-surface-vs-code-surface discipline to
strip named attribution from code (scripts, behavioural docs, public
prose), the mechanical replacement `<Name> ferry-N` → `courier-ferry-N
absorb` produces an **orphan role-ref**: a phrase that points at a
substrate source-anchor whose source-name has been removed.

Examples caught in PR #24:

| Original (history-surface OK) | Mechanical strip (orphan) | Better path |
|------------------------------|---------------------------------------|-------------|
| `Amara ferry-12` | `courier-ferry-12 absorb` | Remove the parenthetical; the class name stands alone |
| `Grok ferry-16 invariant` | `courier-ferry-16 absorb invariant` | Use the principle name directly: "Substrate Truth Principle invariant" |
| `Per Amara ferry-7 evidence-pointer rule` | `Per courier-ferry-7 absorb evidence-pointer rule` | Drop "Per ferry-N" entirely; the rule is in the spec |
| `Gemini ferry-8's example draft` | `courier-ferry-8 absorb example draft` | Replace with role-ref class: "any external example draft" |

The orphan form fails because:

1. **Numbered ferry IDs are meaningful only with the named source.**
"ferry-12" is Amara-specific terminology in this factory; without
"Amara" it's just a number with no resolvable referent.
2. **The role-ref form `courier-ferry-N` is verbose without adding
meaning.** Readers who don't know the substrate vocabulary see noise.
3. **Removing the substrate-source-anchor entirely is usually OK** —
the technical content (class name, principle name, rule shape)
stands on its own. The named source belongs in commit-message
trailers / history-surface docs / memory files, not in code
comments.

## The discipline

When stripping named attribution from a code comment / FAIL message /
script header:

1. **First check:** does the resulting text still make sense without
the named source?
2. **If yes** (e.g., the principle name is self-explanatory) → the
strip is fine
3. **If no** (orphan role-ref, missing referent) → remove the
attribution clause entirely. Don't keep half-attribution.

## Detection (future structural fix)

Aaron's framing 2026-04-28: *"not sure if you can update to find things
like that that don't make sense in the future ... for your future self
and the review agents"* — suggesting a lint that catches the pattern.

Candidate detector regex (for code-surface files only — `tools/`,
`docs/` excluding history-surfaces, behavioural docs):

```
\bcourier-ferry-\d+\b
\bferry-\d+\b
\bferry-\d+'s?\b
```

Plus the inverse: `\b<Person>\s+ferry-\d+\b` (Amara/Grok/Gemini etc
+ ferry-N) to catch un-stripped name attribution that should have
been stripped on code-surface.
Comment on lines +66 to +77
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 (shell portability): The “Candidate detector regex” examples here use \b and \d, which are PCRE tokens and can be misleading if the eventual implementation uses grep -E/awk (the repo’s hygiene scripts frequently target macOS portability). Consider either rewriting these patterns in portable ERE form, or explicitly noting the intended engine/tool (e.g., ripgrep PCRE2) so future implementers don’t copy/paste a non-working regex into a bash hygiene script.

Copilot uses AI. Check for mistakes.

The lint composes with the `prompt-protector` skill's invisible-Unicode
lint shape (write-time scan). Backlog candidate: B-NNNN — extend the
existing `audit-*` scripts under `tools/hygiene/` to flag these
patterns with a fix-suggestion: "remove the attribution clause OR
Comment on lines +80 to +82
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: This memory still uses the placeholder “Backlog candidate: B-NNNN” even though this PR introduces the concrete backlog row B-0070. Update this reference to B-0070 (and ideally link to docs/backlog/P2/B-0070-orphan-role-ref-detector-lint-aaron-2026-04-28.md) so readers can jump straight to the actionable follow-up.

Suggested change
lint shape (write-time scan). Backlog candidate: B-NNNN — extend the
existing `audit-*` scripts under `tools/hygiene/` to flag these
patterns with a fix-suggestion: "remove the attribution clause OR
lint shape (write-time scan). Backlog follow-up: `B-0070`
(`docs/backlog/P2/B-0070-orphan-role-ref-detector-lint-aaron-2026-04-28.md`)
— extend the existing `audit-*` scripts under `tools/hygiene/` to flag
these patterns with a fix-suggestion: "remove the attribution clause OR

Copilot uses AI. Check for mistakes.
move to history-surface OR replace with a self-contained principle
name."

## What this does NOT mean

- Does NOT mean named attribution is forbidden everywhere — it's
the correct framing on history surfaces (`memory/`,
`docs/research/`, `docs/ROUND-HISTORY.md`, `docs/DECISIONS/`,
hygiene-history, commit messages) per the Otto-279 carve-out at
`docs/AGENT-BEST-PRACTICES.md` "history-surface name attribution
exemption" section.
- Does NOT mean automatic strip-attribution scripts are dangerous
— they're useful when paired with a downstream check that catches
orphans.
- Does NOT mean every cross-source citation needs to be removed —
citations to canonical principles (e.g., "Substrate Truth Principle",
"Otto-279 carve-out") that have their own resolvable name are fine
on code surfaces.

## Composition with prior substrate

- **Otto-279** history-surface name-attribution carve-out at
`docs/AGENT-BEST-PRACTICES.md` ~287-348 — the rule that defines
WHICH surfaces get named attribution
- **`feedback_otto_357_no_directives_aaron_makes_autonomy_first_class_accountability_mine_2026_04_27.md`**
— the pre-write self-scan rule for forbidden-token detection;
this orphan-role-ref rule is the same shape (write-time scan)
applied to a different category
- **Otto-341 mechanism-over-vigilance** — the lint detector composes
with the discipline; vigilance-only enforcement is structurally
insufficient
- **`prompt-protector` skill** — invisible-Unicode lint shape;
orphan-role-ref lint would compose at the same write-time-scan
layer

## Triggers for retrieval

- Aaron 2026-04-28: "courier-ferry-5 absorb this does not really make
sense with amamras name"
- Aaron 2026-04-28: "look for courrier-ferrrrry or whatever IDK just
thinking out out for your future self and the review agentsd"
- Pattern: orphan role-ref after name-stripping
- Detection regex: `\bcourier-ferry-\d+\b`, `\bferry-\d+\b` on
code-surface files
- Better path when stripping name from `<Name> ferry-N`: remove the
attribution clause entirely OR replace with self-contained principle
name
- Composes with Otto-279 carve-out + Otto-357 pre-write self-scan +
Otto-341 mechanism-over-vigilance + prompt-protector skill
Loading