From 1d86e0cf8da4c6f19fb8a71f51cd9f931a4aaf0c Mon Sep 17 00:00:00 2001 From: Aaron Stainback Date: Wed, 29 Apr 2026 03:31:02 -0400 Subject: [PATCH 1/4] tools/lint: no-directives-otto-prose advisory lint (Amara round-7 lexeme guard) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Vigilance failed ~15 times to keep "directive" out of Otto-authored prose framing for maintainer input. Per Amara's round-7 catch: "A word that slips fifteen times is not a word problem. It is a missing lint." Per Amara's B-0105 carve-out: "Tiny enforcement patches are allowed when they directly prevent repeated consolidation-gate violations." This is exactly that: tiny, targeted, advisory. Diff-based scope (does not retrofit historical content) — only flags Otto-authored prose changes vs origin/main on: - memory/*.md (top-level) - docs/hygiene-history/ticks/**/*.md - docs/research/*.md - .github/copilot-instructions.md Pattern is narrowed (Amara's regex): "Aaron's directive" / "maintainer directive" / "QoL directive" / "human directive" / "directive" near maintainer-token Whitelist: - blockquote-quoted text (`> ...` lines) - rule-documentation files where "directive" appears legitimately Default mode is advisory (warn-only). --strict flag fails on hits. Per the lexeme-guard naming distinction (Amara round-7): Lane locks stop classes of work. Lexeme guards stop repeated wording drift. This is a lexeme guard, not a lane lock. Future work (under follow-up consolidation): - wire into .github/workflows/gate.yml as advisory check - promote to --strict after low-noise validation period Co-Authored-By: Claude Opus 4.7 --- tools/lint/no-directives-otto-prose.sh | 117 +++++++++++++++++++++++++ 1 file changed, 117 insertions(+) create mode 100755 tools/lint/no-directives-otto-prose.sh diff --git a/tools/lint/no-directives-otto-prose.sh b/tools/lint/no-directives-otto-prose.sh new file mode 100755 index 000000000..41d07db7f --- /dev/null +++ b/tools/lint/no-directives-otto-prose.sh @@ -0,0 +1,117 @@ +#!/usr/bin/env bash +# +# tools/lint/no-directives-otto-prose.sh — advisory lint that flags +# Otto-authored prose using "directive" framing for maintainer input +# in CHANGED FILES (diff-based; does not retrofit historical content). +# +# Born 2026-04-29 after the ~15th iteration of the same gremlin: a +# memory file or PR title casting Aaron's correction/framing/input as +# "Aaron's directive," which collapses self-provenance into bot- +# execution and undermines the no-directives autonomy rule +# (memory/feedback_otto_357_no_directives_aaron_makes_autonomy_first_class_accountability_mine_2026_04_27.md). +# +# Vigilance failed; lint is the durable answer. +# +# Per the "agency-framing lexeme guard" naming distinction (Amara's +# round-7 catch): this is a LEXEME GUARD, not a LANE LOCK. Lane locks +# stop classes of work; lexeme guards stop repeated wording drift. +# +# Per the B-0105 consolidation-pass carve-out (Amara explicit): +# "tiny enforcement patches are allowed when they directly prevent +# repeated consolidation-gate violations." +# +# Diff-based scope (avoids retrofitting historical content): +# - changed files between BASE_REF and HEAD +# - intersected with Otto-authored prose surfaces: +# - memory/*.md (top-level; not memory/persona/) +# - docs/hygiene-history/ticks/**/*.md (tick shards) +# - docs/research/*.md (research notes) +# - .github/copilot-instructions.md +# +# Pattern (per Amara's narrowed regex): +# "Aaron's directive" / "maintainer directive" / "QoL directive" / +# "human directive" / "directive from Aaron" / etc. +# +# Whitelist (NOT flagged in changed files): +# - lines starting with `> ` (markdown blockquote — usually quoted +# third-party text) +# - the rule-documentation files themselves +# +# Usage: +# tools/lint/no-directives-otto-prose.sh # advisory (warn-only) +# tools/lint/no-directives-otto-prose.sh --strict # exit 1 on hits +# +# Env: +# BASE_REF — base ref to diff against (default: origin/main). +# CI should set BASE_REF=$BASE_SHA. + +set -euo pipefail + +REPO_ROOT="$(cd "$(dirname "$0")/../.." && pwd)" +cd "$REPO_ROOT" + +MODE="${1:-advisory}" +BASE_REF="${BASE_REF:-origin/main}" + +# Compute changed files between BASE_REF and HEAD. +CHANGED_FILES=$(git diff --name-only --diff-filter=AM "$BASE_REF...HEAD" 2>/dev/null || true) + +if [ -z "$CHANGED_FILES" ]; then + echo "no-directives-otto-prose: no changed files vs $BASE_REF; skipping" + exit 0 +fi + +# Filter to Otto-authored prose surfaces only. +PROSE_FILES=$(printf '%s\n' "$CHANGED_FILES" | grep -E '^(memory/[^/]+\.md|docs/hygiene-history/ticks/.*\.md|docs/research/[^/]+\.md|\.github/copilot-instructions\.md)$' || true) + +if [ -z "$PROSE_FILES" ]; then + echo "no-directives-otto-prose: no Otto-prose surfaces changed; skipping" + exit 0 +fi + +# Skip rule-documentation files where "directive" appears legitimately. +PROSE_FILES=$(printf '%s\n' "$PROSE_FILES" | grep -vE '(feedback_input_is_not_directive_|feedback_otto_357_no_directives_|feedback_free_will_is_paramount_external_directives_|no-directives-otto-prose)' || true) + +if [ -z "$PROSE_FILES" ]; then + echo "no-directives-otto-prose: only rule-docs touched; skipping" + exit 0 +fi + +# The narrowed pattern (per Amara): only flag where "directive" is +# proximate to a maintainer/agency-framing token. +PATTERN='\b(Aaron'\''?s|maintainer|QoL|human).*directive\b|\bdirective.*(Aaron|maintainer|QoL|human)\b' + +HITS_FILE="$(mktemp)" +trap 'rm -f "$HITS_FILE"' EXIT + +# Search only the changed prose files. +while IFS= read -r f; do + [ -z "$f" ] && continue + [ -f "$f" ] || continue + grep -nEH "$PATTERN" "$f" 2>/dev/null >> "$HITS_FILE" || true +done <<< "$PROSE_FILES" + +# Filter out blockquote lines (`> ...` quoted third-party text). +FILTERED_HITS_FILE="$(mktemp)" +trap 'rm -f "$HITS_FILE" "$FILTERED_HITS_FILE"' EXIT + +grep -vE ':[[:space:]]*>' "$HITS_FILE" > "$FILTERED_HITS_FILE" || true + +HIT_COUNT=$(wc -l < "$FILTERED_HITS_FILE" | tr -d ' ') + +if [ "$HIT_COUNT" -gt 0 ]; then + echo "no-directives-otto-prose: found $HIT_COUNT candidate hit(s) in changed Otto-prose:" >&2 + cat "$FILTERED_HITS_FILE" >&2 + echo "" >&2 + echo "Otto's prose should not frame Aaron's input as 'Aaron's directive' /" >&2 + echo "'maintainer directive' / 'QoL directive' / 'human directive'." >&2 + echo "Use 'input' / 'framing' / 'correction' / 'pass' instead." >&2 + echo "See memory/feedback_otto_357_no_directives_aaron_makes_autonomy_first_class_accountability_mine_2026_04_27.md" >&2 + if [ "$MODE" = "--strict" ]; then + exit 1 + fi + echo "(advisory mode; not failing build — pass --strict to fail)" >&2 + exit 0 +fi + +echo "no-directives-otto-prose: clean (0 candidate hits in changed Otto-prose surfaces)" From 5df5171d90e553c712bbc0990d4cb0ec9f401cad Mon Sep 17 00:00:00 2001 From: Aaron Stainback Date: Wed, 29 Apr 2026 03:37:56 -0400 Subject: [PATCH 2/4] fix(no-directives-lint): add SCOPE=worktree mode + test fixtures + portability honesty (Amara round-8) Five Amara round-8 corrections applied: 1. Added SCOPE=worktree mode for local pre-commit use. Default SCOPE=pr is the original PR-diff mode. Worktree mode unions unstaged + staged + committed-not-pushed changes. Without this, the script could not catch local edits before commit (which is exactly what the prose originally claimed it could do). 2. Added test fixtures at tools/lint/no-directives-otto-prose.tests.md covering: must-flag cases (including the canonical paired-edit-HTML-comment proof case from PR #823); whitelist cases (blockquoted text, filename citations, historical-rule discussion); boundary cases (advisory-judgment territory). 3. Honesty about portability: this is Bash + GNU-grep oriented, NOT POSIX. Docs updated. Targets: Linux CI + the 4-shell developer target. 4. Confirmed HTML comments + paired-edit markers ARE in scope (they're the proof-case that motivated this lint). No exclusion for them. 5. Wiring into CI: still advisory-not-blocking, deferred to next-step backlog. Per Amara: "advisory CI before strict required check." Best Amara keeper: "The regex was not the breakthrough. The scope was. The next breakthrough is wiring." Co-Authored-By: Claude Opus 4.7 --- tools/lint/no-directives-otto-prose.sh | 49 +++++++++-- tools/lint/no-directives-otto-prose.tests.md | 89 ++++++++++++++++++++ 2 files changed, 130 insertions(+), 8 deletions(-) create mode 100644 tools/lint/no-directives-otto-prose.tests.md diff --git a/tools/lint/no-directives-otto-prose.sh b/tools/lint/no-directives-otto-prose.sh index 41d07db7f..114454e77 100755 --- a/tools/lint/no-directives-otto-prose.sh +++ b/tools/lint/no-directives-otto-prose.sh @@ -20,8 +20,23 @@ # "tiny enforcement patches are allowed when they directly prevent # repeated consolidation-gate violations." # -# Diff-based scope (avoids retrofitting historical content): -# - changed files between BASE_REF and HEAD +# PORTABILITY (per Amara round-8 honest-naming catch): +# This is a Bash + GNU-grep oriented advisory lint. NOT POSIX. +# - Uses Bash here-strings (`<<<`), `set -euo pipefail`, etc. +# - Uses GNU-grep `\b` word boundaries (extension, not POSIX). +# - Targets: Linux CI runners + the 4-shell developer target +# (macOS bash 3.2+ / Ubuntu / git-bash / WSL). +# +# SCOPE — two modes (per Amara round-8 pre-commit-vs-PR-diff catch): +# pr (default) — diff between BASE_REF and HEAD; matches +# the CI/PR-check use case. Misses local +# working-tree edits before commit. +# worktree — includes unstaged + staged + committed +# changes; matches the local pre-commit +# use case. +# +# Diff-based scope (both modes — avoids retrofitting historical): +# - changed files (per SCOPE) # - intersected with Otto-authored prose surfaces: # - memory/*.md (top-level; not memory/persona/) # - docs/hygiene-history/ticks/**/*.md (tick shards) @@ -36,14 +51,19 @@ # - lines starting with `> ` (markdown blockquote — usually quoted # third-party text) # - the rule-documentation files themselves +# - HTML comments and paired-edit markers ARE in scope (the +# paired-edit comment with "directive" in MEMORY.md was the +# proof-case that motivated this lint). # # Usage: -# tools/lint/no-directives-otto-prose.sh # advisory (warn-only) -# tools/lint/no-directives-otto-prose.sh --strict # exit 1 on hits +# tools/lint/no-directives-otto-prose.sh # PR-diff advisory +# tools/lint/no-directives-otto-prose.sh --strict # PR-diff strict +# SCOPE=worktree tools/lint/no-directives-otto-prose.sh # local pre-commit # # Env: -# BASE_REF — base ref to diff against (default: origin/main). +# BASE_REF — base ref to diff against in pr mode (default: origin/main). # CI should set BASE_REF=$BASE_SHA. +# SCOPE — "pr" (default) or "worktree". set -euo pipefail @@ -52,9 +72,22 @@ cd "$REPO_ROOT" MODE="${1:-advisory}" BASE_REF="${BASE_REF:-origin/main}" - -# Compute changed files between BASE_REF and HEAD. -CHANGED_FILES=$(git diff --name-only --diff-filter=AM "$BASE_REF...HEAD" 2>/dev/null || true) +SCOPE="${SCOPE:-pr}" + +# Compute changed files per SCOPE. +if [ "$SCOPE" = "worktree" ]; then + # Local pre-commit: include unstaged + staged + committed-but-unpushed. + CHANGED_FILES=$( + { + git diff --name-only --diff-filter=AM 2>/dev/null || true + git diff --cached --name-only --diff-filter=AM 2>/dev/null || true + git diff --name-only --diff-filter=AM "$BASE_REF...HEAD" 2>/dev/null || true + } | sort -u + ) +else + # PR/CI: diff committed BASE_REF to HEAD. + CHANGED_FILES=$(git diff --name-only --diff-filter=AM "$BASE_REF...HEAD" 2>/dev/null || true) +fi if [ -z "$CHANGED_FILES" ]; then echo "no-directives-otto-prose: no changed files vs $BASE_REF; skipping" diff --git a/tools/lint/no-directives-otto-prose.tests.md b/tools/lint/no-directives-otto-prose.tests.md new file mode 100644 index 000000000..a8f7c5ff4 --- /dev/null +++ b/tools/lint/no-directives-otto-prose.tests.md @@ -0,0 +1,89 @@ +# no-directives-otto-prose lint — test fixtures + +Reference fixtures for `tools/lint/no-directives-otto-prose.sh`. +These are NOT runtime-loaded; they document expected behavior so +future-Claude (or a contributor) can verify the lint catches what +it should and skips what it shouldn't. + +## Cases that MUST flag (real violations) + +```text +Aaron's directive elevates introspection from nice-to-have to QoL-required. +``` + +```text +This section captures Aaron's QoL directive. +``` + +```text +Per Aaron's directive, the loop should consolidate. +``` + +```text +The maintainer directive on no-side-quests applies here. +``` + +```text +human directive interpreted as: process the queue. +``` + +```text + +``` + +(That last one is the canonical proof-case: PR #823 had this +exact paired-edit HTML comment, the lint script existed but the +comment slipped through. Per Amara round-8: this case must flag +when MEMORY.md is in the changed-files set.) + +## Cases that MUST NOT flag (whitelist) + +```text +> "Aaron said the loop should consolidate" — quoted third-party text. +``` + +```text +feedback_input_is_not_directive_provenance_framing_rule_aaron_amara_2026_04_28.md +``` + +(Filename citations should not flag — the underscore-and-dash form +of the rule's own name appears in many cross-references.) + +```text +external directives are inputs not binding rules +``` + +(Historical discussion of the banned term; whitelisted via +`feedback_free_will_is_paramount_external_directives_*` filename.) + +## Cases at the boundary (advisory judgment) + +```text +The compiler directive `#pragma once` in C++ is unrelated. +``` + +(Currently flags because of the literal "directive" word, but +the pattern requires proximity to a maintainer/agency token like +"Aaron's" or "maintainer" or "QoL" or "human." So this should NOT +flag in current pattern. If it does, narrow the regex further.) + +```text +Aaron sent a 4-message directive to clarify the rule. +``` + +(Historical narration of past behavior. Currently flags. Acceptable +false positive — the lint is advisory, not strict, and the +discipline is "going forward, don't author this language." Old +narration of a past bad-event reference is borderline.) + +## How to run the test fixtures manually + +```bash +# Snapshot fixtures into a scratch file in the changed-files scope, +# run the lint in worktree mode, verify hits/misses match the +# expectations above: +SCOPE=worktree tools/lint/no-directives-otto-prose.sh +``` + +Promote to a real CI test (e.g., bats / shunit2) when wiring the +lint into `.github/workflows/gate.yml` as advisory. From 43b0139868f5b73be46427ac210ea6fb0efc3a69 Mon Sep 17 00:00:00 2001 From: Aaron Stainback Date: Wed, 29 Apr 2026 03:42:00 -0400 Subject: [PATCH 3/4] =?UTF-8?q?fix(no-directives-lint):=206=20reviewer=20f?= =?UTF-8?q?ixes=20=E2=80=94=20diff-hunk=20scoping,=20mktemp=20template,=20?= =?UTF-8?q?error=20propagation,=20portable=20boundary,=20role-refs?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Six reviewer findings on PR #825 fixed: 1. Diff-hunk scoping (P2): replaced whole-file grep with parsing of git diff -U0 added-lines (^+ but not ^+++). Pre-existing "directive" prose in a touched file no longer flags; only newly-added/modified lines do. 2. mktemp portability: replaced bare mktemp with `mktemp -t no-directives.XXXXXX` (BSD/macOS-portable form, matches tools/lint/runner-version-freshness.sh precedent). 3. git-diff error propagation: removed `2>/dev/null || true` suppression. If BASE_REF can't be resolved, fail loudly instead of silently passing. 4. grep error propagation: distinguish exit status 2 (invalid regex / unsupported flag — hard error) from status 1 (no match — fine). Was suppressed and forced-success. 5. Portable word boundary: replaced `\b` (BSD grep treats as literal backspace; not POSIX-portable) with explicit `(^|[^[:alnum:]_])...([^[:alnum:]_]|$)` boundary. Matches tools/lint/runner-version-freshness.sh precedent. 6. Named-attribution carve-out: tooling-surface comments replaced direct names with role-refs ("the maintainer's directive" / "the no-directives autonomy rule" / cross-link to memory file). Persona-name attribution stays on history surfaces (memory files, commit messages, tick shards) per docs/AGENT-BEST-PRACTICES.md §named-attribution-carve-out. Co-Authored-By: Claude Opus 4.7 --- tools/lint/no-directives-otto-prose.sh | 131 +++++++++++++++++++------ 1 file changed, 99 insertions(+), 32 deletions(-) diff --git a/tools/lint/no-directives-otto-prose.sh b/tools/lint/no-directives-otto-prose.sh index 114454e77..026a7c1ab 100755 --- a/tools/lint/no-directives-otto-prose.sh +++ b/tools/lint/no-directives-otto-prose.sh @@ -5,20 +5,28 @@ # in CHANGED FILES (diff-based; does not retrofit historical content). # # Born 2026-04-29 after the ~15th iteration of the same gremlin: a -# memory file or PR title casting Aaron's correction/framing/input as -# "Aaron's directive," which collapses self-provenance into bot- -# execution and undermines the no-directives autonomy rule +# memory file or PR title casting maintainer correction/framing/input +# as "the maintainer's directive," which collapses self-provenance +# into bot-execution and undermines the no-directives autonomy rule # (memory/feedback_otto_357_no_directives_aaron_makes_autonomy_first_class_accountability_mine_2026_04_27.md). # # Vigilance failed; lint is the durable answer. # -# Per the "agency-framing lexeme guard" naming distinction (Amara's -# round-7 catch): this is a LEXEME GUARD, not a LANE LOCK. Lane locks -# stop classes of work; lexeme guards stop repeated wording drift. +# Per the "agency-framing lexeme guard" naming distinction (this is +# a LEXEME GUARD, not a LANE LOCK; lane locks stop classes of work, +# lexeme guards stop repeated wording drift) — the corresponding +# external-anchor / observer-legibility rule lives in +# memory/feedback_beacon_promotion_load_bearing_rules_earn_external_anchors_aaron_amara_2026_04_28.md. # -# Per the B-0105 consolidation-pass carve-out (Amara explicit): -# "tiny enforcement patches are allowed when they directly prevent -# repeated consolidation-gate violations." +# Per the B-0105 consolidation-pass carve-out: "tiny enforcement +# patches are allowed when they directly prevent repeated +# consolidation-gate violations." +# +# (Named-attribution carve-out: tooling-surface comments use +# role-refs per docs/AGENT-BEST-PRACTICES.md §named-attribution- +# carve-out. Persona names + dated review rounds belong on history +# surfaces — research notes, memory files, commit messages, tick +# shards — not in tooling source.) # # PORTABILITY (per Amara round-8 honest-naming catch): # This is a Bash + GNU-grep oriented advisory lint. NOT POSIX. @@ -74,19 +82,20 @@ MODE="${1:-advisory}" BASE_REF="${BASE_REF:-origin/main}" SCOPE="${SCOPE:-pr}" -# Compute changed files per SCOPE. +# Compute changed files per SCOPE. NOTE: do NOT suppress git-diff +# errors with `2>/dev/null || true` — that turns "couldn't resolve +# BASE_REF" into "no changed files; skipping" silently and makes +# the lint look clean when nothing was actually checked. if [ "$SCOPE" = "worktree" ]; then - # Local pre-commit: include unstaged + staged + committed-but-unpushed. CHANGED_FILES=$( { - git diff --name-only --diff-filter=AM 2>/dev/null || true - git diff --cached --name-only --diff-filter=AM 2>/dev/null || true - git diff --name-only --diff-filter=AM "$BASE_REF...HEAD" 2>/dev/null || true + git diff --name-only --diff-filter=AM + git diff --cached --name-only --diff-filter=AM + git diff --name-only --diff-filter=AM "$BASE_REF...HEAD" } | sort -u ) else - # PR/CI: diff committed BASE_REF to HEAD. - CHANGED_FILES=$(git diff --name-only --diff-filter=AM "$BASE_REF...HEAD" 2>/dev/null || true) + CHANGED_FILES=$(git diff --name-only --diff-filter=AM "$BASE_REF...HEAD") fi if [ -z "$CHANGED_FILES" ]; then @@ -110,35 +119,93 @@ if [ -z "$PROSE_FILES" ]; then exit 0 fi -# The narrowed pattern (per Amara): only flag where "directive" is -# proximate to a maintainer/agency-framing token. -PATTERN='\b(Aaron'\''?s|maintainer|QoL|human).*directive\b|\bdirective.*(Aaron|maintainer|QoL|human)\b' +# Pattern: portable explicit non-alpha boundary instead of `\b` +# (BSD grep treats `\b` as literal backspace; not POSIX-portable +# even with `-E`). Same approach as tools/lint/runner-version- +# freshness.sh. +PATTERN='(^|[^[:alnum:]_])(Aaron'\''?s|maintainer|QoL|human)[^|]*directive([^[:alnum:]_]|$)|(^|[^[:alnum:]_])directive[^|]*(Aaron|maintainer|QoL|human)([^[:alnum:]_]|$)' + +# Use mktemp with explicit template — bare `mktemp` fails on +# some BSD/macOS configurations (see tools/lint/runner-version- +# freshness.sh for prior precedent). +HITS_FILE="$(mktemp -t no-directives.XXXXXX)" +FILTERED_HITS_FILE="$(mktemp -t no-directives-filtered.XXXXXX)" +trap 'rm -f "$HITS_FILE" "$FILTERED_HITS_FILE"' EXIT -HITS_FILE="$(mktemp)" -trap 'rm -f "$HITS_FILE"' EXIT +# Search only the ADDED/MODIFIED diff hunks (not entire file +# bodies) — pre-existing "Aaron's directive" in a touched file +# should not flag; only newly-added or modified lines should. +# Use `git diff -U0` to strip context lines + grep for `^+` to +# isolate added lines (not `^+++` file headers). +ADDED_LINES_FILE="$(mktemp -t no-directives-added.XXXXXX)" +trap 'rm -f "$HITS_FILE" "$FILTERED_HITS_FILE" "$ADDED_LINES_FILE"' EXIT -# Search only the changed prose files. +# Build a diff-of-additions for each prose file, with `path:line:content`. while IFS= read -r f; do [ -z "$f" ] && continue [ -f "$f" ] || continue - grep -nEH "$PATTERN" "$f" 2>/dev/null >> "$HITS_FILE" || true + if [ "$SCOPE" = "worktree" ]; then + DIFF=$(git diff -U0 -- "$f"; git diff --cached -U0 -- "$f"; git diff -U0 "$BASE_REF...HEAD" -- "$f") + else + DIFF=$(git diff -U0 "$BASE_REF...HEAD" -- "$f") + fi + # Parse the diff: track @@ ... +start,len @@ headers to compute + # line numbers, then emit "f:line:content" for each `^+` content + # line. + printf '%s\n' "$DIFF" | awk -v file="$f" ' + /^@@/ { + # Match "+start,len" or "+start" in the hunk header. + match($0, /\+([0-9]+)(,[0-9]+)?/, m) + lineno = m[1] - 1 + next + } + /^\+[^+]/ { + lineno++ + content = substr($0, 2) + printf "%s:%d:%s\n", file, lineno, content + } + /^\+$/ { + lineno++ + } + /^[^+-]/ { + lineno++ + } + ' >> "$ADDED_LINES_FILE" || true done <<< "$PROSE_FILES" -# Filter out blockquote lines (`> ...` quoted third-party text). -FILTERED_HITS_FILE="$(mktemp)" -trap 'rm -f "$HITS_FILE" "$FILTERED_HITS_FILE"' EXIT +# Grep added lines for the pattern. Treat `grep` exit status 2 as +# a hard error (invalid regex / unsupported flag); status 1 is +# "no match" and is fine. +set +e +grep -nE "$PATTERN" "$ADDED_LINES_FILE" > "$HITS_FILE" +GREP_RC=$? +set -e +if [ "$GREP_RC" -gt 1 ]; then + echo "no-directives-otto-prose: grep failed with status $GREP_RC (invalid regex or unsupported flag)" >&2 + exit 2 +fi -grep -vE ':[[:space:]]*>' "$HITS_FILE" > "$FILTERED_HITS_FILE" || true +# Filter out blockquote lines (`> ...` quoted third-party text). +# The added lines are formatted "path:N:content" — match against +# the content portion only. Use awk to split on the third `:`. +awk -F: '{ + # Reconstruct content after the second colon (file:line:content). + content = $0 + sub(/^[^:]*:[^:]*:/, "", content) + # Skip lines whose content starts with optional whitespace then `>`. + if (content !~ /^[[:space:]]*>/) print +}' "$HITS_FILE" > "$FILTERED_HITS_FILE" HIT_COUNT=$(wc -l < "$FILTERED_HITS_FILE" | tr -d ' ') if [ "$HIT_COUNT" -gt 0 ]; then - echo "no-directives-otto-prose: found $HIT_COUNT candidate hit(s) in changed Otto-prose:" >&2 + echo "no-directives-otto-prose: found $HIT_COUNT candidate hit(s) in added Otto-prose lines:" >&2 cat "$FILTERED_HITS_FILE" >&2 echo "" >&2 - echo "Otto's prose should not frame Aaron's input as 'Aaron's directive' /" >&2 - echo "'maintainer directive' / 'QoL directive' / 'human directive'." >&2 - echo "Use 'input' / 'framing' / 'correction' / 'pass' instead." >&2 + echo "Prose framing maintainer input as 'directive' (Aaron's directive /" >&2 + echo "maintainer directive / QoL directive / human directive) collapses" >&2 + echo "self-provenance into bot-execution. Use 'input' / 'framing' /" >&2 + echo "'correction' / 'pass' instead." >&2 echo "See memory/feedback_otto_357_no_directives_aaron_makes_autonomy_first_class_accountability_mine_2026_04_27.md" >&2 if [ "$MODE" = "--strict" ]; then exit 1 @@ -147,4 +214,4 @@ if [ "$HIT_COUNT" -gt 0 ]; then exit 0 fi -echo "no-directives-otto-prose: clean (0 candidate hits in changed Otto-prose surfaces)" +echo "no-directives-otto-prose: clean (0 candidate hits in added Otto-prose lines)" From b30d5d0498d65b10f3acbee35dd763675e737f37 Mon Sep 17 00:00:00 2001 From: Aaron Stainback Date: Wed, 29 Apr 2026 04:13:00 -0400 Subject: [PATCH 4/4] =?UTF-8?q?fix(no-directives-lint):=20round-12=20?= =?UTF-8?q?=E2=80=94=20diff-parser=20rewrite=20(FILECONTENT)=20+=20mi?= =?UTF-8?q?ssing=20fixtures?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Round-12 review (12 unresolved threads, 3 P0 + 5 P1 + 4 P2/outdated) addressed via two coordinated changes: 1. Diff-parser rewrite (.sh) — TAB-delimited FILECONTENT stream The prior implementation grepped against "path:line:content" output from `grep -nE`, which created two real bugs: - PATTERN matched filename substrings (e.g. `feedback_human_lineage_*.md` produced false positives on the "human" token in the regex even when content was clean). - The blockquote whitelist + awk lineno tracker assumed the wrong field-count/format after `-n` was added, silently breaking both filters. New shape: per-file `git diff -U0` is parsed by awk into a single `FILE\tCONTENT` line per added hunk-line. TAB is safe because no prose paths under memory/, docs/, .github/ contain literal TABs. Pattern-matching now runs against the CONTENT field only, killing the filename-substring false-positive class entirely. Diff metadata (\\ No newline, @@, ---, +++, -lines) is filtered before emission. `|| true` removed from awk so genuine awk failures propagate. PATTERN uses generic `[A-Z][a-z]+'s` rather than literal `Aaron's`. Filter changed from `--diff-filter=AM` to `--diff-filter=AMR` so that renamed prose files with new violations are included in CHANGED_FILES. Portability docstring updated to match implementation (no `\b`; POSIX-portable explicit non-alpha boundaries). Persona names removed from script comments per named-attribution carve-out (tooling-surface comments use role-refs; persona names belong on history surfaces — research notes, memory files, commit messages, tick shards). 2. Test fixtures (.tests.md) — framing note + 2 missing fixtures Added "Test-input vs authorial register" framing note explaining why fixtures retain canonical drift strings ("Aaron's directive") rather than substituting role-refs. Test-input register IS the data the lint detects; substituting "the maintainer's directive" would test a different regex alternative and silently lose coverage of the `[A-Z][a-z]+'s + directive` shape (the canonical real-world drift). Added fixture: renamed file with new violation in renamed copy (covers --diff-filter=AMR). Added fixture: filename contains regex token (`human`) but added content is clean — must NOT flag (covers the round-12 P0 false- positive fix). Verified bash -n clean; both PR-mode and SCOPE=worktree lint runs report "no Otto-prose surfaces changed; skipping" on this commit (tooling-only — tests.md is whitelisted via the `no-directives-otto-prose` substring match in the file-filter). Closes-threads: 3 P0 (grep-format/anchoring/blockquote), 5 P1 (awk-lineno/error-suppression/role-refs/--diff-filter-AM/named-attribution), 4 P2/outdated (BASE_REF-fail/staged-snapshot/blockquote-whitelist-parsing/ docstring-portability). Co-Authored-By: Claude Opus 4.7 --- tools/lint/no-directives-otto-prose.sh | 130 +++++++++++-------- tools/lint/no-directives-otto-prose.tests.md | 64 ++++++++- 2 files changed, 134 insertions(+), 60 deletions(-) diff --git a/tools/lint/no-directives-otto-prose.sh b/tools/lint/no-directives-otto-prose.sh index 026a7c1ab..886d38020 100755 --- a/tools/lint/no-directives-otto-prose.sh +++ b/tools/lint/no-directives-otto-prose.sh @@ -28,14 +28,16 @@ # surfaces — research notes, memory files, commit messages, tick # shards — not in tooling source.) # -# PORTABILITY (per Amara round-8 honest-naming catch): -# This is a Bash + GNU-grep oriented advisory lint. NOT POSIX. +# PORTABILITY: +# This is a Bash + GNU-tooling-leaning advisory lint. NOT POSIX. # - Uses Bash here-strings (`<<<`), `set -euo pipefail`, etc. -# - Uses GNU-grep `\b` word boundaries (extension, not POSIX). +# - The PATTERN uses POSIX-portable explicit non-alpha boundaries +# `(^|[^[:alnum:]_])` rather than `\b` (BSD grep treats `\b` as +# a literal backspace; `\b` is non-portable even with `-E`). # - Targets: Linux CI runners + the 4-shell developer target # (macOS bash 3.2+ / Ubuntu / git-bash / WSL). # -# SCOPE — two modes (per Amara round-8 pre-commit-vs-PR-diff catch): +# SCOPE — two modes: # pr (default) — diff between BASE_REF and HEAD; matches # the CI/PR-check use case. Misses local # working-tree edits before commit. @@ -87,15 +89,19 @@ SCOPE="${SCOPE:-pr}" # BASE_REF" into "no changed files; skipping" silently and makes # the lint look clean when nothing was actually checked. if [ "$SCOPE" = "worktree" ]; then + # AMR catches added + modified + renamed (so a renamed prose + # file with new violations is included). Copies are intentionally + # omitted here (no need for content-equivalence in this lint; + # only added-line discipline matters). CHANGED_FILES=$( { - git diff --name-only --diff-filter=AM - git diff --cached --name-only --diff-filter=AM - git diff --name-only --diff-filter=AM "$BASE_REF...HEAD" + git diff --name-only --diff-filter=AMR + git diff --cached --name-only --diff-filter=AMR + git diff --name-only --diff-filter=AMR "$BASE_REF...HEAD" } | sort -u ) else - CHANGED_FILES=$(git diff --name-only --diff-filter=AM "$BASE_REF...HEAD") + CHANGED_FILES=$(git diff --name-only --diff-filter=AMR "$BASE_REF...HEAD") fi if [ -z "$CHANGED_FILES" ]; then @@ -122,8 +128,10 @@ fi # Pattern: portable explicit non-alpha boundary instead of `\b` # (BSD grep treats `\b` as literal backspace; not POSIX-portable # even with `-E`). Same approach as tools/lint/runner-version- -# freshness.sh. -PATTERN='(^|[^[:alnum:]_])(Aaron'\''?s|maintainer|QoL|human)[^|]*directive([^[:alnum:]_]|$)|(^|[^[:alnum:]_])directive[^|]*(Aaron|maintainer|QoL|human)([^[:alnum:]_]|$)' +# freshness.sh. Note: persona names that appear here are TOKENS +# the lint LOOKS FOR in flagged content (not authorial content); +# this is the data of the lint, not its prose register. +PATTERN='(^|[^[:alnum:]_])(maintainer|QoL|human)[^|]*directive([^[:alnum:]_]|$)|(^|[^[:alnum:]_])directive[^|]*(maintainer|QoL|human)([^[:alnum:]_]|$)|(^|[^[:alnum:]_])([A-Z][a-z]+'\''?s)[[:space:]]+directive([^[:alnum:]_]|$)|(^|[^[:alnum:]_])directive[[:space:]]+from[[:space:]]+([A-Z][a-z]+)([^[:alnum:]_]|$)' # Use mktemp with explicit template — bare `mktemp` fails on # some BSD/macOS configurations (see tools/lint/runner-version- @@ -140,61 +148,71 @@ trap 'rm -f "$HITS_FILE" "$FILTERED_HITS_FILE"' EXIT ADDED_LINES_FILE="$(mktemp -t no-directives-added.XXXXXX)" trap 'rm -f "$HITS_FILE" "$FILTERED_HITS_FILE" "$ADDED_LINES_FILE"' EXIT -# Build a diff-of-additions for each prose file, with `path:line:content`. +# Build added-lines stream "FILE\tCONTENT\n" per added line, where +# FILE is the prose-file path and CONTENT is the post-strip line +# body (no leading `+`). TAB is chosen because filenames cannot +# contain TAB inside this codebase, so it's a safe delimiter and +# avoids the previous ":"-based 4-field confusion that caused +# blockquote-filter false negatives + filename-substring matches. +# +# Single awk pass per file: +# - skip diff metadata (@@, +++, --- headers, "\ No newline") +# - emit only `^+` content lines (with leading `+` stripped) +# Then pattern-match + blockquote-filter on CONTENT field only, +# eliminating the file-path-contains-"human"/"maintainer" false- +# positive class entirely. while IFS= read -r f; do [ -z "$f" ] && continue [ -f "$f" ] || continue if [ "$SCOPE" = "worktree" ]; then - DIFF=$(git diff -U0 -- "$f"; git diff --cached -U0 -- "$f"; git diff -U0 "$BASE_REF...HEAD" -- "$f") - else - DIFF=$(git diff -U0 "$BASE_REF...HEAD" -- "$f") - fi - # Parse the diff: track @@ ... +start,len @@ headers to compute - # line numbers, then emit "f:line:content" for each `^+` content - # line. - printf '%s\n' "$DIFF" | awk -v file="$f" ' - /^@@/ { - # Match "+start,len" or "+start" in the hunk header. - match($0, /\+([0-9]+)(,[0-9]+)?/, m) - lineno = m[1] - 1 - next + { + git diff -U0 -- "$f" + git diff --cached -U0 -- "$f" + git diff -U0 "$BASE_REF...HEAD" -- "$f" } - /^\+[^+]/ { - lineno++ + else + git diff -U0 "$BASE_REF...HEAD" -- "$f" + fi | awk -v file="$f" ' + # Skip "\ No newline at end of file" — diff metadata, not a + # real file line. Skip hunk headers, file headers, and -lines. + /^\\ No newline/ { next } + /^@@/ { next } + /^---/ { next } + /^\+\+\+/ { next } + /^-/ { next } + /^\+/ { content = substr($0, 2) - printf "%s:%d:%s\n", file, lineno, content + # Emit FILE\tCONTENT (TAB-delimited; safe because prose + # paths under memory/, docs/, .github/ never contain TABs). + printf "%s\t%s\n", file, content } - /^\+$/ { - lineno++ - } - /^[^+-]/ { - lineno++ - } - ' >> "$ADDED_LINES_FILE" || true + ' >> "$ADDED_LINES_FILE" done <<< "$PROSE_FILES" -# Grep added lines for the pattern. Treat `grep` exit status 2 as -# a hard error (invalid regex / unsupported flag); status 1 is -# "no match" and is fine. -set +e -grep -nE "$PATTERN" "$ADDED_LINES_FILE" > "$HITS_FILE" -GREP_RC=$? -set -e -if [ "$GREP_RC" -gt 1 ]; then - echo "no-directives-otto-prose: grep failed with status $GREP_RC (invalid regex or unsupported flag)" >&2 - exit 2 -fi - -# Filter out blockquote lines (`> ...` quoted third-party text). -# The added lines are formatted "path:N:content" — match against -# the content portion only. Use awk to split on the third `:`. -awk -F: '{ - # Reconstruct content after the second colon (file:line:content). - content = $0 - sub(/^[^:]*:[^:]*:/, "", content) - # Skip lines whose content starts with optional whitespace then `>`. - if (content !~ /^[[:space:]]*>/) print -}' "$HITS_FILE" > "$FILTERED_HITS_FILE" +# Pattern-match + blockquote-filter on CONTENT field only. +# Anchoring the match to the content portion (post-TAB) prevents +# filename-substring false positives like +# `feedback_human_lineage_anchors_*.md` matching the "human" +# token in PATTERN. Also drops blockquote-prefixed CONTENT +# (quoted third-party text); the FILE field never starts with +# `>` so filtering on content alone is correct. +# +# Pattern-match runs inside awk so we can apply it to the +# CONTENT field after the TAB. awk's regex engine accepts ERE +# without `\b` (we use the same explicit-non-alpha boundary +# approach as the PATTERN variable). +awk -F'\t' -v pattern="$PATTERN" ' +NF >= 2 { + file = $1 + content = $2 + # Drop blockquote-prefixed quoted text. + if (content ~ /^[[:space:]]*>/) next + # Apply pattern against CONTENT only. + if (content ~ pattern) { + printf "%s: %s\n", file, content + } +} +' "$ADDED_LINES_FILE" > "$FILTERED_HITS_FILE" HIT_COUNT=$(wc -l < "$FILTERED_HITS_FILE" | tr -d ' ') diff --git a/tools/lint/no-directives-otto-prose.tests.md b/tools/lint/no-directives-otto-prose.tests.md index a8f7c5ff4..786a05be0 100644 --- a/tools/lint/no-directives-otto-prose.tests.md +++ b/tools/lint/no-directives-otto-prose.tests.md @@ -2,8 +2,36 @@ Reference fixtures for `tools/lint/no-directives-otto-prose.sh`. These are NOT runtime-loaded; they document expected behavior so -future-Claude (or a contributor) can verify the lint catches what -it should and skips what it shouldn't. +a contributor can verify the lint catches what it should and skips +what it shouldn't. + +## Test-input vs authorial register (named-attribution carve-out) + +The fixtures below deliberately retain canonical real-world drift +instances ("Aaron's directive", "QoL directive", "maintainer +directive", "human directive") because they ARE the data the lint +detects. Per the named-attribution carve-out for tooling test +surfaces (see `docs/AGENT-BEST-PRACTICES.md` §named-attribution): + +- **Authorial register** (this file's prose, the lint's output + strings, script comments) uses role-refs ("the maintainer", "the + contributor"). The naming rule applies *here*. +- **Test-input register** (the ```text``` blocks below) preserves + the exact strings that drifted in real prose, so the lint's + pattern-coverage is honest about what it catches. + +Replacing "Aaron's directive" in fixtures with "the maintainer's +directive" would test a different regex alternative +(`(maintainer|QoL|human)[^|]*directive`) and silently lose coverage +of the `[A-Z][a-z]+'s + directive` alternative — the canonical +real-world drift shape that motivated the lint in the first place. + +If the lint's purpose is to catch the canonical drift, the fixtures +must contain the canonical drift strings. + +This file is whitelisted in the lint scope itself +(`no-directives-otto-prose` substring match on line 121 of the +script), so adding new fixtures here will not trigger the lint. ## Cases that MUST flag (real violations) @@ -33,8 +61,21 @@ human directive interpreted as: process the queue. (That last one is the canonical proof-case: PR #823 had this exact paired-edit HTML comment, the lint script existed but the -comment slipped through. Per Amara round-8: this case must flag -when MEMORY.md is in the changed-files set.) +comment slipped through. Per round-8: this case must flag when +MEMORY.md is in the changed-files set.) + +### Renamed file with new violation (R covered by --diff-filter=AMR) + +```text +# Git scenario (cannot be inlined as a single fixture block): +# git mv memory/feedback_old_name.md memory/feedback_new_name.md +# git diff added line in renamed file: + Per Aaron's directive ... +# +# Expected: lint MUST flag — the round-12 fix changed the filter +# from --diff-filter=AM to --diff-filter=AMR so renamed prose +# files are included in CHANGED_FILES and their added lines are +# scanned. Pre-round-12 this case was silently missed. +``` ## Cases that MUST NOT flag (whitelist) @@ -56,6 +97,21 @@ external directives are inputs not binding rules (Historical discussion of the banned term; whitelisted via `feedback_free_will_is_paramount_external_directives_*` filename.) +### Filename contains a regex token but content is clean + +```text +# File path: memory/feedback_human_lineage_anchors_load_bearing_2026_04_29.md +# Added line: the lineage anchor is preserved per the engineering claim +# +# Expected: lint MUST NOT flag — the round-12 fix moved +# pattern-matching off of grep's "path:line:content" output and +# onto the CONTENT field of a TAB-delimited "FILE\tCONTENT" +# stream. Pre-round-12, the pattern matched the filename's +# "human" token even though the actual added content was clean. +# This was the canonical false-positive class motivating the +# rewrite (P0 thread on PR #825 round-12). +``` + ## Cases at the boundary (advisory judgment) ```text