Skip to content

free-memory: recurrence-after-correction → operational enforcement (Tick-80 second-order self-grading)#1206

Merged
AceHack merged 1 commit intomainfrom
memory/recurrence-after-correction-needs-operational-enforcement-tick80
May 2, 2026
Merged

free-memory: recurrence-after-correction → operational enforcement (Tick-80 second-order self-grading)#1206
AceHack merged 1 commit intomainfrom
memory/recurrence-after-correction-needs-operational-enforcement-tick80

Conversation

@AceHack
Copy link
Copy Markdown
Member

@AceHack AceHack commented May 2, 2026

Tick-61's no-op-cadence corrective landed on main (commit 67969d8) yet the same pattern RECURRED at Tick-71-79.

Second-order insight: Substrate-knowledge necessary but not sufficient for failure modes the LLM training prior strongly favors. Recurrence-after-correction proves the gap.

Architectural answer (named, not implemented this PR): operational enforcement — pre-tick mechanical checks, deliberate-quiet-periods (B-0165), multi-AI peer review at-decision-time, tick-history decision-shape column.

🤖 Generated with Claude Code

…icient; operational enforcement needed (Otto 2026-05-02 second-order self-grading)

Tick-61's no-op-cadence corrective landed on main (commit
67969d8) yet the same pattern RECURRED at Tick-71-79.

Substrate-knowledge is necessary but not sufficient for
failure modes the LLM training prior strongly favors.
Recurrence-after-correction proves the gap. Operational
enforcement (mechanical checks at decision-time, forced
practice via deliberate-quiet-periods, multi-AI peer review
at-decision-time) IS the architectural answer.

Each substrate layer adds weight to the rule. At some point
(empirically, after enough recurrence-correction cycles),
substrate-weight exceeds training-prior-weight. Not there yet
for this failure mode.

Operational-enforcement candidates (architectural extensions):
- Pre-tick mechanical check on tick-history pattern
- Deliberate-quiet-periods (B-0165) — Aaron-side enforcement
- Multi-AI peer review at-decision-time (faster catch than
  weeks-later post-merge review)
- Tick-history decision-shape column for visibility

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 2, 2026 17:08
@chatgpt-codex-connector
Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

@AceHack AceHack merged commit 3aa9c4f into main May 2, 2026
28 checks passed
@AceHack AceHack deleted the memory/recurrence-after-correction-needs-operational-enforcement-tick80 branch May 2, 2026 17:10
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new memory/ feedback memo capturing a “recurrence-after-correction” failure mode and indexing it in memory/MEMORY.md so it’s discoverable from the main memory entrypoint.

Changes:

  • Added a new feedback memo: “Recurrence-after-correction needs operational enforcement”.
  • Updated memory/MEMORY.md to include the new memo in the index.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
memory/feedback_recurrence_after_correction_needs_operational_enforcement_otto_2026_05_02.md New feedback memo documenting the recurrence pattern and proposing operational enforcement candidates.
memory/MEMORY.md Adds an index entry pointing to the new memo.

- `feedback_party_during_human_sleep_asymmetric_recovery_morning_reconciliation_via_first_principles_aaron_claudeai_2026_05_02.md` (the rule that's failing to take)
- B-0165 (deliberate-quiet-periods protocol) — concrete operational-enforcement candidate
- Otto-341 (mechanism-over-vigilance from existing substrate) — composing parent principle: structural fixes beat process discipline
- `feedback_dont_ask_permission_within_authority_scope_*` — composing twin: the autonomy-disposition rule has the same recurrence-resistance property; substrate-rule alone hasn't prevented the dopamine-loop ritualized-Insight pattern either
AceHack added a commit that referenced this pull request May 2, 2026
…nt candidate #1 opened (#1208)

* hygiene(tick-history): 2026-05-02T17:16Z Tick-85 — PR #1207 operational-enforcement candidate #1 opened

No-op-cadence pre-tick mechanical check script lands. Self-tested
both paths before commit. Closes architectural loop named in
Tick-80 second-order self-grading memo (#1206).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(tick-history): add memory/ prefix to file ref for grep-ability and click-through

Per Copilot review feedback on PR #1208 — `feedback_recurrence_*.md`
without the `memory/` prefix isn't a valid repo-relative path; click-
through fails and grep-ability suffers. Other shards use the prefixed
form consistently.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
AceHack added a commit that referenced this pull request May 2, 2026
…nforcement #1) (#1207)

* tools(hygiene): pre-tick mechanical no-op-cadence check (Tick-80 operational-enforcement candidate #1)

Implements the first of the four operational-enforcement candidates
named in memory/feedback_recurrence_after_correction_needs_operational_enforcement_otto_2026_05_02.md
(merged via PR #1206 earlier this session).

The Tick-80 memo's empirical finding: substrate-knowledge alone is
insufficient for failure modes the LLM training prior strongly favors —
the no-op-cadence pattern recurred at Tick-71-79 even after the
Tick-61 corrective memo named it. The architectural answer is
operational enforcement: mechanical checks at decision-time, not just
substrate-read at wake-time.

This script is one such mechanical check.

What it does:
- Reads last N (default 7) tick-history shards from current UTC date
  under docs/hygiene-history/ticks/YYYY/MM/DD/
- Counts shards matching minimal-observation pattern (heuristic: short
  body OR observation-class language regex)
- If MIN_OBS_COUNT >= THRESHOLD (default 5), prints a WARNING with
  composing-substrate references and party-class operation alternatives

What it does NOT do:
- Does NOT block the tick (informational only; exit 0)
- Does NOT auto-correct (the agent's judgment to act on the warning)
- Does NOT examine prior days (current-day window only)

Configurable via NO_OP_CHECK_WINDOW and NO_OP_CHECK_THRESHOLD env vars.

Self-tested before commit:
- Default threshold (5) on recent 7 shards: 2 matches, no warning
  fires — correct (Tick-80-84 have been substantive)
- Lowered threshold (1): warning fires correctly with full body

Composes with:
- tools/hygiene/check-role-ref-on-current-state-surfaces.sh (B-0162
  sibling pattern; mechanical check at commit-time vs. tick-time)
- tools/hygiene/check-tick-history-shard-schema.sh (sibling pattern)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(no-op-check): glob-loop instead of ls|grep — addresses Codex P2 + shellcheck SC2010

Two findings on PR #1207 line 62 had the same root cause and one fix:

1. **Codex Connector P2**: `RECENT_SHARDS=$(ls "$SHARD_DIR" | grep -E ...)`
   could exit 1 under `set -euo pipefail` when the directory exists
   but `grep` finds no matching schema-conforming filenames, killing
   the script before the `[[ -z "$RECENT_SHARDS" ]]` fallback runs.
   This defeats the script's "informational only / does NOT block tick"
   promise — a tick-start invocation hitting a fresh shard directory
   would unexpectedly fail.

2. **Shellcheck SC2010**: `ls | grep` is the wrong shape; use a glob
   or for-loop with conditions to allow non-alphanumeric filenames.

Fix: replace the pipeline with a `shopt -s nullglob` + glob loop
that filters via bash regex. Bash 3.2 compatible per Otto-235 4-shell
target. Same schema acceptance: `HHMMZ.md`, `HHMMZ-<hex>.md`,
`HHMMSSZ-<hex>.md` per docs/hygiene-history/ticks/README.md.

Verified:
- Shellcheck: clean (no output)
- Default threshold (5) on recent 7 shards: 2 matches, no warning
- Lowered threshold (1): warning fires correctly
- Empty directory case: "nothing to check" + exit 0 (the bug Codex
  caught — verified working)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(no-op-check): midnight rollover, mixed-format sort, env-var validation

Addresses three Copilot findings on PR #1207:

1. **Midnight UTC reset blind window**: previous version only looked
   at today's directory, so a no-op streak spanning midnight would be
   invisible during the first ticks of the new day — exactly when the
   check should still warn. Fix: collect from both today AND yesterday
   directories. Yesterday computed via BSD `date -v-1d` OR GNU
   `date -d "yesterday"` per Otto-235 4-shell target.

2. **Mixed-format sort drift**: raw lexicographic sort misorders
   `1550Z.md` vs `1550Z-01.md` (and HHMMZ vs HHMMSSZ-<hex>) per
   docs/hygiene-history/ticks/README.md mixed-format-sort caveat. Fix:
   parse the timestamp prefix into YYYYMMDDHHMMSS sortkey (HHMM padded
   with `00` for seconds) and sort by that, not by raw filename. The
   parsed-timestamp approach also lets today + yesterday combine
   correctly under one sort.

3. **Env var validation**: `NO_OP_CHECK_WINDOW=foo` previously made
   `tail -n foo` fail under `set -e`, defeating the "informational
   only / does NOT block tick" promise. Fix: validate both env vars
   match `^[0-9]+$` and are >= 1; on invalid input, warn and fall
   back to defaults.

Verified:
- Shellcheck: clean (no output)
- Default threshold (5) on recent 7 shards: 2 matches, no warning
- Lowered threshold (1): warning fires correctly
- Bad env var (NO_OP_CHECK_WINDOW=foo): warns + uses default + exit 0
- Empty directory case: "nothing to check" + exit 0 (Codex fix intact)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(no-op-check): force base-10 on env vars to avoid octal-parse breakage

Codex Connector P2 finding on PR #1207: `THRESHOLD` regex
`^[0-9]+$` accepts zero-padded values like `08`, but bash arithmetic
context then parses `08` as octal and fails with "value too great
for base", short-circuiting the validation and producing
nondeterministic behavior (skip warning path + emit shell error)
instead of either accepting the value or falling back to default.

`NO_OP_CHECK_THRESHOLD=08` is a common zero-padded env style; the
script's "informational only / does NOT block tick" promise breaks
under that input.

Fix: use `10#$VAR` arithmetic-base coercion in the validation
checks AND normalize to base-10 immediately after validation
(`VAR=$((10#$VAR))`) so all downstream usage (arithmetic
comparisons + `tail -n`) sees unambiguous decimal.

Verified:
- Shellcheck: clean
- Default: 2/7, no warning fires
- `NO_OP_CHECK_THRESHOLD=08`: correctly interpreted as 8 (was
  octal-error before)
- `NO_OP_CHECK_WINDOW=08`: correctly reads 8 shards
- `NO_OP_CHECK_THRESHOLD=foo`: regex rejects → default 5 (regression
  check)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(no-op-check): three Copilot findings + hidden tab-IFS-whitespace bug

Three Copilot findings on PR #1207:

1. **Whole-file size measured instead of body column** (line 129).
   Shard schema is six pipe-separated columns:
   `| timestamp | model | cron-id | <body> | <PR ref> | <observation> |`
   Previous code did `wc -c < $shard_path` on the whole file, so a
   terse-body row with a long observation column was treated as
   "not short" and missed the no-op signal. Fix: extract column 5
   (the body) via `awk -F'|' 'NR==1 {print $5}'` and measure its
   length. Threshold tightened from 800 → 600 chars to match
   body-only measurement.

2. **Header docstring AND-semantic vs implementation OR-semantic**
   (line 28). Header said "short body + observation-class language"
   but code was `||`. OR is the correct semantic (any signal of
   minimal-observation counts). Fixed docstring to spell out the
   OR-semantic explicitly.

3. **Same-minute disambiguators sort wrong** (line 103). Previous
   sort key was `YYYYMMDDHHMM00` for both `HHMMZ.md` and `HHMMZ-XX.md`
   (identical primary key). Lex-sort on the path tiebreaker put
   `1550Z-01.md` BEFORE `1550Z.md` (because `-` < `.` in ASCII), so
   the base shard came after its own disambiguators. Fix: emit the
   disambiguator as a SECOND sort field; `sort -k1,1 -k2,2` now
   orders base-before-disambiguators (empty disambiguator sorts
   first because end-of-string < any char).

While implementing #3, discovered a 4th hidden bug: bash's IFS
whitespace-collapsing rule silently merged the empty disambiguator
field with the surrounding tabs in the `read` loop, so `shard_path`
came back empty and the script reported 0 shards. Switched the
field separator from tab to `|` (non-whitespace, no collapsing).

Verified:
- Shellcheck: clean
- Default threshold (5): 2/7 matches, no warning
- THRESHOLD=1: warning fires
- THRESHOLD=08: correctly interpreted as 8 (regression check on
  earlier base-10 fix)
- Same-minute fixture: `0900Z.md`, `0900Z-01.md`, `0900Z-02.md`,
  `0901Z.md` now order correctly under `sort -t'|' -k1,1 -k2,2`

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants