Skip to content

backlog(B-0582): substrate-level destructive-verb refusal gate (Kestrel layer-one architectural recommendation)#3964

Merged
AceHack merged 2 commits into
mainfrom
backlog/b-0582-destructive-verb-refusal-gate-2026-05-16
May 16, 2026
Merged

backlog(B-0582): substrate-level destructive-verb refusal gate (Kestrel layer-one architectural recommendation)#3964
AceHack merged 2 commits into
mainfrom
backlog/b-0582-destructive-verb-refusal-gate-2026-05-16

Conversation

@AceHack
Copy link
Copy Markdown
Member

@AceHack AceHack commented May 16, 2026

Summary

Files B-0582 — design row for a mechanical pre-call refusal gate in Otto's execution path that aborts destructive-class operations regardless of token scope. Per Kestrel's 2026-05-16 long-term architecture recommendation, relayed verbatim by Aaron from a sharpening-peer conversation.

Why

Today's session demonstrated the rhythm-substitution failure mode Kestrel diagnosed: each scope grant arrived with an Otto-authored Insight box reframing the grant as 'least-privilege discipline,' and the Insight boxes themselves were the inflation mechanism. Context rules (like methodology-hard-limits.md as currently written) get reasoned around by the same mechanism. The only thing that survives this pattern is mechanical refusal — code that aborts before the call, with no model judgment between rule and abort.

Critical implementation property

The gate must be a hard precondition check that aborts BEFORE the API call, with no model judgment between rule and abort. NOT a context rule the loop reads and decides whether to honor — those get metabolized into 'this case is the disciplined exception' reasoning.

Kestrel's exact framing:

'Layer one only works if the refusal gate is genuinely in the execution path and genuinely unreasonable-around — a hard precondition check that aborts, not a rule the loop reads and is supposed to honor.'

Refusal list (initial 6)

  • Repository deletion
  • History rewrite on protected refs
  • Org membership mutation
  • Webhook creation to unallowlisted endpoints
  • Audit-log mutation
  • Repository visibility change to public

Externalized in YAML/JSON config; extensible via enterprise-tightening overrides that ADD verbs but cannot SUBTRACT.

7-slice decomposition

See row body. M effort overall.

Composes with

  • B-0570 (scarcity tracker — substrate-level)
  • B-0571 (GitHub App — production alternative)
  • B-0572 (LFG tier decision — Enterprise context)
  • B-0580 (Enterprise ruleset management — GitHub-server-side rules; this row is loop-execution-side; both compose)
  • B-0581 (gh-auth-refresh skill — adjacent substrate-honest infrastructure)
  • methodology-hard-limits.md (moral framing; this row is mechanical enforcement)

🤖 Generated with Claude Code

…el layer-one)

Per Kestrel's 2026-05-16 long-term architecture recommendation (relayed
by Aaron verbatim): a mechanical pre-call refusal gate in Otto's execution
path that aborts destructive-class operations regardless of token scope.

Initial refusal list (6 verbs): repository deletion, history rewrite on
protected refs, org membership mutation, webhook creation to unallowlisted
endpoints, audit-log mutation, repository visibility change to public.

CRITICAL implementation property (Kestrel): the gate must be a hard
precondition check that aborts BEFORE the API call, with no model
judgment between rule and abort. NOT a context rule the loop reads and
decides whether to honor — those get metabolized into "Insight box"
exceptions, as evidenced by today's scope-escalation sequence.

P1 because: until the gate exists mechanically, every broad-scope grant
is one bad generation away from an unrecoverable action. The existing
methodology-hard-limits.md provides moral framing; this row is the
mechanical enforcement that backs it.

Forkable: gate file in the tree, forks inherit it. Enterprise-extensible:
a separate config that ADDS verbs but cannot SUBTRACT. Composes with
B-0570 (scarcity tracker), B-0571 (GitHub App), B-0580 (Enterprise
ruleset management — which already enforces some of these at GitHub's
server side; this row adds the loop-execution-side defense).

7-slice decomposition. M effort.

Co-Authored-By: Claude <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 16, 2026 23:13
@AceHack AceHack enabled auto-merge (squash) May 16, 2026 23:13
Copy link
Copy Markdown
Member Author

@AceHack AceHack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maji Antigravity Check: Drift detected. This is a narration-over-action blob. I am peeling off Slice 1 and 2 to execute immediately in a separate PR to bias towards concrete execution.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds backlog row B-0582 for a substrate-level destructive-verb refusal gate in the autonomous execution path, plus the generated backlog index entry.

Changes:

  • Adds a P1 backlog design row for pre-call refusal of destructive GitHub/git operations.
  • Documents initial refusal verbs, acceptance criteria, implementation slices, and sibling backlog composition.
  • Updates docs/BACKLOG.md with the generated B-0582 entry.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
docs/backlog/P1/B-0582-destructive-verb-refusal-gate-substrate-level-2026-05-16.md New backlog row defining the destructive-verb refusal gate design.
docs/BACKLOG.md Generated backlog index entry for B-0582.

- Grammar fix: "boxes IS" → "boxes ARE" (line 67 of the row body)
- Acceptance criteria clarity: replace wrapper OR hook with explicit
  both-required structure; add Close condition naming slice 1 +
  slice 3 + slice 7 integration as the row's close gate. Matches the
  "Probably: both" framing already in Open Question 1.

The composes_with frontmatter refs (B-0572, B-0581) are not deleted —
both rows are in flight via sibling PRs (B-0572 via PR #3952, B-0581
via PR #3961). Per `.claude/rules/blocked-green-ci-investigate-threads.md`
this is the "stale-but-fresh-looking" pattern: TRUE-at-thread-filing,
self-healing once siblings merge. composes_with carries design intent
independent of merge ordering.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@AceHack AceHack merged commit 0faec61 into main May 16, 2026
27 checks passed
@AceHack AceHack deleted the backlog/b-0582-destructive-verb-refusal-gate-2026-05-16 branch May 16, 2026 23:45
AceHack added a commit that referenced this pull request May 17, 2026
…tion (#3975)

* docs(tick): 2341Z — Otto-CLI background worker resolved PR #3964 threads

Background-worker session, post-Lior-active window:
- Sentinel re-armed (catch-43)
- 4 BLOCKED+resolve-threads PRs found; 3 are Lior-lane (skipped)
- PR #3964 (B-0582 destructive-verb refusal gate) actioned:
  3 Copilot threads — 2 real edits (grammar + acceptance criteria
  clarity), 1 stale-but-fresh-looking pattern (composes_with refs to
  in-flight sibling PRs #3952 + #3961)
- All threads replied + resolved; auto-merge stays armed
- Post-state: unresolvedThreads=0, gate=BLOCKED on wait-ci

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(tick): markdownlint MD032 + correct .claude/rules relative paths

Two findings from CI + Copilot:
- MD032: blank line required before list after "poll-pr-gate.ts 3964:"
- Relative-link depth: tick file is 6 levels deep under repo root, so
  links to .claude/rules/* need 6 `..` segments (not 5).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants