feat(context): preserve verbatim tail-anchor across compaction#29918
Merged
Conversation
When `_maybeCompact` runs in the middle of a long agent work span, the
LLM-produced summary doesn't always preserve the model's most recent
self-narration ("Next step: ...", "About to ..."). The post-compaction
model then falls back to a "where am I?" recovery shape — re-checking
NOW.md / scratch / signals — and drifts off the active thread.
Force-keep the last assistant text from the compactable region by
splicing it verbatim into the summary message as a tag-wrapped
`<verbatim_tail>...</verbatim_tail>` block. Hard cap at 1500 chars,
clamped from the START so the END (where "next step" lines land) wins.
Idempotent: a prior tail block in `existingSummary` is replaced, not
stacked, so successive compactions don't accumulate stale tails.
The spliced summary flows through both `createContextSummaryMessage`
(the in-context user-role message) and the `summaryText` returned in
the result, so `updateConversationContextWindow` (DB persistence) and
the `context_compacted` client event both surface the anchored tail.
Tests: 2 new tail-anchor unit tests + 3 integration tests covering the
splice, the no-eligible-text fallback, and the long-tail clamp.
dvargasfuertes
approved these changes
May 8, 2026
Contributor
|
Oh nice catch if true |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Force-keep the last assistant text from the compactable region by splicing it verbatim into the post-compaction summary message as a tag-wrapped
<verbatim_tail>…</verbatim_tail>block.Why
When
_maybeCompactruns in the middle of a long agent work span, the LLM-produced summary doesn't reliably preserve the model's most recent self-narration ("Next step: …", "About to …"). The post-compaction model then falls back to a "where am I?" recovery shape — re-checkingNOW.md/scratch//signals/— and drifts off the active thread.This was the structural root cause of the May 7 surface-action conversation drift: while iterating on PR #29895, compaction fired at 20:58:31 UTC; the assistant's prior turn had stated "next step: file the SSE followup as promised", but that statement landed in the compactable region and was summarized away. The post-compaction model fell into a heartbeat-shaped recovery (ls signals/, ls scratch/, cat HEARTBEAT.md, scan PRs) and drifted onto unrelated bot feedback on PRs #6227 and #29900.
How
In
assistant/src/context/window-manager.ts:extractTailAssistantText(messages, maxChars=1500): walkscompactableMessagesbackward, returns the trimmed text of the most recent assistant message that has at least one non-empty text block. Skipstool_use/tool_result/ image / unknown blocks. When the tail exceedsmaxChars, clamps from the START with a[...truncated]prefix so the END (where "next step" lines land) wins. Returnsnullwhen nothing eligible is found — the caller treats that as "no anchor to splice".appendTailAnchorToSummary(summary, tailText): idempotent splice. Tag-wrapped block (<verbatim_tail>) is structurally distinct from any##section the LLM might produce, so it survivesclampSummaryAtSectionBoundary(which only runs on the LLM summary, before the splice). If the existing summary already contains a<verbatim_tail>block (e.g. carried forward asexistingSummaryfrom a prior compaction), it's replaced rather than stacked._maybeCompact: extracttailAnchorTextfromcompactableMessages, splice viaappendTailAnchorToSummarywhen non-null, fall through otherwise. The spliced summary flows into bothcreateContextSummaryMessage(in-context user-role message) and thesummaryTextreturned in the result — soupdateConversationContextWindow(DB persistence) and thecontext_compactedclient event both see the anchored tail.Tests
3 new test groups in
assistant/src/__tests__/context-window-manager.test.ts, all passing:describe("extractTailAssistantText")— 5 tests: returns last assistant text; null when none; skips tool_use-only assistants; clamps long text preserving END; ignores whitespace-only.describe("appendTailAnchorToSummary")— 2 tests: appends tag-wrapped block; idempotent (replaces prior tail).describe("compaction tail-anchor")— 3 integration tests viaContextWindowManager.maybeCompact: splices the last compactable assistant text into the summary message +summaryText; omits gracefully when compactable region has only tool_use assistants; clamps long tails preserving the END marker.Full suite: 61/61 pass on
bun test src/__tests__/context-window-manager.test.ts.Notes
SUMMARY_PROMPT_FALLBACKandprompts/compact.md) is worth considering as a follow-up — they're orthogonal: the prompt change asks the LLM nicely, the verbatim splice guarantees it.summaryText, so observability/debug consumers (DB row,context_compactedevent) match what's in-context.