feat: sandcastle refinement loop with critic-based convergence by jerome-benoit · Pull Request #111 · jerome-benoit/sap-ai-provider

jerome-benoit · 2026-05-04T23:26:15Z

Description

Replace single-pass implement→review→merge with a modular iterative implement↔critic refinement loop. Each task gets its own parallel sandbox with convergence detection, quality ratchet, and automated PR creation.

Architecture

Planner (opus) → selects issues
  For each issue (parallel, max 3):
    Sandbox with implement↔critic loop:
      Implementer (sonnet) → codes + commits + pushes
      Critic (sonnet) → structured findings JSON (nonce-tagged)
      Dedup (context-hash) → convergence check
      Quality ratchet → rollback on regression
      Best-state checkpoint → restore optimal intermediate
      Validation-in-loop (ARCS) → deterministic convergence
    Finalize: validate → rebase → PR (draft if non-converged)

Key Design Decisions

Flat iteration budget (50/round) — evidence: ARCS, SWE-Agent, AutoCodeRover all use flat
Context-hash dedup (±3 lines SHA-256) — drift-safe, CodeQL/Qodana pattern
Severity-weighted convergence — refuses convergence if CRITICAL/HIGH persist (OpenHands)
Best-state tracking — resets to best intermediate on non-convergence (SWE-Agent)
Validation-in-loop — deterministic convergence when tests pass (ARCS)
Async subprocess execution — util.promisify(execFile) unblocks event loop for true parallelism
Nonce-tagged critic output — prevents injection from code content
One PR per task — no batch merge, each issue gets its own PR

Modules

File	Lines	Responsibility
`constants.ts`	64	Shared constants + `execFileAsync` + `getHeadSha` + `toErrorMessage`
`types.ts`	83	Zod schemas + exported interfaces + `parseFindingsSafe`
`concurrency-pool.ts`	69	O(1) FIFO semaphore (linked list)
`task-source.ts`	248	`TaskSource` interface + `GithubIssueSource` (fetch + sanitize + plan)
`refinement-loop.ts`	580	Core loop: implement↔critic + dedup + ratchet + convergence
`finalizer.ts`	281	Validate + retry + rebase + push + PR creation
`main.ts`	103	Thin orchestrator: discover → pool → loop → finalize

Prompts

Prompt	Role	Key rules
`plan-prompt.md`	Issue selection	Prefer single-file scope, exclude blocked
`implement-prompt.md`	Code + commit + push	Cross-validate findings, full validation before push
`critic-prompt.md`	Structured review	≤5 HIGH/CRIT findings, nonce-tagged JSON, known decisions blocklist

Type of Change

New feature (non-breaking change that adds functionality)
Refactoring (no functional changes)

Checklist

I have run npm run type-check && npm run test && npm run prettier-check && npm run lint
I have run npm run build && npm run check-build && npm run build:v2 && npm run check-build:v2
My changes follow the existing code style
E2E tested locally (planner + parallel implementers started successfully)

Related Issues

Fixes #110

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: sandcastle refinement loop with critic-based convergence#111

feat: sandcastle refinement loop with critic-based convergence#111
jerome-benoit merged 32 commits into
mainfrom
feat/sandcastle-refinement-loop

jerome-benoit commented May 4, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jerome-benoit commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Architecture

Key Design Decisions

Modules

Prompts

Type of Change

Checklist

Related Issues

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jerome-benoit commented May 4, 2026 •

edited

Loading