Skip to content

feat: sandcastle refinement loop with critic-based convergence#111

Merged
jerome-benoit merged 32 commits into
mainfrom
feat/sandcastle-refinement-loop
May 5, 2026
Merged

feat: sandcastle refinement loop with critic-based convergence#111
jerome-benoit merged 32 commits into
mainfrom
feat/sandcastle-refinement-loop

Conversation

@jerome-benoit
Copy link
Copy Markdown
Owner

@jerome-benoit jerome-benoit commented May 4, 2026

Description

Replace single-pass implement→review→merge with a modular iterative implement↔critic refinement loop. Each task gets its own parallel sandbox with convergence detection, quality ratchet, and automated PR creation.

Architecture

Planner (opus) → selects issues
  For each issue (parallel, max 3):
    Sandbox with implement↔critic loop:
      Implementer (sonnet) → codes + commits + pushes
      Critic (sonnet) → structured findings JSON (nonce-tagged)
      Dedup (context-hash) → convergence check
      Quality ratchet → rollback on regression
      Best-state checkpoint → restore optimal intermediate
      Validation-in-loop (ARCS) → deterministic convergence
    Finalize: validate → rebase → PR (draft if non-converged)

Key Design Decisions

  • Flat iteration budget (50/round) — evidence: ARCS, SWE-Agent, AutoCodeRover all use flat
  • Context-hash dedup (±3 lines SHA-256) — drift-safe, CodeQL/Qodana pattern
  • Severity-weighted convergence — refuses convergence if CRITICAL/HIGH persist (OpenHands)
  • Best-state tracking — resets to best intermediate on non-convergence (SWE-Agent)
  • Validation-in-loop — deterministic convergence when tests pass (ARCS)
  • Async subprocess executionutil.promisify(execFile) unblocks event loop for true parallelism
  • Nonce-tagged critic output — prevents injection from code content
  • One PR per task — no batch merge, each issue gets its own PR

Modules

File Lines Responsibility
constants.ts 64 Shared constants + execFileAsync + getHeadSha + toErrorMessage
types.ts 83 Zod schemas + exported interfaces + parseFindingsSafe
concurrency-pool.ts 69 O(1) FIFO semaphore (linked list)
task-source.ts 248 TaskSource interface + GithubIssueSource (fetch + sanitize + plan)
refinement-loop.ts 580 Core loop: implement↔critic + dedup + ratchet + convergence
finalizer.ts 281 Validate + retry + rebase + push + PR creation
main.ts 103 Thin orchestrator: discover → pool → loop → finalize

Prompts

Prompt Role Key rules
plan-prompt.md Issue selection Prefer single-file scope, exclude blocked
implement-prompt.md Code + commit + push Cross-validate findings, full validation before push
critic-prompt.md Structured review ≤5 HIGH/CRIT findings, nonce-tagged JSON, known decisions blocklist

Type of Change

  • New feature (non-breaking change that adds functionality)
  • Refactoring (no functional changes)

Checklist

  • I have run npm run type-check && npm run test && npm run prettier-check && npm run lint
  • I have run npm run build && npm run check-build && npm run build:v2 && npm run check-build:v2
  • My changes follow the existing code style
  • E2E tested locally (planner + parallel implementers started successfully)

Related Issues

Fixes #110

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature Request]: Implement/review refinement loop with deterministic convergence

2 participants