Skip to content

Comments

t1097: Add prompt-repeat retry strategy to dispatch.sh#1631

Merged
marcusquinn merged 2 commits intomainfrom
feature/t1097
Feb 18, 2026
Merged

t1097: Add prompt-repeat retry strategy to dispatch.sh#1631
marcusquinn merged 2 commits intomainfrom
feature/t1097

Conversation

@marcusquinn
Copy link
Owner

Summary

Add a prompt-repeat retry strategy to the supervisor dispatch pipeline that retries failed tasks with a reinforced prompt at the same model tier before escalating to a more expensive model.

Problem

Many worker failures are due to insufficient prompt clarity (e.g., worker exits without emitting FULL_LOOP_COMPLETE, does not create a PR, produces trivial output) rather than model capability limitations. The current flow immediately escalates to a higher-tier model on retry, which is ~2-5x more expensive and often unnecessary.

Solution

Insert a prompt-repeat step before model escalation in the pulse retry handler:

  1. should_prompt_repeat() -- checks eligibility: config toggle, not already attempted, retryable failure, pattern tracker data
  2. do_prompt_repeat() -- dispatches with reinforced prompt at same tier, reuses existing worktree
  3. Pattern tracking -- records SUCCESS_PATTERN/FAILURE_PATTERN with prompt_repeat tag for data-driven routing

Files Changed

  • .agents/scripts/supervisor/dispatch.sh -- new functions: should_prompt_repeat(), mark_prompt_repeat_done(), build_prompt_repeat_prompt(), do_prompt_repeat()
  • .agents/scripts/supervisor/pulse.sh -- integrate prompt-repeat before model escalation in Phase 1 retry handler, add pattern tracking for outcomes
  • .agents/scripts/supervisor/database.sh -- add prompt_repeat_done column to tasks table (schema + migration)

Configuration

  • SUPERVISOR_PROMPT_REPEAT_ENABLED=true (default) -- set to false to disable
  • Self-tuning via pattern tracker: accumulates data on which task types benefit

Ref #1622

Before escalating to a more expensive model tier on worker failure,
retry with a reinforced prompt at the same tier. Many failures are
due to insufficient prompt clarity (missing completion signals, no PR
created) rather than model capability limitations.

Strategy: should_prompt_repeat() checks eligibility (config toggle,
not already attempted, retryable failure, pattern tracker data),
then do_prompt_repeat() dispatches with build_prompt_repeat_prompt()
which doubles down on requirements and failure-specific guidance.

Pattern tracker integration: records SUCCESS_PATTERN/FAILURE_PATTERN
with prompt_repeat tag so future routing decisions are data-driven.
With 3+ samples and <25% success rate, prompt-repeat is auto-skipped.

Configurable via SUPERVISOR_PROMPT_REPEAT_ENABLED (default: true).
DB migration adds prompt_repeat_done column to tasks table.
@marcusquinn marcusquinn marked this pull request as ready for review February 18, 2026 02:47
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 18, 2026

Warning

Rate limit exceeded

@marcusquinn has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 15 minutes and 19 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feature/t1097

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist
Copy link

Summary of Changes

Hello @marcusquinn, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a strategic enhancement to the supervisor's task retry mechanism. Instead of immediately escalating to a more expensive model upon task failure, the system will now first attempt a 'prompt-repeat' retry. This involves re-dispatching the task with a more explicit and reinforced prompt at the same model tier. This change is designed to address common failures caused by insufficient prompt clarity, thereby improving task completion rates and reducing operational costs by optimizing model usage based on learned patterns.

Highlights

  • New Prompt-Repeat Retry Strategy: Introduced a 'prompt-repeat' retry strategy to re-attempt failed tasks with a reinforced prompt at the same model tier, aiming to reduce costs by avoiding immediate escalation to more expensive models.
  • Eligibility and Reinforced Prompt Generation: Implemented should_prompt_repeat() to check task eligibility based on configuration, previous attempts, retryable failure reasons, and pattern tracker data. A build_prompt_repeat_prompt() function was added to construct failure-specific reinforced prompts.
  • Integration into Supervisor Pipeline: The new do_prompt_repeat() function dispatches eligible tasks with reinforced prompts. This strategy is integrated into the supervisor's pulse retry handler, executing before any model escalation.
  • Pattern Tracking for Optimization: Enhanced pattern tracking to record outcomes (success/failure) of prompt-repeat attempts, enabling data-driven optimization and routing decisions for future tasks.
  • Database Schema Update: Modified the tasks database schema by adding a prompt_repeat_done column and its migration, ensuring that a task does not undergo prompt-repeat multiple times.
Changelog
  • .agents/scripts/supervisor/database.sh
    • Added prompt_repeat_done column to the tasks table.
    • Included a database migration script to add the new column.
  • .agents/scripts/supervisor/dispatch.sh
    • Added should_prompt_repeat() function to determine if a task is eligible for prompt-repeat.
    • Added mark_prompt_repeat_done() function to flag tasks that have undergone a prompt-repeat attempt.
    • Added build_prompt_repeat_prompt() function to generate a reinforced prompt based on task details and failure reason.
    • Added do_prompt_repeat() function to execute the prompt-repeat retry, dispatching the task with the reinforced prompt.
  • .agents/scripts/supervisor/pulse.sh
    • Modified the task retry handler to check for prompt-repeat eligibility before model escalation.
    • Integrated calls to do_prompt_repeat() for eligible tasks.
    • Added logic to record prompt-repeat success and failure outcomes in the pattern tracker.
Activity
  • No human activity has been recorded on this pull request yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@github-actions
Copy link

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 22 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Wed Feb 18 02:47:31 UTC 2026: Code review monitoring started
Wed Feb 18 02:47:31 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 22

📈 Current Quality Metrics

  • BUGS: 0
  • CODE SMELLS: 22
  • VULNERABILITIES: 0

Generated on: Wed Feb 18 02:47:33 UTC 2026


Generated by AI DevOps Framework Code Review Monitoring

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a 'prompt-repeat' retry strategy, which is a sensible approach to handling certain types of task failures before escalating to more expensive models. The implementation is well-structured across dispatch.sh, pulse.sh, and database.sh. The new functions in dispatch.sh are robust, using safe patterns for command construction and background execution. The integration into pulse.sh correctly places the new logic before model escalation.

My review includes two suggestions for improvement: one for code deduplication in pulse.sh to enhance maintainability, and another for a minor performance optimization in dispatch.sh by avoiding a redundant command execution.

Comment on lines 632 to 635
pr_success=$("$pattern_helper" stats 2>/dev/null |
grep -c 'prompt_repeat.*SUCCESS\|SUCCESS.*prompt_repeat' 2>/dev/null || echo "0")
pr_failure=$("$pattern_helper" stats 2>/dev/null |
grep -c 'prompt_repeat.*FAILURE\|FAILURE.*prompt_repeat' 2>/dev/null || echo "0")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The "$pattern_helper" stats command is executed twice here. To improve efficiency, especially if the helper script has a non-trivial execution time, you could call it once and store its output in a variable. Then, you can pipe the variable's content to grep for both success and failure counts.

Suggested change
pr_success=$("$pattern_helper" stats 2>/dev/null |
grep -c 'prompt_repeat.*SUCCESS\|SUCCESS.*prompt_repeat' 2>/dev/null || echo "0")
pr_failure=$("$pattern_helper" stats 2>/dev/null |
grep -c 'prompt_repeat.*FAILURE\|FAILURE.*prompt_repeat' 2>/dev/null || echo "0")
local stats_output
stats_output=$("$pattern_helper" stats 2>/dev/null)
pr_success=$(echo "$stats_output" | grep -c 'prompt_repeat.*SUCCESS\|SUCCESS.*prompt_repeat' 2>/dev/null || echo "0")
pr_failure=$(echo "$stats_output" | grep -c 'prompt_repeat.*FAILURE\|FAILURE.*prompt_repeat' 2>/dev/null || echo "0")

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 66ed098. Cached the stats output in a local variable so it's only called once, then piped to both grep commands.

Comment on lines 308 to 310
local tid_pr_done_retry
tid_pr_done_retry=$(db "$SUPERVISOR_DB" "SELECT COALESCE(prompt_repeat_done, 0) FROM tasks WHERE id = '$(sql_escape "$tid")';" 2>/dev/null || echo "0")
if [[ "$tid_pr_done_retry" -ge 1 ]]; then

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This block of code to check if a prompt-repeat was attempted is duplicated from lines 275-277. According to the general repository rules, repeated logic should be extracted into an internal helper function to improve maintainability.

For example, you could create a function like _was_prompt_repeat_attempted() at the top of the script:

# Internal helper to check if a task has had a prompt-repeat attempt.
# Args: $1 = task_id
# Returns: 0 if attempted, 1 if not.
_was_prompt_repeat_attempted() {
    local task_id="$1"
    local prompt_repeat_done
    prompt_repeat_done=$(db "$SUPERVISOR_DB" "SELECT COALESCE(prompt_repeat_done, 0) FROM tasks WHERE id = '$(sql_escape "$task_id")';" 2>/dev/null || echo "0")
    if [[ "$prompt_repeat_done" -ge 1 ]]; then
        return 0
    fi
    return 1
}

Then you can replace both occurrences with a simple call:

if _was_prompt_repeat_attempted "$tid"; then
    # ... logic ...
fi
References
  1. In shell scripts, extract repeated logic into an internal helper function to improve maintainability. This applies even for standalone scripts where external source dependencies are avoided.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 66ed098. Extracted _was_prompt_repeat_attempted() helper function at the top of pulse.sh and replaced both inline DB queries with calls to it.

@github-actions
Copy link

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 22 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Wed Feb 18 03:07:03 UTC 2026: Code review monitoring started
Wed Feb 18 03:07:04 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 22

📈 Current Quality Metrics

  • BUGS: 0
  • CODE SMELLS: 22
  • VULNERABILITIES: 0

Generated on: Wed Feb 18 03:07:06 UTC 2026


Generated by AI DevOps Framework Code Review Monitoring

@sonarqubecloud
Copy link

@marcusquinn marcusquinn merged commit 756b93d into main Feb 18, 2026
19 checks passed
@marcusquinn marcusquinn deleted the feature/t1097 branch February 18, 2026 03:16
marcusquinn added a commit that referenced this pull request Feb 18, 2026
…d t1094.1 (t1107)

All t1094 deliverables confirmed implemented:
- Prompt strategy tracking → t1095 (pr:#1629)
- Output quality gradient + failure categorization → t1096 (pr:#1632)
- Token usage → t1095 (pr:#1629)
- A/B comparison → t1098+t1099 (pr:#1637, pr:#1634)
- Prompt-repeat strategy → t1097 (pr:#1631)
- Build-agent reference → t1094.1 (pr:#1633)

t1094 parent ready to be marked complete: verified:2026-02-18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant