-
Notifications
You must be signed in to change notification settings - Fork 5
feat: add uncertainty decision framework for headless workers (t176) #656
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Add structured guidance for headless workers on when to make autonomous decisions vs flag uncertainty and exit cleanly. Workers previously had only a vague instruction to 'make a reasonable decision' — this gives them a concrete decision tree with specific proceed/exit criteria. Changes across three layers: - supervisor-helper.sh: Inject framework into every worker dispatch prompt - full-loop.md: Add rule 7 with proceed/exit criteria to headless rules - headless-dispatch.md: Add full Worker Uncertainty Framework section with decision tree, tables, examples, and supervisor integration notes
Summary of ChangesHello @marcusquinn, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses the challenge of autonomous AI workers encountering ambiguity by introducing a robust decision-making framework. Its purpose is to empower headless agents to make more informed choices, distinguishing between situations where they can confidently proceed and those requiring a clean exit to prevent errors or wasted effort. This enhancement is expected to significantly improve worker reliability and efficiency by minimizing risky autonomous decisions and unnecessary task retries. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
🔍 Code Quality Report�[0;35m[MONITOR]�[0m Code Review Monitoring Report �[0;34m[INFO]�[0m Latest Quality Status: �[0;34m[INFO]�[0m Recent monitoring activity: 📈 Current Quality Metrics
Generated on: Sun Feb 8 20:01:37 UTC 2026 Generated by AI DevOps Framework Code Review Monitoring |
WalkthroughThe changes introduce a new "Uncertainty Decision Framework" (t176) across headless worker documentation and prompts, providing explicit guidance for autonomous decision-making versus blocking/exit scenarios. Updates span command scripts, prompt construction, and tool documentation, with a related evaluation task marked as blocked. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Possibly related issues
Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In @.agents/scripts/commands/full-loop.md:
- Around line 187-206: Replace the duplicated "Uncertainty decision framework
(t176)" block with a single-line cross-reference to the canonical "Worker
Uncertainty Framework" in headless-dispatch.md; specifically remove the full
decision-tree content and instead add the suggested sentence referencing the
Worker Uncertainty Framework (headless-dispatch.md#worker-uncertainty-framework)
and instruct readers to document their decision or exit reason in the commit
message or output log, so the section titled "Uncertainty decision framework
(t176)" points to the authoritative source rather than duplicating it.
In @.agents/scripts/supervisor-helper.sh:
- Around line 2583-2615: The supervisor prompt currently asks workers to print
"BLOCKED:" but that marker isn't deterministically parsed by the evaluator;
update the prompt string (variable prompt) to require a structured blocker token
(e.g., TASK_BLOCKED:<JSON with reason, file, suggested_action>) and require the
worker to exit with a distinct non-zero code (e.g., exit 75) so the evaluator
can reliably detect blocked runs; then update the evaluator code paths that
parse worker output (functions like evaluate_worker and extract_log_metadata) to
explicitly look for TASK_BLOCKED and the chosen exit code and treat those runs
as "blocked" (not retriable) so retries won't be triggered.
🧹 Nitpick comments (2)
TODO.md (1)
129-130: Consider adding explicit dependency on t176.The BLOCKED note captures a meta/circular issue: the task to fix
ambiguous_skipped_aievaluation is itself blocked by theambiguous_skipped_aioutcome. Since t176 (this PR) adds uncertainty guidance to help workers know when to exit vs. retry, it seems t175 should logically wait for t176 to be merged before re-attempting dispatch.Consider adding
blocked-by:t176to make the dependency explicit, or add a note like "Retry after t176 uncertainty framework is deployed" to clarify next steps.📋 Suggested update to clarify dependency
-- [ ] t175 Fix `ambiguous_skipped_ai` evaluation — add better heuristic signals `#bug` `#supervisor` ~1h (ai:40m) ref:GH#644 assignee:marcusquinn started:2026-02-08T19:38:50Z logged:2026-02-08 - - Notes: Recurring evaluation outcome across batches. Evaluator can't determine success/failure, skips AI eval, defaults to retry. Add heuristics: check for commits on branch, check for uncommitted changes in worktree. BLOCKED: Re-prompt dispatch failed: ambiguous_skipped_ai +- [ ] t175 Fix `ambiguous_skipped_ai` evaluation — add better heuristic signals `#bug` `#supervisor` ~1h (ai:40m) ref:GH#644 assignee:marcusquinn started:2026-02-08T19:38:50Z logged:2026-02-08 blocked-by:t176 + - Notes: Recurring evaluation outcome across batches. Evaluator can't determine success/failure, skips AI eval, defaults to retry. Add heuristics: check for commits on branch, check for uncommitted changes in worktree. BLOCKED: Re-prompt dispatch failed: ambiguous_skipped_ai. Retry after t176 uncertainty framework is deployed..agents/tools/ai-assistants/headless-dispatch.md (1)
460-528: Consider cross-referencing instead of duplicating framework details.Since both
full-loop.md(rule 7) andheadless-dispatch.md(this section) document the same uncertainty framework, consider whether the full framework should live in one authoritative location with the other file referencing it.For example:
- Keep the detailed framework in
headless-dispatch.md(tool documentation)- In
full-loop.mdrule 7, provide a summary with a reference:"See .agents/tools/ai-assistants/headless-dispatch.md#worker-uncertainty-framework for the full decision tree"This follows the coding guideline to use file:line references instead of duplicating authoritative content, and ensures a single source of truth that's easier to maintain.
As per coding guidelines: Use code examples only when authoritative, otherwise use file:line references.
| 7. **Uncertainty decision framework** (t176) - When facing ambiguity, use this decision tree: | ||
|
|
||
| **PROCEED autonomously** (document decision in commit message): | ||
| - Multiple valid approaches exist but all achieve the goal — pick the simplest | ||
| - Style/naming choices are ambiguous — follow existing codebase conventions | ||
| - Task description is slightly vague but intent is clear from context | ||
| - Choosing between equivalent libraries/patterns — match project precedent | ||
| - Minor scope questions (e.g., fix adjacent issue?) — stay focused on assigned task | ||
|
|
||
| **EXIT cleanly** (include clear explanation in output): | ||
| - Task description contradicts what you find in the codebase | ||
| - Completing the task requires breaking changes to public APIs or shared interfaces | ||
| - The task is already done or obsolete | ||
| - Required dependencies, credentials, or services are missing and cannot be inferred | ||
| - The task requires architectural decisions that affect other tasks | ||
| - Unsure whether to create vs modify a file, and getting it wrong risks data loss | ||
|
|
||
| When proceeding, document the choice: `feat: add retry logic (chose exponential backoff — matches existing patterns)` | ||
| When exiting, be specific: `BLOCKED: Task says 'update auth endpoint' but 3 exist (JWT, OAuth, API key). Need clarification.` | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
🧩 Analysis chain
🏁 Script executed:
# Check the full-loop.md file to understand context
wc -l .agents/scripts/commands/full-loop.mdRepository: marcusquinn/aidevops
Length of output: 105
🏁 Script executed:
# Run the suggested search script to check for duplicates
rg -i "proceed autonomously|exit cleanly|uncertainty|decision framework" .agents/ \
-g '*.md' -C 2
# Check for similar proceed/exit criteria
rg -i "multiple valid approaches|style.*naming.*ambiguous|breaking.*api.*change|architectural decision" .agents/ \
-g '*.md' -C 2
# Check for similar headless worker guidance
rg -i "headless.*worker.*guidance|worker.*proceed|worker.*exit" .agents/ \
-g '*.md' -C 2Repository: marcusquinn/aidevops
Length of output: 14501
🏁 Script executed:
# Check AGENTS.md specifically for decision-making or uncertainty guidance
cat -n .agents/AGENTS.md | head -150Repository: marcusquinn/aidevops
Length of output: 8918
🏁 Script executed:
# List all .md files in .agents/ to understand structure
find .agents -name "*.md" -type f | head -20Repository: marcusquinn/aidevops
Length of output: 704
Consolidate framework reference — duplicates existing content in headless-dispatch.md
This uncertainty decision framework is already fully documented as "Worker Uncertainty Framework" in .agents/tools/ai-assistants/headless-dispatch.md, including the decision tree, proceed/exit criteria, and supervisor integration loop. Repeating the entire framework here violates the single source of truth principle.
Replace the duplicate with a cross-reference:
7. **Uncertainty decision framework** (t176) - When facing ambiguous decisions, consult the [Worker Uncertainty Framework](/.agents/tools/ai-assistants/headless-dispatch.md#worker-uncertainty-framework) in headless-dispatch.md. Document your decision choice or exit reason in the commit message or output log.
This maintains the guidance while eliminating duplication and directing readers to the authoritative, detailed source.
🧰 Tools
🪛 LanguageTool
[style] ~194-~194: Consider using a different verb for a more formal wording.
Context: ...edent - Minor scope questions (e.g., fix adjacent issue?) — stay focused on assi...
(FIX_RESOLVE)
🤖 Prompt for AI Agents
In @.agents/scripts/commands/full-loop.md around lines 187 - 206, Replace the
duplicated "Uncertainty decision framework (t176)" block with a single-line
cross-reference to the canonical "Worker Uncertainty Framework" in
headless-dispatch.md; specifically remove the full decision-tree content and
instead add the suggested sentence referencing the Worker Uncertainty Framework
(headless-dispatch.md#worker-uncertainty-framework) and instruct readers to
document their decision or exit reason in the commit message or output log, so
the section titled "Uncertainty decision framework (t176)" points to the
authoritative source rather than duplicating it.
| # t173: Explicit worker restriction — prevents TODO.md race condition | ||
| # t176: Uncertainty decision framework for headless workers | ||
| prompt="$prompt | ||
|
|
||
| ## MANDATORY Worker Restrictions (t173) | ||
| - Do NOT edit, commit, or push TODO.md — the supervisor owns all TODO.md updates. | ||
| - Do NOT edit todo/PLANS.md or todo/tasks/* — these are supervisor-managed. | ||
| - Report status via exit code, log output, and PR creation only. | ||
| - Put task notes in commit messages or PR body, never in TODO.md." | ||
| - Put task notes in commit messages or PR body, never in TODO.md. | ||
|
|
||
| ## Uncertainty Decision Framework (t176) | ||
| You are a headless worker with no human at the terminal. Use this framework when uncertain: | ||
|
|
||
| **PROCEED autonomously when:** | ||
| - Multiple valid approaches exist but all achieve the goal (pick the simplest) | ||
| - Style/naming choices are ambiguous (follow existing conventions in the codebase) | ||
| - Task description is slightly vague but intent is clear from context | ||
| - You need to choose between equivalent libraries/patterns (match project precedent) | ||
| - Minor scope questions (e.g., should I also fix this adjacent issue?) — stay focused on the assigned task | ||
|
|
||
| **FLAG uncertainty and exit cleanly when:** | ||
| - The task description contradicts what you find in the codebase | ||
| - Completing the task would require breaking changes to public APIs or shared interfaces | ||
| - You discover the task is already done or obsolete | ||
| - Required dependencies, credentials, or services are missing and cannot be inferred | ||
| - The task requires decisions that would significantly affect architecture or other tasks | ||
| - You are unsure whether a file should be created vs modified, and getting it wrong would cause data loss | ||
|
|
||
| **When you proceed autonomously**, document your decision in the commit message: | ||
| \`feat: add retry logic (chose exponential backoff over linear — matches existing patterns in src/utils/retry.ts)\` | ||
|
|
||
| **When you exit due to uncertainty**, include a clear explanation in your final output: | ||
| \`BLOCKED: Task says 'update the auth endpoint' but there are 3 auth endpoints (JWT, OAuth, API key). Need clarification on which one.\`" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a deterministic “blocked” signal to avoid retry loops
The prompt instructs workers to output BLOCKED: and “exit cleanly,” but the supervisor’s evaluator doesn’t appear to deterministically consume that marker. If it’s not parsed, uncertain exits can be misclassified as clean retries. Please verify parsing and add a structured marker/exit code that the evaluator explicitly handles.
🔧 Suggested prompt tweak (paired with evaluator handling)
-**When you exit due to uncertainty**, include a clear explanation in your final output:
-`BLOCKED: Task says 'update the auth endpoint' but there are 3 auth endpoints (JWT, OAuth, API key). Need clarification on which one.`
+**When you exit due to uncertainty**, emit a machine-readable marker (and exit non-zero) so the supervisor can block deterministically:
+`TASK_BLOCKED: Task says 'update the auth endpoint' but there are 3 auth endpoints (JWT, OAuth, API key). Need clarification on which one.`#!/bin/bash
# Verify whether any component parses BLOCKED/TASK_BLOCKED or similar markers.
rg -n "BLOCKED:|TASK_BLOCKED|uncertainty decision framework|Uncertainty Decision Framework"
rg -n "FULL_LOOP_COMPLETE|TASK_COMPLETE|EXIT:" .agents/scripts
rg -n "evaluate_worker|extract_log_metadata" .agents/scriptsAs per coding guidelines, Automation scripts - focus on: Reliability and robustness; Proper exit codes.
🤖 Prompt for AI Agents
In @.agents/scripts/supervisor-helper.sh around lines 2583 - 2615, The
supervisor prompt currently asks workers to print "BLOCKED:" but that marker
isn't deterministically parsed by the evaluator; update the prompt string
(variable prompt) to require a structured blocker token (e.g.,
TASK_BLOCKED:<JSON with reason, file, suggested_action>) and require the worker
to exit with a distinct non-zero code (e.g., exit 75) so the evaluator can
reliably detect blocked runs; then update the evaluator code paths that parse
worker output (functions like evaluate_worker and extract_log_metadata) to
explicitly look for TASK_BLOCKED and the chosen exit code and treat those runs
as "blocked" (not retriable) so retries won't be triggered.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a well-defined uncertainty decision framework for headless workers, which is a great improvement for making them more robust and predictable. The changes are clearly documented in full-loop.md and headless-dispatch.md, and the framework is correctly injected into the worker prompt in supervisor-helper.sh.
My review includes one suggestion in supervisor-helper.sh to improve code maintainability by using a heredoc for the large multi-line string. This aligns with general shell scripting best practices for readability, as encouraged by the repository's style guide which focuses on shell scripting quality.
| prompt="$prompt | ||
|
|
||
| ## MANDATORY Worker Restrictions (t173) | ||
| - Do NOT edit, commit, or push TODO.md — the supervisor owns all TODO.md updates. | ||
| - Do NOT edit todo/PLANS.md or todo/tasks/* — these are supervisor-managed. | ||
| - Report status via exit code, log output, and PR creation only. | ||
| - Put task notes in commit messages or PR body, never in TODO.md." | ||
| - Put task notes in commit messages or PR body, never in TODO.md. | ||
|
|
||
| ## Uncertainty Decision Framework (t176) | ||
| You are a headless worker with no human at the terminal. Use this framework when uncertain: | ||
|
|
||
| **PROCEED autonomously when:** | ||
| - Multiple valid approaches exist but all achieve the goal (pick the simplest) | ||
| - Style/naming choices are ambiguous (follow existing conventions in the codebase) | ||
| - Task description is slightly vague but intent is clear from context | ||
| - You need to choose between equivalent libraries/patterns (match project precedent) | ||
| - Minor scope questions (e.g., should I also fix this adjacent issue?) — stay focused on the assigned task | ||
|
|
||
| **FLAG uncertainty and exit cleanly when:** | ||
| - The task description contradicts what you find in the codebase | ||
| - Completing the task would require breaking changes to public APIs or shared interfaces | ||
| - You discover the task is already done or obsolete | ||
| - Required dependencies, credentials, or services are missing and cannot be inferred | ||
| - The task requires decisions that would significantly affect architecture or other tasks | ||
| - You are unsure whether a file should be created vs modified, and getting it wrong would cause data loss | ||
|
|
||
| **When you proceed autonomously**, document your decision in the commit message: | ||
| \`feat: add retry logic (chose exponential backoff over linear — matches existing patterns in src/utils/retry.ts)\` | ||
|
|
||
| **When you exit due to uncertainty**, include a clear explanation in your final output: | ||
| \`BLOCKED: Task says 'update the auth endpoint' but there are 3 auth endpoints (JWT, OAuth, API key). Need clarification on which one.\`" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For better readability and maintainability, consider using a heredoc to manage this large block of text. This approach is cleaner than a multi-line string assignment, as it avoids the need to escape special characters like backticks (). Using +=for string concatenation is also a more modern and readable pattern thanvar="$var..."`.
prompt+=$(cat <<-'EOF'
## MANDATORY Worker Restrictions (t173)
- Do NOT edit, commit, or push TODO.md — the supervisor owns all TODO.md updates.
- Do NOT edit todo/PLANS.md or todo/tasks/* — these are supervisor-managed.
- Report status via exit code, log output, and PR creation only.
- Put task notes in commit messages or PR body, never in TODO.md.
## Uncertainty Decision Framework (t176)
You are a headless worker with no human at the terminal. Use this framework when uncertain:
**PROCEED autonomously when:**
- Multiple valid approaches exist but all achieve the goal (pick the simplest)
- Style/naming choices are ambiguous (follow existing conventions in the codebase)
- Task description is slightly vague but intent is clear from context
- You need to choose between equivalent libraries/patterns (match project precedent)
- Minor scope questions (e.g., should I also fix this adjacent issue?) — stay focused on the assigned task
**FLAG uncertainty and exit cleanly when:**
- The task description contradicts what you find in the codebase
- Completing the task would require breaking changes to public APIs or shared interfaces
- You discover the task is already done or obsolete
- Required dependencies, credentials, or services are missing and cannot be inferred
- The task requires decisions that would significantly affect architecture or other tasks
- You are unsure whether a file should be created vs modified, and getting it wrong would cause data loss
**When you proceed autonomously**, document your decision in the commit message:
`feat: add retry logic (chose exponential backoff over linear — matches existing patterns in src/utils/retry.ts)`
**When you exit due to uncertainty**, include a clear explanation in your final output:
`BLOCKED: Task says 'update the auth endpoint' but there are 3 auth endpoints (JWT, OAuth, API key). Need clarification on which one.`
EOF
)


Summary
build_dispatch_cmd()in supervisor-helper.shProblem
Workers previously had only a vague instruction: "If you encounter ambiguity, make a reasonable decision and document it in a commit message. If truly blocked, exit cleanly." This led to:
Solution
A three-question decision tree:
With specific examples for both categories (proceed: style choices, vague descriptions, equivalent patterns; exit: contradictions, breaking API changes, missing credentials, architectural decisions).
Files Changed
.agents/scripts/supervisor-helper.sh.agents/scripts/commands/full-loop.md.agents/tools/ai-assistants/headless-dispatch.mdTesting
Summary by CodeRabbit
Documentation
Chores