t1408.4: Add batch execution strategies for task decomposition dispatch#3000
t1408.4: Add batch execution strategies for task decomposition dispatch#3000marcusquinn merged 1 commit intomainfrom
Conversation
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (4)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the task decomposition pipeline by introducing sophisticated batch execution strategies. It provides mechanisms to intelligently order and dispatch subtasks, allowing for either depth-first completion of task branches or breadth-first progress across multiple branches. This improvement aims to optimize resource utilization and streamline the execution of complex, interdependent tasks, making the overall system more efficient and robust in handling decomposed work. Highlights
Changelog
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
🔍 Code Quality Report�[0;35m[MONITOR]�[0m Code Review Monitoring Report �[0;34m[INFO]�[0m Latest Quality Status: �[0;34m[INFO]�[0m Recent monitoring activity: 📈 Current Quality Metrics
Generated on: Fri Mar 6 16:13:59 UTC 2026 Generated by AI DevOps Framework Code Review Monitoring |
There was a problem hiding this comment.
Code Review
This pull request introduces batch execution strategies for task decomposition, which is a great enhancement. However, it introduces significant security vulnerabilities and logic flaws. Specifically, the circular dependency detection is incorrectly implemented, leading to both missed cycles and false positives. Additionally, several grep calls are vulnerable to regex and argument injection, which could be used to bypass validation checks. Beyond security, the new shell script has performance bottlenecks due to inefficient jq usage within shell loops, and there are minor documentation issues including markdown formatting and a broken link.
| get_dispatchable_tasks() { | ||
| local all_tasks="$1" | ||
|
|
||
| # Filter to pending tasks first | ||
| local pending_tasks | ||
| pending_tasks=$(echo "$all_tasks" | jq '[.[] | select(.status == "pending")]') | ||
|
|
||
| local result="[]" | ||
| local task_count | ||
| task_count=$(echo "$pending_tasks" | jq 'length') | ||
|
|
||
| local i=0 | ||
| while [[ $i -lt "$task_count" ]]; do | ||
| local task | ||
| task=$(echo "$pending_tasks" | jq ".[$i]") | ||
|
|
||
| if is_task_unblocked "$task" "$all_tasks"; then | ||
| result=$(echo "$result" | jq --argjson task "$task" '. + [$task]') | ||
| fi | ||
|
|
||
| i=$((i + 1)) | ||
| done | ||
|
|
||
| echo "$result" | ||
| return 0 | ||
| } |
There was a problem hiding this comment.
The get_dispatchable_tasks function is inefficient because it uses a shell while loop that calls is_task_unblocked on each iteration, which in turn repeatedly invokes jq. This will perform poorly with many tasks. The entire logic can be consolidated into a single, much more performant jq command. With this change, the is_task_unblocked function (lines 208-232) is no longer needed and can be removed.
| get_dispatchable_tasks() { | |
| local all_tasks="$1" | |
| # Filter to pending tasks first | |
| local pending_tasks | |
| pending_tasks=$(echo "$all_tasks" | jq '[.[] | select(.status == "pending")]') | |
| local result="[]" | |
| local task_count | |
| task_count=$(echo "$pending_tasks" | jq 'length') | |
| local i=0 | |
| while [[ $i -lt "$task_count" ]]; do | |
| local task | |
| task=$(echo "$pending_tasks" | jq ".[$i]") | |
| if is_task_unblocked "$task" "$all_tasks"; then | |
| result=$(echo "$result" | jq --argjson task "$task" '. + [$task]') | |
| fi | |
| i=$((i + 1)) | |
| done | |
| echo "$result" | |
| return 0 | |
| } | |
| get_dispatchable_tasks() { | |
| local all_tasks="$1" | |
| # This single jq command is much more efficient. | |
| # 1. It creates a map of task statuses for quick lookups. | |
| # 2. It filters for "pending" tasks. | |
| # 3. For each pending task, it checks if all its blockers are "completed". | |
| echo "$all_tasks" | jq ' | |
| # Create a lookup map of task IDs to statuses for efficient checking. | |
| (map({(.id): .status}) | add) as $statuses | |
| | | |
| # Filter for pending tasks whose blockers are all completed. | |
| [ | |
| .[] | |
| | select(.status == "pending") | |
| | select((.blocked_by // []) | all($statuses[.] == "completed")) | |
| ] | |
| ' | |
| return 0 | |
| } |
References
- Consolidate multiple 'jq' calls into a single pass where possible to improve performance and script efficiency.
| order_depth_first() { | ||
| local tasks="$1" | ||
| local concurrency="$2" | ||
|
|
||
| local grouped | ||
| grouped=$(group_by_parent "$tasks") | ||
|
|
||
| # Get sorted branch keys (parent IDs) for deterministic ordering | ||
| local branches | ||
| branches=$(echo "$grouped" | jq -r 'keys | sort | .[]') | ||
|
|
||
| local batches="[]" | ||
|
|
||
| local branch | ||
| while IFS= read -r branch; do | ||
| [[ -z "$branch" ]] && continue | ||
|
|
||
| # Get task IDs for this branch, sorted by ID for deterministic order | ||
| local branch_task_ids | ||
| branch_task_ids=$(echo "$grouped" | jq -r --arg b "$branch" \ | ||
| '.[$b] | sort_by(.id) | .[].id') | ||
|
|
||
| # Split into batches of $concurrency size | ||
| local batch="[]" | ||
| local batch_count=0 | ||
|
|
||
| local task_id | ||
| while IFS= read -r task_id; do | ||
| [[ -z "$task_id" ]] && continue | ||
|
|
||
| batch=$(echo "$batch" | jq --arg id "$task_id" '. + [$id]') | ||
| batch_count=$((batch_count + 1)) | ||
|
|
||
| if [[ "$batch_count" -ge "$concurrency" ]]; then | ||
| batches=$(echo "$batches" | jq --argjson b "$batch" '. + [$b]') | ||
| batch="[]" | ||
| batch_count=0 | ||
| fi | ||
| done <<<"$branch_task_ids" | ||
|
|
||
| # Add remaining tasks in the last partial batch | ||
| if [[ "$batch_count" -gt 0 ]]; then | ||
| batches=$(echo "$batches" | jq --argjson b "$batch" '. + [$b]') | ||
| fi | ||
| done <<<"$branches" | ||
|
|
||
| echo "$batches" | ||
| return 0 | ||
| } |
There was a problem hiding this comment.
This function is inefficient because it uses a shell loop to build batches, calling jq on each iteration to append to a JSON array. This pattern of batch=$(echo "$batch" | jq ...) is very slow. The entire batching logic can be performed much more efficiently within a single jq command, similar to the implementation of order_breadth_first.
| order_depth_first() { | |
| local tasks="$1" | |
| local concurrency="$2" | |
| local grouped | |
| grouped=$(group_by_parent "$tasks") | |
| # Get sorted branch keys (parent IDs) for deterministic ordering | |
| local branches | |
| branches=$(echo "$grouped" | jq -r 'keys | sort | .[]') | |
| local batches="[]" | |
| local branch | |
| while IFS= read -r branch; do | |
| [[ -z "$branch" ]] && continue | |
| # Get task IDs for this branch, sorted by ID for deterministic order | |
| local branch_task_ids | |
| branch_task_ids=$(echo "$grouped" | jq -r --arg b "$branch" \ | |
| '.[$b] | sort_by(.id) | .[].id') | |
| # Split into batches of $concurrency size | |
| local batch="[]" | |
| local batch_count=0 | |
| local task_id | |
| while IFS= read -r task_id; do | |
| [[ -z "$task_id" ]] && continue | |
| batch=$(echo "$batch" | jq --arg id "$task_id" '. + [$id]') | |
| batch_count=$((batch_count + 1)) | |
| if [[ "$batch_count" -ge "$concurrency" ]]; then | |
| batches=$(echo "$batches" | jq --argjson b "$batch" '. + [$b]') | |
| batch="[]" | |
| batch_count=0 | |
| fi | |
| done <<<"$branch_task_ids" | |
| # Add remaining tasks in the last partial batch | |
| if [[ "$batch_count" -gt 0 ]]; then | |
| batches=$(echo "$batches" | jq --argjson b "$batch" '. + [$b]') | |
| fi | |
| done <<<"$branches" | |
| echo "$batches" | |
| return 0 | |
| } | |
| order_depth_first() { | |
| local tasks="$1" | |
| local concurrency="$2" | |
| local grouped | |
| grouped=$(group_by_parent "$tasks") | |
| echo "$grouped" | jq --argjson c "$concurrency" ' | |
| keys | sort | map( | |
| (.[.] | sort_by(.id) | map(.id)) as $ids | |
| | [range(0; length; $c)] | map($ids[.:(.+$c)]) | |
| ) | flatten(1) | |
| ' | |
| return 0 | |
| } |
References
- Consolidate multiple 'jq' calls into a single pass where possible to improve performance and script efficiency.
| local blocker_id | ||
| while IFS= read -r blocker_id; do | ||
| [[ -z "$blocker_id" ]] && continue | ||
| if ! echo "$all_ids" | grep -qx "$blocker_id"; then |
There was a problem hiding this comment.
The use of grep -qx "$variable" without the -F (fixed strings) flag and the -- (end of options) separator is vulnerable to both regex and argument injection. 1. Regex Injection: Without -F, the input is treated as a regular expression. An attacker can provide a task ID like .* to bypass existence checks (line 505) or cause false positives in cycle detection (line 532). 2. Argument Injection: If a task ID starts with a hyphen (e.g., -v), it will be interpreted as an option by grep, which can break the script's logic or be used to read arbitrary files if an attacker can inject options like -f. Remediation: Always use grep -Fqx -- "$variable" when performing literal string matches in shell scripts.
| if ! echo "$all_ids" | grep -qx "$blocker_id"; then | |
| if ! echo "$all_ids" | grep -Fqx -- "$blocker_id"; then |
| while [[ $depth -lt 20 ]]; do | ||
| local deps | ||
| deps=$(echo "$TASKS_JSON" | jq -r --arg id "$current" \ | ||
| '.[] | select(.id == $id) | .blocked_by // [] | .[]' 2>/dev/null) | ||
|
|
||
| if [[ -z "$deps" ]]; then | ||
| break | ||
| fi | ||
|
|
||
| local dep | ||
| while IFS= read -r dep; do | ||
| [[ -z "$dep" ]] && continue | ||
| if echo "$visited" | grep -qx "$dep"; then | ||
| has_cycle=true | ||
| result=$(echo "$result" | jq --arg id "$task_id" --arg dep "$dep" \ | ||
| '.valid = false | .errors += ["Circular dependency detected: \($id) -> ... -> \($dep)"]') | ||
| break 2 | ||
| fi | ||
| visited="${visited}"$'\n'"${dep}" | ||
| current="$dep" | ||
| done <<<"$deps" | ||
|
|
||
| depth=$((depth + 1)) | ||
| done |
There was a problem hiding this comment.
The circular dependency detection logic in cmd_validate is fundamentally flawed and will produce both false negatives and false positives.
- False Negatives (Missed Cycles): The nested loop structure (lines 520-543) does not correctly implement a Depth-First Search (DFS). The inner loop (line 530) iterates over all dependencies of a task, but it overwrites the
currentvariable (line 539) in each iteration. The outer loop then only continues from the last dependency processed. This means any cycles in other branches of the dependency graph will be ignored. - False Positives (Incorrect Cycle Detection): The
visitedlist (line 516) is shared across all branches of a task and is never reset when backtracking. In a diamond-shaped dependency graph (e.g., A depends on B and C, and both B and C depend on D), the script will incorrectly flag a circular dependency when it encounters D for the second time, even though no cycle exists. - Depth Limit: The hardcoded depth limit of 20 (line 520) is an arbitrary restriction that may fail for complex task graphs, although it is intended as a safeguard.
To fix this, implement a proper recursive DFS or a stack-based traversal that correctly manages the 'visited' set for the current path and a 'processed' set to avoid redundant work.
|
|
||
| **Configuration**: `BATCH_STRATEGY` env var (default: `depth-first`). Can be overridden per-repo via bundle config or per-task via the decomposition pipeline. | ||
|
|
||
| **Full docs**: `scripts/commands/pulse.md` "Batch execution strategies" section, `scripts/batch-strategy-helper.sh help` |
There was a problem hiding this comment.
The documentation link scripts/commands/pulse.md appears to be incorrect. Based on the project structure, it should likely be .agents/scripts/commands/pulse.md to ensure it resolves correctly. This can cause confusion and make it harder for developers to find the relevant documentation.
| **Full docs**: `scripts/commands/pulse.md` "Batch execution strategies" section, `scripts/batch-strategy-helper.sh help` | |
| **Full docs**: `.agents/scripts/commands/pulse.md` "Batch execution strategies" section, `scripts/batch-strategy-helper.sh help` |
| return 1 | ||
| fi | ||
| # Validate it's valid JSON | ||
| if ! echo "$TASKS_JSON" | jq empty 2>/dev/null; then |
There was a problem hiding this comment.
The use of 2>/dev/null here to suppress jq errors violates the project's general rule to avoid blanket error suppression. While you handle the error case, hiding jq's own error message makes debugging harder if the issue is not just invalid JSON but something more subtle. This pattern appears several times in the script (e.g., lines 213, 224, 500, 523). Please remove the error suppression. For validation, jq . >/dev/null is a good alternative that silences stdout but preserves stderr.
| if ! echo "$TASKS_JSON" | jq empty 2>/dev/null; then | |
| if ! echo "$TASKS_JSON" | jq . >/dev/null; then |
References
- Avoid using '2>/dev/null' for blanket suppression of command errors in shell scripts to ensure that authentication, syntax, or system issues remain visible for debugging.
|
|
||
| ### Batch Strategies for Decomposed Tasks (t1408.4) |
There was a problem hiding this comment.
The new section "Batch Strategies for Decomposed Tasks" and the preceding blank line have been inserted in the middle of a Markdown table. This breaks the table's rendering, causing subsequent rows to be displayed as plain text. The new section should be moved to after the table to ensure correct formatting.
|
Merge conflict detected after recent merges to main. Dispatching worker to rebase and resolve conflicts. |
…ispatch Add batch-strategy-helper.sh with depth-first and breadth-first batch ordering for dispatching decomposed subtasks. Depth-first (default) completes all subtasks under one branch before starting the next; breadth-first spreads progress evenly across branches. The helper integrates with existing pulse concurrency controls: - Respects MAX_WORKERS and available worker slots - Honors blocked_by dependencies (blocked tasks excluded from batches) - Validates dependency graphs (detects cycles, missing refs, excessive depth) - Configurable concurrency limit per batch (max 8) - macOS bash 3.2 compatible (no associative arrays) Updated pulse.md, orchestration.md, and headless-dispatch.md with batch strategy guidance and usage examples. Closes #2987
0fa843f to
fffeda2
Compare
🔍 Code Quality Report�[0;35m[MONITOR]�[0m Code Review Monitoring Report �[0;34m[INFO]�[0m Latest Quality Status: �[0;34m[INFO]�[0m Recent monitoring activity: 📈 Current Quality Metrics
Generated on: Fri Mar 6 16:43:05 UTC 2026 Generated by AI DevOps Framework Code Review Monitoring |
|
Consolidate shell loops into single jq passes for performance and fix flawed circular dependency detection from PR #3000 review feedback. Changes: - Replace get_dispatchable_tasks O(n*m) shell loop with single jq pass - Remove now-unused is_task_unblocked function - Replace order_depth_first shell loop with single jq pass - Rewrite circular dependency detection with proper stack-based DFS using jq until(), fixing false positives on diamond graphs and false negatives from incomplete traversal - Remove blanket 2>/dev/null from jq calls (preserve stderr for debugging) Closes #3030
Closes #3030 Address all 5 findings from PR #3000 Gemini review: HIGH: Refactor get_dispatchable_tasks to single jq pass, eliminating O(n*m) shell-loop jq invocations. Remove now-unused is_task_unblocked function. HIGH: Refactor order_depth_first to single jq pass, replacing the inefficient pattern of appending to JSON arrays in a shell loop. MEDIUM: Fix grep injection vulnerabilities — replaced all bare grep -qx with proper jq-based validation (grep calls eliminated entirely by moving validation logic into jq). MEDIUM: Fix flawed circular dependency detection that produced false positives on diamond graphs and false negatives on multi-branch cycles. Replaced with proper DFS using explicit stack and path tracking, implemented entirely in jq. MEDIUM: Replace 2>/dev/null with >/dev/null on jq validation call to preserve stderr for debugging while still suppressing stdout.
…3033) Consolidate shell loops into single jq passes for performance and fix flawed circular dependency detection from PR #3000 review feedback. Changes: - Replace get_dispatchable_tasks O(n*m) shell loop with single jq pass - Remove now-unused is_task_unblocked function - Replace order_depth_first shell loop with single jq pass - Rewrite circular dependency detection with proper stack-based DFS using jq until(), fixing false positives on diamond graphs and false negatives from incomplete traversal - Remove blanket 2>/dev/null from jq calls (preserve stderr for debugging) Closes #3030
The 'Batch Strategies for Decomposed Tasks' section (from PR #3000) was inserted in the middle of the Decision Table, splitting two rows ('Cron' and 'Migration') from the rest of the table. This caused those rows to render as plain text instead of table cells. Move the two orphaned rows back into the table and place the Batch Strategies section after the complete table. Closes #3031
…spatch.md PR #3000 inserted the 'Batch Strategies for Decomposed Tasks' section in the middle of the Decision Table, leaving two rows (Cron and Migration scenarios) stranded after the section. This broke table rendering, causing those rows to display as plain text. Moves the two orphaned rows back into the table before the section break. Closes #3031
…spatch.md (#3065) PR #3000 inserted the 'Batch Strategies for Decomposed Tasks' section in the middle of the Decision Table, leaving two rows (Cron and Migration scenarios) stranded after the section. This broke table rendering, causing those rows to display as plain text. Moves the two orphaned rows back into the table before the section break. Closes #3031



Summary
batch-strategy-helper.shwith depth-first and breadth-first batch ordering for dispatching decomposed subtasks from the task decomposition pipeline (t1408)MAX_WORKERS,blocked_by:dependencies,BATCH_STRATEGYenv var)pulse.md,orchestration.md, andheadless-dispatch.mdwith batch strategy guidance, usage examples, and configuration docsDetails
New file:
.agents/scripts/batch-strategy-helper.sh— deterministic helper with three commands:order: Generate full batch ordering for all pending tasksnext-batch: Get the next batch of tasks ready for immediate dispatchvalidate: Validate task dependency graph (detect cycles, missing deps, excessive depth)Strategies:
Integration points:
next-batchwhen dispatching decomposed subtasksblocked_by:dependencies — blocked tasks excluded from batchesMAX_BATCH_SIZE(8)Testing
Tested with multiple scenarios:
blocked_byrefs)Closes #2987