Skip to content

t1311: Evaluate oh-my-pi swarm DAG patterns for supervisor dispatch#2235

Merged
marcusquinn merged 1 commit intomainfrom
feature/t1311
Feb 25, 2026
Merged

t1311: Evaluate oh-my-pi swarm DAG patterns for supervisor dispatch#2235
marcusquinn merged 1 commit intomainfrom
feature/t1311

Conversation

@marcusquinn
Copy link
Owner

@marcusquinn marcusquinn commented Feb 24, 2026

Summary

  • Comprehensive research comparing oh-my-pi's swarm DAG orchestration with our supervisor dispatch system
  • Identifies 6 gaps: no graph construction, no parallel execution waves, no cycle detection, unused blocks: field, no batch-level dependency ordering, no shared workspace communication
  • Proposes 4 concrete enhancements (~14h total effort) that graft graph-based dependency resolution onto the existing TODO.md architecture

Key Findings

Capability oh-my-pi aidevops Gap
Graph construction Map<string, Set<string>> String matching via grep High
Cycle detection Kahn's algorithm None High
Execution waves Topological sort Flat queue High
Model routing Single model per swarm AI-classified per-task tier aidevops superior
Retry/escalation None Prompt-repeat + model escalation aidevops superior

Proposed Enhancements

  1. P0: Graph-based dependency resolution (~4h) — replace grep with adjacency list + Kahn's cycle detection
  2. P1: Execution wave computation for batches (~6h) — topological sort for parallel dispatch
  3. P2: Make blocks: functional (~1h) — inverse of blocked-by:, like oh-my-pi's reports_to
  4. P2: Dependency visualization (~3h) — supervisor-helper.sh dag command

Deliverable

Research document: todo/tasks/t1311-swarm-dag-research.md

Ref #2135

Summary by CodeRabbit

  • Documentation
    • Added research documentation on swarm DAG patterns and architectural analysis for task dispatch optimization.

…h (t1311)

Compare oh-my-pi YAML-defined swarm orchestration (reports_to/waits_for,
Kahn's algorithm topological sort, execution waves) with our TODO.md
blocked-by system. Identifies 6 gaps and proposes 4 concrete enhancements:

1. Graph-based dependency resolution (replace grep with adjacency list + cycle detection)
2. Execution wave computation for batches (topological sort for parallel dispatch)
3. Make blocks: field functional (inverse of blocked-by, like oh-my-pi reports_to)
4. Dependency visualization (dag subcommand for debugging)

Key insight: graft graph-based dependency resolution onto existing TODO.md
architecture rather than replicate oh-my-pi's YAML pipeline model.

Ref #2135
@gemini-code-assist
Copy link

Summary of Changes

Hello @marcusquinn, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a comprehensive research document that critically evaluates the current supervisor dispatch system's approach to task dependencies against the more robust DAG orchestration capabilities of oh-my-pi's swarm extension. It highlights significant limitations in the supervisor's ability to manage complex task flows, such as detecting circular dependencies or executing tasks in parallel waves. The PR then proposes a set of targeted, high-impact enhancements designed to graft advanced graph-based dependency resolution onto the existing architecture, aiming to improve task scheduling efficiency, prevent deadlocks, and provide better visibility into task relationships without a full system rewrite.

Highlights

  • Comprehensive Research Document Added: A detailed research report, 't1311-swarm-dag-research.md', has been added, comparing oh-my-pi's swarm DAG orchestration with the existing supervisor dispatch system.
  • Identified Gaps in Supervisor Dispatch: The report identifies 6 key gaps in the current supervisor's dependency handling, including the lack of graph construction, parallel execution waves, and cycle detection.
  • Proposed Enhancements for Dependency Resolution: Four concrete enhancements are proposed to integrate graph-based dependency resolution into the supervisor, focusing on preventing deadlocks, maximizing parallelism, and improving observability.
  • Implementation Plan and Effort Estimates: Each proposed enhancement includes an implementation outline, estimated effort, and a recommended priority and order for development.
Changelog
  • todo/tasks/t1311-swarm-dag-research.md
    • Added a new research document comparing oh-my-pi's swarm DAG orchestration with the supervisor dispatch system.
    • Documented 6 specific gaps in the supervisor's current dependency management capabilities.
    • Proposed 4 concrete enhancements to introduce graph-based dependency resolution, execution wave computation, functional 'blocks:' field, and dependency visualization.
    • Included detailed implementation plans, effort estimates, and priority rankings for each proposed enhancement.
Activity
  • No human activity has occurred on this pull request yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 24, 2026

Walkthrough

A research document analyzing supervisor dispatch architecture, comparing current implementation with oh-my-pi Swarm patterns, identifying gaps (no graph, no waves, no shared workspace), and proposing four graph-based dependency resolution enhancements with concrete integration points and implementation effort estimates.

Changes

Cohort / File(s) Summary
Swarm DAG Research & Analysis
todo/tasks/t1311-swarm-dag-research.md
Research document introducing comparative analysis of swarm DAG patterns; outlines gaps in current supervisor architecture (string-based blockers, missing wave execution, no reports_to, no execution modes); proposes four prioritized enhancements: graph-based dependency resolution, execution wave computation, functional blocks normalization, and DAG visualization via new supervisor-helper.sh commands. Details implementation plan with file modifications, effort estimates, and integration points in todo-sync.sh and pulse.sh.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~35 minutes

Possibly related PRs

Poem

📊 A DAG of dreams takes shape on page,
Swarms dancing through a graph-ruled stage,
From TODO strings to waves so bright,
Dependencies drawn and sorted right! 🌊✨

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly and specifically references the main objective: evaluating oh-my-pi swarm DAG patterns for supervisor dispatch, which is the core focus of the research document added.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feature/t1311

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 58 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Tue Feb 24 20:51:50 UTC 2026: Code review monitoring started
Tue Feb 24 20:51:51 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 58

📈 Current Quality Metrics

  • BUGS: 0
  • CODE SMELLS: 58
  • VULNERABILITIES: 0

Generated on: Tue Feb 24 20:51:53 UTC 2026


Generated by AI DevOps Framework Code Review Monitoring

@sonarqubecloud
Copy link

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds a comprehensive research document comparing the oh-my-pi swarm DAG orchestration with the existing supervisor dispatch system. The report is well-structured, identifying key gaps and proposing concrete enhancements with code snippets. My review focuses on improving the proposed implementation for building the dependency graph, suggesting a more performant approach to avoid inefficiencies in the proposed shell script.

Comment on lines +238 to +360
```bash
# Build adjacency list from TODO.md blocked-by: fields
# Output: JSON object { "t002": ["t001", "t003"], "t004": ["t002"] }
build_task_dependency_graph() {
local todo_file="$1"
local graph="{}"

while IFS= read -r line; do
local task_id blocked_by
task_id=$(printf '%s' "$line" | grep -oE 't[0-9]+(\.[0-9]+)*' | head -1)
blocked_by=$(printf '%s' "$line" | grep -oE 'blocked-by:[^ ]+' | head -1 | sed 's/blocked-by://')
[[ -z "$task_id" || -z "$blocked_by" ]] && continue

# Add to graph as JSON
local deps_json
deps_json=$(printf '%s' "$blocked_by" | tr ',' '\n' | jq -R . | jq -s .)
graph=$(printf '%s' "$graph" | jq --arg id "$task_id" --argjson deps "$deps_json" '. + {($id): $deps}')
done < <(grep -E '^\s*- \[ \] t[0-9]+.*blocked-by:' "$todo_file" || true)

printf '%s' "$graph"
return 0
}
```

Add cycle detection:

```bash
# Detect cycles using iterative DFS (Kahn's algorithm in shell)
# Returns: comma-separated cycle members, or empty if acyclic
detect_dependency_cycles() {
local graph_json="$1"
# ... Kahn's algorithm implementation ...
# If sorted_count < total_nodes, return unsorted nodes (cycle members)
}
```

**Files to modify:**
- `.agents/scripts/supervisor/todo-sync.sh` — add graph builder + cycle detection
- `.agents/scripts/supervisor/pulse.sh` — call graph builder in Phase 0.5d, log cycles as warnings

**Effort:** ~4h
**Value:** Prevents silent deadlocks, enables wave computation (Enhancement 2)

### Enhancement 2: Execution Wave Computation for Batches (HIGH PRIORITY)

**What:** When a batch is created with inter-dependent tasks, compute execution waves and dispatch wave-by-wave instead of flat queue order.

**Implementation:**

Add `compute_batch_waves()` to `batch.sh`:

```bash
# Compute execution waves from batch task dependencies
# Input: batch_id
# Output: JSON array of arrays [["t001","t002"], ["t003"], ["t004","t005"]]
compute_batch_waves() {
local batch_id="$1"

# Get all tasks in batch with their blocked-by deps
local tasks_json
tasks_json=$(db -json "$SUPERVISOR_DB" "
SELECT t.id, t.description
FROM batch_tasks bt
JOIN tasks t ON bt.task_id = t.id
WHERE bt.batch_id = '$(sql_escape "$batch_id")'
ORDER BY bt.position;
")

# Build dependency graph from TODO.md blocked-by fields
# ... use build_task_dependency_graph() ...

# Compute waves via topological sort
# ... Kahn's algorithm adapted for shell ...

# Store waves in batch metadata
db "$SUPERVISOR_DB" "UPDATE batches SET waves = '...' WHERE id = '...';"
}
```

Modify `cmd_next()` to respect wave ordering:

```bash
# In cmd_next(), after fetching candidates:
# If batch has computed waves, only return tasks from the current wave
# (i.e., tasks whose wave predecessors are all complete)
```

**Files to modify:**
- `.agents/scripts/supervisor/batch.sh` — add wave computation
- `.agents/scripts/supervisor/state.sh` — modify `cmd_next()` to respect waves
- `.agents/scripts/supervisor/database.sh` — add `waves` column to batches table

**Effort:** ~6h
**Value:** Enables diamond/fan-out/fan-in patterns; maximizes parallelism within dependency constraints

### Enhancement 3: Make `blocks:` Functional (LOW EFFORT, MEDIUM VALUE)

**What:** Parse the existing `blocks:` field in TODO.md as the inverse of `blocked-by:`. When building the dependency graph, treat `blocks:t003` on task t001 as equivalent to `blocked-by:t001` on task t003.

**Implementation:**

In `build_task_dependency_graph()`, add a second pass:

```bash
# Second pass: process blocks: fields (inverse direction)
while IFS= read -r line; do
local task_id blocks_field
task_id=$(printf '%s' "$line" | grep -oE 't[0-9]+(\.[0-9]+)*' | head -1)
blocks_field=$(printf '%s' "$line" | grep -oE 'blocks:[^ ]+' | head -1 | sed 's/blocks://')
[[ -z "$task_id" || -z "$blocks_field" ]] && continue

# For each blocked task, add this task as a dependency
IFS=',' read -ra blocked_tasks <<< "$blocks_field"
for blocked_id in "${blocked_tasks[@]}"; do
# Add task_id to blocked_id's dependency set
graph=$(printf '%s' "$graph" | jq --arg id "$blocked_id" --arg dep "$task_id" \
'if .[$id] then .[$id] += [$dep] else . + {($id): [$dep]} end')
done
done < <(grep -E '^\s*- \[ \] t[0-9]+.*blocks:' "$todo_file" || true)
```

This is the exact equivalent of oh-my-pi's `reports_to` → `waits_for` normalization (`dag.ts:33-40`).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The proposed implementation for build_task_dependency_graph is inefficient in two ways:

  1. Inefficient Parsing: It uses multiple processes (printf, grep, head, sed) inside a while loop for each line, which is slow for large files.
  2. Inefficient JSON construction: It rebuilds the entire JSON graph object (graph=$(... | jq ...) ) for every single task dependency. This has a quadratic complexity (O(n^2)) and will perform poorly with many tasks.

A more performant approach is to use a single processing pipeline that streams data through jq once at the end. This avoids both expensive per-line process forking and inefficient in-loop JSON manipulation.

build_task_dependency_graph() {
    local todo_file="$1"

    # Efficiently build the dependency graph by streaming data through a single jq process,
    # rather than repeatedly parsing a large JSON object inside shell loops.
    {
        # Pass 1: process blocked-by: fields
        grep -E '^\s*- \[ \] t[0-9]+.*blocked-by:' "$todo_file" 2>/dev/null |
            sed -nE 's/^\s*- \[ \] (t[0-9]+(\.[0-9]+)*).*blocked-by:([^ ]+).*/\1 \3/p' |
            while IFS=' ' read -r task_id blocked_by; do
                [[ -z "$task_id" ]] && continue
                printf '%s' "$blocked_by" | tr ',' '\n' | while IFS= read -r dep; do
                    [[ -z "$dep" ]] && continue
                    jq -cn --arg task "$task_id" --arg dep "$dep" '{$task: [$dep]}'
                done
            done

        # Pass 2: process blocks: fields (inverse direction)
        grep -E '^\s*- \[ \] t[0-9]+.*blocks:' "$todo_file" 2>/dev/null |
            sed -nE 's/^\s*- \[ \] (t[0-9]+(\.[0-9]+)*).*blocks:([^ ]+).*/\1 \3/p' |
            while IFS=' ' read -r task_id blocks_field; do
                [[ -z "$task_id" ]] && continue
                printf '%s' "$blocks_field" | tr ',' '\n' | while IFS= read -r blocked_id; do
                    [[ -z "$blocked_id" ]] && continue
                    jq -cn --arg task "$blocked_id" --arg dep "$task_id" '{$task: [$dep]}'
                done
            done
    } | jq -s '
        reduce .[] as $item ({};
            ($item | keys[0]) as $key |
            .[$key] += $item[$key]
        ) | . as $merged | reduce (keys_unsorted[]) as $k ($merged; .[$k] |= unique) // {}
    '
}

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (2)
todo/tasks/t1311-swarm-dag-research.md (2)

265-271: detect_dependency_cycles() is an empty stub — the P0 enhancement's core value is undelivered until this is implemented.

The comment # ... Kahn's algorithm implementation ... is the entire body. Since this is explicitly the foundation for Enhancements 2 and 4, and the primary motivation for Enhancement 1 is preventing silent deadlocks, the actual algorithm should be fleshed out before the function is wired into pulse.sh. A complete bash/jq implementation of Kahn's cycle detection is straightforward:

Kahn's algorithm computes in-degree for each node, seeds a queue with all zero-in-degree nodes, processes each node by decrementing its dependents' in-degrees (re-queuing any that hit zero), and signals a cycle when processed_count < total_nodes.

💡 Sketch of a bash+jq implementation
detect_dependency_cycles() {
    local graph_json="$1"
    # Compute in-degrees: for each node, count how many nodes list it as a dep
    local all_nodes in_degree sorted_count=0
    all_nodes=$(printf '%s' "$graph_json" | jq -r 'keys[]')
    local total
    total=$(printf '%s' "$graph_json" | jq 'keys | length')

    # Use jq to produce {node: in-degree} map
    local in_deg_json
    in_deg_json=$(printf '%s' "$graph_json" | jq '
        reduce to_entries[] as $e (
            (keys | map({(.): 0}) | add) // {};
            reduce $e.value[] as $dep (.; .[$dep] += 1)
        )')

    # Process iteratively — seed queue with in-degree 0 nodes
    local queue remaining="$graph_json"
    queue=$(printf '%s' "$in_deg_json" | jq -r 'to_entries[] | select(.value==0) | .key')

    while IFS= read -r node; do
        (( sorted_count++ ))
        # Decrement deps' in-degrees
        local deps
        deps=$(printf '%s' "$graph_json" | jq -r --arg n "$node" '.[$n] // [] | .[]' 2>/dev/null || true)
        # ... update in_deg_json and re-queue zero-in-degree nodes ...
    done <<< "$queue"

    if (( sorted_count < total )); then
        # Return unprocessed nodes as cycle members
        printf '%s' "$graph_json" | jq -r --argjson processed "$sorted_count" 'keys[]' | tail -n "+$((sorted_count+1))"
    fi
}

Would you like me to generate a complete, tested implementation and open a tracking issue for the P0 enhancement?

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@todo/tasks/t1311-swarm-dag-research.md` around lines 265 - 271, The function
detect_dependency_cycles() is an empty stub and must implement Kahn's algorithm
to detect cycles; replace the placeholder with a bash + jq implementation that:
parses the input graph_json (JSON map of node -> [deps]), computes initial
in-degree map (in_deg_json) for all nodes, seeds a queue with nodes of in-degree
zero, iteratively pops nodes (incrementing sorted_count), for each popped node
reads its dependents from graph_json and decrements their in-degree in
in_deg_json, enqueues any that reach zero, and after the loop compares
sorted_count to total nodes and prints a comma-separated list of unprocessed
nodes when sorted_count < total; reference detect_dependency_cycles() to locate
where to implement and ensure robust handling of missing keys and empty arrays
with jq and safe IFS/while reads.

241-260: jq '. + {($id): $deps}' silently overwrites duplicate task entries — consider a merge instead.

If task_id already exists in $graph (e.g., from a malformed or duplicated TODO.md line), the previous dependency array is discarded with no warning. A safer idiom that merges and deduplicates:

-        graph=$(printf '%s' "$graph" | jq --arg id "$task_id" --argjson deps "$deps_json" '. + {($id): $deps}')
+        graph=$(printf '%s' "$graph" | jq --arg id "$task_id" --argjson deps "$deps_json" \
+            '.[$id] = ((.[$id] // []) + $deps | unique)')

Additionally, each loop iteration spawns three subshells (grep, grep|sed, jq|jq|jq) plus a final jq — for large TODO files this will be measurably slow. A single-pass jq invocation fed the raw grep output would eliminate the per-line subprocess cost entirely, or python3 -c could build the entire graph in one pass.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@todo/tasks/t1311-swarm-dag-research.md` around lines 241 - 260, The
build_task_dependency_graph function uses jq '. + {($id): $deps}' which silently
overwrites existing entries; change the update to merge and deduplicate
dependencies (e.g., read existing array and set .[$id] = ((.[$id] // [] ) +
$deps) | unique) so duplicate task lines append/uniquify instead of replacing,
and refactor the loop to avoid per-line subprocesses by passing the entire
grepped input into a single jq/pipeline (or a single python3 -c processing step)
that extracts task_id and blocked-by for every matching line and builds the
final JSON graph in one pass to eliminate repeated grep/sed/jq invocations.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@todo/tasks/t1311-swarm-dag-research.md`:
- Around line 68-74: The fenced pseudocode block labeled "execution-model" is
missing a language identifier which triggers MD040; edit the fenced code block
in todo/tasks/t1311-swarm-dag-research.md (the block that starts with the three
backticks and the for-each iteration pseudocode) and add a language specifier
such as "text" or "plaintext" immediately after the opening ``` so it becomes
```text (or ```plaintext) to satisfy the linter.
- Line 4: In the todo/tasks/t1311-swarm-dag-research.md file the "Source:" line
contains an absolute local path exposing a developer's home directory; replace
that absolute path with a repo-relative path (e.g., packages/swarm-extension/)
or a generic reference/remote URL (e.g.,
github.com/your-org/your-repo/packages/swarm-extension) and avoid embedding
usernames or machine paths in the "Source:" entry to comply with the guideline
about placeholders and not hardcoding sensitive info.
- Around line 341-357: The graph builder appends duplicate dependencies when
both "blocks:" and "blocked-by:" exist causing inflated in-degrees; modify the
jq update in the loop that reads blocks_field (the graph variable update using
jq with --arg id "$blocked_id" --arg dep "$task_id") to deduplicate the
dependency array after appending (use jq's unique on the array, e.g. replace the
current '.[$id] += [$dep] else . + {($id): [$dep]}' branch with one that appends
then | unique) so that the dependency list for each node (referenced via $id)
never contains duplicate entries.
- Around line 293-314: compute_batch_waves() currently uses
build_task_dependency_graph() which returns a global graph; before running the
Kahn/topological-sort step you must filter that graph to only the task IDs
present in the batch (the IDs collected into tasks_json / batch_tasks), removing
any edges to/from tasks not in the batch so waves are computed purely within the
batch. Concretely: extract the set of batch task IDs from the SELECT result,
call build_task_dependency_graph() to get the global mapping, produce a subgraph
containing only nodes whose IDs are in that set and only edges between those
nodes, then run your existing Kahn-style wave computation on that filtered graph
and persist the resulting waves to the batches.waves field.

---

Nitpick comments:
In `@todo/tasks/t1311-swarm-dag-research.md`:
- Around line 265-271: The function detect_dependency_cycles() is an empty stub
and must implement Kahn's algorithm to detect cycles; replace the placeholder
with a bash + jq implementation that: parses the input graph_json (JSON map of
node -> [deps]), computes initial in-degree map (in_deg_json) for all nodes,
seeds a queue with nodes of in-degree zero, iteratively pops nodes (incrementing
sorted_count), for each popped node reads its dependents from graph_json and
decrements their in-degree in in_deg_json, enqueues any that reach zero, and
after the loop compares sorted_count to total nodes and prints a comma-separated
list of unprocessed nodes when sorted_count < total; reference
detect_dependency_cycles() to locate where to implement and ensure robust
handling of missing keys and empty arrays with jq and safe IFS/while reads.
- Around line 241-260: The build_task_dependency_graph function uses jq '. +
{($id): $deps}' which silently overwrites existing entries; change the update to
merge and deduplicate dependencies (e.g., read existing array and set .[$id] =
((.[$id] // [] ) + $deps) | unique) so duplicate task lines append/uniquify
instead of replacing, and refactor the loop to avoid per-line subprocesses by
passing the entire grepped input into a single jq/pipeline (or a single python3
-c processing step) that extracts task_id and blocked-by for every matching line
and builds the final JSON graph in one pass to eliminate repeated grep/sed/jq
invocations.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5859f3c and d679320.

📒 Files selected for processing (1)
  • todo/tasks/t1311-swarm-dag-research.md

# t1311: Swarm DAG Patterns for Supervisor Dispatch — Research Report

**Date:** 2026-02-24
**Source:** `/Users/marcusquinn/Git/oh-my-pi/packages/swarm-extension/`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Replace the absolute local path with a repo-relative or generic reference.

The Source: line embeds a developer's home directory path (/Users/marcusquinn/Git/...), leaking a local username and machine layout into a committed document. Replace with a project-relative or remote reference instead, e.g.:

-**Source:** `/Users/marcusquinn/Git/oh-my-pi/packages/swarm-extension/`
+**Source:** `oh-my-pi` repo — `packages/swarm-extension/` (https://github.com/<org>/oh-my-pi/tree/main/packages/swarm-extension)

As per coding guidelines: "Use placeholders in examples and note secure storage location instead of hardcoding sensitive information" (**/*.{md,sh,js,ts,py}).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@todo/tasks/t1311-swarm-dag-research.md` at line 4, In the
todo/tasks/t1311-swarm-dag-research.md file the "Source:" line contains an
absolute local path exposing a developer's home directory; replace that absolute
path with a repo-relative path (e.g., packages/swarm-extension/) or a generic
reference/remote URL (e.g.,
github.com/your-org/your-repo/packages/swarm-extension) and avoid embedding
usernames or machine paths in the "Source:" entry to comply with the guideline
about placeholders and not hardcoding sensitive info.

Comment on lines +68 to +74
```
for each iteration (0..targetCount):
for each wave (0..waves.length):
Promise.all(wave.map(agent => executeSwarmAgent(agent)))
// All agents in wave run in parallel
// Wave N+1 starts only after wave N completes
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add a language identifier to the fenced code block to satisfy MD040 / Codacy.

The execution-model block at Line 68 is missing a language specifier. Use text or plaintext since it's pseudocode:

-```
+```text
 for each iteration (0..targetCount):
🧰 Tools
🪛 GitHub Check: Codacy Static Code Analysis

[notice] 68-68: todo/tasks/t1311-swarm-dag-research.md#L68
Fenced code blocks should have a language specified

🪛 markdownlint-cli2 (0.21.0)

[warning] 68-68: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@todo/tasks/t1311-swarm-dag-research.md` around lines 68 - 74, The fenced
pseudocode block labeled "execution-model" is missing a language identifier
which triggers MD040; edit the fenced code block in
todo/tasks/t1311-swarm-dag-research.md (the block that starts with the three
backticks and the for-each iteration pseudocode) and add a language specifier
such as "text" or "plaintext" immediately after the opening ``` so it becomes
```text (or ```plaintext) to satisfy the linter.

Comment on lines +293 to +314
compute_batch_waves() {
local batch_id="$1"

# Get all tasks in batch with their blocked-by deps
local tasks_json
tasks_json=$(db -json "$SUPERVISOR_DB" "
SELECT t.id, t.description
FROM batch_tasks bt
JOIN tasks t ON bt.task_id = t.id
WHERE bt.batch_id = '$(sql_escape "$batch_id")'
ORDER BY bt.position;
")

# Build dependency graph from TODO.md blocked-by fields
# ... use build_task_dependency_graph() ...

# Compute waves via topological sort
# ... Kahn's algorithm adapted for shell ...

# Store waves in batch metadata
db "$SUPERVISOR_DB" "UPDATE batches SET waves = '...' WHERE id = '...';"
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

compute_batch_waves() sketch omits the cross-batch graph filtering step — implementation will need it.

The SQL retrieves only t.id and t.description; blocked-by: data is in TODO.md, not in the tasks table. build_task_dependency_graph() correctly parses TODO.md, but it returns the global dependency graph for all tasks, not just those in the target batch. Without a filter step, compute_batch_waves() will either (a) compute waves across tasks outside the batch, potentially creating phantom wave-blocking relationships, or (b) fail to find nodes for batch-scoped task IDs in the global graph.

The implementation should explicitly filter the global graph down to the set of task IDs in the batch before running topological sort:

     # Build dependency graph from TODO.md blocked-by fields
-    # ... use build_task_dependency_graph() ...
+    local full_graph batch_task_ids subgraph
+    full_graph=$(build_task_dependency_graph "$TODO_FILE")
+    batch_task_ids=$(printf '%s' "$tasks_json" | jq '[.[].id]')
+    # Filter graph to batch members only (exclude cross-batch deps as external)
+    subgraph=$(printf '%s' "$full_graph" | jq --argjson ids "$batch_task_ids" \
+        'with_entries(select(.key as $k | $ids | index($k) != null)
+         | .value |= map(select(. as $d | $ids | index($d) != null)))')
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
compute_batch_waves() {
local batch_id="$1"
# Get all tasks in batch with their blocked-by deps
local tasks_json
tasks_json=$(db -json "$SUPERVISOR_DB" "
SELECT t.id, t.description
FROM batch_tasks bt
JOIN tasks t ON bt.task_id = t.id
WHERE bt.batch_id = '$(sql_escape "$batch_id")'
ORDER BY bt.position;
")
# Build dependency graph from TODO.md blocked-by fields
# ... use build_task_dependency_graph() ...
# Compute waves via topological sort
# ... Kahn's algorithm adapted for shell ...
# Store waves in batch metadata
db "$SUPERVISOR_DB" "UPDATE batches SET waves = '...' WHERE id = '...';"
}
compute_batch_waves() {
local batch_id="$1"
# Get all tasks in batch with their blocked-by deps
local tasks_json
tasks_json=$(db -json "$SUPERVISOR_DB" "
SELECT t.id, t.description
FROM batch_tasks bt
JOIN tasks t ON bt.task_id = t.id
WHERE bt.batch_id = '$(sql_escape "$batch_id")'
ORDER BY bt.position;
")
# Build dependency graph from TODO.md blocked-by fields
local full_graph batch_task_ids subgraph
full_graph=$(build_task_dependency_graph "$TODO_FILE")
batch_task_ids=$(printf '%s' "$tasks_json" | jq '[.[].id]')
# Filter graph to batch members only (exclude cross-batch deps as external)
subgraph=$(printf '%s' "$full_graph" | jq --argjson ids "$batch_task_ids" \
'with_entries(select(.key as $k | $ids | index($k) != null)
| .value |= map(select(. as $d | $ids | index($d) != null)))')
# Compute waves via topological sort
# ... Kahn's algorithm adapted for shell ...
# Store waves in batch metadata
db "$SUPERVISOR_DB" "UPDATE batches SET waves = '...' WHERE id = '...';"
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@todo/tasks/t1311-swarm-dag-research.md` around lines 293 - 314,
compute_batch_waves() currently uses build_task_dependency_graph() which returns
a global graph; before running the Kahn/topological-sort step you must filter
that graph to only the task IDs present in the batch (the IDs collected into
tasks_json / batch_tasks), removing any edges to/from tasks not in the batch so
waves are computed purely within the batch. Concretely: extract the set of batch
task IDs from the SELECT result, call build_task_dependency_graph() to get the
global mapping, produce a subgraph containing only nodes whose IDs are in that
set and only edges between those nodes, then run your existing Kahn-style wave
computation on that filtered graph and persist the resulting waves to the
batches.waves field.

Comment on lines +341 to +357
```bash
# Second pass: process blocks: fields (inverse direction)
while IFS= read -r line; do
local task_id blocks_field
task_id=$(printf '%s' "$line" | grep -oE 't[0-9]+(\.[0-9]+)*' | head -1)
blocks_field=$(printf '%s' "$line" | grep -oE 'blocks:[^ ]+' | head -1 | sed 's/blocks://')
[[ -z "$task_id" || -z "$blocks_field" ]] && continue

# For each blocked task, add this task as a dependency
IFS=',' read -ra blocked_tasks <<< "$blocks_field"
for blocked_id in "${blocked_tasks[@]}"; do
# Add task_id to blocked_id's dependency set
graph=$(printf '%s' "$graph" | jq --arg id "$blocked_id" --arg dep "$task_id" \
'if .[$id] then .[$id] += [$dep] else . + {($id): [$dep]} end')
done
done < <(grep -E '^\s*- \[ \] t[0-9]+.*blocks:' "$todo_file" || true)
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Duplicate edges from overlapping blocks:/blocked-by: pairs will corrupt in-degree counts in Kahn's algorithm.

If both t001 blocks:t003 and t003 blocked-by:t001 are present (a redundant but valid authoring pattern), t001 is appended to t003's dependency list twice. Kahn's algorithm computes in-degree per node; a duplicate edge inflates that count, preventing the node from ever reaching in-degree 0 and falsely reporting it as part of a cycle.

Apply unique after the append, consistent with the fix suggested for Enhancement 1's graph builder:

-        graph=$(printf '%s' "$graph" | jq --arg id "$blocked_id" --arg dep "$task_id" \
-            'if .[$id] then .[$id] += [$dep] else . + {($id): [$dep]} end')
+        graph=$(printf '%s' "$graph" | jq --arg id "$blocked_id" --arg dep "$task_id" \
+            '.[$id] = ((.[$id] // []) + [$dep] | unique)')
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
```bash
# Second pass: process blocks: fields (inverse direction)
while IFS= read -r line; do
local task_id blocks_field
task_id=$(printf '%s' "$line" | grep -oE 't[0-9]+(\.[0-9]+)*' | head -1)
blocks_field=$(printf '%s' "$line" | grep -oE 'blocks:[^ ]+' | head -1 | sed 's/blocks://')
[[ -z "$task_id" || -z "$blocks_field" ]] && continue
# For each blocked task, add this task as a dependency
IFS=',' read -ra blocked_tasks <<< "$blocks_field"
for blocked_id in "${blocked_tasks[@]}"; do
# Add task_id to blocked_id's dependency set
graph=$(printf '%s' "$graph" | jq --arg id "$blocked_id" --arg dep "$task_id" \
'if .[$id] then .[$id] += [$dep] else . + {($id): [$dep]} end')
done
done < <(grep -E '^\s*- \[ \] t[0-9]+.*blocks:' "$todo_file" || true)
```
# Second pass: process blocks: fields (inverse direction)
while IFS= read -r line; do
local task_id blocks_field
task_id=$(printf '%s' "$line" | grep -oE 't[0-9]+(\.[0-9]+)*' | head -1)
blocks_field=$(printf '%s' "$line" | grep -oE 'blocks:[^ ]+' | head -1 | sed 's/blocks://')
[[ -z "$task_id" || -z "$blocks_field" ]] && continue
# For each blocked task, add this task as a dependency
IFS=',' read -ra blocked_tasks <<< "$blocks_field"
for blocked_id in "${blocked_tasks[@]}"; do
# Add task_id to blocked_id's dependency set
graph=$(printf '%s' "$graph" | jq --arg id "$blocked_id" --arg dep "$task_id" \
'.[$id] = ((.[$id] // []) + [$dep] | unique)')
done
done < <(grep -E '^\s*- \[ \] t[0-9]+.*blocks:' "$todo_file" || true)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@todo/tasks/t1311-swarm-dag-research.md` around lines 341 - 357, The graph
builder appends duplicate dependencies when both "blocks:" and "blocked-by:"
exist causing inflated in-degrees; modify the jq update in the loop that reads
blocks_field (the graph variable update using jq with --arg id "$blocked_id"
--arg dep "$task_id") to deduplicate the dependency array after appending (use
jq's unique on the array, e.g. replace the current '.[$id] += [$dep] else . +
{($id): [$dep]}' branch with one that appends then | unique) so that the
dependency list for each node (referenced via $id) never contains duplicate
entries.

@marcusquinn marcusquinn merged commit cb72371 into main Feb 25, 2026
22 of 23 checks passed
@marcusquinn marcusquinn deleted the feature/t1311 branch February 25, 2026 00:30
marcusquinn added a commit that referenced this pull request Feb 25, 2026
Review the original opus-tier research (PR #2235) in light of the
AI-first supervisor migration (t1312-t1321). Key findings:

- Original gap analysis remains accurate post-migration
- Enhancement 1 (graph-based deps) elevated: graph as AI context
  improves dispatch decisions more than mechanical execution control
- Enhancement 2 (waves) deprioritized: AI approximates wave behavior
- Enhancement 4 (visualization) elevated: AI decisions need observability
- Concrete evidence: t1311 itself was stuck by malformed blocked-by
  that a graph validator would have caught in one pulse

Recommend: implement Enhancement 1+3 as single PR (~3.5h).

Ref #2135
marcusquinn added a commit that referenced this pull request Feb 25, 2026
Review the original opus-tier research (PR #2235) in light of the
AI-first supervisor migration (t1312-t1321). Key findings:

- Original gap analysis remains accurate post-migration
- Enhancement 1 (graph-based deps) elevated: graph as AI context
  improves dispatch decisions more than mechanical execution control
- Enhancement 2 (waves) deprioritized: AI approximates wave behavior
- Enhancement 4 (visualization) elevated: AI decisions need observability
- Concrete evidence: t1311 itself was stuck by malformed blocked-by
  that a graph validator would have caught in one pulse

Recommend: implement Enhancement 1+3 as single PR (~3.5h).

Ref #2135
alex-solovyev added a commit that referenced this pull request Feb 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant