-
Notifications
You must be signed in to change notification settings - Fork 35
t1311: Post-migration review of swarm DAG research #2306
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -65,7 +65,7 @@ Result: `string[][]` — array of waves, each wave is an array of agent names th | |
|
|
||
| `pipeline.ts` drives execution: | ||
|
|
||
| ``` | ||
| ```text | ||
| for each iteration (0..targetCount): | ||
| for each wave (0..waves.length): | ||
| Promise.all(wave.map(agent => executeSwarmAgent(agent))) | ||
|
|
@@ -475,3 +475,91 @@ oh-my-pi's swarm is a **purpose-built orchestration engine** for multi-agent wor | |
| The right approach is not to replicate oh-my-pi's architecture, but to **graft its graph-based dependency resolution onto our existing task management model**. Enhancements 1-4 do exactly this: they add a dependency graph layer that's built from TODO.md's `blocked-by:`/`blocks:` fields, used for cycle detection and wave computation, and integrated into the existing dispatch pipeline. | ||
|
|
||
| The supervisor's strengths — AI-classified model routing, prompt-repeat retry, quality gates, cross-repo fairness, adaptive concurrency — are capabilities oh-my-pi lacks entirely. The combination of graph-based dependency resolution with AI-driven dispatch decisions would be more powerful than either system alone. | ||
|
|
||
| --- | ||
|
|
||
| ## 9. Post-Migration Review (2026-02-25) | ||
|
|
||
| **Reviewer:** AI (opus-tier review of prior opus-tier research) | ||
| **Context:** The supervisor underwent a major AI-first migration (t1312-t1321, 8 PRs merged) between the original research (2026-02-24) and this review. The codebase grew from 35,741 to 36,526 lines. This section evaluates whether the proposed enhancements remain valid. | ||
|
|
||
| ### 9.1 What Changed Since the Research | ||
|
|
||
| 1. **AI-first decision migration completed.** Decision logic in dispatch, pulse, deploy, evaluate, sanity-check, self-heal, routine-scheduler, and issue-sync was migrated from hardcoded heuristics to the gather-decide-execute AI pipeline (`ai-lifecycle.sh`, `ai-reason.sh`). The supervisor now asks an AI model "what should I do?" rather than following case/if-else trees. | ||
|
|
||
| 2. **`auto_unblock_resolved_tasks()` is unchanged.** The function at `todo-sync.sh:1157-1291` still uses the same grep-based string matching approach documented in Section 2.2. No graph construction, no cycle detection, no wave computation was added. | ||
|
|
||
| 3. **`blocked-by:` usage is extensive.** TODO.md currently has ~261 lines with `blocked-by:` fields and ~9 with `blocks:`. The dependency system is the primary mechanism for task ordering. | ||
|
|
||
| 4. **Supervisor line count increased** (35,741 -> 36,526, +785 lines). The AI migration replaced decision logic with AI calls (prompt construction + response parsing) at roughly equal line count, plus new modules were added. | ||
|
|
||
| ### 9.2 Enhancement Reassessment | ||
|
|
||
| #### Enhancement 1: Graph-Based Dependency Resolution — STILL VALID, PRIORITY ELEVATED | ||
|
|
||
| The AI-first migration makes this **more** important, not less. The AI lifecycle engine now makes dispatch decisions, but it operates on the same flat task list with no graph context. When `ai-reason.sh` builds a prompt asking "which task should I dispatch next?", it has no visibility into: | ||
|
|
||
| - Whether dispatching task X would create a deadlock (circular dependency) | ||
| - Which tasks form a parallelizable wave | ||
| - Whether a blocked task's entire dependency chain is stalled | ||
|
|
||
| Providing the AI with a dependency graph as structured context would improve its dispatch decisions. Instead of the AI inferring dependencies from task descriptions, it would receive `{"waves": [["t001","t002"], ["t003"]], "cycles": [], "critical_path": ["t001","t003","t005"]}` as input. | ||
|
|
||
| **Revised approach:** Rather than replacing `auto_unblock_resolved_tasks()` with a shell-based graph builder (as originally proposed), the graph should be built as **AI context** — a structured summary fed to `ai-context.sh` for dispatch decisions. The unblocking logic can remain as-is (it works correctly), but the graph provides the AI with a global view for smarter ordering. | ||
|
|
||
| **Revised effort:** ~3h (simpler than original — build graph for context, not for execution control) | ||
|
|
||
| #### Enhancement 2: Execution Waves for Batches — DEPRIORITIZED | ||
|
|
||
| The AI-first migration partially addresses this. The AI lifecycle engine can now reason about batch ordering: "these 3 tasks have no dependencies on each other, dispatch them in parallel." The AI doesn't need explicit wave computation if it receives the dependency graph (Enhancement 1) as context. | ||
|
|
||
| However, the AI can only suggest — the mechanical dispatch still processes tasks one-at-a-time per pulse. True parallel wave execution would require changes to the dispatch loop itself, which is mechanical plumbing. | ||
|
|
||
| **Revised priority:** P2 (was P1). The AI can approximate wave behavior with graph context. Explicit wave computation is an optimization, not a correctness fix. | ||
|
|
||
| #### Enhancement 3: Make `blocks:` Functional — STILL VALID, TRIVIAL | ||
|
|
||
| Only 9 lines in TODO.md use `blocks:` vs 261 using `blocked-by:`. The field is rarely used because it's informational-only. Making it functional in the graph builder (Enhancement 1) is a ~30-minute addition. No change to assessment. | ||
|
|
||
| **Revised effort:** ~30min (bundled with Enhancement 1) | ||
|
|
||
| #### Enhancement 4: Dependency Visualization — STILL VALID, HIGHER VALUE | ||
|
|
||
| With the AI-first migration, the supervisor's decision-making is less transparent (AI reasoning vs deterministic code paths). A `dag` visualization command becomes more valuable for debugging: "why did the AI dispatch task X before task Y?" The graph shows the dependency structure the AI was reasoning about. | ||
|
|
||
| **Revised effort:** ~2h (simpler with jq-based graph from Enhancement 1) | ||
|
|
||
| ### 9.3 Revised Priority Matrix | ||
|
|
||
| | # | Enhancement | Effort | Value | Priority | Change | | ||
| |---|------------|--------|-------|----------|--------| | ||
| | 1 | Graph-based dependency resolution (as AI context) | ~3h | Prevents deadlocks, improves AI dispatch decisions | **P0** | Approach changed: graph as AI context, not execution control | | ||
| | 3 | Make `blocks:` functional | ~30min | Natural authoring, bundled with #1 | **P0** | Bundled with #1 | | ||
| | 4 | Dependency visualization (`dag` command) | ~2h | Debugging AI decisions, observability | **P1** | Elevated from P2 | | ||
| | 2 | Execution waves for batches | ~6h | Maximizes parallelism | **P2** | Deprioritized: AI approximates this | | ||
| | **Total** | | **~11.5h** | | | -2.5h from original | | ||
|
|
||
| ### 9.4 Cycle Detection: Concrete Evidence of Need | ||
|
|
||
| The supervisor comments on issue #2135 show a real example of the cycle detection gap: t1311 itself was stuck for days because of a "malformed blocked-by field (backtick character)" — a data quality issue that a graph builder with validation would have caught immediately. The supervisor's AI made 8 separate priority adjustment recommendations about this single task without detecting the root cause (malformed dependency reference). | ||
|
|
||
| A graph builder that validates all `blocked-by:` references point to existing task IDs would have flagged this in the first pulse. | ||
|
|
||
| ### 9.5 Recommendation | ||
|
|
||
| **Implement Enhancement 1 + 3 as a single PR (~3.5h).** This provides: | ||
|
|
||
| 1. A `build_task_dependency_graph()` function that constructs a JSON adjacency list from `blocked-by:` and `blocks:` fields | ||
| 2. Cycle detection via Kahn's algorithm (adapted for shell/jq) | ||
| 3. Validation of blocker references (flag non-existent task IDs) | ||
| 4. Graph summary injected into `ai-context.sh` for dispatch decisions | ||
|
|
||
| This is the highest-value, lowest-risk change. It doesn't modify the existing unblocking or dispatch mechanics — it adds a read-only analysis layer that improves AI decision quality and catches data quality issues. | ||
|
|
||
| **Defer Enhancements 2 and 4** until Enhancement 1 is proven in production. Enhancement 4 (visualization) is a natural follow-up once the graph builder exists. | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Section 9.3 labels Enhancement 4 as Severity: medium 🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage. |
||
|
|
||
| ### 9.6 Conclusion | ||
|
|
||
| The original research is **accurate and well-structured**. The gap analysis (Section 3) and "What NOT to Adopt" recommendations (Section 5) remain fully valid after the AI-first migration. The key insight (Section 8) — graft graph-based resolution onto the existing architecture rather than replacing it — is the correct approach. | ||
|
|
||
| The AI-first migration shifts the implementation strategy: instead of the graph driving mechanical dispatch decisions, it should inform AI dispatch decisions as structured context. This is simpler to implement and more powerful, because the AI can weigh graph structure alongside other factors (model cost, worker availability, cross-repo fairness) that a mechanical wave executor cannot. | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On GitHub, a single newline inside a paragraph is typically rendered as a space, so the
**Reviewer:**and**Context:**lines may display as one paragraph; consider separating them with a blank line or formatting them as a list for consistent rendering.Severity: low
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.