refactor: eval quality fixes — dedup, types, docs, tests#359
Conversation
Covers all 7 improvement points from architecture evaluation: ARCHITECTURE.md sync, integration tests, doc-tests, runtime error handling. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace ADR-1 (don't split binary crate) with evolver extraction plan. Keep ADR on Board dual responsibility. Update crate count to 6. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
8 tasks: evolver extraction, ARCHITECTURE.md sync, integration tests, doc-tests, runtime graceful degradation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…tory Extracts the standalone agent creation logic from the binary crate into opengoose-rig so it can be shared by both the binary and the upcoming opengoose-evolver crate without circular dependencies. runtime.rs re-exports the symbols to keep existing call sites working unchanged. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Move the evolver module (2,490 LOC) into a standalone library crate.
This decouples the skill evolution pipeline from the CLI binary,
improving build parallelism and enabling independent testing.
- Create crates/opengoose-evolver with lib.rs, loop_driver.rs,
pipeline.rs, sweep.rs (all 55 tests migrated and passing)
- Replace crate::skills::{evolve, load} references with direct
opengoose_skills imports
- Move read_conversation_log to the evolver crate (its only consumer)
- Add local home_dir() and test_env_lock() to avoid binary crate deps
- Gate now-test-only re-exports in skills/evolve.rs and skills/load.rs
behind #[cfg(test)] to eliminate unused-import warnings
- Update runtime.rs to use opengoose_evolver::run
- Remove mod evolver from main.rs and delete old evolver directory
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add no_run doc-test to WorkMode trait demonstrating TaskMode.session_for(), and a runnable doc-test to SkillMetadata demonstrating JSON deserialization. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Consolidate three copies of home_dir() into opengoose-rig (single source of truth). Unify Board stamp/query APIs to use &RigId instead of raw &str, add AsRef<str> impl, and rename completed_by_rig to completed_by for naming consistency with claimed_by. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
📝 WalkthroughWalkthroughThe PR extracts the evolver subsystem into a dedicated Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🔴 Stale tracing target filter: evolver logs no longer detected as structured after crate extraction
The evolver module was moved from crates/opengoose/src/evolver/ to crates/opengoose-evolver/src/. Tracing macros (e.g., info!, warn!) in the new crate will automatically use the target prefix opengoose_evolver::* (e.g., opengoose_evolver::loop_driver). However, is_structured_target() at crates/opengoose/src/tui/log_entry.rs:18 still checks target.starts_with("opengoose::evolver"), which no longer matches. This means evolver log messages in the TUI will not be marked as structured, breaking the log filtering behavior in the Logs tab. The test at crates/opengoose/src/tui/tui_layer.rs:188 also hardcodes the stale target "opengoose::evolver", masking the real issue.
(Refers to line 18)
Was this helpful? React with 👍 or 👎 to provide feedback.
| [package] | ||
| name = "opengoose-evolver" |
There was a problem hiding this comment.
🔴 AGENTS.md not updated to reflect new crate — violates mandatory rule file
AGENTS.md line 9 states "3개 크레이트만 존재" (only 3 crates exist), listing only opengoose, opengoose-board, and opengoose-rig. This PR creates a new crate opengoose-evolver (and the workspace already had opengoose-skills and opengoose-sandbox), but AGENTS.md was not updated. The PR does update docs/v0.2/ARCHITECTURE.md to say "6개 크레이트" but leaves the mandatory AGENTS.md rule file inconsistent with the actual codebase.
Prompt for agents
Update AGENTS.md line 9 to reflect the actual crate structure. Change "3개 크레이트만 존재: opengoose (CLI), opengoose-board (데이터), opengoose-rig (에이전트)" to "6개 크레이트: opengoose (CLI), opengoose-board (데이터), opengoose-rig (에이전트), opengoose-skills (스킬), opengoose-evolver (스킬 진화), opengoose-sandbox (실험적)". Also update the dependency rule on line 16 to include the evolver and skills crates in the dependency chain description.
Was this helpful? React with 👍 or 👎 to provide feedback.
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (4)
crates/opengoose-board/src/work_item.rs (1)
42-55:⚠️ Potential issue | 🟠 MajorClose the path-separator validation gap in
RigId::try_new.
try_newblocks/but not\. IfRigIdis ever used as a path segment, this leaves a traversal/sanitization bypass on Windows-style separators.Suggested hardening
if s.contains('/') { return Err(format!("RigId cannot contain '/': {s}")); } + if s.contains('\\') { + return Err(format!("RigId cannot contain '\\\\': {s}")); + } Ok(Self(s))🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/opengoose-board/src/work_item.rs` around lines 42 - 55, RigId::try_new currently forbids '/' but not Windows backslash, allowing path-separator bypass; update RigId::try_new to also reject '\' (backslash) characters (and optionally canonical path-separators if your platform has others) by adding a check similar to the existing '/' check that returns an Err with a clear message (e.g., "RigId cannot contain '\\'"). Ensure the check references the same s variable and follows the same error formatting used in try_new so validation behavior is consistent.docs/v0.2/ARCHITECTURE.md (1)
197-226:⚠️ Potential issue | 🟡 MinorSection 3 still shows two different evolver locations.
This adds
opengoose-evolveras a standalone crate, but the same crate tree above still listscrates/opengoose/src/evolver/*under the binary. That leaves the architecture page internally inconsistent about where the evolver lives after the extraction. Please remove or rewrite the old in-binary evolver subtree so the topology reads one way.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/v0.2/ARCHITECTURE.md` around lines 197 - 226, Section 3 currently documents the evolver in two places; remove the outdated in-binary subtree under crates/opengoose/src/evolver/* and update the topology to reference only the extracted crate opengoose-evolver (and its src/ files) so the dependency graph and the binary tree are consistent; specifically delete or rewrite the lines listing crates/opengoose/src/evolver/* and ensure opengoose only references opengoose-evolver in the dependency list and directory tree.crates/opengoose/src/web/api/rigs.rs (1)
87-112:⚠️ Potential issue | 🟠 MajorValidate the route param before turning it into
RigId.
RigId::new()is the unchecked constructor, so this endpoint now lets malformed path values skip theRigId::try_new()rules and flow intostamps_with_scores()/completed_by(). Parse the id once at the HTTP boundary and return400 Bad Requeston failure instead of constructing an unchecked domain id from user input.Suggested fix
- let rig = state - .board - .get_rig(&id) + let rig_id = opengoose_board::RigId::try_new(&id) + .map_err(|_| StatusCode::BAD_REQUEST)?; + + let rig = state + .board + .get_rig(rig_id.as_ref()) .await .map_err(|_| StatusCode::INTERNAL_SERVER_ERROR)? .ok_or(StatusCode::NOT_FOUND)?; - - let rig_id = opengoose_board::RigId::new(&id);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/opengoose/src/web/api/rigs.rs` around lines 87 - 112, Parse and validate the incoming route param before constructing a RigId: replace the unchecked opengoose_board::RigId::new(&id) with the fallible opengoose_board::RigId::try_new(&id) (or equivalent) and if it fails return StatusCode::BAD_REQUEST; then pass the validated rig_id into state.board.stamps_with_scores(...) and state.board.completed_by(...). Ensure any error mapping around stamps_with_scores / completed_by still returns INTERNAL_SERVER_ERROR only for backend failures, not for malformed route params.crates/opengoose-evolver/src/pipeline.rs (1)
336-364:⚠️ Potential issue | 🟠 MajorDon't leave failed evolver items in
Claimed.
prepare_context()claims the generated work item beforeexecute_action(). If any later step fails, the error arm here callsboard.abandon(ctx.evolver_item_id), but that is not a valid transition fromClaimed, so the item can stay stranded instead of becoming retryable or visibly failed. Use the valid claimed-state failure path here, such as the rig-aware release/mark-stuck flow, instead of abandoning directly.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/opengoose-evolver/src/pipeline.rs` around lines 336 - 364, process_stamp currently calls board.abandon(ctx.evolver_item_id) on action failure but prepare_context() already claimed the item, so abandoning is an invalid transition; replace that abandon call with the rig-aware claimed-failure path using the evolver rig (RigId::new("evolver")) — e.g., call the Board API that releases/marks-stuck a claimed item from a specific rig (use the same evolver_rig used on success) such as board.release_with_rig(ctx.evolver_item_id, &evolver_rig).await or board.mark_stuck(ctx.evolver_item_id, &evolver_rig, ...).await (choose the available method on Board), preserve the existing logging of both the original error and any error from the failure-handling call, and update the match arm in process_stamp to use that rig-aware method instead of board.abandon(...).
🧹 Nitpick comments (6)
crates/opengoose/src/tui/event/rigs.rs (1)
60-65: Consider batching trust lookups to avoid N+1 queries.
board.trust_level(...)is called once per rig in the loop. If this path runs frequently, switching to a single batched score/level fetch will scale better.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/opengoose/src/tui/event/rigs.rs` around lines 60 - 65, The loop is performing an N+1 call by calling board.trust_level(&rig_id) per rig; change to a batched lookup: add or use a bulk API (e.g., board.trust_levels(&[RigId]) or board.list_trust_levels) to fetch all rigs' trust levels in one call after board.list_rigs() and then map results by RigId before building infos; update the code that currently constructs RigId and calls board.trust_level inside the for loop (and keep checks against app.board.items.iter().any(...)) to use the pre-fetched trust map instead.crates/opengoose/src/commands/rigs.rs (1)
39-41: Trust command does duplicate score work.This path fetches score, then fetches trust level (which recomputes score). Consider a single-query/single-computation path to avoid extra DB work and potential drift between displayed
ptsandlevel.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/opengoose/src/commands/rigs.rs` around lines 39 - 41, The code currently calls board.weighted_score(&rig_id).await? and then board.trust_level(&rig_id).await?, causing duplicate score computation; change this to a single call that returns both values (e.g. add/consume a board.weighted_score_and_trust_level(&rig_id).await? or a board.trust_level_from_score(&rig_id, pts).await?), update the call site in rigs.rs to use that combined result (replace separate weighted_score and trust_level calls), and implement the combined method on opengoose_board to compute score and trust in one DB/query pass to avoid extra work and drift between pts and level.crates/opengoose-rig/tests/worker_integration.rs (1)
1-142: These are Board transition tests, not Board→Worker integration tests.Every scenario here drives
Boarddirectly, and the file never instantiatesWorkeror any otheropengoose_rigAPI. Even the retry/stuck case jumps straight tomark_stuck(), so regressions in the real claim/execute/retry loop can still slip through. I'd either add at least one test around the realWorkerwith a stub agent, or rename the suite/comments so the scope is clear.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/opengoose-rig/tests/worker_integration.rs` around lines 1 - 142, Rename the test suite and its module doc comment to reflect these are unit tests that exercise Board state transitions (e.g., change the file header and module name from "Board → Worker integration tests" to "Board state transition tests") or alternatively add at least one integration test that instantiates opengoose_rig::Worker and a stub agent to exercise the real claim/execute/retry loop (create a new async test that spawns Worker and uses functions like Worker::start/Worker::handle_loop with a stubbed agent to perform claim -> execute -> retry/mark_stuck flow); update or add tests with names such as post_claim_submit_lifecycle, claim_then_mark_stuck, or a new worker_integration_with_stub to make intent explicit.crates/opengoose/src/tui/log_entry.rs (1)
23-40: Surface the current-directory fallback here.These helpers now inherit
crate::home_dir()'s"."fallback, so a process without a resolvable home directory will start creating and pruning./.opengoose/logsunder whatever directory launched it. The previous code surfaced that as an error; at minimum I'd log a warning here so operators can tell why logs ended up in the working directory.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/opengoose/src/tui/log_entry.rs` around lines 23 - 40, create_session_log_file and cleanup_old_logs currently use crate::home_dir() which falls back to "." silently; detect when crate::home_dir() is the current-directory fallback (e.g., home_dir().as_os_str() == "." or home_dir().is_relative() && home_dir() == Path::new(".")) and emit a visible warning (via log::warn! or eprintln!) before creating/pruning logs so operators know logs will be created under the working directory; apply the same check+warning in both create_session_log_file and cleanup_old_logs (and before calling cleanup_old_logs_in) so the behavior is surfaced consistently.crates/opengoose-evolver/src/lib.rs (1)
80-83: Reuseevolve_session_id()for log lookup.
RealAgentCallerwrites evolver conversations underevolve_session_id(work_item_id), but this helper reconstructs the key by hand. If those ever drift, evolver will stop finding its own logs and silently lose conversation context. Reuse the shared helper here instead of duplicating the format string.Suggested change
pub(crate) fn read_conversation_log(work_item_id: i64) -> String { - let session_id = format!("task-{work_item_id}"); + let session_id = evolve_session_id(work_item_id); opengoose_rig::conversation_log::read_log(&session_id) .map(|content| opengoose_skills::evolution::prompts::summarize_for_prompt(&content, 4000)) .unwrap_or_default() }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/opengoose-evolver/src/lib.rs` around lines 80 - 83, The read_conversation_log helper currently constructs the session key with format!("task-{work_item_id}") instead of using the shared evolve_session_id function used by RealAgentCaller; change read_conversation_log to call evolve_session_id(work_item_id) to obtain the session_id and pass that to opengoose_rig::conversation_log::read_log so the evolver reads the same log key as RealAgentCaller (keep the rest of the mapping to summarize_for_prompt as-is).docs/superpowers/plans/2026-03-26-eval-quality-fixes.md (1)
29-29: Consider adding language specifiers to fenced code blocks.Several fenced code blocks are missing language identifiers. For better rendering and syntax highlighting, consider adding appropriate specifiers:
textfor expected outputplaintextfor directory structures- Language-specific identifiers for code snippets
This improves documentation readability in rendered Markdown.
Also applies to: 35-35, 72-72, 86-86, 105-105, 121-121, 219-219, 414-414, 488-488, 639-639
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/superpowers/plans/2026-03-26-eval-quality-fixes.md` at line 29, Several fenced code blocks in the document are missing language specifiers; update each triple-backtick fenced block (``` ... ```) to include an appropriate language identifier: use ```text for expected output, ```plaintext for directory structures, and proper language tags (e.g., ```bash, ```json, ```ts, ```js, etc.) for code snippets so syntax highlighting renders correctly; scan the file for all fenced code blocks (look for ``` markers) and apply the mapping consistently to the blocks noted in the review.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@crates/opengoose-evolver/src/sweep.rs`:
- Around line 132-133: The sweep currently always calls caller.call(&prompt, 0)
which forces RealAgentCaller to reuse the global session "task-0"; update the
sweep to use a dedicated sweep session id (e.g., generate a fresh id per sweep
from a UUID/timestamp or a sweep-specific counter) and pass that id into
AgentCaller::call instead of 0, or alternatively extend the AgentCaller::call
signature to accept an explicit session key (string or u64) and update
RealAgentCaller to map that to its session namespace; locate the call site
(build_sweep_prompt and caller.call) and the AgentCaller/RealAgentCaller
implementations to wire the new session id through so each sweep gets an
isolated session.
In `@crates/opengoose-rig/src/agent_factory.rs`:
- Around line 60-68: The code calls agent.extend_system_prompt with
config.session_id but the agent is created/updated for the actual session ID
stored in session.id (used in agent.update_provider), so change the call to use
session.id instead of config.session_id; locate the agent.extend_system_prompt
invocation and replace the session identifier argument with session.id to ensure
the system prompt is attached to the same session the agent was updated for.
In `@crates/opengoose/src/runtime.rs`:
- Around line 15-16: The Operator UI still claims a Worker will auto-claim tasks
even when runtime.worker is None; update the prompt-generation and task-enqueue
logic to reflect actual worker availability: where the Operator prompt is built
(the code that currently prints the "separate Worker rig will automatically
claim and execute tasks" message), check the runtime struct's worker:
Option<Arc<opengoose_rig::rig::Worker>> and only show that message when
worker.is_some(); additionally, gate task creation/enqueue paths to prevent
submitting tasks when worker.is_none() (or show a clear warning and require
explicit confirmation), so both the prompt text and the enqueue code consult
runtime.worker before advertising or accepting work.
In `@docs/superpowers/specs/2026-03-26-eval-quality-fixes-design.md`:
- Around line 52-63: The fenced code blocks (e.g., the block showing the module
hierarchy around opengoose-board / opengoose-rig / opengoose-evolver and the
other blocks at ranges 69-77, 93-96, 102-121, 125-136, 218-226) need language
identifiers to satisfy markdownlint MD040; edit each triple-backtick fence
(including the example tree/diff blocks) to add an appropriate language token
such as `text`, `tree`, or `diff` (e.g., ```text or ```diff) so the linter
recognizes the block type and the document stops failing MD040. Ensure every
fenced block in the file is annotated consistently.
---
Outside diff comments:
In `@crates/opengoose-board/src/work_item.rs`:
- Around line 42-55: RigId::try_new currently forbids '/' but not Windows
backslash, allowing path-separator bypass; update RigId::try_new to also reject
'\' (backslash) characters (and optionally canonical path-separators if your
platform has others) by adding a check similar to the existing '/' check that
returns an Err with a clear message (e.g., "RigId cannot contain '\\'"). Ensure
the check references the same s variable and follows the same error formatting
used in try_new so validation behavior is consistent.
In `@crates/opengoose-evolver/src/pipeline.rs`:
- Around line 336-364: process_stamp currently calls
board.abandon(ctx.evolver_item_id) on action failure but prepare_context()
already claimed the item, so abandoning is an invalid transition; replace that
abandon call with the rig-aware claimed-failure path using the evolver rig
(RigId::new("evolver")) — e.g., call the Board API that releases/marks-stuck a
claimed item from a specific rig (use the same evolver_rig used on success) such
as board.release_with_rig(ctx.evolver_item_id, &evolver_rig).await or
board.mark_stuck(ctx.evolver_item_id, &evolver_rig, ...).await (choose the
available method on Board), preserve the existing logging of both the original
error and any error from the failure-handling call, and update the match arm in
process_stamp to use that rig-aware method instead of board.abandon(...).
In `@crates/opengoose/src/web/api/rigs.rs`:
- Around line 87-112: Parse and validate the incoming route param before
constructing a RigId: replace the unchecked opengoose_board::RigId::new(&id)
with the fallible opengoose_board::RigId::try_new(&id) (or equivalent) and if it
fails return StatusCode::BAD_REQUEST; then pass the validated rig_id into
state.board.stamps_with_scores(...) and state.board.completed_by(...). Ensure
any error mapping around stamps_with_scores / completed_by still returns
INTERNAL_SERVER_ERROR only for backend failures, not for malformed route params.
In `@docs/v0.2/ARCHITECTURE.md`:
- Around line 197-226: Section 3 currently documents the evolver in two places;
remove the outdated in-binary subtree under crates/opengoose/src/evolver/* and
update the topology to reference only the extracted crate opengoose-evolver (and
its src/ files) so the dependency graph and the binary tree are consistent;
specifically delete or rewrite the lines listing crates/opengoose/src/evolver/*
and ensure opengoose only references opengoose-evolver in the dependency list
and directory tree.
---
Nitpick comments:
In `@crates/opengoose-evolver/src/lib.rs`:
- Around line 80-83: The read_conversation_log helper currently constructs the
session key with format!("task-{work_item_id}") instead of using the shared
evolve_session_id function used by RealAgentCaller; change read_conversation_log
to call evolve_session_id(work_item_id) to obtain the session_id and pass that
to opengoose_rig::conversation_log::read_log so the evolver reads the same log
key as RealAgentCaller (keep the rest of the mapping to summarize_for_prompt
as-is).
In `@crates/opengoose-rig/tests/worker_integration.rs`:
- Around line 1-142: Rename the test suite and its module doc comment to reflect
these are unit tests that exercise Board state transitions (e.g., change the
file header and module name from "Board → Worker integration tests" to "Board
state transition tests") or alternatively add at least one integration test that
instantiates opengoose_rig::Worker and a stub agent to exercise the real
claim/execute/retry loop (create a new async test that spawns Worker and uses
functions like Worker::start/Worker::handle_loop with a stubbed agent to perform
claim -> execute -> retry/mark_stuck flow); update or add tests with names such
as post_claim_submit_lifecycle, claim_then_mark_stuck, or a new
worker_integration_with_stub to make intent explicit.
In `@crates/opengoose/src/commands/rigs.rs`:
- Around line 39-41: The code currently calls
board.weighted_score(&rig_id).await? and then board.trust_level(&rig_id).await?,
causing duplicate score computation; change this to a single call that returns
both values (e.g. add/consume a
board.weighted_score_and_trust_level(&rig_id).await? or a
board.trust_level_from_score(&rig_id, pts).await?), update the call site in
rigs.rs to use that combined result (replace separate weighted_score and
trust_level calls), and implement the combined method on opengoose_board to
compute score and trust in one DB/query pass to avoid extra work and drift
between pts and level.
In `@crates/opengoose/src/tui/event/rigs.rs`:
- Around line 60-65: The loop is performing an N+1 call by calling
board.trust_level(&rig_id) per rig; change to a batched lookup: add or use a
bulk API (e.g., board.trust_levels(&[RigId]) or board.list_trust_levels) to
fetch all rigs' trust levels in one call after board.list_rigs() and then map
results by RigId before building infos; update the code that currently
constructs RigId and calls board.trust_level inside the for loop (and keep
checks against app.board.items.iter().any(...)) to use the pre-fetched trust map
instead.
In `@crates/opengoose/src/tui/log_entry.rs`:
- Around line 23-40: create_session_log_file and cleanup_old_logs currently use
crate::home_dir() which falls back to "." silently; detect when
crate::home_dir() is the current-directory fallback (e.g.,
home_dir().as_os_str() == "." or home_dir().is_relative() && home_dir() ==
Path::new(".")) and emit a visible warning (via log::warn! or eprintln!) before
creating/pruning logs so operators know logs will be created under the working
directory; apply the same check+warning in both create_session_log_file and
cleanup_old_logs (and before calling cleanup_old_logs_in) so the behavior is
surfaced consistently.
In `@docs/superpowers/plans/2026-03-26-eval-quality-fixes.md`:
- Line 29: Several fenced code blocks in the document are missing language
specifiers; update each triple-backtick fenced block (``` ... ```) to include an
appropriate language identifier: use ```text for expected output, ```plaintext
for directory structures, and proper language tags (e.g., ```bash, ```json,
```ts, ```js, etc.) for code snippets so syntax highlighting renders correctly;
scan the file for all fenced code blocks (look for ``` markers) and apply the
mapping consistently to the blocks noted in the review.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 8d53676a-c445-4ded-9c11-f458caa818a3
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (34)
Cargo.tomlcrates/opengoose-board/src/beads.rscrates/opengoose-board/src/board.rscrates/opengoose-board/src/rigs.rscrates/opengoose-board/src/stamp_ops.rscrates/opengoose-board/src/work_item.rscrates/opengoose-board/src/work_items/mod.rscrates/opengoose-board/src/work_items/queries.rscrates/opengoose-evolver/Cargo.tomlcrates/opengoose-evolver/src/lib.rscrates/opengoose-evolver/src/loop_driver.rscrates/opengoose-evolver/src/pipeline.rscrates/opengoose-evolver/src/sweep.rscrates/opengoose-rig/src/agent_factory.rscrates/opengoose-rig/src/conversation_log/io.rscrates/opengoose-rig/src/lib.rscrates/opengoose-rig/src/work_mode.rscrates/opengoose-rig/tests/worker_integration.rscrates/opengoose-skills/src/metadata.rscrates/opengoose/Cargo.tomlcrates/opengoose/src/cli/commands.rscrates/opengoose/src/cli/setup.rscrates/opengoose/src/commands/board.rscrates/opengoose/src/commands/rigs.rscrates/opengoose/src/main.rscrates/opengoose/src/runtime.rscrates/opengoose/src/skills/evolve.rscrates/opengoose/src/skills/load.rscrates/opengoose/src/tui/event/rigs.rscrates/opengoose/src/tui/log_entry.rscrates/opengoose/src/web/api/rigs.rsdocs/superpowers/plans/2026-03-26-eval-quality-fixes.mddocs/superpowers/specs/2026-03-26-eval-quality-fixes-design.mddocs/v0.2/ARCHITECTURE.md
💤 Files with no reviewable changes (1)
- crates/opengoose/src/main.rs
| let prompt = build_sweep_prompt(&skill_summaries, &failure_summaries); | ||
| let response = caller.call(&prompt, 0).await?; |
There was a problem hiding this comment.
Give sweeps their own session namespace.
Line 133 always passes 0 into AgentCaller::call(). RealAgentCaller turns that into a Goose session id, so every sweep reuses the same persisted conversation (task-0). That lets old sweep history bleed into later runs and makes decisions depend on stale context instead of the current dormant-skill set. Use a fresh sweep session id, or extend the trait so sweeps can pass an explicit session key instead of hard-coding 0.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/opengoose-evolver/src/sweep.rs` around lines 132 - 133, The sweep
currently always calls caller.call(&prompt, 0) which forces RealAgentCaller to
reuse the global session "task-0"; update the sweep to use a dedicated sweep
session id (e.g., generate a fresh id per sweep from a UUID/timestamp or a
sweep-specific counter) and pass that id into AgentCaller::call instead of 0, or
alternatively extend the AgentCaller::call signature to accept an explicit
session key (string or u64) and update RealAgentCaller to map that to its
session namespace; locate the call site (build_sweep_prompt and caller.call) and
the AgentCaller/RealAgentCaller implementations to wire the new session id
through so each sweep gets an isolated session.
| agent | ||
| .update_provider(provider, &session.id) | ||
| .await | ||
| .context("failed to set provider")?; | ||
|
|
||
| if let Some(prompt) = config.system_prompt { | ||
| agent | ||
| .extend_system_prompt(config.session_id.clone(), prompt) | ||
| .await; |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
In the Goose Agent API, when session_manager.create_session(...)returns a session, issession.idguaranteed to equal the caller-provided session name, and shouldAgent::extend_system_prompt(...)be called withsession.id or with the requested name?
💡 Result:
No, session.id is not guaranteed to equal the caller-provided session name. Goose generates session IDs automatically (typically timestamp-based like yyyymmdd_hhmmss), and CLI flags like --name or -n attempt to resume existing sessions rather than create new ones with custom names. Use --session-id for explicit control, but names and IDs are distinct concepts. Use session.id (the generated/returned ID) with Agent::extend_system_prompt(...), as agents are created per-session using the session ID for isolation via AgentManager/SessionManager.
Citations:
- 1: Cant name a session in cli aaif-goose/goose#4893
- 2: Session manager aaif-goose/goose#4648
- 3: feat: AgentManager - foundation for unified execution (#4389) aaif-goose/goose#4684
- 4: Add session to agents aaif-goose/goose#4216
- 5: Expose GOOSE_SESSION_ID environment variable in CLI sessions aaif-goose/goose#6308
Use session.id instead of config.session_id for extend_system_prompt().
Goose generates session IDs automatically (distinct from caller-supplied names), and agents are isolated per session using the session ID. Line 67 should pass session.id to match the session used in update_provider() on line 61. Passing config.session_id will attach the role-specific system prompt to the wrong session or fail silently, causing the agent to run without its Operator/Worker/Evolver instructions.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/opengoose-rig/src/agent_factory.rs` around lines 60 - 68, The code
calls agent.extend_system_prompt with config.session_id but the agent is
created/updated for the actual session ID stored in session.id (used in
agent.update_provider), so change the call to use session.id instead of
config.session_id; locate the agent.extend_system_prompt invocation and replace
the session identifier argument with session.id to ensure the system prompt is
attached to the same session the agent was updated for.
| pub worker: Option<Arc<opengoose_rig::rig::Worker>>, | ||
| } |
There was a problem hiding this comment.
Degraded mode still tells the Operator that a worker exists.
After this branch returns worker: None, the Operator prompt in Lines 63-71 still says a separate Worker rig will automatically claim and execute tasks and that the Operator should not handle them. In the degraded path that's false, so users can enqueue work that nothing will ever pick up. Plumb worker availability into the Operator/UI prompt, or gate task creation while the worker is offline.
Also applies to: 28-50
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/opengoose/src/runtime.rs` around lines 15 - 16, The Operator UI still
claims a Worker will auto-claim tasks even when runtime.worker is None; update
the prompt-generation and task-enqueue logic to reflect actual worker
availability: where the Operator prompt is built (the code that currently prints
the "separate Worker rig will automatically claim and execute tasks" message),
check the runtime struct's worker: Option<Arc<opengoose_rig::rig::Worker>> and
only show that message when worker.is_some(); additionally, gate task
creation/enqueue paths to prevent submitting tasks when worker.is_none() (or
show a clear warning and require explicit confirmation), so both the prompt text
and the enqueue code consult runtime.worker before advertising or accepting
work.
| ``` | ||
| opengoose-board (독립) | ||
| ↑ | ||
| opengoose-rig (board, goose) | ||
| ↑ | ||
| opengoose-evolver [NEW] (board, rig, skills, goose) | ||
| ↑ | ||
| opengoose (board, rig, skills, evolver) | ||
|
|
||
| opengoose-skills (독립) | ||
| opengoose-sandbox (독립, 실험적) | ||
| ``` |
There was a problem hiding this comment.
Add language identifiers to the fenced code blocks.
markdownlint is already flagging these fences with MD040, so this document will keep failing lint until the blocks are annotated with a language like text, tree, or diff.
Also applies to: 69-77, 93-96, 102-121, 125-136, 218-226
🧰 Tools
🪛 markdownlint-cli2 (0.22.0)
[warning] 52-52: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/superpowers/specs/2026-03-26-eval-quality-fixes-design.md` around lines
52 - 63, The fenced code blocks (e.g., the block showing the module hierarchy
around opengoose-board / opengoose-rig / opengoose-evolver and the other blocks
at ranges 69-77, 93-96, 102-121, 125-136, 218-226) need language identifiers to
satisfy markdownlint MD040; edit each triple-backtick fence (including the
example tree/diff blocks) to add an appropriate language token such as `text`,
`tree`, or `diff` (e.g., ```text or ```diff) so the linter recognizes the block
type and the document stops failing MD040. Ensure every fenced block in the file
is annotated consistently.
Summary
opengoose-rig에 단일pub fn home_dir()통합. evolver에서dirs의존 제거.rig_id: &str→&RigId로 통일.AsRef<str>구현 추가.completed_by_rig→completed_by이름 통일./// # Examples추가Test plan
cargo check— 클린 빌드cargo test— 모든 테스트 통과 (0 failures)cargo clippy --all-targets— 경고 없음cargo fmt— 포맷 적용🤖 Generated with Claude Code
Summary by CodeRabbit
Release Notes
New Features
opengoose-evolvercrate for autonomous skill evolution workflowsAgentConfig,create_agent) inopengoose-rigfor flexible agent initializationRigIdtype with validation support (try_new())Improvements
RigIdin board APIsDocumentation
Tests