t1357.6: Add milestone validation worker for mission orchestrator#2519
t1357.6: Add milestone validation worker for mission orchestrator#2519alex-solovyev merged 5 commits intomainfrom
Conversation
|
Warning You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again! |
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (4)
WalkthroughIntroduces a comprehensive milestone validation worker that automatically validates project milestones after features complete. Includes a 640-line bash script ( Changes
Sequence Diagram(s)sequenceDiagram
actor Orchestrator as Mission Orchestrator
participant MState as Mission State File
participant Validator as milestone-validate.sh
participant Checks as Validation Checks
participant Report as Result Reporter
Orchestrator->>Validator: Dispatch validation<br/>(--mission, --milestone, --repo, --mode)
Validator->>MState: Read mission state & current mode
Validator->>Validator: Detect UI milestone status
Validator->>Checks: Run deterministic checks<br/>(dependencies → build → tests)
Checks-->>Validator: Check results (PASS/FAIL/SKIP)
Validator->>Checks: Run advisory checks<br/>(lint → typecheck → browser QA)
Checks-->>Validator: Advisory results (WARN/SKIP)
Validator->>Report: Aggregate results & failures
Report-->>Validator: Formatted report (JSON/Markdown)
Validator-->>Orchestrator: Exit with status<br/>(0=pass, 1=blocking fail, 2=incomplete)
alt Validation Passes (exit 0)
Orchestrator->>MState: Log validation report
Orchestrator->>Orchestrator: Advance to next milestone
else Validation Fails (exit 1)
Orchestrator->>Orchestrator: Create fix tasks from report
Orchestrator->>Validator: Re-dispatch validation on fixes
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
This PR has a merge conflict (CONFLICTING state). A worker is already running for the related t1362 PR (#2520). Once that lands, this PR will need a rebase. Flagging for the next pulse cycle to dispatch a fix worker if still conflicting. |
|
Merge conflict detected (CONFLICTING state) and Codacy ACTION_REQUIRED. Dispatching a worker to fix the conflict and address Codacy findings. Resolves #2500. |
56926f4 to
44fd8a5
Compare
|
Supervisor pulse: PR has merge conflicts (CONFLICTING state) and Codacy ACTION_REQUIRED. Worker was running 7h+ with no resolution. Killed stuck worker. A new worker will be dispatched to resolve conflicts and address Codacy findings. Issue: #2500 |
🔍 Code Quality Report�[0;35m[MONITOR]�[0m Code Review Monitoring Report �[0;34m[INFO]�[0m Latest Quality Status: �[0;34m[INFO]�[0m Recent monitoring activity: 📈 Current Quality Metrics
Generated on: Fri Feb 27 23:59:43 UTC 2026 Generated by AI DevOps Framework Code Review Monitoring |
Specialised worker dispatched after all features in a milestone complete. Auto-detects project tech stack (Node.js/Python/Rust/Go/Shell), runs full test suite + build + linter, optionally runs Playwright browser tests for UI milestones, reports pass/fail with specific issues, and creates fix tasks as GitHub issues on failure linked to the milestone. - milestone-validation-worker.sh: 1060-line validation runner script - milestone-validation.md: workflow documentation and integration guide - Updated mission-orchestrator.md Phase 4 to delegate to validation worker - Updated subagent-index.toon with new workflow and script entries
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.agents/scripts/milestone-validate.sh:
- Around line 20-24: The header documents an INCOMPLETE exit code (2) but the
script never returns exit 2; either remove the documented "2 - INCOMPLETE" entry
or implement the branch that emits exit 2 when feature-completion checks fail.
Edit milestone-validate.sh to (a) add a clear check in the validation flow
(e.g., inside the main/run_checks or validate_features logic) that detects
"features not all completed" and calls exit 2, or (b) remove/adjust the header
comment to match only the exit paths the script actually emits (exit 0, 1, 3).
Ensure the symbols mentioned (exit 0, exit 1, exit 2, exit 3 and the
validate_features/main/checks function) are consistent between code and
documentation.
- Around line 85-103: The option parsing reads $2 directly and can trigger
nounset or wrong exit codes; update the while/case that sets MISSION_PATH,
MILESTONE_NUM, REPO_PATH, and MODE to first check that the next token exists
(use "${2-}" or test -n "$2" and that it doesn't start with '-') and on failure
call the same argument-error path (exit 3); additionally validate MILESTONE_NUM
is numeric (e.g., grep -E '^[0-9]+$' or [[ "$MILESTONE_NUM" =~ ^[0-9]+$ ]]) and
validate MODE against allowed values before accepting it, and apply the same
guards to the second parsing block referenced (lines 132-149) so both parsing
locations use identical existence and value checks.
In @.agents/workflows/milestone-validation.md:
- Around line 116-117: Update the wording in the milestone-validation docs that
currently say “failing fast” / “no further checks run” to accurately reflect the
helper script behavior: state that the helper fails the milestone on the first
blocking check but still continues to run additional test/diagnostic steps to
collect extra diagnostics, and adjust the two occurrences (the paragraph
referencing “failing fast” at the current block and the similar text at the
later block) to this clarified description so operators aren’t misled.
- Around line 268-303: The outer fenced code block that starts with ```markdown
and contains inner fenced examples (e.g., inner ```text and other ``` blocks)
triggers markdownlint MD031/MD040; fix it by replacing the outer opening fence
```markdown with a longer fence (e.g., ````markdown) and the matching closing
``` with ```` so all inner ``` blocks remain literal; update the block that
begins "## Milestone {N} Validation Report" and its closing fence accordingly to
use the four-backtick variant.
ℹ️ Review info
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📥 Commits
Reviewing files that changed from the base of the PR and between 33c4906 and 56926f4a3b5c3fe535810e85827daa4a240016bf.
📒 Files selected for processing (4)
.agents/scripts/milestone-validate.sh.agents/subagent-index.toon.agents/workflows/milestone-validation.md.agents/workflows/mission-orchestrator.md
| # Exit codes: | ||
| # 0 - PASS (all blocking checks passed) | ||
| # 1 - FAIL (one or more blocking checks failed) | ||
| # 2 - INCOMPLETE (features not all completed) | ||
| # 3 - ERROR (script error, invalid arguments) |
There was a problem hiding this comment.
Exit code contract is inconsistent: INCOMPLETE (2) is declared but unreachable.
The script documents exit code 2, but current control flow only emits 0/1/3. This can mislead orchestrator handling and downstream automation.
Also applies to: 628-637
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.agents/scripts/milestone-validate.sh around lines 20 - 24, The header
documents an INCOMPLETE exit code (2) but the script never returns exit 2;
either remove the documented "2 - INCOMPLETE" entry or implement the branch that
emits exit 2 when feature-completion checks fail. Edit milestone-validate.sh to
(a) add a clear check in the validation flow (e.g., inside the main/run_checks
or validate_features logic) that detects "features not all completed" and calls
exit 2, or (b) remove/adjust the header comment to match only the exit paths the
script actually emits (exit 0, 1, 3). Ensure the symbols mentioned (exit 0, exit
1, exit 2, exit 3 and the validate_features/main/checks function) are consistent
between code and documentation.
There was a problem hiding this comment.
Verified against current code. The script header (lines 29-32) documents 2 - Configuration error, not INCOMPLETE. The code emits return 2 in 9 places within parse_args() for config errors (missing args, bad paths, non-numeric milestone, etc.). The "INCOMPLETE" label was only in the PR body description, which has now been corrected. No code change needed — exit code 2 is reachable and correctly documented.
There was a problem hiding this comment.
@alex-solovyev Thank you for the clarification and for verifying against the current code. You're absolutely right—I apologize for the confusion. If the script header correctly documents exit code 2 as "Configuration error" and the code emits it in multiple places within parse_args() for config errors, then the exit code contract is properly implemented and documented.
I appreciate you taking the time to correct my misunderstanding. The exit code handling is sound, and no changes are needed.
🐰💭
Sorry for the
false alarm!
( •_•)
/>💔
| while [[ $# -gt 0 ]]; do | ||
| local arg="$1" | ||
| case "$arg" in | ||
| --mission) | ||
| MISSION_PATH="$2" | ||
| shift 2 | ||
| ;; | ||
| --milestone) | ||
| MILESTONE_NUM="$2" | ||
| shift 2 | ||
| ;; | ||
| --repo) | ||
| REPO_PATH="$2" | ||
| shift 2 | ||
| ;; | ||
| --mode) | ||
| MODE="$2" | ||
| shift 2 | ||
| ;; |
There was a problem hiding this comment.
Harden option-value parsing to avoid nounset crashes and wrong exit codes.
For value-taking flags, Line 89/93/97/101 reads $2 without checking it exists. With set -u, a missing value exits abruptly (code 1) instead of the documented argument-error path (code 3). Also validate --mode and ensure --milestone is numeric.
Suggested fix
parse_args() {
while [[ $# -gt 0 ]]; do
local arg="$1"
+ local need_value=false
+ case "$arg" in
+ --mission|--milestone|--repo|--mode) need_value=true ;;
+ esac
+ if [[ "$need_value" == "true" && $# -lt 2 ]]; then
+ print_error "Missing value for option: $arg"
+ show_usage
+ exit 3
+ fi
+
case "$arg" in
--mission)
MISSION_PATH="$2"
@@
if [[ -z "$MILESTONE_NUM" ]]; then
print_error "Missing required argument: --milestone"
show_usage
exit 3
fi
+ if ! [[ "$MILESTONE_NUM" =~ ^[0-9]+$ ]]; then
+ print_error "Invalid --milestone value: $MILESTONE_NUM (expected integer)"
+ exit 3
+ fi
@@
if [[ -z "$REPO_PATH" ]]; then
print_error "Missing required argument: --repo"
show_usage
exit 3
fi
+ if [[ -n "$MODE" && "$MODE" != "poc" && "$MODE" != "full" ]]; then
+ print_error "Invalid --mode value: $MODE (expected poc|full)"
+ exit 3
+ fiAlso applies to: 132-149
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.agents/scripts/milestone-validate.sh around lines 85 - 103, The option
parsing reads $2 directly and can trigger nounset or wrong exit codes; update
the while/case that sets MISSION_PATH, MILESTONE_NUM, REPO_PATH, and MODE to
first check that the next token exists (use "${2-}" or test -n "$2" and that it
doesn't start with '-') and on failure call the same argument-error path (exit
3); additionally validate MILESTONE_NUM is numeric (e.g., grep -E '^[0-9]+$' or
[[ "$MILESTONE_NUM" =~ ^[0-9]+$ ]]) and validate MODE against allowed values
before accepting it, and apply the same guards to the second parsing block
referenced (lines 132-149) so both parsing locations use identical existence and
value checks.
There was a problem hiding this comment.
This review was on the old milestone-validate.sh (force-pushed away). The current script milestone-validation-worker.sh was rewritten and already includes require_value() guards on all value-taking flags (--repo-path, --browser-url, --max-retries), numeric validation for milestone number (line 287), and proper ${2-} usage. Fixed in commit 6d9613ce. No --mode flag exists in the current script — mode is read from the mission file.
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify the current state of the milestone validation script
# Check if the old file exists
if [[ -f ".agents/scripts/milestone-validate.sh" ]]; then
echo "=== OLD FILE STILL EXISTS ==="
head -n 30 .agents/scripts/milestone-validate.sh
else
echo "=== OLD FILE NOT FOUND ==="
fi
echo ""
echo "=== CHECKING FOR NEW FILE ==="
# Check if the new file exists
if [[ -f ".agents/scripts/milestone-validation-worker.sh" ]]; then
echo "NEW FILE FOUND: milestone-validation-worker.sh"
echo ""
# Check for require_value function
echo "=== Checking for require_value() function ==="
rg -n "require_value" .agents/scripts/milestone-validation-worker.sh | head -n 5
echo ""
echo "=== Checking argument parsing around line 287 ==="
sed -n '280,295p' .agents/scripts/milestone-validation-worker.sh
echo ""
echo "=== Checking for --mode flag usage ==="
rg -n "\-\-mode" .agents/scripts/milestone-validation-worker.sh || echo "No --mode flag found"
else
echo "NEW FILE NOT FOUND"
fi
Length of output: 972
@alex-solovyev Thanks for the clarification! I've confirmed that the old file is gone and the new milestone-validation-worker.sh script includes all the proper safeguards:
- ✅
require_value()helper with${2-}guards on all value-taking flags - ✅ Numeric validation for milestone number
- ✅ No
--modeflag (reads from mission file instead)
The rewritten script addresses the concerns. Appreciate the update!
| The helper script runs these checks in order (failing fast on blocking checks): | ||
|
|
There was a problem hiding this comment.
Documentation of build-failure flow is inconsistent with helper behavior.
This section says “failing fast” and “no further checks run,” but the helper script still runs tests after build failure to collect extra diagnostics. Please align wording to avoid operator confusion.
Also applies to: 154-155
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.agents/workflows/milestone-validation.md around lines 116 - 117, Update the
wording in the milestone-validation docs that currently say “failing fast” / “no
further checks run” to accurately reflect the helper script behavior: state that
the helper fails the milestone on the first blocking check but still continues
to run additional test/diagnostic steps to collect extra diagnostics, and adjust
the two occurrences (the paragraph referencing “failing fast” at the current
block and the similar text at the later block) to this clarified description so
operators aren’t misled.
There was a problem hiding this comment.
This review was on the old force-pushed version. The current milestone-validation.md (194 lines) has never contained "failing fast" or "no further checks run" wording. The script's actual behavior — continuing to collect diagnostics after a blocking failure — is correctly described in the existing docs without misleading language.
There was a problem hiding this comment.
@alex-solovyev, understood — thank you for clarifying. Since the file was force-pushed and the current version (194 lines) no longer contains the "failing fast" wording, the original concern doesn't apply. I appreciate you pointing out the version mismatch!
| ```markdown | ||
| ## Milestone {N} Validation Report | ||
|
|
||
| **Status:** PASS | FAIL | INCOMPLETE | ||
| **Timestamp:** {ISO datetime} | ||
| **Duration:** {seconds}s | ||
|
|
||
| ### Checks | ||
|
|
||
| | Check | Status | Details | | ||
| |-------|--------|---------| | ||
| | Feature completion | PASS/FAIL | {N}/{total} features have merged PRs | | ||
| | Build | PASS/FAIL | {build command output summary} | | ||
| | Tests | PASS/FAIL | {X} passed, {Y} failed, {Z} skipped | | ||
| | Lint | PASS/WARN | {N} issues (advisory) | | ||
| | Type check | PASS/WARN | {N} errors (advisory) | | ||
| | Custom: {criterion} | PASS/FAIL/MANUAL | {details} | | ||
| | Browser QA | PASS/WARN/SKIP | {summary or "skipped: not a UI milestone"} | | ||
|
|
||
| ### Failures (if any) | ||
|
|
||
| #### {Check name}: {failure summary} | ||
|
|
||
| ```text | ||
| {exact error output, truncated to 50 lines} | ||
| ``` | ||
|
|
||
| **Affected files:** {list of files mentioned in errors} | ||
| **Suggested fix:** {1-2 sentence description of what needs to change} | ||
|
|
||
| ### Budget Impact | ||
|
|
||
| - **Validation time:** {minutes}m | ||
| - **Estimated fix effort:** {hours}h (if failures exist) | ||
| ``` | ||
|
|
There was a problem hiding this comment.
Fix nested fenced-code formatting in the report template block.
The sample block around Line 302 triggers markdownlint (MD031/MD040) due fence structure. Use a longer outer fence so inner fenced examples remain literal.
Suggested fix
-```markdown
+````markdown
## Milestone {N} Validation Report
@@
-```
+````🧰 Tools
🪛 GitHub Check: Codacy Static Code Analysis
[notice] 302-302: .agents/workflows/milestone-validation.md#L302
Fenced code blocks should be surrounded by blank lines
[notice] 302-302: .agents/workflows/milestone-validation.md#L302
Fenced code blocks should have a language specified
🪛 markdownlint-cli2 (0.21.0)
[warning] 302-302: Fenced code blocks should be surrounded by blank lines
(MD031, blanks-around-fences)
[warning] 302-302: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.agents/workflows/milestone-validation.md around lines 268 - 303, The outer
fenced code block that starts with ```markdown and contains inner fenced
examples (e.g., inner ```text and other ``` blocks) triggers markdownlint
MD031/MD040; fix it by replacing the outer opening fence ```markdown with a
longer fence (e.g., ````markdown) and the matching closing ``` with ```` so all
inner ``` blocks remain literal; update the block that begins "## Milestone {N}
Validation Report" and its closing fence accordingly to use the four-backtick
variant.
There was a problem hiding this comment.
This review was on the old force-pushed version. The current milestone-validation.md is 194 lines — there is no line 268-303 and no nested fenced code blocks. All code blocks use standard triple-backtick fences with language specifiers and proper blank-line separation. No MD031/MD040 violations.
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify the current state of milestone-validation.md
echo "=== Line count ==="
wc -l .agents/workflows/milestone-validation.md
echo -e "\n=== Checking for nested fenced code blocks (4+ backticks) ==="
grep -n '````' .agents/workflows/milestone-validation.md || echo "No quadruple-backtick fences found"
echo -e "\n=== All fenced code blocks ==="
grep -n '^```' .agents/workflows/milestone-validation.md | head -20
echo -e "\n=== File preview (first 50 lines) ==="
head -50 .agents/workflows/milestone-validation.md
echo -e "\n=== File preview (last 50 lines) ==="
tail -50 .agents/workflows/milestone-validation.md
Length of output: 3789
@alex-solovyev, you're absolutely right—the current file is 194 lines and the review comment is obsolete. I've verified that the current version uses standard triple-backtick fences with proper language specifiers and blank-line separation throughout. No MD031/MD040 violations present.
The review was generated on an earlier version that's no longer in the branch. Current state looks good!
…tion The package manager detection logic (bun/pnpm/yarn/npm) was duplicated 4 times across run_test_suite, run_build, run_linter, and check_dependencies. Extract into a single detect_pkg_manager() function to address Codacy duplication findings.
44fd8a5 to
cdcd68f
Compare
🔍 Code Quality Report�[0;35m[MONITOR]�[0m Code Review Monitoring Report �[0;34m[INFO]�[0m Latest Quality Status: �[0;34m[INFO]�[0m Recent monitoring activity: 📈 Current Quality Metrics
Generated on: Sat Feb 28 00:03:27 UTC 2026 Generated by AI DevOps Framework Code Review Monitoring |
- Add reset_validation_state() helper to clear state between retry attempts - Wrap validation checks in retry loop using MAX_RETRIES (default: 3) - On transient failures (flaky tests, build races), retry before creating fix tasks - Log retry attempts and include attempt count in progress log - Resolves ShellCheck SC2034 (MAX_RETRIES appears unused)
🔍 Code Quality Report�[0;35m[MONITOR]�[0m Code Review Monitoring Report �[0;34m[INFO]�[0m Latest Quality Status: �[0;34m[INFO]�[0m Recent monitoring activity: 📈 Current Quality Metrics
Generated on: Sat Feb 28 00:04:53 UTC 2026 Generated by AI DevOps Framework Code Review Monitoring |
…ation Address CodeRabbit review finding: value-taking flags (--repo-path, --browser-url, --max-retries) now validate that the next argument exists and is not another flag before consuming it. --max-retries additionally validates the value is numeric. Uses a require_value() helper to keep the pattern DRY across all value-taking options.
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
🔍 Code Quality Report�[0;35m[MONITOR]�[0m Code Review Monitoring Report �[0;34m[INFO]�[0m Latest Quality Status: �[0;34m[INFO]�[0m Recent monitoring activity: 📈 Current Quality Metrics
Generated on: Sat Feb 28 00:12:07 UTC 2026 Generated by AI DevOps Framework Code Review Monitoring |
|
@coderabbitai full review All 4 review comments have been addressed — the review was on the old force-pushed version. The current code:
Please re-review the current state of the branch. |
|
I'll perform a full review of the current state of the branch now. ✅ Actions performedFull review triggered. |
All 4 issues verified as resolved in current code. CodeRabbit confirmed each in follow-up comments.
|
@coderabbitai full review All 4 review findings have been verified against the current code:
Additionally: merge conflict with main has been resolved. |
🔍 Code Quality Report�[0;35m[MONITOR]�[0m Code Review Monitoring Report �[0;34m[INFO]�[0m Latest Quality Status: �[0;34m[INFO]�[0m Recent monitoring activity: 📈 Current Quality Metrics
Generated on: Sat Feb 28 00:25:48 UTC 2026 Generated by AI DevOps Framework Code Review Monitoring |
|
I'll perform a complete review of the current branch state now. ✅ Actions performedFull review triggered. |
|
There was a problem hiding this comment.
Actionable comments posted: 5
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.agents/scripts/milestone-validation-worker.sh:
- Around line 240-247: The --max-retries handler currently accepts 0 which
causes the validation loop to be skipped; update the parsing logic around the
--max-retries case to reject non-positive values by checking that "$2" is a
positive integer (>0) before assigning MAX_RETRIES, returning an error (and
calling log_error) if the value is 0 or negative; update the same validation in
the duplicate block referenced (the similar handler at lines 1055-1079) so both
places enforce a strictly positive integer for MAX_RETRIES and keep using
require_value "$arg" "${2-}" to ensure a value exists.
- Around line 17-27: Add a --json CLI flag to
.agents/scripts/milestone-validation-worker.sh and wire it into the option
parsing so the script sets a boolean (e.g., json_output) alongside existing
flags; when json_output is true, suppress human-readable logs and emit a single
machine-readable JSON object to stdout at the end of the run summarizing
total_checks, failures_count, warnings_count, failures (array of failure objects
with id/message/context), warnings (array), and final exit_code; ensure all
places that currently print or exit on error (the main validation loop / result
aggregation code referenced in the script’s main logic and finalization block
around the current human report output and the region currently repeated at
947-993) accumulate structured data into those arrays and counts so the JSON can
be emitted deterministically, and keep existing exit status behavior unchanged
(script should still exit non-zero on failures) while returning the same numeric
exit code in the JSON exit_code field.
- Around line 841-844: The check_dependencies block currently calls record_fail
and then returns 1 (e.g., in the install error path shown), which under set -e
can abort the whole script; instead, change the failure handling inside the
check_dependencies function (and the similar location around the second
occurrence) to not return a non-zero status immediately: capture the
install_exit, call record_fail "Dependency installation" with the error message,
set or append to a persistent failure flag/array (e.g., dependency_failure=1 or
failures+=("dependency")), and then continue so the function returns 0; update
any callers to inspect that flag after check_dependencies completes and handle
reporting/mission status/fix-task creation accordingly. Ensure you update both
places referenced (the shown install_exit path and the other occurrence) and
keep record_fail usage unchanged.
- Around line 705-711: The lint/type-check blocks currently call record_fail
(using variables lint_exit, lint_output and invoking "$pkg_cmd run lint"), which
makes the milestone fail; change these to be advisory by calling record_pass
instead (e.g., replace record_fail "Linter ($pkg_cmd run lint)" "$issue_count
issues found" with record_pass "Linter ($pkg_cmd run lint)" "$issue_count issues
found"), keeping the existing issue_count computation from lint_output so the
message shows how many issues were found; apply the same replacement for the
other lint/type-check occurrences (the blocks around lines shown) so
lint/type-check no longer flips the milestone to failed.
- Around line 592-598: The ShellCheck invocation currently limits scanning to
top-level files by using find with -maxdepth 1; update the find command used
around repo_path/.agents/scripts (the block that sets sc_output and sc_exit and
calls shellcheck) to recurse into subdirectories (e.g., remove -maxdepth 1 and
ensure you still pass all *.sh files to shellcheck and capture stderr into
sc_output while preserving sc_exit handling) so that all .agents/scripts/**/*.sh
files are linted by shellcheck.
ℹ️ Review info
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📥 Commits
Reviewing files that changed from the base of the PR and between 56926f4a3b5c3fe535810e85827daa4a240016bf and c068362.
📒 Files selected for processing (4)
.agents/scripts/milestone-validation-worker.sh.agents/subagent-index.toon.agents/workflows/milestone-validation.md.agents/workflows/mission-orchestrator.md
🚧 Files skipped from review as they are similar to previous changes (2)
- .agents/subagent-index.toon
- .agents/workflows/milestone-validation.md
| # Options: | ||
| # --repo-path <path> Path to project repository (default: inferred from mission file) | ||
| # --browser-tests Run Playwright browser tests (for UI milestones) | ||
| # --browser-url <url> Base URL for browser tests (default: http://localhost:3000) | ||
| # --max-retries <n> Max validation retry attempts (default: 3) | ||
| # --create-fix-tasks Create fix tasks on failure (default: true) | ||
| # --no-fix-tasks Skip fix task creation on failure | ||
| # --report-only Run validation but don't update mission state | ||
| # --verbose Verbose output | ||
| # --help Show this help message | ||
| # |
There was a problem hiding this comment.
Add a machine-consumable report mode for orchestrator integration.
Output is currently human-readable only. A --json mode is needed for deterministic upstream parsing of check counts, failures, warnings, and exit status.
Also applies to: 947-993
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.agents/scripts/milestone-validation-worker.sh around lines 17 - 27, Add a
--json CLI flag to .agents/scripts/milestone-validation-worker.sh and wire it
into the option parsing so the script sets a boolean (e.g., json_output)
alongside existing flags; when json_output is true, suppress human-readable logs
and emit a single machine-readable JSON object to stdout at the end of the run
summarizing total_checks, failures_count, warnings_count, failures (array of
failure objects with id/message/context), warnings (array), and final exit_code;
ensure all places that currently print or exit on error (the main validation
loop / result aggregation code referenced in the script’s main logic and
finalization block around the current human report output and the region
currently repeated at 947-993) accumulate structured data into those arrays and
counts so the JSON can be emitted deterministically, and keep existing exit
status behavior unchanged (script should still exit non-zero on failures) while
returning the same numeric exit code in the JSON exit_code field.
| --max-retries) | ||
| require_value "$arg" "${2-}" || return 2 | ||
| if ! echo "$2" | grep -qE '^[0-9]+$'; then | ||
| log_error "--max-retries requires a numeric value, got: $2" | ||
| return 2 | ||
| fi | ||
| MAX_RETRIES="$2" | ||
| shift 2 |
There was a problem hiding this comment.
Reject --max-retries 0 to prevent false PASS with zero checks.
0 is currently accepted, which skips the loop entirely and can return success without running any validation.
Suggested fix
- if ! echo "$2" | grep -qE '^[0-9]+$'; then
- log_error "--max-retries requires a numeric value, got: $2"
+ if ! echo "$2" | grep -qE '^[1-9][0-9]*$'; then
+ log_error "--max-retries requires a positive integer (>=1), got: $2"
return 2
fiAlso applies to: 1055-1079
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.agents/scripts/milestone-validation-worker.sh around lines 240 - 247, The
--max-retries handler currently accepts 0 which causes the validation loop to be
skipped; update the parsing logic around the --max-retries case to reject
non-positive values by checking that "$2" is a positive integer (>0) before
assigning MAX_RETRIES, returning an error (and calling log_error) if the value
is 0 or negative; update the same validation in the duplicate block referenced
(the similar handler at lines 1055-1079) so both places enforce a strictly
positive integer for MAX_RETRIES and keep using require_value "$arg" "${2-}" to
ensure a value exists.
| if [[ -d "$repo_path/.agents/scripts" ]]; then | ||
| # Run ShellCheck on scripts if available | ||
| if command -v shellcheck >/dev/null 2>&1; then | ||
| local sc_output | ||
| local sc_exit=0 | ||
| sc_output=$(find "$repo_path/.agents/scripts" -maxdepth 1 -name "*.sh" -exec shellcheck {} + 2>&1) || sc_exit=$? | ||
|
|
There was a problem hiding this comment.
ShellCheck scope misses nested scripts under .agents/scripts/**.
find ... -maxdepth 1 only validates top-level scripts and misses subdirectories.
Suggested fix
-sc_output=$(find "$repo_path/.agents/scripts" -maxdepth 1 -name "*.sh" -exec shellcheck {} + 2>&1) || sc_exit=$?
+sc_output=$(find "$repo_path/.agents/scripts" -type f -name "*.sh" -exec shellcheck {} + 2>&1) || sc_exit=$?📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if [[ -d "$repo_path/.agents/scripts" ]]; then | |
| # Run ShellCheck on scripts if available | |
| if command -v shellcheck >/dev/null 2>&1; then | |
| local sc_output | |
| local sc_exit=0 | |
| sc_output=$(find "$repo_path/.agents/scripts" -maxdepth 1 -name "*.sh" -exec shellcheck {} + 2>&1) || sc_exit=$? | |
| if [[ -d "$repo_path/.agents/scripts" ]]; then | |
| # Run ShellCheck on scripts if available | |
| if command -v shellcheck >/dev/null 2>&1; then | |
| local sc_output | |
| local sc_exit=0 | |
| sc_output=$(find "$repo_path/.agents/scripts" -type f -name "*.sh" -exec shellcheck {} + 2>&1) || sc_exit=$? |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.agents/scripts/milestone-validation-worker.sh around lines 592 - 598, The
ShellCheck invocation currently limits scanning to top-level files by using find
with -maxdepth 1; update the find command used around repo_path/.agents/scripts
(the block that sets sc_output and sc_exit and calls shellcheck) to recurse into
subdirectories (e.g., remove -maxdepth 1 and ensure you still pass all *.sh
files to shellcheck and capture stderr into sc_output while preserving sc_exit
handling) so that all .agents/scripts/**/*.sh files are linted by shellcheck.
| if [[ $lint_exit -eq 0 ]]; then | ||
| record_pass "Linter ($pkg_cmd run lint)" | ||
| else | ||
| local issue_count | ||
| issue_count=$(echo "$lint_output" | grep -cE '(error|warning)' || echo "unknown") | ||
| record_fail "Linter ($pkg_cmd run lint)" "$issue_count issues found" | ||
| fi |
There was a problem hiding this comment.
Lint/type-check is implemented as blocking, not advisory.
These paths call record_fail, which flips milestone result to failed. That conflicts with the documented behavior where lint/type-check should be advisory.
Suggested fix
- record_fail "Linter ($pkg_cmd run lint)" "$issue_count issues found"
+ record_warning "Linter ($pkg_cmd run lint)" "$issue_count issues found"
- record_fail "TypeScript type check" "$error_count type errors"
+ record_warning "TypeScript type check" "$error_count type errors"
- record_fail "Linter (ruff)" "$issue_count issues"
+ record_warning "Linter (ruff)" "$issue_count issues"Also applies to: 726-729, 746-749
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.agents/scripts/milestone-validation-worker.sh around lines 705 - 711, The
lint/type-check blocks currently call record_fail (using variables lint_exit,
lint_output and invoking "$pkg_cmd run lint"), which makes the milestone fail;
change these to be advisory by calling record_pass instead (e.g., replace
record_fail "Linter ($pkg_cmd run lint)" "$issue_count issues found" with
record_pass "Linter ($pkg_cmd run lint)" "$issue_count issues found"), keeping
the existing issue_count computation from lint_output so the message shows how
many issues were found; apply the same replacement for the other lint/type-check
occurrences (the blocks around lines shown) so lint/type-check no longer flips
the milestone to failed.
| if [[ $install_exit -ne 0 ]]; then | ||
| record_fail "Dependency installation" "$pkg_cmd install failed with exit code $install_exit" | ||
| return 1 | ||
| fi |
There was a problem hiding this comment.
Avoid early process termination on dependency install failure.
check_dependencies returns 1, and with set -e that can abort before report generation, mission status update, and fix-task creation.
Suggested fix
- record_fail "Dependency installation" "$pkg_cmd install failed with exit code $install_exit"
- return 1
+ record_fail "Dependency installation" "$pkg_cmd install failed with exit code $install_exit"
+ return 0Also applies to: 1064-1064
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.agents/scripts/milestone-validation-worker.sh around lines 841 - 844, The
check_dependencies block currently calls record_fail and then returns 1 (e.g.,
in the install error path shown), which under set -e can abort the whole script;
instead, change the failure handling inside the check_dependencies function (and
the similar location around the second occurrence) to not return a non-zero
status immediately: capture the install_exit, call record_fail "Dependency
installation" with the error message, set or append to a persistent failure
flag/array (e.g., dependency_failure=1 or failures+=("dependency")), and then
continue so the function returns 0; update any callers to inspect that flag
after check_dependencies completes and handle reporting/mission status/fix-task
creation accordingly. Ensure you update both places referenced (the shown
install_exit path and the other occurrence) and keep record_fail usage
unchanged.



Summary
milestone-validation-worker.sh(1104 lines) -- deterministic helper script that runs dependency install, build verification, test suite, lint, and browser QA checks against a repo after all features in a milestone complete. Auto-detects package manager (npm/yarn/pnpm/bun/pip/cargo) and build/test commands. Supports retry logic, fix task creation, and report-only mode. Exit codes: 0=PASS, 1=FAIL, 2=CONFIG_ERROR, 3=STATE_ERROR.milestone-validation.md(194 lines) -- subagent workflow doc defining the validation process: verify feature completion, run deterministic checks (deps, tests, build, lint), optional browser QA for UI milestones, generate structured report, and act on results (advance milestone or create fix tasks).mission-orchestrator.mdPhase 4 to dispatch the validation worker instead of inline checks, with cross-references to the new files and exit code handling.Details
The milestone validation worker is dispatched by the mission orchestrator (Phase 4) or pulse supervisor when all features in a milestone are detected as complete. It runs integration-level checks to verify that individually-passing features work together correctly.
Key design decisions:
--browser-testsflag)require_value()guards with${2-}for safe nounset handling--max-retries(default: 3) and state reset between attemptsCloses #2500
Summary by CodeRabbit