-
Notifications
You must be signed in to change notification settings - Fork 5
feat: add cross-model review workflow to compare-models-helper (t132.8) #791
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Dispatch the same review prompt to multiple AI models in parallel, collect results, and produce a diff summary. Enables second-opinion patterns for code review, security audit, and architecture review. - cmd_cross_review(): dispatches via runner-helper.sh to N models in parallel - Creates temporary runners per model, cleans up after completion - Collects JSON responses, extracts text, saves to output directory - Generates word count comparison and unified diff for 2-model comparisons - Default models: sonnet,opus (Anthropic second opinion) - Configurable: --models, --timeout, --workdir, --output Fixes: t132.8 ref:GH#740
Summary of ChangesHello @alex-solovyev, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a significant enhancement to the Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
🔍 Code Quality Report�[0;35m[MONITOR]�[0m Code Review Monitoring Report �[0;34m[INFO]�[0m Latest Quality Status: �[0;34m[INFO]�[0m Recent monitoring activity: 📈 Current Quality Metrics
Generated on: Mon Feb 9 17:30:25 UTC 2026 Generated by AI DevOps Framework Code Review Monitoring |
|
All 8 subtasks of t132 (Cross-Provider Model Routing) are now complete: - t132.1: Model-specific subagents (PR #758) - t132.2: Provider/model registry (PR #761) - t132.3: Model availability checker (PR #770) - t132.4: Fallback chain config (PR #781) - t132.5: Supervisor model resolution (PR #787) - t132.6: Quality gate with escalation (PR #788) - t132.7: Multi-provider runner/cron support (PR #789) - t132.8: Cross-model review workflow (PR #791) Also fixed stale git conflict markers in TODO.md.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a valuable cross-review feature for comparing model outputs in parallel. The overall implementation is solid, but there are a few important issues regarding robustness and error handling that should be addressed. My review includes feedback on adding a crucial cleanup mechanism to prevent orphaned resources, fixing a logic bug in the parallel job monitoring by correctly handling command exit statuses, and improving the clarity of the final diff output by applying robust error handling. These changes will make the new feature more reliable and user-friendly and align with shell scripting best practices.
| # --models "sonnet,opus,pro" [--workdir path] [--timeout N] [--output dir] | ||
| # Dispatches via runner-helper.sh in parallel, collects outputs, produces summary. | ||
| ####################################### | ||
| cmd_cross_review() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function launches background processes to create temporary runners but lacks a trap to ensure these runners are cleaned up if the script is interrupted (e.g., with Ctrl+C). This can lead to orphaned resources and violates the repository style guide (rule #33), which requires trap cleanup for temporary resources.
To fix this, please add a trap at the beginning of the function to handle INT and TERM signals. This trap should iterate through the runner_names array and call runner-helper.sh destroy on each. You should then clear the trap using trap - INT TERM before the function returns normally.
References
- Repository style guide rule feat: add /session-review and /full-loop commands for comprehensive AI workflow #33 requires that temporary files (and by extension, temporary resources like runners) must have a
trapfor cleanup on RETURN or EXIT to prevent orphaned resources. (link)
| ( | ||
| "$runner_helper" create "$runner_name" \ | ||
| --model "$model_tier" \ | ||
| --description "Cross-review: $model_tier" \ | ||
| --workdir "$workdir" 2>/dev/null || true | ||
|
|
||
| local result_file="${output_dir}/${model_tier}.txt" | ||
| "$runner_helper" run "$runner_name" "$prompt" \ | ||
| --model "$model_tier" \ | ||
| --timeout "$review_timeout" \ | ||
| --format json 2>/dev/null > "${output_dir}/${model_tier}.json" || true | ||
|
|
||
| # Extract text response from JSON | ||
| if [[ -f "${output_dir}/${model_tier}.json" ]]; then | ||
| jq -r '.parts[]? | select(.type == "text") | .text' \ | ||
| "${output_dir}/${model_tier}.json" 2>/dev/null > "$result_file" || true | ||
| fi | ||
|
|
||
| # Clean up runner | ||
| "$runner_helper" destroy "$runner_name" --force 2>/dev/null || true | ||
| ) & |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The subshell for each model will always exit with status 0 because all commands within it are guarded with || true. This makes the wait check for failures in the main loop ineffective, as it won't detect when a model run fails. The script will incorrectly report 'done' even for failed runs.
This suggestion modifies the subshell to capture the exit status of the critical runner-helper run command and use it as the exit status for the entire subshell. This allows the wait command to correctly detect and report failures. This approach aligns with best practices for shell scripting, specifically by capturing command exit codes in variables for clearer error handling, as recommended by ShellCheck SC2181.
(
local run_status=0
"$runner_helper" create "$runner_name" \
--model "$model_tier" \
--description "Cross-review: $model_tier" \
--workdir "$workdir" 2>/dev/null || true
local result_file="${output_dir}/${model_tier}.txt"
"$runner_helper" run "$runner_name" "$prompt" \
--model "$model_tier" \
--timeout "$review_timeout" \
--format json 2>/dev/null > "${output_dir}/${model_tier}.json" || run_status=$?
# Extract text response from JSON
if [[ -f "${output_dir}/${model_tier}.json" ]]; then
jq -r '.parts[]? | select(.type == "text") | .text' \
"${output_dir}/${model_tier}.json" 2>/dev/null > "$result_file" || true
fi
# Clean up runner
"$runner_helper" destroy "$runner_name" --force 2>/dev/null || true
exit "$run_status"
) &References
- In shell scripts, capture a command's exit code in a variable instead of using
$?directly in conditionals. This aligns with ShellCheck SC2181 and improves clarity for multi-way branches.
| echo "Diff (${model_names[0]} vs ${model_names[1]}):" | ||
| diff --unified=3 "$file_a" "$file_b" 2>/dev/null | head -100 || echo " (files are identical or diff unavailable)" | ||
| echo "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current logic for displaying a diff is flawed. If files differ, diff exits with 1, but its output is piped to head, which usually exits 0, preventing the || echo ... fallback from running. If files are identical, diff exits 0 and produces no output, but no message is shown to the user. This makes it unclear whether files are identical or if an error occurred.
The suggested code explicitly checks the exit status of diff to correctly handle all three cases: identical files, different files, and an error during the diff operation. This approach aligns with best practices for shell scripting, specifically by capturing command exit codes in variables for clearer error handling and multi-way branching, as recommended by ShellCheck SC2181.
| echo "Diff (${model_names[0]} vs ${model_names[1]}):" | |
| diff --unified=3 "$file_a" "$file_b" 2>/dev/null | head -100 || echo " (files are identical or diff unavailable)" | |
| echo "" | |
| echo "Diff (${model_names[0]} vs ${model_names[1]}):" | |
| diff_output=$(diff --unified=3 "$file_a" "$file_b" 2>/dev/null) | |
| diff_status=$? | |
| if [[ $diff_status -eq 0 ]]; then | |
| echo " (files are identical)" | |
| elif [[ $diff_status -eq 1 ]]; then | |
| echo "$diff_output" | head -n 100 | |
| else | |
| echo " (diff command failed)" | |
| fi | |
| echo "" |
References
- In shell scripts, capture a command's exit code in a variable instead of using
$?directly in conditionals. This aligns with ShellCheck SC2181 and improves clarity for multi-way branches.



Summary
cross-reviewcommand to compare-models-helper.sh for second-opinion review patternsUsage
How It Works
Use Cases
Testing
bash -n: Syntax check passes-S error: Zero violationsFixes: t132.8 ref:GH#740