feat: add cross-model review workflow to compare-models-helper (t132.8) #791

alex-solovyev · 2026-02-09T17:29:54Z

Summary

Adds cross-review command to compare-models-helper.sh for second-opinion review patterns
Dispatches the same prompt to multiple AI models in parallel via runner-helper.sh
Collects results, generates word count comparison and unified diff

Usage

# Review code with Sonnet and Opus (default)
compare-models-helper.sh cross-review \
  --prompt "Review this code for security issues: $(cat src/auth.ts)"

# Custom model set with timeout
compare-models-helper.sh cross-review \
  --prompt "Audit the architecture" \
  --models "sonnet,opus,pro" \
  --timeout 900

# Specify output directory
compare-models-helper.sh cross-review \
  --prompt "Review PR changes" \
  --models "opus,pro" \
  --output /tmp/my-review

How It Works

Creates temporary runners for each model (via runner-helper.sh)
Dispatches the same prompt to all models in parallel (background processes)
Waits for all to complete (with per-model timeout)
Extracts text responses from JSON output
Displays each model's response
Generates diff summary: word counts + unified diff (for 2-model comparisons)
Cleans up temporary runners

Use Cases

Code review: Get a second opinion from a different model
Security audit: Cross-validate findings across providers
Architecture review: Compare reasoning approaches
Quality assurance: Verify model agreement on critical decisions

Testing

bash -n: Syntax check passes
ShellCheck -S error: Zero violations

Fixes: t132.8 ref:GH#740

Dispatch the same review prompt to multiple AI models in parallel, collect results, and produce a diff summary. Enables second-opinion patterns for code review, security audit, and architecture review. - cmd_cross_review(): dispatches via runner-helper.sh to N models in parallel - Creates temporary runners per model, cleans up after completion - Collects JSON responses, extracts text, saves to output directory - Generates word count comparison and unified diff for 2-model comparisons - Default models: sonnet,opus (Anthropic second opinion) - Configurable: --models, --timeout, --workdir, --output Fixes: t132.8 ref:GH#740

gemini-code-assist · 2026-02-09T17:30:11Z

Summary of Changes

Hello @alex-solovyev, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a significant enhancement to the compare-models-helper.sh script by adding a cross-review command. This new functionality allows users to simultaneously query multiple AI models with the same prompt, facilitating a comparative analysis of their responses. The script handles parallel execution, result collection, and provides tools like word count comparisons and unified diffs to highlight differences between model outputs, thereby supporting more robust and multi-faceted AI-driven reviews.

Highlights

New cross-review command: A new command, cross-review, has been added to compare-models-helper.sh to facilitate second-opinion review patterns across multiple AI models.
Parallel AI Model Dispatch: The script now dispatches the same prompt to multiple AI models in parallel using runner-helper.sh, allowing for concurrent processing of review requests.
Result Aggregation and Comparison: It collects responses from all models, generates a word count comparison for each, and creates a unified diff when exactly two models are compared, highlighting differences in their outputs.
Configurable Options: Users can specify the prompt, a comma-separated list of models, a working directory, a timeout for each model, and an output directory for results.

Changelog

.agents/scripts/compare-models-helper.sh
- Added the cross-review command to the script's main command dispatch logic.
- Implemented the cmd_cross_review function, which orchestrates the parallel execution of AI model prompts, result collection, and comparison.
- Updated the help documentation (cmd_help) to list the new cross-review command and provide detailed usage examples.
- Included logic for argument parsing (--prompt, --models, --workdir, --timeout, --output), default model selection, output directory management, and error handling for missing runner-helper.sh or insufficient models.
- Integrated parallel execution using background processes (&) and wait for model responses, including error handling for failed dispatches.
- Added functionality to extract text responses from JSON output using jq and display individual model results.
- Implemented a diff summary section, including word count comparison for all successful model responses and a unified diff for exactly two models.
- Ensured cleanup of temporary runners after execution.

Activity

The pull request introduces a new feature (feat) related to cross-model review workflows.
It addresses task t132.8 and references GH#740.
Basic shell syntax check (bash -n) and static analysis (ShellCheck -S error) passed with zero violations.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

github-actions · 2026-02-09T17:30:27Z

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 59 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Mon Feb 9 17:30:23 UTC 2026: Code review monitoring started
Mon Feb 9 17:30:23 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 59

📈 Current Quality Metrics

BUGS: 0
CODE SMELLS: 59
VULNERABILITIES: 0

Generated on: Mon Feb 9 17:30:25 UTC 2026

Generated by AI DevOps Framework Code Review Monitoring

sonarqubecloud · 2026-02-09T17:31:00Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

All 8 subtasks of t132 (Cross-Provider Model Routing) are now complete: - t132.1: Model-specific subagents (PR #758) - t132.2: Provider/model registry (PR #761) - t132.3: Model availability checker (PR #770) - t132.4: Fallback chain config (PR #781) - t132.5: Supervisor model resolution (PR #787) - t132.6: Quality gate with escalation (PR #788) - t132.7: Multi-provider runner/cron support (PR #789) - t132.8: Cross-model review workflow (PR #791) Also fixed stale git conflict markers in TODO.md.

gemini-code-assist

Code Review

This pull request introduces a valuable cross-review feature for comparing model outputs in parallel. The overall implementation is solid, but there are a few important issues regarding robustness and error handling that should be addressed. My review includes feedback on adding a crucial cleanup mechanism to prevent orphaned resources, fixing a logic bug in the parallel job monitoring by correctly handling command exit statuses, and improving the clarity of the final diff output by applying robust error handling. These changes will make the new feature more reliable and user-friendly and align with shell scripting best practices.

gemini-code-assist · 2026-02-09T17:37:31Z

.agents/scripts/compare-models-helper.sh

+#          --models "sonnet,opus,pro" [--workdir path] [--timeout N] [--output dir]
+# Dispatches via runner-helper.sh in parallel, collects outputs, produces summary.
+#######################################
+cmd_cross_review() {


The function launches background processes to create temporary runners but lacks a trap to ensure these runners are cleaned up if the script is interrupted (e.g., with Ctrl+C). This can lead to orphaned resources and violates the repository style guide (rule #33), which requires trap cleanup for temporary resources.

To fix this, please add a trap at the beginning of the function to handle INT and TERM signals. This trap should iterate through the runner_names array and call runner-helper.sh destroy on each. You should then clear the trap using trap - INT TERM before the function returns normally.

References

Repository style guide rule feat: add /session-review and /full-loop commands for comprehensive AI workflow #33 requires that temporary files (and by extension, temporary resources like runners) must have a trap for cleanup on RETURN or EXIT to prevent orphaned resources. ^(link)

gemini-code-assist · 2026-02-09T17:37:31Z

.agents/scripts/compare-models-helper.sh

+        (
+            "$runner_helper" create "$runner_name" \
+                --model "$model_tier" \
+                --description "Cross-review: $model_tier" \
+                --workdir "$workdir" 2>/dev/null || true
+
+            local result_file="${output_dir}/${model_tier}.txt"
+            "$runner_helper" run "$runner_name" "$prompt" \
+                --model "$model_tier" \
+                --timeout "$review_timeout" \
+                --format json 2>/dev/null > "${output_dir}/${model_tier}.json" || true
+
+            # Extract text response from JSON
+            if [[ -f "${output_dir}/${model_tier}.json" ]]; then
+                jq -r '.parts[]? | select(.type == "text") | .text' \
+                    "${output_dir}/${model_tier}.json" 2>/dev/null > "$result_file" || true
+            fi
+
+            # Clean up runner
+            "$runner_helper" destroy "$runner_name" --force 2>/dev/null || true
+        ) &


The subshell for each model will always exit with status 0 because all commands within it are guarded with || true. This makes the wait check for failures in the main loop ineffective, as it won't detect when a model run fails. The script will incorrectly report 'done' even for failed runs.

This suggestion modifies the subshell to capture the exit status of the critical runner-helper run command and use it as the exit status for the entire subshell. This allows the wait command to correctly detect and report failures. This approach aligns with best practices for shell scripting, specifically by capturing command exit codes in variables for clearer error handling, as recommended by ShellCheck SC2181.

( local run_status=0 "$runner_helper" create "$runner_name" \ --model "$model_tier" \ --description "Cross-review: $model_tier" \ --workdir "$workdir" 2>/dev/null || true local result_file="${output_dir}/${model_tier}.txt" "$runner_helper" run "$runner_name" "$prompt" \ --model "$model_tier" \ --timeout "$review_timeout" \ --format json 2>/dev/null > "${output_dir}/${model_tier}.json" || run_status=$? # Extract text response from JSON if [[ -f "${output_dir}/${model_tier}.json" ]]; then jq -r '.parts[]? | select(.type == "text") | .text' \ "${output_dir}/${model_tier}.json" 2>/dev/null > "$result_file" || true fi # Clean up runner "$runner_helper" destroy "$runner_name" --force 2>/dev/null || true exit "$run_status" ) &

References

In shell scripts, capture a command's exit code in a variable instead of using $? directly in conditionals. This aligns with ShellCheck SC2181 and improves clarity for multi-way branches.

gemini-code-assist · 2026-02-09T17:37:31Z

.agents/scripts/compare-models-helper.sh

+            echo "Diff (${model_names[0]} vs ${model_names[1]}):"
+            diff --unified=3 "$file_a" "$file_b" 2>/dev/null | head -100 || echo "  (files are identical or diff unavailable)"
+            echo ""


The current logic for displaying a diff is flawed. If files differ, diff exits with 1, but its output is piped to head, which usually exits 0, preventing the || echo ... fallback from running. If files are identical, diff exits 0 and produces no output, but no message is shown to the user. This makes it unclear whether files are identical or if an error occurred.

The suggested code explicitly checks the exit status of diff to correctly handle all three cases: identical files, different files, and an error during the diff operation. This approach aligns with best practices for shell scripting, specifically by capturing command exit codes in variables for clearer error handling and multi-way branching, as recommended by ShellCheck SC2181.

Suggested change

echo "Diff (${model_names[0]} vs ${model_names[1]}):"

diff --unified=3 "$file_a" "$file_b" 2>/dev/null | head -100 || echo " (files are identical or diff unavailable)"

echo ""

echo "Diff (${model_names[0]} vs ${model_names[1]}):"

diff_output=$(diff --unified=3 "$file_a" "$file_b" 2>/dev/null)

diff_status=$?

if [[ $diff_status -eq 0 ]]; then

echo " (files are identical)"

elif [[ $diff_status -eq 1 ]]; then

echo "$diff_output" | head -n 100

else

echo " (diff command failed)"

fi

echo ""

References

In shell scripts, capture a command's exit code in a variable instead of using $? directly in conditionals. This aligns with ShellCheck SC2181 and improves clarity for multi-way branches.

alex-solovyev merged commit 26c0033 into main Feb 9, 2026
6 checks passed

gemini-code-assist bot reviewed Feb 9, 2026

View reviewed changes

marcusquinn mentioned this pull request Feb 10, 2026

t132.8: Cross-model review workflow (second-opinion pattern) #740

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add cross-model review workflow to compare-models-helper (t132.8) #791

feat: add cross-model review workflow to compare-models-helper (t132.8) #791

Uh oh!

alex-solovyev commented Feb 9, 2026

Uh oh!

Uh oh!

gemini-code-assist bot commented Feb 9, 2026

Uh oh!

github-actions bot commented Feb 9, 2026

Uh oh!

sonarqubecloud bot commented Feb 9, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Feb 9, 2026

Uh oh!

gemini-code-assist bot Feb 9, 2026

Uh oh!

gemini-code-assist bot Feb 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

-            echo "Diff (${model_names[0]} vs ${model_names[1]}):"
-            diff --unified=3 "$file_a" "$file_b" 2>/dev/null | head -100 || echo "  (files are identical or diff unavailable)"
-            echo ""
+            echo "Diff (${model_names[0]} vs ${model_names[1]}):"
+            diff_output=$(diff --unified=3 "$file_a" "$file_b" 2>/dev/null)
+            diff_status=$?
+            if [[ $diff_status -eq 0 ]]; then
+                echo "  (files are identical)"
+            elif [[ $diff_status -eq 1 ]]; then
+                echo "$diff_output" | head -n 100
+            else
+                echo "  (diff command failed)"
+            fi
+            echo ""

feat: add cross-model review workflow to compare-models-helper (t132.8) #791

feat: add cross-model review workflow to compare-models-helper (t132.8) #791

Uh oh!

Conversation

alex-solovyev commented Feb 9, 2026

Summary

Usage

How It Works

Use Cases

Testing

Uh oh!

Uh oh!

gemini-code-assist bot commented Feb 9, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

github-actions bot commented Feb 9, 2026

🔍 Code Quality Report

📈 Current Quality Metrics

Uh oh!

sonarqubecloud bot commented Feb 9, 2026

Quality Gate passed

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant