Skip to content

Comments

t1098: Wire compare-models to read live pattern data#1637

Merged
marcusquinn merged 3 commits intomainfrom
feature/t1098
Feb 18, 2026
Merged

t1098: Wire compare-models to read live pattern data#1637
marcusquinn merged 3 commits intomainfrom
feature/t1098

Conversation

@marcusquinn
Copy link
Owner

Summary

  • Wires live success rate data from pattern-tracker-helper.sh (memory.db) into compare-models-helper.sh output
  • Pattern data now appears in list, compare, recommend, and capabilities commands alongside static pricing/capability specs
  • Adds new patterns command for a focused view of model performance with optional --task-type filtering
  • Gracefully degrades when no pattern data exists (no errors, helpful guidance)

Changes

compare-models-helper.sh:

  • Added pattern tracker integration functions: has_pattern_data(), get_tier_success_rate(), model_id_to_tier(), format_pattern_badge(), get_all_tier_patterns()
  • cmd_list: Appends tier success rates after pricing table
  • cmd_compare: Shows per-model success rate badges in capabilities section
  • cmd_recommend: Includes success rates alongside pricing for recommended models + pattern insights section
  • cmd_capabilities: Shows success rates in tier mapping display
  • cmd_patterns: New dedicated command showing tier performance table with pricing

compare-models.md: Updated agent doc with pattern data workflow step and examples

Example Output

sonnet: $3.00/$15.00 per 1M tokens, 200K context — 100% (n=322) success
aidevops Tier Mapping:
  sonnet   -> claude-sonnet-4        (Code implementation, review, most tasks) [100% (n=322) success]
  opus     -> claude-opus-4          (Architecture decisions, complex reasoning) [98% (n=471) success]

Verification

  • ShellCheck: zero violations
  • Tested with real pattern data (825 patterns across 5 tiers)
  • Tested with missing DB (graceful degradation, no errors)
  • Tested --task-type filtering

Ref #1623

Adds live success rate data from pattern-tracker-helper.sh memory DB
to compare-models-helper.sh. Pattern data appears in:
- cmd_list: tier success rates appended after pricing table
- cmd_compare: per-model success rate badges in capabilities section
- cmd_recommend: success rates alongside pricing for recommended models
- cmd_capabilities: success rates in tier mapping display
- cmd_patterns: new dedicated command for focused pattern data view

Integration reads directly from the shared memory.db (same schema as
pattern-tracker-helper.sh) — no data duplication. Gracefully degrades
when no pattern data exists.
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 18, 2026

Warning

Rate limit exceeded

@marcusquinn has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 14 minutes and 40 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feature/t1098

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist
Copy link

Summary of Changes

Hello @marcusquinn, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the compare-models-helper.sh script by incorporating live pattern data from the pattern tracker. This integration allows users to view dynamic success rates and performance metrics alongside static model specifications, providing a more informed basis for model selection. The changes extend to several existing commands and introduce a new dedicated command for pattern analysis, ensuring a robust and user-friendly experience even when pattern data is unavailable.

Highlights

  • Pattern Data Integration: Integrated live success rate data from pattern-tracker-helper.sh (memory.db) into the compare-models-helper.sh script, allowing model performance insights to be displayed alongside static pricing and capability specifications.
  • Enhanced Command Outputs: Updated existing commands (list, compare, recommend, capabilities) to display pattern data, such as tier success rates and sample counts, providing a more comprehensive view of model performance.
  • New 'patterns' Command: Added a dedicated patterns command to compare-models-helper.sh for a focused view of model performance, including success rates, failures, and pricing, with optional filtering by task type.
  • Graceful Degradation: Implemented logic to gracefully handle scenarios where no pattern data exists, providing helpful guidance to the user without generating errors.
  • Documentation Update: Updated compare-models.md to reflect the new pattern data workflow, including examples and a new step for incorporating pattern data into model comparisons.
Changelog
  • .agents/scripts/compare-models-helper.sh
    • Added new functions: has_pattern_data(), get_tier_success_rate(), model_id_to_tier(), format_pattern_badge(), get_all_tier_patterns() to read and format pattern data from memory.db.
    • Modified cmd_list to display live success rates from the pattern tracker.
    • Modified cmd_compare to show per-model success rate badges in the capabilities section.
    • Modified cmd_recommend to include success rates alongside pricing for recommended models and add a 'Pattern Tracker Insights' section.
    • Modified cmd_capabilities to show success rates in the tier mapping display.
    • Added a new cmd_patterns command for a focused view of model performance with task-type filtering.
    • Updated cmd_help to include the new patterns command and its examples.
  • .agents/tools/ai-assistants/compare-models.md
    • Updated the helper command list to include patterns.
    • Added a new section 'Step 5: Include Pattern Data (t1098)' detailing the integration and usage of pattern data.
    • Updated the 'Data sources' description to include live pattern tracker data.
    • Added 'Pattern-backed insights' as a new actionable advice point.
    • Added scripts/pattern-tracker-helper.sh to the related documents list.
Activity
  • The pull request was created by marcusquinn.
  • No human review comments or activities have been recorded yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@github-actions
Copy link

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 22 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Wed Feb 18 02:56:12 UTC 2026: Code review monitoring started
Wed Feb 18 02:56:12 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 22

📈 Current Quality Metrics

  • BUGS: 0
  • CODE SMELLS: 22
  • VULNERABILITIES: 0

Generated on: Wed Feb 18 02:56:15 UTC 2026


Generated by AI DevOps Framework Code Review Monitoring

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request effectively integrates live pattern data from the pattern-tracker-helper.sh into the compare-models-helper.sh script, adding valuable real-world performance metrics to the model comparison tool. The new patterns command is a great addition for a focused view on model success rates. The code is well-structured and handles the absence of data gracefully.

My review focuses on a couple of areas for improvement:

  • Code Duplication: There's an opportunity to refactor duplicated database query logic into a shared helper function.
  • Shell Scripting Robustness: A suggestion is made to use a bash array for iterating over model tiers to make the script more robust and compliant with ShellCheck best practices, as required by the repository's style guide.

Overall, these are solid enhancements that make the tool more data-driven.

# Same DB as pattern-tracker-helper.sh — no duplication of storage.

readonly PATTERN_DB="${AIDEVOPS_MEMORY_DIR:-$HOME/.aidevops/.agent-workspace/memory}/memory.db"
readonly PATTERN_VALID_MODELS="haiku flash sonnet pro opus"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using a space-separated string for PATTERN_VALID_MODELS and relying on word-splitting in for loops (in get_all_tier_patterns on line 113 and cmd_list on line 286) can be fragile and is flagged by ShellCheck (SC2043). The style guide (rule 14) requires zero ShellCheck violations. It's more robust to define this as a bash array. The loops should then be updated to iterate over the array elements safely (e.g., for tier in "${PATTERN_VALID_MODELS[@]}"; do).

Suggested change
readonly PATTERN_VALID_MODELS="haiku flash sonnet pro opus"
readonly -a PATTERN_VALID_MODELS=(haiku flash sonnet pro opus)
References
  1. ShellCheck zero violations required -- targeted inline disables with reason comments only (link)

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed: converted PATTERN_VALID_MODELS to a bash array (readonly -a) and updated all 3 loop sites to use "\${PATTERN_VALID_MODELS[@]}". ShellCheck passes clean.

Comment on lines 912 to 921
# Get pattern data
local filter=""
if [[ -n "$task_type" ]]; then
filter="AND (tags LIKE '%${task_type}%' OR content LIKE '%task:${task_type}%')"
fi
local model_filter="AND (tags LIKE '%model:${tier}%' OR content LIKE '%model:${tier}%')"

local successes failures
successes=$(sqlite3 "$PATTERN_DB" "SELECT COUNT(*) FROM learnings WHERE type IN ('SUCCESS_PATTERN', 'WORKING_SOLUTION') $model_filter $filter;" 2>/dev/null || echo "0")
failures=$(sqlite3 "$PATTERN_DB" "SELECT COUNT(*) FROM learnings WHERE type IN ('FAILURE_PATTERN', 'FAILED_APPROACH', 'ERROR_FIX') $model_filter $filter;" 2>/dev/null || echo "0")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This block of code to get success and failure counts from the database is duplicated from the get_tier_success_rate function. To improve maintainability and follow the DRY (Don't Repeat Yourself) principle, this logic should be extracted into a new internal helper function. Both get_tier_success_rate and cmd_patterns can then call this new function.

For example, you could create a function like this:

_get_tier_pattern_counts() {
    local tier="$1"
    local task_type="${2:-}"
    [[ -f "$PATTERN_DB" ]] || { echo "0|0"; return 0; }

    local filter=""
    if [[ -n "$task_type" ]]; then
        filter="AND (tags LIKE '%${task_type}%' OR content LIKE '%task:${task_type}%')"
    fi

    local model_filter="AND (tags LIKE '%model:${tier}%' OR content LIKE '%model:${tier}%')"

    local successes failures
    successes=$(sqlite3 "$PATTERN_DB" "SELECT COUNT(*) FROM learnings WHERE type IN ('SUCCESS_PATTERN', 'WORKING_SOLUTION') $model_filter $filter;" 2>/dev/null || echo "0")
    failures=$(sqlite3 "$PATTERN_DB" "SELECT COUNT(*) FROM learnings WHERE type IN ('FAILURE_PATTERN', 'FAILED_APPROACH', 'ERROR_FIX') $model_filter $filter;" 2>/dev/null || echo "0")

    echo "${successes}|${failures}"
    return 0
}

Then get_tier_success_rate and cmd_patterns can be simplified by calling _get_tier_pattern_counts.

References
  1. In shell scripts, extract repeated logic into an internal helper function to improve maintainability. This applies even for standalone scripts where external source dependencies are avoided.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed: extracted _get_tier_pattern_counts() internal helper that returns successes|failures. Both get_tier_success_rate and cmd_patterns now call it instead of duplicating the query logic.

@github-actions
Copy link

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 22 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Wed Feb 18 03:07:43 UTC 2026: Code review monitoring started
Wed Feb 18 03:07:44 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 22

📈 Current Quality Metrics

  • BUGS: 0
  • CODE SMELLS: 22
  • VULNERABILITIES: 0

Generated on: Wed Feb 18 03:07:46 UTC 2026


Generated by AI DevOps Framework Code Review Monitoring

@sonarqubecloud
Copy link

@marcusquinn marcusquinn merged commit 6aa0909 into main Feb 18, 2026
10 of 11 checks passed
@marcusquinn marcusquinn deleted the feature/t1098 branch February 18, 2026 03:17
marcusquinn added a commit that referenced this pull request Feb 18, 2026
…d t1094.1 (t1107)

All t1094 deliverables confirmed implemented:
- Prompt strategy tracking → t1095 (pr:#1629)
- Output quality gradient + failure categorization → t1096 (pr:#1632)
- Token usage → t1095 (pr:#1629)
- A/B comparison → t1098+t1099 (pr:#1637, pr:#1634)
- Prompt-repeat strategy → t1097 (pr:#1631)
- Build-agent reference → t1094.1 (pr:#1633)

t1094 parent ready to be marked complete: verified:2026-02-18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant