t1012: MODELS.md — live model leaderboard with success rates by marcusquinn · Pull Request #1341 · marcusquinn/aidevops

marcusquinn · 2026-02-13T00:40:12Z

Summary

Fixes supervisor-helper.sh syntax error (duplicate Phase 12 code from prior incomplete merge)
Hardens generate-models-md.sh: adds --output argument guard, fixes weighted average SQL to handle missing score criteria
Regenerates MODELS.md with latest pattern data (508 data points)

Verification (t1008 verify mode)

Prior PR #1305 was closed due to CodeRabbit review issues. This PR addresses:

Supervisor syntax error (CRITICAL): Removed orphaned duplicate code block at line 11049 that caused bash -n to fail. Verified: bash -n supervisor-helper.sh passes.
--output guard: --output without value now exits with error instead of crashing on set -u. Verified: generate-models-md.sh --output exits with error.
Weighted average SQL: Replaced fragile AVG(CASE...) * 4 with per-response subquery that correctly handles missing criteria via NULLIF divisor normalization.
Regen stamp on failure (dismissed): Stamp update is intentionally unconditional — it is a throttle, not a success tracker.

Test Results

bash -n generate-models-md.sh — pass
bash -n supervisor-helper.sh — pass
shellcheck generate-models-md.sh — clean (SC1091 info only)
generate-models-md.sh --quiet — generates valid MODELS.md
generate-models-md.sh --output (no value) — correctly errors with guard

…1012) Queries three SQLite databases (model-registry, pattern-tracker, response-scoring) to produce a Markdown leaderboard showing all available models, success rates by tier and task type, quality scores, and head-to-head contest results.

Generated from live data: 17 models across 6 providers, 487 pattern data points, 18 scored responses. Shows success rates by model tier and task type.

Hourly throttled pulse phase iterates over known repos and regenerates MODELS.md when pattern data changes. Registered in subagent-index.toon.

… (t1012) - Remove duplicate Phase 12 code block causing bash syntax error in supervisor - Add --output argument guard to prevent unbound variable crash (set -u) - Fix weighted average SQL to use per-response subquery (handles missing criteria) - Regenerate MODELS.md with latest pattern data (508 data points)

gemini-code-assist · 2026-02-13T00:40:16Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

coderabbitai · 2026-02-13T00:40:20Z

Warning

Rate limit exceeded

@marcusquinn has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 3 minutes and 10 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

✨ Finishing touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feature/t1012

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-02-13T00:40:39Z

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 0 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Fri Feb 13 00:40:35 UTC 2026: Code review monitoring started
Fri Feb 13 00:40:35 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 0

📈 Current Quality Metrics

BUGS: 0
CODE SMELLS: 0
VULNERABILITIES: 0

Generated on: Fri Feb 13 00:40:38 UTC 2026

Generated by AI DevOps Framework Code Review Monitoring

sonarqubecloud · 2026-02-13T00:41:23Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

marcusquinn added 4 commits February 12, 2026 21:53

feat: add MODELS.md — initial leaderboard snapshot (t1012)

3fc960a

Generated from live data: 17 models across 6 providers, 487 pattern data points, 18 scored responses. Shows success rates by model tier and task type.

feat: add supervisor Phase 12 for MODELS.md auto-regeneration (t1012)

b970e25

Hourly throttled pulse phase iterates over known repos and regenerates MODELS.md when pattern data changes. Registered in subagent-index.toon.

marcusquinn merged commit 31cdc7f into main Feb 13, 2026
11 checks passed

github-actions bot mentioned this pull request Feb 13, 2026

t1012: MODELS.md — live model leaderboard with success rates from pattern tracker #1302

Closed

This was referenced Feb 13, 2026

[Supervisor:marcusquinn] 0 queued, 1 working at 19:50 UTC #1314

Closed

t1011: Model contest mode #1301

Closed

marcusquinn deleted the feature/t1012 branch February 21, 2026 01:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

t1012: MODELS.md — live model leaderboard with success rates#1341

t1012: MODELS.md — live model leaderboard with success rates#1341
marcusquinn merged 4 commits intomainfrom
feature/t1012

marcusquinn commented Feb 13, 2026

Uh oh!

gemini-code-assist bot commented Feb 13, 2026

Uh oh!

coderabbitai bot commented Feb 13, 2026

Rate limit exceeded

Uh oh!

github-actions bot commented Feb 13, 2026

Uh oh!

sonarqubecloud bot commented Feb 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

marcusquinn commented Feb 13, 2026

Summary

Verification (t1008 verify mode)

Test Results

Uh oh!

gemini-code-assist bot commented Feb 13, 2026

Uh oh!

coderabbitai bot commented Feb 13, 2026

Rate limit exceeded

Uh oh!

github-actions bot commented Feb 13, 2026

🔍 Code Quality Report

📈 Current Quality Metrics

Uh oh!

sonarqubecloud bot commented Feb 13, 2026

Quality Gate passed

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant