Skip to content

t1012: MODELS.md — live model leaderboard with success rates#1341

Merged
marcusquinn merged 4 commits intomainfrom
feature/t1012
Feb 13, 2026
Merged

t1012: MODELS.md — live model leaderboard with success rates#1341
marcusquinn merged 4 commits intomainfrom
feature/t1012

Conversation

@marcusquinn
Copy link
Owner

Summary

  • Fixes supervisor-helper.sh syntax error (duplicate Phase 12 code from prior incomplete merge)
  • Hardens generate-models-md.sh: adds --output argument guard, fixes weighted average SQL to handle missing score criteria
  • Regenerates MODELS.md with latest pattern data (508 data points)

Verification (t1008 verify mode)

Prior PR #1305 was closed due to CodeRabbit review issues. This PR addresses:

  1. Supervisor syntax error (CRITICAL): Removed orphaned duplicate code block at line 11049 that caused bash -n to fail. Verified: bash -n supervisor-helper.sh passes.
  2. --output guard: --output without value now exits with error instead of crashing on set -u. Verified: generate-models-md.sh --output exits with error.
  3. Weighted average SQL: Replaced fragile AVG(CASE...) * 4 with per-response subquery that correctly handles missing criteria via NULLIF divisor normalization.
  4. Regen stamp on failure (dismissed): Stamp update is intentionally unconditional — it is a throttle, not a success tracker.

Test Results

  • bash -n generate-models-md.sh — pass
  • bash -n supervisor-helper.sh — pass
  • shellcheck generate-models-md.sh — clean (SC1091 info only)
  • generate-models-md.sh --quiet — generates valid MODELS.md
  • generate-models-md.sh --output (no value) — correctly errors with guard

…1012)

Queries three SQLite databases (model-registry, pattern-tracker, response-scoring)
to produce a Markdown leaderboard showing all available models, success rates by
tier and task type, quality scores, and head-to-head contest results.
Generated from live data: 17 models across 6 providers, 487 pattern data points,
18 scored responses. Shows success rates by model tier and task type.
Hourly throttled pulse phase iterates over known repos and regenerates
MODELS.md when pattern data changes. Registered in subagent-index.toon.
… (t1012)

- Remove duplicate Phase 12 code block causing bash syntax error in supervisor
- Add --output argument guard to prevent unbound variable crash (set -u)
- Fix weighted average SQL to use per-response subquery (handles missing criteria)
- Regenerate MODELS.md with latest pattern data (508 data points)
@gemini-code-assist
Copy link

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 13, 2026

Warning

Rate limit exceeded

@marcusquinn has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 3 minutes and 10 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feature/t1012

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 0 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Fri Feb 13 00:40:35 UTC 2026: Code review monitoring started
Fri Feb 13 00:40:35 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 0

📈 Current Quality Metrics

  • BUGS: 0
  • CODE SMELLS: 0
  • VULNERABILITIES: 0

Generated on: Fri Feb 13 00:40:38 UTC 2026


Generated by AI DevOps Framework Code Review Monitoring

@sonarqubecloud
Copy link

@marcusquinn marcusquinn merged commit 31cdc7f into main Feb 13, 2026
11 checks passed
@marcusquinn marcusquinn deleted the feature/t1012 branch February 21, 2026 01:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant