ci(perf): Add benchmark history with JSON-in-comment storage by yamadashy · Pull Request #1289 · yamadashy/repomix

yamadashy · 2026-03-22T03:48:49Z

Store benchmark history as JSON in an HTML comment within the PR comment body. Each benchmark run archives its results into a JSON array (), which is parsed with JSON.parse instead of fragile HTML regex extraction.

How it works

post-pending job: Reads existing comment, if it contains completed results, archives them into the JSON history array, then posts a pending comment with the preserved history
comment job: Reads JSON history from the pending comment, posts final results with a History <details> section rendered from JSON

Key design decisions

JSON-in-comment: Data and view are separated — JSON is the source of truth, HTML is rendered from it
File-based passing: Old comment body is saved to $RUNNER_TEMP/old-comment.txt to avoid env var size/escaping issues
Shell for gh api: Commit message fetching stays in shell to avoid escaping nightmares in Node
5 entry cap: History is limited to 5 entries to prevent comment bloat

Checklist

Run npm run test
Run npm run lint

🤖 Generated with Claude Code

Store benchmark history as JSON in an HTML comment within the PR comment body, replacing the need for artifact-based or HTML regex parsing approaches. - History data stored as `` - post-pending job archives completed results into JSON history - comment job reads JSON history and renders History section - Both jobs share the same renderHistory() logic - Capped at 5 history entries to prevent comment bloat - File-based comment passing to avoid env var escaping issues Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-03-22T03:49:04Z

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 00fe2aa5-579f-4b74-9419-3c9c8e9585f1

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

📝 Walkthrough

Walkthrough

Modified .github/workflows/perf-benchmark.yml to replace shell-based PR comment generation with Node-driven rendering, adding WORKFLOW_RUN_URL environment variable, embedded JSON history tracking with 5-entry limit, and improved benchmark result formatting.

Changes

Cohort / File(s)	Summary
Benchmark Workflow Refactoring `.github/workflows/perf-benchmark.yml`	Refactored PR comment generation from shell-based to Node-driven approach; added WORKFLOW_RUN_URL environment variable to `post-pending` and `comment` jobs; implemented embedded JSON history extraction and archiving (5-entry limit) via inline Node script; improved benchmark result reading from JSON files and formatting with optional History details section.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

ci(benchmark): Add performance benchmark workflow for PRs #1252 — Direct refactor/enhancement of the perf-benchmark workflow that replaces shell-based comment generation with Node-based rendering and adds embedded JSON history tracking.
ci(perf): Add pending comment before benchmark runs #1267 — Earlier modification of the same perf-benchmark workflow's PR comment posting/updating mechanism that this PR further enhances with Node-driven rendering and history archiving.

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title directly and clearly summarizes the main change: adding benchmark history with JSON storage in comments.
Description check	✅ Passed	The description comprehensively covers the change with detailed explanations of how it works, design decisions, and completed checklist items.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch ci/perf-benchmark-json-history

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-03-22T03:49:34Z

⚡ Performance Benchmark

Latest commit:	`29e5b6e` fix(ci): Add post-pending dependency to comment job
Status:	✅ Benchmark complete!
Ubuntu:	2.51s (±0.05s) → 2.53s (±0.02s) · +0.02s (+0.7%)
macOS:	1.60s (±0.19s) → 1.86s (±0.17s) · +0.26s (+16.2%)
Windows:	3.10s (±0.04s) → 3.10s (±0.02s) · -0.01s (-0.2%)

Details

Packing the repomix repository with node bin/repomix.cjs
Warmup: 2 runs (discarded)
Measurement: 10 runs / 20 on macOS (median ± IQR)
Workflow run

History

b34dab8 fix(ci): Suppress shellcheck SC2016 for inline Node scripts

Ubuntu:	2.56s (±0.01s) → 2.56s (±0.04s) · +0.01s (+0.2%)
macOS:	1.32s (±0.07s) → 1.32s (±0.10s) · -0.00s (-0.3%)
Windows:	3.03s (±0.13s) → 3.01s (±0.04s) · -0.02s (-0.7%)

ee092dd fix(ci): Skip archiving same SHA on benchmark rerun

Ubuntu:	2.46s (±0.02s) → 2.49s (±0.06s) · +0.03s (+1.3%)
macOS:	1.98s (±0.14s) → 2.01s (±0.19s) · +0.03s (+1.6%)
Windows:	3.00s (±0.05s) → 3.25s (±0.13s) · +0.26s (+8.6%)

3828132 fix(ci): Harden benchmark history JSON parsing and HTML escaping

Ubuntu:	2.62s (±0.03s) → 2.62s (±0.03s) · +0.00s (+0.0%)
macOS:	1.25s (±0.04s) → 1.28s (±0.06s) · +0.04s (+2.8%)
Windows:	3.63s (±0.68s) → 3.63s (±0.59s) · +0.01s (+0.1%)

ef865fb ci(perf): Add benchmark history with JSON-in-comment storage

Ubuntu:	2.68s (±0.05s) → 2.69s (±0.08s) · +0.01s (+0.6%)
macOS:	2.02s (±0.14s) → 2.13s (±0.49s) · +0.11s (+5.6%)
Windows:	3.37s (±0.07s) → 3.46s (±0.06s) · +0.09s (+2.6%)

cloudflare-workers-and-pages · 2026-03-22T03:51:16Z

Deploying repomix with Cloudflare Pages

Latest commit:	`29e5b6e`
Status:	✅ Deploy successful!
Preview URL:	https://8acfa38e.repomix.pages.dev
Branch Preview URL:	https://ci-perf-benchmark-json-histo.repomix.pages.dev

View logs

- Scope OS-row regex to main table only (exclude History section) - Wrap JSON.parse in try/catch to handle corrupted comment bodies - HTML-escape commit messages to prevent injection and comment breakage - Use start/end delimiters for JSON comment to avoid --> conflicts Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Prevent duplicate history entry when a workflow is re-run on the same commit by checking prevSha !== shortSha before archiving. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Add shellcheck disable directives for SC2016 (expressions don't expand in single quotes) on node -e invocations where single quotes are intentionally used to pass JavaScript code without shell expansion. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Ensure comment job waits for post-pending to finish before reading the PR comment body, preventing a race condition where post-pending could overwrite completed results with an in-progress status. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

claude · 2026-03-22T04:16:15Z

Code Review - PR #1289 (3rd Review)

Previous reviews covered HTML escaping, double-escaping, duplication, stale comments, and consistency risks. Here are new findings only:

Issues

1. False "Benchmark complete!" when all benchmarks fail

The comment job runs with if: always() && !cancelled(), so it executes even when all benchmark jobs fail and produce no artifacts. readResult() returns null for missing files, formatResult(null) returns "-", and the comment shows a success status with all dashes. This is misleading — it claims success when everything failed. Consider checking if any results were read before displaying the success status.

2. No error gate between Node script and file read

If the inline Node script crashes (syntax error, runtime exception), the shell continues to read the temp file which either has stale content from a previous run or does not exist. The actual Node error gets buried. Consider deleting the temp file before running Node so stale reads are impossible.

3. Silent JSON parse failure with no diagnostic output

Both try/catch blocks around JSON.parse silently discard errors. If the history JSON gets corrupted, all history is lost with zero indication of why. A one-line console.error in the catch would make debugging possible.

Suggestions (non-blocking)

4. Redundant API call to re-fetch comment body

Both jobs make two API calls: one paginated call to find the comment ID (which already returns the full body in the response), then a second call to fetch the body by ID. These could be combined into a single call by extracting both .id and .body from the paginated response.

5. Concurrent pushes can silently lose history

If two commits are pushed to a PR in quick succession, both post-pending jobs read the same comment, compute their own histories, and write back — the second write silently overwrites the first. This is distinct from the previously-noted eventual-consistency risk. Consider using the comment ID ETag or a simple SHA-based version check.

Reviewed with Claude Code

devin-ai-integration

Devin Review found 1 new potential issue.

View 6 additional findings in Devin Review.

devin-ai-integration · 2026-03-22T04:21:28Z

.github/workflows/perf-benchmark.yml

+      - name: Comment on PR
+        if: ${{ github.event.pull_request.head.repo.fork == false }}


🟡 Step summary no longer written for fork PRs due to merged steps

In the old code, GITHUB_STEP_SUMMARY was written in the "Generate benchmark report" step which had no if guard — it ran for all PRs, including forks. In the new code, the step summary write (fs.appendFileSync(summaryFile, ...) at line 352) is inside the "Comment on PR" step which is guarded by if: ${{ github.event.pull_request.head.repo.fork == false }} at line 243. This means fork PRs lose the step summary entirely. Previously, the step summary was fork PR authors' only way to view benchmark results (since the PR comment was also fork-guarded).

Prompt for agents

In .github/workflows/perf-benchmark.yml, the "comment" job (starting around line 229) currently has a single "Comment on PR" step (line 242) guarded by the fork check. To restore the old behavior where GITHUB_STEP_SUMMARY was written regardless of fork status, split this into two steps: 1. A "Generate benchmark report" step with NO `if` condition that runs the Node script to generate the comment body and write to GITHUB_STEP_SUMMARY. This step should write the body to $RUNNER_TEMP/new-comment.md. 2. A "Comment on PR" step with `if: ${{ github.event.pull_request.head.repo.fork == false }}` that reads $RUNNER_TEMP/new-comment.md and posts/updates the PR comment. This matches the old code's pattern where report generation (including step summary) was unconditional, and only the PR comment posting was fork-guarded.

Was this helpful? React with 👍 or 👎 to provide feedback.

claude · 2026-03-22T04:24:15Z

Code Review - PR 1289 (4th Review)

Previous reviews (Claude 3rd review + CodeRabbit) comprehensively covered the major issues. After a fresh pass, most significant findings have already been raised. Here is what remains:

Already-raised issues still open (summary for tracking)

Issue	Raised by	Status
Commit messages with closing comment sequence can break JSON-in-comment	CodeRabbit	Open
Silent catch hides JSON corruption	Claude 3rd review	Open
False success status when all benchmarks fail	Claude 3rd review	Open
No error gate if Node script crashes before writing file	Claude 3rd review	Open
Regex parses all tables, not just the latest	CodeRabbit	Open

New finding (minor)

Asymmetric encoding of msg field between storage and rendering

In post-pending, prevMsg is extracted from already-rendered HTML (so it is HTML-escaped), then stored raw in the JSON history blob. In renderHistory, h.msg is injected directly into HTML without escaping, which works today because the value is already HTML-escaped from the prior render. But commitMsg for the current entry is escaped via esc() before rendering, then the HTML-escaped version gets stored in JSON on the next cycle.

This means history entries accumulate one layer of HTML escaping per cycle they survive. After 2 cycles, an ampersand becomes double-escaped. This is a concrete bug for commit messages containing special HTML characters and they will progressively double-escape in the history section.

Fix: Store the plain-text commit message in JSON (not the HTML-escaped version). When extracting prevMsg, unescape HTML entities back to plain text before storing in the JSON blob. In renderHistory, always apply esc() to h.msg before inserting into HTML.

Overall assessment

This is a well-structured CI change. The JSON-in-comment approach is clever and the file-based passing avoids env var escaping issues. The main risks are around HTML/comment escaping edge cases. Once the injection and double-escaping issues are addressed, this looks good to merge.

Reviewed with Claude Code

codecov · 2026-03-22T04:47:20Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 87.18%. Comparing base (506d7cf) to head (29e5b6e).
⚠️ Report is 6 commits behind head on main.

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #1289   +/-   ##
=======================================
  Coverage   87.18%   87.18%           
=======================================
  Files         115      115           
  Lines        4324     4324           
  Branches     1002     1002           
=======================================
  Hits         3770     3770           
  Misses        554      554

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

This comment has been minimized.

Sign in to view

This comment was marked as resolved.

Sign in to view

This comment has been minimized.

Sign in to view

This comment was marked as resolved.

Sign in to view

fix(ci): Skip archiving same SHA on benchmark rerun

ee092dd

Prevent duplicate history entry when a workflow is re-run on the same commit by checking prevSha !== shortSha before archiving. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai bot approved these changes Mar 22, 2026

View reviewed changes

This comment has been minimized.

Sign in to view

This comment was marked as resolved.

Sign in to view

devin-ai-integration bot reviewed Mar 22, 2026

View reviewed changes

yamadashy merged commit f4a5131 into main Mar 22, 2026
63 checks passed

yamadashy deleted the ci/perf-benchmark-json-history branch March 22, 2026 14:14

This was referenced Mar 23, 2026

chore(ci): Increase perf benchmark history limit from 5 to 50 #1294

Merged

perf(ci): Improve benchmark stability with interleaved execution #1348

Merged

ci(perf-benchmark): Enable GitHub autolink for commit SHAs in benchmark comments #1353

Merged

		- name: Comment on PR
		if: ${{ github.event.pull_request.head.repo.fork == false }}

Uh oh!

Conversation

yamadashy commented Mar 22, 2026 • edited by devin-ai-integration bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

How it works

Key design decisions

Checklist

Uh oh!

This comment has been minimized.

coderabbitai bot commented Mar 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Uh oh!

github-actions bot commented Mar 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚡ Performance Benchmark

Uh oh!

cloudflare-workers-and-pages bot commented Mar 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Deploying repomix with Cloudflare Pages

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment has been minimized.

This comment was marked as resolved.

Uh oh!

This comment has been minimized.

This comment has been minimized.

This comment was marked as resolved.

Uh oh!

claude bot commented Mar 22, 2026

Code Review - PR #1289 (3rd Review)

Issues

Suggestions (non-blocking)

Uh oh!

devin-ai-integration bot left a comment

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

claude bot commented Mar 22, 2026

Code Review - PR 1289 (4th Review)

Already-raised issues still open (summary for tracking)

New finding (minor)

Overall assessment

Uh oh!

codecov bot commented Mar 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

yamadashy commented Mar 22, 2026 •

edited by devin-ai-integration bot

Loading

coderabbitai bot commented Mar 22, 2026 •

edited

Loading

github-actions bot commented Mar 22, 2026 •

edited

Loading

cloudflare-workers-and-pages bot commented Mar 22, 2026 •

edited

Loading

codecov bot commented Mar 22, 2026 •

edited

Loading