Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
123 changes: 123 additions & 0 deletions .github/workflows/benchmark.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
# This workflow will run on the default branch when triggered by a `workflow_run` event.

name: Benchmark

on:
workflow_run:
workflows: [Test]
types: [completed]

concurrency:
group: ${{ github.workflow }}-${{ github.event.workflow_run.head_branch }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd appreciate a comment here why this is needed and what it does.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can add that it cancels in-progress runs (for runs on the same branch as the PR) if the workflow is re-triggered during a current run (by e.g. another push to the PR and successful Test workflow run).

cancel-in-progress: true

jobs:
benchmark:
runs-on: ubuntu-24.04
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to use the same host we benchmark the polkadot SDK with.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like different ones are being used for benches in the Polkadot SDK, but two common ones are e.g. parity-large and parity-benchmark (both enabled in our revive repo). The differences can be seen here.

if: ${{ github.event.workflow_run.conclusion == 'success' && github.event.workflow_run.event == 'pull_request' }}
permissions:
pull-requests: write
steps:
- name: Checkout PR Branch
uses: actions/checkout@v4
with:
ref: ${{ github.event.workflow_run.head_sha }}
Comment on lines +21 to +24
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we checking out the PR branch here only to checkout the main branch first and then switch to the PR branch again?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Essentially due to the get-solc and get-llvm actions used first. I figured we'd want to use these action files from the PR branch in case those were modified as part of the PR. An alternative would be to run those steps twice (once on main and then on the PR branch after switching).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean either ways if they are modified in an incompatible way either main or the PR branch is broken. I was just thinking it would be simpler to first bench the CI branch and then checking out main for the comparison.


- name: Set Up Rust Toolchain
uses: actions-rust-lang/setup-rust-toolchain@v1
with:
# Without this it will override our rust flags.
rustflags: ""

- name: Install Solc
uses: ./.github/actions/get-solc

- name: Download LLVM
uses: ./.github/actions/get-llvm
with:
target: x86_64-unknown-linux-gnu

- name: Set LLVM Environment Variables
run: |
echo "LLVM_SYS_181_PREFIX=$(pwd)/llvm-x86_64-unknown-linux-gnu" >> $GITHUB_ENV

- name: Install Benchmarking Tools
run: |
cargo install critcmp

- name: Checkout and Compile Main Branch
run: |
git fetch origin main
git checkout -f main
Comment on lines +50 to +51
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, that makes sense to parameterize it by allowing to specify branch or tag to test against. This will help to nicely measure improvements/regressions over some release.
Obviously, that can be done later, not in this PR

Copy link
Contributor Author

@elle-j elle-j Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That can be a neat and valuable feature, thanks. Can add in a follow-up 👍

Regarding your other comment and concurrency, jobs and workflows (not steps within jobs) can run concurrently by default. Which, thanks for the reminder, made me add a concurrency group where this workflow only runs one at a time for the given PR branch (and cancels in-progress ones if it's triggered again while in progress).

I'd have to look up whether another CI machine is still more optimial for perf testing tho.

make install

- name: Run Benchmarks on Main Branch
run: |
cargo bench --package resolc --bench compile -- --save-baseline main_resolc
cargo bench --package revive-yul --bench parse -- --save-baseline main_yul_parse
cargo bench --package revive-yul --bench lower -- --save-baseline main_yul_lower
timeout-minutes: 20
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could also remove the timeout-minutes and add it only if we notice unusually long runs. (This took about 6-7min on M4 Pro.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should narrow down the workflow trigger. Thinking about it, this will be generating a lot of CI hours. I don't think it's necessary to run this workflow in every PR. Is there a simple way to configure it to to be manually triggered by the PR author?


- name: Checkout and Compile PR Branch
run: |
git checkout -f ${{ github.event.workflow_run.head_sha }}
make install

- name: Run Benchmarks on PR Branch
run: |
cargo bench --package resolc --bench compile -- --save-baseline pr_resolc
cargo bench --package revive-yul --bench parse -- --save-baseline pr_yul_parse
cargo bench --package revive-yul --bench lower -- --save-baseline pr_yul_lower
timeout-minutes: 20

- name: Compare Benchmarks
run: |
critcmp main_resolc pr_resolc > benchmarks_resolc
critcmp main_yul_parse pr_yul_parse > benchmarks_yul_parse
critcmp main_yul_lower pr_yul_lower > benchmarks_yul_lower

- name: Create Report
run: |
cat > BENCHMARK_REPORT.md << EOF
# Benchmarks

## Compile E2E

<details>
<summary>Results</summary>

\`\`\`
$(cat benchmarks_resolc)
\`\`\`

</details>

## Parse Yul

<details>
<summary>Results</summary>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hiding the results is definitively what we want here in order to not make it too noisy. Can we also have a summary or something on top that tells PR authors quickly whether the benchmarks have regressed, improved or no significant change was detected?


\`\`\`
$(cat benchmarks_yul_parse)
\`\`\`

</details>

## Lower Yul

<details>
<summary>Results</summary>

\`\`\`
$(cat benchmarks_yul_lower)
\`\`\`

</details>
EOF

- name: Post PR Comment with Benchmark Report
uses: marocchino/sticky-pull-request-comment@v2
with:
header: benchmark
number: ${{ github.event.workflow_run.pull_requests[0].number }}
path: BENCHMARK_REPORT.md
Loading