chore(do not merge): Experiment with caching for "Build Doc" Job by kevaundray · Pull Request #2271 · ethereum/execution-specs

kevaundray · 2026-02-21T20:10:51Z

🗒️ Description

Was briefly looking into this to check why it takes so long.

Wrote some unorganized notes below for my own reference. (It could be wrong)

From my understanding, "Build Doc" is creating diffs across each fork so we can see how a file has changed over forks.

If this is correct, then this jobs run time will increase if more forks are added.

I believe it does a pairwise diff, ie if we have the three forks:

Frontier
Homestead
Amsterdam

Then it will create a diff for:

Frontier -> Homestead
Homestead -> Amsterdam

This is a file by file diff, so if frontier has 38 files and homestead has 38 files, then we would be diffing 38 file pairs.

If we take the Frontier -> Homestead diff as an example. Lets say the code wants to check the diff between frontier/vm/gas.py and homestead/vm/gas.py. It does the following:

Parse both files into ASTs, so essentially converts the source code into a data structure
Runs a TreeDiff algorithm, that compares the two ASTs to find differences. (This is roughly O(n^2) in complexity
Applies the result of the tree diff to generate the html rendering.(I'm a bit vague here as I didn't look too deep into this part)

Hypothesis: The tree diff algorithm is the expensive part. Noting that even if there is no change between Frontier and Homestead, I believe it still runs the algorithm, to then conclude that nothing has changed.

🔗 Related Issues or PRs

N/A.

✅ Checklist

All: Ran fast tox checks to avoid unnecessary CI fails, see also Code Standards and Enabling Pre-commit Checks:
```
uvx tox -e static
```
All: PR title adheres to the repo standard - it will be used as the squash commit message and should start type(scope):.
All: Considered updating the online docs in the ./docs/ directory.
All: Set appropriate labels for the changes (only maintainers can apply labels).
Tests: Ran mkdocs serve locally and verified the auto-generated docs for new tests in the Test Case Reference are correctly formatted.
Tests: For PRs implementing a missed test case, update the post-mortem document to add an entry the list.
Ported Tests: All converted JSON/YML tests from ethereum/tests or tests/static have been assigned @ported_from marker.

Cute Animal Picture

codecov · 2026-02-21T20:34:37Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 85.86%. Comparing base (fa39bdc) to head (5cdd5ba).
⚠️ Report is 2 commits behind head on forks/amsterdam.

Additional details and impacted files

@@                 Coverage Diff                 @@
##           forks/amsterdam    #2271      +/-   ##
===================================================
- Coverage            86.00%   85.86%   -0.15%     
===================================================
  Files                  600      599       -1     
  Lines                39357    39390      +33     
  Branches              3770     3770              
===================================================
- Hits                 33850    33822      -28     
- Misses                4877     4938      +61     
  Partials               630      630

Flag	Coverage Δ
unittests	`85.86% <ø> (-0.15%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

kevaundray · 2026-02-21T20:57:40Z

It skips around 70% of pairs and fails after 40 minutes. Barring the actual error, this makes sense since it was previously taking an hour and we shaved off 30% of the work.

This gives weight to the claim that the bottleneck is the tree diffing, but need to have better logs to confirm

kevaundray · 2026-02-21T21:06:33Z

Seems the diffs rely on https://github.com/SamWilsn/fladrif

kevaundray · 2026-02-21T21:51:12Z

Currently it seems to build docs for all forks unconditionally -- I think most people will modify the latest fork(currently amsterdam), so a lot of this is also repeated work that should never fail.

I think one reasonable optimization here is to have two settings:

When its a PR, only build docs for fork pairs that have changed. So for people building ontop of amsterdam, build docs would only build for bpo5 -> amsterdam
When it gets merged into forks/amsterdam, build for all forks and deploy the new docs

Rationale here is that docs are not released on the PRs anyways, so building everything is not needed

kevaundray · 2026-02-21T22:09:32Z

Added the above suggestion, but it won't be noticeable in this PR because any change outside of a fork(like the github action fileS) will by default trigger a full rebuild -- should be runnable locally by setting DOCC_DIFF_FORKS=amsterdam

kevaundray · 2026-02-21T22:34:23Z

Diffs are published here for reference.

@SamWilsn Do you know if folks are using this? (Asking because whenever I visit the specs, I usually just look at the code)

If so I think we could rewrite it in Rust to be faster, given that this will grow in time as we add more forks. I wrote a proof of concept here https://kevaundray.github.io/execution-specs-viewer/ that compiles in about a minute using Rust. It uses a different stack etc, but largely gives the features I think you originally intended?

kevaundray · 2026-02-22T14:51:22Z

So if we just build docs for amsterdam, it takes around 40 minutes, which I think is still pretty high

danceratopz · 2026-02-23T05:49:14Z

Diffs are published here for reference.

@SamWilsn Do you know if folks are using this? (Asking because whenever I visit the specs, I usually just look at the code)

If so I think we could rewrite it in Rust to be faster, given that this will grow in time as we add more forks. I wrote a proof of concept here https://kevaundray.github.io/execution-specs-viewer/ that compiles in about a minute using Rust. It uses a different stack etc, but largely gives the features I think you originally intended?

Thanks for taking a look at this, @kevaundray. Yes, the docs build is too slow. I got it down to about ~4-5 minutes locally by tweaking the existing build tool, but I never PR'd these changes / followed up with @SamWilsn. Perhaps it's time to do that. My changes didn't change the fork diffset, however, which is a also great idea!

kevaundray · 2026-02-23T08:38:14Z

Diffs are published here for reference.

@SamWilsn Do you know if folks are using this? (Asking because whenever I visit the specs, I usually just look at the code)

If so I think we could rewrite it in Rust to be faster, given that this will grow in time as we add more forks. I wrote a proof of concept here https://kevaundray.github.io/execution-specs-viewer/ that compiles in about a minute using Rust. It uses a different stack etc, but largely gives the features I think you originally intended?

Thanks for taking a look at this, @kevaundray. Yes, the docs build is too slow. I got it down to about ~4-5 minutes locally by tweaking the existing build tool, but I never PR'd these changes / followed up with @SamWilsn. Perhaps it's time to do that. My changes didn't change the fork diffset, however, which is a also great idea!

Ah nice, curious to know what you changed!

I guess I'm still curious to know if this is actually used, given the work that needs to be done, grows with the number of hardforks

danceratopz · 2026-02-23T08:43:02Z

Diffs are published here for reference.

@SamWilsn Do you know if folks are using this? (Asking because whenever I visit the specs, I usually just look at the code)

If so I think we could rewrite it in Rust to be faster, given that this will grow in time as we add more forks. I wrote a proof of concept here https://kevaundray.github.io/execution-specs-viewer/ that compiles in about a minute using Rust. It uses a different stack etc, but largely gives the features I think you originally intended?

Thanks for taking a look at this, @kevaundray. Yes, the docs build is too slow. I got it down to about ~4-5 minutes locally by tweaking the existing build tool, but I never PR'd these changes / followed up with @SamWilsn. Perhaps it's time to do that. My changes didn't change the fork diffset, however, which is a also great idea!

Ah nice, curious to know what you changed!

Will try and get some PRs up soon! But quite a few cache optimizations and replaced libcst (Concrete Syntax Tree) library with Python's built-in ast module for parsing Python source.

I guess I'm still curious to know if this is actually used, given the work that needs to be done, grows with the number of hardforks

I think the diff is really nice feature, but I suspect it has a limited number of users, but there was recently this comment:

The EELS diff documentation is also a great reference I wasn't aware of it. This could serve as a validation layer to cross-check PRSpec's automated spec extraction against the canonical diffs.

#2212 (comment)

kevaundray · 2026-02-23T09:05:16Z

Diffs are published here for reference.

@SamWilsn Do you know if folks are using this? (Asking because whenever I visit the specs, I usually just look at the code)

If so I think we could rewrite it in Rust to be faster, given that this will grow in time as we add more forks. I wrote a proof of concept here https://kevaundray.github.io/execution-specs-viewer/ that compiles in about a minute using Rust. It uses a different stack etc, but largely gives the features I think you originally intended?

Thanks for taking a look at this, @kevaundray. Yes, the docs build is too slow. I got it down to about ~4-5 minutes locally by tweaking the existing build tool, but I never PR'd these changes / followed up with @SamWilsn. Perhaps it's time to do that. My changes didn't change the fork diffset, however, which is a also great idea!

Ah nice, curious to know what you changed!

Will try and get some PRs up soon! But quite a few cache optimizations and replaced libcst (Concrete Syntax Tree) library with Python's built-in ast module for parsing Python source.

I guess I'm still curious to know if this is actually used, given the work that needs to be done, grows with the number of hardforks

I think the diff is really nice feature, but I suspect it has a limited number of users, but there was recently this comment:

The EELS diff documentation is also a great reference I wasn't aware of it. This could serve as a validation layer to cross-check PRSpec's automated spec extraction against the canonical diffs.

#2212 (comment)

Ah interesting, I didn't get to the parser, amazing!

In that issue, it seemed like the user could benefit from just a normal diff between two forks but I could be mistaken as I didn't look carefully at their project.

Though I think that if the time can be brought down to 4-5 minutes then my question no longer becomes less consequential since it is no longer the bottleneck

SamWilsn · 2026-02-23T19:42:56Z

docc was started as a way to migrate away from our custom sphinx plugin, and I pretty much stopped working on it when we reached parity. There are a ton of low-hanging fruits, for example:

It's roughly architected to support parallelization, but doesn't do any.
Rendering to HTML has multiple passes from HTML -> syntax tree -> HTML.

@SamWilsn Do you know if folks are using this? (Asking because whenever I visit the specs, I usually just look at the code)

I mean, I use it, especially whenever I'm trying to explain a change to someone else. One of the biggest selling points for creating EELS was the ability to show both snapshot (the source itself) and consensus-specs-style diffs.

That said, I don't think https://ethereum.github.io/execution-specs gets much attention. A lot of the fault there lies with me not continuing to improve docc. It's hardly usable in its current state.

If so I think we could rewrite it in Rust to be faster, given that this will grow in time as we add more forks.

I'd love to write this in Rust! Would give an opportunity to fix a lot of the architectural issues in docc. I had originally used Python so the rest of the team would feel more comfortable, but I'm not sure that was actually a smart decision.

I wrote a proof of concept here https://kevaundray.github.io/execution-specs-viewer/ that compiles in about a minute using Rust.

Holy shit that's impressive! Tree sitter was what I was going to use originally before I found libcst (which actually has a partial implementation in rust).

Regardless of where we go, I'd love to get your help on the frontend side of things.

It uses a different stack etc, but largely gives the features I think you originally intended?

Most of the complexity of docc isn't in the diff functionality, it's in everything else. For example, docc is aware of typing and definitions, so you can actually navigate through the source:

Screencast_20260223_143534.webm

kevaundray · 2026-02-23T20:16:25Z

docc was started as a way to migrate away from our custom sphinx plugin, and I pretty much stopped working on it when we reached parity. There are a ton of low-hanging fruits, for example:

It's roughly architected to support parallelization, but doesn't do any.

Rendering to HTML has multiple passes from HTML -> syntax tree -> HTML.

@SamWilsn Do you know if folks are using this? (Asking because whenever I visit the specs, I usually just look at the code)

I mean, I use it, especially whenever I'm trying to explain a change to someone else. One of the biggest selling points for creating EELS was the ability to show both snapshot (the source itself) and consensus-specs-style diffs.

That said, I don't think https://ethereum.github.io/execution-specs gets much attention. A lot of the fault there lies with me not continuing to improve docc. It's hardly usable in its current state.

If so I think we could rewrite it in Rust to be faster, given that this will grow in time as we add more forks.

I'd love to write this in Rust! Would give an opportunity to fix a lot of the architectural issues in docc. I had originally used Python so the rest of the team would feel more comfortable, but I'm not sure that was actually a smart decision.

I wrote a proof of concept here https://kevaundray.github.io/execution-specs-viewer/ that compiles in about a minute using Rust.

Holy shit that's impressive! Tree sitter was what I was going to use originally before I found libcst (which actually has a partial implementation in rust).

Regardless of where we go, I'd love to get your help on the frontend side of things.

It uses a different stack etc, but largely gives the features I think you originally intended?

Most of the complexity of docc isn't in the diff functionality, it's in everything else. For example, docc is aware of typing and definitions, so you can actually navigate through the source:

Screencast_20260223_143534.webm

Thanks for the clarification and context! This all makes sense and I think the go-to definition functionality is very useful :)

continuing to improve docc

When you get the cycles, it may be interesting to scope out your vision of it in an issue, so others can maybe come and implement it.

I'd love to write this in Rust!

This was what I originally wanted to do!

Intuitively, I would think an architecture that allows one to do something like docc --old=forks/osaka --new=forks/amsterdam would scale the best. So the tool itself would just create the diff and html between two forks and this can be ran in parallel in the CI. The CI would then be in charge of not running docc for old forks that haven't changed because its cached.

Didn't dive too deep into this, so there might be something I was missing here.

I'd love to get your help on the frontend side of things.

Ah I was using claude 🤣 it's pretty likely that you have more design skill points than me :)

Regardless of where we go

Would leave this upto you and Dan, happy to follow with what you folks think makes sense. My initial investigation was aimed at reducing the CI times since this gets ran on every PR, but it seems Dan may have a way to reduce this down to less than 10 minutes, so its not that much of a concern

SamWilsn

Code-wise, I think this makes a lot of sense!

What about just disabling diffs when building on a branch? The artifacts don't get uploaded anywhere, and any missing references will get caught by the non-diff build anyway.

kevaundray · 2026-02-23T20:47:47Z

Code-wise, I think this makes a lot of sense!

What about just disabling diffs when building on a branch? The artifacts don't get uploaded anywhere, and any missing references will get caught by the non-diff build anyway.

I think this would be much better

kevaundray · 2026-02-23T23:27:42Z

Closing due to #2296

kevaundray added 2 commits February 21, 2026 19:54

use uvx and caching like other jobs

f9653d9

skip work if sources are identical

369986a

fix ci error

c0f9e42

add fork pair filtering

8f2c6fb

kevaundray added 3 commits February 22, 2026 01:12

test only amsterdam times

f166b1c

fix

b6d9ccf

tox

5cdd5ba

SamWilsn reviewed Feb 23, 2026

View reviewed changes

kevaundray mentioned this pull request Feb 23, 2026

chore(do not merge): Experiment on docc #2296

Closed

7 tasks

kevaundray closed this Feb 23, 2026

kevaundray mentioned this pull request Feb 24, 2026

chore(ci): Skip diffs if not on default branch #2304

Merged

7 tasks

Conversation

kevaundray commented Feb 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🗒️ Description

🔗 Related Issues or PRs

✅ Checklist

Cute Animal Picture

Uh oh!

codecov bot commented Feb 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

kevaundray commented Feb 21, 2026

Uh oh!

kevaundray commented Feb 21, 2026

Uh oh!

kevaundray commented Feb 21, 2026

Uh oh!

kevaundray commented Feb 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kevaundray commented Feb 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kevaundray commented Feb 22, 2026

Uh oh!

danceratopz commented Feb 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kevaundray commented Feb 23, 2026

Uh oh!

danceratopz commented Feb 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kevaundray commented Feb 23, 2026

Uh oh!

SamWilsn commented Feb 23, 2026

Uh oh!

kevaundray commented Feb 23, 2026

Uh oh!

SamWilsn left a comment

Choose a reason for hiding this comment

Uh oh!

kevaundray commented Feb 23, 2026

Uh oh!

kevaundray commented Feb 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

kevaundray commented Feb 21, 2026 •

edited

Loading

codecov bot commented Feb 21, 2026 •

edited

Loading

kevaundray commented Feb 21, 2026 •

edited

Loading

kevaundray commented Feb 21, 2026 •

edited

Loading

danceratopz commented Feb 23, 2026 •

edited

Loading

danceratopz commented Feb 23, 2026 •

edited

Loading