Skip to content

chore(do not merge): Experiment with caching for "Build Doc" Job#2271

Closed
kevaundray wants to merge 7 commits intoethereum:forks/amsterdamfrom
kevaundray:kw/experiment-build-docs-ci
Closed

chore(do not merge): Experiment with caching for "Build Doc" Job#2271
kevaundray wants to merge 7 commits intoethereum:forks/amsterdamfrom
kevaundray:kw/experiment-build-docs-ci

Conversation

@kevaundray
Copy link
Contributor

@kevaundray kevaundray commented Feb 21, 2026

🗒️ Description

Was briefly looking into this to check why it takes so long.

Wrote some unorganized notes below for my own reference. (It could be wrong)

From my understanding, "Build Doc" is creating diffs across each fork so we can see how a file has changed over forks.

If this is correct, then this jobs run time will increase if more forks are added.

I believe it does a pairwise diff, ie if we have the three forks:

  • Frontier
  • Homestead
  • Amsterdam

Then it will create a diff for:

  • Frontier -> Homestead
  • Homestead -> Amsterdam

This is a file by file diff, so if frontier has 38 files and homestead has 38 files, then we would be diffing 38 file pairs.


If we take the Frontier -> Homestead diff as an example. Lets say the code wants to check the diff between frontier/vm/gas.py and homestead/vm/gas.py. It does the following:

  • Parse both files into ASTs, so essentially converts the source code into a data structure
  • Runs a TreeDiff algorithm, that compares the two ASTs to find differences. (This is roughly O(n^2) in complexity
  • Applies the result of the tree diff to generate the html rendering.(I'm a bit vague here as I didn't look too deep into this part)

Hypothesis: The tree diff algorithm is the expensive part. Noting that even if there is no change between Frontier and Homestead, I believe it still runs the algorithm, to then conclude that nothing has changed.

🔗 Related Issues or PRs

N/A.

✅ Checklist

  • All: Ran fast tox checks to avoid unnecessary CI fails, see also Code Standards and Enabling Pre-commit Checks:
    uvx tox -e static
  • All: PR title adheres to the repo standard - it will be used as the squash commit message and should start type(scope):.
  • All: Considered updating the online docs in the ./docs/ directory.
  • All: Set appropriate labels for the changes (only maintainers can apply labels).
  • Tests: Ran mkdocs serve locally and verified the auto-generated docs for new tests in the Test Case Reference are correctly formatted.
  • Tests: For PRs implementing a missed test case, update the post-mortem document to add an entry the list.
  • Ported Tests: All converted JSON/YML tests from ethereum/tests or tests/static have been assigned @ported_from marker.

Cute Animal Picture

Put a link to a cute animal picture inside the parenthesis-->

@codecov
Copy link

codecov bot commented Feb 21, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 85.86%. Comparing base (fa39bdc) to head (5cdd5ba).
⚠️ Report is 2 commits behind head on forks/amsterdam.

Additional details and impacted files
@@                 Coverage Diff                 @@
##           forks/amsterdam    #2271      +/-   ##
===================================================
- Coverage            86.00%   85.86%   -0.15%     
===================================================
  Files                  600      599       -1     
  Lines                39357    39390      +33     
  Branches              3770     3770              
===================================================
- Hits                 33850    33822      -28     
- Misses                4877     4938      +61     
  Partials               630      630              
Flag Coverage Δ
unittests 85.86% <ø> (-0.15%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@kevaundray
Copy link
Contributor Author

It skips around 70% of pairs and fails after 40 minutes. Barring the actual error, this makes sense since it was previously taking an hour and we shaved off 30% of the work.

This gives weight to the claim that the bottleneck is the tree diffing, but need to have better logs to confirm

@kevaundray
Copy link
Contributor Author

Seems the diffs rely on https://github.com/SamWilsn/fladrif

@kevaundray
Copy link
Contributor Author

Currently it seems to build docs for all forks unconditionally -- I think most people will modify the latest fork(currently amsterdam), so a lot of this is also repeated work that should never fail.

I think one reasonable optimization here is to have two settings:

  • When its a PR, only build docs for fork pairs that have changed. So for people building ontop of amsterdam, build docs would only build for bpo5 -> amsterdam
  • When it gets merged into forks/amsterdam, build for all forks and deploy the new docs

Rationale here is that docs are not released on the PRs anyways, so building everything is not needed

@kevaundray
Copy link
Contributor Author

kevaundray commented Feb 21, 2026

Added the above suggestion, but it won't be noticeable in this PR because any change outside of a fork(like the github action fileS) will by default trigger a full rebuild -- should be runnable locally by setting DOCC_DIFF_FORKS=amsterdam

@kevaundray
Copy link
Contributor Author

kevaundray commented Feb 21, 2026

Diffs are published here for reference.

@SamWilsn Do you know if folks are using this? (Asking because whenever I visit the specs, I usually just look at the code)

If so I think we could rewrite it in Rust to be faster, given that this will grow in time as we add more forks. I wrote a proof of concept here https://kevaundray.github.io/execution-specs-viewer/ that compiles in about a minute using Rust. It uses a different stack etc, but largely gives the features I think you originally intended?

@kevaundray
Copy link
Contributor Author

So if we just build docs for amsterdam, it takes around 40 minutes, which I think is still pretty high

@danceratopz
Copy link
Member

danceratopz commented Feb 23, 2026

Diffs are published here for reference.

@SamWilsn Do you know if folks are using this? (Asking because whenever I visit the specs, I usually just look at the code)

If so I think we could rewrite it in Rust to be faster, given that this will grow in time as we add more forks. I wrote a proof of concept here https://kevaundray.github.io/execution-specs-viewer/ that compiles in about a minute using Rust. It uses a different stack etc, but largely gives the features I think you originally intended?

Thanks for taking a look at this, @kevaundray. Yes, the docs build is too slow. I got it down to about ~4-5 minutes locally by tweaking the existing build tool, but I never PR'd these changes / followed up with @SamWilsn. Perhaps it's time to do that. My changes didn't change the fork diffset, however, which is a also great idea!

@kevaundray
Copy link
Contributor Author

Diffs are published here for reference.

@SamWilsn Do you know if folks are using this? (Asking because whenever I visit the specs, I usually just look at the code)

If so I think we could rewrite it in Rust to be faster, given that this will grow in time as we add more forks. I wrote a proof of concept here https://kevaundray.github.io/execution-specs-viewer/ that compiles in about a minute using Rust. It uses a different stack etc, but largely gives the features I think you originally intended?

Thanks for taking a look at this, @kevaundray. Yes, the docs build is too slow. I got it down to about ~4-5 minutes locally by tweaking the existing build tool, but I never PR'd these changes / followed up with @SamWilsn. Perhaps it's time to do that. My changes didn't change the fork diffset, however, which is a also great idea!

Ah nice, curious to know what you changed!

I guess I'm still curious to know if this is actually used, given the work that needs to be done, grows with the number of hardforks

@danceratopz
Copy link
Member

danceratopz commented Feb 23, 2026

Diffs are published here for reference.

@SamWilsn Do you know if folks are using this? (Asking because whenever I visit the specs, I usually just look at the code)

If so I think we could rewrite it in Rust to be faster, given that this will grow in time as we add more forks. I wrote a proof of concept here https://kevaundray.github.io/execution-specs-viewer/ that compiles in about a minute using Rust. It uses a different stack etc, but largely gives the features I think you originally intended?

Thanks for taking a look at this, @kevaundray. Yes, the docs build is too slow. I got it down to about ~4-5 minutes locally by tweaking the existing build tool, but I never PR'd these changes / followed up with @SamWilsn. Perhaps it's time to do that. My changes didn't change the fork diffset, however, which is a also great idea!

Ah nice, curious to know what you changed!

Will try and get some PRs up soon! But quite a few cache optimizations and replaced libcst (Concrete Syntax Tree) library with Python's built-in ast module for parsing Python source.

I guess I'm still curious to know if this is actually used, given the work that needs to be done, grows with the number of hardforks

I think the diff is really nice feature, but I suspect it has a limited number of users, but there was recently this comment:

The EELS diff documentation is also a great reference I wasn't aware of it. This could serve as a validation layer to cross-check PRSpec's automated spec extraction against the canonical diffs.

#2212 (comment)

@kevaundray
Copy link
Contributor Author

Diffs are published here for reference.

@SamWilsn Do you know if folks are using this? (Asking because whenever I visit the specs, I usually just look at the code)

If so I think we could rewrite it in Rust to be faster, given that this will grow in time as we add more forks. I wrote a proof of concept here https://kevaundray.github.io/execution-specs-viewer/ that compiles in about a minute using Rust. It uses a different stack etc, but largely gives the features I think you originally intended?

Thanks for taking a look at this, @kevaundray. Yes, the docs build is too slow. I got it down to about ~4-5 minutes locally by tweaking the existing build tool, but I never PR'd these changes / followed up with @SamWilsn. Perhaps it's time to do that. My changes didn't change the fork diffset, however, which is a also great idea!

Ah nice, curious to know what you changed!

Will try and get some PRs up soon! But quite a few cache optimizations and replaced libcst (Concrete Syntax Tree) library with Python's built-in ast module for parsing Python source.

I guess I'm still curious to know if this is actually used, given the work that needs to be done, grows with the number of hardforks

I think the diff is really nice feature, but I suspect it has a limited number of users, but there was recently this comment:

The EELS diff documentation is also a great reference I wasn't aware of it. This could serve as a validation layer to cross-check PRSpec's automated spec extraction against the canonical diffs.

#2212 (comment)

Ah interesting, I didn't get to the parser, amazing!

In that issue, it seemed like the user could benefit from just a normal diff between two forks but I could be mistaken as I didn't look carefully at their project.

Though I think that if the time can be brought down to 4-5 minutes then my question no longer becomes less consequential since it is no longer the bottleneck

@SamWilsn
Copy link
Contributor

docc was started as a way to migrate away from our custom sphinx plugin, and I pretty much stopped working on it when we reached parity. There are a ton of low-hanging fruits, for example:

  • It's roughly architected to support parallelization, but doesn't do any.
  • Rendering to HTML has multiple passes from HTML -> syntax tree -> HTML.

@SamWilsn Do you know if folks are using this? (Asking because whenever I visit the specs, I usually just look at the code)

I mean, I use it, especially whenever I'm trying to explain a change to someone else. One of the biggest selling points for creating EELS was the ability to show both snapshot (the source itself) and consensus-specs-style diffs.

That said, I don't think https://ethereum.github.io/execution-specs gets much attention. A lot of the fault there lies with me not continuing to improve docc. It's hardly usable in its current state.

If so I think we could rewrite it in Rust to be faster, given that this will grow in time as we add more forks.

I'd love to write this in Rust! Would give an opportunity to fix a lot of the architectural issues in docc. I had originally used Python so the rest of the team would feel more comfortable, but I'm not sure that was actually a smart decision.

I wrote a proof of concept here https://kevaundray.github.io/execution-specs-viewer/ that compiles in about a minute using Rust.

Holy shit that's impressive! Tree sitter was what I was going to use originally before I found libcst (which actually has a partial implementation in rust).

Regardless of where we go, I'd love to get your help on the frontend side of things.

It uses a different stack etc, but largely gives the features I think you originally intended?

Most of the complexity of docc isn't in the diff functionality, it's in everything else. For example, docc is aware of typing and definitions, so you can actually navigate through the source:

Screencast_20260223_143534.webm

@kevaundray
Copy link
Contributor Author

docc was started as a way to migrate away from our custom sphinx plugin, and I pretty much stopped working on it when we reached parity. There are a ton of low-hanging fruits, for example:

  • It's roughly architected to support parallelization, but doesn't do any.
  • Rendering to HTML has multiple passes from HTML -> syntax tree -> HTML.

@SamWilsn Do you know if folks are using this? (Asking because whenever I visit the specs, I usually just look at the code)

I mean, I use it, especially whenever I'm trying to explain a change to someone else. One of the biggest selling points for creating EELS was the ability to show both snapshot (the source itself) and consensus-specs-style diffs.

That said, I don't think https://ethereum.github.io/execution-specs gets much attention. A lot of the fault there lies with me not continuing to improve docc. It's hardly usable in its current state.

If so I think we could rewrite it in Rust to be faster, given that this will grow in time as we add more forks.

I'd love to write this in Rust! Would give an opportunity to fix a lot of the architectural issues in docc. I had originally used Python so the rest of the team would feel more comfortable, but I'm not sure that was actually a smart decision.

I wrote a proof of concept here https://kevaundray.github.io/execution-specs-viewer/ that compiles in about a minute using Rust.

Holy shit that's impressive! Tree sitter was what I was going to use originally before I found libcst (which actually has a partial implementation in rust).

Regardless of where we go, I'd love to get your help on the frontend side of things.

It uses a different stack etc, but largely gives the features I think you originally intended?

Most of the complexity of docc isn't in the diff functionality, it's in everything else. For example, docc is aware of typing and definitions, so you can actually navigate through the source:

Screencast_20260223_143534.webm

Thanks for the clarification and context! This all makes sense and I think the go-to definition functionality is very useful :)

continuing to improve docc

When you get the cycles, it may be interesting to scope out your vision of it in an issue, so others can maybe come and implement it.

I'd love to write this in Rust!

This was what I originally wanted to do!

Intuitively, I would think an architecture that allows one to do something like docc --old=forks/osaka --new=forks/amsterdam would scale the best. So the tool itself would just create the diff and html between two forks and this can be ran in parallel in the CI. The CI would then be in charge of not running docc for old forks that haven't changed because its cached.

Didn't dive too deep into this, so there might be something I was missing here.

I'd love to get your help on the frontend side of things.

Ah I was using claude 🤣 it's pretty likely that you have more design skill points than me :)

Regardless of where we go

Would leave this upto you and Dan, happy to follow with what you folks think makes sense. My initial investigation was aimed at reducing the CI times since this gets ran on every PR, but it seems Dan may have a way to reduce this down to less than 10 minutes, so its not that much of a concern

Copy link
Contributor

@SamWilsn SamWilsn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code-wise, I think this makes a lot of sense!

What about just disabling diffs when building on a branch? The artifacts don't get uploaded anywhere, and any missing references will get caught by the non-diff build anyway.

@kevaundray
Copy link
Contributor Author

Code-wise, I think this makes a lot of sense!

What about just disabling diffs when building on a branch? The artifacts don't get uploaded anywhere, and any missing references will get caught by the non-diff build anyway.

I think this would be much better

@kevaundray
Copy link
Contributor Author

Closing due to #2296

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants