shortening CI and doc building time #28

edublancas · 2023-02-16T16:22:13Z

following up on the comment from @neelasha23 in our team brainstorming:

shortening the CI and doc building time is essential since we're blocked while waiting for this to finish, sharing a few thoughts

ci

I recently added a flag to our repos so pytest prints tests that take >5 seconds to finish, this can help us spot slow tests and fix them. there might be cases where we have no choice, but in the majority of cases we should strive to have fast tests; we could go beyond printing logs and enforce this by failing slow tests. if we don't enforce it as a rule, then we'd have to run periodic checks to look for slow tests

another solution would be to find a vendor that can run our tests in more powerful hardware. I remember seeing some of them. This can be a quick win.

Finally, can also try running pytest in parallel, there's an extension that allows doing that; the challenge there is that if our tests are not fully isolated, we might sun into issues that will be hard to debug.

docs

similarly, as with the CI, we could print the running time for each notebook so we spot slow ones. I think jupyter-book has an option to timeout slow notebooks so that's another option.

we can also check if readthedocs has an option to build our docs in faster hardware.

running notebooks in parallel is also another option, but in this case, we'll have to implement it ourselves. an alternative can be a caching layer. readthedocs doesn't cache anything since it starts from an empty container but jupyter-book has caching built-in. so we could build something that fetches the latest successful docs build from master/main so we can benefit from it. But I'm unsure if readthedocs has an API for downloading previous builds

edublancas · 2023-02-23T19:30:35Z

@idomic is working on this one. He found an option to cache conda environments: https://dev.to/epassaro/caching-anaconda-environments-in-github-actions-5hde

some thoughts:

Since our projects are Python packages, they must have a setup.py file. All the package dependencies are declared there and we even define optional dependencies (for example, the ones required for testing/development).

the solution that Ido found relies on conda environment files (env.yaml), instead of (setup.py) files but the caching action looks like a generic caching feature so I think it'll work.

I think that changing the run step here (from the proposed solution in the blog post):

      - name: Update environment
        run: mamba env update -n my-env -f environment.yml
        if: steps.cache.outputs.cache-hit != 'true'

for the series of steps that we run to prepare the environment will work

for example, for sk-eval, the run step would be all the pip install commands that we have here: https://github.com/ploomber/sklearn-evaluation/blob/3b3f6aa8f568c0b444d132f369e123026fb1c303/.github/workflows/ci.yml#L42

edublancas · 2023-02-23T19:35:55Z

If the above doesn't work, we can generate a conda environment file after running the pip install commands. Since most of our repos use conda environments (even when they're calling pip install, conda export freeze > env.yaml will work

edublancas · 2023-03-01T16:41:00Z

@idomic as discussed earlier - in the top comment here I mention a few things about speeding up the doc building process. let's assign someone to take a deep dive on this

edublancas · 2023-03-04T19:10:37Z

did some research on accelerating docs building

parallel builds

we can enable this locally for quick testing; however, it's not possible to enable this on readthedocs as it doesn't offer a way to add parameters to the command that builds the docs.

we can enable parallel notebook executing with the -j X option. This speeds up jupysql docs:

parallel:

python -m sphinx -T -E -W --keep-going -b html -d _build/doctrees -D  -j 4 .   54.40s user 10.51s system 85% cpu 1:16.00 total

serial:

python -m sphinx -T -E -W --keep-going -b html -d _build/doctrees -D  .   52.33s user 9.79s system 39% cpu 2:35.62 total

notes

looks like sphinx offers the -j parameter and plugins can decide to take it ignore it. the notebook execution comes from the myst-nb package (documented here), but it's passed to the sphinx CLI, and then grabbed by myst-nb.

One important thing I noted is that this is a "best-effort" thing, if some steps are not parallelizable, the parameter is ignored. I realized this because the flag didn't work I saw some warnings about some latex thing not being parallelizable, then I tried commenting out all the variables in conf.py that reference to latex; ran again and this time it worked (the logs don't say anything but I realized it was working bc the logs printed all at once at the end as opposed to line by line, which happens when running serially)

caching

No solution yet, we need to implement something with S3

Another option is caching the notebooks. jupyter-book already offers a mechanism to cache previously executed notebooks.

The first thing I tried was to enable readthedocs offline formats so each build would produce a zip with all the HTML contents. However, this didn't work and the command triggered a full doc build again since jupyter-book uses a separate folder to store the notebook's state.

the only solution I can think of is to use readthedocs config to run a command before and after building the docs (this is supported). after a successful build, we can upload the full folder to S3 and before each build we can fetch the folder. this will speed things up a lot as the build will only have to run notebooks that have changed.

neelasha23 · 2023-03-23T13:25:08Z

Which repo should we focus on for this issue? (I saw @edublancas has fixed for skeval).

edublancas · 2023-03-23T14:21:38Z

let's focus on JupySQL, the tests are pretty fast on ubuntu but slow on windows and mac, although that appears to be a hardware problem.

I think let's focus on implementing the documentation caching (as described in my earlier comment) and see how well it works.

idomic · 2023-03-23T14:24:32Z

It's also pretty sporadic, sometimes 5m, sometimes 15m, I'm not sure a lot can be done there. Keep updating here.
If that fails, we can enable testing in parallel for both repos, it might cut the time by 30%.

idomic mentioned this issue Mar 1, 2023

re-add examples/ to CI ploomber/sklearn-evaluation#305

Closed

This was referenced Apr 3, 2023

Parallel tests ploomber/jupysql#352

Merged

CI Shortening ploomber/jupysql#332

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

shortening CI and doc building time #28

shortening CI and doc building time #28

edublancas commented Feb 16, 2023

edublancas commented Feb 23, 2023 •

edited

Loading

edublancas commented Feb 23, 2023

edublancas commented Mar 1, 2023

edublancas commented Mar 4, 2023 •

edited

Loading

neelasha23 commented Mar 23, 2023

edublancas commented Mar 23, 2023

idomic commented Mar 23, 2023 •

edited

Loading

shortening CI and doc building time #28

shortening CI and doc building time #28

Comments

edublancas commented Feb 16, 2023

ci

docs

edublancas commented Feb 23, 2023 • edited Loading

edublancas commented Feb 23, 2023

edublancas commented Mar 1, 2023

edublancas commented Mar 4, 2023 • edited Loading

parallel builds

notes

caching

neelasha23 commented Mar 23, 2023

edublancas commented Mar 23, 2023

idomic commented Mar 23, 2023 • edited Loading

edublancas commented Feb 23, 2023 •

edited

Loading

edublancas commented Mar 4, 2023 •

edited

Loading

idomic commented Mar 23, 2023 •

edited

Loading