Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CircleCI: Overhaul with parallelisation and parameters for a cleaner config #977

Merged
merged 2 commits into from
Apr 27, 2021

Conversation

Tonux599
Copy link
Contributor

@Tonux599 Tonux599 commented Feb 25, 2021

As per the commit message:

  • Bump CircleCI config version to 2.1.
  • Use commands and parameters to get rid of repeated commands. New boards can be added with just 5 lines at the bottom of the config.
  • Made use of some parallelisation. Currently a single board from each Coreboot version is built. Afterwards all remaining boards are built in parallel.

Successful build here
Currently rebuilding here to test cache is working. (Edit: works)

A few things to note here:

  • As long as a single job does not exceed the 5hr limit, the CI can run for however long it needs to. So no matter how many boards we add it will not timeout
  • Unfortunately saving workspaces and restoring them can take a bit of time, so at the moment (we can make more changes to speed thing up, but this PR is a good baseline for further optimisations) it does take longer to build everything than how it currently stands. However, things will change if CircleCI ever decide to increase their 4 parallel runs or a payment plan is purchased to increase concurrency.
  • Finally, adding new boards is really easy now, at the bottom of the config add:
- build:
    name: <name of board>
    target: <name of board>
    requires:
      - <name of the last **sequential** board>

…config.

* Bump CircleCI config version to 2.1.
* Use commands and parameters to get rid of repeated commands. New boards can be added with just 5 lines at the bottom of the config.
* Made use of some parallelisation. Currently a single board from each Coreboot version is built. Afterwards all remaining boards are built in parallel.
@tlaurion
Copy link
Collaborator

tlaurion commented Mar 2, 2021

Really interesting reality check #935 (comment)

Attempting quick comparison on single board:

@tlaurion
Copy link
Collaborator

tlaurion commented Mar 23, 2021

@Tonux599 Little attempt combining tlaurion@9f6e86d (see here for details) with the optimizations of this PR over here.

tlaurion added a commit to tlaurion/heads that referenced this pull request Apr 27, 2021
@tlaurion
Copy link
Collaborator

@Tonux599 trying to pass CPUS=24 to CPUS=16 since last attempt failed (not enough memory resource, build failed because of parallelization last time I checked)

Build happening on referred commit above.

@tlaurion
Copy link
Collaborator

tlaurion commented Apr 27, 2021

@Tonux599 not sure I understand why the save cache is happening before all builds are made:
2021-04-27-161110

Or maybe that is right at this point, and the most complete cache will happen later on, and the third "saving cache" here is just the upload. Will try to keep an eye on what is parallelized until build is done:
2021-04-27-161349

EDIT:
No I confirm, the third cache is happening before the end of all builds:
2021-04-27-162006
2021-04-27-162151

@Tonux599 : Or, if I understand well, those caches are not aimed at taking all the build cache from anything else then the 3 main boards (corresponding to different versions of coreboot) and that's it?

@Tonux599
Copy link
Contributor Author

My apologies for not tending to this more, I've been rather busy.

The reason save_cache is there is because all the items in that column (save_cache and various boards) should already exist in a workspace where a coreboot has been built for each coreboot version we use. So then we are creating a cache where all the tools and coreboot toolsets are built so all that would be left to build is the .rom's for each board, which generally doesn't take too long.

Might be worth noting however, before I got too bogged down with uni work I was looking at scrapping using workplaces all together, as I had drafted a way where we could just use caches throughout the build which would have significantly increased the time to build all boards when a suitable cache was present.

@tlaurion
Copy link
Collaborator

@Tonux599 as of now, this approach (with CPUS=16 not 24) seems to be the best way to continue building Heads for now on.
I would love to see your working branch if you can point me out.

@Tonux599
Copy link
Contributor Author

@tlaurion Regrettably it failed to materialise.

It was something akin to this (n.b. that I'm pretty sure caches were disabled in that) but idea was that instead of using workspaces it would build a cache of each component (so, musl-cross, cb_4.8.1_toolchain, cb_4.11_toolchain, cb_4.13_toolchain and then at some point all the rest of the tools) and then build all the boards in parallel using the caches created.

Along with implementing something like this when we came around to rebuilding, the build would jump straight into building each board with the caches from the full build.

@tlaurion
Copy link
Collaborator

@Tonux599 Want to apply :
sed 's/CPUS=24/CPUS=16/g' -i .circleci/config.yml

So that I can merge this and we open new issue for better optimization later on?

@tlaurion
Copy link
Collaborator

tlaurion@7aafac7 was successful.

@Tonux599
Copy link
Contributor Author

@tlaurion
Copy link
Collaborator

tlaurion commented Apr 27, 2021

@Tonux599 : Well, I mean I already verified it worked, so if you want to add it here so I can merge and that master can be verified by osresearch CircleCI instance (so that master future commits are green again!), that would be cool (last commit were not build successfully on master, that is why I tested this today).

Build was successful of that commit on top of master here:
https://app.circleci.com/pipelines/github/tlaurion/heads/720/workflows/7580286e-1c3d-47e5-b15d-0bf4bfa3721e

The reason its still yellow here is because i've reissued a build to build from cache to know how long it takes.
But we know that workspace increments exponentially and build time of separate boards are mostly linked to downloading those workspace layers and decompressing them, and waiting for parallel builds to pickup with a maximum of 4 builds in parallel in free version. (so 12 build means 3x4, so that the 8 next builds have to wait for the prior 4 to finish (30 minutes with decompressing of workspaces) which adds up fast.

But that is the best solution we have right now, with build time being longer because some of the modules are not part of the caches created for the basic board to create different coreboot workspace being reused instead of combined caches.
(The resulting reused full cache being incomplete, some modules are being rebuilt instead of cache being reused).

I would merge so that 6 hours of build time happens now and that other PR can be merged upstream later on when osresearch CircleCi build confirms successful.

@Tonux599 Thanks for this PR!

@Tonux599
Copy link
Contributor Author

I would merge so that 6 hours of build time happens now and that other PR can be merged upstream later on when osresearch CircleCi build confirms successful.

Yep, I'm happy for this to merge whenever given it's already built successfully.

@Tonux599 Thanks for this PR!

Always happy to help! 😄

@tlaurion
Copy link
Collaborator

Merging.

@Tonux599 Do you want to create an issue with your hints on future optimizations and linked docs?

@tlaurion tlaurion merged commit cae003e into linuxboot:master Apr 27, 2021
@Tonux599
Copy link
Contributor Author

Merging.

@Tonux599 Do you want to create an issue with your hints on future optimizations and linked docs?

I will but not tonight. I'll add it to my to-do list.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants