-
Notifications
You must be signed in to change notification settings - Fork 130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Overhaul CI workflows to improve efficiency #392
Merged
ChrisCummins
merged 28 commits into
facebookresearch:development
from
ChrisCummins:ci-jobs
Sep 13, 2021
Merged
Overhaul CI workflows to improve efficiency #392
ChrisCummins
merged 28 commits into
facebookresearch:development
from
ChrisCummins:ci-jobs
Sep 13, 2021
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Sep 10, 2021
ChrisCummins
force-pushed
the
ci-jobs
branch
4 times, most recently
from
September 10, 2021 18:11
756c2f0
to
db3d161
Compare
Codecov Report
@@ Coverage Diff @@
## development #392 +/- ##
===============================================
- Coverage 85.91% 85.17% -0.74%
===============================================
Files 87 87
Lines 4743 4743
===============================================
- Hits 4075 4040 -35
- Misses 668 703 +35
Continue to review full report at Codecov.
|
ChrisCummins
force-pushed
the
ci-jobs
branch
4 times, most recently
from
September 11, 2021 12:48
231f1dd
to
62aa880
Compare
ChrisCummins
changed the title
🏗️ WIP: Overhaul CI workflows to improve efficiency
Overhaul CI workflows to improve efficiency
Sep 11, 2021
ChrisCummins
force-pushed
the
ci-jobs
branch
6 times, most recently
from
September 12, 2021 22:23
46e8ade
to
2f4cef4
Compare
The asan CI job started failing with a permission error on the Csmith binary. Try setting the executable bit: PermissionError: [Errno 13] Permission denied: '/opt/hostedtoolcache/Python/3.9.6/x64/lib/python3.9/site-packages/compiler_gym/third_party/csmith/csmith/bin/csmith'
This is because other tests can clobber the shared FLAGS state, changing the behavior of the tests.
This adds a `pip install -r compiler_gym/requirements.txt` step to the `make install` target, as otherwise the package can be installed without resolving the required deps.
Don't run a dedicate coverage test job, instead collect coverage reports from all test jobs and merge the results.
ChrisCummins
force-pushed
the
ci-jobs
branch
from
September 13, 2021 13:54
4260e72
to
9fb7432
Compare
This was referenced Sep 28, 2021
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This patch set overhauls the CI workflow to achieve a 59.4% reduction in total compute time and a 19.4% reduction in wall time.
Background
The "CI" workflow is run on every pull request and update to the stable and development branches of CompilerGym. It is responsible for running the test suite on a range of supported Python versions and operating systems to catch regressions and test new features. The CI workflow had grown to be very computationally hungry because of the large number of tests, lengthy build process, and large number of runtime configurations.
In the previous configuration, the CI workflow would spawn four job types:
bazel_test
(example run): these jobs would run the test suite using bazel.install_test
(example run): these jobs would build the compiler_gym Python wheel and then run the test suite using pytest.llvm-service-asan
(example run): the same asinstall_test
, except that it would build the LLVM compiler service with address sanitizer support and run only the LLVM test suite with leaking checking enabled.pytest-cov
(example run): the sameinstall_test
, except it run the test suite with code coverage enabled and upload the report to codecov.com.In total, 12 jobs were spawned and processed independently and in parallel requiring a massive 11 hours of compute time for every single change or PR update:
Much of this compute time is redundant and wasteful:
install_test
jobs builds a Python wheel from scratch even though the Python wheel is version insensitive.bazel_test
andinstall_test
jobs run the same test suite redundantly.This is made worse because there is no caching on account of GitHub's cache size limits.
New approach
This pull requests uses the workflow artifacts mechanism to make the CI workflow more efficient by breaking it into a graph of smaller, dependent jobs:
The jobs have the following types:
build
: Build a pair of compiler_gym Python wheels for Linux and macOS and upload them as artifacts for use by other jobs.build-asan-llvm-service
: Build the LLVM service with address sanitizer support (and build nothing else). Upload the artifact for use in other jobs.test
: Once thebuild
job has complete, download the wheel artifact and run the pytest suite on it, excluding theexamples/
andllvm/
directories. Upload a code coverage report artifact.test-examples
: Once thebuild
job has complete, download the wheel artifact and run the pytest suite on it from theexamples/
directory. Upload a code coverage report artifact.test-llvm-env
: Once thebuild
job has complete, download the wheel artifact and run the pytest suite on it from thellvm/
directory. Upload a code coverage report artifact.test-llvm-env-linux-asan
: Once thebuild
andbuild-asan-llvm-service
jobs has complete, download the wheel artifact, repack it using the asan LLVM service build, and run the pytest suite on it from thellvm/
directory.upload-coverage-reports
: Download all code cover report artifacts and combine them into a single upload to codecov.com.Splitting the building and testing into separate jobs enables a single build to be shared across test runners.
Sharding the test suite into three subsets ("core",
examples
, andllvm-env
) achieves greater test parallelism, reducing the wall time 13 minutes.Further, each of the build / test jobs are defined independently for macOS and Linux so that the tests for one platform do not block on the build for another. In total 20 jobs are spawned:
Differences with old CI config
The functionality of the new CI workflow is not exactly equivalent to the old config:
bazel test
locally.Drawbacks
Implementation complexity: The new CI workflow is much more efficient, but this is achieved by having a more complex workflow configuration file. The CI workflow definition YAMLs have grown from 178 LOC spread across 3 files to 477 LOC in a single file. Worse, the YAML contains large amounts of duplicate code, as there is no mechanism for templating jobs and each of the
test-<foo>
andbuild-<foo>
job definitions typically differ only in one or two lines.Maintenance burden: Sharding the test suite so that test runners execute only a subset of the test suite is an entirely manual process and introduces a maintenance burden. Now any changes to the test suite will require updating the CI configs. Additionally, the test job shards have an unbalanced amount of work to do. For example, the LLVM tests take approximately 30 minutes, whereas the example tests take only 2 minutes. Further PRs may be required to achieve a more granular breakdown of test jobs.
Increased number of macOS jobs: GitHub's runners permit a maximum of 5 macOS jobs to run in parallel. The new CI configuration specifies 5 jobs. In the future as we add more macOS jobs we will not be able to achieve greater parallelism.
Issue #385.