Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add GHA workflow to validate pants BUILD files #5732

Merged
merged 3 commits into from
Sep 20, 2022
Merged

Conversation

cognifloyd
Copy link
Member

@cognifloyd cognifloyd commented Sep 16, 2022

Background

This is another part of introducing pants, as discussed in the TSC Meetings on 12 July 2022, 02 Aug 2022 and 06 Sept 2022. Pants has fine-grained per-file caching of results for lint, fmt (like black), test, etc. It also has lockfiles that work well for monorepos that have multiple python packages. With these lockfiles CI should not break when any of our dependencies or our transitive dependencies release new versions, because CI will continue to use the locked version until we explicitly relock with updates.

To keep PRs as manageable/reviewable as possible, introducing pants will take a series of PRs. I do not know yet how many PRs; I will break this up into logical steps with these goals:

  • introduce pants to the st2 repo, and
  • teach some (but not all) TSC members about pants step-by-step.

Other pants PRs include:

Due to scope change, this PR replaces #5727

Overview of this PR

It's helpful to review this PR one commit at a time.

This PR prepares us to run pants in GitHub Actions (GHA). There are 2 parts to this:

  1. Add a CI-specific pants config file: pants.ci.toml (used in addition to pants.toml)
  2. Prepare to add BUILD files to the repo: Add GHA workflow that makes sure BUILD files are up-to-date.

CI-specific pants config

The pants docs recommend adding a CI-specific pants config file called pants.ci.toml:
https://www.pantsbuild.org/docs/using-pants-in-ci#configuring-pants-for-ci-pantscitoml-optional

Please note this quote from the docs:

set the environment variable PANTS_CONFIG_FILES=pants.ci.toml to use this new config file, in addition to pants.toml.

So, pants.ci.toml does NOT replace pants.toml, it extends it. So, pants will still use the pants.toml, but we can tune settings specifically for GHA, and possibly CircleCI (if we need to use pants there).

We don't have to remember to add the PANTS_CONFIG_FILES env var, however, because that is covered by the reusable pants-init action, which injects the env var for us.

Pants GHA workflow

BUILD files are an important part of helping pants understand our codebase. As the docs say:

BUILD files provide metadata about your code (the timeout of a test, any dependencies which cannot be inferred, etc). BUILD files are typically located in the same directory as the code they describe. Unlike many other systems, Pants BUILD files are usually very succinct, as most metadata is either inferred from static analysis, assumed from sensible defaults, or generated for you.

In one of the next PRs, we will be adding BUILD files with metadata for pants across the repo. That PR will run ./pants tailor :: as described in initial config docs.

pants has some great tooling to ensure those BUILD files are up-to-date, so we will need a GHA workflow that validates those files. I want to enable a workflow as soon as we add the BUILD files, but--to simplify reviewing--I don't want to add the workflow in the same PR that we add the BUILD files. So, I'm adding the workflow in this PR without enabling it.

This is the key part of the workflow:

- name: Check BUILD files
run: |
./pants tailor --check update-build-files --check ::

Here, we run two pants goals in check mode:

  • tailor: "Auto-generate BUILD file targets for new source files."
  • update-build-files: "Format and fix safe deprecations in BUILD files."

tailor example: If we add a shell script in a directory that didn't have any shell scripts before, then tailor would add a shell_sources target in a BUILD file in that directory. Same for python, and many other things. I expect to discuss more about the contents of BUILD files in the PRs where we add, and then modify them.

update-build-files example: BUILD files use python syntax, so this runs black over them to reformat them. update-build-files also vastly simplifies upgrading between pants versions, because it can adjust the BUILD files to use the latest-and-greatest targets/fields/etc. As the docs say, it can: "Automatically fix deprecations, such as target type renames, that are safe because they do not change semantics."

@cognifloyd cognifloyd added this to the pants milestone Sep 16, 2022
@cognifloyd cognifloyd self-assigned this Sep 16, 2022
@pull-request-size pull-request-size bot added the size/M PR that changes 30-99 lines. Good size to review. label Sep 16, 2022
Comment on lines +5 to +21
# temporarily only allow manual runs until we have BUILD files and lockfiles
workflow_dispatch:
#push:
# branches:
# # only on merges to master branch
# - master
# # and version branches, which only include minor versions (eg: v3.4)
# - v[0-9]+.[0-9]+
# tags:
# # also version tags, which include bugfix releases (eg: v3.4.0)
# - v[0-9]+.[0-9]+.[0-9]+
#pull_request:
# type: [opened, reopened, edited]
# branches:
# # Only for PRs targeting those branches
# - master
# - v[0-9]+.[0-9]+
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will uncomment this and remove workflow_dispatch when we add the BUILD files.

Comment on lines +34 to +40
- name: Initialize Pants and its GHA caches
uses: pantsbuild/actions/init-pants@c0ce05ee4ba288bb2a729a2b77294e9cb6ab66f7
# This action adds an env var to make pants use both pants.ci.toml & pants.toml.
# This action also creates 3 GHA caches (1 is optional).
# - `pants-setup` has the bootsrapped pants install
# - `pants-named-caches` has pip/wheel and PEX caches
# - `pants-lmdb-store` has the fine-grained process cache.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/pantsbuild/actions/tree/main/init-pants

The action handles installing pants, adding the PANTS_CONFIG_FILES env var (to use our new pants.ci.toml file), and it manages the GHA caches for us. Using this action means that we don't have to repeat those steps in every workflow that uses pants.

One of these GHA cahces will include the pip cache, so we will also be able to simplify our cache usage in other workflows once we are using pants everywhere.

Comment on lines +50 to +51
# enable the optional lmdb_store cache since we're not using remote caching.
cache-lmdb-store: 'true'
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we experience any issues with the 10GB GHA cache limit (a per-repo limit), we may need to add a workflow that periodically clears out older unused caches - Github expires unused caches after 7 days, but we might want a shorter timespan, as this cache is a per-commit cache. We can also bump the gha-cache-key input var if the cache gets too big to ignore it and start fresh.

Comment on lines +57 to +62
- name: Upload pants log
uses: actions/upload-artifact@v2
with:
name: pants-log-py${{ matrix.python-version }}
path: .pants.d/pants.log
if: always() # We want the log even on failures.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my experience, this log is often nearly empty, but it can be very useful when debugging CI issues. Adding this step is recommended here: https://www.pantsbuild.org/docs/using-pants-in-ci#tip-store-pants-logs-as-artifacts

We will want to add this step in each workflow that runs pants.

Comment on lines +9 to +13
[stats]
# "print metrics of your cache's performance at the end of the run,
# including the number of cache hits and the total time saved thanks
# to caching"
log = true
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recommended under "Tip: check cache performance with [stats].log" at the end of the Directories to Cache section of the docs.

Comment on lines +4 to +7
[GLOBAL]
# Colors often work in CI, but the shell is usually not a TTY so Pants
# doesn't attempt to use them by default.
colors = true
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cognifloyd cognifloyd marked this pull request as ready for review September 16, 2022 21:47
@cognifloyd cognifloyd changed the title Pants GHA init Add GHA workflow to validate pants BUILD files Sep 16, 2022
Copy link
Member

@rush-skills rush-skills left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
infrastructure: ci/cd pantsbuild size/M PR that changes 30-99 lines. Good size to review.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants