Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

attestations: batch attestation requests to reduce GH API load #18544

Open
1 task done
woodruffw opened this issue Oct 10, 2024 · 6 comments
Open
1 task done

attestations: batch attestation requests to reduce GH API load #18544

woodruffw opened this issue Oct 10, 2024 · 6 comments
Assignees
Labels
features New features in progress Maintainers are working on this

Comments

@woodruffw
Copy link
Member

Verification

Provide a detailed description of the proposed feature

This is a tracking issue for myself.

GitHub ratelimits their attestation APIs. These ratelimits are mostly fine when authenticated, but are severe when unauthenticated. However, even when authenticated, we're currently regularly getting close to them on CI (since the CI performs a lot of attestation verifications while performing bottle builds + test installs).

We can significantly reduce the amount of individual GH API calls we make by batching our attestation lookups. Instead of performing one gh attestation ... subcommand per attestation, we can instead pre-compute the list of attestations to look up and fetch them in a bulk fashion by requesting them in a single gh attestation ... invocation. gh attestation ... should then download them in an appropriate batched fashion internally.

What is the motivation for the feature?

Three main motivations:

  • Reliability/reducing our baseline GH API load: fewer separate downloads means fewer network requests that can fail for reasons outside of our control. It also means fewer hits counted against both our (CI) and user-level API tokens, making it less likely that users will see ratelimiting errors.
  • Performance: doing attestation downloads and verification in bulk upfront can be made embarrassingly parallel, either at the gh level or at the brew level. It also reduces the number of gh subprocesses brew needs to spawn, which are also slow.
  • Post gh transition: downloading the attestations up-front means we can more easily slot sigstore-ruby in for verification.

How will the feature be relevant to at least 90% of Homebrew users?

Reliability and performance, per above.

What alternatives to the feature have been considered?

Two options:

  1. Leave things as they are.
  2. Perform a more aggressive refactor that avoids the attestation APIs entirely, by instead stapling attestations to container manifests/metadata (is this the right term?) and retrieving them in-line with bottle downloads from GHCR. This would bypass the ratelimit concerns with attestations, but might have some undesirable knock-on effects (such as making 3p tap attestation support harder, since not all 3p taps use containers for bottle distribution).
@woodruffw woodruffw added the features New features label Oct 10, 2024
@woodruffw woodruffw self-assigned this Oct 10, 2024
@carlocab
Copy link
Member

Another option: check attestations at bottle fetch time instead of at bottle pour time.

test-bot does a lot of installing and uninstalling formulae in any given CI run, so checking them every time a formula is installed is pretty wasteful/expensive.

@Bo98
Copy link
Member

Bo98 commented Oct 10, 2024

Perform a more aggressive refactor that avoids the attestation APIs entirely, by instead stapling attestations to container manifests/metadata (is this the right term?) and retrieving them in-line with bottle downloads from GHCR.

Still think we should pursue this.

Batching is a good idea in general to reduce roundtrips but it's not really a full solution to the rate limit problem as there's so many situations we need to cover: dependencies, multiple named args, dependent upgrades, Brewfiles and scripts that do multiple brew install calls. The last of which is unfixable.

60/hour is not really acceptable for us to ship to GA.

For third-party taps: when containers aren't used then we do download a tab JSON so it could go there and that shouldn't be an issue.

@MikeMcQuaid
Copy link
Member

Another option: check attestations at bottle fetch time instead of at bottle pour time.

This makes sense to me for something to do for now.

Still think we should pursue this.

Agreed but, until someone does, we should try to improve things accordingly first.

60/hour is not really acceptable for us to ship to GA.

Agreed.

Copy link

github-actions bot commented Nov 4, 2024

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

@github-actions github-actions bot added the stale No recent activity label Nov 4, 2024
@woodruffw woodruffw added in progress Maintainers are working on this and removed stale No recent activity labels Nov 4, 2024
@woodruffw
Copy link
Member Author

Leaving some notes for myself, since I'm planning on working on this in the coming weeks:

  1. We can batch fetch attestations directly via the GH API: https://docs.github.com/en/rest/users/attestations?apiVersion=2022-11-28
  2. We're now in a good state to directly integrate sigstore-ruby

@Bo98
Copy link
Member

Bo98 commented Nov 18, 2024

I did also briefly look into stapling attestations to GHCR recently and can look into drafting that maybe this week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
features New features in progress Maintainers are working on this
Projects
None yet
Development

No branches or pull requests

4 participants