Skip to content
This repository was archived by the owner on Aug 29, 2023. It is now read-only.

Conversation

@galargh
Copy link
Contributor

@galargh galargh commented Dec 22, 2021

Fixes #36 🤞

Description

When deploying changes to all the repositories, we're hitting a secondary rate limit due to opening too many PRs too quickly. I believe the only problematic call involved in the process is the Create a Pull Request API call. The documentation mentions that hitting the endpoint too quickly may lead to hitting the limit.

That's why I propose to reorganise the dispatch/copy-workflow procedure in the following way:

  1. Prepare batches of up to 256 targets.
  2. For each batch(sequentially):
    1. Trigger copy-workflow with the batch; for each target(in up to 20 parallel jobs):
      1. Checkout the default branch
      2. Copy files as instructed by the config
      3. Check if there are any changes
      4. Push the changes to remote web3-bot/sync branch IF there are any changes
    2. Wait for the triggered copy-workflow to complete
    3. Find all the matrix jobs from the copy-workflow that pushed changes to remote web3-bot/sync branch
    4. For all the found matrix jobs(sequentially):
      1. Create a PR
      2. Sleep for 3 seconds

The main idea here is to do as much of parallelisable work inside copy-workflow and parallelise it as much as possible. That only leaves the PR creation calls to be executed sequentially. Furthermore, by executing PR creation calls conditionally(only if needed) we're able to keep the runtime when we only update a handful of repos small.

We could further optimise this procedure by performing the PR creation part in parallel to copy-workflow execution. I think that would be possible because the copy-workflow doesn't execute any GH API calls. However, I propose that we postpone any work in this direction until we deem it necessary because it would add even more complexity to the dispatch workflow.

Removing PR create API call from copy-workflow

To accomplish this I remove the use of peter-evans/create-pull-request completely and replace it with:

git checkout -B web3-bot/sync
git push origin web3-bot/sync -f

These 2 commands ensure that our local state, after copying all the necessary files, is reflected in the remote web3-bot/sync branch. They are only executed if needed.

Dispatching copy-workflow and waiting for the completion

Unfortunately, benc-uk/workflow-dispatch is not capable of waiting for the dispatched workflow completion. It can also be replaced by a simple gh workflow run call(gh is already installed on the runners) so I decided to do just that.

Sadly, the latter doesn't know how to wait for the workflow completion either. Nor does it return the dispatched workflow id so we have to be clever about acquiring it. I decided to:

  1. record a timestamp before dispatching copy-workflow;
  2. dispatch copy-workflow;
  3. request the latest run of the copy-workflow until it's start date is after the recorded timestamp;
  4. record the id of that run of the copy-workflow.

This works because copy-workflows are being dispatched sequentially which means there is never more than one copy-workflow running at a time.

Once we have the id in hand, we can use gh run watch to wait for the run to complete.

Finding all the repositories that need PRs created

Once again, the Actions framework doesn't make it easy to retrieve this information. It doesn't support job outputs from matrix jobs(or rather it does but it only persist the output of the latest job which is not useful here).

The workaround I came up with is to:

  • inside copy-workflow: conditionally execute the branch push step;
  • inside dispatch:
    1. find all the matrix jobs executed by the copy-workflow run;
    2. filter them by the status of the branch push step execution;
    3. map the remaining jobs to the job names which are in {owner}/{repository} format.

Creating the PRs

It can be achieved with gh api -X POST "/repos/$job_name/pulls".

I also added a 3 second sleep after the GH API call. I decided to choose 3 seconds because that value comes up in multiple places across GH API so there's a chance it's going to work here too.

Overhead

I ran dispatch workflow in my account to measure the overhead of this setup. To achieve this I:

  • replaced the GH API call to create a PR in dispatch with an echo,
  • removed everything but the push step from copy-workflow,
  • replaced the push step in copy-workflow with an echo,
  • made the push step to always execute.

This means that the overhead numbers are for a case where we have to create PRs(and sleep for 3 seconds) for all the repositories in the configuration.

Repositories Repositories per batch Max parallel factor Time spent in copy-workflow Time spent "creating PRs" Total execution time
176 100 10    2m 53s
+ 1m 51s
= 4m 44s
   5m 27s
+ 4m 7s
= 9m 34s
15m 15s

Testing

Next step ideas if needed

  • increase/decrease the interval between PR creations
  • decouple PR creation from branch pushes completely - create a new job that creates PRs in targets that have existing branches but no associated PRs - trigger it once dispatch is done/manually/on schedule
  • create a workflow in target repositories that ensures a PR exists after a push to web3-bot/sync branch
  • move towards pull model - create a workflow in each target that knows how to copy workflows from unified CI repo and create a PR - trigger it from dispatch/manually/on schedule

Base automatically changed from testing-js to master January 7, 2022 10:06
@galargh galargh marked this pull request as ready for review January 10, 2022 10:47
@galargh
Copy link
Contributor Author

galargh commented Feb 16, 2022

@marten-seemann Have you had a chance to look at this? I'm interested to hear what you think about this approach. I was thinking of trying it out on the next major release of unified CI for Go.

@laurentsenta laurentsenta mentioned this pull request Mar 18, 2022
2 tasks
@galargh galargh mentioned this pull request Mar 21, 2022
1 task
@galargh galargh requested a review from laurentsenta March 30, 2022 09:56
@galargh galargh mentioned this pull request Mar 31, 2022
3 tasks
@galargh
Copy link
Contributor Author

galargh commented Apr 4, 2022

Merging - as per verbal approval from @laurentsenta

@galargh galargh merged commit 8c6afa9 into master Apr 4, 2022
@galargh galargh deleted the testing-rate-limit branch April 4, 2022 10:55
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Opening many PRs triggers GitHub's abuse detection mechanism

3 participants