Improve CI cache usage by nineinchnick · Pull Request #14880 · trinodb/trino

nineinchnick · 2022-11-03T12:03:16Z

Description

I noticed we're still creating too many cache entries so it gets cleared too often, so we should not create them on forks, because such entries cannot be reused except for that fork and branch.

Ref: https://docs.github.com/en/actions/using-workflows/caching-dependencies-to-speed-up-workflows#restrictions-for-accessing-a-cache

I also noticed downloading the cache sometimes gets stuck, like here: https://github.com/trinodb/trino/actions/runs/3384101557/jobs/5620679970. There's a new setting in the cache action documented here: https://github.com/actions/cache/blob/main/workarounds.md#cache-segment-restore-timeout and it should be available because actions/setup-java is using actions/cache 3.0.4 (the JS lib, not the GH action): https://github.com/actions/setup-java/blob/main/package.json

Ref: https://github.com/actions/toolkit/blob/main/packages/cache/RELEASES.md#304

Non-technical explanation

n/a

Release notes

(x) This is not user-visible or docs only and no release notes are required.
( ) Release notes are required, please propose a release note for me.
( ) Release notes are required, with the following suggested text:

Caches created on forks cannot be shared so don't create them to avoid wasting time and wasting cache limits.

.github/workflows/ci.yml

hashhar

BTW is this really a problem though? We can have around 6 cache entries with current size.

Only the latest one on master will get used anyway.

This change would mean PRs which change pom.xml will no longer benefit from caching (since the only reason a new cache entry would be created on fork would be if pom.xml changed).

nineinchnick · 2022-11-03T12:38:37Z

BTW is this really a problem though? We can have around 6 cache entries with current size.

We see a lot of failures related to fetching artifacts from Maven Central. I want to increase the chances of a cache hit by avoiding the eviction of cache entries, especially if most of them cannot be reused.

Also, we waste time (50s) on uploading a cache entry that won't be reused:

Only the latest one on master will get used anyway.

Only if it exists.

This change would mean PRs which change pom.xml will no longer benefit from caching (since the only reason a new cache entry would be created on fork would be if pom.xml changed).

They don't benefit from cache anyway, since they can't use the one from master, because the cache key is a hash of all pom files. We could work around it by using a different restore key: https://github.com/actions/cache/blob/main/workarounds.md#update-a-cache but:

this input param is not exposed in setup-java, we'd need to extract the usage of the cache action
there are relatively few commits in the last 3 months that change poms:

 release_changes | pom_changes | all_changes 
-----------------+-------------+-------------
            1818 |         696 |       11723

query:

SELECT
    count(*) FILTER (WHERE d.path_name like '%pom.xml' AND c.message like '[maven-release-plugin] %') AS release_changes
  , count(*) FILTER (WHERE d.path_name like '%pom.xml' AND c.message not like '[maven-release-plugin] %') AS pom_changes
  , count(*) AS all_changes
FROM git.default.commits c
JOIN git.default.diff_stats d ON d.commit_id = c.object_id
WHERE c.commit_time > NOW() - INTERVAL '3' MONTH
;

This is a different problem to solve in a separate PR though (restoring the cache vs creating a new cache entry).

Default timeout is 60 minutes, after which a stuck download would be retried. Lower it to start retrying downloads when they get stuck.

hashhar · 2022-11-03T12:49:08Z

Only the latest one on master will get used anyway.

Only if it exists.

Why would it not exist? The tip of master always creates a cache entry and since GHA rebases PRs onto master before running CI unless you get caught in the very tiny window where some PR is merged to master but it's cache is not yet uploaded only then would the cache not exist.

They don't benefit from cache anyway, since they can't use the one from master,

Only if you assume CI runs just once on such a PR which is rarely true.

For example if you go to https://github.com/trinodb/trino/actions/caches and see which one is getting used recently and when it was created you'll notice the one created from tip of master is most used one. But also others created from forks get used relatively often as well.

Note that the cache is LRU, not FIFO so as long as something uses the cache from tip of master it'll never get evicted - just eventually replaced when some other PR is merged to master.

nineinchnick · 2022-11-03T13:01:45Z

Only the latest one on master will get used anyway.

Only if it exists.

Why would it not exist? The tip of master always creates a cache entry and since GHA rebases PRs onto master before running CI unless you get caught in the very tiny window where some PR is merged to master but it's cache is not yet uploaded only then would the cache not exist.

They don't benefit from cache anyway, since they can't use the one from master,

Only if you assume CI runs just once on such a PR which is rarely true.

For example if you go to https://github.com/trinodb/trino/actions/caches and see which one is getting used recently and when it was created you'll notice the one created from tip of master is most used one. But also others created from forks get used relatively often as well.

Note that the cache is LRU, not FIFO so as long as something uses the cache from tip of master it'll never get evicted - just eventually replaced when some other PR is merged to master.

You're right. We should first make sure that forks can reuse the cache from master, then we can analyze how often they reuse their own cache. I extracted the other commit into #14881, will close this one and work on using restore-key.

nineinchnick · 2022-11-03T13:41:44Z

I just opened #14882 and trinodb/github-actions#15 if we decide to use this action in #14865.

Only cache Maven dependencies on main branch

31db227

Caches created on forks cannot be shared so don't create them to avoid wasting time and wasting cache limits.

cla-bot bot added the cla-signed label Nov 3, 2022

nineinchnick requested review from MiguelWeezardo, ebyhr and hashhar November 3, 2022 12:03

hashhar reviewed Nov 3, 2022

View reviewed changes

.github/workflows/ci.yml Outdated Show resolved Hide resolved

hashhar reviewed Nov 3, 2022

View reviewed changes

Set CI cache download timeout to enable retries

6259d34

Default timeout is 60 minutes, after which a stuck download would be retried. Lower it to start retrying downloads when they get stuck.

nineinchnick force-pushed the ci-cache branch from 04e6db7 to 6259d34 Compare November 3, 2022 12:43

nineinchnick closed this Nov 3, 2022

nineinchnick deleted the ci-cache branch November 3, 2022 14:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve CI cache usage#14880

Improve CI cache usage#14880
nineinchnick wants to merge 2 commits intotrinodb:masterfrom
nineinchnick:ci-cache

nineinchnick commented Nov 3, 2022

Uh oh!

Uh oh!

hashhar left a comment

Uh oh!

nineinchnick commented Nov 3, 2022 •

edited

Loading

Uh oh!

hashhar commented Nov 3, 2022 •

edited

Loading

Uh oh!

nineinchnick commented Nov 3, 2022

Uh oh!

nineinchnick commented Nov 3, 2022 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

2 participants

Conversation

nineinchnick commented Nov 3, 2022

Description

Non-technical explanation

Release notes

Uh oh!

Uh oh!

hashhar left a comment

Choose a reason for hiding this comment

Uh oh!

nineinchnick commented Nov 3, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hashhar commented Nov 3, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nineinchnick commented Nov 3, 2022

Uh oh!

nineinchnick commented Nov 3, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

2 participants

nineinchnick commented Nov 3, 2022 •

edited

Loading

hashhar commented Nov 3, 2022 •

edited

Loading

nineinchnick commented Nov 3, 2022 •

edited

Loading