GHProxy: Cleanup old caches#23621
Conversation
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: alvaroaleman The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
| } | ||
|
|
||
| go func() { | ||
| for range time.NewTicker(cachePruneInterval).C { |
There was a problem hiding this comment.
iirc there's some gotcha with this construct that leaks tickers?
There was a problem hiding this comment.
if ghproxy uses interrupts, prefer interrupts.Tick()
There was a problem hiding this comment.
Yeah the Ticket never gets garbage collected, however it runs as long as the binary so that doesn't matter
|
|
||
| func writecachePartitionMetadata(basePath, tempDir string, expiresAt time.Time) error { | ||
| if expiresAt.IsZero() { | ||
| return nil |
There was a problem hiding this comment.
Won't this lead to leaks? Why not failsafe to writing metadata that expires at time.Now().Add(time.Hour) or something?
There was a problem hiding this comment.
No, it won't and we can't do that. The whole reason the expiry information is passed on from the client is tokens validity varies and in the case of PAT, it never expires which will gets by an empty expiresAt. If we unconditionally added 1h here, we would always delete all caches after one hour, we cant do that.
There was a problem hiding this comment.
Ah, I see - this is to handle PAT. Perhaps a comment would be good to clarify this since it's implicit behavior?
There was a problem hiding this comment.
I guess I would have expected "no expiry header" -> "no call to writing metadata" rather than "no expiry header" -> "pass an invalid date" -> "do nothing" but maybe that's just me being confused by it all
There was a problem hiding this comment.
Also I guess it would not hurt to have a default cache TTL for PAT entries, too, since they could hit the same issues that apps auth hits, on a smaller machine or with fewer inodes free? Setting the TTL to a week or something should not cause adverse effects.
There was a problem hiding this comment.
Added a comment and made it a pointer to further clarify this might not be set. The TTL is for the entire cache, not individual entries so we can never evict a PAT cache
85d5978 to
b8a2c72
Compare
Currently, Ghproxy never cleans up caches. This can relatively quickly lead to inode exhaustion when apps auth is used, as it results in many, relatively shortlived caches (1h). This change adds pruning for those which works as follows: * The github client will add an expiry header if it sends a request with a token that expires * Ghproxy will write the expiry time into a metadata file at the root of the cache partition * A background routine in ghproxy will iterate over all cache paritions and delete them when they have expired
b8a2c72 to
47df9eb
Compare
|
/lgtm |
|
I just caught a flake over in #23656 (comment) that I suspect is related to this, only because I've never seen ghproxy flake on me before and this just landed recently |
Thanks. I suspect that is because the test relies on timings and if things to too slow, it will fail like this. I'll try to improve this through a fake clock. |
Currently, Ghproxy never cleans up caches. This can relatively quickly
lead to inode exhaustion when apps auth is used, as it results in many,
relatively shortlived caches (1h).
This change adds pruning for those which works as follows:
a token that expires
the cache partition
and delete them when they have expired
Fixes #23407