Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GC: add support for GC policies #338

Merged
merged 1 commit into from
Jul 4, 2020
Merged

Conversation

rchincha
Copy link
Contributor

For our dist-spec implementation [1][2],
we have a use case where we use umoci's GC code to clean up orphaned
blobs. The issue is that as per dist-spec layer upload and manifest
updates are two different API calls and typically in that order, which
means layers begin life as orphans.

Adding the proposed patch allows us to:
a) mitigate the above issue
b) have a generic policy framework for future expansion

[1] https://github.com/opencontainers/distribution-spec
[2] https://github.com/anuvu/zot

oci/casext/gc.go Outdated Show resolved Hide resolved
@tych0
Copy link
Member

tych0 commented Jun 30, 2020

LGTM

Copy link
Member

@cyphar cyphar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this idea, but there are some things I'd like to see addressed. Thanks.

oci/casext/walk.go Outdated Show resolved Hide resolved
oci/casext/gc.go Outdated
@@ -25,6 +25,9 @@ import (
"golang.org/x/net/context"
)

// GCPolicy is a policy function that returns if a blob can be GC'ed
type GCPolicy func(desc ispec.Descriptor) (bool, error)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In line with switching to Paths, this should be a DescriptorPath.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Want to hide umoci internals from the policy func writers. So just pass in the standard ispec.Descriptor.

Copy link
Member

@cyphar cyphar Jul 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DescriptorPath isn't really an umoci internal, it's fairly publicly used if you want to do ResolveReference (so basically every umoci API user will see DescriptorPath at some point). For a bit of background -- the reason why DescriptorPath exists is that it allows programs to correctly handle arbitrary nesting of descriptors.

However I just re-read the original description of the PR and realised that passing descriptors makes no sense (sorry, it looks like @tych0 and I overlooked this) -- you are only calling policy(desc) if there is no descriptor in the black set (in other words, on blobs that have no descriptor pointing to them), so the descriptor will always be an empty struct. The new tests didn't catch this, so we'll need to update the tests so they can exercise this code properly.

oci/casext/gc.go Show resolved Hide resolved
oci/casext/gc.go Show resolved Hide resolved
@rchincha rchincha force-pushed the gc branch 3 times, most recently from d373e90 to 925e169 Compare July 1, 2020 02:01
Copy link
Member

@cyphar cyphar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like @tych0 and I both overlooked this -- the original change (using Digest) was actually correct. I misread the original purpose of the PR -- you want to be able to not garbage collect blobs which don't have any descriptors pointing to them. So therefore the policy function won't be able to take a descriptor, because in order for the policy to be checked there must be no descriptor for the blob.

This is unfortunate, since it means the policy function doesn't know the type of the blob, but that is the nature of the problem. I'm a little worried that the new tests didn't catch that the passed ispec.Descriptor was completely empty (even the digest was empty) so those tests should be updated to correctly exercise that code.

oci/casext/gc.go Outdated Show resolved Hide resolved
oci/casext/gc.go Outdated Show resolved Hide resolved
oci/casext/gc_test.go Outdated Show resolved Hide resolved
oci/casext/gc_test.go Outdated Show resolved Hide resolved
@tych0
Copy link
Member

tych0 commented Jul 1, 2020

This is unfortunate, since it means the policy function doesn't know the type of the blob, but that is the nature of the problem.

Ugh, yeah. That is unfortunate. Makes total sense though.

@rchincha rchincha force-pushed the gc branch 2 times, most recently from c05011e to 3382efc Compare July 2, 2020 17:31
@rchincha
Copy link
Contributor Author

rchincha commented Jul 2, 2020

Updated the PR with suggested changes.

@rchincha
Copy link
Contributor Author

rchincha commented Jul 2, 2020

https://travis-ci.org/github/opencontainers/umoci/jobs/704404847
Is this failure expected?
loadinternal: cannot find runtime/cgo

@cyphar
Copy link
Member

cyphar commented Jul 3, 2020

@rchincha That's not the cause of the failure (it's some weird warning we get when running under Travis that I haven't had a chance to debug). The cause of the failure was that the Open Build Service had some downtime in the past few days, so the package installations failed. EDIT: The actual cause of the failure was that opensuse/leap:latest switched over to Leap 15.2, which uncovered that some OBS repositories have been broken for a while (namely home:cyphar:bats). I've fixed that and re-triggered the CI run -- I will also fix up some of those aspects of our testing in a separate PR.

Copy link
Member

@cyphar cyphar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few nits.

oci/casext/gc.go Outdated Show resolved Hide resolved
oci/casext/gc.go Outdated Show resolved Hide resolved
oci/casext/gc.go Show resolved Hide resolved
oci/casext/gc.go Outdated Show resolved Hide resolved
oci/casext/gc.go Show resolved Hide resolved
oci/casext/gc_test.go Outdated Show resolved Hide resolved
oci/casext/gc_test.go Outdated Show resolved Hide resolved
oci/casext/gc_test.go Outdated Show resolved Hide resolved
For our dist-spec implementation [1][2],
we have a use case where we use umoci's GC code to clean up orphaned
blobs. The issue is that as per dist-spec layer upload and manifest
updates are two different API calls and typically in that order, which
means layers begin life as orphans.

Adding the proposed patch allows us to:
a) mitigate the above issue
b) have a generic policy framework for future expansion

[1] https://github.com/opencontainers/distribution-spec
[2] https://github.com/anuvu/zot

Signed-off-by: Ramkumar Chinchani <[email protected]>
@cyphar
Copy link
Member

cyphar commented Jul 4, 2020

LGTM, thanks and sorry for all the back-and-forth around Descriptors when it wasn't relevant to your PR. :D

@tych0
Copy link
Member

tych0 commented Jul 4, 2020

LGTM.

@tych0 tych0 merged commit 977db48 into opencontainers:master Jul 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants