Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ociruntime: handle images with high layer count #7630

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

sluongng
Copy link
Contributor

@sluongng sluongng commented Oct 2, 2024

When the action required an image with more than 20 layers, our mount
will fail with

create OCI bundle: create rootfs: mount overlayfs: no such file or directory

After some digging, it seems like 20 is the current limit of the number
of lowerdir allowed in each mount call.

Add special logic to break down images with more than 20 layers into
groups of 20. For each group, create an overlayfs mount called
"merged" in the same bundle dir. The final overlayfs will then
be composed of these "merged" groups as lowerdirs.

@sluongng sluongng requested a review from bduffany October 2, 2024 10:22
@sluongng sluongng force-pushed the sluongng/ociruntime-many-layers branch 2 times, most recently from f57618c to 9f79612 Compare October 2, 2024 10:43
func TestOverlayfsHighLayerCount(t *testing.T) {
setupNetworking(t)

image := "ghcr.io/avdv/nix-build@sha256:5f731adacf7290352fed6c1960dfb56ec3fdb31a376d0f2170961fbc96944d50"
Copy link
Member

@bduffany bduffany Oct 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The layering logic is starting to get complicated - it's kind of hard to understand the layer ordering now that the "groups" have to be in reverse-manifest-order and so do the individual layers within those merged groups.

So I feel like it'd be better to have a more explicit test on how we're handling the layers, rather than an integration test that uses a big image (which also makes the test more expensive). For example, create a fake slice of 30 image.Layer structs, pass that to a func like GetOverlayMounts(...) which returns some data structure representing the mounts, then assert that it returns 2 dirs, one with 10 layers and another with 20 layers (and make sure the order of the layers is correct). The "20" const could also be a parameter of that func and it could be set to 3 to make the test easier to follow.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah with the latest version of this PR, I do think that the logic is a bit convoluted (multiple slices.Reverse calls) and could use some tests.

I will look for a way to export part of createRootfs() logic for testing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I gave this a try and when I tried to do the refactoring, making the test inspecting the directory and mount structure felt... wrong? It felt like we were testing the implementation details, which would lock us out of changing it up in the future.

Instead, I think we should just test the higher-level effect which matters to our users.
So I added a test that creates X layers on top of busybox container image.
Each layer includes an a.txt file with a different value and we simply assert that the last layer wins by running cat /a.txt.

Let me know what you think.

@sluongng sluongng force-pushed the sluongng/ociruntime-many-layers branch 3 times, most recently from 084571c to 6a04038 Compare October 4, 2024 13:22
@sluongng sluongng requested a review from bduffany October 4, 2024 13:24
@sluongng sluongng marked this pull request as draft October 4, 2024 13:35
@sluongng
Copy link
Contributor Author

sluongng commented Oct 4, 2024

Hmm it seems like the test is failing at 1 + 21 layers. (my test is working! 🎉 )

I will put the PR in draft so I can investigate the implementation a bit more.

@sluongng sluongng force-pushed the sluongng/ociruntime-many-layers branch 3 times, most recently from f95d891 to cb4679b Compare October 4, 2024 15:58
@sluongng sluongng marked this pull request as ready for review October 4, 2024 16:00
When the action required an image with more than 20 layers, our mount
will fail with

```
create OCI bundle: create rootfs: mount overlayfs: no such file or directory
```

After some digging, it seems like 20 is the current limit of the number
of lowerdir allowed in each mount call.

Add special logic to break down images with more than 20 layers into
groups of 20. For each group, create an overlayfs mount called
"merged<group-id>" in the same bundle dir. The final overlayfs will then
be composed of these "merged" groups as lowerdirs.
@sluongng sluongng force-pushed the sluongng/ociruntime-many-layers branch from cb4679b to 9289657 Compare October 4, 2024 16:04
@sluongng
Copy link
Contributor Author

sluongng commented Oct 4, 2024

Turn out it's just an extra slices.Reverse that I need to remove 👀

The tests all passed now. I also ran the old test against the Nix image manually and it passed as well.

@sluongng
Copy link
Contributor Author

sluongng commented Oct 4, 2024

What is curious though is with the current test setup, I can create an oci image with 2000 layers and it would not fail. So I wonder if the problem was not only layer count alone but perhaps a combination of layer count and layer content 🤔

I did not dig into it too much, but I left a TODO there to investigate further in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants