-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix shared cache mounts result in overlay corruption #2637
Conversation
Unrelated but we still have the CNI panic even after upgrading to go-cni 1.1.2 (#2632): https://github.com/moby/buildkit/runs/5201057000?check_suite_focus=true#step:6:1033
@fuweid Upgrading to go-cni 1.1.3 could fix it? |
Yes! Sorry for rush fix in v1.1.2. I forgot to update cni opt. Sorry for that Ref:
|
Thanks @fuweid! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good overall, thanks for tackling it!
// Don't need temporary mount wrapper for non-overlayfs mounts | ||
return mounts, release, nil | ||
} | ||
dir, err := ioutil.TempDir("", "buildkit") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should add a defer
that removes this dir if an error is encountered after this that causes the func to return early
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the review. Fixed this.
// no mount exist. release the current mount. | ||
sm.curMounts = nil | ||
if err := mount.Unmount(sm.curMountPoint, 0); err != nil { | ||
return err | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't have to be fixed in this PR as it's a larger issue, but if buildkitd crashed or errors were encountered unmounting here I guess we could leak overlay mounts. Then if buildkitd started up again, you could run into the same issue of duplicating overlay mounts because the old ones are still sitting around. This is different from mounts made for containers w/ runc/containerd because those are typically mounted inside a mount namespace, meaning they will get auto cleaned up by the kernel when no processes are left in the namespace. The only remediation would be to manually unmount stuff from under /tmp or reboot the host.
Like I said, this is probably a pre-existing issue, but something to keep in mind. We might want a way of cleaning up any leftover mounts when buildkitd starts up (though even that can't be 100% robust if it's possible for users to change the configuration of where such tmp mounts are made). Maybe we could even run buildkitd inside its own mount namespace, but trying to get Go to do that is never easy and might have weird undesirable side effects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the comment. Let's work on it in following-up PRs. I think we can add a directory under /var/lib/buildkit/
for storing all temporary overlay mounts. When buildkitd starts, it should clean up all existing mounts under that directory.
Signed-off-by: Kohei Tokunaga <[email protected]>
Fixes #2334
Currently,
MutableRef.Mounts
can return writable overlayfs mounts that can be mounted to multiple directories.However, mounting a writable overlayfs to multiple places can result in an error.
This commit avoids this issue by the approach suggested in #2334 (comment).