Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Follow links in includedPaths to resolve incorrect caching when source path is behind symlink #2318

Merged
merged 4 commits into from
Sep 8, 2021

Conversation

aaronlehmann
Copy link
Collaborator

As discussed in #2300, includedPaths does not resolve symlinks when
looking up the source path in the prefix tree. If the user requests a
path that involves symlinks (for example, /a/foo when a symlink /a -> /b
exists), includedPaths will not find it, and will expect nothing to be
copied. This does not match the actual copy behavior implemented in
fsutil, which will follow symlinks in prefix components of a given path,
so it can end up caching an empty result even though the copy will
produce a non-empty result, which is quite bad.

To fix this, use getFollowLinks to resolve the path before walking it.
In the wildcard case, this is done to the non-wildcard prefix of the
path (if any), which matches the behavior in fsutil.

Fixes the skipped test, and also the repro case here:
https://gist.github.com/aaronlehmann/64054c9a2cff0d27e200cc107bba3d69

Fixes #2300

…e path is behind symlink

As discussed in moby#2300, includedPaths does not resolve symlinks when
looking up the source path in the prefix tree. If the user requests a
path that involves symlinks (for example, /a/foo when a symlink /a -> /b
exists), includedPaths will not find it, and will expect nothing to be
copied. This does not match the actual copy behavior implemented in
fsutil, which will follow symlinks in prefix components of a given path,
so it can end up caching an empty result even though the copy will
produce a non-empty result, which is quite bad.

To fix this, use getFollowLinks to resolve the path before walking it.
In the wildcard case, this is done to the non-wildcard prefix of the
path (if any), which matches the behavior in fsutil.

Fixes the repro case here:
https://gist.github.com/aaronlehmann/64054c9a2cff0d27e200cc107bba3d69

Fixes moby#2300

Signed-off-by: Aaron Lehmann <[email protected]>
@aaronlehmann
Copy link
Collaborator Author

cc @coryb @tonistiigi

@aaronlehmann
Copy link
Collaborator Author

Any comments on this PR?

Copy link
Collaborator

@coryb coryb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

// components before the last component, so
// handle last component in d1 specially.
for {
v, ok := root.Get(k)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I get this. Isn't this supposed to walk per component? see getFollowLinksWalk()

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a call to getFollowLinks below (line 568) which does the walk by component. This loop is just to handle the case where the final component is a symlink, which getFollowLinks does not handle.

I've added a test case for this specific scenario.

return "", nil, false, nil
}

k, cr, err := getFollowLinks(root, convertPathToKey([]byte(d1)), true)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this could use getFollowLinksWalk so you can reuse linkWalked without double counting.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to use getFollowLinksWalk.

@@ -570,8 +607,7 @@ func (cc *cacheContext) includedPaths(ctx context.Context, m *mount, p string, o
}
} else {
if !strings.HasPrefix(fn+"/", p+"/") {
k, _, kOk = iter.Next()
continue
break
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you explain this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the beginning, we seek to the first path with this prefix (line 541). If we reach a path that don't match the prefix, we can stop iterating, because we're now outside the directory of interest. The old code would keep iterating, but there was no reason to - I think this was left over from before we had that initial seek. So breaking here is an optimization.

Signed-off-by: Aaron Lehmann <[email protected]>
@aaronlehmann
Copy link
Collaborator Author

Thanks for the thorough review!

@tonistiigi tonistiigi merged commit f5eb400 into moby:master Sep 8, 2021
@aaronlehmann aaronlehmann deleted the follow-links-includedpaths branch September 8, 2021 18:30
@crazy-max crazy-max added this to the v0.10.0 milestone Feb 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Copying a symlinked wildcard directory leads to incorrect caching
4 participants