Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix handling of the volume directive #334

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion integration/dockerfiles/Dockerfile_test_volume
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
FROM gcr.io/google-appengine/debian9@sha256:1d6a9a6d106bd795098f60f4abb7083626354fa6735e81743c7f8cfca11259f0
RUN mkdir /foo
RUN echo "hello" > /foo/hey
VOLUME /foo/bar /tmp
VOLUME /foo/bar /tmp /qux/quux
ENV VOL /baz/bat
VOLUME ["${VOL}"]
RUN echo "hello again" > /tmp/hey
13 changes: 13 additions & 0 deletions integration/dockerfiles/Dockerfile_test_volume_2
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
FROM gcr.io/google-appengine/debian9@sha256:1d6a9a6d106bd795098f60f4abb7083626354fa6735e81743c7f8cfca11259f0
VOLUME /foo1
RUN echo "hello" > /foo1/hello
WORKDIR /foo1/bar
ADD context/foo /foo1/foo
COPY context/foo /foo1/foo2
RUN mkdir /bar1
VOLUME /foo2
VOLUME /foo3
RUN echo "bar2"
VOLUME /foo4
RUN mkdir /bar3
VOLUME /foo5
3 changes: 0 additions & 3 deletions integration/integration_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -213,9 +213,6 @@ func TestLayers(t *testing.T) {
offset := map[string]int{
"Dockerfile_test_add": 10,
"Dockerfile_test_scratch": 3,
// the Docker built image combined some of the dirs defined by separate VOLUME commands into one layer
// which is why this offset exists
"Dockerfile_test_volume": 1,
}
for dockerfile, built := range imageBuilder.FilesBuilt {
t.Run("test_layer_"+dockerfile, func(t *testing.T) {
Expand Down
4 changes: 2 additions & 2 deletions pkg/commands/volume.go
Original file line number Diff line number Diff line change
Expand Up @@ -48,15 +48,15 @@ func (v *VolumeCommand) ExecuteCommand(config *v1.Config, buildArgs *dockerfile.
for _, volume := range resolvedVolumes {
var x struct{}
existingVolumes[volume] = x
err := util.AddPathToVolumeWhitelist(volume)
err := util.AddVolumePathToWhitelist(volume)
if err != nil {
return err
}

// Only create and snapshot the dir if it didn't exist already
if _, err := os.Stat(volume); os.IsNotExist(err) {
logrus.Infof("Creating directory %s", volume)
v.snapshotFiles = []string{volume}
v.snapshotFiles = append(v.snapshotFiles, volume)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a comment about this change. This is a bug I found where it is not respecting multiple volumes created by the same VOLUME directive.

The Dockerfile_test_volume test did not catch it because tmp is already a directory and so v.snapshotFiles did not get overridden. I added an extra volume for the directive to create to provide test coverage.

VOLUME /foo/bar /tmp /qux/quux

if err := os.MkdirAll(volume, 0755); err != nil {
return fmt.Errorf("Could not create directory for volume %s: %s", volume, err)
}
Expand Down
3 changes: 3 additions & 0 deletions pkg/constants/constants.go
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,9 @@ const (
// Docker command names
Cmd = "cmd"
Entrypoint = "entrypoint"

// VolumeCmdName is the name of the volume command
VolumeCmdName = "volume"
)

// KanikoBuildFiles is the list of files required to build kaniko
Expand Down
11 changes: 10 additions & 1 deletion pkg/executor/build.go
Original file line number Diff line number Diff line change
Expand Up @@ -138,6 +138,7 @@ func (s *stageBuilder) build(opts *config.KanikoOptions) error {
if err := s.snapshotter.Init(); err != nil {
return err
}
var volumes []string
args := dockerfile.NewBuildArgs(opts.BuildArgs)
for index, cmd := range s.stage.Commands {
finalCmd := index == len(s.stage.Commands)-1
Expand Down Expand Up @@ -167,6 +168,10 @@ func (s *stageBuilder) build(opts *config.KanikoOptions) error {
return err
}
files := command.FilesToSnapshot()
if cmd.Name() == constants.VolumeCmdName {
volumes = append(volumes, files...)
continue
}
var contents []byte

// If this is an intermediate stage, we only snapshot for the last command and we
Expand All @@ -188,17 +193,21 @@ func (s *stageBuilder) build(opts *config.KanikoOptions) error {
// the files that were changed, we'll snapshot those explicitly, otherwise we'll
// check if anything in the filesystem changed.
if files != nil {
if len(files) > 0 {
files = append(files, volumes...)
volumes = []string{}
}
contents, err = s.snapshotter.TakeSnapshot(files)
} else {
contents, err = s.snapshotter.TakeSnapshotFS()
volumes = []string{}
}
}
}
if err != nil {
return fmt.Errorf("Error taking snapshot of files for command %s: %s", command, err)
}

util.MoveVolumeWhitelistToWhitelist()
if contents == nil {
logrus.Info("No files were changed, appending empty layer to config. No layer added to image.")
continue
Expand Down
41 changes: 4 additions & 37 deletions pkg/snapshot/snapshot.go
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,6 @@ import (
"path/filepath"
"syscall"

"github.com/GoogleContainerTools/kaniko/pkg/constants"
"github.com/GoogleContainerTools/kaniko/pkg/util"
"github.com/sirupsen/logrus"
)
Expand Down Expand Up @@ -84,21 +83,6 @@ func (s *Snapshotter) TakeSnapshotFS() ([]byte, error) {
return contents, err
}

func shouldSnapshot(file string, snapshottedFiles map[string]bool) (bool, error) {
if val, ok := snapshottedFiles[file]; ok && val {
return false, nil
}
whitelisted, err := util.CheckWhitelist(file)
if err != nil {
return false, fmt.Errorf("Error checking for %s in whitelist: %s", file, err)
}
if whitelisted && !isBuildFile(file) {
logrus.Infof("Not adding %s to layer, as it's whitelisted", file)
return false, nil
}
return true, nil
}

// snapshotFiles creates a snapshot (tar) and adds the specified files.
// It will not add files which are whitelisted.
func (s *Snapshotter) snapshotFiles(f io.Writer, files []string) (bool, error) {
Expand All @@ -123,11 +107,7 @@ func (s *Snapshotter) snapshotFiles(f io.Writer, files []string) (bool, error) {
}
for _, file := range parentDirs {
file = filepath.Clean(file)
shouldSnapshot, err := shouldSnapshot(file, snapshottedFiles)
if err != nil {
return false, fmt.Errorf("Error checking if parent dir %s can be snapshotted: %s", file, err)
}
if !shouldSnapshot {
if val, ok := snapshottedFiles[file]; ok && val {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you remove shouldSnapshot() and isBuildFile()? I think those functions are necessary so that kaniko is able to build itself.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the specific files cases (ADD, COPY, WORKDIR) that call snapshotFiles() it is correct that they are added to the snapshot even if one of their parent directories is a volume. That is the main reason I removed checking the whitelist, which is what shouldSnapshot() was doing. Now that we aren't checking the whitelist there is no need for isBuildFile() either. That function's job was to make sure that build files that were whitelisted were added to the snapshot anyway.

The key thing to consider about this change is, is it ok to not check the whitelist at all for specific files cases, directives ADD, COPY and WORKDIR? (VOLUME is included too, but only if it's snapshot with another specific file case and not a system-wide snapshot.) Up to now, Kaniko has prevented adding whitelisted files and directories with those directives. It doesn't seem to be necessary to me. The main purpose of the whitelist seems to be to ignore files and directories when there is no way to know if the file was mounted or came from the base image. If a Dockerfile is explicitly trying to add files and directories with ADD, COPY and WORKDIR, I don't think there is a reason to prevent that. Let me know if you think I've missed something!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@peter-evans that makes sense, thanks for explaining!

continue
}
snapshottedFiles[file] = true
Expand All @@ -148,35 +128,22 @@ func (s *Snapshotter) snapshotFiles(f io.Writer, files []string) (bool, error) {
// Next add the files themselves to the tar
for _, file := range files {
file = filepath.Clean(file)
shouldSnapshot, err := shouldSnapshot(file, snapshottedFiles)
if err != nil {
return false, fmt.Errorf("Error checking if file %s can be snapshotted: %s", file, err)
}
if !shouldSnapshot {
if val, ok := snapshottedFiles[file]; ok && val {
continue
}
snapshottedFiles[file] = true

if err = s.l.Add(file); err != nil {
if err := s.l.Add(file); err != nil {
return false, fmt.Errorf("Unable to add file %s to layered map: %s", file, err)
}
if err = t.AddFileToTar(file); err != nil {
if err := t.AddFileToTar(file); err != nil {
return false, fmt.Errorf("Error adding file %s to tar: %s", file, err)
}
filesAdded = true
}
return filesAdded, nil
}

func isBuildFile(file string) bool {
for _, buildFile := range constants.KanikoBuildFiles {
if file == buildFile {
return true
}
}
return false
}

// shapShotFS creates a snapshot (tar) of all files in the system which are not
// whitelisted and which have changed.
func (s *Snapshotter) snapShotFS(f io.Writer) (bool, error) {
Expand Down
82 changes: 47 additions & 35 deletions pkg/util/fs_util.go
Original file line number Diff line number Diff line change
Expand Up @@ -34,17 +34,30 @@ import (
"github.com/sirupsen/logrus"
)

var whitelist = []string{
"/kaniko",
// /var/run is a special case. It's common to mount in /var/run/docker.sock or something similar
// which leads to a special mount on the /var/run/docker.sock file itself, but the directory to exist
// in the image with no way to tell if it came from the base image or not.
"/var/run",
// similarly, we whitelist /etc/mtab, since there is no way to know if the file was mounted or came
// from the base image
"/etc/mtab",
type WhitelistEntry struct {
Path string
PrefixMatchOnly bool
}

var whitelist = []WhitelistEntry{
{
Path: "/kaniko",
PrefixMatchOnly: false,
},
{
// /var/run is a special case. It's common to mount in /var/run/docker.sock or something similar
// which leads to a special mount on the /var/run/docker.sock file itself, but the directory to exist
// in the image with no way to tell if it came from the base image or not.
Path: "/var/run",
PrefixMatchOnly: false,
},
{
// similarly, we whitelist /etc/mtab, since there is no way to know if the file was mounted or came
// from the base image
Path: "/etc/mtab",
PrefixMatchOnly: false,
},
}
var volumeWhitelist = []string{}

// GetFSFromImage extracts the layers of img to root
// It returns a list of all files extracted
Expand Down Expand Up @@ -136,13 +149,13 @@ func DeleteFilesystem() error {
func ChildDirInWhitelist(path, directory string) bool {
for _, d := range constants.KanikoBuildFiles {
dirPath := filepath.Join(directory, d)
if HasFilepathPrefix(dirPath, path) {
if HasFilepathPrefix(dirPath, path, false) {
return true
}
}
for _, d := range whitelist {
dirPath := filepath.Join(directory, d)
if HasFilepathPrefix(dirPath, path) {
dirPath := filepath.Join(directory, d.Path)
if HasFilepathPrefix(dirPath, path, d.PrefixMatchOnly) {
return true
}
}
Expand Down Expand Up @@ -266,7 +279,7 @@ func CheckWhitelist(path string) (bool, error) {
return false, err
}
for _, wl := range whitelist {
if HasFilepathPrefix(abs, wl) {
if HasFilepathPrefix(abs, wl.Path, wl.PrefixMatchOnly) {
return true, nil
}
}
Expand All @@ -278,7 +291,7 @@ func checkWhitelistRoot(root string) bool {
return false
}
for _, wl := range whitelist {
if HasFilepathPrefix(root, wl) {
if HasFilepathPrefix(root, wl.Path, wl.PrefixMatchOnly) {
return true
}
}
Expand All @@ -291,7 +304,7 @@ func checkWhitelistRoot(root string) bool {
// (1)(2)(3) (4) (5) (6) (7) (8) (9) (10) (11)
// Where (5) is the mount point relative to the process's root
// From: https://www.kernel.org/doc/Documentation/filesystems/proc.txt
func fileSystemWhitelist(path string) ([]string, error) {
func fileSystemWhitelist(path string) ([]WhitelistEntry, error) {
f, err := os.Open(path)
if err != nil {
return nil, err
Expand All @@ -314,7 +327,10 @@ func fileSystemWhitelist(path string) ([]string, error) {
}
if lineArr[4] != constants.RootDir {
logrus.Debugf("Appending %s from line: %s", lineArr[4], line)
whitelist = append(whitelist, lineArr[4])
whitelist = append(whitelist, WhitelistEntry{
Path: lineArr[4],
PrefixMatchOnly: false,
})
}
if err == io.EOF {
logrus.Debugf("Reached end of file %s", path)
Expand All @@ -337,7 +353,7 @@ func RelativeFiles(fp string, root string) ([]string, error) {
if err != nil {
return err
}
if whitelisted && !HasFilepathPrefix(path, root) {
if whitelisted && !HasFilepathPrefix(path, root, false) {
return nil
}
if err != nil {
Expand Down Expand Up @@ -400,22 +416,15 @@ func CreateFile(path string, reader io.Reader, perm os.FileMode, uid uint32, gid
return dest.Chown(int(uid), int(gid))
}

// AddPathToVolumeWhitelist adds the given path to the volume whitelist
// It will get snapshotted when the VOLUME command is run then ignored
// for subsequent commands.
func AddPathToVolumeWhitelist(path string) error {
logrus.Infof("adding %s to volume whitelist", path)
volumeWhitelist = append(volumeWhitelist, path)
return nil
}

// MoveVolumeWhitelistToWhitelist copies over all directories that were volume mounted
// in this step to be whitelisted for all subsequent docker commands.
func MoveVolumeWhitelistToWhitelist() error {
if len(volumeWhitelist) > 0 {
whitelist = append(whitelist, volumeWhitelist...)
volumeWhitelist = []string{}
}
// AddVolumePathToWhitelist adds the given path to the whitelist with
// PrefixMatchOnly set to true. Snapshotting will ignore paths prefixed
// with the volume, but the volume itself will not be ignored.
func AddVolumePathToWhitelist(path string) error {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In what case would we need to whitelist volumes? I'm not totally sure we need to whitelist them at all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, volumes do need to be whitelisted. This is because any file-system changes inside a volume should not be added to the snapshot. Consider this example:

FROM gcr.io/google-appengine/debian9@sha256:1d6a9a6d106bd795098f60f4abb7083626354fa6735e81743c7f8cfca11259f0
VOLUME /foo
RUN echo "hello world" > /foo/hello

After the volume /foo is created any file-system changes within that volume should not be in the snapshot. The file hello will not be in the layer created at the RUN directive. However, the directory foo itself should be included in the snapshot. Due to this complication I felt that the best way to handle it was by being able to flag whitelist entries as PrefixMatchOnly=true. This only allows matches where the whitelisted directory is a prefix of the path being checked. If it's an exact match it will not be whitelisted. This allows the volume directories to be snapshot, but anything inside them to be whitelisted.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After I wrote the above comment I went back to double check. Turns out it's more complicated still. See the following example built with Docker and the container-diff analysis.

FROM gcr.io/google-appengine/debian9@sha256:1d6a9a6d106bd795098f60f4abb7083626354fa6735e81743c7f8cfca11259f0
VOLUME /foo
RUN echo “hello world” > /foo/hello
COPY run.sh /foo/run.sh
VOLUME /bar
COPY run.sh /bar/run.sh
VOLUME /baz
ADD run.sh /baz/run.sh
WORKDIR /baz/bat
RUN echo “hello world” > /bar/hello
RUN echo “hello world” > /baz/hello
RUN echo “hello world” > /tmp/hello
Analysis for kaniko-test Layer 1:
FILE        SIZE
/foo        0

Analysis for kaniko-test Layer 2:
FILE               SIZE
/foo               24B
/foo/run.sh        24B

Analysis for kaniko-test Layer 3:
FILE               SIZE
/bar               24B
/bar/run.sh        24B

Analysis for kaniko-test Layer 4:
FILE               SIZE
/baz               24B
/baz/run.sh        24B

Analysis for kaniko-test Layer 5:
FILE            SIZE
/baz            0
/baz/bat        0

Analysis for kaniko-test Layer 6:
FILE              SIZE
/tmp              12B
/tmp/hello        12B

File-system changes within volumes are whitelisted during system-wide snapshots. None of the hello files created within volumes appear in the layers. The only hello file that does appear is contained in /tmp, which isn't a volume. However, directives that specify files (ADD/COPY/WORKDIR) override the whitelist.

At the moment, TakeSnapshot(files) is checking the whitelist. I will try removing it and improve the volume integration test to cover these cases.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed checking the whitelist for TakeSnapshot(files) and it now works as expected. I added to Dockerfile_test_volume_2 to provide coverage of these specific file directive cases.

logrus.Infof("adding volume %s to whitelist", path)
whitelist = append(whitelist, WhitelistEntry{
Path: path,
PrefixMatchOnly: true,
})
return nil
}

Expand Down Expand Up @@ -515,7 +524,7 @@ func CopyFile(src, dest string) error {
}

// HasFilepathPrefix checks if the given file path begins with prefix
func HasFilepathPrefix(path, prefix string) bool {
func HasFilepathPrefix(path, prefix string, prefixMatchOnly bool) bool {
path = filepath.Clean(path)
prefix = filepath.Clean(prefix)
pathArray := strings.Split(path, "/")
Expand All @@ -524,6 +533,9 @@ func HasFilepathPrefix(path, prefix string) bool {
if len(pathArray) < len(prefixArray) {
return false
}
if prefixMatchOnly && len(pathArray) == len(prefixArray) {
return false
}
for index := range prefixArray {
if prefixArray[index] == pathArray[index] {
continue
Expand Down
Loading