Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi-stage builds fail when run in kind #2164

Open
abayer opened this issue Jul 6, 2022 · 8 comments
Open

Multi-stage builds fail when run in kind #2164

abayer opened this issue Jul 6, 2022 · 8 comments
Labels
area/multi-stage builds issues related to kaniko multi-stage builds kind/bug Something isn't working priority/p3 agreed that this would be good to have, but no one is available at the moment. work-around-available

Comments

@abayer
Copy link

abayer commented Jul 6, 2022

Actual behavior
When trying to run a multi-stage build with Kaniko in a kind cluster, specifically of https://github.com/GoogleContainerTools/skaffold/blob/main/examples/microservices/leeroy-web/Dockerfile, it fails with:

error building image: deleting file system after stage 0: unlinkat //product_uuid: device or resource busy

EDIT: Interestingly, it seems that adding --ignore-path=/product_uuid to the Kaniko args gets rid of the error. I don't know if this is something specific to that file, since I have stumbled across a reference to the exact same error, including //product_uuid, at https://github.com/mattmoor/mink/blob/b9148a39b2d8bbc69ca9aaf5e89a7613c0b179d8/.github/workflows/minkind-cli.yaml#L150-L155.

Expected behavior
I expect multi-stage builds to succeed on kind.

To Reproduce
Steps to reproduce the behavior:

  1. Run a Kaniko build of https://github.com/GoogleContainerTools/skaffold/blob/main/examples/microservices/leeroy-web/Dockerfile in a kind cluster.

Additional Information

Triage Notes for the Maintainers

Description Yes/No
Please check if this a new feature you are proposing
Please check if the build works in docker but not in kaniko
Please check if this error is seen when you use --cache flag
Please check if your dockerfile is a multistage dockerfile
abayer added a commit to abayer/tektoncd-pipeline that referenced this issue Jul 6, 2022
abayer added a commit to abayer/tektoncd-pipeline that referenced this issue Jul 6, 2022
abayer added a commit to abayer/tektoncd-pipeline that referenced this issue Jul 7, 2022
The existing `test/e2e-tests-kind.env` is specifically for the `PipelineRun` approach. These new files are for running the e2e tests, via `kind`, in Prow.

There are four new env files - one for just the go e2e tests each for `stable` and `alpha`, and one for just the yaml tests each for `stable` and `alpha`.

Additionally, `examples/v1beta1/taskruns/git-volume.yaml` is moved to `examples/v1beta1/taskruns/no-ci/git-volume.yaml`. This is because Kind nodes don't have `git` installed, which is necessary for git volumes to work. Also, `--ignore-path=/product_uuid` has been added to the Kaniko args in `examples/v1beta1/pipelineruns/pipelinerun.yaml` and `test/yamls/v1beta1/pipelineruns/pipelinerun.yaml` to work around an issue with Kaniko multi-stage builds on Kind (GoogleContainerTools/kaniko#2164).

Signed-off-by: Andrew Bayer <[email protected]>
abayer added a commit to abayer/tektoncd-pipeline that referenced this issue Jul 14, 2022
The existing `test/e2e-tests-kind.env` is specifically for the `PipelineRun` approach. These new files are for running the e2e tests, via `kind`, in Prow.

There are four new env files - one for just the go e2e tests each for `stable` and `alpha`, and one for just the yaml tests each for `stable` and `alpha`.

Additionally, `examples/v1beta1/taskruns/git-volume.yaml` is moved to `examples/v1beta1/taskruns/no-ci/git-volume.yaml`. This is because Kind nodes don't have `git` installed, which is necessary for git volumes to work. Also, `--ignore-path=/product_uuid` has been added to the Kaniko args in `examples/v1beta1/pipelineruns/pipelinerun.yaml` and `test/yamls/v1beta1/pipelineruns/pipelinerun.yaml` to work around an issue with Kaniko multi-stage builds on Kind (GoogleContainerTools/kaniko#2164).

Signed-off-by: Andrew Bayer <[email protected]>
abayer added a commit to abayer/tektoncd-pipeline that referenced this issue Jul 14, 2022
The existing `test/e2e-tests-kind.env` is specifically for the `PipelineRun` approach. These new files are for running the e2e tests, via `kind`, in Prow.

There are four new env files - one for just the go e2e tests each for `stable` and `alpha`, and one for just the yaml tests each for `stable` and `alpha`.

Additionally, `examples/v1beta1/taskruns/git-volume.yaml` is moved to `examples/v1beta1/taskruns/no-ci/git-volume.yaml`. This is because Kind nodes don't have `git` installed, which is necessary for git volumes to work. Also, `--ignore-path=/product_uuid` has been added to the Kaniko args in `examples/v1beta1/pipelineruns/pipelinerun.yaml` and `test/yamls/v1beta1/pipelineruns/pipelinerun.yaml` to work around an issue with Kaniko multi-stage builds on Kind (GoogleContainerTools/kaniko#2164).

Signed-off-by: Andrew Bayer <[email protected]>
tekton-robot pushed a commit to tektoncd/pipeline that referenced this issue Jul 14, 2022
The existing `test/e2e-tests-kind.env` is specifically for the `PipelineRun` approach. These new files are for running the e2e tests, via `kind`, in Prow.

There are four new env files - one for just the go e2e tests each for `stable` and `alpha`, and one for just the yaml tests each for `stable` and `alpha`.

Additionally, `examples/v1beta1/taskruns/git-volume.yaml` is moved to `examples/v1beta1/taskruns/no-ci/git-volume.yaml`. This is because Kind nodes don't have `git` installed, which is necessary for git volumes to work. Also, `--ignore-path=/product_uuid` has been added to the Kaniko args in `examples/v1beta1/pipelineruns/pipelinerun.yaml` and `test/yamls/v1beta1/pipelineruns/pipelinerun.yaml` to work around an issue with Kaniko multi-stage builds on Kind (GoogleContainerTools/kaniko#2164).

Signed-off-by: Andrew Bayer <[email protected]>
@gawsoftpl
Copy link

gawsoftpl commented Aug 10, 2022

This error still appear during create multi stage docker image via Kind k8s cluster.

kaniko INFO[0113] Deleting filesystem...                                                                              
kaniko error building image: deleting file system after stage 0: unlinkat //product_uuid: device or resource busy

@pbarker
Copy link
Contributor

pbarker commented Sep 14, 2022

Also seeing this same error when building in kind, with latest

@iamyeka
Copy link

iamyeka commented Jan 10, 2023

Same issue

@ducanhkl
Copy link

ducanhkl commented Jan 27, 2023

Same issue with kind and kaniko.
My Dockerfile.

FROM maven:3.8.7-openjdk-18 as builder

COPY src /usr/src/app/src
COPY pom.xml /usr/src/app

RUN mvn -f /usr/src/app/pom.xml clean package

FROM openjdk:18-jdk-alpine3.15

COPY --from=builder /usr/src/app/target/app.jar /usr/app/application.jar

ENTRYPOINT ["java", "-jar", "/usr/app/application.jar"]

Error:

INFO[0239] Taking snapshot of full filesystem...
INFO[0240] Saving file usr/src/app/target/app-0.0.1-SNAPSHOT.jar for later use 
INFO[0240] Deleting filesystem...                       
error building image: deleting file system after stage 0: unlinkat //product_uuid: device or resource busy
time="2023-01-27T10:41:03.882Z" level=info msg="sub-process exited" argo=true error="<nil>"
Error: exit status 1

But it work ok if I run kaniko in docker, not kind.

@gawsoftpl
Copy link

I fixed it by add ignorePaths to scaffold yaml:

ignorePaths:
  - /product_uuid
  - image: job-runner
    context: ../../
    kaniko:
      cache: {}
      ignorePaths:
        - /product_uuid
      dockerfile: microservices/job-runner/docker/Dockerfile

For normal kaniko cli use arg:
--ignore-path=/product_uuid

@aaron-prindle aaron-prindle added work-around-available area/multi-stage builds issues related to kaniko multi-stage builds kind/bug Something isn't working priority/p3 agreed that this would be good to have, but no one is available at the moment. priority/p2 High impact feature/bug. Will get a lot of users happy and removed priority/p2 High impact feature/bug. Will get a lot of users happy labels May 31, 2023
@FrancisRussell
Copy link

FrancisRussell commented Dec 8, 2023

So I've hit this, and for reliability reasons it was necessary for me to investigate what the cause was. I do not fully understand all the components involved, but I have a working theory.

To my understanding Kaniko needs to be able to function without the ability to chroot. This means that in order to execute docker commands, it needs to obliterate the root filesystem of the container its running in. This means that the only places Kaniko can hold state are either in the memory of the build process, or potentially in a filesystem location that is unlikely to conflict with the image being built and also exempt from snapshotting.

Looking at kind, it seems that it wants to provide its own versions of the following files:

/sys/class/dmi/id/product_name
/sys/class/dmi/id/product_uuid
/sys/devices/virtual/dmi/id/product_uuid

However, rather than bind-mounting them to individual instances, these are copied into the root-filesystem of the container and then bind-mounted from there. This is mostly inferred from https://github.com/kubernetes-sigs/kind/blob/main/images/base/files/kind/bin/mount-product-files.sh. When Kaniko clears the root filesystem it attempts to delete these files. The bind mounting may explain why they cannot be deleted given that Linux is usually incredibly forgiving of deleting files which are in use.

From this I conclude that to ensure Kaniko works correctly, you probably want to ignore both /product_uuid and /product_name. It's unclear to me why the presence of /product_name hasn't been observed to be an issue.

Oddly, I've only seen this issue in CI and not on a local kind deploy. One factor appears to be the host system. Although product_uuid and product_name are always copied in, they are only bind mounted if the host's /sys defines these paths, which in turn may depend on such things as the host's firmware since they are derived from DMI information. On my local Linux machine, only /sys/class/dmi/id/product_name but not /sys/class/dmi/id/product_uuid or /sys/devices/virtual/dmi/id/product_uuid are present so product_uuid is never bind-mounted. I do not yet have an explanation for why Kaniko does not fail for me locally on deleting /product_name. Establishing a root shell into a Kind pod on my local machine, I also had no issue deleting /product_name.

@jandry
Copy link

jandry commented Dec 18, 2023

Got exactly same issue but it's random. I'm using the version v1.12.1.

But I can't ignore the path in my case, I want to export it in second image

@gba-foundever
Copy link

gba-foundever commented May 24, 2024

Seem similar to #1697 (by no solution also in thread)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/multi-stage builds issues related to kaniko multi-stage builds kind/bug Something isn't working priority/p3 agreed that this would be good to have, but no one is available at the moment. work-around-available
Projects
None yet
Development

No branches or pull requests

9 participants