Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(WCOW): fix file access failure for multistage builds #5289

Conversation

billywr
Copy link

@billywr billywr commented Sep 2, 2024

fix (WCOW) file access failure for multistage builds
Fixes: #5193

The failure occurs on Windows Client SKUs (e.g., Windows 11) but generally works on server SKUs like WS2022, although similar failures can also occur on WS2022 in some cases.

@billywr billywr changed the title Fix WCOW COPY --from failure in multistage builds on Windows Fix COPY --from failure in multistage builds (WCOW) Sep 2, 2024
@billywr billywr changed the title Fix COPY --from failure in multistage builds (WCOW) Fix: cache/contenthash/checksum.go: fix failure in multistage builds (WCOW) Sep 2, 2024
Copy link
Collaborator

@profnandaa profnandaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And fix the CI build failures.

Can add an explanation in the commit body and PR description that this failure is only observed on the Windows Client SKUs (e.g. Windows 11), but works okay on server e.g. WS2022. I however, came across some similar failure in WS2022 too but a different scenario, I'll need to check my notes and send to you.

cache/contenthash/checksum.go Outdated Show resolved Hide resolved
Copy link
Collaborator

@profnandaa profnandaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as per the comments above.

@billywr billywr force-pushed the wcow-copy-from-multistage-builds-failure branch 3 times, most recently from b77425f to 3c42087 Compare September 3, 2024 12:15
@billywr billywr force-pushed the wcow-copy-from-multistage-builds-failure branch from 3c42087 to f2af587 Compare September 3, 2024 12:50
@billywr billywr marked this pull request as ready for review September 4, 2024 07:03
@billywr billywr force-pushed the wcow-copy-from-multistage-builds-failure branch from f2af587 to d60c5c4 Compare September 4, 2024 07:29
@billywr billywr marked this pull request as draft September 4, 2024 08:56
@billywr
Copy link
Author

billywr commented Sep 4, 2024

WIP: working on a way to avoid lots of code duplication

@billywr billywr force-pushed the wcow-copy-from-multistage-builds-failure branch 6 times, most recently from 7b29abb to 518e948 Compare September 4, 2024 13:10
@profnandaa
Copy link
Collaborator

And fix the CI build failures.

Can add an explanation in the commit body and PR description that this failure is only observed on the Windows Client SKUs (e.g. Windows 11), but works okay on server e.g. WS2022. I however, came across some similar failure in WS2022 too but a different scenario, I'll need to check my notes and send to you.

I could have mistaken, not found it. Found this that I had noted as a repro, but ran it on WS2022 and runs ok. Can just count-check on Win11 with your fix:

FROM mcr.microsoft.com/windows/nanoserver:ltsc2022 as base
USER ContainerAdministrator
RUN echo aa> /foo
WORKDIR /bar

FROM base as base1
COPY hello.txt .

FROM base as base2
COPY --from=base1 /bar/hello.txt .
RUN exit 0

FROM mcr.microsoft.com/windows/nanoserver:ltsc2022
COPY --from=base2 /foo /f

@billywr billywr changed the title Fix: cache/contenthash/checksum.go: fix failure in multistage builds (WCOW) fix(WCOW): fix file access failure for multistage builds Sep 5, 2024
@billywr billywr force-pushed the wcow-copy-from-multistage-builds-failure branch 4 times, most recently from f6b6486 to 8fac5d2 Compare September 5, 2024 06:51
@billywr billywr marked this pull request as ready for review September 5, 2024 07:40
@billywr billywr marked this pull request as draft September 5, 2024 07:40
@billywr billywr force-pushed the wcow-copy-from-multistage-builds-failure branch from 8fac5d2 to 790713c Compare September 5, 2024 07:42
@billywr billywr marked this pull request as ready for review September 5, 2024 08:28
Copy link
Collaborator

@profnandaa profnandaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just that minor nitpick, all else LGTM.

cache/contenthash/checksum_windows.go Outdated Show resolved Hide resolved
@billywr billywr force-pushed the wcow-copy-from-multistage-builds-failure branch from 790713c to c2a772b Compare September 9, 2024 18:06
@billywr billywr force-pushed the wcow-copy-from-multistage-builds-failure branch 3 times, most recently from 065deca to 0e9a71f Compare September 24, 2024 05:29
@billywr billywr force-pushed the wcow-copy-from-multistage-builds-failure branch from 0e9a71f to 6e35a7a Compare September 26, 2024 06:49
// elevating the admin privileges to walk special files/directory
// like `System Volume Information`, etc. See similar in #4994
privileges := []string{winio.SeBackupPrivilege}
return winio.RunWithPrivileges(privileges, func() error {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe there was a list of paths which should be excluded if I remember correctly. A number of metadata files, some of which (if not all) are listed here:

https://github.com/containerd/continuity/blob/50fa7de4fc5d1529fed1c4d6e3efad231bf5a232/fs/fstest/compare_windows.go#L19

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opened an issue for that some time back #5011, I noticed that that list keeps growing and maintaining a whitelist is going to be a chase game. However, we can rethink it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gabriel-samfira Can we proceed with this as is and handle the whitelisting part in the other open issue? This is currently blocking some integration tests for WCOW.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies for the delay. I completely missed your replies.

@profnandaa The best source of truth for that list of files is probably the server team at MSFT (you should be able to ping them). We need to mirror what they say we should exclude. But we do need to exclude those files. In some cases we can't walk them (and we shouldn't) even with elevated privileges.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well noted, will follow up. Thanks!

@gabriel-samfira gabriel-samfira merged commit c1dacbc into moby:master Oct 7, 2024
91 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

WCOW: COPY --from failing with failed to walk error on Windows 11
5 participants