-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make Dockerfiles OCI compliant #408
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
auto: Download container is switched to alpine. The git container specified the /git directory as a volume. As such, all the files under /git would be lost after each script invoke. Alpine is used later in the build process anyway, so it shouldn't be any extra cost to switch to it
This behavior is buildah-specific?
docker-compose: I added selinux labeling. I tested this with real docker (not just podman!) and it seems to work fine. Though I suggest you try it too.
Will try!
Note: for caching to work properly, you still need to replace /root/.cache/pip with /root/.cache/pip,Z on selinux systems.
if this works on all other systems, then we can have it as default.
docker-compose.yml
Outdated
security_opt: | ||
- label=type:nvidia_container_t | ||
runtime: nvidia |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what would happen here if the machine does not have a gpu?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fantastic question! I have another machine I can test with that doesn't have an nvidia gpu, I'll report back later today when I get some time to test. My initial thought is to move this statement to the gpu only ones if it explodes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AbdBarho Yes, thank you for having me check this! Podman on my non-nvidia tablet complains that it's missing a security context label, so it shouldn't be here by default. Perhaps it's best left commented out? I'm fine with removing it from here and adding it to documentation too.
I didn't try it with Docker on the tablet, but the fact that I'm having issues with Podman makes me think it would also have issues.
EDIT: Can't just move it to the GPU ones since the CPU ones inherit from it. And interestingly enough, the runtime: nvidia
doesn't seem to have any effects, just the security_opt
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we can use yaml anchors just like the base service?
x-gpu-config: &gpu_config
security_opt:
- label=type:nvidia_container_t
runtime: nvidia
services:
auto:
<<: *base_service
<<: *gpu_config
# ...
I should also test this setup as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AbdBarho I could do that too. I'd have to make auto-cpu be the "base" and then make auto extend it, otherwise both will get the label. Should I comment out the <<: *gpu_config
and let people with systems that need it to un-comment it themselves?
(I also don't want this to trip us up. I can remove it for now and if you feel like it'd be useful, it can be put in later)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, many users of this repo are not very tech savy, and I prefer to not have them fiddle around with the files
but I agree what I wrote is incorrect, this label should only exist on the GPU one...
we can swap the configs around, i.e. let the gpu inherit from cpu if possible.
worst case, we copy the config, I don't mind it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added the gpu_config. Please do test!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That does not seem to work, at least on windows:
Error response from daemon: Unknown runtime specified nvidia
maybe we can move this problem to another issue / pr so we can finish this.
|
||
SHELL ["/bin/sh", "-ceuxo", "pipefail"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is not possible to have shell config at all? no other way to define it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
podman will ignore it completely, at least it does for me.
With that said, I think changing the statement to && should cover e
, docker/podman already print the full RUN command, so that should cover x
.
For pipefail and u
, I think I can only suggest making complicated RUN command a separate script, like the clone.sh
one.
It looks like it is. It seems like Docker used to behave this way too, but something might have changed. This is what lead me to this head-scratching bug: https://stackoverflow.com/a/47575197 If you are set on using the Git container, I can just change the path the script points to
It does not, Docker throws a fit on it. I've been using a script to sed/replace the |
Latest commit:
|
@RedTopper I changed some final stuff so we can merge this:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you again
We can create other PRs or issues for the remaining topics
## Justification Closes issue AbdBarho#352 This update makes the Dockerfiles OCI compliant, making it easier to use Buildah or other image building techniques that require it ## Implementation This changes a few things, listed below: * auto: Download container is switched to alpine. The `git` container specified the `/git` directory as a volume. As such, all the files under `/git` would be lost after each script invoke. Alpine is used later in the build process anyway, so it shouldn't be any extra cost to switch to it * auto: "New" clone.sh script is copied into the container, which is basically just the previous clone script that was embedded in the Dockerfile. * all: `<<EOF` heredoc styles have been switched to `&& \` * all: I added NVIDIA_DRIVER_CAPABILITIES and NVIDIA_VISIBLE_DEVICES to expose my Nvidia card. This is most likely a selinux/podman problem, but shouldn't change anything with docker to add it. * docker-compose: I added selinux labeling. I tested this with real docker (not just podman!) and it seems to work fine. Though I suggest you try it too. ## Testing Locally builds with buildah. Note: for caching to work properly, you still need to replace `/root/.cache/pip` with `/root/.cache/pip,Z` on selinux systems. Note: I was having some trouble running invoke. Thought it was this PR, but it's a known issue. See invoke-ai/InvokeAI#3182 --------- Co-authored-by: AbdBarho <[email protected]>
## Justification Closes issue AbdBarho#352 This update makes the Dockerfiles OCI compliant, making it easier to use Buildah or other image building techniques that require it ## Implementation This changes a few things, listed below: * auto: Download container is switched to alpine. The `git` container specified the `/git` directory as a volume. As such, all the files under `/git` would be lost after each script invoke. Alpine is used later in the build process anyway, so it shouldn't be any extra cost to switch to it * auto: "New" clone.sh script is copied into the container, which is basically just the previous clone script that was embedded in the Dockerfile. * all: `<<EOF` heredoc styles have been switched to `&& \` * all: I added NVIDIA_DRIVER_CAPABILITIES and NVIDIA_VISIBLE_DEVICES to expose my Nvidia card. This is most likely a selinux/podman problem, but shouldn't change anything with docker to add it. * docker-compose: I added selinux labeling. I tested this with real docker (not just podman!) and it seems to work fine. Though I suggest you try it too. ## Testing Locally builds with buildah. Note: for caching to work properly, you still need to replace `/root/.cache/pip` with `/root/.cache/pip,Z` on selinux systems. Note: I was having some trouble running invoke. Thought it was this PR, but it's a known issue. See invoke-ai/InvokeAI#3182 --------- Co-authored-by: AbdBarho <[email protected]>
Justification
Closes issue #352
This update makes the Dockerfiles OCI compliant, making it easier to use Buildah or other image building techniques that require it
Implementation
This changes a few things, listed below:
git
container specified the/git
directory as a volume. As such, all the files under/git
would be lost after each script invoke. Alpine is used later in the build process anyway, so it shouldn't be any extra cost to switch to it<<EOF
heredoc styles have been switched to&& \
Testing
Locally builds with buildah.
Note: for caching to work properly, you still need to replace
/root/.cache/pip
with/root/.cache/pip,Z
on selinux systems.Note: I was having some trouble running invoke. Thought it was this PR, but it's a known issue. See invoke-ai/InvokeAI#3182