Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create pods and containers such that generated systemd unit files are cohesive #443

Closed
jayache80 opened this issue Mar 1, 2022 · 13 comments
Labels
bug Something isn't working

Comments

@jayache80
Copy link

jayache80 commented Mar 1, 2022

Describe the bug

Somehow, generated systemd unit files from a pod and containers using podman-compose are not cohesive; if I start the pod service, all containers will start, but if I stop the pod service, not all containers stop and the pod is left in a "degraded" state.

To Reproduce

Create a compose file called foo.yaml with these contents:

version: "2.1"
services:
  app:
    image: docker.io/python
    container_name: app
    ports:
      - 5000:5000
  web:
    image: docker.io/nginx
    container_name: web
    ports:
      - 8080:80

then modify the podman-compose script such that pods are created with --infra=true. Then create the pod/containers and generate systemd files with:

podman-compose -f foo.yaml -p foo up -d
podman pod stop pod_foo
podman generate systemd pod_foo --files --name

Install the unit files to ~/.config/systemd/user/ and start the pod with:

systemctl --user start pod-pod_foo.service

and stop the pod with:

systemctl --user stop pod-pod_foo.service

Expected behavior

systemctl --user stop pod-pod_foo.service will stop the pod and all containers therein.

Actual behavior

systemctl --user stop pod-pod_foo.service only stops the infra container, and all other containers are left running and the pod is in a "degraded" state.

Output

For reference, the same deployment can be created with only podman commands using the following:

# Create pod
podman pod create --infra=true --name foo

# Create containers and run them in the pod
podman run --name app -dt --pod foo docker.io/python
podman run --name web -dt --pod foo docker.io/nginx

# The pod will be running now, so let's stop it:
podman pod stop foo

# Generate systemd unit files
podman generate systemd foo --files --name

That generates systemd unit files that work as expected, and look like this:

# pod-foo.service
# autogenerated by Podman 3.4.4
# Mon Feb 28 20:40:12 PST 2022

[Unit]
Description=Podman pod-foo.service
Documentation=man:podman-generate-systemd(1)
Wants=network-online.target
After=network-online.target
RequiresMountsFor=
Requires=container-app.service container-web.service
Before=container-app.service container-web.service

[Service]
Environment=PODMAN_SYSTEMD_UNIT=%n
Restart=on-failure
TimeoutStopSec=70
ExecStart=/usr/bin/podman start f1292dabc396-infra
ExecStop=/usr/bin/podman stop -t 10 f1292dabc396-infra
ExecStopPost=/usr/bin/podman stop -t 10 f1292dabc396-infra
PIDFile=/run/user/1000/containers/overlay-containers/e24a0d50b5666215946963a9219bb35b4f26f1ae75c1d09c151ea381f0155eab/userdata/conmon.pid
Type=forking

[Install]
WantedBy=default.target
# container-app.service
# autogenerated by Podman 3.4.4
# Mon Feb 28 20:40:12 PST 2022

[Unit]
Description=Podman container-app.service
Documentation=man:podman-generate-systemd(1)
Wants=network-online.target
After=network-online.target
RequiresMountsFor=/run/user/1000/containers
BindsTo=pod-foo.service
After=pod-foo.service

[Service]
Environment=PODMAN_SYSTEMD_UNIT=%n
Restart=on-failure
TimeoutStopSec=70
ExecStart=/usr/bin/podman start app
ExecStop=/usr/bin/podman stop -t 10 app
ExecStopPost=/usr/bin/podman stop -t 10 app
PIDFile=/run/user/1000/containers/overlay-containers/6ed0b973ec269542c7376eca7e302679f565cb292d703525b70884bf74513ada/userdata/conmon.pid
Type=forking

[Install]
WantedBy=default.target
# container-web.service
# autogenerated by Podman 3.4.4
# Mon Feb 28 20:40:12 PST 2022

[Unit]
Description=Podman container-web.service
Documentation=man:podman-generate-systemd(1)
Wants=network-online.target
After=network-online.target
RequiresMountsFor=/run/user/1000/containers
BindsTo=pod-foo.service
After=pod-foo.service

[Service]
Environment=PODMAN_SYSTEMD_UNIT=%n
Restart=on-failure
TimeoutStopSec=70
ExecStart=/usr/bin/podman start web
ExecStop=/usr/bin/podman stop -t 10 web
ExecStopPost=/usr/bin/podman stop -t 10 web
PIDFile=/run/user/1000/containers/overlay-containers/1c36d668d474366f427ccde2b344728c63bf1dc9d68bf30af34dabb2fe82ca1d/userdata/conmon.pid
Type=forking

[Install]
WantedBy=default.target

Notice that the units for the containers have BindsTo and After referencing pod-foo.service.

However, with the podman-compose reproduction steps above, I get systemd unit files that look like this:

# pod-pod_foo.service
# autogenerated by Podman 3.4.4
# Mon Feb 28 21:15:15 PST 2022

[Unit]
Description=Podman pod-pod_foo.service
Documentation=man:podman-generate-systemd(1)
Wants=network-online.target
After=network-online.target
RequiresMountsFor=
Requires=container-app.service container-web.service
Before=container-app.service container-web.service

[Service]
Environment=PODMAN_SYSTEMD_UNIT=%n
Restart=on-failure
TimeoutStopSec=70
ExecStart=/usr/bin/podman start caf63e041bfc-infra
ExecStop=/usr/bin/podman stop -t 10 caf63e041bfc-infra
ExecStopPost=/usr/bin/podman stop -t 10 caf63e041bfc-infra
PIDFile=/run/user/1000/containers/overlay-containers/3e02551e71997ec7739d62405961a6761bbad34e73e7a348f6434330bd5a58bd/userdata/conmon.pid
Type=forking

[Install]
WantedBy=default.target
# container-app.service
# autogenerated by Podman 3.4.4
# Mon Feb 28 21:15:15 PST 2022

[Unit]
Description=Podman container-app.service
Documentation=man:podman-generate-systemd(1)
Wants=network-online.target
After=network-online.target
RequiresMountsFor=/run/user/1000/containers

[Service]
Environment=PODMAN_SYSTEMD_UNIT=%n
Restart=on-failure
TimeoutStopSec=70
ExecStart=/usr/bin/podman start app
ExecStop=/usr/bin/podman stop -t 10 app
ExecStopPost=/usr/bin/podman stop -t 10 app
PIDFile=/run/user/1000/containers/overlay-containers/1ab907341ee16531bace29b1ba64031f31eae0b21edbd9b60c5be6eb0af43c42/userdata/conmon.pid
Type=forking

[Install]
WantedBy=default.target
# container-web.service
# autogenerated by Podman 3.4.4
# Mon Feb 28 21:15:15 PST 2022

[Unit]
Description=Podman container-web.service
Documentation=man:podman-generate-systemd(1)
Wants=network-online.target
After=network-online.target
RequiresMountsFor=/run/user/1000/containers

[Service]
Environment=PODMAN_SYSTEMD_UNIT=%n
Restart=on-failure
TimeoutStopSec=70
ExecStart=/usr/bin/podman start web
ExecStop=/usr/bin/podman stop -t 10 web
ExecStopPost=/usr/bin/podman stop -t 10 web
PIDFile=/run/user/1000/containers/overlay-containers/116a9f7f05750d890dbff368dcfe3e28b0157c848ca9cc42df925d35c9954652/userdata/conmon.pid
Type=forking

[Install]
WantedBy=default.target
$ podman-compose version
podman-compose version: 1.0.4
['podman', '--version', '']
using podman version: 3.4.4
podman-composer version 1.0.4
podman --version
podman version 3.4.4
exit code: 0

(Using latest master)

Environment:

  • OS: Arch Linux

Additional context

containers/podman#13368 (comment)

@jayache80 jayache80 added the bug Something isn't working label Mar 1, 2022
@muayyad-alsadi
Copy link
Collaborator

I have posted in the mailing list regarding this

one suggestion was to use ExecStart=podman pod wait cnt1 cnt2 ...
in the systemd unit, something like this

ExecStartPre=/bin/podman-compose up --no-start
ExecStartPre=/bin/podman pod start pod_mypod
ExecStart=/bin/podman pod wait cnt1 cnt2 ...

without any systemd for each container

another suggest is to use the above podman generate systemd foo --files --name
with the following modification

@rhatdan we can use systemd's PartOf= on the container and the implied ConsistsOf= on the pod

       PartOf=
           Configures dependencies similar to Requires=, but limited to stopping and restarting of units. When systemd stops or restarts the units listed here, the action is propagated to this unit. Note that this is a one-way
           dependency — changes to this unit do not affect the listed units.

           When PartOf=b.service is used on a.service, this dependency will show as ConsistsOf=a.service in property listing of b.service.  ConsistsOf= dependency cannot be specified directly.

we might also use BindsTo= (I'm not sure on container unit or pod unit)

@jayache80
Copy link
Author

I guess what I don't understand is: in what way do the containers and pods that podman-compose creates differ from those that the raw podman commands create? Because in each case we're running the same podman generate systemd command, but we get different output depending on if the pod/containers was created with podman vs podman-compose. (I'm sorry I haven't spent enough time reading through the podman-compose code).

@muayyad-alsadi
Copy link
Collaborator

try to add a network and define different hostnames

# Create pod
podman pod create --name foo --infra=false --share=

# create network
podman network create foo_net

# Create containers and run them in the pod
podman run --name app -dt --pod foo --net foo_net --network-alias=app -h apphost docker.io/python
podman run --name web -dt --pod foo --net foo_net --network-alias=web -h webhost docker.io/nginx

# The pod will be running now, so let's stop it:
podman pod stop foo

# Generate systemd unit files
podman generate systemd foo --files --name

what way do the containers and pods that podman-compose creates differ from those that the raw podman commands create?

basically compose spec is more complex than the simple example you have shown
it allows specifying multiple network and different hostnames
in order for container to join a network, the pod must not have shared network
and in order to be able to --hostname apphost, the pod must not have shared namespaces (--share="")

so the command we end with is podman pod create --name foo --infra=false --share=
I use --infra=false instead of --infra=true because the presense of infra cause the status of the pod to be degraded instead of running (most likely because of --share="")

You can try to remove --share= from podman-compose source just like you added the infra back,
but it will break stacks that need to specify hostname (like the included ansible awx docker-compose), username space ..etc.

@Aposhian
Copy link

Aposhian commented Mar 6, 2022

As a note, I am interested in developing a library/command to convert directly from compose files to systemd units for podman. That way podman-compose would not have to be invoked in the systemd unit files. This would be facilitated by #445

@muayyad-alsadi
Copy link
Collaborator

@jayache80 you can no choose to not create pod or create pod with whatever arguments you want

#442 (comment)

podman-compose --no-pod up -d
# or
podman-compose --pod-args='--infra=false --share=""' up -d
# or
podman-compose --pod-args='--infra=true --share=""' up -d

@muayyad-alsadi
Copy link
Collaborator

after last push, please check my comment

#307 (comment)

@jayache80 @Aposhian your feedback is highly appreciated

@jayache80
Copy link
Author

jayache80 commented Mar 13, 2022

Generic pod troubleshooting

@muayyad-alsadi Thanks for addressing this. I tried to get an iteration of this "hello world" to work, but the pod always comes up as degraded as I've seen mentioned. This is foo.yaml:

version: "2.1"
services:
  app:
    image: docker.io/python
    container_name: app
    ports:
      - 5000:5000
  web:
    image: docker.io/nginx
    container_name: web
    ports:
      - 8080:80

And I try to start it with:

podman-compose -f foo.yaml --pod-args='--infra=true --share=""' up -d

The pod starts but the app (which just starts a python interpreter) immediately exits and the pod has Degraded status.

I tried adding --share=net but that doesn't seem to affect things.

I tried adding:

networks:
  - foonet

blocks to the yaml, but I'm fairly certain that's not supported, as that yields RuntimeError: missing networks: default.

Clearly I don't know what I'm doing so any suggestions on how to modify this toy yaml such that I can get a grasp on this basic use case would be much appreciated.

New systemd stuff

I ran:

sudo podman-compose systemd --action create-unit

followed by:

podman-compose systemd -a register -f foo.yaml

and then

systemctl --user enable --now podman-compose@foo

That created this systemd user file: /usr/lib/systemd/user/[email protected], the environment file ~/.config/containers/compose/projects/foo.env, and installed my service to ~/.config/systemd/user/default.target.wants/[email protected] and it started once. A subsequent stop and start yielded:

Mar 12 20:11:39 archdesktop web[164722]: 10-listen-on-ipv6-by-default.sh: info: can not modify /etc/nginx/conf.d/default.conf (read-only file system?)
Mar 12 20:11:39 archdesktop web[164722]: /docker-entrypoint.sh: Launching /docker-entrypoint.d/20-envsubst-on-templates.sh
Mar 12 20:11:39 archdesktop web[164722]: /docker-entrypoint.sh: 22: /docker-entrypoint.d/20-envsubst-on-templates.sh: Transport endpoint is not connected
Mar 12 20:11:39 archdesktop podman[164736]: 2022-03-12 20:11:39.833547804 -0800 PST m=+0.048054610 container died 86bab07aedcc137cfff12329dd743cd9acdd5e15ac46c2da3dbd418472c078a5 (image=docker.io/library/nginx:latest, name=web)
Mar 12 20:11:39 archdesktop podman[164736]: 2022-03-12 20:11:39.977419018 -0800 PST m=+0.191925826 container cleanup 86bab07aedcc137cfff12329dd743cd9acdd5e15ac46c2da3dbd418472c078a5 (image=docker.io/library/nginx:latest, name=web, com.docker.compose.project.config_files=foo.yaml, com.docker.compose.container-number=1, com.docker.compose.project.working_dir=/home/archuser/code/fooooo, com.docker.compose.service=web, io.podman.compose.config-hash=e0200b17bbbdb68faf5e37baaa80f22e959acd9782153931020258e52ec4387a, io.podman.compose.project=fooooo, maintainer=NGINX Docker Maintainers <[email protected]>, io.podman.compose.version={__version__}, com.docker.compose.project=fooooo)
Mar 12 20:11:40 archdesktop systemd[675]: [email protected]: Failed with result 'exit-code'.

I'm really not sure what's going on there, possibly user error (but please ignore the fact that the service is now called fooooo).

But it's pretty neat that the systemd unit creation is integrated directly through podman-compose and coordinated by the registered .env file!

@muayyad-alsadi
Copy link
Collaborator

muayyad-alsadi commented Mar 13, 2022

podman-compose -f foo.yaml --pod-args='--infra=true --share=""' up -d

I prefer having a pod with no infra and nothing shared (that's why I made it the default)

I tried adding --share=net but that doesn't seem to affect things.

please use podman-compose down between the runs to make sure your changes takes effect

I'm really not sure what's going on there, possibly user error (but please ignore the fact that the service is now called fooooo).

I've just adjust the unit file to make it ExecStartPre=- instead of ExecStartPre=
this might fix your problem

you might need to recreate the unit file and daemon reload

sudo podman-compose systemd --action create-unit
sudo systemctl daemon-reload
systemctl --user daemon-reload

@jayache80
Copy link
Author

Thanks for assisting @muayyad-alsadi

I prefer having a pod with no infra and nothing shared (that's why I made it the default)

Unfortunately even with --infra=false I never got the pod created without a Degraded status.

please use podman-compose down between the runs

Indeed this allows the service to be restarted. So I have to do something like this:

systemctl --user stop [email protected]
podman-compose -f path/to/foo.yaml down
systemctl --user start [email protected]

every time I want the service stopped and started. Is that expected? If so, I would argue that the fact that I have to keep track of such a thing and run an intermediate podman-compose command before the service can be restarted is not desired. I envision a self-contained, totally-abstracted systemd user service, something that I don't even need to know is implemented using podman pods/containers. For example, the original example where containers are created within a pod, and systemd service files are generated, all using only podman commands, the end result is systemd user service that is perfectly stoppable and restartable without ever running another podman command ever again, a la "set it and forget it".

@muayyad-alsadi
Copy link
Collaborator

every time I want the service stopped and started

this should not be the case. down was only needed to apply your changes to --infra= ..etc

commands like the following should work normally

systemctl --user stop [email protected]
systemctl --user start [email protected]

and even if you typed

podman pod stop pod_foo

the following should report proper status even though it was stopped outside

systemctl --user status [email protected]

without ever running another podman command ever again, a la "set it and forget it".

yes this is the objective and it works for me

I've just added a command to help you list registered compose stacks

$ podman-compose systemd -a list
simple
demo1
demo2

which gives list of registered compose files

$ cat ~/.config/containers/compose/projects/simple.env
COMPOSE_PROJECT_DIR=/home/alsadi/proj/podman-compose/tests/simple
COMPOSE_FILE=docker-compose.yaml
COMPOSE_PROJECT_NAME=simple
COMPOSE_PATH_SEPARATOR=:

$ systemctl --user cat podman-compose@simple
# /usr/lib/systemd/user/[email protected]

[Unit]
Description=%i rootless pod (podman-compose)

[Service]
Type=simple
EnvironmentFile=%h/.config/containers/compose/projects/%i.env
ExecStartPre=-/home/alsadi/proj/podman-compose/podman_compose.py up --no-start
ExecStartPre=/usr/bin/podman pod start pod_%i
ExecStart=/home/alsadi/proj/podman-compose/podman_compose.py wait
ExecStop=/usr/bin/podman pod stop pod_%i

[Install]
WantedBy=default.target

as you can see below the status is running (no infra, all defaults and it's running)

$ systemctl --user start podman-compose@nets_test4
$ podman pod ps
POD ID        NAME            STATUS      CREATED        INFRA ID    # OF CONTAINERS
387819eddcdf  pod_simple      Running     4 minutes ago              1
fd1ee601d026  pod_nets_test4  Running     39 hours ago               4
$ podman pod stats pod_nets_test4
POD           CID           NAME               CPU %       MEM USAGE/ LIMIT   MEM %       NET IO      BLOCK IO    PIDS
fd1ee601d026  5e53bf173607  nets_test4_web1_1  3.62%       888.8kB / 16.65GB  0.01%       -- / --     -- / --     1
fd1ee601d026  6300540a75b3  nets_test4_web2_1  3.50%       892.9kB / 16.65GB  0.01%       -- / --     -- / --     1
fd1ee601d026  cd0a56514e36  nets_test4_web4_1  5.41%       892.9kB / 16.65GB  0.01%       -- / --     -- / --     1
fd1ee601d026  5b2bfc24de33  nets_test4_web3_1  6.36%       888.8kB / 16.65GB  0.01%       -- / --     -- / --     1

@jayache80
Copy link
Author

@muayyad-alsadi

this should not be the case

without ever running another podman command ever again, a la "set it and forget it".

yes this is the objective and it works for me

Glad to hear this. Perhaps it's an incompatible set of images/containers that I'm working with.

By chance, do you have no problems which such a docker-compose.yaml:

version: "2.1"
services:
  app:
    image: docker.io/python
    container_name: app
    ports:
      - 5000:5000
  web:
    image: docker.io/nginx
    container_name: web
    ports:
      - 8080:80

or does it reproduce the issue that I'm getting?

@muayyad-alsadi
Copy link
Collaborator

regarding the degraded problem, it's not related to infra, it's because your compose has two containers and one of them is dead

$ podman ps -a -f 'pod=pod_foo'
CONTAINER ID  IMAGE                            COMMAND               CREATED        STATUS                    PORTS                   NAMES
527b1950cbc8  docker.io/library/python:latest  python3               6 minutes ago  Exited (0) 3 minutes ago  0.0.0.0:5000->5000/tcp  app
58d888884ea3  docker.io/library/nginx:latest   nginx -g daemon o...  5 minutes ago  Up 3 minutes ago          0.0.0.0:8080->80/tcp    web

your app (python container) exited because it does not run any command, I've change it to run something like python -m http.server 5000

$ cat docker-compose.yaml 
version: "2.1"
services:
  app:
    image: docker.io/python
    command: python -m http.server 5000
    workingdir: /etc
    container_name: app
    ports:
      - 5000:5000
  web:
    image: docker.io/nginx
    container_name: web
    ports:
      - 8080:80

and now it works perfectly

$ podman pod ps
POD ID        NAME            STATUS      CREATED             INFRA ID    # OF CONTAINERS
29a9ac685816  pod_foo         Running     About a minute ago              2

$ podman ps -a -f 'pod=pod_foo'
CONTAINER ID  IMAGE                            COMMAND               CREATED        STATUS            PORTS                   NAMES
251d4c289579  docker.io/library/python:latest  python -m http.se...  2 minutes ago  Up 2 minutes ago  0.0.0.0:5000->5000/tcp  app
fbe4fa45e4e7  docker.io/library/nginx:latest   nginx -g daemon o...  2 minutes ago  Up 2 minutes ago  0.0.0.0:8080->80/tcp    web

my first attempt systemd failed because it was pulling the images in background until some timedout
then tried to start unfinished pod and failed

Mar 15 12:05:25 alsadi-laptop.localdomain podman_compose.py[80278]: Copying blob sha256:38121472aa0128f87b31fde5c07080418cc17b4a8ee224767b59e24c592ff7d3
Mar 15 12:05:49 alsadi-laptop.localdomain systemd[1196]: [email protected]: start-pre operation timed out. Terminating.
Mar 15 12:05:50 alsadi-laptop.localdomain podman[83226]: Error: no containers in pod c870bcb0411200474ed55b708266fc5522d2dca72ed6e41dbc253842a5deacad have no dependencies, cannot start pod: no such container
Mar 15 12:05:50 alsadi-laptop.localdomain systemd[1196]: [email protected]: Control process exited, code=exited, status=125/n/a
░░ Subject: Unit process exited
░░ Defined-By: systemd
░░ Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
░░ 
░░ An ExecStartPre= process belonging to unit UNIT has exited.
░░ 
░░ The process' exit code is 'exited' and its exit status is 125.
Mar 15 12:05:50 alsadi-laptop.localdomain systemd[1196]: [email protected]: Failed with result 'timeout'.

I'll solve this by printing that you should do a podman-compose build and podman-compose pull (or podman-compose start -d) once to make sure systemd won't be interactive or take long time pulling images ...etc.

also I noted that changes to compose are not applied unless you use down
this is already have a ticket

#409

@muayyad-alsadi
Copy link
Collaborator

@jayache80

#409 is now fixed, changes are detected and applied automatically.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants