Skip to content

Use systemd mount unit file for extra volume mount in compat mode#245

Closed
rhvgoyal wants to merge 1 commit intoprojectatomic:masterfrom
rhvgoyal:mount-unit-compat
Closed

Use systemd mount unit file for extra volume mount in compat mode#245
rhvgoyal wants to merge 1 commit intoprojectatomic:masterfrom
rhvgoyal:mount-unit-compat

Conversation

@rhvgoyal
Copy link
Collaborator

@rhvgoyal rhvgoyal commented Jun 2, 2017

We were using systemd mount unit file for extra volume mount. Then we
wanted to extend container-storage-setup for use by other run times and
we started doing mounts inline and got rid of dependency on systemd mount
unit.

But this does not work for compatibility mode. And reason being that now
it is possible that volume device is not up by the time container-storage-setup
runs and that means we will not mount it and either try to create new or
return.

We can't even wait for device to come up as in comaptibility mode we
don't save any metadata. So we don't know if we are running for first
time and we should create volume or we are restarting and we should
wait for volume.

So go back to creating a systemd mount unit file for mounting extra
volumes. Only for compatibility mode. In non-compatibility mode, we
need to explicitly activate configuration and that means by that time
volume has already been created and that means we can wait for volume.

Signed-off-by: Vivek Goyal vgoyal@redhat.com

@rhvgoyal
Copy link
Collaborator Author

rhvgoyal commented Jun 2, 2017

@rhatdan PTAL

@rhvgoyal
Copy link
Collaborator Author

rhvgoyal commented Jun 2, 2017

@dustymabe I think this should fix the issue you are seeing

@rhvgoyal rhvgoyal force-pushed the mount-unit-compat branch from 929e92b to a4559f0 Compare June 2, 2017 15:54
@rhvgoyal
Copy link
Collaborator Author

rhvgoyal commented Jun 2, 2017

Fixed a bug and pushed new patch with "set -x". Trying to debug why /var/lib/containers is not mounted.

@rhatdan
Copy link
Member

rhatdan commented Jun 2, 2017

LGTM other then the test failure.

# e.g if ${mount_dir} is /var/lib/containers
# then filename will be var-lib-containers.mount
filename=$(systemd_escaped_filename ${mount_dir})
unit_file_path="/etc/systemd/system/$filename"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not certain about re-engineering this now, but doing things like this is exactly why systemd generators were created. This is fresh on my mind since I recently wrote one for ostree.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I will have to write a generator of my own? How it is different than what I am doing now? I am also generating a unit file. Just that outside the systemd mechanism to call generators.

Also I am assuming that I need to drop a generator binary somewhere (instead of script) and we don't compile anything in this package.

So for this use case, it feels like that switching to generator mechanism might be work and less gain.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generators can be shell scripts, just look at:

# rpm -qf /usr/lib/systemd/system-generators/selinux-autorelabel-generator.sh 
policycoreutils-2.5-20.fc25.x86_64
# cat /usr/lib/systemd/system-generators/selinux-autorelabel-generator.sh
#!/bin/sh

# This systemd.generator(7) detects if SELinux is running and if the
# user requested an autorelabel, and if so sets the default target to
# selinux-autorelabel.target, which will cause the filesystem to be
# relabelled and then the system will reboot again and boot into the
# real default target.

PATH=/usr/sbin:$PATH
unitdir=/usr/lib/systemd/system

# If invoked with no arguments (for testing) write to /tmp.
earlydir="/tmp"
if [ -n "$2" ]; then
    earlydir="$2"
fi

set_target ()
{
    ln -sf "$unitdir/selinux-autorelabel.target" "$earlydir/default.target"
}

if selinuxenabled; then
    if test -f /.autorelabel; then
        set_target
    elif grep -sqE "\bautorelabel\b" /proc/cmdline; then
        set_target
    fi
fi

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The argument for writing a generator is that we're plugging into a well-understood mechanism designed for generating unit files.

To repeat again, it handles the daemon reload thing for us. Another thing is that the generated units are explicitly runtime only, stored in /run. That means it's much clearer that the source config files are the canonical thing. If it's in /etc a sysadmin might think they should edit it.

Actually, that's another thing - if a sysadmin wants to override or extend the unit, then units stored in /etc will take precendence over it. Which is another issue that I fixed with the ostree generator for /var.

WantedBy=docker-storage-setup.service
EOF
systemctl enable $filename >/dev/null 2>&1
systemctl start $filename
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally in scripts you want --no-block. Otherwise one can get into dependency loops.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I think a daemon-reload is required here. Which note is something that the systemd generator model handles for you.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I will use --no-block.

daemon-reload makes sense too. Will use that.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm..., If I use --no-block, that means by the time docker-storage-setup finishes, there is a possibility that mount is not finished. IOW, docker-storage-setup can return without mount actually being there and that means docker can start without mount actually being there. And I don't think we want that.

So I think it makes sense not to use --no-block here.

unit_file_path="/etc/systemd/system/$filename"

# If unit file already exists, nothing to do.
[ -f "$unit_file_path" ] && return 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pet peeve: this isn't atomic - if the system crashes in the middle we could have a partially written mount unit. Use a temporary file and rename into place. (Perhaps someday bash will support O_TMPFILE?)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. other parts of scripts are already doing this logic. Will use it for unit file too.

# is still racy though. There is no guarantee that volume will be
# up by the time this code runs when unit file is not present already.
if extra_volume_exists $CONTAINER_ROOT_LV_NAME $VG; then
if ! setup_systemd_mount_unit_compat "$VG" "$CONTAINER_ROOT_LV_NAME" "$_RESOLVED_MOUNT_DIR_PATH"; then
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the admin changes the config file, we won't regenerate the mount unit, right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If admin changes config file to start mounting on a different path, then a mount unit will be generated. Ideally I am expecting user to "reset" existing configuration and then make changes to config file. Otherwise there are many possibilities and many things can go wrong.

On a side note, to cover for all these corner cases, I have written new commands for container-storage-setup and at some point of time we should make use of it. These commands store their
own metadata and don't allow arbitrary changes to existing configuration.

@rhvgoyal rhvgoyal force-pushed the mount-unit-compat branch 2 times, most recently from c96c321 to edd260a Compare June 2, 2017 18:43
@rhvgoyal
Copy link
Collaborator Author

rhvgoyal commented Jun 2, 2017

@cgwalters Made two changes. generating a temp unit file and renaming it. And using systemd daemon-reload.

I am not thinking of moving to using generators for this yet.

PTAL


cat <<EOF > "${unit_file_path}.tmp"
[Unit]
Description=Mount $lv_name on $mount_dir directory.
Copy link
Member

@cgwalters cgwalters Jun 2, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, right now the sysadmin could edit this (and actually, might want to for something like changing the filesystem Type right?).

But perhaps let's give ourselves a mechanism that allows us to switch to a generator in the future with a simple comment here like:

#  WARNING: This file was auto-generated by container-storage-setup.
# In the future, this file might be moved to a different location.  If you want
# to edit it, remove this comment.

(Then in the future we could do if grep -q "WARNING: auto-generated by container-storage-setup"; then rm -f)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now I am not expecting anybody to touch this file. As of now, we create xfs filesystem on extra volume and there are no knobs to configure it. So I am not expecting anyobdy to touch this file.

Down the line when we introduce any config variable to allow specifying filesystem, then we can specify filesystem type accordingly.

But putting a warning that this file has been automatically generated by container-storage-setup is good anyway. Atleast system admin will know who created this file.

We were using systemd mount unit file for extra volume mount. Then we
wanted to extend container-storage-setup for use by other run times and
we started doing mounts inline and got rid of dependency on systemd mount
unit.

But this does not work for compatibility mode. And reason being that now
it is possible that volume device is not up by the time container-storage-setup
runs and that means we will not mount it and either try to create new or
return.

We can't even wait for device to come up as in comaptibility mode we
don't save any metadata. So we don't know if we are running for first
time and we should create volume or we are restarting and we should
wait for volume. 

So go back to creating a systemd mount unit file for mounting extra
volumes. Only for compatibility mode. In non-compatibility mode, we
need to explicitly activate configuration and that means by that time
volume has already been created and that means we can wait for volume.  

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
@rhvgoyal rhvgoyal force-pushed the mount-unit-compat branch from edd260a to 1b84349 Compare June 2, 2017 19:25
@rhvgoyal
Copy link
Collaborator Author

rhvgoyal commented Jun 2, 2017

Added WARNING to mount unit file.

@cgwalters
Copy link
Member

OK, I'll be honest - I haven't yet given this code a full review; the c-s-s code has changed a lot. But it does seem likely to get us to a better place.

@rh-atomic-bot r+ 1b84349

@rh-atomic-bot
Copy link

⌛ Testing commit 1b84349 with merge 9b77bcb...

@rh-atomic-bot
Copy link

☀️ Test successful - status-redhatci
Approved by: cgwalters
Pushing 9b77bcb to master...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants