-
-
Notifications
You must be signed in to change notification settings - Fork 604
Filesystems
OSv supports a variety of filesystems which are described in the below paragraphs. Layers-wise, it comes with the VFS layer (see fs/vfs/*
) and particular filesystem implementations found under fs/**/*
with the exception of ZFS.
During boot time, OSv initially mounts the BootFS filesystem (see vfs_init()
and mount_rootfs()
) and then proceeds to mount and pivot to a 'real' filesystem like RoFS, ZFS or VirtioFS (for details see this code in loader.cc
) unless the --nomount
kernel option was specified. The root filesystem can be explicitly selected using the --rootfs
option, otherwise, the loader will try to discover it by trying RoFS, then Virtio-FS, and eventually ZFS.
Please note that OSv also supports the /etc/fstab
file where one could add an extra filesystem mount point. In addition, one can mount an extra filesystem by supplementing appropriate options to the command line like so:
./scripts/run.py --execute='--rootfs=rofs --mount-fs=zfs,/dev/vblk0.2,/data /hello
In order to build an image with a specific type of filesystem, you need to specify the fs
option (which defaults to zfs
) like so:
./scripts/build image=tests fs=rofs
The ext2/3/4 filesystem support was added in the form of a shared pluggable module libext on top of the lwext4 library. For more details on how to build images with ext support and have OSv mount a secondary disk with the ext filesystem, please read this readme.
The ZFS code has been based on the FreeBSD implementation as of circa 2014 and since adapted to work in OSv. The ZFS is a sophisticated filesystem that traces its roots in Solaris and you can find some resources about it on this Wiki page. The majority of the ZFS code can be found under the subtree bsd/sys/cddl/
. The ZFS filesystem driver has been fairly recently extracted from the kernel as a separate shared library libsolaris.so
which is dynamically loaded during boot time from a different filesystem (most likely BootFS or RoFS) before ZFS filesystem can be mounted.
There are three ways ZFS can be mounted on OSv:
- The first and the original one assumes mounting ZFS at the root (
/
) from the 1st partition of the 1st disk -/dev/vblk0.1
. - The second one involves mounting ZFS from the 2nd partition of the 1st disk -
/dev/vblk0.2
at an arbitrary non-root mount point, for example/data
. - Similarly, the third way involves mounting ZFS from the 1st partition of the 2nd or higher disk - for example
/dev/vblk1.1
at an arbitrary non-root mount point as well. Please note that both the second and third options assume that the root filesystem is non-ZFS - most likely RoFS or Virtio-FS.
The disadvantage of the 1st option is that the code and data live in the same read-write filesystem, whereas the other two options allow one to isolate code from mutable data. Ideally one would put all code and configuration on RoFS partition, colocated on the same disk (2) or not (3), and mutable data on the separate partition on the same disk (2) or different (3). It has been shown that booting and mounting ZFS from a separate disk is also slightly faster (by 30-40ms) compared to the original option 1.
Below are the examples of building and running OSv with ZFS
This is the original and default method. Please note the libsolaris.so
is part of the loader.elf
and loaded from BootFS which makes the kernel larger.
./scripts/build image=native-example fs=zfs #The fs defaults to zfs
./scripts/run.py
OSv v0.56.0-152-gfd716a77
...
devfs: created device vblk0.1 for a partition at offset:4194304 with size:532676608
virtio-blk: Add blk device instances 0 as vblk0, devsize=536870912
...
zfs: driver has been initialized!
VFS: mounting zfs at /zfs
zfs: mounting osv/zfs from device /dev/vblk0.1
...
This is a fairly new method that allows ZFS to be mounted at a non-root mount point like /data
for example and mixed with another filesystem on the same disk. Please note that libsolaris.so
is placed on a root filesystem (typically RoFS) under /usr/lib/fs/
and loaded from it automatically. The build
script will automatically add the relevant mount point line to the /etc/fstab
.
./scripts/build image=native-example,zfs fs=rofs_with_zfs #Has to add zfs module that adds /usr/lib/fs/libsolaris.so to RoFS
./scripts/run.py
OSv v0.56.0-152-gfd716a77
...
devfs: created device vblk0.1 for a partition at offset:4194304 with size:191488
devfs: created device vblk0.2 for a partition at offset:4385792 with size:532676608
virtio-blk: Add blk device instances 0 as vblk0, devsize=537062400
...
VFS: mounting rofs at /rofs
zfs: driver has been initialized!
VFS: initialized filesystem library: /usr/lib/fs/libsolaris.so
VFS: mounting devfs at /dev
VFS: mounting procfs at /proc
VFS: mounting sysfs at /sys
VFS: mounting ramfs at /tmp
VFS: mounting zfs at /data
zfs: mounting osv/zfs from device /dev/vblk0.2
...
This fairly new method is similar to the above, in that it also allows ZFS to be mounted at a non-root mount point like /data
but this time from a different disk. Please note that libsolaris.so
is placed on a root filesystem (typically RoFS) under /usr/lib/fs/
and loaded from it automatically as well. Similar to the above, the build
script will automatically add the relevant mount point line to the /etc/fstab
.
./scripts/build image=native-example,zfs fs=rofs --create-zfs-disk #Creates empty disk at build/last/zfs_disk.img with ZFS filesystem
./scripts/run.py --second-disk-image build/last/zfs_disk.img
OSv v0.56.0-152-gfd716a77
...
devfs: created device vblk0.1 for a partition at offset:4194304 with size:1010688
virtio-blk: Add blk device instances 0 as vblk0, devsize=5204992
devfs: created device vblk1.1 for a partition at offset:512 with size:536870400
virtio-blk: Add blk device instances 1 as vblk1, devsize=536870912
...
VFS: mounting rofs at /rofs
zfs: driver has been initialized!
VFS: initialized filesystem library: /usr/lib/fs/libsolaris.so
VFS: mounting devfs at /dev
VFS: mounting procfs at /proc
VFS: mounting sysfs at /sys
VFS: mounting ramfs at /tmp
VFS: mounting zfs at /data
zfs: mounting osv/zfs from device /dev/vblk1.1
...
However, with a different disk setup, you can manually make OSv mount different disk and partition by explicitly using the --mountfs
boot option like so:
#Build ZFS disk somehow differently and make sure the `build` does not append ZFS mount point (inspect build/last/fstab)
./scripts/run.py --execute='--rootfs=rofs --mount-fs=zfs,/dev/vblk1.1,/data /hello' --second-disk-image <disk_path>
Please note that in the examples above, the ZFS pool and filesystem are created using the zfs_loader.elf
version of OSv that executes zpool.so
, zfs.so
and cpiod.so
among others. This is actually quite fast and efficient but recently we have enhanced the build mechanism to create ZFS disks using the zpool
and zfs
on a Linux host provided you have OpenZFS installed (see more info here).
To that end, there is a fairly new script zfs-image-on-host.sh
that can be used to either mount an existing OSv ZFS disk or create a new one. The latter actually can be orchestrated by the build
script if one passes the option --use-openzfs
like so:
./scripts/build image=native-example fs=zfs -j$(nproc) --use-openzfs
Some help from the zfs-image-on-host.sh
:
Manipulate ZFS images on host using OpenZFS - mount, unmount and build.
Usage: zfs-image-on-host.sh mount <image_path> <partition> <pool_name> <filesystem> |
build <image_path> <partition> <pool_name> <filesystem> <populate_image> |
unmount <pool_name>
Where:
image_path path to a qcow2 or raw ZFS image; defaults to build/last/usr.img
partition partition of disk above; defaults to 1
pool_name name of ZFS pool; defaults to osv
filesystem name of ZFS filesystem; defaults to zfs
populate_image boolean value to indicate if the image should be populated with content
from build/last/usr.manifest; defaults to true but only used with 'build' command
Examples:
zfs-image-on-host.sh mount # Mount OSv image from build/last/usr.img under /zfs
zfs-image-on-host.sh mount build/last/zfs_disk.img 1 # Mount OSv image from build/last/zfs_disk.img 2nd partition under /zfs
zfs-image-on-host.sh unmount # Unmount OSv image from /zfs
Using the same script you can always mount any ZFS disk on host, inspect any files and modify it if you want, and umount it. OSv will now see all changes if run with the same disk:
./scripts/zfs-image-on-host.sh mount build/last/zfs_disk.img
Connected device /dev/nbd0 to the image build/last/zfs_disk.img
Imported pool osv
Mounted osv/zfs at /zfs
[wkozaczuk@fedora-mbpro osv]$ find /zfs/
/zfs/
/zfs/seaweedfs
/zfs/seaweedfs/logs
/zfs/seaweedfs/logs/weed.osv.osv.log.WARNING.20220726-181118.2
/zfs/seaweedfs/logs/weed.osv.osv.log.INFO.20220726-180155.2
/zfs/seaweedfs/logs/weed.WARNING
/zfs/seaweedfs/logs/weed.INFO
/zfs/seaweedfs/master
/zfs/seaweedfs/master/snapshot
find: ‘/zfs/seaweedfs/master/snapshot’: Permission denied
/zfs/seaweedfs/master/log
/zfs/seaweedfs/master/conf
./scripts/zfs-image-on-host.sh unmount